Notes on the Multinomial distribution
15.02.2018
A multidimensional generalization of a binomial distribution. Assume that in each experiment there are k possible outcomes (enumerated by 1,2,…,k). Probabilities of these outcomes are p1,p2,…,pk, so that p1,…,pk≥0, p1+…+pk=1. Multinomial distribution describeds the number of outcomes of each type in n independent repetitions of the experiment:
(X1,…,Xk)=(m1,…,mk) if exactly m1 experiments resulted in the outcome 1, exactly m2 experiments resulted in the outcome 2,…, exactly mk experiments resulted in the outcome k.
Parameters: n – number of experiments, k – number of outcomes in each experiment, (p1,…,pk) – probability distribution of an outcome in each experiment.
Values: all sequences (m1,…,mk) of non-negative integers that sum up to n (there are (nn+k−1) such sequences).
Probability mass function: P(X1=m1,…,Xk=mk)=n!m1!…mk!pm11…pmkk.
Derivation
Let ξl, l=1,2,…,n, be the result of l-th experiment, ξl∈{1,2,…,k}. The event {X1=m1,…,Xk=mk} means that exactly m1 variables ξl=1, exactly m2 variables ξl=2, …, exactly mk variables ξl=k. When variables ξ1,…,ξl are already grouped according to their values, the probability becomes pm11…pmkk. The number of partitions of n elements into k groups by m1,m2,…,mk elements is n!m1!…mk!.
Moment generating function: M(t1,…,tk)=Eet1X1+…+tkXk=(p1et1+…+pketk)n
Proof
M(t1,…,tk)=Eet1X1+…+tkXk= using probability mass function =∑m1+…+mk=nn!m1!…mk!pm11…pmkket1m1+…+tkmk= =∑m1+…+mk=nn!m1!…mk!(p1et1)m1…(pketk)mk= by multinomial formula =(p1et1+…+pketk)n
Expectation: EXj=npj, 1≤j≤k.
Variance: V(Xj)=npj(1−pj), 1≤j≤k
Covariance: cov(Xi,Xj)=−npipj, 1≤i<j≤k.
Derivation
The moment generating function of a single variable Xj is obtained from M(t1,…,tk) by letting ti=0, i≠j. That is EetjXj=M(0,…,0,tj,0,…,0)=(p1+…+pj−1+pjetj+pj+1+…+pk)n= using that p1+…+pk=1 =(1−pj+pjetj)n This is exactly the moment generating function of a binomial distribution with parameters n,pj: Xj∼Binomial(n,pj) In particular EXj=npj, V(Xj)=npj(1−pj). To compute expectation of the product EXiXj, i≠j, we take second mixed derivative of M at point zero: ∂M∂ti=npieti(p1et1+…+pketk)n ∂2M∂ti∂tj=n(n−1)pipjeti+tj(p1et1+…+pketk)n−2 Put t1=…=tk=0 and get EXiXj=n(n−1)pipj So, the covariance cov(Xi,Xj)=EXiXj−EXiEXj= =n(n−1)pipj−n2pipj=−npipj.