A multidimensional generalization of a binomial distribution. Assume that in each experiment there are k possible outcomes (enumerated by 1,2,,k). Probabilities of these outcomes are p1,p2,,pk, so that p1,,pk0, p1++pk=1. Multinomial distribution describeds the number of outcomes of each type in n independent repetitions of the experiment:

(X1,,Xk)=(m1,,mk) if exactly m1 experiments resulted in the outcome 1, exactly m2 experiments resulted in the outcome 2,, exactly mk experiments resulted in the outcome k.

Parameters: n – number of experiments, k – number of outcomes in each experiment, (p1,,pk) – probability distribution of an outcome in each experiment.

Values: all sequences (m1,,mk) of non-negative integers that sum up to n (there are (nn+k1) such sequences).

Probability mass function: P(X1=m1,,Xk=mk)=n!m1!mk!pm11pmkk.

Derivation

Let ξl, l=1,2,,n, be the result of l-th experiment, ξl{1,2,,k}. The event {X1=m1,,Xk=mk} means that exactly m1 variables ξl=1, exactly m2 variables ξl=2, , exactly mk variables ξl=k. When variables ξ1,,ξl are already grouped according to their values, the probability becomes pm11pmkk. The number of partitions of n elements into k groups by m1,m2,,mk elements is n!m1!mk!.

Moment generating function: M(t1,,tk)=Eet1X1++tkXk=(p1et1++pketk)n

Proof

M(t1,,tk)=Eet1X1++tkXk= using probability mass function =m1++mk=nn!m1!mk!pm11pmkket1m1++tkmk= =m1++mk=nn!m1!mk!(p1et1)m1(pketk)mk= by multinomial formula =(p1et1++pketk)n

Expectation: EXj=npj, 1jk.

Variance: V(Xj)=npj(1pj), 1jk

Covariance: cov(Xi,Xj)=npipj, 1i<jk.

Derivation

The moment generating function of a single variable Xj is obtained from M(t1,,tk) by letting ti=0, ij. That is EetjXj=M(0,,0,tj,0,,0)=(p1++pj1+pjetj+pj+1++pk)n= using that p1++pk=1 =(1pj+pjetj)n This is exactly the moment generating function of a binomial distribution with parameters n,pj: XjBinomial(n,pj) In particular EXj=npj, V(Xj)=npj(1pj). To compute expectation of the product EXiXj, ij, we take second mixed derivative of M at point zero: Mti=npieti(p1et1++pketk)n 2Mtitj=n(n1)pipjeti+tj(p1et1++pketk)n2 Put t1==tk=0 and get EXiXj=n(n1)pipj So, the covariance cov(Xi,Xj)=EXiXjEXiEXj= =n(n1)pipjn2pipj=npipj.