A multidimensional generalization of a binomial distribution. Assume that in each experiment there are \(k\) possible outcomes (enumerated by \(1,2,\ldots,k\)). Probabilities of these outcomes are \(p_1,p_2,\ldots,p_k,\) so that \[ p_1,\ldots,p_k\geq 0, \ p_1+\ldots+p_k=1. \] Multinomial distribution describeds the number of outcomes of each type in \(n\) independent repetitions of the experiment:

\((X_1,\ldots,X_k)=(m_1,\ldots,m_k)\) if exactly \(m_1\) experiments resulted in the outcome \(1,\) exactly \(m_2\) experiments resulted in the outcome \(2,\ldots,\) exactly \(m_k\) experiments resulted in the outcome \(k.\)

Parameters: \(n\) – number of experiments, \(k\) – number of outcomes in each experiment, \((p_1,\ldots,p_k)\) – probability distribution of an outcome in each experiment.

Values: all sequences \((m_1,\ldots,m_k)\) of non-negative integers that sum up to \(n\) (there are \({n\choose n+k-1}\) such sequences).

Probability mass function: \[ P(X_1=m_1,\ldots,X_k=m_k)=\frac{n!}{m_1!\ldots m_k!}p^{m_1}_{1}\ldots p^{m_k}_k. \]

Derivation

Let \(\xi_l,\) \(l=1,2,\ldots,n,\) be the result of \(l\)-th experiment, \(\xi_l\in\{1,2,\ldots,k\}.\) The event \[ \{X_1=m_1,\ldots,X_k=m_k\} \] means that exactly \(m_1\) variables \(\xi_l=1,\) exactly \(m_2\) variables \(\xi_l=2,\) \(\ldots,\) exactly \(m_k\) variables \(\xi_l=k.\) When variables \(\xi_1,\ldots,\xi_l\) are already grouped according to their values, the probability becomes \(p^{m_1}_1\ldots p^{m_k}_k.\) The number of partitions of \(n\) elements into \(k\) groups by \(m_1,m_2,\ldots,m_k\) elements is \(\frac{n!}{m_1!\ldots m_k!}\).

Moment generating function: \[ M(t_1,\ldots,t_k)=Ee^{t_1X_1+\ldots+t_kX_k}=(p_1e^{t_1}+\ldots+p_k e^{t_k})^n \]

Proof

\[ M(t_1,\ldots,t_k)=Ee^{t_1X_1+\ldots+t_kX_k}= \] using probability mass function \[ =\sum_{m_1+\ldots+m_k=n}\frac{n!}{m_1!\ldots m_k!}p^{m_1}_1\ldots p^{m_k}_k e^{t_1m_1+\ldots +t_km_k}= \] \[ =\sum_{m_1+\ldots+m_k=n}\frac{n!}{m_1!\ldots m_k!}(p_1e^{t_1})^{m_1}\ldots (p_ke^{t_k})^{m_k}= \] by multinomial formula \[ =(p_1 e^{t_1}+\ldots+p_k e^{t_k})^n \]

Expectation: \(EX_j=np_j,\) \(1\leq j\leq k.\)

Variance: \(V(X_j)=np_j(1-p_j),\) \(1\leq j\leq k\)

Covariance: \(cov(X_i,X_j)=-np_ip_j,\) \(1\leq i<j\leq k.\)

Derivation

The moment generating function of a single variable \(X_j\) is obtained from \(M(t_1,\ldots,t_k)\) by letting \(t_i=0,\) \(i\ne j.\) That is \[ Ee^{t_jX_j}=M(0,\ldots,0,t_j,0,\ldots,0)=(p_1+\ldots+p_{j-1}+p_j e^{t_j}+p_{j+1}+\ldots+p_k)^n= \] using that \(p_1+\ldots+p_k=1\) \[ =(1-p_j+p_je^{t_j})^n \] This is exactly the moment generating function of a binomial distribution with parameters \(n,p_j:\) \[ X_j\sim Binomial(n,p_j) \] In particular \[ EX_j=np_j, \ V(X_j)=np_j(1-p_j). \] To compute expectation of the product \(EX_iX_j,\) \(i\ne j,\) we take second mixed derivative of \(M\) at point zero: \[ \frac{\partial M}{\partial t_i}=np_ie^{t_i}(p_1 e^{t_1}+\ldots+p_k e^{t_k})^n \] \[ \frac{\partial^2 M}{\partial t_i\partial t_j}=n(n-1)p_ip_je^{t_i+t_j}(p_1 e^{t_1}+\ldots+p_k e^{t_k})^{n-2} \] Put \(t_1=\ldots=t_k=0\) and get \[ E X_iX_j=n(n-1)p_ip_j \] So, the covariance \[ cov(X_i,X_j)=EX_iX_j-EX_iEX_j= \] \[ =n(n-1)p_ip_j-n^2p_ip_j=-np_ip_j. \]