In statistics, we often estimate [[Expectation]] using averages. The Law of Large Numbers ("LLN") ensures that the sample mean $\bar X_n$, converges to the true mean $\mu$, as the number of observations $n$ increases, provided specific assumptions are met. $ \bar X_n = \frac{1}{n} \sum_{i=1}^n X_i \xrightarrow[n \to \infty]{\mathbf P / \text{a.s.}} \mu $ **Assumptions:** - All [[Random Variable|random variables]] $X_i$ are [[Independence and Identical Distribution|i.i.d.]]. - Each $X_i$ has a finite expected value $\mu$ and a finite variance $\sigma^2$. - Convergence becomes more apparent as $n$ grows. **Forms of Law of Large Numbers:** - *Weak law of large numbers (”WLLN”):* It relies on the [[Modes of Convergence#Convergence in Probability|Convergence in Probability]], which states that the larger the number $n$ of i.i.d. r.v’s., the smaller the probability that the sample mean will be far off from the true mean. - *Strong law of large numbers (”SLLN”):* It relies on the [[Modes of Convergence#Convergence Almost Surely|Convergence Almost Surely]], which states that with probability $1$ a sequence will eventually equal the true mean. This provides a justification that observed frequencies can be treated as probabilities when $n$ is large enough. ## Expectation of the Sample Mean We can show that the expectation of the sample mean $\bar X_n$ is the theoretical mean $\mu$ of multiple experiments. $ \begin{align} \mathbb{E}[\bar X_n]&=\frac{\mathbb{E}[X_1+ \dots + X_n]}{n} \tag{1}\\[2pt] &=\frac{\mathbb{E}[X_1] + \dots + \mathbb{E}[X_n]}{n} \tag{2}\\[2pt] &=\frac{n \mu}{n} \tag{3}\\[4pt] &=\mu \tag{4} \end{align} $ where: - (1) s $n$ is a constant, it can be pulled out of the expectation. - (2) [[Linearity of Expectations]] allows to split up expectation of sum into sum of expectations. - (3) Each $\mathbb E[X_i]= \mu$. **Interpretation:** As an example consider the result of $n$ coin flips as a single experiment $\bar X_n$. By taking the expectation of that, we assume that the average of many experiment results tends towards the true mean $\mu$. ## Variance of the Sample Mean We can show that as $n$ increases, the variance of the sample mean $\bar X_n$ decreases, which leads to more stable sample means from single experiments. $ \begin{align} \mathrm{Var}(\bar X_n) &= \mathrm{Var} \left(\frac{1}{n}*(X_1+\dots+X_n) \right) \tag{1}\\[4pt] &= \frac{1}{n^2}*\mathrm{Var}(X_1+ \dots + X_n) \tag{2}\\[6pt] &=\frac{1}{n^2}* \mathrm{Var}(X_1)+ \dots + \mathrm{Var}(X_n) \tag{3}\\[4pt] &=\frac{n\sigma^2}{n^{2}}\tag{4}\\[6pt] &=\frac{\sigma^2}{n} \tag{5} \end{align} $ where: - (1) Pulling out a [[Variance after Linear Transformation#Scaling by Factor|factor from the variance]] squares it. - (2) Since all $X_i$ are independent, the [[Variance of Sum of Random Variables#Special Case of Independence|Variance of Sum of Random Variables]] is equal to the sum of variances from single terms. **Interpretation:** The longer the sequence of coin flips for each single experiment, the less volatile are the resulting sample means. **Chebyshev inequality:** The [[Modes of Convergence#Convergence in Probability|Convergence in Probability]] can be shown by [[Chebyshev Inequality]]. We simply plug in $\bar X_n$ and $\mathrm{Var}(\bar X_n)$. The probability of exceeding the threshold $c$ will converge to zero, as $n \to \infty$. $ \begin{align} \mathbf P\big(\lvert \bar X_n- \mu \rvert \ge c) &\le \frac{\sigma^2}{c^2}\\[4pt] &\le \frac{\mathrm{Var}(\bar X_n)}{c^2} \\[4pt] &\le \frac{\sigma^2}{n* c^2} \xrightarrow[n\to \infty]{}0 \end{align} $