In statistics, we often estimate [[Expectation]] using averages. The Law of Large Numbers ("LLN") ensures that the sample mean $\bar X_n$, converges to the true mean $\mu$, as the number of observations $n$ increases, provided specific assumptions are met.
$ \bar X_n = \frac{1}{n} \sum_{i=1}^n X_i \xrightarrow[n \to \infty]{\mathbf P / \text{a.s.}} \mu $
**Assumptions:**
- All [[Random Variable|random variables]] $X_i$ are [[Independence and Identical Distribution|i.i.d.]].
- Each $X_i$ has a finite expected value $\mu$ and a finite variance $\sigma^2$.
- Convergence becomes more apparent as $n$ grows.
**Forms of Law of Large Numbers:**
- *Weak law of large numbers (”WLLN”):* It relies on the [[Modes of Convergence#Convergence in Probability|Convergence in Probability]], which states that the larger the number $n$ of i.i.d. r.v’s., the smaller the probability that the sample mean will be far off from the true mean.
- *Strong law of large numbers (”SLLN”):* It relies on the [[Modes of Convergence#Convergence Almost Surely|Convergence Almost Surely]], which states that with probability $1$ a sequence will eventually equal the true mean. This provides a justification that observed frequencies can be treated as probabilities when $n$ is large enough.
## Expectation of the Sample Mean
We can show that the expectation of the sample mean $\bar X_n$ is the theoretical mean $\mu$ of multiple experiments.
$
\begin{align}
\mathbb{E}[\bar X_n]&=\frac{\mathbb{E}[X_1+ \dots + X_n]}{n} \tag{1}\\[2pt]
&=\frac{\mathbb{E}[X_1] + \dots + \mathbb{E}[X_n]}{n} \tag{2}\\[2pt]
&=\frac{n \mu}{n} \tag{3}\\[4pt]
&=\mu \tag{4}
\end{align}
$
where:
- (1) s $n$ is a constant, it can be pulled out of the expectation.
- (2) [[Linearity of Expectations]] allows to split up expectation of sum into sum of expectations.
- (3) Each $\mathbb E[X_i]= \mu$.
**Interpretation:** As an example consider the result of $n$ coin flips as a single experiment $\bar X_n$. By taking the expectation of that, we assume that the average of many experiment results tends towards the true mean $\mu$.
## Variance of the Sample Mean
We can show that as $n$ increases, the variance of the sample mean $\bar X_n$ decreases, which leads to more stable sample means from single experiments.
$
\begin{align}
\mathrm{Var}(\bar X_n) &= \mathrm{Var} \left(\frac{1}{n}*(X_1+\dots+X_n) \right) \tag{1}\\[4pt]
&= \frac{1}{n^2}*\mathrm{Var}(X_1+ \dots + X_n) \tag{2}\\[6pt]
&=\frac{1}{n^2}* \mathrm{Var}(X_1)+ \dots + \mathrm{Var}(X_n) \tag{3}\\[4pt]
&=\frac{n\sigma^2}{n^{2}}\tag{4}\\[6pt]
&=\frac{\sigma^2}{n} \tag{5}
\end{align}
$
where:
- (1) Pulling out a [[Variance after Linear Transformation#Scaling by Factor|factor from the variance]] squares it.
- (2) Since all $X_i$ are independent, the [[Variance of Sum of Random Variables#Special Case of Independence|Variance of Sum of Random Variables]] is equal to the sum of variances from single terms.
**Interpretation:** The longer the sequence of coin flips for each single experiment, the less volatile are the resulting sample means.
**Chebyshev inequality:** The [[Modes of Convergence#Convergence in Probability|Convergence in Probability]] can be shown by [[Chebyshev Inequality]]. We simply plug in $\bar X_n$ and $\mathrm{Var}(\bar X_n)$. The probability of exceeding the threshold $c$ will converge to zero, as $n \to \infty$.
$
\begin{align}
\mathbf P\big(\lvert \bar X_n- \mu \rvert \ge c)
&\le \frac{\sigma^2}{c^2}\\[4pt]
&\le \frac{\mathrm{Var}(\bar X_n)}{c^2} \\[4pt]
&\le \frac{\sigma^2}{n* c^2} \xrightarrow[n\to \infty]{}0
\end{align}
$