Conditional variance can be expressed as a function $g(Y)$ that takes as input any $y$ and returns the respective variance of $X$ under that condition. $ g(y)=\mathrm{Var}(X \vert Y=y)$ To get from the conditional variance to the unconditional, we can use the law of total variance. $ \mathrm{Var}(X) = \underbrace{\mathbb{E}[\mathrm{Var}(X \vert Y)]}_{\text{Within-group variability}}+ \underbrace{\mathrm{Var}(\mathbb{E}[X \vert Y])}_{\text{Between-group variability}} $ The formula can be decomposed into: - *Within-group variability:* Measures the average [[Variance]] of $X$ within each group $y$. - *Between-group variability:* Measures the variability of the group means $\mathbb E[X \vert Y]$ across all groups $y$. ## Derivation of Law of Total Variance **Conditional Variance:** The variance of $X$ conditioned on $Y$ is: $ \begin{align} \mathrm{Var}(X)&= \mathbb{E}[X^2] - (\mathbb{E}[X])^2 \tag{1}\\ \mathrm{Var}(X \vert Y) &= \mathbb{E}[X^2 \vert Y] - (\mathbb{E}[X \vert Y]^2) \tag{2} \end{align} $ where: - (1) General variance formula. - (2) Since a conditional r.v. has the same properties as an unconditional one, we just extend the regular variance formula. **Expectation of Conditional Variance:** Each different $y$ gives a different variance of $X$. Consider the expectation of these different variances. $ \begin{align} \mathbb E[\mathrm{Var}(X \vert Y)] &= \mathbb E \big[\mathbb{E}[X^2 \vert Y] \big] - \mathbb E \big[(\mathbb{E}[X \vert Y])^2 \big] \tag{3}\\ \mathbb E [\mathrm{Var}(X \vert Y)] &= \mathbb E[X^2] - \mathbb E \big[(\mathbb{E}[X \vert Y])^2 \big] \tag{4} \end{align} $ where: - (3) Wrapping [[Expectation]] around both terms. [[Linearity of Expectations]] allows to write them as separate expectations. - (4) Applying [[Law of Iterated Expectations]] for the first term. **Variance of Conditional Expectation:** Each $y$ gives a different expectation in $X$. Consider the variance of these different expectations. $ \begin{align} \mathrm{Var}(X)&= \mathbb{E}[X^2] - (\mathbb{E}[X])^2 \tag{5}\\ \mathrm{Var}(\mathbb{E}[X \vert Y]) &= \mathbb{E} \big[(\mathbb{E}[X \vert Y])^2 \big] - \Big(\mathbb{E} \big[\mathbb{E}[X \vert Y] \big] \Big)^2 \tag{6}\\ &= \mathbb{E} \big[(\mathbb{E}[X \vert Y])^2 \big] \tag{7} -(\mathbb{E}[X])^2 \end{align} $ where: - (6) Inserting the conditional expectation into the general variance formula. - (7) Applying law of iterated expectations for the second term. **Combining Components:** Applying the law of total variance, will recover the original definition of the variance $ \begin{align} \mathrm{Var}(X) &= \mathbb E [\mathrm{Var}(X \vert Y)]+\mathrm{Var}(\mathbb{E}[X \vert Y]) \\[2pt] &= \mathbb{E}[X^2] - \mathbb E \big[(\mathbb{E}[X \vert Y])^2\big] + \mathbb{E} \big[(\mathbb{E}[X\vert Y])^2 \big]-(\mathbb{E}[X])^2 \\[2pt] &= \mathbb{E}[X^2] - (\mathbb{E}[X])^2 \end{align} $ ## Example: Student Scores in Two Sections Let $X$ be the score of a randomly (uniform) picked student in a class. The class is split into two sections, where $y=1$ (10 students) and $y=2$ (20 students). We know that: **Conditional Expectations:** - Mean student score in section 1 → $\mathbb{E}[X \vert Y=1]=90$ - Mean student score in section 2 → $\mathbb{E}[X \vert Y=2]=60$ **Conditional Variances:** - Variance of student scores in section 1 → $\mathrm{var}(X \vert Y=1) = 15$ - Variance of student scores in section 2 → $\mathrm{var}(X \vert Y=2) = 30$ **Unconditional Expectation:** - $\mathbb{E}[X \vert Y]$ is a random variable that can take $\in [60, 90]$ - $\mathbb{E}\big[\mathbb{E}[X \vert Y]\big]$is the expectation of the r.v. above → $\frac{1}{3}*60 + \frac{2}{3}*90=70$ - $\mathbb{E}[X]=70$ as it is the same as $\mathbb{E}\big[\mathbb{E}[X \vert Y]\big]$ by law of iterated expectation **Calculating Components:** - *Within-group variability*: Each section has a variance of scores. Take the expectation of these section variances → $\mathbb{E}[\mathrm{Var}(X \vert Y)] = \frac{1}{3}*15+ \frac{2}{3}*30=25$ - *Between-group variability:* Each section has an expectation (average) of scores. Take the variance of these averages → $\mathrm{Var}(\mathbb{E}[X \vert Y]) = \frac{1}{3}*(90-70)^2+\frac{2}{3}*(60-70)^2=200$