We have demonstrated in [[Linear Regression with LSE]] that $\hat \beta$ follows a [[Gaussian Distribution]]. To test the significance of a single coefficient from the $\hat \beta$ vector, we leverage this distribution and standardize it. $ \hat \beta \sim \mathcal N_p\Big(\beta^\star,\sigma^2(\mathbb X^T \mathbb X)^{-1}\Big) $ **Isolating Beta Element:** To extract a single $\hat \beta_j$ element, we construct a product of $u\hat \beta$, where $u$ is a zero vector, with $1$ at the $j$-th position. $ \hat \beta_1=u\hat \beta, \quad \text{where } u^T= \begin{bmatrix} 1&0& \cdots &0 \end{bmatrix} $ The distribution of $\hat \beta_j$ can be expressed as: $ \begin{align} \hat \beta_j &\sim \mathcal N \Big( u^T \beta^\star, u^T\big[\sigma^2(\mathbb X^T\mathbb X)^{-1}\big]\, u \Big) \tag{1}\\[10pt] \hat \beta_j &\sim \mathcal N\Big(\beta^\star_j,\sigma^2 \underbrace{\Big[(\mathbb X^T \mathbb X)^{-1}\Big]_{jj}}_{\gamma_j}\Big) \tag{2} \end{align} $ (2) The [[Variance]] term of $\mathcal N$ simplifies to the product of $\sigma^2 \gamma_j$, where $\gamma_j$ is the $(j,j)$-the coordinate in the matrix $(\mathbb X^T \mathbb X)^{-1}$ which has dimension $p \times p$. **Scaling Beta Element:** To test the null hypothesis $H_0: \beta^\star_j=0$, we need to standardize $\hat \beta_j$. $ \frac{\hat \beta_j-\beta^\star_j}{\sqrt{\sigma^2}* \gamma_j} \sim \mathcal N(0,1) $ However, we do not observe the population parameter $\sigma^2$. Replacing $\sigma^2$ with its unbiased estimator $\hat \sigma^2$ and applying [[T-Test#Cochran’s Theorem|Cochran’s Theorem]], we obtain: $ \frac{\hat \beta_j-\beta^\star_j}{\sqrt{\hat \sigma^2* \gamma_j}} = \frac{\frac{\hat \beta_j-\beta^\star_j}{\sqrt{\sigma^2* \gamma_j}}}{\sqrt{\frac{\hat \sigma^2}{\sigma^2}}} \sim \frac{\mathcal N(0,1)}{\sqrt{\frac{\chi^2_{n-p}}{n-p}}} \sim t_{n-p} $ By definition the division of a standard Gaussian with this variation of a [[Chi-Square Distribution]] returns a $t_{n-p}$ [[Student T-Distribution]]. This allows us to perform a [[T-Test]] on the significance of each parameter $\hat \beta_j$. >[!note:] >If we would know the population parameter $\sigma^2$, then the test-statistic would follow a $\mathcal N(0,1)$, and we could simply look up the quantiles in a standard Gaussian table (”Z-test”) for significance testing. ## Two Sample-Test When we compare the difference between two coefficients, the matrix multiplication in the variance term gets a bit more involved. Assume the following null hypothesis: $ H_0: \beta_2 - \beta_1>0 $ We adjust the vector to single out only the parameters of interest, which are the first and second element, with their respective sign. $ u^T=\begin{bmatrix} -1 &1&0& \cdots&0 \end{bmatrix} $ The difference $\hat \beta_2-\hat \beta_1$ follows: $ \begin{aligned} \hat \beta_2- \hat \beta_1 &\sim \mathcal N \Big( u^T \beta^\star, u^T\big[\sigma^2\overbrace{(\mathbb X^T\mathbb X)^{-1}}^{M}\big]\, u \Big) \\[8pt] \hat \beta_2- \hat \beta_1&\sim \mathcal N \Big( \beta_2^\star- \beta_1^\star, \sigma^2 (M_{11}+M_{22}-2M_{12}) \Big) \end{aligned} $ The subsequent steps follow the regular procedure of a [[Two Sample T-Test]].