## Test Statistic
The Wald test is used to check the statistical significance of parameters estimated by a model. It assess the difference in the parameter estimate $\hat \theta$, the hypothesized parameter $\theta_0$ under $H_0$, and the standard error of the parameter estimate $\text{SE}$. There are two forms to write the test statistic:
**Unsquared Form:**
$ W=\frac{\hat \theta - \theta_0}{\sqrt{\widehat{\text{var}}(\hat \theta)}} \xrightarrow[n \to \infty]{(d)}\mathcal N(0,1) $
- *Reasons for convergence:* We can write the [[Modes of Convergence#Convergence in Distribution|Convergence in Distribution]] to a [[Gaussian Distribution|Gaussian]] under some mild regularities and given our estimator for $\hat \theta$ is [[Properties of an Estimator#Key Properties of an Estimator|consistent]]. This is because $\hat \theta \xrightarrow[]{\mathbf P}\theta$ and assuming that $\theta = \theta_0$ this is the same as $\hat \theta \xrightarrow[]{\mathbf P}\theta_0$. Thus the numerator $\to0$, and at the same time the $\text{SE}$ normalizes the [[Properties of an Estimator#Variance of an Estimator|variance of the estimator]] to $1$.
- *Standard error:* We denote $\widehat{\text{var}}(\hat \theta)$ as the standard error of the estimate (i.e. estimator of the variance of $\hat \theta$). Mind the hat symbol in two places. The $\widehat{\text{var}}$ indicates, that we do not work with the actual variance of the estimate $\hat \theta$ (which is unknown), but the estimated variance. Note that we cannot have unknown quantities in a test statistic.
**Squared Form:**
$ W=\frac{(\hat \theta - \theta_0)^2}{\widehat{\text{var}}(\hat \theta)} \xrightarrow[n \to \infty]{(d)}\chi^2_1 $
We can also write Wald-statistic in a squared form. However, now it converges to a [[Chi-Square Distribution]] with $1$ degree of freedom $\chi_1^2$. This is because, a chi-squared distribution is just the sum of $k$ squared Gaussians. In the univariate case here, it is only a single Gaussian we are squaring.
## Decision Rule
- As the test statistic converges to a Gaussian (or chi-squared) asymptotically, we can simply compare the test-statistic value to the asymptotic quantile values that we find in probability tables.
- We find the asymptotic [[P-Value]], by looking at which theoretical quantile the observed test-statistic value would be located.
| Hypothesis | Test | Asymptotic P-Value |
| -------------------------------------------------------------------------------- | ----------------------------------------------------- | --------------------------------------------------------------------- |
| $\begin{cases} H_0: \theta = \theta_0\\ H_1: \theta \not = \theta_0 \end{cases}$ | $\mathbf 1\big\{\lvert W \rvert > q_{\alpha/2}\big\}$ | $\mathbf P \big(\lvert Z \rvert > \lvert W^{\text{obs}} \rvert \big)$ |
| $\begin{cases} H_0: \theta \le \theta_0\\ H_1: \theta > \theta_0 \end{cases}$ | $\mathbf 1\big\{W > q_{\alpha}\big\}$ | $\mathbf P \big(Z > W^{\text{obs}} \big)$ |
| $\begin{cases} H_0: \theta \ge \theta_0\\ H_1: \theta < \theta_0 \end{cases}$ | $\mathbf 1\big\{W < -q_{\alpha}\big\}$ | $\mathbf P \big(Z < W^{\text{obs}} \big)$ |
## Derivation of Asymptotic Level
To show the asymptotic level of the Wald-test, we use the property of Wald test statistic $W \xrightarrow[]{(d)} \mathcal N(0,1)$.
When $\theta = \theta_0$:
$
\begin{align}
\lim_{n \to \infty} \mathbf P_{\theta_0}\big(\psi=1 \big)
&=\lim_{n \to \infty} \mathbf P_{\theta_0}\big(\lvert W\rvert> q_{\alpha/2}\big) \tag{1} \\[8pt] &=\lim_{n \to \infty} \mathbf P_{\theta_0}\big(\lvert Z \rvert> q_{\alpha/2}\big) \tag{2} \\[8pt]
&= \alpha \tag{3}
\end{align}
$
When $\theta \le \theta_0$:
$
\begin{align}
\lim_{n \to \infty} \mathbf P_\theta(\psi=1) &= \lim_{n \to \infty} \mathbf P_\theta(W > q_\alpha) \tag{4}\\[4pt]
&=\lim_{n\to \infty} \mathbf P_{\theta} \left(\frac{\hat \theta - \theta_0}{\sqrt{\widehat{\text{var}}(\hat \theta)}}> q_\alpha \right) \tag{5}\\[8pt]
&\le\lim_{n\to \infty} \mathbf P_{\theta} \left(\frac{\hat \theta - \theta}{\sqrt{\widehat{\text{var}}(\hat \theta)}}> q_\alpha \right) \tag{6}\\[14pt]
&\le\lim_{n\to \infty} \mathbf P_\theta \left(Z> q_\alpha \right) \tag{7}\\[10pt]
&\le \alpha \tag{8}
\end{align}
$
where:
- (2) Since $W$ converges to $\mathcal N(0,1)$ we can replace it with a Gaussian r.v. $Z$.
- (5) It is necessary to have the same $\theta$ in $\mathbf P_\theta$ and in $W$. Only then we can legitimately replace $W$ with $Z$.
- (6) Assuming that $(\theta \le \theta_0)$, by replacing $\theta_0$ with $\theta$, we make the numerator larger and thus the outside probability larger.