Statistical Test Errors - Bernhard Pfann, CFA

**Type of Test Errors:** | | Fail to Reject $H_0$ | Reject $H_0$ | | ------------------------------------- | ---------------------------- | ------------------------ | | $H_0$ is true $(\theta \in \Theta_0)$ | Correctly not rejected $H_0$ | Type 1 Error | | $H_1$ is true $(\theta \in \Theta_1)$ | Type 2 Error | Correctly rejected $H_0$ | **Quantifying test errors:** The probability of rejecting $H_0$ is governed by the probability that data is generated under $\mathbf P_\theta$ and that our [[Statistical Test]] $\psi=1$. - Probability to reject: $\mathbf P_\theta(\psi=1)$ - Probability to fail to reject: $1-\mathbf P_\theta(\psi=1)$ ## Type 1 Error The $\alpha$ parameter is also called the level of the test. This means, that a false rejection of $H_0$ should only happen with a probability of $\alpha$ at max. **Why at max?** If our null hypothesis is e.g. $\theta \le \theta_0$, then our test will have the highest changes of giving a wrong conclusion, when $\theta$ is closest to $\theta_0$. So even in this “worst-case”, the type 1 error has to be lt;\alpha$. $ \begin{align} \mathbf P_\theta(\psi=1) &\le \alpha \quad \forall \theta \in \Theta_0 \\[8pt] \max_{\theta \in \Theta_0} \mathbf P_\theta(\psi=1) &\le \alpha \end{align} $ | Notation | Explanation | | ------------------------------------ | --------------------------------------------------------------------------- | | $\forall \theta \in \Theta_0$ | For all $\theta$ where the null hypothesis is actually true. | | $(\psi=1)$ | The decision rule rejects $H_0$. | | $\mathbf P_\theta(\cdot) \le \alpha$ | We allow this misclassification to happen with at max $\alpha$ probability. | ## Type 2 Error It is less common to directly control a test by $\beta$. This means that although $H_1$ is true, our test does not reject $H_0$. $ \mathbf P_\theta(\psi=0) \le \beta \quad \forall \theta \in \Theta_1 $ |Notation|Explanation| |---|---| |$\forall \theta \in \Theta_1$|For all $\theta$ where the alternative hypothesis is actually true.| |$(\psi=0)$|The decision rule does not reject $H_0$.| |$\mathbf P_\theta(\cdot) \le \beta$|We allow this misclassification to happen with at max $\beta$ probability.| **Power of a test:** Power is written as $(1-\beta)$ and denotes the probability to find evidence for $H_1$ given that $H_1$ is actually true (i.e. probability to rightfully reject). ## Neyman-Pearson Paradigm Ideally our statistical test minimizes the error of both types. However, the two error types are in conflict with each other. The Neyman-Pearson paradigm picks a side and states that we should fix a type 1 error and minimize type 2 error within this constraint. The type 2 error can be decreased by increasing the sample size. As $n\to \infty$ then $\beta \to 0$, since the standard deviation of the estimator will decrease. ## Duality To Confidence Intervals **Constructing a confidence Interval:** Given our data that is generated from a distribution $\mathbf P_\theta$, we construct an interval $I=[A,B]$, such that the probability of the unknown $\theta$ to be $\in I$ is at least $(1- \alpha$). $ \mathbf P_\theta \big(\theta \in I\big) \ge 1- \alpha $ **One-sample two-sided test:** We have our null hypothesis that $\theta=\theta_0$, and we want to build a test $\psi$ based on this interval. Since the interval has been constructed on the data (which is coming from $\mathbf P_\theta$) we would expect $\theta_0$ to be within the interval, if $\theta_0 \approx \theta$. Thus we only have evidence to reject $H_0$, when $\theta_0$ is outside the interval. $ \begin{cases} H_0:\theta = \theta_0 \\ H_1:\theta \not = \theta_0 \\ \end{cases} \quad \implies \quad\psi= \mathbf 1 \{ \theta_0 \not \in I\} $ **Level of test:** Under $H_0$ we assume that $\theta_0=\theta$. Given this assumption, we can replace $\mathbf P_{\theta}(\theta \in I)$ by $\mathbf P_{\theta_0}(\theta_0 \in I)$ equivalently. $ \begin{aligned} \mathbf P_{\theta_0}(\psi=1) &= \mathbf P_{\theta_0}(\theta_0 \not \in I) =1- \overbrace{\mathbf P_{\theta_0}(\theta_0 \in I)}^{\ge 1- \alpha} \le \overbrace{1-(1-\alpha)}^{\alpha} \\[6pt] \mathbf P_{\theta_0}(\psi=1) &\le \alpha \end{aligned} $ **Conclusion:** Under the assumption that the data comes from $\mathbf P_{\theta_0}$, our test would falsely reject (as $\psi=1$) with a probability of $\alpha$ at max. This is just the definition of a test at level $\alpha$, which we have constructed from a confidence level $(1- \alpha)$. >[!note:] >The complement of the rejection region of the test $\psi$ is the same as the confidence interval $I$.