**Motivation:** - In an [[Autoregressive Model|AR Model]] $\text{AR}(p)$ we should only include the lags that individually contribute, i.e. they help describing the underlying data generation process. - When we only look at the ACF, we will see correlations of $X_t$ to prior $X_{t-h}$ due to the dependency structure in the autoregressive process, even though some of these variable are not individually contributing. **Example:** By definition an $\text{AR(1)}$ model only needs $X_{t-1}$ to be fully defined. However you can rewrite in terms of $X_{t-2}$ and some white noise terms: $ \begin{align} \text{AR}(1): X_t &= \phi X_{t-1}+W_t \\[2pt] &= \phi (\phi X_{t-2}+W_{t-1})+W_t\\[2pt] &=\phi^2X_{t-2}+ \phi W_{t-1}+W_t \end{align} $ Below you see the proof that $X_{t-2}$ is correlated to $X_t$ in an $\text{AR}(1)$ model. Therefore instead we need to look at the partial autocorrelation function (PACF). $ \begin{align} \rho(X_t, X_{t-2})&=\rho(\phi^2X_{t-2}+ \phi W_{t-1}+W_t, X_{t-2}) \\[4pt] &= \rho(\phi^2X_{t-2}, X_{t-2}) + \rho(\phi W_{t-1},X_{t-2}) + \rho(W_t,X_{t-2}) \\[4pt] &=\phi^2 \gamma(0) \end{align} $ ## Partial Correlation **General Setup:** We have [[Random Variable|r.v's.]] $X,Y$ and both are influenced by $Z$. The partial correlation of $X,Y$ tells us their correlation beyond the common impact that $Z$ has on both. To factor out $Z$, we simply condition the correlation on it $(\rho_{X,Y | Z})$. Therefore we regress each $X,Y$ separately on $Z$ to get the impact of $Z$ on each of them. - Regress $X$ on $Z$ to get $\hat X$ ($X$ is the dependent variable) - Regress $Y$ on $Z$ to get $\hat Y$ ($Y$ is the dependent variable) Finally we subtract this impact in the [[Correlation]], so that we are effectively comparing the residuals beyond the impact of $Z$. $ \rho_{X,Y | Z}=\rho(X-\hat X, \, Y-\hat Y) $ **Time Series Setup:** For time series, we deploy the same concept for the autocorrelation of different lags $X_t$ and $X_{t-h}$. We need to factor out all terms “in between”, $X_{t-1}, \dots , X_{t-h+1}$ to identify the individual contribution of $X_{t-h}$. Again to factor out variables we regress on them. - We fit an $\text{AR}(h-1)$ to get $\hat X$ - We compute the correlation of the residuals $\rho(X-\hat X, X)$ - If the correlation is above a certain threshold, then an additional lag would have some descriptive power **Frisch-Waugh-Lovell theorem:** The above approach is equivalent to fitting an $\mathrm{AR}(h)$ model, where the coefficient $\phi_h$ is the partial autocorrelation. The interpretation of a linear regression coefficient is: “How much does the output $(X_t)$ change, when I change this regressor $(X_{t-h})$, holding everything else equal. >[!Note:] >The partial autocorrelation function, is then the collection of partial autocorrelations plotted for all lags.