The moving average $\text{MA}(q)$ model looks similar in structure to the [[Autoregressive Model]] $\text{AR}(p)$. However it is fundamentally different in terms of [[Stationarity]]. $ \begin{align} X_t&=W_t+\phi_1X_{t-1}+ \dots+\phi_qX_{t-q} && \text{(AR)} \\[6pt] X_t&=W_t+\theta_1W_{t-1}+ \dots+\theta_qW_{t-q} && \text{(MA)} \end{align} $ The $\text{MA}(q)$ is a weighted average of the last $q$ [[White Noise Model|White Noise]] terms. As we move from $X_t$ to $X_{t+1}$ the window of averaging largely overlaps. The larger $q$, the more $W_i$ terms are overlapping in consecutive observations, and hence the larger is the dependency structure in the [[Time Series as Stochastic Process|Time Series]]. ![[moving-average-model.svg|center]] ## Checking for Stationarity To satisfy [[Stationarity#Weak Stationarity|Weak Stationarity]], we need to prove a constant mean, constant variance and covariance that only depends on the number of lags. **Expectation:** The expectation of $X_t$ is a weighted average of $W_i$ terms, which all have zero mean. $ \mathbb E[X_t]= \underbrace{\mathbb E[W_t]}_{=0}+ \theta_1 \underbrace{\mathbb E[W_{t-1}]}_{=0}+\dots+\theta_q \underbrace{\mathbb E[W_{t-q}]}_{=0} =0$ Compared to the $\text{AR}(p)$ model, the $\text{MA}(q)$ model does not capture the full path (how it got from $X_0$ to $X_t$). It forgets everything that happened $q$ steps before. >[!Note:] >The expectation is constant over time. **Variance:** Assume the following moving average model, which considers the last 3 noise terms equally. $ X_t = \frac{1}{3}(W_t+W_{t-1}+W_{t-2}) $ The variance of $X_t$ can be written as follows: $ \begin{align} \mathrm{cov}(X_t, X_t) =\mathrm{var}(X_t)& = \mathrm{var}(\frac{1}{3}W_t)+(\frac{1}{3}W_{t-1})+(\frac{1}{3}W_{t-2}) \tag{1}\\[8pt] &=\frac{1}{9}\mathrm{var}(W_t)+\frac{1}{9}\mathrm{var}(W_{t-1})+\frac{1}{9}\mathrm{var}(W_{t-2}) \tag{2}\\[8pt] &=\frac{3}{9} \sigma_t^2 \tag{3} \end{align} $ where: - (1) We know that the variance of the [[Sum of Independent Random Variables]], equals the sum of the variances. - (2) We can factor out the $\frac{1}{3}$ from the variance (it gets squared). - (3) Each $W_i$ has the same variance $\sigma_t^2$. >[!note:] >We conclude that the variance is constant over time. **Covariance:** Assuming the same moving average model as above. Here we rely on the [[Covariance#Covariance after Linear Transformation|linearity of covariance]] property. $ \begin{align} \mathrm{cov}(X_t, X_{t-1})&=\mathrm{cov}\Big( \frac{1}{3}(W_t+W_{t-1}+W_{t-2}), \frac{1}{3} (W_{t-1}+W_{t-2}+W_{t-3})\Big) \tag{1}\\[8pt] &=\mathrm{cov}\Big(\frac{1}{3}(W_{t-1}, W_{t-1}+W_{t-2})\Big)+\mathrm{cov}\Big(\frac{1}{3}(W_{t-2}, W_{t-1}+W_{t-2})\Big) \tag{2}\\[8pt] &=\frac{1}{9}\mathrm{cov}(W_{t-1}, W_{t-1})+\frac{1}{9}\mathrm{cov}(W_{t-1}, W_{t-2})+ \tag{3}\\[8pt] &\quad \,\,\frac{1}{9}\mathrm{cov}(W_{t-2}, W_{t-2})+\frac{1}{9}\mathrm{cov}(W_{t-2}, W_{t-1}) \tag{4}\\[8pt] &=\frac{1}{9}\mathrm{var}(W_{t-1})+\frac{1}{9}\mathrm{var}(W_{t-2}) \tag{5}\\[8pt] &=\frac{2}{9}\sigma_t^2 \tag{6} \end{align} $ where: - (4) The noise terms that only appear on one side, do not contribute to covariance, as they are completely independent of each other. - (5) Applying linearity of covariance. - (6) Applying linearity of covariance again. Covariances of independent terms are zero, covariances of the same r.v. turn into variances. >[!note:] >We conclude that the autocovariance $\gamma$ only depends on the gap between $s,t$ and not their absolute position on the time series. This satisfies stationarity. ## Autocovariance for MA(1) The moving average model of order $1$, denoted as $\text{MA}(1)$, is defined as: $ X_t=W_t+ \phi W_{t-1} \quad $ The autocovariance at lag $0$ corresponds to the variance of $X_t$. Since the noise terms $W_i$ are [[Independence and Identical Distribution|i.i.d.]], the expression simplifies. $ \begin{align} \mathrm{Cov}(X_t, X_t) &=\mathrm{Cov}(W_t+ \phi W_{t-1},W_t+ \phi W_{t-1})\\[6pt] &=\mathrm{Cov}(W_t,W_t) + 2*\mathrm{Cov}(W_t,\phi W_{t-1}) + \mathrm{Cov}(\phi W_{t-1},\phi W_{t-1})\\[6pt] &=\sigma^2+2\phi \mathrm{Cov}(W_t, W_{t-1})+\phi^2 \mathrm{Cov}(W_{t-1},W_{t-1})\\[6pt] &=\sigma^2+\phi^2\sigma^2\\[6pt] &=\sigma^2(1+\phi^2) \end{align} $ The autocovariance at lag $1$ uses the same properties and can be written as: $ \begin{align} \mathrm{Cov}(X_t, X_{t-1}) &=\mathrm{Cov}(W_t+ \phi W_{t-1}, W_{t-1}+ \phi W_{t-2})\\[6pt] &=\mathrm{Cov}(W_t,W_{t-1}) + \phi\mathrm{Cov}(W_t, W_{t-2}) \\[6pt] &\phantom{==} +\, \phi \mathrm{Cov}(W_{t-1}, W_{t-1})+ \phi^2 \mathrm{Cov}(W_{t-1}, W_{t-2})\\[6pt] &= \phi \sigma^2 \end{align} $