The method of moments allows to estimate the parameters of a distribution, by matching the theoretical moments of a given distribution $m_k(\theta)$ with the sample moments $\hat m_k$ from observed data.
**Process:**
1. Identify distribution $\mathbf P_\theta$ that fits the data well.
2. Identify the number of parameters $k$ that define $\mathbf P_\theta$ (e.g. for a [[Gaussian Distribution|Gaussian]] we have $k=2$).
3. Obtain the first $k$ theoretical moments from the distribution ([[Moment Generating Function|MGF]]).
4. Calculate the first $k$ sample moments from observed data.
5. Now we have a system of $k$ equations, where we can solve for all $\theta$ parameters (e.g. for $\mathcal N$ it is $\mu, \sigma^2$).
## Theoretical Moments
We have our definition of the theoretical moments of a distribution from [[Random Variable]] $X$.
$ m_k(\theta)=\mathbb E_\theta[X^k] $
With the help of the MGF, we can construct the first $k$ theoretical moments.
$ \mathbb E[X^k] = \frac{d^k}{dt^k}\Big[M_X(t)\Big]_{t=0} $
## Sample Moments
We also construct the sample moments, by replacing the [[Expectation]] with an average. This is justified by [[Law of Large Numbers|LLN]]. The sample moments do not depend on $\theta$ anymore (since the expectation is gone), but solely on the collected sample data.
$ \hat m_k= \frac{1}{n} \sum_{i=1}^n X_i^k $
Assume we want to estimate $\hat \theta_n$ which is $\in \mathbb R^d$. We can set up a system of equations up to the $d$-th sample moment, in order to solve for $\hat \theta_n$.
$ \begin{aligned} m_1(\hat \theta_n) &=\hat m_1 \\ m_2(\hat \theta_n) &=\hat m_2 \\ \vdots\space &= \space\vdots \\ m_d(\hat \theta_n) &=\hat m_d \end{aligned} $
**Example:** For $X\sim \mathcal N(\mu, \sigma^2)$ we have two parameters to estimate, which requires the first two moments to estimate.
$ (\hat \mu_n, \hat \sigma^2_n)=\Big (\hat m_1, \hat m_2-(\hat m_1)^2\Big) $
$
\begin{align}
\mu &= \frac{1}{n}\sum_{i=1}^n x_i\\[8pt]
\hat \sigma^2 &= \frac{1}{n}\sum_{i=1}^n x_i^2 - \left(\frac{1}{n}\sum_{i=1}^n x_i\right)^2
\end{align}
$