**Exponential Family:** A distribution belongs to that [[Exponential Family]], if its density has the following form. $ f_p (y) = h(y)* \exp \Big\{ T(y)\eta (p)- B(p) \Big\} $ **Canonical Exponential Family:** When there is only a single parameter $\theta$ and $T(y)=y$, then our distribution is member of the canonical exponential family. $ f_\theta(y) = \exp \left\{ \frac{y \theta - b(\theta )}{\phi } + c(y,\phi ) \right\} $ | Symbol | Name | Comment | | ------------ | ---------------------- | -------------------------------------------------------------------------- | | $\phi$ | Dispersion parameter | The term is assumed to be known for simplicity. We treat it as a constant. | | $\theta$ | Canonical parameter | The formerly denoted $\eta(p)$. | | $b(\theta)$ | Log-partition function | When $\theta$ is a function, then $b(\theta)$ is a function of a function. | | $c(y, \phi)$ | Normalization | The formerly denoted $h(y)$ moved into the exponential function. | ## General to Canonical Form To understand what $b(\theta)$ is in terms of $B(p)$, we need to express $\theta$ in terms of $p$. Since we know that $\theta = \eta(p)$, we simply need to take the inverse $\eta^{-1}(\theta)$ for that translation. $ \begin{aligned} \eta(p):p &\mapsto \theta \\[2pt] \eta^{-1}(\theta):\theta &\mapsto p \\[4pt] b(\theta) &= b\big(\underbrace{\eta(p)}_{\theta}\big) = B(\underbrace{\eta^{-1}(\theta)}_{p}) \end{aligned} $ **Example:** The density of a [[Bernoulli Distribution|Bernoulli]] $f_\theta(y)$ can be written as follows: $ f_\theta(y) = \exp \left\{y*\underbrace{\ln(\frac{p}{1-p})}_{\theta}-(\underbrace{-\ln(1-p)}_{B(p)})\right \} $ Taking the inverse of the canonical parameter: $ \begin{align} \eta(p):\theta&=\ln(\frac{p}{1-p})\\[8pt] e^\theta&=\frac{p}{1-p}\\[10pt] e^\theta&=p*(1+e^\theta)\\[2pt] \eta^{-1}(\theta):p&=\frac{e^\theta}{1+e^\theta} \end{align} $ Substituting the inverse into $B(p)$ to get to the log-partition function $b(\theta)$: $ \begin{align} b(\theta) &= -\ln\big(1-\eta^{-1}(\theta)\big) \\ &=-\ln\Big(1-\frac{e^\theta}{1+e^\theta}\Big) \\ &=-\ln\Big(\frac{1+e^\theta}{1+e^\theta} -\frac{e^\theta}{1+e^\theta} \Big) \\[10pt] &=\ln(1+e^\theta) \end{align} $ ## Calculate Moments The log-likelihood of a single observation in $y$ for a canonical exponential distribution gets rid of the exponential term. $ \ell(\theta) = \log\big(f_\theta(y_i)\big)=\frac{y_i \theta - b(\theta )}{\phi } + c(y_i,\phi ) $ By relying on the first two [[Identities of Log-Likelihood]], we can derive the [[Expectation]] and [[Variance]], from this notational form of the [[Probability Density Function|PDF]]. **Expectation:** $ \begin{align} \frac{\partial \ell}{\partial \theta} &= \frac{Y-b^\prime(\theta)}{\phi} \\[6pt] 0=\mathbb E\Big[\frac{\partial \ell}{\partial \theta}\Big] &= \frac{\mathbb E[Y]-b^\prime(\theta)}{\phi} \\[6pt] \mathbb E[Y] &= b^\prime(\theta) \end{align} $ **Variance:** $ \begin{aligned} \frac{\partial^2 \ell}{\partial \theta^2}+ \Big(\frac{\partial \ell}{\partial \theta}\Big)^2 &= -\frac{b^{\prime \prime}(\theta)}{\phi}+\Big(\frac{Y-\overbrace{b^\prime(\theta)}^{\mathbb E[Y]}}{\phi}\Big)^2 \\[6pt] 0&=-\frac{b^{\prime \prime}(\theta)}{\phi}+\frac{\mathrm{var}(Y)}{\phi^2} \\[8pt] \mathrm{var}(Y)&=b^{\prime \prime}(\theta)*\phi \end{aligned} $