The [[Cumulative Density Function|CDF]] can be expressed as the expectation of an indicator variable.
$ F(t)= \mathbf P(X \le t)=\mathbb E\big[\mathbf 1\{X \le t\}\big] $ ^ddb620
The empirical CDF ("sample CDF") replaces the expectation with a sample average, effectively counting the number of observations where $X_i \le t$ and dividing by the total number of observations $n$.
$ F_n(t)=\frac{1}{n} \sum_{i=1}^n \mathbf 1 \{X_i \le t\} $ ^5957fc
We can compute this empirical CDF at every location of $X_i$, which results in a step function (i.e. it increases with every new $X_i$).
![[empirical-cdf.png|center|500]]
## Convergence Properties
**Pointwise Convergence:**
For every fixed $t$, the empirical CDF [[Modes of Convergence#Convergence Almost Surely|converges almost surely]] to the true CDF. This is because of [[Law of Large Numbers|LLN]], which says that a sample average converges to the expectation as $n$ grows larger.
$
\begin{align}
\frac{1}{n} \sum_{i=1}^n \mathbf 1 \{X_i \le t\} &\xrightarrow[n \to \infty]{a.s.} \mathbb E\big[\mathbf 1\{X \le t\}\big] \\[12pt]
F_n(t) &\xrightarrow[n \to \infty]{a.s.}F(t)
\end{align}
$
**Uniform Convergence:**
This states that a function $F_n$ converges to the function $F$ as a whole and with a uniform rate. This behavior is *not implied* just because of the pointwise converge.
- *Example:* While at $n=100$, we can see that $F_n(t=1.6)$ has already converged, but $F_n(t=2.5)$ did not convergence, due to few observations in that region. Since there are infinite points in $t \in \mathbb R$, the point-wise convergence at all infinite points would need to be proven, to establish uniform convergence.
**Glivenko-Cantelli Theorem:**
The theorem states that if the empirical CDF comes from [[Independence and Identical Distribution|i.i.d.]] [[Random Variable|r.v's.]], then it converges uniformly to the true CDF. Technically speaking, the supremum of the absolute difference between the empirical and asymptotic CDF goes to $0$, even for the “worst” choice of $t$. ^d0ca51
$ \sup_{t \in \mathbb R} \Big \lvert F_n(t) -F(t) \Big \rvert \xrightarrow[n \to \infty]{a.s.}0 $
## Asymptotic Properties
**Normality:**
We showed above that $F_n(t)$ is the sample average of indicator (i.e. [[Bernoulli Distribution|Bernoulli]]) r.v’s. for a given $t$. Therefore $F_n(t)$ follows a [[Binomial Distribution]], scaled by $1\over n$.
$ F_n(t) = \frac{1}{n} \sum_{i=1}^n \mathbf 1 \{X_i \le t\} \sim \frac{1}{n}*\mathrm{Binom}(n,p) $
With large enough $n$, this sample average converges to a [[Gaussian Distribution]] by [[Central Limit Theorem|CLT]].
$ F_n(t) \xrightarrow[n \to \infty]{(d)} \mathcal N(\mu, \sigma^2) $
**Variance:**
$
\begin{align}
\mathrm{Var}\big(F(t)\big)
&=\mathrm{Var}\Big(\text{Binom}(n,p)*\frac{1}{n}\Big) \tag{1}\\[8pt]
&=\frac{1}{n^2}*\mathrm{Var}\big(\text{Binom}(n,p)\big) \tag{2}\\[12pt]
&=\frac{np(1-p)}{n^{2}} \tag{3}\\[10pt]
&=\frac{F(t)(1-F(t)}{n} \tag{4}
\end{align}
$
where:
- (1) The variance of $F(t)$ is just the variance of a scaled binomial.
- (3) Apply the [[Binomial Distribution#Variance|Variance of a Binomial]] with parameters $n,p$.
- (4) We know that $p$ is just $\mathbf 1 \{X_i <t\}$, which is $F(t)$.
$
\begin{align}
\big(F_n(t) -F(t) \big) &\xrightarrow[n \to \infty]{(d)} \mathcal N(0, \sigma^2) \tag{5}\\[10pt]
\big(F_n(t) -F(t) \big) &\xrightarrow[n \to \infty]{(d)} \mathcal N\Big(0, \frac{F(t)(1-F(t)}{n}\Big) \tag{6}\\[10pt]
\sqrt n* \big(F_n(t) -F(t) \big) &\xrightarrow[n \to \infty]{(d)} \mathcal N\Big(0,\, F(t)(1-F(t)\Big) \tag{7}\\[10pt]
\end{align}
$
where:
- (5) By LLN we have pointwise convergence for a fixed $t$. Thus we can say that the difference between $F_n(t)$ and $F(t)$ goes to $0$ for some fixed $t$. Since $F(t)$ is deterministic, there is no effect on the variance.
- (6) Insert asymptotic variance from above.
## Donsker’s Theorem
Above we have established pointwise asymptotic normality. To make uniform statements for every $t \in \mathbb R$, we need to rely on Donsker’s theorem.
Given $F$ is continuous, the supremum of the absolute difference scaled by $\sqrt n$ converges to a Brownian bridge $\mathbb B$ on the fixed interval $[0,1]$. Since $\mathbb B$ does not depend on anything unknown (i.e. it is a pivotal distribution), we can look up its quantiles in tables (or software).
$
\sqrt{n} * \sup_{t \in \mathbb R} \Big \lvert F_n(t) - F(t) \Big \rvert
\xrightarrow [n \to \infty ]{(d)}
\sup_{0 \le x \le 1} \Big \lvert \mathbb B(x) \Big \rvert
$