The Markov inequality is useful for bounding the probability that a non-negative [[Random Variable]] $X$ that takes on extreme values. It applies when we know the [[Expectation]] of $X$ but lack specific details about its distribution, especially in the tails. If $X \ge 0$ and $a>0$ then: $ \mathbf P(X \ge a) \leq \frac{\mathbb{E}[X]}{a} $ When $a$ is big, then the probability of $X$ being bigger than that constant $a$, is small and vice versa. **Derivation:** $ \begin{align} \mathbb{E}[X] = \int_0^\infty xf_X(x) \, dx &\ge \int_a^\infty xf_X(x) \,dx \tag{1}\\ &\ge \int_a^\infty a f_X(x) \,dx \tag{2}\\ &\ge a \int_a^\infty f_X(x) \,dx \tag{3}\\[10pt] &\ge a* \mathbf P(X \ge a) \tag{4} \end{align} $ where: - (1) Since $X \ge 0$, its expectation equals the integral of the PDF from $[0,\infty]$, weighted by $x$. This has to be bigger than the same term integrated over a shorter range $[a, \infty]$. - (2) When integrating from $[a, \infty]$, then $a$ is the lowest value that $x$ can take. Therefore the inequality holds even more true, when replacing $x$ with $a$. - (3) As $a$ is a constant, it can be pulled out of the integral. - (4) The remaining part under the integral is just the tail probability mass. **Example:** Let $X$ come from an [[Exponential Distribution]] with $\lambda=1$. As we know its full distribution, we can compare that exact tail probability with the bound given by Markov inequality is. Exact tail probability: $ \mathbf P(X \ge a) = e^{-a\lambda}$ Markov bound: $ \mathbf P(X \ge a) = \mathbb E[X]/a = \lambda/a$ When $\lambda =1$, then as $a$ increases: * The exact tail probability decreases exponentially. * The Markov bound decreases linearly. > [!note:] > Since an exponential decay is faster than a linear decrease, it shows that the Markov inequality provides only a loose bound for large $a$. The bound is less close to the real tail probability (less insightful) the bigger $a$ gets.