When we do optimization, we need to identify if a function is convex. If this is the case, we know that the found a global minimum at $f^\prime(\theta)=0$. In non-convex cases, this does not need to be true. There are 3 ways to identify convexity. ## Linear Lower Bounds A function $f$ is convex if at any point $w$ we can construct a tangent, and this tangent gives a lower bound of the function (at any point $u$). $ f(u) \ge f(w)+f^\prime(w)*(u-w) \quad \forall \,(u, w) $ ![[linear-lower-bounds.jpeg|center|350]] ## Chord Across the Bowl A function $f$ is convex if the cord spun between any two points $u,w$ is always above the function within the range of $u$ and $w$. $ f\big(\underbrace{\lambda (w+(1-\lambda)u}_{v}\big) \le \underbrace{\lambda f(w)+(1-\lambda)f(u)}_{\star} \quad \forall \,(0 \le \lambda \le 1) $ ![[chord-across-bowl.jpeg|center|350]] ## Second Derivative A function $f$ is convex if it is twice differentiable and the second derivative is non-negative. This can be shown via the [[Taylor Series]], which lets us fully describe a function $f$ as the sum of all derivatives evaluated at point $a$, which is in close proximity to $x$. $ f(x) = f(a)+f^\prime(a)*(x-a)+\frac{1}{2}f^{\prime\prime}(a)(x-a)^2+\mathcal O\big(\lvert x-a\rvert^3\big) $ When we assume that $a$ is a critical point, meaning that $f^\prime(a)=0$, then the Taylor series is governed by the second derivative $f^{\prime \prime}(a)$. $ f(x) = f(a)+\frac{1}{2}f^{\prime\prime}(a)\underbrace{(x-a)^2}_{> 0}+\mathcal O\big(\lvert x-a\rvert^3\big) $ When we now assume that $f^{\prime \prime}(a) >0$, then we see that the function $f(x)$, will be larger at any point $x$ compared to $a$. Hence, we can say that $a$ is a minimum of the function. $ f(x) > f(a) \quad \forall x $ **Cases:** All these inequalities of the second derivatives have to be true for all $x \in \Omega$. | | General | Strictly | | ------- | --------------------------- | ------------------------- | | Concave | $f^{\prime \prime}(x)\le 0$ | $f^{\prime \prime}(x)< 0$ | | Convex | $f^{\prime \prime}(x)\ge 0$ | $f^{\prime \prime}(x)> 0$ | **Example:** The domain of the $\log$ function contains all non-negative numbers denoted by the sample space $\Omega$. We can prove that $f(x)$ is concave since plugging into the second derivative any number from $\Omega$ gives a negative value. $ \begin{align} \Omega=(0, \infty), \, f(x)&=\log x \tag{1} \\[6pt] f^\prime(x)&=\frac{1}{x} \tag{2} \\[4pt] f^{\prime \prime}(x)&=-\frac{1}{x^2} \implies \text{(concave)} \tag{3} \end{align} $ (3) No matter what you fill into $x$, the denominator will always be positive, and hence $f^{\prime \prime}(x)$ will always be negative.