Lagrange multipliers provide a method to find the minima or maxima of a multivariate function $f(x,y)$ subject to constraints. This is useful for optimization problems where the solution is restricted to a specific line, surface, or hyperplane. **Key Idea:** To minimize a function $f(x,y)$ under constraints, we need to formulate the constraint as a function $g(x,y)=0$. Then we can combine $f$ and $g$ into a single equation. The critical insight is that the minima/maxima are found where the gradient of $f$ is parallel to the gradient of $g$, where $\lambda$ is the *Lagrange multiplier*, a scalar that adjusts for the lengths of the two gradients. $ \nabla f(x,y)=\lambda *\nabla g(x,y)$ ## Example Maximize the function $f(x,y)=x^2y$ subject to the constraint $x^2+y^2=1$. Below the light-blue contour lines depict different $[x,y]$ values yielding the same function value for $f$. ![[lagrange-multipliers-1.png|center|400]] **Constraint as a function:** To combine function $f$ with the constraint, we need to express the constraint as a function of $x,y$ at returns $0$ at the constraint. $x^2+y^2=1 \quad \mapsto \quad g(x,y)=x^2+y^2-1$ **Intersection of contours:** From the contour lines above we can see that the minima/maxima lie where the contours “kiss” the constraint line. At these intersection points both $f(x,y)$ and $g(x,y)$ share the same tangent. ![[lagrange-multipliers-2.png|center|400]] Since the steepest slope of a function is perpendicular to its tangent, we know that the [[Gradient Descent#Gradient Vector|Gradient vectors]] $\nabla f$ and $\nabla g$ point into the same direction. Set up the gradients: $ \begin{align} \nabla f(x,y) &= \begin{bmatrix} \frac{\partial f}{\partial x} \\[6pt] \frac{\partial f}{\partial y} \end{bmatrix} = \begin{bmatrix}2xy \\ x^2\end{bmatrix}\\[6pt] \nabla g(x,y) &= \begin{bmatrix} \frac{\partial g}{\partial x} \\[6pt] \frac{\partial g}{\partial y} \end{bmatrix} = \begin{bmatrix}2x \\ 2y\end{bmatrix} \end{align} $ Applying Lagrange multiplier condition: $ \begin{align} \frac{\partial f}{\partial x} &= \lambda \frac{\partial g}{\partial x}\\[6pt] \frac{\partial f}{\partial y} &= \lambda \frac{\partial g}{\partial y} \end{align} $ Substituting the gradients: $\begin{align}2xy=2x\gamma \\ x^2=2y\gamma \end{align}$ By adding the constraint as a third function, we have a system of equations that can be solved for $x,y$ and $\lambda$. > [!note:] > In this plot we actually see 4 tangent points, where we do not know which of them are minima or maxima beforehand. After we identified their respective $[x,y]$ values, we have to compute $f(x,y)$ to see which point returns the lowest/highest function value. **Useful links:** * [Lagrange multipliers, using tangency to solve constrained optimization](https://www.youtube.com/watch?v=yuqB-d5MjZA) * [Lagrange Multipliers | Geometric Meaning & Full Example](https://www.youtube.com/watch?v=8mjcnxGMwFo)