Constrained extremes: Lagrange multipliers

Let us consider a function of $N$ variables, $F(x_1,\dots,x_N)$. If the variables $x_1,\dots,x_N$ are all independent, the extreme of $F$ w.r.t. their variations is obtained from the condition:

$\displaystyle \frac{\partial F}{\partial x_i}= 0; \quad \forall x_i,$ (13.1)

and the partial derivative w.r.t. $x_i$ has to be taken with all the other variables $x_j, j\ne i$, held constant. If, however, the variables $x_1,\dots,x_N$ are subject to a constraint, e.g. $f(x_1,\dots,x_N) = k$, with $k$ some constant, then their variations are not all independent. Let us define the function

$\displaystyle G(x_1,\dots,x_N,\lambda) = F(x_1,\dots,x_N) + \lambda [ f(x_1,\dots,x_N) - k ].$ (13.2)

If the constraint is satisfied we have $G = F$. The gradient of $G$ is:

$\displaystyle \nabla G = \left ( \frac{\partial G}{\partial x_1}, \dots, \frac{...
...d \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad$    
$\displaystyle = \left ( \frac{\partial F}{\partial x_1} + \lambda \frac{\partia...
...x_N} + \lambda \frac{\partial f}{\partial x_N}, f(x_1,\dots,x_N) - k) \right ),$ (13.3)

and the partial derivatives mean that the variation is done by holding all the other variables constant. If $\nabla G = 0$ we must also have $\nabla F = 0$. This is clear, because if $F$ was not at an extreme a variation of (at least one of) the $x_i$ would produce a variation of $F$, but the condition $f(x_1,\dots,x_N) - k = 0$, implied by $\frac{\partial G}{\partial \lambda} = 0$, would mean that also $G$ would have to change. The condition $\nabla G = 0$ then implies that $\nabla F = 0$ and $f(x_1,\dots,x_N) - k = 0$, meaning that $F$ is at an extreme and it satisfies the constraint. This condition then reads:

$\displaystyle \frac{\partial F}{\partial x_i} + \lambda \frac{\partial f}{\partial x_i} = 0; \quad \forall x_i.$ (13.4)

This can be generalised to an extremisation with $M$ constraints, expressed by the conditions $f_1(x_1,\dots,x_N) - k_1 = 0, \dots, f_M(x_1,\dots,x_N) - k_M = 0$, which give:

$\displaystyle \frac{\partial F}{\partial x_i} + \sum_{j=1}^M \lambda_j \frac{\partial f_j}{\partial x_i} = 0; \quad \forall x_i.$ (13.5)

The parameters $\lambda_j$ are known as Lagrange multipliers.