Constrained extremes: Lagrange multipliers

Let us consider a function of $N$ variables, $F(x_1,\dots,x_N)$ . If the variables $x_1,\dots,x_N$ are all independent, the extreme of $F$ w.r.t. their variations is obtained from the condition:

$\displaystyle \frac{\partial F}{\partial x_i}= 0; \quad \forall x_i,$

(13.1)

and the partial derivative w.r.t. $x_i$

has to be taken with all the other variables $x_j, j\ne i$ , held constant. If, however, the variables $x_1,\dots,x_N$ are subject to a constraint, e.g. $f(x_1,\dots,x_N) = k$ , with $k$

some constant, then their variations are not all independent. Let us define the function

$\displaystyle G(x_1,\dots,x_N,\lambda) = F(x_1,\dots,x_N) + \lambda [ f(x_1,\dots,x_N) - k ].$

(13.2)

If the constraint is satisfied we have $G = F$

. The gradient of $G$

is:

$\displaystyle \nabla G = \left ( \frac{\partial G}{\partial x_1}, \dots, \frac{... ...d \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad$
$\displaystyle = \left ( \frac{\partial F}{\partial x_1} + \lambda \frac{\partia... ...x_N} + \lambda \frac{\partial f}{\partial x_N}, f(x_1,\dots,x_N) - k) \right ),$	(13.3)

and the partial derivatives mean that the variation is done by holding all the other variables constant. If $\nabla G = 0$ we must also have $\nabla F = 0$ . This is clear, because if $F$

was not at an extreme a variation of (at least one of) the $x_i$

would produce a variation of $F$

, but the condition $f(x_1,\dots,x_N) - k = 0$ , implied by $\frac{\partial G}{\partial \lambda} = 0$ , would mean that also $G$

would have to change. The condition $\nabla G = 0$ then implies that $\nabla F = 0$ and $f(x_1,\dots,x_N) - k = 0$ , meaning that $F$

is at an extreme and it satisfies the constraint. This condition then reads:

$\displaystyle \frac{\partial F}{\partial x_i} + \lambda \frac{\partial f}{\partial x_i} = 0; \quad \forall x_i.$

(13.4)

This can be generalised to an extremisation with $M$

constraints, expressed by the conditions $f_1(x_1,\dots,x_N) - k_1 = 0, \dots, f_M(x_1,\dots,x_N) - k_M = 0$ , which give:

$\displaystyle \frac{\partial F}{\partial x_i} + \sum_{j=1}^M \lambda_j \frac{\partial f_j}{\partial x_i} = 0; \quad \forall x_i.$

(13.5)

The parameters $\lambda_j$ are known as Lagrange multipliers.