2.5 Determinants

It is straightforward to check that a \(2\times 2\) matrix \(A=\begin{pmatrix} a&b\\c&d \end{pmatrix}\) is invertible if and only if \(ad-bc\neq 0\). The quantity \(ad-bc\) is called the determinant of \(A\), and we want to generalize this to larger matrices.

Definition 2.12 Let \(A=(a_{ij})\) be \(n\times n\). The determinant of \(A\), written \(\det (A)\), is the number

\[ \sum_{\sigma \in S_n} \sgn(\sigma) a_{1,\sigma(1)}a_{2,\sigma(2)}\cdots a_{n,\sigma(n)}. \]

Example 2.6

\(S_2=\{\id, (1,2)\}\) so the determinant of \(A=(a_{ij})\) is \[ a_{11}a_{22} - a_{12}a_{21}\] where the first term is the one arising from \(\sigma=\id\) and the second is from \(\sigma=(1,2)\). This is equivalent to the expression \(ad-bd\) at the start of this subsection.
\(S_3=\{\mathrm{id}, (1,2,3), (1,3,2), (1,2), (1,3), (2,3)\}\) and the determinant of \(A=(a_{ij})\) is \[ a_{11}a_{22}a_{33}+a_{12}a_{23}a_{31}+a_{13}a_{21}a_{32} -a_{12}a_{21}a_{33}-a_{13}a_{22}a_{31} - a_{11}a_{23}a_{32}. \]

Our aims in this section are to show that \(A\) is invertible if and only if \(\det A\neq 0\) for square matrices \(A\) of any size, and to investigate other ways of computing \(\det A\): summing over all elements \(\sigma \in S_n\) quickly becomes unmanageable as \(|S_n|=n!\) can be very large.

Proposition 2.8 Let \(A=(a_{ij})\) be a square matrix and \(\lambda\) a number.

If \(A\) has a row or a column of zeroes, then \(\det A=0\).
\(\det A = \det A^T\).
If \(A\) has two equal columns or two equal rows, then \(\det A=0\).
\(\det I_n = 1\).

Proof.

Suppose the \(i\)th row of \(A\) is zero, so \(a_{ij}=0\) for all \(j\). Every term in the sum defining \(\det A\) involves some \(a_{ij}\) as a factor, so is zero. Thus \(\det A=0\). The argument for a column of zeroes is similar.
The \(i,j\) entry of \(A^T\) is \(a_{ji}\), so \[ \det A^T = \sum_{\sigma \in S_n} \sgn(\sigma) a_{\sigma(1),1}\cdots a_{\sigma(n),n}.\] Notice that \[ a_{\sigma(1),1}\cdots a_{\sigma(n),n} = a_{1,\sigma^{-1}(1)}\cdots a_{n,\sigma^{-1}(n)}\] as the terms on each side are the same, just written in a different order. Therefore \[\begin{align*} \det A^T& = \sum_{\sigma\in S_n} \sgn(\sigma) a_{\sigma(1),1}\cdots a_{\sigma(n),n} \\ &= \sum_{\sigma \in S_n} \sgn(\sigma) a_{1,\sigma^{-1}(1)}\cdots a_{n,\sigma^{-1}(n)} & \text{as above} \\ &= \sum_{\sigma \in S_n} \sgn(\sigma^{-1})a_{1,\sigma^{-1}(1)}\cdots a_{n,\sigma^{-1}(n)} & \text{by problem sheet 4, Q1 iv)} \\ &= \sum_{\sigma \in S_n} \sgn(\sigma)a_{1,\sigma(1)}\cdots a_{n,\sigma(n)} \\ &= \det A \end{align*}\] where the second-to-last equality is because the permutations \(\sigma^{-1}\) for \(\sigma \in S_n\) are just the permutations in \(S_n\), written in a different order.
Suppose \(p<q\) and \(a_{pj}=a_{qj}\) for all \(j\). Let \(\sigma \in S_n\) and consider the terms in the sum defining \(\det A\) corresponding to \(\sigma\) and \(\sigma (p,q)\). Since \(a_{p,\sigma(p)}=a_{q, \sigma(p)}= a_{q, \sigma (p,q) (q)}\) and \(a_{q,\sigma(q)}=a_{p,\sigma(p,q)(p)}\) the product of entries of \(A\) is exactly the same for \(\sigma (p,q)\) as it is for \(\sigma\), but \(\sgn \sigma = - \sgn (\sigma (p,q))\). It follows that the two terms add to zero. Pairing up all terms in the sum this way, we see that it is zero. If \(A\) has two equal columns, then \(\det A=\det A^T\) by the last part and \(A^T\) has two equal rows so has determinant zero by the argument above.
Let \(I_n = (\delta_{ij})\), so \(\delta_{ij}=1\) if \(i=j\) and 0 otherwise. For \[ \delta_{1,\sigma(1)}\cdots \delta_{n,\sigma(n)} \neq 0\] we must have \(\sigma(1)=1\), \(\sigma(2)=2\), and so on, so the only nonzero term in the sum defining \(\det I_n\) is the \(\sigma=\id\) term, which has the value \[\sgn(\id)\delta_{11}\cdots \delta_{nn}=1. \]

Proposition 2.9 Let \(A=(a_{ij})\) be a square matrix.

\(\det r_i(\lambda)(A) = \lambda \det A\).
\(\det r_{ij}(\lambda)(A)=\det A\).
\(\det r_{ij}(A)=-\det A\).

Proof.

Each term in the sum defining \(\det r_i(\lambda)(A)\) is the same as the corresponding term in \(\det A\), but with exactly one factor of \(\lambda\). Taking these factors outside the sum we get \(\det r_i(\lambda)(A) = \lambda \det A\).
Let \(r_{ij}(\lambda)(A)=(b_{ij})\), so \(b_{pq}=a_{pq}\) if \(p \neq j\) and \(b_{jq}=a_{jq}+\lambda a_{iq}\). The term \[ b_{1,\sigma(1)} \cdots b_{n,\sigma(n)}\] from the sum defining the determinant of \(r_{ij}(\lambda)(A)\) then equals \[\begin{multline*} a_{1,\sigma(1)}\cdots a_{j-1, \sigma(j-1)} (a_{j, \sigma(j)}+\lambda a_{i, \sigma(j)}) a_{j+1,\sigma(j+1)}\cdots a_{n,\sigma(n)} =\\ a_{1,\sigma(1)}\cdots a_{n,\sigma(n)} + \lambda a_{1,\sigma(1)}\cdots a_{j-1,\sigma(j-1)} a_{i,\sigma(j)} a_{j+1,\sigma(j+1)}\cdots a_{n,\sigma(n)}.\end{multline*}\] The first term on the right of this equation is the term in the sum defining \(\det A\) coming from the permutation \(\sigma\). The second term on the right of this equation is \(\lambda\) times the term coming from the permutation \(\sigma\) in the sum defining the determinant of a matrix which is the same as \(A\), except with the \(i\)th row of \(A\) repeated in place of the \(j\)th row. Such a matrix has two repeated rows so its determinant is zero. Summing over all \(\sigma\in S_n\) gives \[\det r_{ij}(\lambda)(A)= \det A + \det B \] where \(B\) is a matrix with two repeated rows, so \(\det B=0\).
\(r_{ij}(A)\) is the matrix whose \(p,q\) entry is \(a_{pq}\) if \(p\neq i,j\), \(a_{jq}\) if \(p=i\), and \(a_{iq}\) if \(p=j\). The term in the sum defining \(\det r_{ij}(A)\) corresponding to \(\sigma\in S_n\) is \[ \sgn(\sigma) a_{1, \sigma(1)} \cdots a_{i-1, \sigma(i-1)} a_{j,\sigma(i)} a_{i+1,\sigma(i+1)}\cdots a_{j-1, \sigma(j-1)} a_{j, \sigma(i)} a_{j+1,\sigma(j+1)} \cdots a_{n,\sigma(n)}.\] The product of \(a_{rs}\)s appearing here is the same as the one in the term of the sum defining \(\det A\) corresponding to \(\sigma (i,j)\). But \(\sgn (\sigma) = - \sgn (\sigma (i,j))\), so each term in the sum for \(\det r_{ij}(A)\) appears with the opposite sign in \(\det A\).

Lemma 2.4 Let \(A\) be a square matrix and \(r\) a row op. Then \(\det (E(r)A)=\det E(r) \det A\). Furthermore, \(\det A=0\) if and only if \(\det r(A)=0\).

Proof. Firstly, \(E(r) = r(I_n)\), so by the previous proposition \(\det E(r)=\lambda\det I_n = \lambda\) if \(r=r_i(\lambda)\), \(\det E(r)=\det I_n=1\) if \(r=r_{ij}(\lambda)\) and \(\det E(r)=-\det I_n=-1\) if \(r=r_{ij}\). The first part of the lemma is now immediate from the last proposition. The second part follows from the first, because \(r(A)=E(r)A\) and each \(E(r)\) has nonzero determinant.

Theorem 2.2 Let \(A\) be a square matrix. Then \(A\) is invertible if and only if \(\det A \neq 0\).

Proof. The previous lemma shows that doing a row operation to \(A\) doesn’t affect whether or not its determinant is zero, so used repeatedly shows that \(\det A \neq 0\) if and only if the determinant of the RRE form of \(A\) is not 0.

If \(A\) is invertible its RRE form is \(I_n\) which has nonzero determinant, so \(A\) has nonzero determinant. If \(A\) is not invertible its RRE form has a row of zeroes, so its determinant is zero by Proposition 2.8, so \(\det A=0\).

Theorem 2.3 Let \(A\) and \(B\) be \(n\times n\) matrices. Then \(\det (AB)=\det A \det B\).

Proof. Case 1. \(A\) and \(B\) are both invertible. As in the proof of Theorem 2.1, we can write \(A\) and \(B\) as products of elementary matrices: \[\begin{align*} A &= E(r_1)\cdots E(r_l) \\ B &= E(s_1)\cdots E(s_m). \end{align*}\] Now by Lemma 2.4, for any two row ops \(r\) and \(s\) we have \(\det E(r)E(s)=\det E(r) \det E(s)\). Using this repeatedly: \[\begin{align*} \det A &= \det E(r_1)\cdots \det E(r_l) \\ \det B &= \det E(s_1)\cdots \det E(s_m) \\ \det (AB)&= \det (E(r_1)\cdots E(r_l)E(s_1)\cdots E(s_m))\\&= \det E(r_1) \cdots \det E(r_l) \det E(s_1) \cdots \det E(s_m) \\ &= \det A \det B. \end{align*}\]

Case 2. \(B\) isn’t invertible. By Corollary 2.2, there is a nonzero vector \(\mathbf{v}\) with \(B \mathbf{v}=\mathbf{0}\). Then \(AB \mathbf{v}=A\mathbf{0}=\mathbf{0}\), so \(AB\) isn’t invertible. Thus \(\det AB=0\) and \(\det B=0\) so certainly \(\det AB=\det A \det B\).

Case 3. \(B\) is invertible but \(A\) isn’t. By Corollary 2.2 again, there is a nonzero vector \(\mathbf{v}\) with \(A\mathbf{v}=\mathbf{0}\). Then \(B^{-1}\mathbf{v}\) is nonzero too, and \(AB (B^{-1}\mathbf{v})=\mathbf{0}\) so \(AB\) isn’t invertible and \(\det AB=\det A \det B\) as in case 2.

Corollary 2.3

If \(A\) is invertible then \(\det(A^{-1})=(\det A)^{-1}\).
If a square matrix \(A\) has a left or right inverse then it is invertible.

Proof.

Apply the previous result to \(AA^{-1}=I_n\).
If \(AB=I_n\) then \(\det(AB)=\det(A)\det(B)=1\), so \(\det(A)\neq 0\), so \(A\) is invertible. The case of a left-inverse is similar.

Finally we look at computing determinants by row and column expansions.

Definition 2.13 The \(i,j\) minor of a square matrix \(A\), written \(A_{ij}\), is the determinant of the matrix obtained from \(A\) by deleting row \(i\) and column \(j\) from \(A\).

Definition 2.14 For \(1 \leq i \leq n\) the \(i\)th row expansion of an \(n\times n\) matrix \(A= (a_{ij})\) is \[\begin{equation*} r_i(A) =\sum_{r=1}^n (-1)^{i+r}a_{ir}A_{ir}. \end{equation*}\]

Definition 2.15 For \(1 \leq j \leq n\) the \(j\)th column expansion of an \(n\times n\) matrix \(A= (a_{ij})\) is \[\begin{equation*} c_j(A)= \sum_{r=1}^n (-1)^{r+j} a_{rj} A_{rj}. \end{equation*}\]

It turns out that each of the row and column expansions actually equal \(\det A\).

Example 2.7 Let’s look at the \(2\times 2\) case to try and convince ourselves that these definitions really do all produce the determinant. Let \[A = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}. \] The first row expansion is \[\begin{align*} \sum_{r=1}^2 (-1)^{1+r}a_{1r}A_{1r} &= (-1)^{1+1}a_{11} A_{11} + (-1)^{1+2}a_{12}A_{12}\\ &= a_{11}a_{22}-a_{12}a_{21} \end{align*}\] which agrees with our old definition of the determinant. Now the second column expansion: \[\begin{align*} \sum_{r=1}^2 (-1)^{r+2} a_{r2} A_{r2} &= (-1)^{1+2}a_{12} A_{12} + (-1)^{2+2}a_{22}A_{22} \\ &= -a_{12}a_{21} + a_{22}a_{11} \end{align*}\] which again agrees with our previous definition.

The general proof that row and column expansions really do equal the determinant is slightly messy. Here’s a special case:

Proposition 2.10 Let \(A\) be a square matrix. Then \(c_n(A)=\det A\).

Proof. \[\begin{align*} c_n(A)&= \sum_{r=1}^n (-1)^{n+r} a_{rn} A_{rn} \\ &= \sum_{r=1}^n (-1)^{n+r}a_{rn}\sum_{\sigma\in S_{n-1}} \sgn(\sigma) a_{1,\sigma(1)} \cdots a_{r-1,\sigma(r-1)}a_{r+1,\sigma(r)}\cdots a_{n,\sigma(n-1)} \\ &= \sum_{r=1}^n \sum_{\sigma\in S_{n-1}} (-1)^{n+r}\sgn(\sigma)a_{1,\sigma(1)}\cdots a_{r-1,\sigma(r-1)}a_{r,n}a_{r+1,\sigma(r)}\cdots a_{n,\sigma(n-1)} \end{align*}\] The permutation \[ \begin{pmatrix} 1 & 2 & \cdots & r-1 & r & r+1 & \cdots & n \\ \sigma(1) & \sigma(2) & \cdots & \sigma(r-1) & n & \sigma(r) & \cdots & \sigma(n-1) \end{pmatrix}\] obtained by inserting an \(n\) into the bottom row of \(\sigma\) at position \(r\), equals \(\sigma (n,n-1,\ldots,r+1,r)\). The cycle has length \(n-r+1\) so the sign of \(\sigma (n,n-1,\ldots, r+1,r)\) is \(\sgn(\sigma) (-1)^{n-r}= \sgn(\sigma)(-1)^{n+r}\). Every permutation in \(S_n\) can be obtained in exactly one way by inserting \(n\) into the bottom row of a permutation from \(S_{n-1}\). So our sum is \[ \sum _{\tau \in S_n} \sgn(\tau) a_{1,\tau(1)}\cdots a_{n,\tau(n)} \] which is \(\det A\).