# 2.6: The Matrix Inverse

2.6: The Matrix Inverse

### Inverse Matrix

Compute the inverse of a 3-by-3 matrix.

Check the results. Ideally, Y*X produces the identity matrix. Since inv performs the matrix inversion using floating-point computations, in practice Y*X is close to, but not exactly equal to, the identity matrix eye(size(X)) .

### Solve Linear System

Examine why solving a linear system by inverting the matrix using inv(A)*b is inferior to solving it directly using the backslash operator, x = A .

Create a random matrix A of order 500 that is constructed so that its condition number, cond(A) , is 1e10 , and its norm, norm(A) , is 1 . The exact solution x is a random vector of length 500, and the right side is b = A*x . Thus the system of linear equations is badly conditioned, but consistent.

Solve the linear system A*x = b by inverting the coefficient matrix A . Use tic and toc to get timing information.

Find the absolute and residual error of the calculation.

Now, solve the same linear system using the backslash operator .

The backslash calculation is quicker and has less residual error by several orders of magnitude. The fact that err_inv and err_bs are both on the order of 1e-6 simply reflects the condition number of the matrix.

The behavior of this example is typical. Using A instead of inv(A)*b is two to three times faster, and produces residuals on the order of machine accuracy relative to the magnitude of the data.

## Jared Antrobus @ University of Kentucky -->

Let $A$ be a square matrix of size $n imes n$. If $A$ row-reduces to the $n imes n$ identity matrix $I_n$, then $A$ is invertible. That is, there exists some $n imes n$ matrix $A^<-1>$ (read, "$A$ inverse") such that $AA^<-1>=A^<-1>A=I_n$. Inverses can only exist for square matrices, but not every square matrix has an inverse. (Sometimes non-square matrices have left-inverses or right-inverses, but that is beyond the scope of this class.)

We return once again to the problem of solving a system of linear equations. Suppose we have a system of $n$ linear equations in $n$ variables. $egin a_<11>x_1+a_<12>x_2+ldots+a_<1n>x_n&=b_1 a_<21>x_1+a_<22>x_2+ldots+a_<2n>x_n&=b_2 vdots& a_x_1+a_x_2+ldots+a_x_n&=b_n end$ Let $A$ be the $n imes n$ matrix with entries $a_$. (Recall that $A$ is the coefficient matrix of the above system.) Then this system can be written as the matrix equation $Ax=b$, where $x$ and $b$ are the column vectors with entries $x_1,ldots,x_n$ and $b_1,ldots,b_n$, respectively. $Ax = egin a_<11>x_1+a_<12>x_2+ldots+a_<1n>x_n a_<21>x_1+a_<22>x_2+ldots+a_<2n>x_n vdots a_x_1+a_x_2+ldots+a_x_n end = eginb_1_2vdots_nend = b$ If $A$ is an invertible matrix, this is actually quite amazing. For any vector $b$, we can use $A^<-1>$ to solve for $x$! $egin Ax&=b A^<-1>Ax&=A^<-1>b x&=A^<-1>b end$

Great, so how can we find the inverse of a square matrix? How do we even know if a matrix is invertible? Return to Gauss-Jordan elimination. The same sequence of row operations which takes $A$ to $I_n$, also takes $I_n$ to $A^<-1>$. Thus we can set up an augmented matrix $eginA&I_nend$ and row-reduce. If $A$ reduces to the identity, then the right-hand side of the augmented matrix will be $A^<-1>$. If $A$ does not row-reduce to the identity, then $A$ is not invertible (or singular). $eginA&I_nendlongrightarrow eginI_n&A^<-1>end$

Example. Let $A=egin1&2-1&3end$. To find $A^<-1>$ (if it exists), we reduce the augmented matrix $eginA&I_2end$. $egin &left[egin 1&2&1&0 -1&3&0&1 end ight] R_1+R_2 ightarrow R_2 &left[egin 1&2&1&0 0&5&1&1 end ight] frac<1><5>R_2 ightarrow R_2 &left[egin 1&2&1&0 0&1&1/5&1/5 end ight] -2R_2+R_1 ightarrow R_1 &left[egin 1&0&3/5&-2/5 0&1&1/5&1/5 end ight] end$ So $A^<-1>=egin3/5&-2/51/5&1/5end$. We can check this by multiplying: $AA^<-1>=egin1&2-1&3end egin3/5&-2/51/5&1/5end = egin1&0&1end$

Theorem. For $2 imes 2$ matrices only, there is an easy shortcut for determining the inverse. Suppose $A=egina&bc&dend$. Then the determinant of $A$ is $det(A)=ad-bc$. Whenever $det(A)$ is nonzero, $A$ is invertible and $A^<-1>=frac<1>egind&-b-c&aend.$

I previously mentioned that knowing the inverse of a matrix can help us solve systems of equations. Let's see an example of this technique.

Example. Consider this system of linear equations. $egin x+2y&=5 -x+3y&=10 end$ Note that this is equivalent to the matrix equation $egin1&2-1&3endeginxyend=egin510end.$ From the previous example, we know the inverse of this matrix. We left-multiply by this inverse. $egin egin3/5&-2/51/5&1/5end egin1&2-1&3end eginxyend &=egin3/5&-2/51/5&1/5end egin510end egin1&0&1end eginxyend &=egin-13end eginxyend &=egin-13end end$ So the solution to the system is $(-1,3)$.

I take this opportunity to remind you that while matrices do not commute in general, we always have $AA^<-1>=A^<-1>A=I_n$. That is to say, invertible matrices do commute with their inverses.

## 2.6.2. Shrunk Covariance¶

### 2.6.2.1. Basic shrinkage¶

Despite being an asymptotically unbiased estimator of the covariance matrix, the Maximum Likelihood Estimator is not a good estimator of the eigenvalues of the covariance matrix, so the precision matrix obtained from its inversion is not accurate. Sometimes, it even occurs that the empirical covariance matrix cannot be inverted for numerical reasons. To avoid such an inversion problem, a transformation of the empirical covariance matrix has been introduced: the shrinkage .

In scikit-learn, this transformation (with a user-defined shrinkage coefficient) can be directly applied to a pre-computed covariance with the shrunk_covariance method. Also, a shrunk estimator of the covariance can be fitted to data with a ShrunkCovariance object and its ShrunkCovariance.fit method. Again, results depend on whether the data are centered, so one may want to use the assume_centered parameter accurately.

Mathematically, this shrinkage consists in reducing the ratio between the smallest and the largest eigenvalues of the empirical covariance matrix. It can be done by simply shifting every eigenvalue according to a given offset, which is equivalent of finding the l2-penalized Maximum Likelihood Estimator of the covariance matrix. In practice, shrinkage boils down to a simple a convex transformation : (Sigma_ < m shrunk>= (1-alpha)hat + alphafrac<< m Tr>hat>

m Id) .

Choosing the amount of shrinkage, (alpha) amounts to setting a bias/variance trade-off, and is discussed below.

### 2.6.2.2. Ledoit-Wolf shrinkage¶

In their 2004 paper 1, O. Ledoit and M. Wolf propose a formula to compute the optimal shrinkage coefficient (alpha) that minimizes the Mean Squared Error between the estimated and the real covariance matrix.

The Ledoit-Wolf estimator of the covariance matrix can be computed on a sample with the ledoit_wolf function of the sklearn.covariance package, or it can be otherwise obtained by fitting a LedoitWolf object to the same sample.

Case when population covariance matrix is isotropic

It is important to note that when the number of samples is much larger than the number of features, one would expect that no shrinkage would be necessary. The intuition behind this is that if the population covariance is full rank, when the number of sample grows, the sample covariance will also become positive definite. As a result, no shrinkage would necessary and the method should automatically do this.

This, however, is not the case in the Ledoit-Wolf procedure when the population covariance happens to be a multiple of the identity matrix. In this case, the Ledoit-Wolf shrinkage estimate approaches 1 as the number of samples increases. This indicates that the optimal estimate of the covariance matrix in the Ledoit-Wolf sense is multiple of the identity. Since the population covariance is already a multiple of the identity matrix, the Ledoit-Wolf solution is indeed a reasonable estimate.

See Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood for an example on how to fit a LedoitWolf object to data and for visualizing the performances of the Ledoit-Wolf estimator in terms of likelihood.

O. Ledoit and M. Wolf, “A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices”, Journal of Multivariate Analysis, Volume 88, Issue 2, February 2004, pages 365-411.

### 2.6.2.3. Oracle Approximating Shrinkage¶

Under the assumption that the data are Gaussian distributed, Chen et al. 2 derived a formula aimed at choosing a shrinkage coefficient that yields a smaller Mean Squared Error than the one given by Ledoit and Wolf’s formula. The resulting estimator is known as the Oracle Shrinkage Approximating estimator of the covariance.

The OAS estimator of the covariance matrix can be computed on a sample with the oas function of the sklearn.covariance package, or it can be otherwise obtained by fitting an OAS object to the same sample.

Bias-variance trade-off when setting the shrinkage: comparing the choices of Ledoit-Wolf and OAS estimators ¶

Chen et al., “Shrinkage Algorithms for MMSE Covariance Estimation”, IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010.

See Ledoit-Wolf vs OAS estimation to visualize the Mean Squared Error difference between a LedoitWolf and an OAS estimator of the covariance.

## Example (3 × 3)

Find the inverse of the matrix A using Gauss-Jordan elimination.

 A = 12 9 11 3 13 10 14 4 15

### Our Procedure

We write matrix A on the left and the Identity matrix I on its right separated with a dotted line, as follows. The result is called an augmented matrix.

We include row numbers to make it clearer.

Next we do several row operations on the 2 matrices and our aim is to end up with the identity matrix on the left, like this:

(Technically, we are reducing matrix A to reduced row echelon form, also called row canonical form).

The resulting matrix on the right will be the inverse matrix of A.

Our row operations procedure is as follows:

1. We get a "1" in the top left corner by dividing the first row
2. Then we get "0" in the rest of the first column
3. Then we need to get "1" in the second row, second column
4. Then we make all the other entries in the second column "0".

We keep going like this until we are left with the identity matrix on the left.

Let's now go ahead and find the inverse.

### New Row [1]

Divide Row [1] by 12 (to give us a "1" in the desired position):

### New Row [2]

Row[2] &minus 3 × Row[1] (to give us 0 in the desired position):

This gives us our new Row [2]:

### New Row [3]

Row[3] &minus 14 × Row[1] (to give us 0 in the desired position):

This gives us our new Row [3]:

### New Row [2]

Divide Row [2] by 10.75 (to give us a "1" in the desired position):

### New Row [1]

Row[1] &minus 0.75 × Row[2] (to give us 0 in the desired position):

1 &minus 0.75 × 0 = 1
0.75 &minus 0.75 × 1 = 0
0.9167 &minus 0.75 × 0.6744 = 0.4109
0.0833 &minus 0.75 × -0.0233 = 0.1008
0 &minus 0.75 × 0.093 = -0.0698
0 &minus 0.75 × 0 = 0

This gives us our new Row [1]:

### New Row [3]

Row[3] &minus -6.5 × Row[2] (to give us 0 in the desired position):

This gives us our new Row [3]:

### New Row [3]

Divide Row [3] by 6.5504 (to give us a "1" in the desired position):

### New Row [1]

Row[1] &minus 0.4109 × Row[3] (to give us 0 in the desired position):

1 &minus 0.4109 × 0 = 1
0 &minus 0.4109 × 0 = 0
0.4109 &minus 0.4109 × 1 = 0
0.1008 &minus 0.4109 × -0.2012 = 0.1834
-0.0698 &minus 0.4109 × 0.0923 = -0.1077
0 &minus 0.4109 × 0.1527 = -0.0627

This gives us our new Row [1]:

### New Row [2]

Row[2] &minus 0.6744 × Row[3] (to give us 0 in the desired position):

0 &minus 0.6744 × 0 = 0
1 &minus 0.6744 × 0 = 1
0.6744 &minus 0.6744 × 1 = 0
-0.0233 &minus 0.6744 × -0.2012 = 0.1124
0.093 &minus 0.6744 × 0.0923 = 0.0308
0 &minus 0.6744 × 0.1527 = -0.103

This gives us our new Row [2]:

We have achieved our goal of producing the Identity matrix on the left. So we can conclude the inverse of the matrix A is the right hand portion of the augmented matrix:

Calculate $$left[egin2 & 11 & 3end ight]^ <-1>$$ using the Gauss-Jordan elimination.

To find the inverse matrix, augment it with the identity matrix and perform row operations trying to make the identity matrix to the left. Then to the right will be the inverse matrix.

So, augment the matrix with the identity matrix:

$$left[egin2 & 1 & 1 & 01 & 3 & 0 & 1end ight]$$

Divide row $$1$$ by $$2$$ : $$R_ <1>= frac> <2>$$ .

Subtract row $$1$$ from row $$2$$ : $$R_ <2>= R_ <2>- R_ <1>$$ .

Multiply row $$2$$ by $$frac<2> <5>$$ : $$R_ <2>= frac<2 R_<2>> <5>$$ .

Subtract row $$2$$ multiplied by $$frac<1> <2>$$ from row $$1$$ : $$R_ <1>= R_ <1>- frac> <2>$$ .

We are done. On the left is the identity matrix. On the right is the inverse matrix.

## Finding the Inverse of a Matrix with the TI83 / TI84

By taking any advanced math course or even scanning through this website, you quickly learn how powerful a graphing calculator can be. A more “theoretical” course like linear algebra is no exception. In fact, once you know how to do something like finding an inverse matrix by hand, the calculator can free you up from that calculation and let you focus on the big picture.

Remember, not every matrix has an inverse. The matrix picked below is invertible, meaning it does in fact have an inverse. We will talk about what happens when it isn’t invertible a little later on. Here is the matrix we will use for our example:

( left[ egin 8 & 2 & 1 & 6 8 & 4 & 1 & 1 0 & 2 & 6 & 4 15 & 8 & 9 & 20 end ight])

Note: for a video of these steps, scroll down.

### Step 1: Get to the Matrix Editing Menu

This is a much more involved step than it sounds like! If you have a TI 83, there is simply a button that says “MATRIX”. This is the button you will click to get into the edit menu. If you have a TI84, you will have to press [2ND] and [(x^<-1>)]. This will take you into the menu you see below. Move your cursor to “EDIT” at the top.

Now you will select matrix A (technically you can select any of them, but for now, A is easier to deal with). To do this, just hit [ENTER].

### Step 2: Enter the Matrix

First, you must tell the calculator how large your matrix is. Just remember to keep it in order of “rows” and “columns”. For example, our example matrix has 4 rows and 4 columns, so I type 4 [ENTER] 4 [ENTER].

Now you can enter the numbers from left to right. After each number, press [ENTER] to get to the next spot.

Now, before we get to the next step. On some calculators, you will get into a strange loop if you don’t quit out of this menu now. So, press [2ND] and [MODE] to quit. When you do this, it will go back to the main screen.

### Step 3: Select the Matrix Under the NAMES Menu

After you have quit by clicking [2ND] and [MODE], go back into the matrix menu by clicking [2ND] and [(x^<-1>)] (or just the matrix button if you have a TI83). This time, select A from the NAMES menu by clicking [ENTER].

### Step 4: Press the Inverse Key [(x^)] and Press Enter

The easiest step yet! All you need to do now, is tell the calculator what to do with matrix A. Since we want to find an inverse, that is the button we will use.

At this stage, you can press the right arrow key to see the entire matrix. As you can see, our inverse here is really messy. The next step can help us along if we need it.

### Step 5: (OPTIONAL) Convert Everything to Fractions

While the inverse is on the screen, if you press [MATH] , 1: Frac, and then ENTER, you will convert everything in the matrix to fractions. Then, as before, you can click the right arrow key to see the whole thing.

That’s it! It sounds like a lot but it is actually simple to get used to. It’s useful too – being able to enter matrices into the calculator lets you add them, multiple them, etc! Nice! If you want to see it all in action, take a look at the video to the right where I go through the steps with a different example. Even with the optional step, it takes me less than 3 minutes to go through.

Oh yeah – so what happens if your matrix is singular (or NOT invertible)? In other words, what happens if your matrix doesn’t have an inverse?

As you can see above, your calculator will TELL YOU. How nice is that?

## 2.6: The Matrix Inverse

Given a Matrix, the task is to find the inverse of this Matrix using the Gauss-Jordan method.
What is matrix?

Matrix is an ordered rectangular array of numbers.

### Inverse of a matrix:

Given a square matrix A, which is non-singular (means the Determinant of A is nonzero) Then there exists a matrix

1. The matrix must be a square matrix.
2. The matrix must be a non-singular matrix and,
3. There exist an Identity matrix I for which

In general, the inverse of n X n matrix A can be found using this simple formula:

### Methods for finding Inverse of Matrix:

1. Elementary Row Operation (Gauss-Jordan Method) (Efficient)
2. Minors, Cofactors and Ad-jugate Method (Inefficient)

### Elementary Row Operation (Gauss – Jordan Method):

Gauss-Jordan Method is a variant of Gaussian elimination in which row reduction operation is performed to find the inverse of a matrix.
Steps to find the inverse of a matrix using Gauss-Jordan method:
In order to find the inverse of the matrix following steps need to be followed:

1. Form the augmented matrix by the identity matrix.
2. Perform the row reduction operation on this augmented matrix to generate a row reduced echelon form of the matrix.
3. The following row operations are performed on augmented matrix when required:
• Interchange any two row.
• Multiply each element of row by a non-zero integer.
• Replace a row by the sum of itself and a constant multiple of another row of the matrix.

Below is the C++ program to find the inverse of a matrix using the Gauss-Jordan method:

## *2.6: Matrix Inversion

where I is the n nidentity matrix. The solution X, also of size n n, will be the inverse of A. The proof is simple: after we premultiply both sides of Eq. (2.33) by A ? 1 we have A ? 1 AX= A ? 1 I, which reduces to X= A ? 1 .

Inversion of large matrices should be avoided whenever possible due its high cost. As seen from Eq. (2.33), inversion of A is equivalent to solving Ax i= b i with i=1, 2, , n, where b i is the ith column of I. If LU decomposition is employed in the solution, the solution phase (forward and back substitution) must be repeated n times, once for each b i. Since the cost of computation is proportional to n 3 for the decomposition phase and n 2 for each vector of the solution phase, the cost of inversion is considerably more expensive than the solution of Ax=b (single constant vector b).

Matrix inversion has another serious drawback a banded matrix loses its structure during inversion. In other words, if A is banded or otherwise sparse.

To find the inverse of matrix $A$, using Gauss-Jordan elimination, it must be found the sequence of elementary row operations that reduces $A$ to the identity and, then, the same operations on $I_n$ must be performed to obtain $A^<-1>$.

### Inverse of 2 $imes$ 2 matrices

Example 1: Find the inverse of

Step 1: Adjoin the identity matrix to the right side of $A$:

Step 2: Apply row operations to this matrix until the left side is reduced to $I$. The computations are:

Step 3: Conclusion: The inverse matrix is:

### Not invertible matrix

If $A$ is not invertible, then, a zero row will show up on the left side.

Example 2: Find the inverse of

Step 1: Adjoin the identity matrix to the right side of A:

Step 2: Apply row operations

Step 3: Conclusion: This matrix is not invertible.

### Inverse of 3 $imes$ 3 matrices

Example 1: Find the inverse of

Step 1: Adjoin the identity matrix to the right side of A:

Step 2: Apply row operations to this matrix until the left side is reduced to I. The computations are: