# 4.2: Properties of Eigenvalues and Eigenvectors - Mathematics

4.2: Properties of Eigenvalues and Eigenvectors - Mathematics

## Eigenvalues and Eigenvectors

#### 1. Definitions

Let A be a square matrix. An eigenvalue of A is a scalar &lambda such that

The above equation can also be written:

The alternative form of the equation is satisfied if, and only if, the matrix A - &lambda I is singular.

This gives us alternative definitions of the eigenvalue. An eigenvalue of the matrix A is a root of the polynomial det( A - z I ) , which is called the characteristic polynomial of the matrix, or a solution of the characteristic equation det( A - z I ) = 0 . The eigenvalue itself is also called a characteristic root of the matrix.

#### 2. Basic Properties

A matrix over an algebraically complete field, such as the field of complex numbers, always has at least one eigenvalue. Since the characteristic polynomial of an n⨯ n matrix is of degree n , the matrix cannot have more than n eigenvalues.

Even if a field is not algebraically complete, the eigenvalues of a matrix exist in the splitting field of its characteristic polynomial.

Theorem 2.1. A square matrix has the same characteristic polynomial and the same eigenvalues as its transpose.

Proof. The characteristic polynomials of A and A T are equal for all values of the independent variable:

In an infinite field, such as the field of real numbers, this is sufficient to show that the polynomials are identical i.e., they have equal coefficients. However, each coefficient is itself a polynomial in the entries in A and A T . They can be equal for all values of the entries only if they have equal coefficients. Hence the two polynomials are identical in any field.

Identical characteristic polynomials yield identical eigenvalues. █

Theorem 2.2. Similar matrices have the same characteristic polynomial and the same eigenvalues.

Proof. If Q is nonsingular, then det( Q -1 ) det( Q ) = 1 , and the characteristic polynomials of A and Q -1 A Q are equal for all values of the independent variable:

The rest of the argument is the same as for Theorem 2.1, except that not all values of the coefficients of Q will make Q nonsingular. However, all values within some interval will, so the argument is still valid. █

Since similar matrices can be interpreted as matrices of the same linear transformation with respect to different bases, it makes sense to define the characteristic polynomial and eigenvalues of a linear transformation of a finite-dimensional vector space, without regard to any particular basis.

Theorem 2.3. The matrices AB and BA have the same characteristic polynomial and the same eigenvalues.

Proof. If A is nonsingular, then AB and BA are similar:

and the desired result follows from Theorem 2.2. The extension to all A uses a similar argument. █

If &lambda is an eigenvalue of the matrix A , then the set < x | Ax = &lambda x >is a nontrivial subspace, which is called the eigenspace corresponding to &lambda .

Every matrix has at least one eigenvalue (perhaps in an extension field), but since the characteristic polynomial may have multiple roots, the number of distinct eigenvalues of an n⨯ n matrix may be less than n . It can never be greater than n .

Every eigenvalue &lambda of a matrix A has an algebraic multiplicity , which is the number of times z-&lambda appears in a complete factorization of the characteristic polynomial. It also has a geometric multiplicity , which is the dimension of its eigenspace. The algebraic multiplicities of all the eigenvalues of an n⨯ n matrix always add up to n . The geometric multiplicity of an eigenvalue cannot exceed its algebraic multiplicity. The proof of this fact will come later. It may be less, as the following simple example:

The characteristic polynomial can be factored into linear factors (in its splitting field, if necessary):

Substitution of z = 0 shows that the determinant is the product of the eigenvalues:

The coefficient of z n-1 in the determinant is the sum of the diagonal entries in the matrix. The coefficient of z n-1 in the product of linear factors is the sum of the eigenvalues. Hence:

The sum of the diagonal entries in a square matrix is called the trace of the matrix. Since it is the sum of the eigenvalues, it is the same for similar matrices, and hence it can be considered a property of a linear transformation.

The characteristic polynomial of a direct sum of matrices is the product of their characteristic polynomials:

Its eigenvalues are clearly those of the constituent matrices.

#### 3. Triangular and Diagonal Matrices

If the square matrix A is triangular, then A - z I is also triangular, and its determinant is the product of its diagonal elements:

This makes the characteristic equation particularly easy to solve. The eigenvalues of A are its diagonal elements.

Theorem 3.1. Every square matrix is similar (over the splitting field of its characteristic polynomial) to an upper triangular matrix.

Proof. We use induction on the size of the matrix. For a 1⨯ 1 matrix the result is trivial.

Now assume that every (n-1)⨯ (n-1) matrix is similar to an upper triangular matrix, and let A be n⨯ n .

Let &lambda be an eigenvalue of A and let r 1 be a corresponding eigenvector. Find additional column vectors so that r 1 , r 2 , . r n are linearly independent. Let these column vectors be the columns of the matrix R .

Now let e 1 be the n⨯ 1 column vector with 1 as its first element and zeros elsewhere. Then it is easily shown that for any n⨯ n matrix S , the product S e 1 is the first column of S .

In particular, R e 1 = r 1 and R -1 r 1 = e 1 .

Then the first column of R -1 A R is R -1 A R e 1 , which is easily shown to be &lambdae 1 . Hence R -1 A R can be partitioned as follows:

Here v is 1⨯ (n-1) , B is (n-1)⨯ (n-1) , and 0 consists of zeros.

By inductive hypothesis B is similar to S -1 B S , which is upper triangular. Hence

Of course, a similar argument shows that every square matrix is similar to a lower triangular matrix.

Theorem 3.2. Eigenvectors corresponding to distinct eigenvalues are linearly independent.

Proof. We use induction on the number m of eigenvectors (which may be less than the size of the matrix). For m=1 the assertion is trivial.

Now assume that x 1 , x 2 , . x m are eigenvectors corresponding to the distinct eigenvalues &lambda 1 , &lambda 2 , . &lambda m and let

Multiply both sides by the matrix A - &lambda m I to obtain:

By inductive hypothesis, c k (&lambda k -&lambda m ) = 0 for every 0 &le k &le m-1 . Since &lambda k &ne&lambda m , it follows that c k = 0 in every case. Therefore, c m x m = 0 . Hence c m must also be zero, and the eigenvectors are linearly independent. █

Corollary 3.3. A matrix with distinct eigenvalues is similar (over the splitting field of its characteristic polynomial) to a diagonal matrix whose diagonal elements are the eigenvalues.

#### 4. Matrix Polynomials

For a square matrix A , we can define matrix powers as repeated multiplication of A or A -1 (if A is nonsingular):

Let p(z) = c m z m + c m-1 z m-1 + . + c 1 z + c 0 be a polynomial and let A be a square matrix, both over the same field. Although a polynomial is usually evaluated only for scalars, it is possible to substitute the matrix A for z (and c 0 I for the constant term) to obtain:

Matrix polynomials have many of the properties of polynomials in general. Multiplication of polynomials in the same matrix is commutative.

Since similarity preserves matrix operations, application of the same polynomial to similar matrices produces similar results:

Theorem 4.1. For every square matrix A , there is a unique monic polynomial m(z) (over the same field) of minimal degree for which m( A ) = 0 . It divides every polynomial p(z) (over the same field or an extension field) for which p( A ) = 0 .

Proof. If A is the zero matrix, then clearly m(z) = z . (The argument below is valid in this case, but it is hard to follow.)

In other cases, the matrices I , A , A 2 , A 3 , . are vectors in a finite-dimensional vector space (with the usual definitions of addition and scalar multiplication), so there is a minimum value of k such that I , A , A 2 , . A k are linearly dependent. Then

Now let p(z) be any polynomial with p( A ) = 0 . By the division algorithm, p(z) = m(z)q(z) + r(z) , where r(z) is zero or of lower degree than m(z) . But substitution of A for z shows that r( A ) = 0 , which is impossible for a nonzero r(z) because m(z) is the nonzero polynomial of lowest degree with this property. Hence r(z) is the zero polynomial and m(z) divides p(z) .

The uniqueness of m(z) follows from the fact that monic polynomials of the same degree which divide each other must be identical. █

The polynomial m(z) is called the minimum polynomial of the matrix A . It is easily seen that similar matrices have the same minimum polynomial. Hence it makes sense to define the minimum polynomial of a linear transformation of a finite-dimensional vector space, without regard to the basis.

The following generalization of Theorem 4.1 shows that there is a minimum polynomnial for every nontrivial subspace, relative to a given square matrix.

Theorem 4.2. For every n⨯ n matrix A and every nontrivial subspace S of the vector space of n-dimensional column vectors (over the same field), there is a unique monic polynomial m S (z) (over the same field) of minimal degree such that m S ( A ) x = 0 for every x &isin S. It divides every polynomial p(z) (over the same field or an extension field) such that p( A ) x = 0 for every x &isin S. Moreover, if S is a subspace of T, then m S (z) divides m T (z).

Proof. Clearly, such a minimum polynomial exists, because the minimum polynomial of the matrix has the desired property, although its degree may not be minimal.

Now let p(z) be any polynomial with p( A ) x = 0 for every x &isin S (which includes m T (z) ). By the division algorithm, p(z) = m S (z)q(z) + r(z) , where r(z) is zero or of lower degree than m S (z) . But substitution of A for z shows that r( A ) x = 0 for every x &isin S , which is impossible for a nonzero r(z) because m S (z) is the nonzero polynomial of lowest degree with this property. Hence r(z) is the zero polynomial and m S (z) divides p(z) . █

The minimum polynomial of the whole space is also the minimum polynomial of the matrix, because p( A ) x = 0 for all x if and only if p( A ) x = 0 . The minimum polynomial of the space of all scalar multiples of a nonzero vector x is often called the minimum polynomial of x .

Of course, the minimum polynomial is always defined with respect to a matrix (or its associated linear transformation). Since most discussions involving minimum polynomals refer to only a single matrix, it is usually understood.

The following theorem is called the Cayley-Hamilton Theorem .

Theorem 4.3. Let p(z) be the characteristic polynomial of a square matrix A . Then p( A ) = 0 , and the minimum polynomial of A divides p(z).

Proof. We first show that the theorem holds for any diagonal matrix. Operations on diagonal matrices are especially simple. The sum and product of two diagonal matrices are obtained by adding and multiplying corresponding diagonal entries, respectively. Hence p( A ) is the diagonal matrix whose i -th diagonal entry is p(a ii ) . But a ii is an eigenvalue of A , so p(a ii ) = 0 . Hence p( A ) = 0 .

Now the theorem also holds for any matrix similar to a diagonal matrix, because p( R -1 A R ) = R -1 p( A ) R and similar matrices have the same characteristic polynomial. In particular, the theorem holds for any matrix with distinct eigenvalues.

If every square matrix had distinct eigenvalues, the proof would end here.

We can use a continuity argument to extend the theorem to complex matrices that do not have distinct eigenvalues. Although the matrix A may not be similar to a diagonal matrix, it is similar to an upper triangular matrix T . The eigenvalues of this matrix appear along its main diagonal, as is the case with any upper triangular matrix. It is easy to construct a sequence T 1 , T 2 , T 3 , . of upper triangular matrices with distinct eigenvalues whose limit is T . If p k (z) is the characteristic polynomial of T k , then p k ( T k ) = 0 . Then lim k⟶ &infin p k ( T k ) = p( T )= 0 . Since A is similar to T , p( A ) = 0 .

Finally, we can extend the result to matrices over any field. In the complex case, each entry in p( A ) is a polynomial in the entries in A which takes on the value zero for all values of the independent variables. This is possible only if all coefficients are zero, when like terms are combined. Hence the polynomial evaluates to zero in any field. (In fact, it evaluates to zero in any commutative ring.) █

#### 5. The First Matrix Decomposition Theorem

It has been shown that any matrix with distinct eigenvalues is similar to a diagonal matrix. A matrix with repeated eigenvalues may or may not be similar to a diagonal matrix, but it is similar to a matrix which is nearly diagonal, according to the two decomposition theorems.

Theorem 5.1. A square matrix is similar (over the splitting field of its characteristic polynomial) to a direct sum of matrices, each of which has only a single eigenvalue, which is an eigenvalue of the original matrix, and whose size is the algebraic multiplicity of the eigenvalue.

Proof. The proof is by induction on the number of distinct eigenvalues. If there is only one, the theorem is trivially true.

Let &lambda be an eigenvalue of the n⨯ n matrix A of algebraic multiplicity k , and let &lambda 1 , &lambda 2 , . &lambda n-k be its other eigenvalues. Then its characteristic polynomial can be written as p(z) = q(z)r(z) , where q(z) = (z-&lambda) k , and r(z) = (z-&lambda 1 )(z-&lambda 2 ). (z-&lambda n-k ) .

Now let x be any vector in the null space of q( A ) , i.e. q( A ) x = 0 . Since multiplication of polynomials in the same matrix is commutative, q( A )( Ax ) = A q( A ) x = 0 , so Ax is also in the null space. Hence the null space of q( A ) is invariant under the linear transformation whose matrix is A .

Similarly, the null space of r( A ) is also invariant under the linear transformation whose matrix is A .

Since q(z) and r(z) are relatively prime, by the Euclidean algorithm there are matrices s(z) and s(z) such that s(z)q(z) + t(z)r(z) = 1 and s( A )q( A ) + t( A )r( A ) = I , and consequently s( A )q( A ) x + t( A )r( A ) x = x for any vector x .

Now s( A )q( A ) x is in the null space of r( A ) because r( A )s( A )q( A ) x = s( A )q( A )r( A ) x = s( A )p( A ) x and p( A ) = 0 by the Cayley-Hamilton Theorem.

Similarly, t( A )r( A ) x is in the null space of q( A ) .

Hence an arbitrary vector x can be expressed as a sum of elements of the two null spaces.

Moreover, the only common element of the null spaces is zero, because q( A ) x = 0 and r( A ) x = 0 imply that x = 0 .

Therefore, the entire vector space is the direct sum of the two null spaces, each of which is an invariant subspace under the linear transformation whose matrix is A .

It follows that A is similar to the direct sum A q &oplus A r of the matrices of the transformation restricted to the null spaces of q( A ) and r( A ) , respectively, and the characteristic polynomial of A is the product of the characteristic polynomials of A q and A r .

Now let &omega be an eigenvalue of A q and let x be the corresponding eigenvector (expressed in the original basis). Then 0 = q( A ) x = ( A -&lambda I ) k x = (&omega-&lambda) k x , which implies that &omega = &lambda . A similar argument shows that &lambda cannot be an eigenvalue of A r .

Hence the characteristic polynomials of A q and A r are q(z) and r(z) , respectively, and A q has the required properties.

To complete the proof, we apply the inductive hypothesis to A r . █

#### 6. Decomposition of Nilpotent Matrices

A nilpotent matrix is a square matrix with some power equal to the zero matrix i.e. the square matrix A is nilpotent if and only if A m = 0 for some nonnegative integer m . It is easily shown that all eigenvalues of A must be zero. Conversely, if all eigenvalues of a matrix are zero, the Cayley-Hamilton Theorem shows that the matrix is nilpotent. The linear transformation associated with a nilpotent matrix is also said to be nilpotent.

Most of the concepts used in this section are also defined for matrices that are not nilpotent, but we will not need the more general definitions.

The minimum polynomial of a nilpotent matrix is always a power of the matrix, and so are the minimum polynomials of nontrivial subspaces and nonzero vectors. Therefore, we shall in most cases use the degree of the minimum polynomial (abbreviated DMP), to refer to the mimimum polynomial of a nilpotent matrix, the linear transformation associated with it, or a nontrivial subspace or nonzero vector.

Let m be the DMP of the vector x with respect to the nilpotent matrix A . Then A m x = 0 but A m-1 x &ne 0 .

We first show that the vectors x , A x , A 2 x , . A m-1 x . are linearly independent. To this end, let c 0 x + c 1 A x + c 2 A 2 x + . + c m-1 A m-1 x = 0 , If we multiply both sides by A m-1 , we obtain c 0 A m-1 x = 0 . Hence c 0 = 0 . Now we multiply both sides by A m-2 to obtain c 1 A m-1 x = 0 . Hence Hence c 1 = 0 . Continuing in this way, we show that all coefficients must be zero.

The m -dimensional subspace spanned by x , A x , A 2 x , . A m-1 x is an invariant subspace. It is called a cyclic invariant subspace of A . Notice that its DMP and dimension are always equal.

Lemma 6.1. Given a nilpotent linear transformation of a finite-dimensional vector space, there is always a cyclic invariant subspace whose DMP is the same as that of the transformation.

Proof. The DMP of the transformation is always the maximum DMP of any of the vectors in a basis for the vector space. Since the basis is finite, it is equal to the DMP of at least one basis vector. █

Lemma 6.2. If e and f are two nonnegative integers whose sum is less than or equal to the dimension of a cyclic invariant subspace for the nilpotent matrix A and A e s = 0 for a vector s in the subspace, then there is a vector r in the subspace for which A f r = s .

Proof. Let x , A x , A 2 x , . A m-1 x be a basis for the subspace. Then

A generalization of the following theorem is often called the second decomposition theorem.

Theorem 6.3. Given a nilpotent linear transformation of a finite-dimensional vector space, the space is a direct sum of cyclic invariant subspaces of the transformation.

Proof. The proof is by induction on the dimension n of the vector space. It is trivial for n = 1 (and also fairly easy to prove for n = 2 ).

When n > 1 , let A be the matrix of the transformation (relative to any basis), and let m be its DMP.

By Lemma 6.1, there is a cyclic invariant subspace S whose DMP is m . If m = n , S is the whole space and the proof is complete.

If m , consider the relation between vectors defined by x

y if x - y &isin S . It is easily shown that it is an equivalence relation. Moreover, vector operations on equivalent vectors give equivalent results i.e, if x

y' and c is any scalar, then x + y

Now consider the set T * of equivalence classes. For every vector x , let x * be the class containing it. The vector x is called a representative of the class. It is not unique clearly ( x + s ) * = x * for any s &isin S .

Then it is easily shown that the following definitions make T * a vector space (over the same field as A ):

Let u 1 , u 2 , . u m be any basis for S and let v 1 * , v 2 * , . v k * be any basis for T * . It is easily shown that u 1 , u 2 , . u m , v 1 , v 2 , . v k is a basis for the whole n -dimensional vector space. Moreover, each vector v j can be replaced by any other representive of the same class, and the same assertions will still hold. Moreover, the dimension of T * is n - m .

Since S is an invariant subspace with respect to A , we can define a linear transformation of T * by

This transformation is obviously nilpotent. Let p be its DMP.

Now A m = 0 , which is equivalent to A m x = 0 for every vector x . Then clearly ( A * ) m x * = 0 * for every equivalence class x * , and p &le m .

Since the dimension of T * is less than n , the inductive hypothesis states that it is direct sum of cyclic invariant subspaces. The DMP of every one is less than or equal to p , which is less than or equal to m .

Consider one such subspace whose basis is h * , ( A h ) * , ( A 2 h ) * , . ( A r-1 h ) * , where r &le m .

Now ( A r h ) * = 0 * , which is equivalent to A r h &isin S .

Now multiply by A m-r (this is where r &le m is used) to obtain A m-r A r h = A m h . Since m is the DMP of A , A m h = 0 .

By Lemma 6.2, there must be a vector t &isin S such that A r t = A r h .

Let g = h - t . Then the basis for the cyclic invariant subspace of T * can be rewritten as g * , ( A g ) * , ( A 2 g ) * , . ( A r-1 g ) * , where A r g = 0 .

Then g , A g , A 2 g , . A r-1 g are a basis for a cyclic invariant subspace of the whole space.

This can be done for every cyclic invariant subspace of T * . The resulting subspaces, together with S , yield the required decomposition. █

The matrix of a nilpotent transformation over a cyclic invariant subspace relative to the basis x , A x , A 2 x , . A m-1 x is particularly simple. It has ones just above the main diagonal and zeros elsewhere, as in this four-dimensional example:

The matrix of a nilpotent linear transformation relative to a basis consisting of the combined bases of its cyclic invariant subspaces is a direct sum of such matrices. Therefore, we have proven:

Theorem 6.4. A nilpotent matrix is similar to a direct sum of matrices, each of which has ones just above the main diagonal and zeros elsewhere.

#### 7. The Jordan Canonical Form

A Jordan block is an upper triangular matrix with a single eigenvalue, which appears on the main diagonal, ones above the main diagonal, and zeros elsewhere:

Theorem 7.1. Every square matrix is similar (over the splitting field of its characteristic polynomial) to a direct sum of Jordan blocks.

Proof. By Theorem 5.1, a matrix is similar to a direct sum of matrices, each with a single eigenvalue:

Let &lambda k be the eigenvalue of A k . Then A k - &lambda k I is nilpotent, so by Theorem 6.4 it is similar to a direct sum of matrices, each of which has ones above its main diagonal and zeros elsewhere:

The direct sum of Jordan blocks is called the Jordan canonical form , or the Jordan normal form , of a matrix. It can be shown that the Jordan canonical form of a matrix is unique, except for the order of the blocks. Moreover, it is essentially the same for similar matrices, so it can be considered a property of the associated linear transformation.

The basis which gives rise to the Jordan canonical form is not unique, even if rearrangements of the basis vectors are allowed. For example, an identity matrix is its own Jordan canonical form, regardless of the basis used.

## Variance

Variance is another measure of the spread of data in a data set. In fact, it is simply standard deviation squared. The formula is : We use variance and standard deviation only when dealing with 1-dimensional data but for two dimensions we use Covariance. If you think of stock price as a 1-dimensional data moving with time on the x-axis, then we can compare how 2 stocks move together using covariance. Covariance is always measured between 2 dimensions. If you calculate the covariance between one dimension and itself, you get the variance. So, if you had a 3-dimensional data set (x, y, z), then you could measure the covariance between x and y dimensions, the x and z dimensions, and the y and z dimensions. Measuring the covariance between x and x, or y and y, or z and z would give you the variance of the x, y, and z dimensions. When dealing with more than one variable our variance or you may call it covariance becomes: Let’s take for example the movement of two stocks that I think may be correlated, HP and Dell, and try to find the covariance. Below is a table which shows their closing prices for one month. We will use this data to find the correlation as positive or negative. If the value is positive, it indicates that both dimensions are increasing together. If the value is negative, then as one dimension increases, the other decreases.

As you can see, the covariance equals

0.23 which is a positive number, so we can assume the two stocks are moving together.

Covariance Matrix

If your dataset has more than 2 dimensions then it can have more than one covariance measurement. For example, if you have a dataset with 3 dimensions x, y and z, then the covariance of this dataset is given by: Let’s see how the covariance matrix looks like when we have another stock added to the analysis. This time lets pick VMware price data.

148.800003, 144.419998, 144.589996, 144.580002, 148.070007, 149.449997, 151.380005, 153.059998, 152.839996, 153.759995, 153.210007, 151.940002, 152.25, 152.039993, 151.929993, 150.880005, 151.619995, 151.679993, 154.279999, 154.770004, 151.369995 ] print(np.cov(price_HP,price_DELL,price_VMWARE))

The output looks like this: Which is a representation of: Eigenvectors and Eigenvalues

Understanding these two properties is the most important part of understanding PCA. An eigenvector is a vector whose direction remains unchanged when a linear transformation is applied on it. In mathematical terms, an Eigenvector when multiplied to a vector gives a product of the Eigenvector itself with a scalar. The best explanation of Eigenvectors and Eigenvalues is given in the below video. I wish I had this video when I learnt about Eigenvectors for the first time.

Now, let’s do some maths and find the eigenvector and eigenvalue of a sample vector. As you can see in our above calculations, [1,1] is the Eigenvector and 2 is the Eigenvalue. Now, lets see how we can find the Eigen pairs of a sample vector A. Replacing the value of our vector A in the above formula we get: With the found Eigenvalues , let’s try and find the corresponding Eigenvectors which satisfy AX= λX.  In short, the Eigenvector is a projection of our dataset onto a subspace and the eigenvector with the highest eigenvalue is the principal component of the data set.

### Principal Component Analysis (PCA)

1. Get the dataset.
2. Subtract the columns with its mean. For PCA to work, we need to center the data points along the origin by subtracting the points by their mean.
3. Find the covariance matrix
4. Find the Eigenvectors and Eigenvalues of the covariance matrix
5. Choose Eigenvectors (Principal Components) with the highest Eigenvalues and then multiply it with our original data matrix.

We will attempt to project a 2 dimensional to 1 dimension.  Step 2: Subtract the columns with its mean. Step 3: Find the covariance matrix ### Step 4: Find the Eigenvectors and Eigenvalues of the covariance matrix Step 5: Choose Eigenvectors (principal components) with the highest Eigenvalues and then multiply it with our original data matrix.

Of the two eigenvalues 0.0490834 and 1.28402771, 1.28402771 is greater so this becomes our Eigenvalue whose corresponding Eigenvector is the principal component. The above array is our lower-dimensional representation of our original dataset in 2D.

## 4.2: Properties of Eigenvalues and Eigenvectors - Mathematics

It’s now time to start solving systems of differential equations. We’ve seen that solutions to the system,

where (lambda) and (vec eta )are eigenvalues and eigenvectors of the matrix (A). We will be working with (2 imes 2) systems so this means that we are going to be looking for two solutions, (left( t ight)) and (left( t ight)), where the determinant of the matrix,

We are going to start by looking at the case where our two eigenvalues, (>) and (>) are real and distinct. In other words, they will be real, simple eigenvalues. Recall as well that the eigenvectors for simple eigenvalues are linearly independent. This means that the solutions we get from these will also be linearly independent. If the solutions are linearly independent the matrix (X) must be nonsingular and hence these two solutions will be a fundamental set of solutions. The general solution in this case will then be,

Note that each of our examples will actually be broken into two examples. The first example will be solving the system and the second example will be sketching the phase portrait for the system. Phase portraits are not always taught in a differential equations course and so we’ll strip those out of the solution process so that if you haven’t covered them in your class you can ignore the phase portrait example for the system.

So, the first thing that we need to do is find the eigenvalues for the matrix.

Now let’s find the eigenvectors for each of these.

The eigenvector in this case is,

The eigenvector in this case is,

Then general solution is then,

Now, we need to find the constants. To do this we simply need to apply the initial conditions.

All we need to do now is multiply the constants through and we then get two equations (one for each row) that we can solve for the constants. This gives,

Now, let’s take a look at the phase portrait for the system.

From the last example we know that the eigenvalues and eigenvectors for this system are,

It turns out that this is all the information that we will need to sketch the direction field. We will relate things back to our solution however so that we can see that things are going correctly.

We’ll start by sketching lines that follow the direction of the two eigenvectors. This gives, Now, from the first example our general solution is

If we have ( = 0) then the solution is an exponential times a vector and all that the exponential does is affect the magnitude of the vector and the constant (c_<1>) will affect both the sign and the magnitude of the vector. In other words, the trajectory in this case will be a straight line that is parallel to the vector, (>). Also notice that as (t) increases the exponential will get smaller and smaller and hence the trajectory will be moving in towards the origin. If ( > 0) the trajectory will be in Quadrant II and if ( < 0) the trajectory will be in Quadrant IV.

So, the line in the graph above marked with (>) will be a sketch of the trajectory corresponding to ( = 0) and this trajectory will approach the origin as (t) increases.

If we now turn things around and look at the solution corresponding to having ( = 0) we will have a trajectory that is parallel to (>). Also, since the exponential will increase as (t) increases and so in this case the trajectory will now move away from the origin as (t) increases. We will denote this with arrows on the lines in the graph above. Notice that we could have gotten this information without actually going to the solution. All we really need to do is look at the eigenvalues. Eigenvalues that are negative will correspond to solutions that will move towards the origin as (t) increases in a direction that is parallel to its eigenvector. Likewise, eigenvalues that are positive move away from the origin as (t) increases in a direction that will be parallel to its eigenvector.

If both constants are in the solution we will have a combination of these behaviors. For large negative (t)’s the solution will be dominated by the portion that has the negative eigenvalue since in these cases the exponent will be large and positive. Trajectories for large negative (t)’s will be parallel to (>) and moving in the same direction.

Solutions for large positive (t)’s will be dominated by the portion with the positive eigenvalue. Trajectories in this case will be parallel to (>) and moving in the same direction.

In general, it looks like trajectories will start “near” (>), move in towards the origin and then as they get closer to the origin they will start moving towards (>) and then continue up along this vector. Sketching some of these in will give the following phase portrait. Here is a sketch of this with the trajectories corresponding to the eigenvectors marked in blue. In this case the equilibrium solution (left( <0,0> ight)) is called a saddle point and is unstable. In this case unstable means that solutions move away from it as (t) increases.

So, we’ve solved a system in matrix form, but remember that we started out without the systems in matrix form. Now let’s take a quick look at an example of a system that isn’t in matrix form initially.

We first need to convert this into matrix form. This is easy enough. Here is the matrix form of the system.

This is just the system from the first example and so we’ve already got the solution to this system. Here it is.

Now, since we want the solution to the system not in matrix form let’s go one step farther here. Let’s multiply the constants and exponentials into the vectors and then add up the two vectors.

So, the solution to the system is then,

Let’s work another example.

So, the first thing that we need to do is find the eigenvalues for the matrix.

Now let’s find the eigenvectors for each of these.

The eigenvector in this case is,

The eigenvector in this case is,

Then general solution is then,

Now, we need to find the constants. To do this we simply need to apply the initial conditions.

Now solve the system for the constants.

Now let’s find the phase portrait for this system.

From the last example we know that the eigenvalues and eigenvectors for this system are,

This one is a little different from the first one. However, it starts in the same way. We’ll first sketch the trajectories corresponding to the eigenvectors. Notice as well that both of the eigenvalues are negative and so trajectories for these will move in towards the origin as (t) increases. When we sketch the trajectories we’ll add in arrows to denote the direction they take as (t) increases. Here is the sketch of these trajectories. Now, here is where the slight difference from the first phase portrait comes up. All of the trajectories will move in towards the origin as (t) increases since both of the eigenvalues are negative. The issue that we need to decide upon is just how they do this. This is actually easier than it might appear to be at first.

The second eigenvalue is larger than the first. For large and positive (t)’s this means that the solution for this eigenvalue will be smaller than the solution for the first eigenvalue. Therefore, as (t) increases the trajectory will move in towards the origin and do so parallel to (>). Likewise, since the second eigenvalue is larger than the first this solution will dominate for large and negative (t)’s. Therefore, as we decrease (t) the trajectory will move away from the origin and do so parallel to (>).

Adding in some trajectories gives the following sketch. In these cases we call the equilibrium solution (left( <0,0> ight)) a node and it is asymptotically stable. Equilibrium solutions are asymptotically stable if all the trajectories move in towards it as (t) increases.

Note that nodes can also be unstable. In the last example if both of the eigenvalues had been positive all the trajectories would have moved away from the origin and in this case the equilibrium solution would have been unstable.

Before moving on to the next section we need to do one more example. When we first started talking about systems it was mentioned that we can convert a higher order differential equation into a system. We need to do an example like this so we can see how to solve higher order differential equations using systems.

So, we first need to convert this into a system. Here’s the change of variables,

Now we need to find the eigenvalues for the matrix.

Now let’s find the eigenvectors.

The eigenvector in this case is,

The eigenvector in this case is,

The general solution is then,

Apply the initial condition.

This gives the system of equations that we can solve for the constants.

The actual solution to the system is then,

we can see that the solution to the original differential equation is just the top row of the solution to the matrix system. The solution to the original differential equation is then,

Notice that as a check, in this case, the bottom row should be the derivative of the top row.

## Lecture Notes for Math 3410, with Computational Examples

For inner product spaces, the above is taken as the definition of what it means for an operator to be symmetric.

###### Exercise 4.2.1 .

Prove that if (xxdotp(Ayy)=(Axx)dotp yy) for any (xx,yyinR^n ext<,>) then (A) is symmetric.

Take (xx=mathbf_i) and (yy=mathbf_j ext<,>) where (_1,ldots, mathbf_n>) is the standard basis for (R^n ext<.>) Then with (A = [a_]) we have

A useful property of symmetric matrices, mentioned earlier, is that eigenvectors corresponding to distinct eigenvalues are orthogonal.

###### Theorem 4.2.2 .

If (A) is a symmetric matrix, then eigenvectors corresponding to distinct eigenvalues are orthogonal.

###### Proof .

To see this, suppose (A) is symmetric, and that we have

with (xx_1 eqmathbf<0>,xx_2 eq mathbf<0> ext<,>) and (lambda_1 eq lambda_2 ext<.>) We then have, since (A) is symmetric, and using the result above,

It follows that ((lambda_1-lambda_2)(xx_1dotp xx_2)=0 ext<,>) and since (lambda_1 eq lambda_2 ext<,>) we must have (xx_1dotp xx_2=0 ext<.>)

The procedure for diagonalizing a matrix is as follows: assuming that (dim E_lambda(A)) is equal to the multiplicity of (lambda) for each distinct eigenvalue (lambda ext<,>) we find a basis for (E_lambda(A) ext<.>) The union of the bases for each eigenspace is then a basis of eigenvectors for (R^n ext<,>) and the matrix (P) whose columns are those eigenvectors will satisfy (P^<-1>AP = D ext<,>) where (D) is a diagonal matrix whose diagonal entries are the eigenvalues of (A ext<.>)

If (A) is symmetric, we know that eigenvectors from different eigenspaces will be orthogonal to each other. If we further choose an orthogonal basis of eigenvectors for each eigenspace (which is possible via the Gram-Schmidt procedure), then we can construct an orthogonal basis of eigenvectors for (R^n ext<.>) Furthermore, if we normalize each vector, then we'll have an orthonormal basis. The matrix (P) whose columns consist of these orthonormal basis vectors has a name.

###### Definition 4.2.3 .

A matrix (P) is called if (P^T = P^<-1> ext<.>)

###### Theorem 4.2.4 .

A matrix (P) is orthogonal if and only if the columns of (P) form an orthonormal basis for (R^n ext<.>)

A fun fact is that if the columns of (P) are orthonormal, then so are the rows. But this is not true if we ask for the columns to be merely orthogonal. For example, the columns of (A = bm 1amp 0amp 5-2amp 1amp 21amp 2amp -1ebm ) are orthogonal, but the rows certainly are not. But if we normalize the columns, we get

which, as you can confirm, is an orthogonal matrix.

###### Definition 4.2.5 .

An (n imes n) matrix (A) is said to be orthogonally diagonalizable if there exists an orthogonal matrix (P) such that (P^TAP) is diagonal.

The above definition leads to the following result, also known as the Principal Axes Theorem.

###### Theorem 4.2.6 . Real Spectral Theorem.

The following are equivalent for a real (n imes n) matrix (A ext<:>)

There is an orthonormal basis for (R^n) consisting of eigenvectors of (A ext<.>)

(A) is orthogonally diagonalizable.

###### Exercise 4.2.7 .

Determine the eigenvalues of (A=bm 5amp -2amp -4-2amp 8amp -2-4amp -2amp 5ebm ext<,>) and find an orthogonal matrix (P) such that (P^TAP) is diagonal.

We'll solve this problem with the help of the computer.

We get (c_A(x)=x(x-9)^2 ext<,>) so our eigenvalues are (0) and (9 ext<.>) For (0) we have (E_0(A) = ll(A) ext<:>)

For (9) we have (E_9(A) = ll(A-9I) ext<.>)

The approach above is useful as we're trying to remind ourselves how eigenvalues and eigenvectors are defined and computed. Eventually we might want to be more efficient. Fortunately, there's a command for that.

Note that the output above lists each eigenvalue, followed by its multiplicity, and then the associated eigenvectors.

This gives us a basis for (R^3) consisting of eigenvalues of (A ext<,>) but we want an orthogonal basis. Note that the eigenvector corresponding to (lambda = 0) is orthogonal to both of the eigenvectors corresponding to (lambda =9 ext<.>) But these eigenvectors are not orthogonal to each other. To get an orthogonal basis for (E_9(A) ext<,>) we apply the Gram-Schmidt algorithm.

This gives us an orthogonal basis of eigenvectors. Scaling to clear fractions, we have

From here, we need to normalize each vector to get the matrix (P ext<.>) But we might not like that the last vector has norm (sqrt<45> ext<.>) One option to consider is to apply Gram-Schmidt with the vectors in the other order.

That gives us the (slightly nicer) basis

The corresponding orthonormal basis is

This gives us the matrix (P=bm 2/3amp -1/sqrt<2>amp 1/sqrt<18>1/3amp 0 amp -4/sqrt<18>2/3amp 1/sqrt<2>amp 1/sqrt<18>ebm ext<.>) Let's confirm that (P) is orthogonal.

Since (PP^T=I_3 ext<,>) we can conclude that (P^T=P^<-1> ext<,>) so (P) is orthogonal, as required. Finally, we diagonalize (A ext<.>)

Incidentally, the SymPy library for Python does have a diagaonalization routine however, it does not do orthogonal diagonalization by default. Here is what it provides for our matrix (A ext<.>)

## Eigenvalues and Eigenvectors

Many problems present themselves in terms of an eigenvalue problem:

A·v=&lambda·v

In this equation A is an n-by-n matrix, v is a non-zero n-by-1 vector and &lambda is a scalar (which may be either real or complex). Any value of &lambda for which this equation has a solution is known as an eigenvalue of the matrix A. It is sometimes also called the characteristic value. The vector, v, which corresponds to this value is called an eigenvector. The eigenvalue problem can be rewritten as

A·v-&lambda·v=0
A·v-&lambda·I·v=0
(A-&lambda·Iv=0

If v is non-zero, this equation will only have a solution if

This equation is called the characteristic equation of A, and is an n th order polynomial in &lambda with n roots. These roots are called the eigenvalues of A. We will only deal with the case of n distinct roots, though they may be repeated. For each eigenvalue there will be an eigenvector for which the eigenvalue equation is true. This is most easily demonstrated by example

##### Example: Find Eigenvalues and Eigenvectors of a 2x2 Matrix

then the characteristic equation is

and the two eigenvalues are

All that's left is to find the two eigenvectors. Let's find the eigenvector, v1, associated with the eigenvalue, &lambda1=-1, first.

so clearly from the top row of the equations we get

Note that if we took the second row we would get

In either case we find that the first eigenvector is any 2 element column vector in which the two elements have equal magnitude and opposite sign.

where k1 is an arbitrary constant. Note that we didn't have to use +1 and -1, we could have used any two quantities of equal magnitude and opposite sign.

Going through the same procedure for the second eigenvalue:

Again, the choice of +1 and -2 for the eigenvector was arbitrary only their ratio is important. This is demonstrated in the MatLab code below.

#### Using MatLab

The eigenvalues are the diagonal of the "d" matrix

The eigenvectors are the columns of the "v" matrix.

Note that MatLab chose different values for the eigenvectors than the ones we chose. However, the ratio of v1,1 to v1,2 and the ratio of v2,1 to v2,2 are the same as our solution the chosen eigenvectors of a system are not unique, but the ratio of their elements is. (MatLab chooses the values such that the sum of the squares of the elements of each eigenvector equals unity).

## Finding eigenvalues via the characteristic polynomial

How do we find eigenvalues and eigenvectors?

Suppose that (x) is an eigenvector of (A) with eigenvalue (lambda). Then (Ax = lambda x) can be written as [Ax - (lambda I_n) x = 0.] (Recall that (I_n) is the (n imes n) identity matrix.) Rewriting the last equality gives [(A - lambda I_n) x = 0.] Hence, (x) is a nonzero vector in the nullspace of (A - lambda I_n). For the example above, we have (A - 2I_n = egin -1 & 2 1 & -2end), the nullspace of which contains the nonzero vector (x = egin 2 1end).

In other words, for any eigenvalue (lambda) of (A), the matrix (A - lambda I_n) is singular, implying that its determinant is zero. Note that (det(A-lambda I_n)) is a polynomial in (lambda) of degree (n) and is called the characteristic polynomial of (A), denoted by (p_A). (Some books define the characteristic polynomial of (A) as (det(lambda I_n -A)) instead. Since (lambda I_n -A) is singular iff (A - lambda I_n) is, either definition will give the same roots.)

For the example above, (p_A = det(A-lambda I_2) = left|egin 1 -lambda & 2 1 & -lambdaend ight| = (1 -lambda)(-lambda) - 2cdot 1 = lambda^2 - lambda - 2 = (lambda - 2)(lambda + 1)). Note that (-1) is another root of this polynomial.

Is (-1) is an eigenvalue of (A)? The answer is &ldquoyes&rdquo. To see this, we have to find a nonzero vector (x) such that ((A - (-1)I_2)x = 0.) Now, (A - (-1)I_2 = egin 2 & 2 1 & 1end). By inspection, one sees that (egin 1 -1end) is a nonzero vector in the nullspace of (A-(-1)I_2). Hence, (egin 1 -1end) is an eigenvector of (A) with eigenvalue (-1).

There are no other eigenvalues of (A) because we have found all the roots to the polynomial. (A single-variable quadratic polynomial can have no more than two distinct roots.)

In general, every root of the characteristic polynomial is an eigenvalue. If (lambda) is such that (det(A-lambda I_n) = 0), then (A- lambda I_n) is singular and, therefore, its nullspace has a nonzero vector. Such a vector by definition gives an eigenvector.

## Mathematics and Education Keywords : max-plus algebra, earliest starting times, project network, interval.

### ITERATIVE SYSTEM OF FUZZY NUMBER MAX-PLUS LINEAR EQUATIONS ITERATIVE SYSTEM OF FUZZY NUMBER MAX-PLUS LINEAR EQUATIONS

Prociding of International Conference on Mathematics and Natural Science 2008. FMIPA ITB Bandung 28-30 Oktober 2008.

M. Andy Rudhito, Sri Wahyuni, Ari Suparwanto, F. Susilo

vectors. Moreover, the solution is unique if the square matrix of the systems is definite.

Keywords: Max-Plus Algebra, System of Linear Equations, Fuzzy Number

### 4. Eigenvalues and Eigenvectors of Matrices over Fuzzy Number Max-Plus Algebra In this section we assume that readers have known some basic concepts of fuzzy set and fuzzy number. Further details can be found in Zimmermann, H.J., (1991), Lee, K.H. (2005) and Susilo, F. (2006).    ### 3. Eigenvalues and Eigenvectors of Matrices over Interval Max-Plus Algebra In this section we will review some basic concepts of interval max-plus algebra, matrices over interval max-plus algebra, and the existence and uniqueness of interval max-plus eigenvalue. Further details can be found in Litvinov, L.G., et.al. (2001) and Rudhito, A. et.al. (2008a, 2008b).

readmore: click at t he pic ture bellow:  ### 2. EIGENVALUES AND EIGENVECTORS OF MATRICES OVER MAX-PLUS ALGEBRA In this section we will review some basic concepts of max-plus algebra, matrices over max-plus algebra and its relations with graph theory, and the existence and uniqueness of max-plus eigenvalues. Further details can be found in Baccelli et.al (1992) and Rudhito A (2003).

readmore: click at t he picture bellow:  ### EIGENVALUES AND EIGENVECTORS OF MATRICES OVER FUZZY NUMBER MAX-PLUS ALGEBRA ( Introduction ) The max-plus algebra can be used to model and analyze a networks, like the project scheduling, production system, queueing networks, etc (Bacelli, et al. (2001), Rudhito, A. (2003), Krivulin, N.K. (2001)). The networks modelling with max-plus algebra approach is usually a max-plus linear system equations and it can be written as a matrix equation. The periodical properties of networks dynamics can be analyzed through the max-plus eigenvalues and eigenvectors of matrices in its modelling.
Recently, the fuzzy networks modelling has been developed. In this paper, the fuzzy network refers to networks whose their activity times are fuzzy number. The fuzzy scheduling can be read in Chanas, S., Zielinski, P. (2001), and Soltoni, A., Haji, R. (2007). The fuzzy queueing networks can read in Lüthi, J., Haring, G. (1997), and Pardoa & Fuente (2007).
When we follow the notions of modelling and analyzing of networks with max-plus algebra approach, we can use the analyzing of periodical properties of the dynamic can be do through eigenvalues and eigenvectors of matrices over max-plus fuzzy number in its modelling. For this reasons, this paper will discuss eigenvalues and eigenvectors of matrices over max-plus fuzzy number.
Before we proceed the essential considerations, we will reviewed the notions of eigenvalues and eigenvectors of matrices over max-plus algebra, and eigenvalues and eigenvectors of matrices over interval max-plus algebra.

Outline article (next sections/posting):

2. Eigenvalues and Eigenvectors of Matrices over Max-plus algebra

3. Eigenvalues and Eigenvectors of Matrices over Interval Max-Plus Algebra

4. Eigenvalues and Eigenvectors of Matrices over Fuzzy Number Max-Plus Algebra