PA RTl]l , ],: ]]] "*-'-***-"*-- , ,1il , l]|, cHAPTER cHAPTER cHAPTER cHAPTER 7 8 9 10 ,],]:l::]]:j;l]jiil, Linear Algebra. vector calculus Linear Algebra: Matrices, Vectors, Determinants. Linear Systems Linear Algebra: Matrix Eigenvalue Problems Vector Differential Calculus. Grad, Div, Curl Vector lntegral Calculus. Integral Theorems Linear algebra in Chaps. 7 and 8 consists of the theory and application of vectors and matrices, mainly related to linear systems of equations, eigenvalue problems, and linear transformations. Linear algebra is of growing importance in engineering research and teaching because it forms a foundation of numeric methods (see Chaps. 20-22), and its main instruments, matrices, can hold enorínous amounts of data-think of a net of millions of telephone connections-in a form readily accessible by the computer. Linear analysis in Chaps. 9 and 10, usually called vector calculus, extends dffirentiation of functions of one variable to functions of several variables-this includes the vector differential operations grad, div, and curl. And it generalizes integrationto integrals over curves, suďaces, and solids, with transformations of these integrals into one another, by the basic theorems of Gauss, Green, and Stokes (Chap. 10). Software suitable for linear algebra (Lapack, Maple, Mathematica, Matlab) canbe found in the list at the opening of Part E of the book if needed. Numeric linear algebra (Chap. 20) can be studied directly after Chap. 7 or 8 because Chap. 20 is independent of the other chapters in Part E on numerics. 771 -T cHAPTER l Linear Attebra: Matrices, Vectofs, Determi nants. Linear Systems This is the first of two chapters on linear algebra, which concerns mainlY sYstems of linear equations and linear transformations (to be discussed in this chaPter) and eigenvalue problems (to follow in Chap. 8). Systems of linear equations, briefly called linear systems, arise in electrical networks, mechanica| frameworks, economic models, optimization problems, numerics for differential equations, as we shall see in Chaps. 2I_23, and so on, As main tools, linear algebra uses matrices (rectangular affays of numbers or functions) and vectors. calculations with matrices handle matrices as single objects, denote them by single letters, and calculate with them in a very compact form, almost as with numbers, .o ihut matrix calculations constitute a powerful "mathematical shorthand", Calculations with matrices and vectors are defined and exPlained in Secs. 1,I-7,2, Sections 7.3-1.8 center around linear systems, with a thorough discussion of Gauss elimination, the role of rank, the existence and uniqueness problem for solutions (Sec. 7,5), and matrix inversion. This also includes determinants (Cramer's rule) in Sec. 7.6 (for quick reference) and Sec. 7.7. App|ications are considered throughout this chaPter. The last section (Sec. 7.9) on vector spaces, inner product spaces, and linear transformations is more abstract. Eigenvalue problems follow in Chap, 8, coMMENT. Ir{umeric lirtear algebra (,Secs. 20.I-20.5) can be studied immediatelY after this chapter. Prerequislre., None. Sections that may be omitted in a short Course: ].5,7.9. References and Answers to Problems: App. 1 Part B, and App.2. 71 Matrices, Vectors: Addition and Scalar Muttiplication In this section and the next one we introduce the basic concepts and rules of matrix and vector algebra. The main application to linear systems (systems of linear equations) begins in Sec. 7.3. 272 SEC. 7.1 Matrices, Vectors: Addition and Scalar Multiplication A matrix is a rectangular affay of numbers (or functions) enclosed in brackets. These numbers (or functions) are called the entries (or sometimes the elements) of the matrix. For example, t,; -,o, ,:] 'o','f ' are matrices. The first matrix has two rows (horizontal lines of entries) and three columns (vertical lines). The second and third matrices are square matrices, that is, each has as many rows as columns (3 and 2, respectively). The entries of the second matrix have two indices giving the location of the entry. The first index is the number of the row and the second is the number of the column in which the entry stands. Thus, a2g (read a two three) is in Row 2 and Column 3, etc. This notation is standard, regardless of whether a matrix is square or not. Matrices having just a single row or column are called vectors. Thus the fourth matrix in (1) has just one row and is called a row vector. The last matrix in (1) has just one column and is called a column vector. We shall see that matrices are practical in various applications for storing and processing data. As a first illustration let us consider two simple but typical examples. E XA M P LE l Linear Systems, a Maior Application of Matrices In a system of linear equations, briefly called a linear system, such as 4x; -| 6x2 -| 9xg: 6 6xt - Zxg:29 5x1-8x2* x3:10 the coefficients of the unknowns xy x2, x3 zíe the entries of the coefficient matrix, call it A, is obtained by augmenting A by the right sides of the linear system and is called the augmented matrix of the system. In A the coefficients of the system zrre displayed in the pattern of the equations. That is, their position in A corresponds to that in the system when written as shown. The same is true for Á. We shall see that the augmented matrix Á contains all the information about the solutions of a system, so that we can solve a system just by calculations on its augmented matrix. We shall discuss this in great detail, beginning in Sec. 7.3. Meanwhile you may verify by substitution that the solution is -r1 : 3, x2: +, 1 /Q - l. The notation xy x2, "r3 for the unknowns is practical but not essential; we could choose x, j, z or some other 1etters. fan atz asf I o^ azz orrl. Lr, asz ,rr_] |o, a2 as], t;] f e-" lru- (l) ^[l:;] Thematrix ^: [i : : j] tr 274 ExAMPLE 7 CHAP.7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems Sales Figures in Matrix Form Sales figures for three products I, II, m in a store on Monday (M), Tuesday (T), , , ,may for each week be arranged in a matrix [ +oo 330 810 0 210 A: l o Lzo 78o 500 5o0 I L roo 0 0 27o 43o M 4701 I I 960 l II 780] III If the company has ten stores, we can set up ten such matrices, one for each store. Then bY adding corresPonding entries of these matrices we can get a matrix showing the total sales of each product on each daY, Can You think of other data for which matrices are feasible? For instance, in transportation oi storage problems? or in recording phone calls, or in listing distances in a network of roads? l Generat Concepts and Notations We shal| denote matrices by capital boldface letters A, B, C, , , , , or by writing the general entry in brackets; thus A : |ait"], and So on. By an m X n matrix (read m by n matrix) we mean a matrix with m.owš ána n columns-rows come alwaYs first! m X n is called thesizeofthematrix.ThusanmXnmatrixisoftheform |':"!^ (2) 6 : |a7"7 -- atz azz atn2 I Thematricesin(l)areof sizes 2x3,3 X 3,2x2,1 X 3, anď2 X 1,respectively, Each entry in (2) has two subscripts. rn" first is the row number and the second is the column nwmber. Thus ct27 is the entry in Row 2 and Column 1, If m : n, wecall A an n X n ,qou." matrix. Then its diagonal containing the entries a11, a22, . . . , onn ircalled the main diagonal of A, Thus the main diagonals of the two ,áu*"Luffices in (1) zía a11, a22, a33 aíd e-*,4x, respectively, Square matrices *" purtiŇlarly important, as we shall see, A matrix that is not square is called a rectangular matrix, Vectors A vector is a matrix with only one row or column. Its entries are called the components of the vector. We shall denote vectors by lowercase boldface letters &' b' ' ' ' or bY its general component in brackets, a -_ |or]i,, and so on, our special Vectors in (1) suggest hut u (general) row vector is of the form Forinstance, a:|-2 5 0,8 0 1], , anf.u -- |o, a2 ____1 SEC. 7.1 Matrices, Vectors: Addition and Scalar Multiplication A column vector is of the form b- bI b2 : brn DEFlNlTloN 275 For instance, ,:Il] Matrix Addition and Scalar Multiptication What makes matrices and vectors really useful and particularly suitable for computers is the fact that we can calculate with them almost as easily as with numbers. Indeed, we now introduce rules for addition and for scalar multiplication (multiplication by numbers) that were suggested by practical applications. (Multiplication of matrices by matrices follows in the next section.) We first need the concept of equality. Equality of Matrices Two matrices A : |oio] and B : Ibio] are equal, written A : B, if and only if they have the same size and the corresponding entries aíeequal, that is, all : b11, a12 : bI2, and so on. Matrices that are not equal are called different. Thus, matrices of different sizes are always different. EXAMPLE 3 Equalityof Matrices Let Then DEFlNlTloN [; :] [i :] f+ ll L, ,_] t;;:] [; if and onlv if The following matrices are all different. Explain! As a special case, the sum a have the same number of components. f4 0l B: I l [: -r_] : 4, aI2: 0, : 3, azz: -1. fol A:I Lo^ o,'r'rf aIl a2I 1 4 'l .z) Addition of Matrices The sum of two matrices L : |airc-] and B : |b7"] of the same size is written A + B and has the entries ai1x l bip obtained by adding the corresponding entries of A and B. Matrices of different sizes cannot be added. i b of two row vectors or two column vectors, which must components, is obtained by adding the corresponding 276 CHAP.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems EXAMPLE 4 Addition of Matrices and Vectors lf ^:[-o u '-] and u:['-l o] . L o l 2) L: l 0]' A in Example 3 and our present A cannot be added. If a a*b:t-l 9 2l. An application of matrix addition was suggested in Example 2. |-t then A+R:I L: :[5 7 2]andb: Many others will fbllow. 5 2 t-6 ]l ,) 2 0], then l DEFlNlTloN Scalar Multiplication (Multiplication by a Number} The product of any m X n matrix A : |a7"] and any scalar c (number c) is written cA and is the m X n matrix cA : |caq") obtained by multiplying each entry of A by ,. Here (-1)A is simply written -A and is called the negative of A. Similarly, (-k)A is written -kA. Also, A + (-B) is written A - B and is called the difference of A and B (which must have the same size!). EX A M P LE 5 Scalar Multiplication Il ^:['J jll then -A-|:' jll +^:[; il .^:[: :l Ln,o -4 5_] L-n o 4 5.] ' L ,o -,] L. ..] If a matrix B shows the distances between some cities in miles, 1,6098 gives these distances in kilometers. I Rules for Matrix Addition and Scalar Multiplication. From the familiar laws for the addition of numbers we obtain similar laws for the addition of matrices of the same size m X n, namely, (a) (b) (c) (d) A-lB:B (A+B)+C:A A*O:A A+(-A):0. +A +(B+C) (3) (writtenA+B+C) Here 0 denotes the zero matrix (of size m X n), that is, the m X n mattix with all entries zero. (The last matrix in Example 5 is a zero matrix.) Hence matrix addition is commutative anď associative [by (3a) and (3b)]. Similarly, for scalar multiplication we obtain the rules (4\ (a) c(A + B) (b) (c + ft)A (c) c(kA) (d) 1A :cAfcB :cA+kA : (ck)A -A. (written ckA) E Let SEC. 7.1 Matrices, Vectors: Addition and Scalar Multiplication ADD|T!oN AND scALAR MULTIPLICATIoN OF MATRICES AND VECTORS A: Find the following expressions or give íeasons why they are undefined. 1. C + D, D + C,6(D - C), 6C - 6D 2. 4C, 2D, 4C + 2D, 8C - 0D 3. A + C - D, C - D,D - C,B + 2C + 4D 4. 2(^ + B), 2^ + 2B, 5A - ln,A + B + C 5. 3C - 8D, 4(3A), (4.3)^, * - -ro 6. 5A - 3C, A - B t D, 4(B - 6A), 4B * 24^ 7. 33u, 4v ]- 9u, 4(v -f 2.25u), u - v 8. A + a, I2u f 10v, 0(B - v), 0B * u 9. (Linear system) Write down a linear system (as in Example 1) whose augmented matrix is the matrix B in this problem set, 10. (Scalar multiplication) The matrix A in Example 2 shows the numbers of items sold. Find the matrix showing the number of units sold if a unit consists of (a) 5 items, (b) 10 items? 11. (Double subscript notation) Write the entries of A in Example 2 in the general notation shown in (2). 12. (Sizes, diagonal) What sizes do A, B, C, D, u, v in this problem set have? What are the main diagonals of A and B, and what about C? 13. (Equality) Give reasons why the five matrices in Example 3 are different. 14. (Addition of vectors) Can you add (a) row vectors whose numbers of components are different, (b) a row and a column vector with the same number of components, (c) a vector and a scalar? (General rules) Prove (3) and (4) for general 3 X 2 matrices and scalars c and k. TEAM PROJECT. Matrices in Modeling Networks. Matrices have various applications, as we shall see, in a form that these problems can be efficiently handled on the computer. For instance, they can be used to characíerize connections in electrical networks, in nets of roads, in production processes, etc., as follows. (a) Nodal incidence matrix. The network in Fig. 152 consists of 5 branches or edges (connections, numbered I, 2, , , ,, 5) and 4 nodes (points where two or more branches come together), with one node being grounded. We number the nodes and branches and give each branch a direction shown by an arrow. This we do arbitrarily. The network can now be described by a "nodctl incidence ntatrix" L : |a3p), where Í+l r branch k leaves node OI ajk: ]-r irbranchkentersnode OI I O ir branch /c does not touch O Show that for the network in Fig, 152 the matrix A has the given form Branch Node @ Node @ Node @ Node @ Fig. l52. Network and nodal incidence matrix in Team Project 16(a) (b) Find the nodal incidence matrices of the networks in Fig. 153. l5. 16. i^] -:[ ; ,,^:] 1],:[_l r] l]":L1.1] ":I [: 1-1 _1 0 0-1 0 1 0 , ,l 0 0 1 0 _1 | 100_1 0_] 277 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems -,:I + 1 if branch k is in mesh I i l and has the same orientation - 1 if branch k is in mesh l-y--l and has the opposite orientation 0 if branch k is not in mesh E o oi J 1@: Fig.153. Networks in Team Project 16(b) (c) Graph the three networks corresponding to the nodal incidence matrices and a mesh is a loop with no branch in its interior (or in its exterior). Here, the meshes are numbered and directed (oriented) in an arbitrary fashion. Show that in Fig. 154 the matrix M corresponds to the given figure, where Row 1 corresponds to mesh 1, etc. 10_].0 001-1 -1 10]_ 0100 Network and matrix M in Team Project 16(d) (e) Number the nodes in Fig. 154 from left to right 1, Z, 3 and the low node by 4. Find the corresponding nodal incidence matrix. |-l-,-,l t;::]|.0-,1,1:l;lL-, 1 0 0] L; ;:] tl l t 0 0.1 I o -t 0 0 -| ,| |-r 0 0 1 l 0| L 0 0 -l -l 0 -l_] (d) Mesh incidence matrix. A network can also be characterized by the mesh incidence matrixM : |m7"7, where .l ;I 1_] Ir lo u=l lo L, Fig. l54. 7.2 Matrix Muttiplication Matrix multiplication means multiplication of matrices by matrices. This is the last algebraic operation to be defined (except for transposition, which is of lesser imPortance). Now matrices are addedby adding coíTesponding entries.Inmultiplication, do we multiplY colTesponding entries? The answer is no. Why not? Such an operation would not be Of much use in applications. The standard definition of multiplication looks artificial, but wi1l be fully motivated later in this section by the use of matrices in "linear transformations," by which this multiplication is suggested. SEC. 7.2 Matrix Multiplication DEFlNlTloN 279 Multiplication of a Matrix by a Matrix The product C : AB (in this order) of an m X nmatrix A : |a4"ftimes an r X p matrix B : |b7"] is defined if and only if r : n and is then the m X p matrix C : |circ] with entries (1) Cjk : 3 orrro : ailbut. * aizba"+''' * a3nbntt j: l,"',ffi k:I,"',P.L_I \ The condition r : n mearIs that the second factor, B, must have as many rows as the first factor has columns, namely n. As a diagram of sizes (denoted as shown): AB:C ímXn]lnXr]:|mXr]. cipin (1) is obtained by multiplying each entry in the7th row of A by the corresponding entry in the kh column of B and then adding these n products. For instance, czt : aztbr. + azzbzt + " , + a2nbn1, and so on. One calls this briefly a "multiplication of rows into columrzs." See the illustration in Fig. 155, where n : 3. }-. n=3 p=2 p=2 ( fo,t a12 o,.l l-ó,, brr] l-.,, .,rl l l o,, 0,, azg l l b^ brrl = | ,r, ,r, l .=o1l"., ,:; o',", ll u,',;,:l-|"',:, ":;l I L';; ao2 "o._.] ' J L"o, 'o,_) Fig.l55. Notations in a product AB : C EX A M P L E 1 Matrix Multiplication Herec11 :3.2 + 5.5 + (-1),9:22, and soon. Theentryintheboxis c23: 4,3 + 0,7 + 2,I: The product BA is not defined. EXAMPLE 2 Multiplication of a Matrix and aVector 14. l [i ;] t:] : [i ]-; :] : ti]] whereas 3 6 1[i] :,,,, [;] ,, 6 1 [,1 ,,:,| l t;] ti ;] is undefined I EXAMPLE 3 Products of Row and Column Vectors 28o CHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems ExAMpLE 4 cAUTloN! Matrix Multiplication ls Not Commutative, AB + BA in General This is illustrated by Examples 1 and 2, where one of the two products is not even defined, and by ExamPle 3, where the two products have different sizes. But it also holds for square matrices. For instance, [, ,l[- |l:[, ol but t ,lI l:t99::lL,oo I00_] L | -l_] Lo 0_] L r -r_] Lroo lo0] L-gs -99] ItisinterestingthatthisalsoshowsthatAB:0 T, and 0.5 for T ---> N, hence 0.5 for T ---> T. If today there is no trouble, what is the probability of N two days after today? Three days after today? 27. (Profit vector) Two factory outlets F1 and F2 in New York and Los Angeles sell sofas (S), chairs (C), and tables (T) with a profit of $110, $45, and $80, respectively. Let the sales in a cerlain week be given by the matrix sCT [600 400 l00l F1 A: I l L:oo 820 2o5 _) F2 Introduce a "profit vector" p such that the components of v : Ap give the total profits of F, and F2. 28. TEAM PROJECT. Special Linear Transformations. Rotations have various applications. We show in this project how they can be handled by matrices. (a) Rotation in the plane. Show that the linear transformation y : Ax with matrix is a counterclockwise rotation of the Cartesian xlxrcoordinate system in the plane about the origin, where 0 is the angle of rotation. (b) Rotation through n0. Show that in (a) Is this plausible? Explain this in words. |-cos 0 -sin 0l [",l A:| l and x:| l, I sin d cos áJ L*r_] [yrl y: I l Ly,_] fcos * -sin al [cos É-sin É-.l I rin * .o, o] [ sin B .o, B_.] : [.o, (a + B\ -sin (" + rll Isin(a+P cos(a+B)] [ :i; .:};] []: : .J;] [:lí .Ti |][-cos n0 -sin n0l An:l l L sin n9 cos n0_.] 7.3 Linear Systems of Equations. Gauss E[imination The most important use of matrices occurs in the solution of systems of linear equations, briefly called linear systems. Such systems model various problems, for instance, in frameworks, electrical networks, traffic flow, economics, statistics, and many others. In this section we show an important solution method, the Gauss elimination. General properties of solutions will be discussed in the next sections. 287 (c) Addition formulas for cosine and sine. By geometry we should have Derive from this the addition formulas (6) in App, A3.1. (d) Computer graphics. To visualize a threedimensional object with plane faces (e.g., a cube), we may store the position vectoís of the vertices with respect to a suitable xp2x3-coordinate system (and a list of the connecting edges) and then obtain a twodimensional image on a video screen by projecting the object onto a coordinate plane, for instance, onto the ;rrxr-plane by setting x, : 0. To change the appearance of the image, we can impose a linear transformation on the position vectors stored. Show that a diagonal matrix D with main diagonal entries 3,I,+, gives from an x : [xr] the new position vector y : Dx, where )r : 3íl(stretch in the xl-direction by a factor 3), yz : x2 (unchanged), y3 : Lr" (contraction in the x3-direction), What effect would a scalar matrix have? (e) Rotations in space. Explain y : Ax geometrically when A is one of the three matrices what effect would these transformations have in situations such as that described in (d)? ____- @- 20x, :39 0:0 25 -20 pivot l0 -------------- [; Eliminate rr- L: 1 25 -20 0 -l l| 0l l0 zsl, 90 l 0 -95|-'nol *o*3-3Row2 il0 0l 0] 0-{1 - ^2T ,t3 - 10x2-| 25xr: 90 - 95x3 : _190 0: 0 x3:i3:2IA] xr: s(lo - 25xg) : iz: 4 LA] XI: X2 - xg * i1.: 2 [A] where A stands for "amperes." This is the answer to our problem. The solution is unique. l :::_L Pivot l ------------->@- xzl x3 : 0 Eliminate ------> + x2- 3: 0 l0x2 * 25x=: 99 -1- 10x2 : 80. |-t -l I |0 0 (3) l lo l0 I L0 30 To eliminate x2, do: Add -3 times the pivot equation to the third equation. The result is (4)|i -954: _190 70x2 -| 25x3: 90 J1- xzl í3: 0 Back Substitution. Determination of xg, x2, xí(in this order) Working backward from the last to the first equation of this "triangular" system (4), we can now readily find x3, then x2, dfld then x1: Step 2. Elimination of x2 The first equation remains as it is. We want the new second equation to serve as the next pivot equation. But since it has no,r2-term (in fact, it is 0 : 0), we must first change the order of the equations and the corresponding rows of the new matrix. We put 0 : 0 at the end and move the third equation and the fourth equation one place up. This is called partial pivoting (as opposed to the rarely lsed total pivoting, in which also the order of the unknowns is changed). It gives -1 @ tr 0 THEoREM t CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems Elementary Row Operations. Row-Equivalent Systems Example 2 illustrates the operations of the Gauss elimination. These are the first two of three operations, which are called Elementary Row Operations for Matrices: Interchan7e oí two rows Addition of a constant multiple of one row to another row Multiplication of a row by a nonzero constant c. CAUTION! These operations are for rows, not for columns! They coffespond to the fo11owing Elementary Operations for Equations: Interchange oí two equations Addition of a constant multiple of one equation to another equation Multiplication of an equation by a nonzero constant c. Clearly, the interchange of two equations does not alter the solution set. Neither does that addition because we can undo it by a coíTesponding subtraction. Similarly for that multiplication, which we can undo by multiplying the new equation by llc (since c * 0), producing the original equation. We now call a linear system ,S, row-equivalent. to a linear system 52 if ,S1 can be obtained from S, by (finitely many!) row operations. Thus we have proved the following result, which also justifies the Gauss elimination. Row-Equivalent Systems .., Row-equivalent linear systems have the same set of solutions. ,,/ Because of this theorem, systems having the same solution sets are often called equivalent systems. But note well that we are dealing with row operations. No column operations on the augmented matrix are permitted in this context because they would generally alter the solution set. A linear system (1) is called overdetermined if it has more equations than unknowns, as in Example 2, determined if m : n, as in Example 1, and underdetermined if it has fewer equations than unknowns. Furthermore, a system (1) is called consistent if it has at least one solution (thus, one solution or infinitely many solutions), but inconsistent if it has no solutions at a11, as ír i x2 : I, x1 l x2: 0 in Example 1. Gauss Elimination: The Three Possible Cases of Systems The Gauss elimination can take care of linear systems with a unique solution (see Example 2), with infinitely many solutions (Example 3, below), and without solutions (inconsistent systems; see Example 4). r SEC. 7.3 Linear Systems of Equations. Gauss Elimination EXAMPLE 3 Gauss Elimination if lnfinitely Many Solutions Exist Solve the following linear systems of three equations in four unknowns whose augmented matrix is 293 I-3.0 2.0 2.0 -5.0 I (5) 10.6 1.5 1.5 -5.4 I L|.2 -0.3 -0.3 2.4 ;ll 2.I _l Thus, @ * 2.0x2-1 2.04 - 5.0xn: 8.0 -l 7.5x2 * 1.5;13 - 5.4xn: 2.1 - 0.3x2 - 0.3x3-1 2.4xa: 2.1. 3.0x1 * 2.0x2-| 2.0xg - 5.0xa: Solution. As in the previous example, we circle pivots and box terms of equations and coruesponding entries to be eliminated. We indicate the operations in terms of equations and operate on both equations and matrices. Step 1. Elimination of xlfrom the second and third equations by adding - 0.6i3.0 : -0.2 times the first equation to the second equation, - I.2l3.0: -0.4 times the first equation to the third equation. This gives the following, in which the pivot of the next step is circled. |-3.0 2.0 2.o -5.0 I l 0 l,l 1.1 -4.4 l L 0 -1.1 -1.1 4.4 8.0l 1.1 l Row 2 - 0.2 Row l I - |.l _] Row 3 - 0.4 Row l 8.0l lll a] Row3*Row2 3l . r| Row2-3Rowl 0_] Row3-2Rowl 8.0 1,1 This gives |-3.0 2.0 2,0 -5.0 l (1) l o 1.1 1.1 -4.4 I L0 0 0 0 (6) Step 2. This gives t; ,, ,, Lr24 [': 3.0x1 f 2.0x2 * 2.0ry - 5.0xn: 8.0 1.1x2 * 1.1x3 - 4.4xn: 1.1 0- 0 l] @+ 2x2 * xz: 3 [-f,+ xzl xs:O ll |6xr|+ 2x2 * 4xg: 6. Step 1. Elimination of xlfrom the second and third equations by adding -t times the first equation to the second equation, -3: -Z times the first equation to the third equation. 1 1 3 2 3x1 ,| 2x2,| x3: é**)+ *", :óJ 7 zrr]+ zrr: J a -Z 0. (ilfi)+ I.lxc - 4.4xa F -]- 1.1x3 * 4.4xn: -1.1 Elimination of x2 írom the third equation of (6) by adding 1.1l1.I : 1 times the second equation to the third equation. Back Substitution. From the second equation, x2 : I - xs l 4x4. From this and the first equation, xt : 2 - 14. Since x3 and .T4 remain arbitrary, we have infinitely many solutions. If we choose a value of .í3and a value of x4, then the corresponding values of x1 and x2 zíe-uniquely determined. On Notation. If unknowns remain arbitrary, it is also customary to denote them by other letters t1, t2, , , , . Inthisexamplewemaythuswrite xt:2 2- tz,x2: I --rs * 4*4: I - tt+ 4t2,x3: /1 (first arbitrary unknown), xq: t2 (second arbitrary unknown). l EXAM PLE 4 Gauss Elimination if no Solution Exists What will happen if we apply the Gauss elimination to a linear system that has no solution? The answer is that in this case the method will show this fact by producing a contradiction. For instance, consider dtt alz 294 CHAp. 7 Linear Al6ebra: Matrices, Vectors, Determinants. Linear Systems Step 2. Elimination of x2 from the third equation gives 21 -1 1 00 ,':) Row3-6Row2 3x1 -l 2x2* J3: 3 - *r, + i,r: -2 0: 12. The false Statement o : 12 shows that the System has no solution. l Row Echelon Form and lnformation From lt At the end of the Gauss elimination the form of the coefficient matrix, the augmented matrix, and the system itself are called the row echelon form. In it, rows of zeros, if present, are the 1ast rows, and in each nonzero row the leftmost nonzero entrY is farther io the right than in the previous row. For instance, in Example 4 the coefficient matrix and its augmented in row echelon form are Note that we do not require that the leftmost nonzero entries be 1 since this would have no theoretic or numeric advantage. (The so-called reduced echelon form, in which those entries are I, will be discussed in Sec. 7.8.) At the end of the Gauss elimination (before the back substitution) the row echelon form of the augmented matrix will be ;l ,;) t; 'r;l and t; 'r L; ; ;] L. 0 0 ló, ',T, I |' |-l$_ i,!'-' I I ', b* o,.In c^zn : Ll"ILrr lvrll(8) Here, rš ntanďa1 * O, c22+ O,. ,kr, * 0, and a1l the entries inthe blue triangle as well as in the blue rectangle are zero. From this we see that with resPect to solutions of the system with augmented matrix (8) (and thus with respect to the originally given system) there are three possible cases: (a) Exactly one solution tf r : n andĎr*l" ,Ďrn,if present, are zeío. To get the solution, solve the nth equation corresponding to (8) (which is knnxn : Ď) for xr, then the (n - 1)st equation for xn_1, and so on up the line. See Example2, where r : n : 3 andm:4. (b) Inftnitely many solutions if r < n anďĚr*t, , , , ,Ď-, if present, aíe zero. To obtain any of these solutions, choose values of xr*l, , , , , xnarbitrarily. Then solve the fih equation for x,, then the (r - 1)st equation for xr_', and so on up the line. See ExamPle 3. (c) 1o solutiontf r +",+ Clry@) Z(2): c2lv,rr-| CzzyQ) +",+ C2ry(:r) :::: a(rn): CmtY(t)l CrrzY +... * C-rYe). These are vector equations for rows. To switch to columns, we write (3) in terms of components as n such systems, with k : I, . . ., h, alk: CllU'1"l CtzUzk +...+ CIrUrk (4) azk: Czt.Uu"t Czz.t)zk +",+ czr.Urk :::: aynk: CrnlUyr -| CmzU2k + " ' l Crn Urk and collect components in columns. Indeed, we can write (4) as f alxf l-.,,l f crr1 l-c,,'l I | ,,ol l Cztl I,,,l l ,,,,ll( (5) |.- |:r,o| ,'|+r*|-'.." |+ lu,kl ','I I , ,--.] L.-,] l ua" L.-,] L.-, ]|a where k -- 1,, " , n. Now the vector on the left is the kth column vector of A. We see that each of these n columns is a linear combination of the same r columns on the right. Hence A cannot have more linearly independent columns than rows, whose number is rank A : r. Now rows of A are columns of the transpose AT. For AT our conclusion is that AT cannot have more linearly independent columns than rows, so that A cannot have more linearly independent rows than columns. Together, the number of linearly independent columns of A must be r, the rank of A. This completes the proof. l EXAMPLE 4 lllustration of Theorem 3 The matrix in (2) has rank 2. From Example 3 we see that the first two row vectors are linearly independent and by "working backward" we can verify that Row 3 : 6 Row 1 -j Row 2. Similarly, the first two columns are linearly independent, and by reducing the last matrix in Example 3 by columns we find that Column3 :{Column 1 + Column2 and Column 4 : t Column 1 + 2}Column 2, l (3) 300 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems Combining Theorems 2 and 3 we obtain THEoREM 4 PRooF Vector Space The following related concepts are of general interest in linear algebra. In the Present context they provide a clarification of essential properties of matrices and their role in connection with linear systems, A vector space is a (nonempty) set V of vectors such that with any two vectors a and b in V al1 their linear combinations a a + Bb (a, F any real numbers) are elements of V, and these vectors satisfy the laws (3) and (4) in Sec. 7.1 (written in lowercase letters a, b, u, . . . , which is our notation for vectors). (This definition is PresentlY sufficient, GeneralvectorspaceswillbediscussedinSec.7.9.) The maximum number of linearly independent vectors in V is called the dimension of nl*i*"":ili3:i!.'"..*"u,,u*ethedimensiontobefinite;infinitedimension A linearly independent set in v consisting of a maximum possible number of vectors in v is called a basis for v. Thus the number of vectors of a basis for v equals dim v, The set of all linear combinations of given vectors 8(1), , , , ?(p) with the same number of components is called the span of these vectors. Obviously, a SPan is a vector Space. By a subspace of a vector space V we mean a nonempty subset of V (including V itself; ;1xlFfi,'.,#,",,J"1, :"Jriífi.rJ'íJ.".J.",:ij:i1;.''" algebraic oPerations (addition and EXAMPLE 5 Vector Space, Dimension, Basis The span of the three vectors in Example 1 is a vector space of dimension 2, and a basis is a11;, &i2;, for instance, Or ái1;, i13;, etC. We further note the simple THEoREM 5 The matrix A with those p vectors as row Theorem 3 it has rank A šn 1p, which vectors has p rows and n < p columns; hence by implies linear dependence by Theorem2, l PRooF A basis of n vectors is flrrl : [1 0 ei2;:[0 0 1]. In the case of a matrix A we call the span of the row span of the column vectors the column space of A, vectors the row space of A and the 0],a.21 : t0 10 0],",, l Linear Dependence of Vectors p vectors with n 1 p components are always linearly dependent. Vector Space ť The vector Space Rn consisting of all vectors with n components (n real numbers) has dimension n. SEC.7,4 Linear lndependence. Rank of a Matrix. Vector Space Now, Theorem 3 shows that a matrix A has as many linearly independent rows as columns. By the definition of dimension, their number is the dimension of the row space or the column space of A. This proves THEoREM ó Row Space and Column Space The row space and the column space of a matrix A have the same dimension, equal to rank A. Finally, for a given matrix A the solution set of the homogeneous system Ax : 0 is a vector space, called the null space of A, and its dimension is called the nullity of A. In the next section we motivate and prove the basic relation (6) rank A -l nullity A : Number of columns of A. J 4 5 6 48 84 816 168 0-1 05 50 02 FJol LINEARINDEIENDENCE Are the following sets of vectors linearly independent? (Show the details.) 13. [3 -2 0 4], [5 0 0 1], [-6 1 0 1], 2003] 1,4. LI 1 0], [1 0 0], [1 1 15. [6 0 3 I 4 2],l0 -1 l12 3 0 *I9 8 -11l 16. [3 4 ]1,12 0 3], [8 2 17. |0.2 I.2 5.3 2.8 1.6], l4.3 3,4 0.9 2.0 -4.3l t] : L;o, l,: |: t: Ll the its 10. torS 11. ol sI ul ,) ,:1 ,^) ;l ;] 12. 1] 27 0 5], 3], [5 5 6] @ RANK, RoW spAcE, coLUMN spAcE l- t -z1 1. l. .l L-, . ] |-o -2 3. l, 4 l L55 r0 3 5. l-, 0 L-. 5 |-s 0 l. 2 7. l l+ 0 L.4 |-t 0 l. 5 9. l l: 8 L. -3] : :] 4l: ,o :] 2L{ 1 :':] CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems 18. t9. t3 2 1],[0 0 [r + + i],B t+áá+] 20. t1 2 3 4),I2 |4561] 0], [4 3 6] + i +l.|+ i + ál. 3 4 5],[3 4 5 6], 21. CAS Experiment. Rank, (a) Show experimentally that the n X nmatrix A,: |ap)with ap: j + k -_I has rank 2 for any n. (Probiem 20 shows n : 4,) Try to prove it. (b) Do the same when ajk : j + k * c, where c is any positive integer, (c) What is rank A, tf a7, - 2j+k-21Try to find other large matrices of low rank independent of n, ,ez_N, PRoPERTlEs oF RANK AND CONSEQUENCES Show the following. 22. rankBTAT : rank AB, (Note the order!) 23. rankA : rank B does not tmply rank A2 : rank 82, (Give a counterexample,) 24. IfA is not square, either the row vectors or the column vectors of A are linearly dependent, 7.5 Solutions of Linear Systems: If the row vectors of independent, so are conversely. a square matrix are linearly the column vectors, and 26. Gtve examples showing that the rank of a product of matrices cannot exceed the rank of either factor, @ vEcToR spAcEs t,'t" giu"n set of vectors a vector space? (Give reason,) If you, Jnr*er is yes, determine the dimension and find a basis. (l)1,1)z,, , , denote components,) 27. Allvectors in R3 such that u1 l u2: 0 28. Al1 vectors in Ra such that}u2 - 3ua: k 29. Al|vectors in R3 with ur ž0, u2: -4Ll3 30. A1l vectors in R2 with utš uz 31. A1l vectors in R3 with 4u, l u3 : 0, 3u2: ug 32, Altvectors in Ra with U,l, _ Uz: 0, U3 : 5U1, Uq: o 33. All vectors in R'with ]ui| < 1 for j : I" " ,n 34. Al1 ordered quadruples of positive real numbers 35. All vectofs in R5 with l)1 : 2l)2 : 3l)s : 4uq: 5I)s 36. All vectors in Ra with 3ul - u3 : 0, 2u1 * 3uz - 4ua: O Existence, Uniqueness THEO'R,EM I Rank as just defined gives complete information about existence, uniqueness' and general structure of the solution set of linear systems as follows, A linear system of equations in n ,rÁorn, has a unique solution if F: coefficient matrix and the augmented ma;ix have the same rankn,and infinitelY manY solution if that common rank is less than n. Thesystem has no solution if those two matrices have different rank' Tostatethispreciselyandproveit,weshallusethe(generallyimportant)conceptof a submatrix of A. By this we mean any matrix obtained from A bY omitting some roWS or columns (or both). By definition this includes A itself (as the matrix obtained bY omitting no rows or columns); this is practical, Fundamental Theorem for Linear Systems (a) Existence. A linear system of m equations in n wnknowll'S X1' ' ' " Xn atlXt * al2x2 + , , , l alnXn -- b1 aztXt* a22X2+ "' * a2nXn: bz arnlX1 l arn2x2 + "' * an"nxn: bn (1) 303 is consistent, that is, has solutions, augmented matrix Á, have the same aln AatnL a*n if and only if the cofficient matrix A and the rank. Here, att and Á: att amI (b) Uniqueness. The system (I) has precisely one solution if and only if this common rank r of A and Á, equals n. (c) Infinitely many solutions. If this common rank r is less than n, the system (l) has infinitely many solutions. All of these solutions are obtained by determining r suitable wnknowns (whose submatrix of cofficients must have rank r) in terms of the remaining n - r unknowns, to which arbitrary values can be assigned. (See Example 3 in Sec. 7.3.) (d) Gauss elimination (Sec. 7.3). Iísolutions exist, they can atl be obtained by the Gauss eliminatiott. (This method will automatically reveal whether or not solutions exist; see Sec. 7.3.) PROOF (a) We can write the system (1) in vector form Ax : b or in terms of column vectors C(1),'",C(n)Of A: Crl;-Tr * crrrx2 + ", l crn,Xr.:b. Á is obtained by augmenting A by a single column b. Hence, by Theorem 3 in Sec.'7.4, rank Á equals rank A or rank A + 1. Now if (1) has a solution x, then (2) shows that b must be a linear combination of those column vectors, so that Á and A have the same maximum number of linearly independent column vectors and thus the same rank. Conversely, if rank Á : rank A, then b must be a linear combination of the column vectors of A, say, (2) (2*) b : alc11; i l anc6,, since otherwise rank Á : rank A + 1. But (2*) means that (l) has a solution, namely, xl : Q7, , , , , xn : an, as can be seen by comparing (2*) and (2). (b) If rank A : n, the rz column vectors in (2) are linearly independent by Theorern 3 in Sec. 7.4. We claim that then the representation (2) of b is unique because otherwise C, . . . , iné. Expressing these vectors in terms of the vectors of K and collecting terms, we can thus write the system in the form (3) Ó*r+t,"',Črnrtn; here, j : I,'. . ., r.Since the system has a solution, there &fe }1,,,,,!,satisfying (3). These scalars are unique since K is linearly independent. Choosing *r+L, , , *n fixes the B, and corresponding ij : lj - Fi,where j : l,,,,,T. (d) This was discussed in Sec. 7.3 and is restated here as a reminder. l The theorem is illustrated in Sec. ].3. In Example 2 there is a unique solution since rank Á : rank A: n: 3 (as can be seen from the last matrix in the example). In ExamPle 3 we have rank Á : rank A : 2 < n - 4 andcan choosex3 and xn arbitrarily. In ExamPle 4 there is no solution because rank A : 2 < rank Á : 3. Homo8eneous Linear System Recall from Sec. 7.3 that a linear system (1) is called homogeneous if all the bjs are zero, and nonhomogeneous if one or several bj s are not zero. For the homogeneous system we obtain from the Fundamental Theorem the following results. Homogeneous Linear System A homogeneous linear system attXt * al2x2+,,, l alnxn : 0 aztXt l a22X2 +''' * a2nxn : 0 (4) am,7Xt * an2x2 +,,,,| arrnxn : 0 always has the trivial solution x1 : 0, , xtl,: 0. Nontrivial solutions exist if and onty if rankA 1n. If rartkA : r 1n,these solutions, togetherwithx: Orform a vector space (see Sec. 1.Q of dimension ll - T, called the solution Space of (4). In paríicular, if x61 antl xQ) are solution vectors of (4), then x : ctx(1) * c2x21 with any scalars c1 and. c2 is a solution vector oí(4). (This does not hold for nonhomogeneous systems. Also, the term solwtion space is used for homogeneous systems only.) THEoREM z SEC. 7.5 Solutions of Linear Systems: Existence, Uniqueness P R O O F The first proposition can be seen directly from the system. It agrees with the fact that b : 0 implies that rank Á : rank A, so that a homogeneous system is always consistent, If rank A: n, the trivial solution is the unique solution according to (b) in Theorem 1. If rank A 1 n, there are nontrivial solutions according to (c) in Theorem 1. The solutions form a vector space because if x11; and xlr, are any of them, then Ax,1) : 0, Ax12; : 0, and this implies A(xrrl i xrr>) : Axcu i Ax,r; : 0 as well as A(cx11;) : cAx11; : 0, where c is arbitrary. If rank A : r 1 n, Theorem 1 (c) implies that we can choose n - r suitable unknowns, call them xr+l, , , , , xll, in an arbitrary fashion, and every solution is obtained in this way. Hence a basis for the solution space, briefly called a basis of solutions of (4), is yZby D -- o,,, D: ailCi, l a3zC32+ "'* ainCin (j:I,2,",,orn) (3b) D : al"Cro l a2*Car i ", l a6"Cn1" (k : I, 2,, ", or n) Here, Ciu -- ?I)j*kMirc and M,"is a determinant of order n - I, namely, the determinant of the submatrix of A obtained from A by omitting the row and column of the entry a7", that is, the jth row and the kth column. In this way, D is defined in terms of n determinants of order fl - I, each of which is, in turn, defined in terms of n - 1 determinants of order ft - 2, and so on; We finally arrive at second-order determinants, in which those submatrices consist of single entries whose determinant is defined to be the entry itself, From the definition it follows that we may expand D by any row or column, that is, choose in (3) the entries in any row of column, similarly when exPanding the C7"' s in (3), and so on. This delinition is unambiguous, thatis, yields the same value for D no matter which columns or roWS we choose in expanding. A proof is given in App, 4, atz aln azz a2n (1) (3a) r SEC.7.7 Determinants. Cramer's Rule (4a) (4b) ExAMPLE 1 ' : Žr(-I)j*ooioMjo ŤL D : > ?I)j*OoioMjo j_7 |ol atsl Mzz- l l. I os, ass l 309 (j:I,2,, ,,orn) (fr:1,2,...,orn). Terms used in connection with determinants are taken from matrices. In D we have n2 entries a7", zlson rows and n columns, and a main diagonal on which a11, a22, , , , , anlt stand. Two terms are new: Mip is called the minor of a7" in D, and C7, the cofactor of a7" in D. For later use we note that (3) may also be written in terms of minors Minors and cofactors of a Third-order Determinant In (4) of the previous section the minors and cofactors of the entries in the first column can be seen directly. For the entries in the second row the minors are latz atsl Mzt: l l. |on assl |ott apl Mzs: l l |osl aszl and the cofactors are C21 : - Mzt, C22 : * Mzz, and C; : - Mzs. Similarly for the third row-write these down yourself. And verify that the signs in C7" form a checkerboard pattern + l+ EXAM PLE 2 Expansions of a Third-Order Determinant ExAMPtE 3 l:;:l:-, l-, 2 5l Inspired by this, can you formulate a little theorem matrices? " Ii ':':l :,r :l-,|,,:l-,|-', , l(l2 - 0) - 3(4 + 4) + 0(0 + 6): -I2. This is the expansion by the first row. The expansion by the third column is :l ,:o|_,, :l .l ; ;l -,l] :l :.-l2+O:-l2 Verify that the other four expansions also give the value - 12. Determinant of a Triangular Matrix l l: :l - -3,4,5: -60. on determinants of triangular matrices? Of diagonal l '=- l 310 cHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems General Properties of Determinants To obtain the value of a determinant (1), we can first simplify it systematically bY elementary row operations, similar to those for matrices in Sec. 7.3, as follows. Behavior of an nth-Order Determinant under Elementary Row Operations (a) Interchange of two rows multiplies the value of the determinant bY -I. (b) Addition of a multiple of a row to another row does not alter the value of the determinant. (c) Muttiplication of a row by a nonzero constant c multiplies the value of the determinant by c. (This holds also when c: O, but gives no longer an elementarY row operation.) PRooF (a) By induction. The statementholds forn: 2because THEoREM l : ad - bc, but (5) p:Ž (-I)j*koi,"Miu, h:1 l,l, ,,l dl |:bc-ad. bl We now make the induction hypothesis that (a) holds for determinants of order n - I > 2 and show that it then holds for determinants of order n. Let D be of order n. Let E be obtained from D by the interchange of two rows. Expand D and E by a row that is not one of those interchanged, call it the jth row. Then by (4a), E : i (-l)j*kay,Nit" k_l where N,i. is obtained íiomthe minor Mip of aip in D by the interchange of those two rows whlch have been interchanged in D (and which N3p must both contain because we expand by another row!). Now these minors are of order n - 1. Hence the induction hypothesis applies and gives Nju: -Mio. Thus E : -D by (5). (b) Add c times Row i to Row j. Let D be the new determinant. Its entries in Row j are a.x -| ca11".If we expand Ď ay this Row j, we see that we can write it as D : Dt l cD2, where Dt: D has in Row i the aiu, whereas D2has in that Row j the ap from the addition. Hence D, has al"inboth Row l and Row7. Interchanging these two ťows gives D2back, but on the other hand it gives -Drby (a). Together D2 - -D2: 0, so that D : Dt - D_ (c) Expand the determinant by the row that has been multiplied. CAUTION! det (cA) : cn det A (not c det A). Explain why. l ExAMpLE 4 Evaluation of Determinants by Reduction to Triangular Form Because of Theorem 1 we may evaluate determinants by reduction to triangular form, as in the Gauss elimination for a matrix. For instance (with the blue explanations always referring to the preceding determinant) lz. o -4 6 l ll 5 l 0 o:I l o 2 6 -| I 1-3 8 9 l SEC.7.7 Determinants. Cramer's Ru[e 3tl 20 05 02 08 20 05 00 00 20 05 00 00 -4 9 6 _) -4 9 2,4 - lI.4 -4 9 2.4 -0 l0 6 -12 6 -12 -1 3,8 29.2 Row2-2Row1 Row 4 * 1.5 Row l ] Row 3 - 0.4 Row 2 Row 4 - 1,6 Row 2 -,: l 38 l 47.25l Row 4 + 4.]5Row 3 : 2. 5 .2.4. 41.25 : 1134, TH],EoREM 2 Further Properties of nth-Order Determinants (a)-(c) in Theorem l hold also for columns. (d) Transposition leaves the value of a determinant unaltered. (e) Á zero row or column renders the value of a determinant zero. (f) Proportional rows or columns render the value of a determinant zero. In particular, a determinant with two identical rows or columns has the value zero. P R O O F (a)-(e) follow directly from the fact that a determinant can be expanded by any row column. In (d), transposition is defined as for matrices, that is, the 7th row becomes the ith column of the transpose. (0 If Row7 : c times Row i, then D : cD1, where D, has Row j : Row l. Hence an interchange of these rows reproduces D1, but it also gives -Dtby Theorem 1(a). Hence Dt : 0 and D : cD1: 0. Similarly for columns. l It is quite remarkable that the important concept of the rank of a matrix A, which is the maximum number of linearly independent row or column vectors of A (see Sec. 7.4), can be related to determinants. Here we may assume that rank A ) 0 because the only matrices with rank 0 are the zero matrices (see Sec. 7.4). THEoREM 3 Rank in Terms of Determinants An m X n matrix A : |aiuf has rank r > 7 if and onty if A has an r X r submatrix with nonzero determinant, whereas every square submatrix wíth more than r rows that A has (or does not have!) has determinant equal to zero. In particular, if A is square, n X n, it has rank n if and only if detA t 0. l ,=__--- r i t), so that det Š: 0 bY Theorem 2. This proves the theorem for an m X n matrix, Inparticular,ifAissquare,nXn,thenrankA:nlfandonlyifAcontainsannXn submatrix with nonzero ^determinant. But the only such submatrix can be A itself, hence detA * 0. l Cramer's Ru[e Theorem 3 opens the way to the classical solution formula for linear sYstems known as cru,,'..]".ol.r, which giu", solutions as quotients of determinants. Cramer's rule is not practicalin computatioň.s (for which the methods in Secs -7.3 anď20.I-z0,3 are suitable), but is of theoreticalinteresl in differential equations (Secs. z.I0,3.3) and other theories that have engineering applications, Cramer,s Theorem (Solution of Linear Systems by Determinants) (a) Iía linear system of n equations in the same number of unknowns X1,, , , , Xn attXt* al2X2 + "' * alnXn: b1 aztXt* a22X2 + "' l a2nXn: b2 antXt* an2x2 + ", * annxn: bn has a nonzero cofficient determinant D : det A,, the system has precisely one solution. This solution is given by the formwlas (6) D1 Ý:-^lD D2 Dn xz: D :i": D (Cramer's rule) (7) where D1" is the determinant obtained from D by replacing in D the kth column by the column with the entries by , , , ,bn, (b) Hence if the system (6) ,s homogeneous and D + O, it has only the trivial solutioníl :0, X2:0,... ,Xn: o.i7o __ 0,thehomogeneous Systemalsohas nontrivial solutions. THEoREM 4 2GRBRIEL CRAMER (l704-I'7 5z), Swiss mathematician, SEC.7.7 Determinants. Cramer's Rule P R O O F The augmented matrix Á of the system (6) is of size n at most n. Now if 313 X (n * 1). Hence its rank can be a7natt (8) D: detA: +0, ant ann then rank A : n by Theorem 3. Thus rank Á : rank A. Hence, by the Fundamental Theorem in Sec. 7.5, the system (6) has a unique solution. Let us now prove (1). Expanding Dby its tth column, we obtain D : ay"Cro -l a2l"C2k +''' l an1"Cn1", where C6 is the cofactor of entry ap tn D. If we replace the entries in the frth column of D by any other numbers, we obtain a new determinant, say, Ď. Clearty, its expansion by the kth column will be of the form (9), with alk, , , , , ankreplaced by those new numbers and the cofactors Cp as before. In particular, if we choose as new numbers the entries all,,,,, anlof the /th column of D (where l + k),we have a new determinant Ó which has twice the columnfau ^ ant]r, once as its /th column, and once as its kth because of the replacement. Hence Ď : 0 by Theore m 2(f).If we now expa nd Ď by the column that has been replaced (the kth column), we thus obtain auCu" l a2lC2p l l a6Cn1" : 0 (l + k). We now multiply the first equation in (6) by Cu" on both sides, the second by Cr,r,. . . , the last by Cnt", and add the resulting equations. This gives Cg(alxy+ , , , * alnxn) + , , , l Cnlr(anrlr f . . . + annxn) : btCu" + ," l bnC61. Collecting terms with the same xi,lNe can write the left side as xl(alCy" * a2lC21" + , , , l anlCn1") + , , , l xn(atnCu" l a2nC2p + . . . l annC6"). From this we see that xo is multiplied by au"Cu"* a27"C21" + ", l an1"Cn1". Equation (9) shows that this equals D. Similarly, xt is multiplied by auCu"l a2lC27" + ", l a6Cnp. Equation (10) shows that this is zero when l + k. Accordingly, the left side of (11) equals simply x1"D, so that (11) becomes xkD: btCr1r+ b2c2k + ", l bnCn1". (9) (10) (11) 3l4 CHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear SYstems Now the right side of this is D6 as defined in the theorem, expanded bY its frth column, so that division by D gives (7). This proves Cramer,s rule. If (6) is homogeneous and D * 0, then each D1, has a column of zeros, so that Drc : 0 by Theorem 2(e), and (7) gives the trivial solution, Finally, if (6) is homogeneous and D : 0, then rank A < n by Theorem 3, so that nontrivial solutions exist by Theorem 2 in Sec, 7,5, l Illustrations of Theorem 4 for n : 2 and 3 are given in Sec. 7 -6, and an imPortant application of the present formulas will follow in the next section, 1. (Second-order determinant) Expand a general secondorder determinant in four possible ways and show that the results agree. 2. (Minors, cofactors) Complete the list of minors and cofactors in Example 1. 3. (Third-order determinant) Do the task indicated in Example 2. Also evaluate D by reduction to triangular form. 4. (Scalar multiplicaiion) Show that det (kA) : k" det A (not k det A), where A is any n X n matrix, Give an example. @ EvALuATloN oF DETERMINANTs Evaluate, showing the details of your work, (Expansion numerically impractical) Show that the computation of an nth-order determinant by expansion involves n! multiplications, which if a multiplication takes 10-9 sec would take these times: 15. I200 3400 0056 00,78 0-2 1 2 0-2 -1 2 0 0 -4 -1 16. 1 0 17. 13 a -L ;l 6. l cos n0 sin n0| l -sin n0 cos n0l ll ':,\ 10n z015 25 . 0.004 22 71 0.5, 109 Time sec mln years years lcos a sin al 7. l l I sin B cos É| ,l"l:'l1ll 10. l,: |:, CRAMEťS RULE Solu" by Cramer's rule and check by Gauss elimination and back substitution. (Show details.) 18.2x -5y:23 4x+6y--2 19. 3y -l 4z : 14.8 4x-l 2y- z:-6.3 x- y-l5z:13.5 20. w +2x -3z:30 4x-5yf-2z:13 2w -F8;r-4y+ z:42 3w + y-5z:35 @ RANK By DETERMINANTs Find the rank by Theorem 3 (which is not a very practical way) and check by row reduction. (Show details,) "\',, 11. l; l; -C a -/- 13. 14. 00 50 15 24 2t[:1] SEC. 7.8 lnverse of a Matrix. Gauss-Jordan Elimination TBAM PROJECT. Geometrical Applications: Curves and Surfaces Through Given Points. The idea is to get an equation from the vanishing of the determinant of a homogeneous linear system as the condition for a nontrivial solution in cramer's theorem. We explain the trick for obtaining such a system for the case of a line Z through two given points Pi (It yt) and P2: (xz, yz). The unknown line is ax l by : -c, say. We write it as ax + by + c, I : 0. To get a nontrivial solution a, b, c, the determinant of the "coefficients" x, y, 1 must be zero. The system is ax*by *c.1:0 (LineL) (I2) ax1 *byr-|c,1:0 (PronZ) ax2l by, -| c, 1 : 0 (P2on L). 315 (a) Line through two points. Derive from D : 0 in (12) the familiar formula X*X,s, _ !- jt X,:, - Xz !l, - lz (b) Plane. Find the analog of (12) for a plane through three given points. Apply it when the points are (1, 1, 1), (3,2, 6), (5, 0, 5). (c) Circle. Find a similar formula for a circle in the plane through three given points. Find and sketch the circle through (2, 6), (6, 4), (1, I). (d) Sphere. Find the analog of the formula in (c) for a sphere through four given points. Find the sphere through (0, 0, 5), (4,0, I), (0, 4, I), (0, 0, -3) by this formula or by inspection. (e) General conic section. Find a formula for a general conic section (the vanishing of a determinant of 6th order). Try it out for a quadratic parabola and for a more general conic section of your own choice. 25. WRITING PROJECT. General Properties of Determinants. Illustrate each statement in Theorems 1 and 2 with an example of your choice. 26. CAS EXPERIMENT. Determinant of Zeros and ones. Find the value of the determinant of the n x n matrix A,, with main diagonal entries all0 and all others 1. Try to find a formula for this. Try to prove it by induction. Interpret ,{3 and A.4as "incidence matrices" (as in Problem Set 7.1 but without the minuses) of a triangle and a tetrahedron, respectively; similarly for an " n-simplex" , having n vertices and n(n - I) 12 edges (and spanning R'-1, fl: 5,6, , , ,). [:],i];] [T :: ,:^ ť] 7.8 |nverse of a Matrix. Gauss-J ordan E[im i nation In this section we consi.der square matňces exclusively. The inverse of ann X nmatríx A: |a1"]is denotedby A-'and is annX nmatrtx such that AA_1 :A-lA:I where I is the n X n unit matrix (see Sec. 7.2). If A has an inverse, then A is called a nonsingular matrix. If A has no inverse, then A is called a singular matrix. If A has an inverse, the inverse is wnique. Indeed, if both B and C are inverses of A, then AB : f and CA : I, so that we obtain the uniqueness from B : IB : (CA)B : C(AB) : CI - C. (1) 316 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems we prove next that A has an inverse (is nonsingular) if and only if it has maximum possible rank n.The proof will also show that Ax : b Ímptes x : A-lb provided A-1 exists, and will thus give a motivation for the inverse as well as a relation to linear systems' (But this wIII notgive a good method of solving Ax : b numerically because the Gauss elimination in Sec. 7.3 requires fewer computations,) LetAbeagiven (2) n X n matrix and consider the linear system THEoREM 1 PRooF If the inverse A-1 exists, then gives Ax:b. multiplication from the left on both sides and use of (1) A-lAx:x:A-lb. This shows that (2) has a unique solution x. Hence A must have rank nby the Fundamental Theorem in Sec. 7.5. Conversely, let rank A : n. Then by the Same theorem, the system (2) has a unique solution x for any b. Now the back subsiitution following the Gauss elimination (Sec. 7.3) shows that the components x7 of x aíe linear combinations of those of b. Hence we can write x:Bb(3) with B to be determined. Substitution into (2) gives Ax:A(Bb):(AB)b:Cb:b (C : AB) for any b. Hence C : AB : I, the unit matrix. Similarly, if we substitute (2) into (3) we get x:Bb:B(Ax):(BA)x for any x (and b : Ax). Hence BA : L Togetheí, B : A-1 exists, -"*rr-*r*RDAN (1842-1899), German mathematician and geodesist. |See American Mathematical Monthly 94 (1981), l30-I42,) we do not recommend it asa method for solving systems of linear equations, since the number of operations in addition to those of the Gauss elimination t rarger than that for back substitution' which the Gauss-Jordan elimination avoids. See also Sec, 20,1, l Existence of the lnverse The inverse A-' of an n X n matrix A, exists if and onlY if rank A : n' thus (bY Theorem 3, Sec. l.i1 r7 ana only if detA + 0. Hence L is nonsingular if rank L : n, and is singular if rank L 1 n, ._----< SEC. 7.8 lnverse of a Matrix. Gauss-Jordan Elimination Determination of the lnverse by the Gauss-Jordan Method For the practical determination of the inverse A-1 of a nonsingular n X n matrix A we can use the Gauss elimination (Sec. 7.3), actlally a variant of it, called the Gauss-Jordan elimination3 (footnote of p. 316). The idea of the method is as follows. Using A, we form n linear systems AX atbt:alb1 *...+ anbn. llL_7 Lu") 376 DEFlNlTloN cHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems We now extend this concept to general real vector spaces by taking basic ProPerties of (a, b) as axioms for an "abstract inner product" (a, b) as follows. Vectors whose inner product is zero are called orthogonal. The length or norm of a vectot ínV is defined by (2) llull : \/(a, Ď (= 0), A vector of norm 1 is From these axioms (3) From this follows (4) called a unit vector. and from (2) one can derive the basic inequality |(a, n)| = ll"ll lln|| Gauchy-Schwar7í inequality), llu + bll= l|a|l + |ln|| (Trian gle ine quality). A simple direct calculation gives (5) llu + nll '+ llu - bll ' : zt llall '+ llnll 'l (P ar all e l o 8 r am e quality) . aDAvtp HILBERT (1862_1943), gíeatGerman mathematician, taught at Kónigsberg and Góttingen and was the creator of the famous Góttingen mathematical school. He is known for his basic work in algebra, the calculus of variations, integral equations, functional analysis, and mathematical logic. His "Foundations of GeometrY" helped the axiomatic method to gain general recognition, His famous 23 Problems (Presented in 1900 at the International Congress of Mathematicians in Paris) considerably influenced the develoPment of modern mathematics. If V is finite dimensional, it is actually a so-called Hitbert space; see Ref. [GR7], P. 73, listed in APP, I, sHBxvtRNN AMANDUS SCHWARZ (I843-I92I). German mathematician, known by his work in comPlex analysis (conformal mapping) and differential geometry. For Cauchy see Sec, 2,5, Real lnner Product Space A real vector space V is called a real inner product space (or real pre-Hilberta space) if it has the following property. With every pair of vectors a and b in 7there is associated a real number, which is denoted by (a, b) and is called the inner product of a and b, such that the following axioms are satisfied. I. For al1 scalars q1 and q2 anď all vectors a, b, c ínV, (qp ,| Qzb, c) : Qt(L, c) + q2(b, c) II. For all vectors a and b in V, (a, b) : (b, a) III. For every a tn V, (a, a) > 0, (a, a) : 0 if and only if a (P o s it iv e - definit ene s s) . (Linearity). (Symmetry). ,} SEC. 7.9 Vector Spaces, lnner Product Spaces, Linear Transformations OptionaI 327 EXA M P LE 3 n-Dimensional Euclidean Space R' with the inner product (a,b) : uTb : alb. -| . . . l anbn (where both a and b are column vectors) is called the z-dimensional Euclidean space and is denoted by E' or agaín simply by R". Axioms I-III hold, as direct calculation shows. Equation (2) gives the "Euclidean norm" (1) ll"ll :\/(a,a):Ý{":\r1'+...+"'. l EXAMPLE 4 An lnner Product for Functions. Function Space Thesetofallreal-valuedcontinuousfunctions/("r),s(r)""onagiveninterva],a 1, decrease if ^ < 1). The characteristic equation is (develoP the characteristic determinant by the first column) det(L - ^I) : -^3 - 0.6(-2,3^- 0,3,0,4) : -^3 + 1,38^ + 0,0,72 : 0, A positive root is found to be (for instance, by Newton's method, Sec. 19.2) ^: I.z. A corresponding eigenvector x can be determined from the characteristic matrix A, - 1.2I: where Xs : 0.125 is chosen, xz : 0.5 then follows from 0.3x2 - I.2xg : 0, and 11 : 1 from -1.2x1 -| 2.3x2 -l 0.4xg : 0. To get an initial population of 1200 as before, we multiply x by I20ol0 + 0.5 + 0.125) : 738. Answer: Proportional growth of the numbers of females in the three classes willoccurif theinitialvalues are738,369,92inclasses I,2,3,respectively.ThegrowthratewillbeI.2per 3 years, l EXAM PLE 4 Vibrating System of Two Masses on Two Springs (Fig, t59} Mass-spring systems involving several masses and springs can be treated as eigenvalue Problems. For instance, the mechanical system in Fig. 159 is governed by the system of oDEs y'l: -5yl * Zyz ll ^ lz: zjt - Zlz 3 years, T] Lffi]:L]l] I; ],;':I] , say, -:L.{;,] where y1 and y2 arc the displacements of the masses from rest, as shown in the figure, and Primes denote derivatives with respect to time /. In vector form, this becomes ,,,, _ lríl: o,, : [-' zl [y,-l (7) '" : |"r;) L , -r) Lr,_l X(3):*-: L* (6) ffiI= I hz= 2 ^2= I (Net change in spring length ^, \ t2 System in static equilibrium Fig. l59. Masses on springs in Example 4 System in motion SEC. 8.2 Some Applications of Eigenvalue Problems We try a vector solution of the form (8) Y : xe-t. This is suggested by a mechanical system of a single mass on a spring (Sec. 2.4), whose motion is given by exponential functions (and sines and cosines). Substitution into (7) gives @2xe't : Axe-t. Dividing by ," and writing a2 : l, we see that our mechanical system leads to the eigenvalue problem (9) where ), : a2. 343 From Example 1 in Sec. 8.1 we see that A has the eigenvalues ir : -l and ),2 : -6. Consequently, @: \/=: +i and '\t:6: *i\/6,respectively. Corresponding eigenvectors are (10) -, : [;] and ",: [ 1) From (8) we thus obtain the four complex solutions [see (10), Sec. Z.2] *1"'i' : xl(cos t + i sin t), *r"-i ': x2(cos l/6 t * i sinÝ6 ). By addition and subtraction (see Sec. 2.2) we get the four real solutions x1 cos /, x1 sin /, "2 co, \6 r. x2 sinÝ6 t. A general solution is obtained by taking a linear combination of these, y : xt(at cos / + bl sint) -| x2(a2cos \6 í+ b2 sin 16 4 with arbitrary constants ay by, a2, b2 (to which values can be assigned by prescribing initial displacement and initial velocity of each of the two masses). By (10), the components of y are lt : at cos / + b, sin t -| 2a2 "o, \6 t + 2b2 sin \6 r yz: 2al cos / + 2bl sint - az"o, V6 t - bz sin V6 r. These functions describe harmonic oscillations of the two masses. Physically, this had to be expected because we have neglected damping. l @ ELAsTlc DEFoRMATIoNs Given A in a deformation y : Ax, find the principal directions and corresponding factors of extension or contraction. Show the details. [r 5l r 0.4 0.8l 7, L, , ] 8, L,, ;;] g. l''' ' 'l 10. [' 4f L r.s 6.5_] L+ 1l _] lt. l ' Ý6] 12. [' 21 LÝ6 2) Lz l3_] r LINEARTRANsFoRMATloNs Find the matrix A in the indicated linear transformation y : Ax. Explain the geometric significance of the eigenvalues and eigenvectors of A. Show the details. 1. Reflection about the y-axis in R2 2. Reflection about the xy-plane in R3 3. Orthogonal projection (perpendicular projection) of R2 onto the x-axis 4. Orthogonal projection of R3 onto the plane y : x 5. Dilatation (uniform stretching) in R2 by a factor 5 6. Counterclockwise rotation through the angle tl2 about the origin in R2 Ax : ,\x r-2 3l 13. l l 14. L3-2) I to.s vx/z1 1_1Lttr/z to.o J ''LT 0.1 0.1 0.8 l1]15. (Leontiefl input-output model) Suppose that three industries are interrelated so that their outputs are used as inputs by themselves, according to the 3 X 3 consumption matrix where a,i7, \s the fraction of the output of industry k .onru..róá (purchased) by industry j,Let p, be the price charged by industry7 for its total output, A problem is to find prices so that for each industry, total expenditures equal total income, Show that this leads to Ap : po where p : |pl pz pg]T, and find a solution p with nonnegativl py pz, pz, 16. Show that a consumption matrix as considered in Prob, 15 must have column sums 1 and always has the eigenvalue 1. 1,7. (Open Leontief input-output model) If not the whole output but only a portion of it is consumed by the industries themselves, then instead of Ax : x (as in Prob. 15), we have x - Ax : y, where x : [íl x2 x,]' is produced, Ax is consumed by the industries, and, thus, y is the net production available for other consumers, Find for what production x a given demand vector y : [0.136 0.2'72 0.136]T can be achieved if the consumption matrix is @ populATloN MoDEL wlTH AGE sPEclFlcATloN Find the growth rate in the Leslie model (see Example 3) with the matrix as given. (Show details,) r 0 3.45 0.60l l"I2I. lo.qo 0 0 lal,l"-"I L0 0.45 0J 24. TEAM PROJECT. General Properties of Eigenvalues and Eigenvectors, Prove the following staiements and illustrate them with examples of your own choice. Here, il, , , ,l,-are the (not necessarily distinct) eigenvalues of a given n X n matrix A,: |a3l"], (a)Trace.Thesumofthemaindiagonalentriesiscalled the trace of A. It equals the sum of the eigenvalues, (b) "Spectral shift." A - kI has the eigenvalues Á,1 - k, , , , ,ln - k and the same eigenvectors as A, (c) Scalar multiples, powers, kA has the eigenvalues klt,, , , ,kln,L* (m: I,2,, , ,) has the eigenvalues lrh,, , , , ln*. The eigenvectors are those of A, (d) Spectral mapping theorem, The "polynomial matrix" p(A) : k*L- + k*_1A*-1 +,,, + klA + kol has the eigenvalues p(Xi) : k*Xj* + k,,n_t^j*-1 +,,, + k|^j + ko where j : l,,,,, fl,and the same eigenvectors as A, (e) Perron's theorem. Show that a Leslie matrix L with positive l12,!tz, lrr, l"zhas a positive eigenvalue, (This isaspecialcaseofthefamousPerron-Frobeniustheorem inSec.20.7,whichisdifficulttoproveinitsgeneralform.) f0.2 0.5 0l A,:|aluf:lou 0 o,1 Lo., 0.5 0.7] l- 0 l2.o 0l zz.|ols 0 .l L". o ro 0] t 0 1.280 2.915] 23. lo.s6o 0 0 l L0 0.420 0] f o.2 0.4 l n: lo.1 0 I Lo.z o.4 MARKOV PROCESSES l1] Pi"a n"it states of the Markov processes modeled by the following matrices. (Show the details,) |-0.1 0.4l 18.1 l Lo.q 0.6_] l;] ,,Ll]11 lwlsstLy LEONTIEF (1906-1999). American economist at New York UniversitY. For his inPut-outPut analysis he was awarded the Nobel Prlze in 1913, CHAP.8LinearA[6ebra:MatrixEigenvalueProblems SEC. 8.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices 345 8.3 Symmetric, Skew-Symmetric, and Orthotonat Matrices We consider three classes of real squaíe matrices that occur quite frequently in applications because they have several remarkable properties which we shall now discuss. The first two of these classes have already been mentioned in Sec. 7.2. DEFlNlTloNs (1) Symmetric, Skew-Symmetric, and Orthogonal Matrices A real square matrix A : |airc] is called symmetric if transposition leaves it unchanged, AT : A, thus a1ri : aik; skew-symmetric if transposition gives the negative of A, rT -A' : -A, thus akj: -aik, orthogonal if transposition gives the inverse of A, AT : A-1. (2) (3) ExAM PLE l Symmetric, Skew-Symmetric, and orthogonal Matrices The matrices [-3 l 5l [0 9-t2f le + 3l | , 0 -2 l. |-n o ,o|. l-. . ,| |, : .l l.,^ ": ,:l l-] ,, =,| L s -2 +) Ltz -20 o_] L 5 , -._.] are SYmmetric, skew-symmetric, and orthogonal, respectively, as you should verify. Every skew-symmetric matrix has all main diagonal entries zero. (Can you prove this?) l *:L:TLLT,T: ffil,i i f,?..oe written as the sum of a symmetric matrix R and a R:}(l+A') and S:}(a-Ar). EXAMPLE 2 lllustration of Formuh (a) (4) 346 THEoREM l THEoREM z CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems Eigenvalues of symmetric and skew-symmetric Matrices (a) The eigenvalues of a symmetric matrix are real, (b) The eigenvalues of a skew-symmetric matrix are pure imaginary or zero. This basic theorem (and an extension of it) will be proved in Sec, 8,5, ExAMpLE 3 Eigenvalues of Symmetric and Skew-Symmetric Matrices The matrices in (i) and (7) of Sec. 8.2 are symmetric and have real eigenvalues. The skew-sYmmetric matrix in Example 1 has the eigenvalues O, -25i, and25l. (Verify this.) The following matrix has the real eigenvalues 1 and 5 but is not symmetric. Does this contradict Theorem 1? Ii] l Orthogonal Transformations and Orthogonal Matrices Orthogonal transformations are transformations (5) y : Ax where A is an orthogonal matrix. With each vector x in R' such a transformation assigns a vector Y in R'. For instance, the plane rotation through an angle 0 ,: [];] : [:;: :: l [;;] is an orthogonal transformation. It can be shown that any orthogonal transformation in the plane oi in three-dimensional space is a rotation (possibly combined with a reflection in a straight line or a plane, respectively). The main reason for the importance of orthogonal matrices is as follows. (6) lnvariance of lnner product An orthogonal transformation preserves the value of the inner PrOduct of vectors a andb in R", defined by f br1 Il(7) a,b : aTb : lat o.] l : I La.) That is, for any a andb in Rn, orthogonal n X n matrix L, and u : Aa, v : Ab we have u,y : a,b. Hence the transformation also preserves the length or norm of anY vector a in Rn given by (8) ll u ll : \/*i: \Fi. SEC. 8.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices 347 PROOF Let A be orthogonal. Let u : Aa and v : Ab. We must show that u.v : a.b. Now (Aa)- : aTAT by (10d) in Sec. 7.2 and, ATA : A-lA : I by (3). Hence (9) u.y: uTy: (Aa)TAn: aTATAb: aTIb: aTb: a.b. From this the invariance of || a || roUows if we set b : a. Orthogonal matrices have further interesting properties as follows. THEoREM 3 Orthonormality of Column and Row Vectors A real square matrix is orthogonal if and only if its column vectors z1,, . . . , an (and also its row vectors) form an orthonormal system, that is, (10) ?j.Zk: 'jrao: {o if j + k Ll if j:k. PROOF (a) LetA be orthogonal. ThenA-lA: ATA: I,interms of column vectors a1, , an, (11) [: A-lA : ATA : ,a,-f: aJ.az..."'. Zn'az, , , dnT The last equality implies (10), by the definition of the n X n unit matrix I. From (3) it follows that the inverse of an orthogonal matrix is orthogonal (see CAS Experiment 20). Now the column vectors of A-1 (: AT) are the row vectors of A. Hence the row vectors of A also form an orthonormal system. (b) Conversely, if the column vectors of A satisfy (10), the off-diagonal entries in (11) must be 0 and the diagonal entries 1. Hence ATA : I, as (11) shows. Similarly, AAT : I. This implies AT : A-1 because also A-lA : AA-1 : r and the inverse is unique. Hence A is orthogonal. Similarly when the row vectors of A form an orthonormal system, by what has been said at the end of part (a). THEoREM 4 Determinant of an Orthogonal Matrix The determinant of an orthogonal matrix has the value -lí or -I. PROOF FromdetAB: detAdetB (Sec.7.8,Theorem4)anddetAT: detA (Sec.7.7,Theorem 2d), we get for an orthogonal matrix 1 : detl: det(AA-l): det(AAT): detAdetAT: (detA)2. l EXAMPLE 4 lllustration of Theorems 3 and 4 The last matrix in Example 1 and the matrix in (6) illustrate Theorems 3 and 4 because their determinants are -1 and *1, as you should verify. l l-",', Lu,-u,[:] ,", tr 348 Tl{E.oREM 5 PRooF E.XAiMPLE 5 CHAP.8 Linear Algebra: Matrix Eigenvalue Problems The first part of the statement holds for any real matrix A because its characteristic polynomial has real coefficients, so that its zeros (the eigenvalues of A) must be as indicated, rt,e ctaim th; |^l : 1 will be proved in Sec, 8,5, l Eigenvalues of an Orthogonal Matrix The orthogonal matrix in Example 1 has the characteristic equation -^3 + ?x'* t,t - t : o, Now one o[ the eigenvalues must be real (why? ). hence * | o= l , Trying, *:^ P, - l , Division by ,\ + l gives _(i2 _ 5N3 + 1) : 0 and the two eigenvalues (5 + i\/i)t6 and (5 _ t.\/Il)ta, which have absolute value 1. Verify all of this, Looking back at this section, you will find that the numerous basic results it contains have relatively short, straightforward proofs. This is typical of large portions of matrix eigenvalue theory. Eigenvalues of an Orthogonal Matrix The eigenvalues of an orthogonal matrix L are real or complex conjugates in pairs and have absolute value I, 1. (Verification) Verify the statements in Example 1, 2. Verify the statements in Examples 3 and 4, 3. Are the eigenvalues of A + B of the form i3 + llj, where n7 and Fi are the eigenvalues of A and B, respectively? 4. (Orthogonality) Prove that eigenvectors of a symmetric matrix corresponding to different eigenvalues are orthogonal, Give an example, 5. (Skew-symmetric matrix) Show that the inverse of a skew-symmetric matrix is skew-symmetric, 6. Do there exist nonsingular skew-symmetric n X n matrices with odd n? 7. (Orthogonal matrix) Do there exist skew-symmetric orthogonal 3 X 3 matrices? 8. (Symmetric matrix) Do there exist nondiagonal symmetric 3 X 3 matrices that are orthogonal? @ ElGENvALuEs oF syMMETRlc, sKEwSYMMETRIC, AND ORTHOGONAL MATRlcEs Are the following matrices symmetric, skew-symmetric, or orthogonal? Find their spectrum (thereby illustrating Theoiems 1 and 5). (Show the details of your work,) T0.96 -0.28l l- o bl 9. l l 10. l lll Lo.zs 0.96-] L-a a) 11.[' 'l L- t 1_j "|'^_:, ,,LT: :;; l] ,oo, ,|"L,l ,u],-1] ,,Ll : il ,.L-l 1il ,|':,', ") 18.(Rotationinspace)Giveageometricinterpretationof the transformation y : Ax with A as in Prob, 12 and x and y referred to a Cartesian coordinate system, 19. WRITING PROJECT, Section Summary, Summarize the main concepts and facts in this section, with illustrative examples of your own, PlR:O::B:LĚ:ffi;; ;E.ffi:::8:lB SEq 8:4 Eigenbasgs. Diagonalization. Qu.adlatic |o|._r_ 20. CAS EXPERIMENT. Orthogonal Matrices. (a) Products. Inverse. Prove that the product of two orthogonal matrices is orthogonal, and so is the inverse of an orthogonal matrix. What does this mean in terms of rotations? (b) Rotation. Show that (6) is an orthogonal transformation. Verify that it satisfies Theorem 3. Find the inverse transformation. (c) Powers. Write a program for computing powers L- (m: I,2,...) of a2 X 2matrix A and their spectra. Apply it to the matrix in Prob. 9 (call it A). To what rotation does A correspond? Do the eigenvalues of L- have a limit as m ---> a? (d) Compute the eigenvalues of (0.9A)-, where A is the matrix in Prob. 9. Plot them as points. What is their limit? Along what kind of curve do these points approach the limit? (e) Find A such that y : Ax is a counterclockwise rotation through 30' in the plane. ,-,,-$y' Eigenbases. DiagonaIization. (1) ]T'Fi,EOREM ],l P R O O F A11 we have to show is that x1, . . )xn aíe Let r be the largest integer such that {xr, . r 1 n and the set {x1, ... ,x, xr*1} is Ct, . . . , cr+l, not all zero, such that linearly independent. Suppose they are not. , , xr} is a linearly independent set. Then linearly dependent. Thus there are scalars (2) clx, *...+ crllxr11 :0 (see Sec. 7.4). Multiplying both sides by A and using A*j : ),7x7, we obtain clilx1 +,, . l Cr*lÁ,r*tXr+l : 0.(3) Quadratic Forms So far we have emphasized properties of eigenvalues. We now turn to general properties of eigenvectors. Eigenvectors of ann X nmaftix A may (or may not!) form a basis for R". If we are interested in a transformation y : Ax, such an "eigenbasis" (basis of eigenvectors)-if it exists-is of great advantage because then we can represent any x in R'uniquely as a linear combination of the eigenvectors x1, . . . ,xn, say, X: ClX1 l c2x2 +... l cnxn. And, denoting the corresponding (not necessarily distinct) eigenvalues of the matrix A by ir, , , , , hn, we have Arj : ,ň.7x7, so that we simply obtain y : Ax : A(clx1 + . . . l cnxn) :CIAX1 +",*cnLxn :c1,[1X1 +",lcnhnxn. This shows that we have decomposed the complicated action of A on an arbitrary vector x into a sum of simple actions (multiplication by scalars) on the eigenvectors of A. This is the point of an eigenbasis. Now if the n eígenvalues are all different, we do obtain a basis: Basis of Eigenvectors If an n X n matrix A has n distinct eigenvalues, then A has a basis of eigenvectors Xl,' ,,XnforR". i 350 CHAP.8 Linear Algebra: Matrix Eigenvalue Probtems To get rid of the last term, we subtract ir*, times (2) from this, obtaining cr(ir - .trr*l)x, * l cr(X, - ir*r)x, : 0. Herecl(ňl-i,*r):0,...,C,(L,-i,*r):0since{x,,...,X,\islinearlyindependent. Hence c!: . . . _ cr: 0, since all the eigenvalues are distinct. But with this, (2) reduces to cr*lxr*1 : 0, rr.n." cr+I :0, since xr], * 0 (an eigenvector!). This contradicts the fact that not a1l scalars in (2) aíezero.Hence tňe conclusion of the theorem must hold, l ExAMPLElEigenbasis.NondistinctEigenvalues.Nonexistence t5 3l |-ll T ll Thematrixa: | 'lr,urabasisoleigenvec,o" |'|,l |,o""pondingtotheeigenvalues L3 ,_j ,,", d u.JrJ vl llévll --'- " Lr] L-t] ir : 8, /.z: 2, (See Example 1 in Sec, 8,2,) Even if not all n eigenvalues are different, a matrix A may still provide an eigenbasis for Rn, See ExamPle 2 in Sec.8.1, where n -- 3, / .,_____^^1^.._ l^ qnlzn ,,n n hnsi.s^ Fof ontheotherhand'A.maynothaveenoughlinearlyindependenteigenvectorstomakeupabasis.For instance, A in Example 3 of Sec, 8,1 is Tkl o : [O 'l and has only one eigenvector l " l (k + 0, arbitrary). l Lo 0] .llu rlgJ vlrlJ v"- --o- Lo] Actually, eigenbases exist under much moíegeneral conditions than those in Theorem 1' An important case is the following, Tl{Eo.RE,M 2 For a proof (which is involved) see Ref, [B3], vol, 1, pp,270-272, EXAMPLE 2 OrthonormalBasisof Eigenvectors rlT The first matrix in Example 1 is symmetric, and an orthonorma' basis of eigenvectors is Ll/Vz |l\/2] ' ltt:/1 -u\,5]r. ^drrtPrv r rJ oJrrrrrtv! l Diagonatization of Matrices Eigenbases also play a role in reducing a matrix A to a diagonal matrix whose entries are the eigenvalues of A. This is done by-a ,,similarity transformation," which is defined as follows (and will have various applications in numerics in Chap, 20), DEftNlTloN Symmetric Matrices A symmetric matrix has an orthonormal basis of eigenvectors for Rn Similar Matrices. Simitarity Transformation Ann X nmatri^ Á i, called similar to an n X n matrix A if Á : p-lAp for some (nonsingu lar!) n X n matrix P. This transformation, which gives Á f,om A, is called a similarity transformation, SEC. 8.4 Eigenbases. Diagonalization. Quadratic Forms The key property of this transformation is that it preserves the eigenvalues of A: THEoREM 3 Eigenvalues and Eigenvectors of Similar Matrices If L is similar to A, then Á has the same eigenvalues as A,. Furthermore, if x is an eigenvector of A, then y : P-lx is an eigenvector of A corresponding to the same eigenvalue. PROOF FromAx: ,\x (,\ an eigenvalue, x * 0) we getP-lAx: .trP-lx. Now I: PP-1. By this "identity trick" the previous equation gives P-lAx : P-lAIx : P-lAPP-lx : Á(P-lx) : r\P-lx. Hence ,i. is an eigenvalue of Á and P-lx a coffesponding eigenvector. Indeed, P-lx : 0 wouldgivex: Ix: PP-lx: P0:0rcontradictingx * 0. l EXAMPLE 3 Eigenvalues and Vectors of Similar Matrices 35l Let Then ^:[; -;] ancl ":[l ^:[-i;][;-;][i;] ;] |-3 0l :L, ,) Here P-1 was obtained from (4") in Sec.7.8 with detP: 1. We see that Á has the eigenvalues ir : 3, lz:Z.Thecharacteristicequationof Ais(6 - ^)(-1 - D+ 12: 12 - 5^ + 6:0. Ithastheroots(the eigenvalues of A) ),1 :3, h2: 2, confirming the first part of Theorem 3. We confirm the second part. From the first component of (A - nl)x : 0 we have (6 - n)xr - 3x2 : g. For.tr:3thisgives 3x1_ 3x2:0,say,xr: [1 1]T.For t-2itgives4x1 -3xr: O,say, xz:í3 4]T. In Theorem 3 we thus have Yr:P-lxr:Ii]]tl]:t;] Indeed, these are eigenvectors of the diagonal matrix Á. !z: P-lxz: t i ;] t;] :[:] THEoREM 4 Perhaps we see that x1 and x2 are the columns of P. This suggests the general method of transforming a matrix A to diagonal form D by using P : X, the matrix with eigenvectors as columns: l Diagonalization of a Matrix If an n X n matrix A has a basis of eigenvectors, then (5) D : X-IAX is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X is the matrix with these eigenvectors as column vectors. Also, (5*) D* : x-lA-x (m : 2,3, . - .). I 352 CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems pRooF Let x1, ... ,xn constitute a basis of eigenvectors of A for R'. Let the corresPonding eigenvalues of A be it" , , Xn, respectively, so that Ax1 : ).1x1, , A1, : lnxn, Then X : [r,, *r1 tu, ranki, by Theorem 3 in Sec.7.4. Hence X-1 exists bY Theorem 1 in Sec. 7.8. We claim that (6) AX : A[x1 xrr] : [Ax1 Ax,"] : [irxr lnxn]: XD where D is the diagonal matrix as in (5). The fourth equality in (6) follows bY direct calculation. (Try it for n : 2 and then for general n.) The third equality uses Axrc : XuXu, The second equality results if we note that the first column of AX is A times the first column of X, and so on. For instance, when n : 2 and we write X1 : ["l, xzl]-, xz : íxp xzzfT, we have AX : A[x1 xz] : |"o,,,o '".,,,)[,;, f anxl l al2x21 l 1_orrrr, * a22x21 Column 1 ,,,:,,,f attXtz -l al2X22 aztXtz * a22X22 Column 2 If we multiply (6) by X-l from the left, we obtain (5). Since transformation, Theorem 3 implies that D has the same eigenvalues follows if we note that D2 : DD : x-lAxx-lAx : x-lAAX : X-lArX, Diagonalization Diagonalize x-1 : calculating Ax and multiplying by x-' from the left, we thus obtain D : X-IAX: : [Axr Ax2]. (5) is a similarity as A. Equation (5*) l ExAMPLE 4 [ ,, 02 -r r-l a:|-l1.5 1.0 ,rI L nl 1.8 -9.3_.] Solution. The characteristic determinant gives the characteristic equation -,\3 - ^2 + Iz^ : O. The roots (eigenvalues of A) are ir : 3, lz: -4, is : 9.By the Gauss elimination applied to (A - ,\I)x : 0 with ^ ] ^r, ,\2, ,\3 we find eigenvectors and then X-'by the Gauss-Jordan elimination (Sec. 7.8, ExamPle 1). The results are [']: ],,,1 11]Li][r]L1],x:Lir1] [,]1 ],,,1 11] L 1 -,o: l] Ll : l] l SEC. 8.4 Eigenbases. Diagonalization. Quadratic Forms Quadratic Forms. Transformation to Principal Axes By definition, aquadraticform Qinthe components/1, ",,xnof avectorxis asum of n2 terms, namely, o: xTAx :iš o,or,*o j:l k_| : attX1,2 -lal2xlx2 +",lalnxlxn * a2lx2x1+ a22x22 + ", l a2nx2xn +... l anlxnx1l an2xnxz+ ", + annxnz. A : |a4"fis called the coefficient matrix of the form. We may assume that A is symmetric, because we can take off-diagonal terms together in pairs and write the result as a sum of two equal terms; see the following example. EXAMPLE 5 Quadratic Form. Symmetric Coefficient Matrix Let xTAx: [xr ",, [: i] t;;] :3xl2 *4xp2*6x2x1+2x22:3,] *l0xp2+2x22. Here4+6:l0:5*5.FromthecorrespondingsymmetricmatrixC:|qrc],wherec7k:l@3rctou), thus c11 : 3, cn: c2l: 5, czz: 2, we get the same result; indeed, xTCx : [xr ,, [; ]] t;;] _ 3,12 l 5xp2l 5x2x1 + 2x22 : 3*t2 -| 10xp2 + 2x22. l Quadratic forms occur in physics and geometry, for instance, in connection with conic sections (ellipses xr2laz + xr2lb2: 1, etc.) and quadratic surfaces (cones, etc.). Their transformation to principal axes is an important practical task related to the diagonalization of matrices, as follows. By Theorem 2 the symmetric coefficient matrix A of (7) has an orthonormal basis of eigenvectors. Hence if we take these as column vectors, we obtain a matrix X that is orthogonal, so that X-1 : XT. From (5) we thus have A : XDX-1 : XDXT. Substitution into (7) gives (8) 0 : xTXDxTx. If we set XTx : y, then, since XT : X-1, we get (9) x : Xy. Furthermore, in (8) we have xTX : 1XTx)T : yT and XTx : yl so that Q becomes simply (10) Q:y-Dy:htltz*Xzyz2+...ll,.!n2. (7) 353 354 CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems This proves the following basic theorem. THEoREM 5 Principal Axes Theorem The substitution (9) transforTns a quadratic form O : x'Ax : i }, ororr*t" (arc3 : a7") j:l k:I to the principal axes form or canonical form (IO), where it, , , , hn are the (not necessarily distinct) eigenvalues of the (symmetric!) matrix A, and X is an orthogonal matrix with corresponding eigenvectoís x1, ",,xn, respectively, as column vectors. EXAM PLE 6 Transformation to Principal Axes. Conic Sections Find out what type of conic section the following quadratic form represents and transform it to PrinciPal axes: Q : I'\x] - 30xp2 + I7x22 : I28, Solution. We have Q : xÍ Ax, where f ll - l5l [*,l a:| l x:| | L-ts 17] L,r) This gives the characteristic equation (17 - D2 - I52 : 0. It has the roots \ : 2, X2 : 32, Hence (10) becomes Q: 2y + 32y22. We see that Q : I28 represents the ellipse 2yl2 + 32yr2 : l28, that is, o9 j,t' !z' 82 ' 2'-l' If we want to know the direction of the principal axes in the xlx2-coordinates, we have to determine normalized eigenvectors from (A - nl)x : 0 with ), : r\r : Z and l : lz : 32 anďthen use (9), We get f ttÝ11 l -tn/11 l _| anO l _l Lvx,5) L ttÝz) hence f ltxT -ll{2l [y,l *, : yltÝ1 - y2lÝ1 :Xy:l - -ll l."J LltvT llÝ1) Lrr_.] ' x2: y|Ý1 + y2lÝ1, This is a 45o rotation. our results agree with those in Sec. 8.2, Example 1, except for the notations, See also Fig. 158 in that example. I diagonalize. (Show the details.) 1 l: :1 ,|oo SEC. 8.4 Eigenbases. Diagonalization. Quadratic Forms r DlAGoNALlzATloN oF MATRlcEs Find an eigenbasis (a basis of eigenvectors) and (d) Diagonalization. What can you do in (5) if you want to change the order of the eigenvalues in D, for instance, interchange dl : i1 and d22: Á,2? @ stMlLAR MATRIcEs HAvE EQuAL sPEcTRA Verify this for A and Á : P-IAP. Find eigenvectors y of Á. Sho, that x : Py are eigenvectors of A. (Show the details of your work.) |--sOlr4-2] ts.A:I l,p:l l L 0 2) L-: l_] 17. ^ 18. A ELr8] TRANsFoRMATIoN To pRlNclpAL AxEs. coNlc sEcTloNs What kind of conic section (or pair of straight lines) is given by the quadratic form? Transform it to principal axes. Express x' : lxr x2f ínterms of the new coordinate vector yT : [yr y2f, as in Example 6. 19. x12 l 24xp2 - 6xr2 : 5 20. 3x * 4Ý5xlx2 * 7xr2 : 9 21,. 3xt2 * 8xp2 - 3xr2 : g 22. 6x12 * I6xp2 - 6rr' : 20 23. 4x] i 2Ý5xp, l 2x22 : 10 24. 7x12 - 24xp2 : I44 25. x12 - Izxlx2 * xz2 : 35 ,:] ,, :^] ,,f |-s ll 4. t3'L, ,] LI t.o 6.0l f z 5.1 l 6.1 L r.s 1.0_] Lo 7[i : ,) ,L;: l ,l] :|,,:,I l] "[ : j]10. [::,i]-:[:i|] 12. l4 A: [; 1] -: [; i] 15 A :|_: -:],-: [] :] |-: 0l l-s 21 16'A:L, -r)'P:Ll 4_] 8 [_] ': :] 11. (Orthonormal basis) Illustrate Theorem 2 w ith further examples. (No basis) Find further 2 X 2 and 3 X 3 matrices without eigenbases. PROJECT. Similarity of Matrices. Similarity is basic, for instance in designing numeric methods. (a) Trace. By definition, the trace of an n X n matrix A, : |a7] is the sum of the diagonal entries, trace A : all l az,z + , , , * ann. Show that the trace equals the sum of the eigenvalues, each counted as often as its algebraic multiplicity indicates. Illustrate this with the matrices in Probs. 1, 3, 5, J,9. (b) Trace of product. Let B : |bio]be n X n. Show that similar matrices have equal traces, by first proving trace AB : i; airbtl : trace BA. i:l L:I (c) Find a relationship between Á in (4) and Á : PAP-i. 355 356 26. 27. 28. 29. CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems 3xr2+22xg2*3xr2:g !2xj -f 32xlx2 * !2x22 : Il2 6.5xj * 5.0xg2 + 6.5 xz2 : 36 (Definiteness) A quadratic form Q(x) : xTAx and its (symmetric!) matrix A are called (a) positive definite if O(x) } 0 for all x * 0, (b) negative definite if QG) < 0 for all x * 0, (c) indefinite if Q(x) takes both positive and negative values. (See Fig, 160,) tQ(x) and A aíecalled positive semidefinite (negative semidefinite) if 0(x) > 0 (0(x) < 0) for all x,] A necessary and sufficient condition for positive definiteness is that all the "principal minors" are positive (see Ref. [B3], vol. 1, p. 306), that is, lal avl a11)0. l l=0. lae azzl (o) positive definite íorm (ó) Negative def inite form (c) lndefinite form Fig.16O. Quadratic forms in two variables \':',":]: :\ >0, 8.5 Complex Matrices and Forms. Optional Show that the form in Prob. 23 is positive definite, whereas that in Prob. 19 is indefinite. 30. (Definiteness) Show that necessary and sufficient for (a), (b), (c) in Prob .29 is that the eigenvalues of A are (a) all positive, (b) all negative, (c) both positive and negative. Hint, |Jse Theorem 5. detA > 0. ;-;,] then u:[';'' The three classes of real matrices in Sec. 8.3 have complex counterParts that are of Practical interest in certain applications, mainly because of their sPectra (see Theorem 1 in this section), for instance, in quantum mechanics. To define these classes, we need the following standard EXAMPLE l Notations IfA _[3+4i Lo ;l;,] and .:[i ;:' ul. 2 + 5i) Notations Á : |aip] is obtained from A - |aip] by replacing each (o, Freal) with its complex conjugatěaro: a - i\,Also, ÁT of Á, hence the conjugate transpose of A. entryajk:a+iB : |doifis the transpose SEC. 8.5 Complex Matrices and Forms. Optional 357 DEFlNlTloN Hermitian, Skew-Hermitian, and Unitary Matrices A square matrix A : |ooif is called Hermitian if Á' : A, that is, al"i : aitc skew-Hermitian if ď : -A, that is, al"i : -a3tt unitary if ď:A-1. The first two classes are named after Hermite (see footnote 13 in Problem Set 5.8). From the definitions we see the following. If A is Hermitian, the entries on the main diagonal must satisfy dj j : aifi that is, they are real. Similarly, if A is skew-Hermitian, thenaii: -ajj. If we setaii: a l iB, thisbecomes q,- iP: -(a + iD. Hence d : 0, so that a3i must be pure imaginary or 0. EXAMPLE 2 Hermitian, Skew-Hermitian, and Unitary Matrices ^ : [ ,1r, '-,"f ' f3i B:I L-z+i 2 + if l+i +\,5] l, C:| l -i _] L+* }i .] are Hermitian, skew-Hermitian, and unitary matrices, respectively, as you may verify by using the definitions. I If a Hermitian matrix is real, then ÁT : AT : A. Hence a real Hermitian matrix is a symmetric matrix (Sec. 8.3.). Similarly, if a skew-Hermitian matrix is real, then ÁT : A- : -A. Hence a real skew-Hermitian matrix is a skew-symmetric matrix. Finally, if a unitary matrix is real, then ÁT : AT : A-1. Hence a real unitary matrix is an orthogonal matrix. This shows that Hermitian, skew-Hermitian, and unitaly matrices generalize symmetric, skew-symmetric, and orthogonal matrices, respectively. EigenvaIues It is quite remarkable that the matrices under consideration have spectra (sets of eigenvalues; see Sec. 8.1) that can be characterizedin a general way as follows (see Fig. 161). Skew-Hermitian (skew-symmetric) Unitary (orthogonal) Hermitian (symmetric) Fig. 16l. Location of the eigenvalues of Hermitian, skew-Hermitian, and unitary matrices in the complex ,tr-plane 1 ReL 358 THEoREM l CHAP.8 Linear Algebra: Matrix Eigenvalue Problems Eigenvalues (a) The eigenvalues of a Hermitian matrix (and thus of a symmetric matrix) are real. (b) The eigenvalues of a skew-Hermitian matrix (and thus of a skew-symmetric matrix) are pure imaginary or Zero. (c) The eigenvalues of a unitary matrix (and thus of an orthogonal matrix) have absolute value I. ExAMPLE 3 lllustration of Theorem l For the matrices in Example 2 we find by direct calculation ano|t}V3 ++il':fi+i:L l p R o o F We prove Theorem 1. Let ,tr be an eigenvalue and x an eigenvector of A. MultiPlY Ax : Ax from the left by *-, thus íTAx : ),íTx, and divide by íTx : ítxl, + , , , ,l ínxn : |"rl' + . . . + |x,,|2, which is real and not 0 because x * 0. This gives (1) n:$ (a) If A is Hermitian, ď : A or AT : Á and we show that then the numerator in (1) is real, which makes .tr real. xTAx is a scalar; hence taking the transPose has no effect. Thus (2) xTAx : 1íTAx)T : xTATx : *'Áx : (xTAx). Hence, xTAx equals its complex conjugate, So that it must be real. (a + ib : a - ib impliesb:0.) _ tnl rr A is skew_Hermitian, AT : -Á and instead of (2) we obtain xTAx : - 1Xrax;(3) so that xTAx equals minus its complex conjugate and is pure imaginary or 0, (a + ib : -(a - ib) implies a:0.) (c) Let A be unitary. We take Ax : ,trx and its conjugate transpose (nD':(,l,X)-:^Xand multiply the two left sides and the two right sides, (lx;''{x : ^^X'x : |,\|2XTx. Characteristic Equation Eigenvalues A B C Hermitian Skew-Hermitian Unitary ^2-Lli*18:0 ^2 - 2i^ * 8 :0 ^2 - i^ - 1 :0 9,2 4i, -2i lxE + Lri, -!Ý5 + li SEC. 8.5 Complex Matrices and Forms. Optional But A is unitary, Á- : A-1, so that on the left we obtain (Áx)Ttx : x'Á-Ax : xTA-lAx : xTIx : íTx. Together, íTx: |,l|2XTx. We now divide by xTx (+ 0) to get lrtl' : 1. Hence lrt| : t. This proves Theorem l as well as Theorems 1 and 5 in Sec. 8.3. l Key properties of orthogonal matrices (invariance of the inner product, orthonormality of rows and columns; see Sec. 8.3) generalize to unitary matrices in a remarkable way. To see this, instead of R' we now use the complex vector space Cn of all complex vectors with n complex numbers as components, and complex numbers as scalars. For such complex vectors the inner product is defined by (note the overbar for the complex conjugate) (4) a.b : áTb. The length or norm of such a complex vector is a real number defined by (5) llull : \6Á : \F; : ffi : m. THfoREM 2 lnvariance of lnner product Á unitary transformation, that is, y : Ax with a unitary matrix A, preserves the value of the inner product (4), hence also the norm (5). P R O O F The proof is the same as that of Theorem 2 tn Sec. 8.3, which the theorem generalizes. In the analog of (9), Sec. 8.3, we now have bars, u.v: úTv: 1Áa;Ten : a-ďAb : áTIb : áTb : a.b. l The complex analog of an orthonormal systems of real vectors (see Sec. 8.3) is defined as follows. DEFlNlTloN Theorem 3 in Sec. 8.3 extends to complex as follows. THI-O,RE'.M' 3 Unitary Systems of Column and Row Vectors A complex square matrix is unitary if and only if its column vectors (and also its row vectors) form a unitary system. 359 Unitary System A unitary system is a set of complex vect Lj'?,l": áj'Uo .ors satisfying the relationships [o if j+k -l-1 [t if j:k. (6) 360 CHAP. 8 Linear Atgebra: Matrix Eigenvalue Problems p R o o F The proof is the same as that of Theorem 3 in Sec. 8.3, except for the bars required in ^T A-1 o_Á in (A\ onÁ t'Á\ nf fhe nrecr lA : A-1 and in (4) and (6) of the present section, THEoREM 4 Determinant of a Unitary Matrix Let A, be a unitary matrix. Then its determinant has absolute value one, that is, |det n| : 1. PRooF Similarly as in Sec. 8.3 we obtain 1 : det (AA-1) : det (,rÁ') : det A det ÁT : det A det Á : det A det A : |det A|2. Hence |det A| : 1 (where det A may now be complex), l ExAMPLE 4 Unitary Matrix tllustrating Theorems lc and 2-4 Forthevectors aT:12 _l]and5r:11 + l 4i]we getáT :|2 i]TandaTb:2(I + i)_ 4: _2+2i and with [o.si 0 6 lA:| l Lo.o 0.8l] also ^" : [;] T-0.8 + 3.2if and Ab:| l. L-2.6 + 0.6i _] as one can readily verify. This gives (ÁDTnn -- -2 * 2i, illustrating Theorem 2. The matrix is unitarY. Its columns form a unitary system, ár-u, : -0.8i,0.8l + 0,62 : I, ár'u2: -0,8',0,6 + 0,6,0,8i : 0, -ur' ur : 0.62 + (-0.8,)0,8, : 1 and so do its rows. Also, detA : -1. The eigenvalues are 0.6 + 0.8' and -0.6 + 0.8i, with eigenvectors t1 1]T and [1 - 1]T, respectively. - l Theorem 2 in Sec. 8.4 on the existence of an eigenbasis extends to complex matrices as follows. THEoREM 5 Basis of Eigenvectors A Hermitian, skew-Hermitian, or unitary matrix has a basis of eigenvectors for C" that is a unitary system. For a proof see Ref. [B3], vol. 1 , pp. 270_2]2 and p. 244 (Definition 2). EXA M P LE 5 Unitary Eigenbases The matrices A, B, C in Example2have the following unitary systems of eigenvectors, as you should verify. 1 A: ftl - zi 5]T (,\ : 9), l ftl - zi -2]T (^: 2) 1 - I -- B: ftl - zi -5]T (l : -2i), .,fu,, l + 2ilT (t : 4i) l - . r l ..-. C: ut l' IIT rn : }tr * v6ll. 6l - llT (^: +u - rÁll, l SEC. 8.5 Complex Matrices and Forms. Optional Hermitian and Skew-Hermitian Forms The concept of a quadratic form (Sec. 8.4) can be extended to complex. We call the numerator XTAx in (1) a form in the components .tr1, . . , xlt of x, which may now be complex. This form is again a sum of n2 terms ,lL fL j_7 k:I : ar171x1+ " , l alnílxn -| a2ní2Xn* a2lí2x1 * l an ínx1l l annxnxn. A is called its coefficient matrix. The form is called a Hermitian or skew_Hermitian form if A is Hermitian or skew-Hermitian, respectively. The value of a Hermitian form is real, and that of a skew-Hermitian form is pure imaginary or zero. This can be seen directly from (2) and (3) and accounts for the importance of these forms in physics. Note that (2) and (3) are valid for any vectors because in the proof of (2) and (3) we did not use that x is an eigenvector but only that xTx is real and not 0. EXAMPLE 6 Hermitian Form For A in Example 2 anď, say, x : [1 + l 5i]T we get x|Ax:Ll_i _5ij [ 4 l -3il[l +il [+tt +1)+(1_3l),5,1 L,*r, , ]L5i _.l :[l -i -si] Lrl +3ixl +t)*r.r,_] :223' l Clearly, if A and x in (4) are real, then (7) reduces to a quadratic form, as discussed in the last section. 1. (Verifrcation) Verify the statements in Examples 2 and 3. 2. (Product) Show 6Áf : -AB for A and B in Example 2. For any n X n Hermitian A and skew-Hermitian B. 3. Show that 1ABC;T : -C-IBA for any n x n Hermitian A, skew-Hermitian B, and unitary C. 4. (Eigenvectors) Find eigenvectors of A, B, C in Examples 2 and 3. g. E ElGENvALuEs AND ElGENvEcToRs Are the matrices in Probs. 5-11 Hermitian? SkewHermitian? Unitary? Find their eigenvalues (thereby verifying Theorem 1) and eigenvectors. 36l (7) il Yl -i: ;] \,5 ) tllr lv2 7. l |,l _ LÝz 5 |:, ,) 1-0 6.Í Lzi 00l I 0 .51 l I 5i 0_.] I+i 0 I-i 10.,I ,:,] I |,: , ,,Ll:i] CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems 12. PROJECT. Complex Matrices (a) Decomposition. Show that any square matrix may be written as the sum of a Hermitian and a skew-Hermitian matrix. Give examples, (b) Normal matrix. This important concept denotes a matrix that commutes with its conjugate transpose, Aď : Á'A. P.oue that Hermitian, skew-Hermitian, and unitary matrices are normal, Give corresponding examples of your own. (c) Normality criterion. Prove that A is normal if and only if the Hermitian and skew-Hermitian matrices in (a) commute. (d) Find a simple matrix that is not normal, Find a normal matrix that is not Hermitian, skew-Hermitian, or unitary. (e) Unitary matrices. Prove that the product of two unitary n X n matrices and the inverse of a unitary matrix are unitary. Give examples, (f) Powers of unita,ry matrices in applications may sometimes be very simple. Show that C12 : I in Example 2. Find further examples. @ coMpLEx FoRMs -sthe given matrix (call it A) Hermitian or skew_Hermitian? Find íTAx.(Show all the details.) a, b, c, k are real 16. (Pauti spin matrices) Find the eigenvalues and eigenvectors of the so-calledPauli spinmatrices and show Ň SrS, : iS", SgS" : -iS, S"2 - Sr' : Sr2 : I, where ,. [-,,, ;,] ,": [1 ]:] ,o. |, _,,, *o',f, - : [;;] ",["_, ,l,] ,-: [;,] ,": [: ;] ',: [: ..: [; :] ;] 1. In solving an eigenvalue problem, what is given and what is sought? 2, Do there exist square matrices without eigenvalues? Eigenvectors corresponding to mofe than one eigenvalue of a given matrix? 3. What is the defect? Why is it important? Give examples, 4. Can a complex matrix have real eigenvalues? Real eigenvectors? Give reasons. 5. What is diagonalization of a matrix? Transformation of a form to principal axes? 6. What is an eigenbasis? When does it exist? Why is it important? 7. Does a 3 X 3 matrix always have a real eigenvalue? 8. Give a few typical applications in which eigenvalue problems occur. E DlAGoNALlzATloN Find an eigenbasis and diagonalize. (Show the details,) t l0l 121 9. l l L-l++ - l03_.l |- I4.4 10. l L-tt.z -11.21 102.6_] 11. |-14 10 l L-,o i t.] |- 15 4 1.10 l L-tz -2 1],^: ,-12. z1 ;l -s] ]l ^:, ,] t5 3 13.|, ? L-. -a ,.L;,: _.,_-- Summary of Chapter 8 @ slMlLARlTy Verify that A and Á : Here, A, P are: 15. ['' 241 |l Lz.+ o.zl'Lz P-IAP have the same spectrum. 17[1 ',,_:] [i 1| 363 Transformation to Canonical Form. Reduce the quadratic form to principal axes. 18. 11.56xr2 + 2o.I6x 2 -| I7.44x22: 100 19. 1.09x12 - 0.06xrx2 l l.\lxr2 : 1 20. I4x12 l 24xp2 - 4xr2 : 29 i] [ :| "|',1 ::;] The practical importance The problems are defined (1) of matrix eigenvalue problems can hardly be overrated. by the vector equation Ax : ),x. A is a given square matrix. A1l matrices in this chapter ate square. ,tr is a scalar. To solve the problem (1) means to determine values of ),, called eigenvalues (or characteristic values) of A, such that (1) has a nontrivial solution x (that is, x t 0), called an eigenvector of A corresponding to that i. An n X n matrix has at least one and at most n numerically different eigenvalues. These are the solutions of the characteristic equation (Sec. 8.1) atz azz- l D(^) : det (A - ^I) : -0. an2 ann - l D(^) is called the characteristic determinant of A. By expanding it we get the characteristic polynomial of A, which is of degree nin),. Some typical applications are shown in Sec. 8.2. Section 8.3 is devoted to eigenvalue problems for symmetric (A. : A), skew-symmetric (A' : -A), and orthogonal matrices (A' : A-1). Section 8.4 concerns the diagonalÍzation of matrices and the transformation of quadratic forms to principal axes and its relation to eigenvalues. Section 8.5 extends Sec. 8.3 to the complex analogs of those real matrices, called Hermitian (Á- : A), skew-Hermitian (ď : -A), and unitary matrices (ď : A-1). A1l the eigenvalues of a Hermitian matrix (and a symmetric one) are real. For a skew-Hermitian (and a skew-symmetric) matrix they are pure imaginary oí Zero. For a unitary (and an orthogonal) matrix they have absolute value 1. an- h azt anI aIn a2n (2)