PA RTl]l ,
],:
]]]
"*-'-***-"*--
,
,1il
, l]|,
cHAPTER
cHAPTER
cHAPTER
cHAPTER
7
8
9
10
,],]:l::]]:j;l]jiil,
Linear Algebra.
vector calculus
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Linear Algebra: Matrix Eigenvalue Problems
Vector Differential Calculus. Grad, Div, Curl
Vector lntegral Calculus. Integral Theorems
Linear algebra in Chaps. 7 and 8 consists of the theory and application of vectors and
matrices, mainly related to linear systems of equations, eigenvalue problems, and linear
transformations.
Linear algebra is of growing importance in engineering research and teaching because it
forms a foundation of numeric methods (see Chaps. 20-22), and its main instruments,
matrices, can hold enorínous amounts of data-think of a net of millions of telephone
connections-in a form readily accessible by the computer.
Linear analysis in Chaps. 9 and 10, usually called vector calculus, extends dffirentiation
of functions of one variable to functions of several variables-this includes the vector
differential operations grad, div, and curl. And it generalizes integrationto integrals over
curves, suďaces, and solids, with transformations of these integrals into one another, by
the basic theorems of Gauss, Green, and Stokes (Chap. 10).
Software suitable for linear algebra (Lapack, Maple, Mathematica, Matlab) canbe found
in the list at the opening of Part E of the book if needed.
Numeric linear algebra (Chap. 20) can be studied directly after Chap. 7 or 8 because
Chap. 20 is independent of the other chapters in Part E on numerics.
771
-T
cHAPTER l
Linear Attebra: Matrices,
Vectofs, Determi nants.
Linear Systems
This is the first of two chapters on linear algebra, which concerns mainlY sYstems of
linear equations and linear transformations (to be discussed in this chaPter) and eigenvalue
problems (to follow in Chap. 8).
Systems of linear equations, briefly called linear systems, arise in electrical networks,
mechanica| frameworks, economic models, optimization problems, numerics for
differential equations, as we shall see in Chaps. 2I_23, and so on,
As main tools, linear algebra uses matrices (rectangular affays of numbers or functions)
and vectors. calculations with matrices handle matrices as single objects, denote them by
single letters, and calculate with them in a very compact form, almost as with numbers,
.o ihut matrix calculations constitute a powerful "mathematical shorthand",
Calculations with matrices and vectors are defined and exPlained in Secs. 1,I-7,2,
Sections 7.3-1.8 center around linear systems, with a thorough discussion of Gauss
elimination, the role of rank, the existence and uniqueness problem for solutions (Sec. 7,5),
and matrix inversion. This also includes determinants (Cramer's rule) in Sec. 7.6 (for
quick reference) and Sec. 7.7. App|ications are considered throughout this chaPter. The
last section (Sec. 7.9) on vector spaces, inner product spaces, and linear transformations
is more abstract. Eigenvalue problems follow in Chap, 8,
coMMENT. Ir{umeric lirtear algebra (,Secs. 20.I-20.5) can be studied immediatelY
after this chapter.
Prerequislre., None.
Sections that may be omitted in a short Course: ].5,7.9.
References and Answers to Problems: App. 1 Part B, and App.2.
71 Matrices, Vectors:
Addition and Scalar Muttiplication
In this section and the next one we introduce the basic concepts and rules of matrix and
vector algebra. The main application to linear systems (systems of linear equations) begins
in Sec. 7.3.
272
SEC. 7.1 Matrices, Vectors: Addition and Scalar Multiplication
A matrix is a rectangular affay of numbers (or functions) enclosed in brackets. These
numbers (or functions) are called the entries (or sometimes the elements) of the matrix.
For example,
t,; -,o, ,:]
'o','f '
are matrices. The first matrix has two rows (horizontal lines of entries) and three columns
(vertical lines). The second and third matrices are square matrices, that is, each has as
many rows as columns (3 and 2, respectively). The entries of the second matrix have two
indices giving the location of the entry. The first index is the number of the row and the
second is the number of the column in which the entry stands. Thus, a2g (read a two three)
is in Row 2 and Column 3, etc. This notation is standard, regardless of whether a matrix
is square or not.
Matrices having just a single row or column are called vectors. Thus the fourth matrix
in (1) has just one row and is called a row vector. The last matrix in (1) has just one
column and is called a column vector.
We shall see that matrices are practical in various applications for storing and processing
data. As a first illustration let us consider two simple but typical examples.
E XA M P LE l Linear Systems, a Maior Application of Matrices
In a system of linear equations, briefly called a linear system, such as
4x; -| 6x2 -| 9xg: 6
6xt - Zxg:29
5x1-8x2* x3:10
the coefficients of the unknowns xy x2, x3 zíe the entries of the coefficient matrix, call it A,
is obtained by augmenting A by the right sides of the linear system and is called the augmented matrix of the
system. In A the coefficients of the system zrre displayed in the pattern of the equations. That is, their position
in A corresponds to that in the system when written as shown. The same is true for Á.
We shall see that the augmented matrix Á contains all the information about the solutions of a system,
so that we can solve a system just by calculations on its augmented matrix. We shall discuss this in great
detail, beginning in Sec. 7.3. Meanwhile you may verify by substitution that the solution is -r1 : 3, x2: +,
1
/Q - l.
The notation xy x2, "r3 for the unknowns is practical but not essential; we could choose x, j, z or some other
1etters.
fan atz asf
I o^ azz orrl.
Lr, asz ,rr_]
|o, a2 as],
t;]
f e-"
lru-
(l)
^[l:;] Thematrix
^: [i :
: j]
tr
274
ExAMPLE 7
CHAP.7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems
Sales Figures in Matrix Form
Sales figures for three products I, II, m in a store on Monday (M), Tuesday (T), , , ,may for each week be
arranged in a matrix
[ +oo 330 810 0 210
A: l o Lzo 78o 500 5o0
I
L roo 0 0 27o 43o
M
4701 I
I
960
l
II
780] III
If the company has ten stores, we can set up ten such matrices, one for each store. Then bY adding corresPonding
entries of these matrices we can get a matrix showing the total sales of each product on each daY, Can You think
of other data for which matrices are feasible? For instance, in transportation oi storage problems? or in recording
phone calls, or in listing distances in a network of roads? l
Generat Concepts and Notations
We shal| denote matrices by capital boldface letters A, B, C, , , ,
, or by writing the general
entry in brackets; thus A : |ait"], and So on. By an m X n matrix (read m by n matrix)
we mean a matrix with m.owš ána n columns-rows come alwaYs first! m X n is called
thesizeofthematrix.ThusanmXnmatrixisoftheform
|':"!^
(2) 6 : |a7"7 --
atz
azz
atn2 I
Thematricesin(l)areof sizes 2x3,3 X 3,2x2,1 X 3, anď2 X 1,respectively,
Each entry in (2) has two subscripts. rn" first is the row number and the second is the
column nwmber. Thus ct27 is the entry in Row 2 and Column 1,
If m : n, wecall A an n X n ,qou." matrix. Then its diagonal containing the entries
a11, a22, . . . , onn ircalled the main diagonal of A, Thus the main diagonals of the two
,áu*"Luffices in (1) zía a11, a22, a33 aíd e-*,4x, respectively,
Square matrices *" purtiŇlarly important, as we shall see, A matrix that is not square
is called a rectangular matrix,
Vectors
A vector is a matrix with only one row or column. Its entries are called the components
of the vector. We shall denote vectors by lowercase boldface letters &' b' ' ' ' or bY its
general component in brackets, a -_ |or]i,,
and so on, our special Vectors in (1) suggest
hut u (general) row vector is of the form
Forinstance, a:|-2 5 0,8 0 1],
, anf.u -- |o, a2
____1
SEC. 7.1 Matrices, Vectors: Addition and Scalar Multiplication
A column vector is of the form
b-
bI
b2
:
brn
DEFlNlTloN
275
For instance, ,:Il]
Matrix Addition and Scalar Multiptication
What makes matrices and vectors really useful and particularly suitable for computers is
the fact that we can calculate with them almost as easily as with numbers. Indeed, we
now introduce rules for addition and for scalar multiplication (multiplication by numbers)
that were suggested by practical applications. (Multiplication of matrices by matrices
follows in the next section.) We first need the concept of equality.
Equality of Matrices
Two matrices A : |oio] and B : Ibio] are equal, written A : B, if and only if they
have the same size and the corresponding entries aíeequal, that is,
all : b11, a12 : bI2, and so on. Matrices that are not equal are called different.
Thus, matrices of different sizes are always different.
EXAMPLE 3 Equalityof Matrices
Let
Then
DEFlNlTloN
[; :] [i :]
f+ ll
L, ,_] t;;:] [;
if and onlv if
The following matrices are all different. Explain!
As a special case, the sum a
have the same number of
components.
f4 0l
B: I l
[: -r_]
: 4, aI2: 0,
: 3, azz: -1.
fol
A:I
Lo^
o,'r'rf
aIl
a2I
1
4
'l .z)
Addition of Matrices
The sum of two matrices L : |airc-] and B : |b7"] of the same size is written
A + B and has the entries ai1x l bip obtained by adding the corresponding entries
of A and B. Matrices of different sizes cannot be added.
i b of two row vectors or two column vectors, which must
components, is obtained by adding the corresponding
276 CHAP.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
EXAMPLE 4 Addition of Matrices and Vectors
lf
^:[-o
u
'-] and u:['-l
o]
.
L o l 2) L: l 0]'
A in Example 3 and our present A cannot be added. If a
a*b:t-l 9 2l.
An application of matrix addition was suggested in Example 2.
|-t
then A+R:I
L:
:[5 7 2]andb:
Many others will fbllow.
5
2
t-6
]l
,)
2 0], then
l
DEFlNlTloN Scalar Multiplication (Multiplication by a Number}
The product of any m X n matrix A : |a7"] and any scalar c (number c) is written
cA and is the m X n matrix cA : |caq") obtained by multiplying each entry of A
by ,.
Here (-1)A is simply written -A and is called the negative of A. Similarly, (-k)A is
written -kA. Also, A + (-B) is written A - B and is called the difference of A and B
(which must have the same size!).
EX A M P LE 5 Scalar Multiplication
Il
^:['J
jll then -A-|:' jll +^:[; il .^:[: :l
Ln,o -4 5_] L-n o 4 5.] ' L ,o -,] L. ..]
If a matrix B shows the distances between some cities in miles, 1,6098 gives these distances in kilometers. I
Rules for Matrix Addition and Scalar Multiplication. From the familiar laws for the
addition of numbers we obtain similar laws for the addition of matrices of the same size
m X n, namely,
(a)
(b)
(c)
(d)
A-lB:B
(A+B)+C:A
A*O:A
A+(-A):0.
+A
+(B+C)
(3)
(writtenA+B+C)
Here 0 denotes the zero matrix (of size m X n), that is, the m X n mattix with all entries
zero. (The last matrix in Example 5 is a zero matrix.)
Hence matrix addition is commutative anď associative [by (3a) and (3b)].
Similarly, for scalar multiplication we obtain the rules
(4\
(a) c(A + B)
(b) (c + ft)A
(c) c(kA)
(d) 1A
:cAfcB
:cA+kA
: (ck)A
-A.
(written ckA)
E
Let
SEC. 7.1 Matrices, Vectors: Addition and Scalar Multiplication
ADD|T!oN AND scALAR MULTIPLICATIoN
OF MATRICES AND VECTORS
A:
Find the following expressions or give íeasons why they
are undefined.
1. C + D, D + C,6(D - C), 6C - 6D
2. 4C, 2D, 4C + 2D, 8C - 0D
3. A + C - D, C - D,D - C,B + 2C + 4D
4. 2(^ + B), 2^ + 2B, 5A - ln,A + B + C
5. 3C - 8D, 4(3A), (4.3)^, * - -ro
6. 5A - 3C, A - B t D, 4(B - 6A), 4B * 24^
7. 33u, 4v ]- 9u, 4(v -f 2.25u), u - v
8. A + a, I2u f 10v, 0(B - v), 0B * u
9. (Linear system) Write down a linear system (as in
Example 1) whose augmented matrix is the matrix B
in this problem set,
10. (Scalar multiplication) The matrix A in Example 2
shows the numbers of items sold. Find the matrix
showing the number of units sold if a unit consists of
(a) 5 items, (b) 10 items?
11. (Double subscript notation) Write the entries of A in
Example 2 in the general notation shown in (2).
12. (Sizes, diagonal) What sizes do A, B, C, D, u, v in
this problem set have? What are the main diagonals of
A and B, and what about C?
13. (Equality) Give reasons why the five matrices in
Example 3 are different.
14. (Addition of vectors) Can you add (a) row vectors
whose numbers of components are different, (b) a row
and a column vector with the same number of
components, (c) a vector and a scalar?
(General rules) Prove (3) and (4) for general 3 X 2
matrices and scalars c and k.
TEAM PROJECT. Matrices in Modeling Networks.
Matrices have various applications, as we shall see,
in a form that these problems can be efficiently
handled on the computer. For instance, they can be
used to characíerize connections in electrical
networks, in nets of roads, in production processes,
etc., as follows.
(a) Nodal incidence matrix. The network in Fig. 152
consists of 5 branches or edges (connections, numbered
I, 2, , , ,, 5) and 4 nodes (points where two or more
branches come together), with one node being
grounded. We number the nodes and branches and give
each branch a direction shown by an arrow. This we
do arbitrarily. The network can now be described by a
"nodctl incidence ntatrix" L : |a3p), where
Í+l r branch k leaves node OI
ajk: ]-r irbranchkentersnode OI
I O ir branch /c does not touch O
Show that for the network in Fig, 152 the matrix A has
the given form
Branch
Node @
Node @
Node @
Node @
Fig. l52. Network and nodal incidence
matrix in Team Project 16(a)
(b) Find the nodal incidence matrices of the networks
in Fig. 153.
l5.
16.
i^] -:[
;
,,^:]
1],:[_l r]
l]":L1.1]
":I
[:
1-1 _1 0 0-1
0 1 0 , ,l
0 0 1 0 _1
|
100_1 0_]
277
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
-,:I
+ 1 if branch k is in mesh I i l
and has the same orientation
- 1 if branch k is in mesh l-y--l
and has the opposite orientation
0 if branch k is not in mesh E
o
oi
J
1@:
Fig.153. Networks in Team Project 16(b)
(c) Graph the three networks corresponding to the
nodal incidence matrices
and a mesh is a loop with no branch in its interior (or
in its exterior). Here, the meshes are numbered and
directed (oriented) in an arbitrary fashion. Show that
in Fig. 154 the matrix M corresponds to the given
figure, where Row 1 corresponds to mesh 1, etc.
10_].0
001-1
-1 10]_
0100
Network and matrix M in
Team Project 16(d)
(e) Number the nodes in Fig. 154 from left to right 1,
Z, 3 and the low node by 4. Find the corresponding
nodal incidence matrix.
|-l-,-,l t;::]|.0-,1,1:l;lL-, 1 0 0] L; ;:]
tl l t 0 0.1
I o -t 0 0 -| ,|
|-r 0 0 1 l 0|
L 0 0 -l -l 0 -l_]
(d) Mesh incidence matrix. A network can also be
characterized by the mesh incidence matrixM : |m7"7,
where
.l
;I
1_]
Ir
lo
u=l
lo
L,
Fig. l54.
7.2 Matrix Muttiplication
Matrix multiplication means multiplication of matrices by matrices. This is the last
algebraic operation to be defined (except for transposition, which is of lesser imPortance).
Now matrices are addedby adding coíTesponding entries.Inmultiplication, do we multiplY
colTesponding entries? The answer is no. Why not? Such an operation would not be Of
much use in applications. The standard definition of multiplication looks artificial, but
wi1l be fully motivated later in this section by the use of matrices in "linear
transformations," by which this multiplication is suggested.
SEC. 7.2 Matrix Multiplication
DEFlNlTloN
279
Multiplication of a Matrix by a Matrix
The product C : AB (in this order) of an m X nmatrix A : |a4"ftimes an r X p
matrix B : |b7"] is defined if and only if r : n and is then the m X p matrix
C : |circ] with entries
(1) Cjk : 3 orrro : ailbut. * aizba"+''' * a3nbntt
j: l,"',ffi
k:I,"',P.L_I
\
The condition r : n mearIs that the second factor, B, must have as many rows as the first
factor has columns, namely n. As a diagram of sizes (denoted as shown):
AB:C
ímXn]lnXr]:|mXr].
cipin (1) is obtained by multiplying each entry in the7th row of A by the corresponding
entry in the kh column of B and then adding these n products. For instance,
czt : aztbr. + azzbzt + " , + a2nbn1, and so on. One calls this briefly a
"multiplication of rows into columrzs." See the illustration in Fig. 155, where n : 3.
}-.
n=3 p=2 p=2
( fo,t a12 o,.l l-ó,, brr] l-.,, .,rl
l l o,, 0,, azg l l b^ brrl = | ,r, ,r, l
.=o1l"., ,:; o',", ll u,',;,:l-|"',:, ":;l
I L';; ao2
"o._.] ' J
L"o, 'o,_)
Fig.l55. Notations in a product AB : C
EX A M P L E 1 Matrix Multiplication
Herec11 :3.2 + 5.5 + (-1),9:22, and soon. Theentryintheboxis c23: 4,3 + 0,7 + 2,I:
The product BA is not defined.
EXAMPLE 2 Multiplication of a Matrix and aVector
14.
l
[i ;] t:]
:
[i ]-; :]
:
ti]] whereas
3 6 1[i] :,,,,
[;]
,, 6 1
[,1
,,:,| l
t;] ti ;] is undefined I
EXAMPLE 3 Products of Row and Column Vectors
28o CHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
ExAMpLE 4 cAUTloN! Matrix Multiplication ls Not Commutative, AB + BA in General
This is illustrated by Examples 1 and 2, where one of the two products is not even defined, and by ExamPle 3,
where the two products have different sizes. But it also holds for square matrices. For instance,
[, ,l[- |l:[, ol but t ,lI l:t99::lL,oo I00_] L | -l_] Lo 0_] L r -r_] Lroo lo0] L-gs -99]
ItisinterestingthatthisalsoshowsthatAB:0<loesno/necessarilyimplyBA:OorA:OorB:O,We
shall discuss this further in Sec. 7.8, along with reasons when this happens. l
our examples show that the order of factors in matrix products must alwaYs be obsenled
,nry ,orn1rl/y. otherwise matrix multiplication satisfies rules similar to those for numbers,
namely.
(a) (kA)B : k(AB) : A(kB) written kAB or AkB
(b) A(BC) : (AB)C written ABC
(c) (A+B)C:AC+BC
(d) C(A+B):CA+CB
provided A, B, and C are such that the expressions on the left are defined; here, fr is anY
scalar. (2b) is called the associative law. (2c) and (2d) are called the distributive laws.
Since matrix multiplication is a multiplication of rows into columns, we can write the
defining formula (1) more compactly as
(2)
(3)
where a7 is the jth row
agreement with (1),
aibp: |a1 ajz
C.ip:8.ib1x, j:I,' ,ffi, k:1,"',P,
vector of A and b7, is the frth column vector of B, so that in
: aitbutl aizba" + "'l ainbnu.
ExAMPLE 5 Product in Terms of Row and Column Vectors
If A : [c7p]
is oťsize 3 X 3 and B : |bp]is of size 3 X 4, then
,-,|',=*)
[urb,
atbz a tb: u, bnl
14l AB -
|
azbr azbz azbs urbn
i
.
Lurb, asbz asbs asb+]
Taking ar : [3 5 -1], a2: |4 o 2], etc., verify (4) for the product in Example l. l
parallel processing of products on the computer is facilitated by a variant of (3) for
computing C : AB, which is used by standard algorithms (such as in Lapack). In this
method, A is used as given, B is taken in terms of its column vectors, and the product is
computed columnwise; thus,
(5) AB : A[b, b2 be] : [Ab1 Abz Abo].
SEC. 7.2 Matrix Multiplication 281
Columns of B are then assigned to different processors (individually or several to each
processor), which simultaneously compute the columns of the product matrix Ab1, Ab2, etc.
EX A M P L E 6 Computing Products Columnwise by (5)
To obtain
f4
AB: I
L-s
from (5), calculate the columns
t1 ]] t;]
:t;] [; ]] [;]
:t;] t; ]] [:]
of
AB and then write them as a single matrix, as shown in the ílrst formula on the right.
;] [;
oo,uf:tl;
:,,:]
[]i]
I
Motivation of Multiplication by Linear Transformations
Let us now motivate the "unnatural" matrix multiplication by its use in linear
transformations. For n : 2 vartables these transformations are of the form
(6*)
jt : attX1 * al2x2
jz: aztXy l a22X2
and suffice to explain the idea. (For general n they will be discussed in Sec. 7.9.) For
instance, (6*) may relate dfl xlx2-coordinate system to a yly2-coordinate system in the
plane. In vectorial form we can write (6*) as
(6) ,:
[;,]
: Ax :|oo,,,^ ",'.,',)[;;] :|"",:':,
Now suppose further that the xlx2-system is related to a wlw2-s
l apx2l
l.
+ a22x2)
ystem by another linear
transformation, say,
[",l f b,
(7) *:| |:Bw:I
L*) l-u,,
Then the yp2-system is related to the
wish to express this relation directly.
linear transformation, too, say,
'rrf [r,l :|brrr,
+ bnwzl
.
brr) Lr) lbrrr, + b22w2)
wlw2-system indirectly via the xlx2-system, and we
substitution will show that this direct relation is a
(8) y:CW:
[.;;
,,,,,,ft;;]
:
[.::;] Indeed,
substituting (7) into (6), we obtain
cnwzl
.
czzwz)
lt: al(bnlut ,l b2w) t ap(b2lwl, l bzzwz)
: (albn l a2b2l)wt l (albn l apb22)w2
lz: azt(bnwt l bpw) * a22(b2lwt l bzzwz)
: (aztbl -l ar2b2l)wt l (aztbn l a22b22)w2,
282 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Comparing this with (8), we see that
This proves that C : AB with the product defined as in (1). For larger matrix sizes the
idea and result are exactly the same. Only the number of variables changes. We then have
m variables y and n variables x and p variabtes w. The matrices A, B, and C : AB then
have sizes m X n,, X p, anď m X p, respectively. And the requirement that C be the
product AB leads to formula (1) in its general form. This motivates matrix multiplication
completely.
Transposition
Transposition provides a transition from row vectors to column vectors and conversely.
More generally, it gives us a choice to work either with a matrix or with its transpose,
whatever will be more practical in a specific situation.
cll: albn l a2b21
czl: aztbl l a22b21
clz: albnl a2b22
C22: aztbn l a22b22.
":Li
[;:]-:[;:]
DEFlNlTloN
EXAMPILE V Transposition of Matrices and Vectors
lf o:['-8 'l. then
L400]
A little more compactly, we can write
[5
lo
l]
:;],:Iil]
6 2 3lT
[;] L:]'
:,,
Note that for a square matrix, the transpose is obtained by interchanging entries that are symmetrically positioned
with respect to the main diagonal, e.8., atz and a21, and so on. l
3].
Transposition of Matrices and Vectors
The transpose of anm X nmatríx A: |ay"]is the n X mmatrix AT (readÁ transpose)
that has the first row of A as its ťtrst column, íhesecond row of A as its second
column, and so on. Thus the transpose of A in (2) is AT : la,"if, written out
(9) AT : |ooi]:
As a special case, transposition converts row vectors to column vectors and
conversely.
att azt o*r)
SEC. 7.2 Matrix Multiplication 283
(10)
Rules for transposition are
(a)
(b)
(c)
(d)
Upper triangular
(A-)- : A
(A+B)':AT+BT
('A)- : cAT
(AB)' : BTAT.
CAUTION! Note that in (10d) the transposed matrices are in reversed order. We leave
the proofs to the student. (See Prob. 22.)
SpeciaI Matrices
Certain kinds of matrices will occur quite frequently in our work, and we now list the
most important ones of them.
Symmetric and Skew-Symmetric Matrices. Transposition gives rise to two useful
classes of matrices, as follows. Symmetric matrices and skew-symmetric matrices are
square matrices whose transpose equals the matrix itself or minus the matrix, respectively:
(11) AT : A (thus ap1 : a7"), AT :
Symmetric Matrix
EXAMPLE 8 Symmetric and Skew-Symmetric Matrices
-A (thus api - -aikl hence aii : 0).
Skew-Symmetric Matrix
r0 ,-rl
u:|-l 0 - 2| isskew-symmetric.
L, 2 0.]
For instance, if a company has three building supply centers Cy C2, C3, then A could show costs, say, ajj for
handling 1000 bags of cement on center Ci, and a7" (j + k) the cost of shipping 1000 bags from Ci to Cp.
Clearly, ai1*: atti because shipping in the opposite direction will usually cost the same.
Symmetric matrices have several general properties which make them important. This will be seen as we
Triangular Matrices. Upper triangular matrices are square matrices that can have
nonzero entries only on andabove the main diagonal, whereas any entry below the diagonal
must be zero. Similarly, lower triangular matrices can have nonzero entries only on and
below the main diagonal. Any entry on the main diagonal of a triangular matrix may be
zero or not.
EX A M P L E 9 Upper and Lower Triangular Matrices
r 20 I20 200l
o :
| 'ro
l0 ,ro
I is symmetric. and
Lzoo l50 3o_j
t; :], [i
00l
n n|
ll
2 Ol
3 6_]
[;:l]L1:1
0
-J
0
9
Lower triangular
CHAP 7 Linear Algebra: Matrices, Vectors, Determinants. Llnear Systems
Diagonal Matrices. These are square matrices that can have nonzero entries only on
the main diagonal. Any entry above or below the main diagonal must be zero.
If all the diagonal entries of a diagonal matrix S are equal, say, c, we call S a scalar
matrix because multiplication of any square matrix A of the same size by S has the same
effect as the multiplication by a scalar, that is,
AS:SA:cA.
In particu\ar, a scalar matrix whose entries on the main diagonal are all 1 is called a
unit matrix (or identity matrix) and is denoted by In or simply by I. For I, formula (12)
becomes
AI:IA:A.
ExAMPLE 1o Diagonal Matrix D. Scalar Matrix S. Unit Matrix !
(12)
(13)
l,Ll:l],L::l]
Quarter
1234
I
t:.z l2.8 13.6 ts.o
ll :.: 3.2 3.4 3.9 l
L,, 5.2 54 u,]
,Li:l]
Applications of Matrix Muttiplication
Matrix multiplication will play a crucial role in connection with linear systems of
equations, beginning in the next section. For the time being we mention Some other simPle
applications that need no lengthy explanations.
ExAMPLE ll Computer Production. Matrix Times Matrix
Supercomp Ltd produces two computer models PC1086 and PC1186. The matrix A shows the cost per comPuter
(in thousands of dollars) and B the production figures for the year 2005 (in multiPles of 10000 units.) Find a
matrix C that shows the shareholders the cost per quarter (in millions of dollars) for raw material, labor, and
miscellaneous.
PC1086
f l.z
I
A:l0.3
I
L0.5
Solution.
PC1186
t.0l
l
0o I
0 6_]
Raw Components
Labor
Miscellaneous
Quarter
I234
[3 B 6 9l PCl086
B= | l
Lo 2 4 3_] pcll86
C:AB:
Raw Components
Labor
Miscellaneous
Since cost is given in multiples of $1000 and production in multiples of l0000 units, the entries of C are
multiples of $l0 millions; thus c11 : 13-2 means $l32 million, etc. l
SEC. 7.2 Matrix Multiplication
EXAM PLE 1Z Weight Watching. Matrix Times Vector
Suppose that in a weight-watching program, a person of 185lb burns 350 cal/hr in walking (3 mph), 500 in
bycycling (13 mph) and 950 in jogging (5.5 mph). Bil|, weighing 185 lb, plans to exercise according to the
matrix shown. Verify the calculations (W : Walking, B : Bicycling, J : Jogging).
MoN
WED
FRI
SAT
EXAMPLE l3 Markov Process. Powers of a Matrix. Stochastic Matrix
Suppose that the 2004 state of land use in a city of 60 miz of builrup area is
C: Commercially Used 25a/o I: Industrially IJseď 207o R: Residentially Used 557o.
Find the states in 2009,2014, anď2019, assuming that the transition probabilities for 5-year intervals are given
by the matrix A and remain practically the same over the time considered.
ToC
ToI
ToR
A is a stochastic matrix, that is, a square matrix with all entries nonnegative and all column sums equal to 1.
Our example concerns a Markov p.o""ssl, that is, a process fbr which the probability of entering a certain state
depends only on the last state occupied (and the matrix A), not on any earlier state,
Solution, From the matrix A and the 2004 state we can compute the 2009 state,
To explain: The 2009 íigure ťor C equals 257o times the probability 0.7 that C goes into C, plus 207o times the
probability 0.1 that I goes into C, plus 557o times the probability 0 that R goes into C. Together,
25,0.1 + 20,0.1 + 55,0 - I9.5 |7ol. Also 25 , 0.2 + 20.0.9 + 55 .0.2 : 34 [Vol.
Similarly, the new R is 46.5%. We see that the 2009 state vector is the column vector
y : t19.5 34.0 46.5lT : Ax : A [25 20 55]T
where the column vector x : |25 20 55]T is the given 2004 state vector. Note that the sum of the entries of
y is 100 [7o]. Similarly, you may verify that fbr 2014 and 2019 we get the state vectors
z: Ay: A(Ax) : A2x : t17.05 43.80 39.15]T
u: Lz: A2y : A3x : t16,315 50.660 33.025]T
1ANDRBI ANDREJEVITCH MARKOV (l s56-1922), Russian mathematician, known for his work in
probability theory.
l
WBJ
r L,l :{ i:] Lil] L,iii]
From C From I From R
[07 0.1 0l
a:lo.z 0.9 0.2 l
Lo, 0 0B_]
: ||:':,,1':,i
:'i,:|:,,:;]
[1,1
:'i
:]] [l] Li:,:]
285
286 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Answer, In 2009 the commercial area will be 19.57o (11.7 mi2), the industrial 347a (20.4 mi2) and the
residential 46.57o (21 .9 miz). For 20 1 4 the corresponding figure s are l'7 .05Va , 43 .8O7a , 39 .I5Va . For 20 1 9 theY
are I6.315?a, 50.66o7o, 33.0257o. (In Sec. 8.2 we shall see what happens in the limit, assuming that those
probabilities remain the same. In the meantime, can you experiment or guess?) l
.:Ll
l":Ll
,b:[3 0 8]
MULTlPLlcATloN, ADDlTloN, AND
TRANsPoslTloN oF MATRlcEs AND
vEcToRs
calculate the following products and sums or give reasons
why they are not defined. (Show all intermediate results,)
1. Aa, Ab, AbT, AB
2. AbT + BbT, (A + B)b', bAo B - BT
3. AB, BA, AAT, ATA
4.
^2,
B', (A-)', (A')-
5. aTA, bA, 5B(3a + zbT),15Ba + 10BbT
6. ATb, bTB, (3A - 2B)Ta, aT(3A - 2B)
7. ab, ba, (ab)A, a(bA)
8. ab - ba, -(abX7a), -28ba,5abB
9. (A + B)2, A2 + AB + BA + F',
^'
+ 2^B + B2
10. (A + BXA - B),
^2
_ AB + BA - B2,
^'
- B2
11. A2B, A3, 1AB;2,
^'B'12. B3, BC, (BC)2, (nC)(nC)T
13. aTAa, a'lA + A-)a, bBbT, b(B - B')b-
14. aTCCT a, a'C2a, bCTCbT, bCCTbT
15. (General rules) Prove (2) for 2 X 2 matrices A, : |aixf,
B : fb7"f, C : |c4"f and a general scalar,
16. (Commutativity) Find all 2 X 2 matrices A, : |a7"]
that commute with B : |byJ, where bju: j + k,
17. (Product) Write AB in Probs. 1-14 in terms of row
and column vectors.
18. (Product) Calculate AB in Prob. 1 columnwise, (See
Example 6.)
l"9. TEAM PROJECT. Symmetric and SkewSymmetric
Matrices. These matrices occur quite
frequently in applications, so it is worthwhile to study
some of their most important properties,
(a) Verify the claims in (11) that al"i : a7, for a
symmetric matrix, andayi: -ajkfor a skew-symmetric
matrix. Give examples.
(b) Show that for every square matrix C the matrix
C + CT is symmetric and C - CT is skew-symmetric,
Write C in the form C : S + T, where S is symmetric
and T is skew-symmetric and find S and T in terms of
C. Represent A and B in Probs - I-I4 in this form,
(c) A linear combination of matrices A, B, Cr, , , ,
M of the same size is an expression of the form
(l4) aA, l- bB + cC + ", l mM,
where a, , , , , m are any scalars. Show that if these
matrices are square and symmetric, so is GD;,
similarly, if they are skew-symmetric, so is (14),
(d) Show that AB with symmetric A and B is
symmetric if and only if A and B commute, that is,
AB : BA.
(e) Under what condition is the product of skewsymmetric
matrices skew-symmetric?
20. (Idempotent and nilpotent matrices) By definition,
A is idempotent if A2 : A, and B is nilpotent if
B* : 0 for some positive integer m, Give examples
(different from 0 or I). Also give examples such that
A2 : I (the unit matrix).
21. (Triangular matrices) Let U1, U2 be upper triangular
and L1, L2 lower triangular. Which of the following
are triangular? Give examples. How can you save half
of your work by transposition?
U1 * IJ2, UlU2, U12, Ur * Lr, UrLr, Ll + L2,
LlL2,Lt2
22. (Transposition of products) Prove (10a)-(10c),
Illustrate the basic formula (10d) by examples of your
own. Then prove it.
APPL|cATloNs
23. (Markov process) If the transition matrix A has the
entries a|t: 0.5, ap: 0:.3, a21,: 0,5, a22: 0,7 and
the initial state is [1 1]T, what will the next three
states be?
24, (Concert subscription) In a community of 300 000
adults, subscribers to a concert series tend to renew their
subscription with probability 90%o andpersons presently
not subscribing will subscribe for the next season with
probability O.I'Vo.If the present number of subscribers
is 2000, can one predict an increase, decrease, or no
change over each of the next three seasons?
SEC. 7.3 Linear Systems of Equations. Gauss Elimination
25. CAS Experiment. Markov Process. Write a program
for a Markov process. Use it to calculate further steps in
Example 13 of the text. Experiment with other stochastic
3 X 3 matrices, also using different stafting values.
26. (Production) In a production process, let N mean "no
trouble" and T"trouble." Let the transition probabilities
from one day to the next be 0.9 for l/ + { hence 0.1
for N ---> T, and 0.5 for T ---> N, hence 0.5 for T ---> T.
If today there is no trouble, what is the probability of
N two days after today? Three days after today?
27. (Profit vector) Two factory outlets F1 and F2 in New
York and Los Angeles sell sofas (S), chairs (C), and
tables (T) with a profit of $110, $45, and $80,
respectively. Let the sales in a cerlain week be given by
the matrix
sCT
[600 400 l00l F1
A: I l
L:oo 820 2o5 _) F2
Introduce a "profit vector" p such that the components
of v : Ap give the total profits of F, and F2.
28. TEAM PROJECT. Special Linear Transformations.
Rotations have various applications. We show in this
project how they can be handled by matrices.
(a) Rotation in the plane. Show that the linear
transformation y : Ax with matrix
is a counterclockwise rotation of the Cartesian xlxrcoordinate
system in the plane about the origin, where
0 is the angle of rotation.
(b) Rotation through n0. Show that in (a)
Is this plausible? Explain this in words.
|-cos 0 -sin 0l [",l
A:| l and x:| l,
I sin d cos áJ L*r_]
[yrl
y: I l
Ly,_]
fcos
* -sin al [cos
É-sin É-.l
I rin * .o, o] [ sin B .o, B_.]
: [.o,
(a + B\ -sin (" + rll
Isin(a+P cos(a+B)]
[ :i; .:};]
[]: : .J;] [:lí
.Ti
|][-cos n0 -sin n0l
An:l l
L sin n9 cos n0_.]
7.3 Linear Systems of Equations.
Gauss E[imination
The most important use of matrices occurs in the solution of systems of linear equations,
briefly called linear systems. Such systems model various problems, for instance, in
frameworks, electrical networks, traffic flow, economics, statistics, and many others. In
this section we show an important solution method, the Gauss elimination. General
properties of solutions will be discussed in the next sections.
287
(c) Addition formulas for cosine and sine. By
geometry we should have
Derive from this the addition formulas (6) in App, A3.1.
(d) Computer graphics. To visualize a threedimensional
object with plane faces (e.g., a cube), we
may store the position vectoís of the vertices with
respect to a suitable xp2x3-coordinate system (and a
list of the connecting edges) and then obtain a twodimensional
image on a video screen by projecting
the object onto a coordinate plane, for instance, onto
the ;rrxr-plane by setting x, : 0. To change the
appearance of the image, we can impose a linear
transformation on the position vectors stored. Show
that a diagonal matrix D with main diagonal entries
3,I,+, gives from an x : [xr] the new position vector
y : Dx, where )r : 3íl(stretch in the xl-direction
by a factor 3), yz : x2 (unchanged), y3 : Lr"
(contraction in the x3-direction), What effect would a
scalar matrix have?
(e) Rotations in space. Explain y : Ax geometrically
when A is one of the three matrices
what effect would these transformations have in
situations such as that described in (d)?
____-<l
(1)
cHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Linear System, Coefficient Matrix, Augmented Matrix
A linear system of m eqluations in n unknow[s .x1, " ,
, xn is a set of equations of the form
attXt+",lalnxn:b1
aztXt + ", l a2nxn: bz
am|Xt + ", * annxn: brn
The system is calleď linear because each variable xi appears in the first power only, just
as in the equation of a straight line. aL|, , , , , aynlr are given numbers, called the
coefficients of the system. bt, . , , , b* on the right are also given numbers. If all the bi
are zero, then (1) is called a homogeneous system. If at least one bi is not zero, then (1)
is called a nonhomogeneous system.
A solution of (1) is a set of numbers .x1, , , , , xn that satisfies all the rz equations.
A solution vector of (1) is a vector x whose components form a solution of (1). If the
system (1) is homogeneous, it has at least the trivial solution í1: 0, , , , xn : 0.
Matrix Form of the Linear System (1). From the definition of matrix multiplication
we see that the m equations of (1) may be written as a single vector equation
Ax:lr
where the coeffrcient matrix A,: |a7"] is the m X n matrix
and x- and b
are column vectors. We assume that the coefficients ai1x o,ía not all zero, so that A is not
azero matrix. Note that x has n components, whereas b has ru components. The matrix
(2)
f al atz atr1
O :
L,- I a'lmz ,^,)
X1
Xn
[^-
luo,.
f o1 aln ' br1
l"|l||lA:| l
||l
||l
Lo*, ann, b-]
is called the augmented matrix of the system (1). The dashed vertical line could be
omitted (as we shall do later); it is merely a reminder that the last column of Á does not
belong to A.
The augmented matrix Á determines the system (1) completely becalse it contains all
the given numbers appearing in (1).
SEC. 7.3 Linear Systems of Equations. Gauss Elimination
ExAMPLE 1 Geometric !nterpretation. Existence
If m : n : 2, we have two equations in two
xl*x2=l
2xl-xr=O
Case (o)
lnfinitely
many solutions
No solution
Fig. t5ó. Three
equations in
three unknowns
interpreted as
planes in space
and Uniqueness of Solutions
unknowns x1, 12
xl+X,2=l
2x1+2xr=2
Case (b)
289
allx1 -| al2x2 : b1
a2lx1 * a22x2: b2.
If we interpraí xy x2 as coordinates in the x 2-plane, then each of the two equations represents a straight line,
and (x1, x2) is a solution if and only if the point P with coordinates J1, x2 lies on both lines. Hence there are
three possible cases:
(a) Precisely one solution if the lines intersect.
(b) Infinitely many solutions if the lines coincide.
(c) No solution if the lines are parallel
For instance,
XI+x2=I
x1 1-x2=o
Case (c)
If the system is homogenous, Case'(c) cannot happen, because then those two straight lines pass through the
origin, whose coordinates 0, 0 constitute the trivial solution. If you wish, consider three equations in three
unknowns as representations of three planes in space and discuss the various possible cases in a similar fashion.
See Fig. 156.
Our simple example illustrates that a system (1) may perhaps have no solution. This poses
the following problem. Does a given system (1) have a solution? Under what conditions
does it have precisely one solution? If it has more than one solution, how can we
characterize the set of all solutions? How can we actually obtain the solutions? Perhaps
the last question is the most immediate one from a practical viewpoint. We shall answer
it first and discuss the other questions in Sec. 7.5.
Gauss Elimination and Back substitution
This is a standard elimination method for solving linear systems that proceeds
systematically inespective of particular features of the coefficients. It is a method of great
practical importance and is reasonable with respect to computing time and storage demand
(two aspects we shall consider in Sec. 20.1 in the chapter on numeric linear algebra). We
begin by motivating the method. If a system is in "triangular form," say,
2x, l 5xz: 2
l3xr: _26
we can solve it by "back substitution," that is, solve the last equation for the variable,
xz: -26113 : -2, and then work backward, substitutinE xz: -2 into the first equation
l
Unique solution
29o CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems
and solve it for x1, obtaining h: +(2 - 5rr) : *(z - 5 , (-2)) : 6. This gives us the idea
of first reducing a general system to triangular form. For instance, let the given sYstem be
2x1 l 5xr: 2
-4xt l 3x2: -30.
Its augmented matrix is
|,^ ; -;]
We leave the first equation as it is. We elimin ate x1 from the second equation, to get a triangular
system. For this we add twice the first equation to the second, and we do the same operation
on the rows ofthe augmented matrix. This gives _4xt l 4x1 * 3x2 t_ I0x2: _30 + 2,2,
that is,
2x1 -| 5xz:
I3xr:-26 Row2-l2Row1
where Row 2 -l 2 Row 1 means "Add twice Row 1 to Row 2" in the original matrix,
fhis is the Gauss elimination (for 2 equations in 2 unknowns) giving the triangular form,
from which back substitution now yields x2 : -2 and xl : 6, as before,
Since a linear system is completely determined by its augmented matrix, Gauss
,*:ť:^: j;::,:"':::x:L::"íl,:::xí::á!;,,#,::,:;;fi ffi 1[:T',Ť:i:*
the equation, b"hind them, just as a help in order not to lose track,
ExAMPLEzGaussElimination.ElectricalNetwork
Solve the linear system
X1- Xz l X3 : 0
-xrt x2- x3: 0
I0x2 ,| 25xg : 90
2Ox1* I0x2 : 80,
Deňvation from the circuit in Fig. 157 (Optional). This is the system for the unknown currents
xt: it, xz: iz, xs: izin the electricai network in Fig. 157. To obtain it, we label the currents as shown,
choosing directions arbitrarily; if a current will come out negative, this will simPlY mean that the current flows
against the direction of our arrow. The current entering each battery will be the same as the current leaving it,
The equations for the currents result from kirchhoff,s laws:
Kirchhof,f,,s current law (KCL). At any point of a circuit, the sum of the inftowing currents equals the sum
of the outflowing currents.
Kirchhoff,s voltage law (KVL). In any closecl loop, the sum of all voltage drops equals the impressed
electromotive force.
Node P gives the íirst equation, node Q the second, the right loop the third, and the left loop the fourth, as
indicated in the figure.
20a NodeP: i1- ir* i3= 0
Node Q: -ir, * i2- i3= 0
[; 13
21
_26]
,o uI
Ino
u
Right loop: tOir+ 25ir= 90
P 15o
Left loop: 20i, + l}i, = 80
Fig. 157. Network in Example 2 and equations relating the currents
SEC. 7.3 Linear Systems of Equations. Gauss Elimination 29l
Solution by Gauss Elimination. This system could be solved rather quickly by noticing its particular
form. But this is not the point. The point is that the Gauss elimination is systematic and will work in general,
also for large systems, We apply it to our system and then do back substitution. As indicated let us write the
augmented matrix of the system first and then the system itself:
xÁ
:l
*I
80_]
Equations
Step 1. Elimination of x1
Call the first row of A the pivot row and the first equation the pivot equation. Call the coefficient l of its
Jl-term the pivot in this step. Use this equation to eliminate x1 (get rid of x1) in the other equations. For this, do:
Add 1 times the pivot equation to the second equation.
Add -20 times the pivot equation to the fourth equation.
This corresponds to row operations on the augmented matrix as indicated in BLUE behind the new matrix in
(3). So the operations are performed on the preceding matrix. The result is
Augmented
@-r
Matri
lI
1 -1
l0 25
10
1
0
0l
al Row2*Rowl
90 l
80_] Row 4 20 Row l
x1- xz* x3: 0
0: 0
I0x2-| 25xr: 99
30x2-20xr:36.
x1- xz l í3: 0
Pivot 1O----------------
@ + 25x, : )g
Eliminate 3Ox2------->
@- 20x, :39
0:0
25
-20
pivot l0 -------------- [;
Eliminate rr-
L:
1
25
-20
0
-l l| 0l
l0 zsl, 90 l
0 -95|-'nol *o*3-3Row2
il0 0l 0]
0-{1 - ^2T
,t3 -
10x2-| 25xr: 90
- 95x3 : _190
0: 0
x3:i3:2IA]
xr: s(lo - 25xg) : iz: 4 LA]
XI: X2 - xg * i1.: 2 [A]
where A stands for "amperes." This is the answer to our problem. The solution is unique. l
:::_L
Pivot l ------------->@- xzl x3 : 0
Eliminate ------>
+ x2- 3: 0
l0x2 * 25x=: 99
-1- 10x2 : 80.
|-t -l
I
|0 0
(3) l
lo l0
I
L0 30
To eliminate x2, do:
Add -3 times the pivot equation to the third equation.
The result is
(4)|i
-954: _190
70x2 -| 25x3: 90
J1- xzl í3: 0
Back Substitution. Determination of xg, x2, xí(in this order)
Working backward from the last to the first equation of this "triangular" system (4), we can now readily find
x3, then x2, dfld then x1:
Step 2. Elimination of x2
The first equation remains as it is. We want the new second equation to serve as the next pivot equation. But
since it has no,r2-term (in fact, it is 0 : 0), we must first change the order of the equations and the corresponding
rows of the new matrix. We put 0 : 0 at the end and move the third equation and the fourth equation one place
up. This is called partial pivoting (as opposed to the rarely lsed total pivoting, in which also the order of the
unknowns is changed). It gives
-1
@
tr
0
THEoREM t
CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Elementary Row Operations. Row-Equivalent Systems
Example 2 illustrates the operations of the Gauss elimination. These are the first two of
three operations, which are called
Elementary Row Operations for Matrices:
Interchan7e oí two rows
Addition of a constant multiple of one row to another row
Multiplication of a row by a nonzero constant c.
CAUTION! These operations are for rows, not for columns! They coffespond to the
fo11owing
Elementary Operations for Equations:
Interchange oí two equations
Addition of a constant multiple of one equation to another equation
Multiplication of an equation by a nonzero constant c.
Clearly, the interchange of two equations does not alter the solution set. Neither does that
addition because we can undo it by a coíTesponding subtraction. Similarly for that
multiplication, which we can undo by multiplying the new equation by llc (since c * 0),
producing the original equation.
We now call a linear system ,S, row-equivalent. to a linear system 52 if ,S1 can be
obtained from S, by (finitely many!) row operations. Thus we have proved the following
result, which also justifies the Gauss elimination.
Row-Equivalent Systems
..,
Row-equivalent linear systems have the same set of solutions. ,,/
Because of this theorem, systems having the same solution sets are often called
equivalent systems. But note well that we are dealing with row operations. No column
operations on the augmented matrix are permitted in this context because they would
generally alter the solution set.
A linear system (1) is called overdetermined if it has more equations than unknowns,
as in Example 2, determined if m : n, as in Example 1, and underdetermined if it has
fewer equations than unknowns.
Furthermore, a system (1) is called consistent if it has at least one solution (thus, one
solution or infinitely many solutions), but inconsistent if it has no solutions at a11, as
ír i x2 : I, x1 l x2: 0 in Example 1.
Gauss Elimination: The Three Possible Cases of Systems
The Gauss elimination can take care of linear systems with a unique solution (see Example
2), with infinitely many solutions (Example 3, below), and without solutions (inconsistent
systems; see Example 4).
r
SEC. 7.3 Linear Systems of Equations. Gauss Elimination
EXAMPLE 3 Gauss Elimination if lnfinitely Many Solutions Exist
Solve the following linear systems of three equations in four unknowns whose augmented matrix is
293
I-3.0 2.0 2.0 -5.0
I
(5) 10.6 1.5 1.5 -5.4
I
L|.2 -0.3 -0.3 2.4
;ll
2.I _l
Thus,
@ * 2.0x2-1 2.04 - 5.0xn: 8.0
-l 7.5x2 * 1.5;13 - 5.4xn: 2.1
- 0.3x2 - 0.3x3-1 2.4xa: 2.1.
3.0x1 * 2.0x2-| 2.0xg - 5.0xa:
Solution. As in the previous example, we circle pivots and box terms of equations and coruesponding entries
to be eliminated. We indicate the operations in terms of equations and operate on both equations and matrices.
Step 1. Elimination of xlfrom the second and third equations by adding
- 0.6i3.0 : -0.2 times the first equation to the second equation,
- I.2l3.0: -0.4 times the first equation to the third equation.
This gives the following, in which the pivot of the next step is circled.
|-3.0 2.0 2.o -5.0
I
l 0 l,l 1.1 -4.4
l
L 0 -1.1 -1.1 4.4
8.0l
1.1 l Row 2 - 0.2 Row l
I
- |.l _] Row 3 - 0.4 Row l
8.0l
lll
a] Row3*Row2
3l
. r| Row2-3Rowl
0_] Row3-2Rowl
8.0
1,1
This gives
|-3.0 2.0 2,0 -5.0
l
(1) l o 1.1 1.1 -4.4
I
L0 0 0 0
(6)
Step 2.
This gives
t;
,, ,,
Lr24
[':
3.0x1 f 2.0x2 * 2.0ry - 5.0xn: 8.0
1.1x2 * 1.1x3 - 4.4xn: 1.1
0- 0
l]
@+ 2x2 * xz: 3
[-f,+ xzl xs:O
ll
|6xr|+ 2x2 * 4xg: 6.
Step 1. Elimination of xlfrom the second and third equations by adding
-t times the first equation to the second equation,
-3: -Z times the first equation to the third equation.
1
1
3
2
3x1 ,| 2x2,| x3:
é**)+ *", :óJ
7 zrr]+ zrr:
J
a
-Z
0.
(ilfi)+ I.lxc - 4.4xa
F -]- 1.1x3 * 4.4xn: -1.1
Elimination of x2 írom the third equation of (6) by adding
1.1l1.I : 1 times the second equation to the third equation.
Back Substitution. From the second equation, x2 : I - xs l 4x4. From this and the first equation,
xt : 2 - 14. Since x3 and .T4 remain arbitrary, we have infinitely many solutions. If we choose a value of
.í3and a value of x4, then the corresponding values of x1 and x2 zíe-uniquely determined.
On Notation. If unknowns remain arbitrary, it is also customary to denote them by other letters t1, t2, , , , .
Inthisexamplewemaythuswrite xt:2 2- tz,x2: I --rs * 4*4: I - tt+ 4t2,x3: /1 (first
arbitrary unknown), xq: t2 (second arbitrary unknown). l
EXAM PLE 4 Gauss Elimination if no Solution Exists
What will happen if we apply the Gauss elimination to a linear system that has no solution? The answer is that
in this case the method will show this fact by producing a contradiction. For instance, consider
dtt alz
294 CHAp. 7 Linear Al6ebra: Matrices, Vectors, Determinants. Linear Systems
Step 2. Elimination of x2 from the third equation gives
21
-1 1
00 ,':) Row3-6Row2
3x1 -l 2x2* J3: 3
- *r, + i,r: -2
0: 12.
The false Statement o : 12 shows that the System has no solution. l
Row Echelon Form and lnformation From lt
At the end of the Gauss elimination the form of the coefficient matrix, the augmented
matrix, and the system itself are called the row echelon form. In it, rows of zeros, if
present, are the 1ast rows, and in each nonzero row the leftmost nonzero entrY is farther
io the right than in the previous row. For instance, in Example 4 the coefficient matrix
and its augmented in row echelon form are
Note that we do not require that the leftmost nonzero entries be 1 since this would have
no theoretic or numeric advantage. (The so-called reduced echelon form, in which those
entries are I, will be discussed in Sec. 7.8.)
At the end of the Gauss elimination (before the back substitution) the row echelon form
of the augmented matrix will be
;l
,;)
t; 'r;l and t; 'r L;
; ;] L. 0 0
ló,
',T,
I
|'
|-l$_
i,!'-'
I
I
',
b*
o,.In
c^zn
:
Ll"ILrr lvrll(8)
Here, rš ntanďa1 * O, c22+ O,. ,kr, * 0, and a1l the entries inthe blue triangle
as well as in the blue rectangle are zero. From this we see that with resPect to solutions
of the system with augmented matrix (8) (and thus with respect to the originally given
system) there are three possible cases:
(a) Exactly one solution tf r : n andĎr*l" ,Ďrn,if present, are zeío. To get the
solution, solve the nth equation corresponding to (8) (which is knnxn : Ď) for xr, then
the (n - 1)st equation for xn_1, and so on up the line. See Example2, where r : n : 3
andm:4.
(b) Inftnitely many solutions if r < n anďĚr*t, , , , ,Ď-, if present, aíe zero. To obtain
any of these solutions, choose values of xr*l, , , , , xnarbitrarily. Then solve the fih equation
for x,, then the (r - 1)st equation for xr_', and so on up the line. See ExamPle 3.
(c) 1o solutiontf r<mandoneof theentriesĎr*l,,,,,Ďrrisnotzero. SeeExamPle
4, where r:2 < m - 3 and Ďr+t:Ďz: 12.
SEC. 7.3 Linear Systems of Equations. Gauss Elimination
GAUss ELIMINATIoN AND BACK
sUBSTITUT!oN
Solve the following systems or indicate the nonexistence of
solutions. (Show the details of your work.)
1. 5x - 2y : 20,9 2. 3_0x + 6,2y : 0.2
3x 1- 7y - 4z : -46
5r.l-|4x*8y-| z: 7
8w +4y-2z: 0
-w,l 6x l27: 13
-2w-7]x-|4y*3z: 0
Jw *3y-27: 0
2x-f 8y-6z:-20
5w-I3x- !*5z: 16
l17=i_191 MoDELs oF ELEcTRlcAL NETwoRKs
Using Kirchhoff's laws (see Example 2), find the cuments.
15.
-x -F 4y : -I9.3
3. 0.5x + 3.5y :5.J
-X -| 5.0y : 7.3
5. 0.8x + 1.2y - 0.67 :
2.6x -f I.Jz :
4,0x-1.3y-1.5z:
6.I4x-2y- 4z:0
I8x-2y- 6z.:0
4x*8y-141:0
8.2x i y - 3e:8
5x -l 2z:3
8" - } l Jz :0
2.Ix -| 8.5y : 4.3
4. 411 -27,:2
6x-27-* z,:29
4xf8y-4z,:24
-1.8
15.3
1.1
1.. l
-2I. } l <,
-
4y + 6z,: -I2
X)- y-| z: 2
9, 4y-l 4a:24
3x-IIy-2z*-6
6x-l7y-| z: 18
16.
18.
10.
11.
12.
13.
0.6;r -| 0.3y - 0.4z : - 1.9
-4.6x + 0.5y 1- 1.2z : - 1.3
2x- y*3z --1
-4x -| 2y - 67 : 2
-2v-2::-8-J
3x -f 4y - 5z: 13
x* !-2z: 0
-4w- x- y-l 2z--4
-2w*3x-f 3y-6z:-2
w-2x+5y-3z:0
-3w*6x-| y* z:0
2w--h,*3y- z:3
19.
14.
Wheatstone bridge
(Prob. 20, next page)
Net of one-way streets
(Prob. 2I, next page)
(Show the details of your work.)
*--Eo
295
R1
u,t
cHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear SYstems
20. (Wheatstone bridge) Show that if R,lR3 : RllR2 in
the figure, then 1 : 0. (Ro is the resistance of the
instrument by which 1 is measured.) This bridge is a
method for determining R". R1, R2, R3 are known, R3
is variable. To get R", make 1 : 0 by varing Rr, Then
calculate R. : R3RI/R2.
21. (Traffic flow) Methods of electrical circuit analysis
have applications to other fields. For instance, applying
the analog of Kirchhoff's current law, find the traffic
flow (cars per hour) in the net of one-way streets (in
the directions indicated by the arrows) shown in the
figure. Is the solution unique?
22. (Models of markets) Determine the equilibrium
solution (Dl : St, Dz: Sz) of the two-commodity
market with linear model (D, S, P : demand, supply,
price; index 1 : first commodity, index 2 : second
commodity)
Dt:60 - zPl - P2, 51 : 4P1 - 2Pz + 14
Dz: 4Pr- P2 + I0, Sz: 5Pz- 2,
23. (Equivalence relation) By definition, an equivalence
relationon a set is a relation satisfying three conditions
(named as indicated):
(i) Each element Á of the set is equivalent to itself
(" Reflexivity ") .
(ii) If A is equivalent to B, then B is equivalent to Á
("Symmetry"),
(iii) If Á is equivalent to B and B is equivalent to C,
then Á is equivalent to C ("Transitivity"),
Show that row equivalence of matrices satisfies these
three conditions. Ilnr. show that for each of the three
elementary row operations these conditions hold,
24. PROJECT. Elementary Matrices. The idea is that
elementary operations can be accomplished by matrix
multiplication. If A is an m X n matrix on which we
want to do an elementary operation, then there is a
matrix E such that EA is the new matrix after the
operation. Such an E is called an elementary matrix,
This idea can be helpful, for instance, in the design of
algorithms. (C omputationally, it is generally preferable
to do row operations directly, rather than by
multiplication by E.)
(a) Show that the following are elementary matrices,
for interchanging Rows 2 and 3, for adding -5 times
the first row to the third, and for multiplying the fourth
row by 8.
El:
F,z:
Es:
Apply El,F,z, E3 to a vector and to a 4 X 3 matrix of
your choice. Find B : E3E2EIA, where A : |a4"f ts
the general 4 X 2matrix. Is B equal to C : E'E2E3A?
(b) Conclude that F-1, F'2, E3 are obtained by doing
the corresponding elementary operations on the 4 X 4
unit matrix. Prove that f M is obtained from A by an
elementary rov) operation, then
M:EA,
where E is obtained from the n X n unit matrix |n by
the same row operation.
25. CAS PROJECT. Gauss Elimination and Back
Substitution. Write a program for Gauss elimination
and back substitution (a) that does not include pivoting,
(b) that does include pivoting. Apply the programs to
Probs. l3-l6and to some larger systems of your choice,
Li:ll]
L:l:I
Lilli]
7.4 Linear lndependence. Rank of a Matrix.
Vector Space
In the last section we explained the Gauss elimination with back substitution, the most
important numeric solutión method for linear systems of equations. It aPPeared that such
a system may have a unique solution or infinitely many solutions, or it may be inconsistent,
that is, have no solution at a11. Hence we are confronted with the questions of existence
and uniqueness of solutions. We shall answer these questions in the next section. As the
SEC.7.4 Linear lndependence. Rank of a Matrix. Vector Space 297
key concept for this (and other questions) we introduce the rank of a matrix. To define
rank, we first need the following concepts, which are of general importance.
Linear Independence and Dependence of Vectors
Given any set of m vectors fl11;, , , , z@n) (with the same number of components), a linear
combination of these vectors is an expression of the form
Cr&<u l c2Z21+ " , l cna6r1
where Cy C2,
(1)
, cTL are any scalars. Now consider the equation
ctártl ,l c2ar2, * * Crnaqny : 0.
Clearly, this vector equation (1) holds if we choose all c7's zero,because then it becomes
0 : 0. If this is the only m-tttple of scalars for which (1) holds, then our vectors
&(1), , , ,
,1@r) are said to form alinearly independent set or, more briefly, we call them
linearly independent. Otherwise, if (1) also holds with scalars not all zero, we call these
vectors linearly dependent, because then we can express (at least) one of them as a
linear combination of the others. For instance, if (1) holds with, s&}, c1 * 0, we can
solve (1) for a,lr:
fl(1) : k2ar2, * l krnaq6 where ki : -cilcy
(Some kj s may be zero. Or even all of them, namely, if a,1, : 0.)
Why is this important? Well, in the case of linear dependence we can get rid of some
of the vectors until we arrive at a linearly independent set that is optimal to work with
because it is smallest possible in the sense that it consists only of the "really essential"
vectors, which can no longer be expressed linearly in terms of each other. This motivates
the idea of a "basis" used in various contexts, notably later in our present section.
Linear lndependence and Dependence
The three vectors
a111:[ 3 0 2 2l
a121 : [ -6 42 24 54]
a6,,: [ 21 -2I
are linearly dependent because
6a<u - Žurrr- a(s):0.
Although this is easily checked (do it!), it is not so easy to discover. However, a systematic method for finding
out about linear independence and dependence follows below.
The first two of the three vectors are linearly independent because clai1; -| c2á(2) : 0 implies c2 : 0 (from
the second components) and then c1 : 0 (from any other component of a11;). l
Rank of a Matrix
DEFlNlTloN The rank of a matrix A is the maximum number of linearly independent row vectors
of A. It is denoted by rank A.
ExAMPLE l
0 -15]
298 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems
our further discussion
understanding general
will show that the rank of a matrix is an important key concept for
properties of matrices and linear systems of equations,
EXA,MPLE 2 Rank
Thematrix
[
3 o 2 ,)
t2l o:l-u 42 24 ,o|
Lzr -2l o -ts]
has rank 2. because Example 1 shows that the first two row Vectors are linearly independent, whereas all three
row vectors are linearly dependent,
Note further that rank A : 0 if and only if A : 0. This follows directly from the definition, l
We call a matrix A1 row_equivalent to a matrix Az if A1 can be obtained from A, by
(finitely many!) elementary row operations,
Now the maximum number of linearly independent row vectors of a matrix does not
change if we change the order of rows oi multiply a row by an nonzero c or take a linear
combination by uáoing a multiple of a row to another row. This proves that rank is
invariant under elementary row operations:
T H E,O'REM l
Hence we can determine the rank of a matrix by reduction to row_echelon form
(Sec. 7.3) and then see the rank directly,
EXAMPLE 3 Determinationof Rank
For the matrix in Example 2 we obtain successively
[,
0
^:|-u
42
L2| -2I
[3
0
lr 42
Lo -2I
T3 0
l. 42
L0 0
since rank is defined in terms of two vectors, we
(given)
Row2]-2Row1
Row3-7Row1
Row3+}Row2 a
immediately have the useful
,1o
;1]
,,1o
í]
,1, ,il
THEoRE,M 2
Row-Equivalent Matrices
Row-equivalent matrices have the same rank,
Linear lndependence and Dependence of Vectors
p vectors with n components each are linearly independent if the matrix with these
vectors as row vectárs has rank p, bwt they are linearly dependent if that rank is
less than p.
_ --
SEC.7.4 Linear lndependence. Rank of a Matrix. Vector Space 299
Further important properties will result from the basic
THEoREM 3 Rank in Terms of column vectors
The rank r of a matrix A equals the maximum number of linearly independent
column vectors oí
^.Hence A and its transpose AT have the same rank.
P R O O F In this proof we write simply "rows" and "columns" for row and column vectors. Let
A be an m X n matrix of rank A : r. Then by definition of rank, A has r linearly
independent rows which we denote by vru, , y(r) (regardless of their position in A),
and all the rows á(1), , , , uon) of A are linear combinations of those, say,
á(t): CllV11;-| clzyrz> +",+ Clry@)
Z(2): c2lv,rr-| CzzyQ) +",+ C2ry(:r)
::::
a(rn): CmtY(t)l CrrzY<z> +... * C-rYe).
These are vector equations for rows. To switch to columns, we write (3) in terms of
components as n such systems, with k : I, . . ., h,
alk: CllU'1"l CtzUzk +...+ CIrUrk
(4)
azk: Czt.Uu"t Czz.t)zk +",+ czr.Urk
::::
aynk: CrnlUyr -| CmzU2k + " ' l Crn Urk
and collect components in columns. Indeed, we can write (4) as
f alxf l-.,,l f crr1 l-c,,'l
I
| ,,ol l Cztl I,,,l l ,,,,ll(
(5) |.- |:r,o| ,'|+r*|-'.." |+ lu,kl
','I
I
, ,--.] L.-,]
l ua"
L.-,] L.-, ]|a
where k -- 1,, " , n. Now the vector on the left is the kth column vector of A. We see
that each of these n columns is a linear combination of the same r columns on the right.
Hence A cannot have more linearly independent columns than rows, whose number is
rank A : r. Now rows of A are columns of the transpose AT. For AT our conclusion is
that AT cannot have more linearly independent columns than rows, so that A cannot have
more linearly independent rows than columns. Together, the number of linearly
independent columns of A must be r, the rank of A. This completes the proof. l
EXAMPLE 4 lllustration of Theorem 3
The matrix in (2) has rank 2. From Example 3 we see that the first two row vectors are linearly independent
and by "working backward" we can verify that Row 3 : 6 Row 1 -j Row 2. Similarly, the first two columns
are linearly independent, and by reducing the last matrix in Example 3 by columns we find that
Column3 :{Column 1 + Column2 and Column 4 : t Column 1 + 2}Column 2, l
(3)
300 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Combining Theorems 2 and 3 we obtain
THEoREM 4
PRooF
Vector Space
The following related concepts are of general interest in linear algebra. In the Present
context they provide a clarification of essential properties of matrices and their role in
connection with linear systems,
A vector space is a (nonempty) set V of vectors such that with any two vectors a and
b in V al1 their linear combinations a a + Bb (a, F any real numbers) are elements of V,
and these vectors satisfy the laws (3) and (4) in Sec. 7.1 (written in lowercase letters a,
b, u, . . . , which is our notation for vectors). (This definition is PresentlY sufficient,
GeneralvectorspaceswillbediscussedinSec.7.9.)
The maximum number of linearly independent vectors in V is called the dimension of
nl*i*"":ili3:i!.'"..*"u,,u*ethedimensiontobefinite;infinitedimension
A linearly independent set in v consisting of a maximum possible number of vectors
in v is called a basis for v. Thus the number of vectors of a basis for v equals dim v,
The set of all linear combinations of given vectors 8(1), , , , ?(p) with the same
number of components is called the span of these vectors. Obviously, a SPan is a vector
Space.
By a subspace of a vector space V we mean a nonempty subset of V (including V itself;
;1xlFfi,'.,#,",,J"1, :"Jriífi.rJ'íJ.".J.",:ij:i1;.''"
algebraic oPerations (addition and
EXAMPLE 5 Vector Space, Dimension, Basis
The span of the three vectors in Example 1 is a vector space of dimension 2, and a basis is a11;, &i2;, for instance,
Or ái1;, i13;, etC.
We further note the simple
THEoREM 5
The matrix A with those p vectors as row
Theorem 3 it has rank A šn 1p, which
vectors has p rows and n < p columns; hence by
implies linear dependence by Theorem2, l
PRooF A basis of n vectors is flrrl : [1 0
ei2;:[0 0 1].
In the case of a matrix A we call the span of the row
span of the column vectors the column space of A,
vectors the row space of A and the
0],a.21 : t0 10 0],",,
l
Linear Dependence of Vectors
p vectors with n 1 p components are always linearly dependent.
Vector Space ť
The vector Space Rn consisting of all vectors with n components (n real numbers)
has dimension n.
SEC.7,4 Linear lndependence. Rank of a Matrix. Vector Space
Now, Theorem 3 shows that a matrix A has as many linearly independent rows as
columns. By the definition of dimension, their number is the dimension of the row space
or the column space of A. This proves
THEoREM ó Row Space and Column Space
The row space and the column space of a matrix A have the same dimension, equal
to rank A.
Finally, for a given matrix A the solution set of the homogeneous system Ax : 0 is a
vector space, called the null space of A, and its dimension is called the nullity of A. In
the next section we motivate and prove the basic relation
(6) rank A -l nullity A : Number of columns of A.
J
4
5
6
48
84
816
168
0-1
05
50
02
FJol LINEARINDEIENDENCE
Are the following sets of vectors linearly independent?
(Show the details.)
13. [3 -2 0 4], [5 0 0 1], [-6 1 0 1],
2003]
1,4. LI 1 0], [1 0 0], [1 1
15. [6 0 3 I 4 2],l0 -1
l12 3 0 *I9 8 -11l
16. [3 4 ]1,12 0 3], [8 2
17. |0.2 I.2 5.3 2.8 1.6],
l4.3 3,4 0.9 2.0 -4.3l
t] :
L;o,
l,:
|:
t:
Ll
the
its 10.
torS
11.
ol
sI
ul
,)
,:1
,^)
;l
;]
12.
1]
27 0 5],
3], [5 5 6]
@ RANK, RoW spAcE, coLUMN spAcE
l- t -z1
1. l. .l
L-, . ]
|-o -2
3. l, 4
l
L55
r0 3
5. l-, 0
L-. 5
|-s 0
l. 2
7. l
l+ 0
L.4
|-t 0
l. 5
9. l
l: 8
L. -3]
: :]
4l: ,o
:]
2L{
1 :':]
CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems
18.
t9.
t3 2 1],[0 0
[r + + i],B
t+áá+]
20. t1 2 3 4),I2
|4561]
0], [4 3 6]
+ i +l.|+ i + ál.
3 4 5],[3 4 5 6],
21. CAS Experiment. Rank, (a) Show experimentally
that the n X nmatrix A,: |ap)with ap: j + k -_I
has rank 2 for any n. (Probiem 20 shows n : 4,) Try
to prove it.
(b) Do the same when ajk : j + k * c, where c is
any positive integer,
(c) What is rank A, tf a7, - 2j+k-21Try to find other
large matrices of low rank independent of n,
,ez_N, PRoPERTlEs oF RANK
AND CONSEQUENCES
Show the following.
22. rankBTAT : rank AB, (Note the order!)
23. rankA : rank B does not tmply rank A2 : rank 82,
(Give a counterexample,)
24. IfA is not square, either the row vectors or the column
vectors of A are linearly dependent,
7.5 Solutions of Linear Systems:
If the row vectors of
independent, so are
conversely.
a square matrix are linearly
the column vectors, and
26. Gtve examples showing that the rank of a product of
matrices cannot exceed the rank of either factor,
@ vEcToR spAcEs
t,'t" giu"n set of vectors a vector space? (Give reason,) If
you, Jnr*er is yes, determine the dimension and find a
basis. (l)1,1)z,, , , denote components,)
27. Allvectors in R3 such that u1 l u2: 0
28. Al1 vectors in Ra such that}u2 - 3ua: k
29. Al|vectors in R3 with ur ž0, u2: -4Ll3
30. A1l vectors in R2 with utš uz
31. A1l vectors in R3 with 4u, l u3 : 0, 3u2: ug
32, Altvectors in Ra with U,l,
_ Uz: 0, U3 : 5U1, Uq: o
33. All vectors in R'with ]ui|
< 1 for j : I" " ,n
34. Al1 ordered quadruples of positive real numbers
35. All vectofs in R5 with l)1 : 2l)2 : 3l)s : 4uq: 5I)s
36. All vectors in Ra with
3ul - u3 : 0, 2u1 * 3uz - 4ua: O
Existence, Uniqueness
THEO'R,EM I
Rank as just defined gives complete information about existence, uniqueness' and general
structure of the solution set of linear systems as follows,
A linear system of equations in n ,rÁorn, has a unique solution if
F:
coefficient matrix
and the augmented ma;ix have the same rankn,and infinitelY manY solution if that common
rank is less than n. Thesystem has no solution if those two matrices have different rank'
Tostatethispreciselyandproveit,weshallusethe(generallyimportant)conceptof
a submatrix of A. By this we mean any matrix obtained from A bY omitting some roWS
or columns (or both). By definition this includes A itself (as the matrix obtained bY omitting
no rows or columns); this is practical,
Fundamental Theorem for Linear Systems
(a) Existence. A linear system of m equations in n wnknowll'S X1' ' '
"
Xn
atlXt * al2x2 + , , , l alnXn -- b1
aztXt* a22X2+ "' * a2nXn: bz
arnlX1 l arn2x2 + "' * an"nxn: bn
(1)
303
is consistent, that is, has solutions,
augmented matrix Á, have the same
aln
AatnL
a*n
if and only if the cofficient matrix A and the
rank. Here,
att
and Á:
att
amI
(b) Uniqueness. The system (I) has precisely one solution if and only if this
common rank r of A and Á, equals n.
(c) Infinitely many solutions. If this common rank r is less than n, the system
(l) has infinitely many solutions. All of these solutions are obtained by determining
r suitable wnknowns (whose submatrix of cofficients must have rank r) in terms of
the remaining n - r unknowns, to which arbitrary values can be assigned. (See
Example 3 in Sec. 7.3.)
(d) Gauss elimination (Sec. 7.3). Iísolutions exist, they can atl be obtained by
the Gauss eliminatiott. (This method will automatically reveal whether or not
solutions exist; see Sec. 7.3.)
PROOF (a) We can write the system (1) in vector form Ax : b or in terms of column vectors
C(1),'",C(n)Of A:
Crl;-Tr * crrrx2 + ", l crn,Xr.:b.
Á is obtained by augmenting A by a single column b. Hence, by Theorem 3 in Sec.'7.4,
rank Á equals rank A or rank A + 1. Now if (1) has a solution x, then (2) shows that b
must be a linear combination of those column vectors, so that Á and A have the same
maximum number of linearly independent column vectors and thus the same rank.
Conversely, if rank Á : rank A, then b must be a linear combination of the column
vectors of A, say,
(2)
(2*) b : alc11; i l anc6,,
since otherwise rank Á : rank A + 1. But (2*) means that (l) has a solution, namely,
xl : Q7, , , ,
, xn : an, as can be seen by comparing (2*) and (2).
(b) If rank A : n, the rz column vectors in (2) are linearly independent by Theorern 3
in Sec. 7.4. We claim that then the representation (2) of b is unique because otherwise
C<uír + ", l crn,xn: Ctu r +,,, l crnrŤ.n.
This would imply (take all terms to the left, with a minus sign)
(lr - _ r)ccu +,,, l (xn - ín)crn,: 0
and x, - Í1-- 0., , , .xn - ín: 0 by linear independence. But this means that the
scalars xI,," ,xnin(2) are uniquely determined. that is, the solution of (1) is unique.
304 cHAp.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
(c) If rankA : rank Á,: ,1n,thenby Theorem 3 in Sec.].4 there is alinearlY
independent set K of r column vectors of A such that the other n - r column vectors of
A are linear combinations of those vectors. we renumber the columns and unknowns,
denoting the renumbered quantities by ^, so that {Ó<rl,
, , , , Ó,,r} is that linearly independent
set K. Then (2) becomes
ó,rrír, +,,, + e@i.r+ č(r+1).tr+t l,,, + č6;,ftn : b,
ó(r+t), . . , trn, are linear combinations of the vectors of K, and so are the vectors
ir+tč<r+t>, . . . , iné<n>. Expressing these vectors in terms of the vectors of K and collecting
terms, we can thus write the system in the form
(3) Ó<u}r+",*Ó,,ryr:tl
withy3 : *j l É7,where Fr.results from then - rterms Č<r*t>*r+t,"',Črnrtn; here,
j : I,'. . ., r.Since the system has a solution, there &fe }1,,,,,!,satisfying (3). These
scalars are unique since K is linearly independent. Choosing *r+L, , , *n fixes the B,
and corresponding ij : lj - Fi,where j : l,,,,,T.
(d) This was discussed in Sec. 7.3 and is restated here as a reminder. l
The theorem is illustrated in Sec. ].3. In Example 2 there is a unique solution since
rank Á : rank A: n: 3 (as can be seen from the last matrix in the example). In ExamPle
3 we have rank Á : rank A : 2 < n - 4 andcan choosex3 and xn arbitrarily. In ExamPle
4 there is no solution because rank A : 2 < rank Á : 3.
Homo8eneous Linear System
Recall from Sec. 7.3 that a linear system (1) is called homogeneous if all the bjs are
zero, and nonhomogeneous if one or several bj s are not zero. For the homogeneous
system we obtain from the Fundamental Theorem the following results.
Homogeneous Linear System
A homogeneous linear system
attXt * al2x2+,,, l alnxn : 0
aztXt l a22X2 +''' * a2nxn : 0
(4)
am,7Xt * an2x2 +,,,,| arrnxn : 0
always has the trivial solution x1 : 0, , xtl,: 0. Nontrivial solutions exist if and
onty if rankA 1n. If rartkA : r 1n,these solutions, togetherwithx: Orform a
vector space (see Sec. 1.Q of dimension ll - T, called the solution Space of (4).
In paríicular, if x61 antl xQ) are solution vectors of (4), then x : ctx(1) * c2x21
with any scalars c1 and. c2 is a solution vector oí(4). (This does not hold for
nonhomogeneous systems. Also, the term solwtion space is used for homogeneous
systems only.)
THEoREM z
SEC. 7.5 Solutions of Linear Systems: Existence, Uniqueness
P R O O F The first proposition can be seen directly from the system. It agrees with the fact that
b : 0 implies that rank Á : rank A, so that a homogeneous system is always consistent,
If rank A: n, the trivial solution is the unique solution according to (b) in Theorem 1.
If rank A 1 n, there are nontrivial solutions according to (c) in Theorem 1. The solutions
form a vector space because if x11; and xlr, are any of them, then Ax,1) : 0, Ax12; : 0,
and this implies A(xrrl i xrr>) : Axcu i Ax,r; : 0 as well as A(cx11;) : cAx11; : 0,
where c is arbitrary. If rank A : r 1 n, Theorem 1 (c) implies that we can choose
n - r suitable unknowns, call them xr+l, , , , , xll, in an arbitrary fashion, and every
solution is obtained in this way. Hence a basis for the solution space, briefly called a basis
of solutions of (4), is y<rl, , , ,
, y(n_Ď, where the basis vector y<7; is obtained by choosing
xr+j : l and the other xr+I, " , , xn zero; the corresponding first r components of this
solution vector are then determined. Thus the solution space of (4) has dimension n - r.
This proves Theorem 2. l
The solution space of (4) is also called the null space of A because Ax : 0 for every x
in the solution space of (4). Its dimension is called the nullity of A. Hence Theorem 2
states that
rank A -l nullity A: n
where n is the number of unknowns (number of columns of A).
Furthermore, by the definition of rank we have rank A, = m in (4). Hence if m 1 n,
then rank L 1 n. By Theorem 2 this gives the practically important
THEoREM 3 Homogeneous Linear System with Fewer Equations Than Unknowns
A homogeneous linear system with fewer equations than unknowns has always
nontrivial solutions.
Nonhomogeneous Linear Systems
The characterizationof all solutions of the linear system (1) is now quite simple, as follows.
THEoREM 4 Nonhomoteneous Linear System
If a nonhomo7eneous linear system (1)
's
consistent, then all of its solutions are
obtained as
(6) x:xolxn
where xo is any (fixed) solution oí(1) and x7, runs through all the solutions of the
c orre sponding homo 7eneous sy stem (4).
PROOF The difference xh: x - xo of any two solutions of (1) is a solution of (4) because
Axn : A(x - xo) : Ax - Axo : b-b : 0. Since x is any solution of (1), we get all
the solutions of (1) if in (6) we take any solutioíl x6 of (1) and let xhyary throughout the
solution space of (4).
(5)
l
7.6 For Reference:
Second- and Third-Order Determinants
We explain these determinants separately from the general theory in Sec. J.7 because theY
will be sufficient for many of our examples and problems. Since this section is for
reference, go on to the next section, consulting this material only when needed,
A determinant of second order is denoted and defined by
lal anl
(1) D:detA: | |:ortazz-atzazt,
lazt azzl
so here we have bars (whereas a matrix has brackets).
cramerrs rule for solving linear systems of two equations in two unknowns
(a) attxt * al2x2 : b1
(b) aztxt l a22x2: b2
lb, apl
ll
lb, azzl
:
D
Ior, 6r|
|o,-, br|
X2:
D
:
CHAP.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
(2)
is
b zz - anbz
X1
(3)
anbz - blazt
with D as in (1), provided
D+0.
The value D : 0 appears for inconsistent nonhomogeneous systems and for homogeneous
systems with nontrivial solutions,
P R o o F We prove (3). To eliminate x2, multiply (2a) by arz and (2b) by -on and add,
(alazz - anazt)x1 -- bla2z - anbz,
Similarly, to eliminate í1, multiply (2a)by -azt and (2b) by o1,, and add,
(alazz - anazt)x2: allbz - btan,
Assuming that D -- attazz - atzazt * 0, dividing, and writing the right sides of these
two equaiions as O"t".Áinánts, we obtain (3), l
SEC. 7.6 For Reference: Second- and Third-Order Determinants
EXAMPLE l Cramer's Rule forTwo Equations
fi]
l: ;l
84
:-:6.
14
l, ::)_ -56
r, l 14
3o7
--4. lIf
4x1 -| 3x2: 12
then x1:
2x1 * 5x2: -8
x2:
Thi rd-Order Determinants
A determinant of third order can be defined by
|on a*l
|o,, o,,|'|r","',
"''*'*l
atz
azz
asz
(4)
'
|":.',,':
i.''j,":|:'la,,,
anl
-azt l l lasr
|o, aszl
Note the following. The signs on the right are + - +. Each of the three terms on the
right is an entry in the first column of D times its minor, that is, the second-order
determinant obtained from D by deleting the row and column of that entry; thus, for a11
delete the first row and first column, and so on.
If we write out the minors in (4), we obtain
(4*) D: allazzaz3- alla2gag2l a2latsasz- a2lal2ar"* aglatzaz3- aglalga22.
Cramer's Rule for Linear Systems of Three Equations
attxt l al2x2 * algxg: b1
(5) aztxt l a22x2 * ar"xg: b2
aslxt * ag2x2 * aggxg: bs
is
(6)
Dr
-L1
-,D
D2 D3
q-- r-,DD (D+0)
with the determinant D of the system given by (4)
lb, atz asl lo, b'
',
: |r, azz orrl .
',
: |o^ b2
|U'" o'"', ,'""*l
'
|o'r-,, b'3
uu',|
b"|
by the
it also
arcl lo1 atz
or"l , D" : |o^ uzz
orrl lo,, asz
Note that Dr, Dz, Dz are obtained by replacing Columns I, 2, 3, respectively,
column of the right sides of (5).
Cramer's rule (6) can be derived by eliminations similar to those for (3), but
follows from the general case (Theorem 4) in the next section.
3o8 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
7.7 Determinants. Cramer's Rule
Determinants were originally introduced for solving linear systems. Although imPractical
in computations, they have important engineering applications in eigenvalue problems
(Sec. 8.1), difí-erentiál equations, vector algebra (Sec. 9.3), and So on, They can be
introduced in several equivalent ways. Our definition is particularlY Practical in connection
with linear systems.
A determinant of order n is ascalar associated with an n X n (hence square!) matrix
A: |a7), which is written
D: detA:
att
azt
anL an2 ann
and is defined for n: 1 by
(2)
andforn>Zby
D -- o,,,
D: ailCi, l a3zC32+ "'* ainCin (j:I,2,",,orn)
(3b) D : al"Cro l a2*Car i ", l a6"Cn1" (k : I, 2,, ", or n)
Here,
Ciu -- ?I)j*kMirc
and M,"is a determinant of order n - I, namely, the determinant of the submatrix of A
obtained from A by omitting the row and column of the entry a7", that is, the jth row and
the kth column.
In this way, D is defined in terms of n determinants of order fl - I, each of which is,
in turn, defined in terms of n - 1 determinants of order ft - 2, and so on; We finally
arrive at second-order determinants, in which those submatrices consist of single entries
whose determinant is defined to be the entry itself,
From the definition it follows that we may expand D by any row or column, that is,
choose in (3) the entries in any row of column, similarly when exPanding the C7"' s in (3),
and so on.
This delinition is unambiguous, thatis, yields the same value for D no matter which
columns or roWS we choose in expanding. A proof is given in App, 4,
atz aln
azz a2n
(1)
(3a)
r
SEC.7.7 Determinants. Cramer's Rule
(4a)
(4b)
ExAMPLE 1
'
: Žr(-I)j*ooioMjo
ŤL
D : > ?I)j*OoioMjo
j_7
|ol atsl
Mzz- l l.
I
os, ass l
309
(j:I,2,, ,,orn)
(fr:1,2,...,orn).
Terms used in connection with determinants are taken from matrices. In D we have n2
entries a7", zlson rows and n columns, and a main diagonal on which a11, a22, , , ,
, anlt
stand. Two terms are new:
Mip is called the minor of a7" in D, and C7, the cofactor of a7" in D.
For later use we note that (3) may also be written in terms of minors
Minors and cofactors of a Third-order Determinant
In (4) of the previous section the minors and cofactors of the entries in the first column can be seen directly.
For the entries in the second row the minors are
latz atsl
Mzt: l l.
|on assl
|ott apl
Mzs: l l
|osl aszl
and the cofactors are C21 : - Mzt, C22 : * Mzz, and C; : - Mzs. Similarly for the third row-write these
down yourself. And verify that the signs in C7" form a checkerboard pattern
+
l+
EXAM PLE 2 Expansions of a Third-Order Determinant
ExAMPtE 3
l:;:l:-,
l-, 2 5l
Inspired by this, can you formulate a little theorem
matrices?
" Ii
':':l :,r :l-,|,,:l-,|-',
, l(l2 - 0) - 3(4 + 4) + 0(0 + 6): -I2.
This is the expansion by the first row. The expansion by the third column is
:l
,:o|_,, :l .l ; ;l
-,l] :l
:.-l2+O:-l2
Verify that the other four expansions also give the value - 12.
Determinant of a Triangular Matrix
l
l: :l
- -3,4,5: -60.
on determinants of triangular matrices? Of diagonal
l
'=- l
310 cHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
General Properties of Determinants
To obtain the value of a determinant (1), we can first simplify it systematically bY
elementary row operations, similar to those for matrices in Sec. 7.3, as follows.
Behavior of an nth-Order Determinant under Elementary Row Operations
(a) Interchange of two rows multiplies the value of the determinant bY -I.
(b) Addition of a multiple of a row to another row does not alter the value of the
determinant.
(c) Muttiplication of a row by a nonzero constant c multiplies the value of the
determinant by c. (This holds also when c: O, but gives no longer an elementarY
row operation.)
PRooF (a) By induction. The statementholds forn: 2because
THEoREM l
: ad - bc, but
(5) p:Ž (-I)j*koi,"Miu,
h:1
l,l,
,,l dl
|:bc-ad.
bl
We now make the induction hypothesis that (a) holds for determinants of order n - I > 2
and show that it then holds for determinants of order n. Let D be of order n. Let E be
obtained from D by the interchange of two rows. Expand D and E by a row that is not
one of those interchanged, call it the jth row. Then by (4a),
E : i (-l)j*kay,Nit"
k_l
where N,i. is obtained íiomthe minor Mip of aip in D by the interchange of those two
rows whlch have been interchanged in D (and which N3p must both contain because we
expand by another row!). Now these minors are of order n - 1. Hence the induction
hypothesis applies and gives Nju: -Mio. Thus E : -D by (5).
(b) Add c times Row i to Row j. Let D be the new determinant. Its entries in Row j are
a.x -| ca11".If we expand Ď ay this Row j, we see that we can write it as D : Dt l cD2,
where Dt: D has in Row i the aiu, whereas D2has in that Row j the ap from the addition.
Hence D, has al"inboth Row l and Row7. Interchanging these two ťows gives D2back,
but on the other hand it gives -Drby (a). Together D2 - -D2: 0, so that D : Dt - D_
(c) Expand the determinant by the row that has been multiplied.
CAUTION! det (cA) : cn det A (not c det A). Explain why. l
ExAMpLE 4 Evaluation of Determinants by Reduction to Triangular Form
Because of Theorem 1 we may evaluate determinants by reduction to triangular form, as in the Gauss elimination
for a matrix. For instance (with the blue explanations always referring to the preceding determinant)
lz. o -4 6
l
ll 5 l 0
o:I
l o 2 6 -|
I
1-3 8 9 l
SEC.7.7 Determinants. Cramer's Ru[e 3tl
20
05
02
08
20
05
00
00
20
05
00
00
-4
9
6
_)
-4
9
2,4
- lI.4
-4
9
2.4
-0
l0
6
-12
6
-12
-1
3,8
29.2
Row2-2Row1
Row 4 * 1.5 Row l
]
Row 3 - 0.4 Row 2
Row 4 - 1,6 Row 2
-,: l
38 l
47.25l Row 4 + 4.]5Row 3
: 2. 5 .2.4. 41.25 : 1134,
TH],EoREM 2 Further Properties of nth-Order Determinants
(a)-(c) in Theorem l hold also for columns.
(d) Transposition leaves the value of a determinant unaltered.
(e) Á zero row or column renders the value of a determinant zero.
(f) Proportional rows or columns render the value of a determinant zero. In
particular, a determinant with two identical rows or columns has the value zero.
P R O O F (a)-(e) follow directly from the fact that a determinant can be expanded by any row
column. In (d), transposition is defined as for matrices, that is, the 7th row becomes the
ith column of the transpose.
(0 If Row7 : c times Row i, then D : cD1, where D, has Row j : Row l. Hence an
interchange of these rows reproduces D1, but it also gives -Dtby Theorem 1(a). Hence
Dt : 0 and D : cD1: 0. Similarly for columns. l
It is quite remarkable that the important concept of the rank of a matrix A, which is the
maximum number of linearly independent row or column vectors of A (see Sec. 7.4), can
be related to determinants. Here we may assume that rank A ) 0 because the only matrices
with rank 0 are the zero matrices (see Sec. 7.4).
THEoREM 3 Rank in Terms of Determinants
An m X n matrix A : |aiuf has rank r > 7 if and onty if A has an r X r submatrix
with nonzero determinant, whereas every square submatrix wíth more than r rows
that A has (or does not have!) has determinant equal to zero.
In particular, if A is square, n X n, it has rank n if and only if
detA t 0.
l
,=__--<l
312 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
P R o o F The key idea is that elementary row operations (Sec. 7 .3) alter neither rank (by Theorem
1 in Sec. 7.4) nor the properiy or a áeterminant being nonzero (bY Theorem 1 in this
section). The echelon á.á Á tr A (see Sec. 7.3) has r_nonzero row vectors (which are
the first r row vectors) if and only if rank A : r.LetŘ u" the r X r submatrix in the left
upper corner of Á (so ,hutt*h; Ěnti;; ;iŘ *" in both the first r rows anď r columns or Á),
Now Ř is triangular, with all diagonal entries rjj fioíZeío.
lhY':
det Ř :*'r' ' ' ' ro * 0,
Also det R + 0 for the coíTesponding r x, submatrix R of A because R results from R
byelementaryrowoperations.Similarly,detS:OforanysquaresubmatrixSofr*I
or more rows perhaps contained in A because the correspánaing submatrix Šof Á must
contain a row of zeros (otherwise we would have rank A, >- r i t), so that det Š: 0 bY
Theorem 2. This proves the theorem for an m X n matrix,
Inparticular,ifAissquare,nXn,thenrankA:nlfandonlyifAcontainsannXn
submatrix with nonzero
^determinant.
But the only such submatrix can be A itself, hence
detA * 0.
l
Cramer's Ru[e
Theorem 3 opens the way to the classical solution formula for linear sYstems known as
cru,,'..]".ol.r, which giu", solutions as quotients of determinants. Cramer's rule is not
practicalin computatioň.s (for which the methods in Secs -7.3 anď20.I-z0,3 are suitable),
but is of theoreticalinteresl in differential equations (Secs. z.I0,3.3) and other theories
that have engineering applications,
Cramer,s Theorem (Solution of Linear Systems by Determinants)
(a) Iía linear system of n equations in the same number of unknowns X1,, , ,
, Xn
attXt* al2X2 + "' * alnXn: b1
aztXt* a22X2 + "' l a2nXn: b2
antXt* an2x2 + ", * annxn: bn
has a nonzero cofficient determinant D : det A,, the system has precisely one
solution. This solution is given by the formwlas
(6)
D1
Ý:-^lD
D2 Dn
xz:
D
:i":
D
(Cramer's rule)
(7)
where D1" is the determinant obtained from D by replacing in D the kth column by
the column with the entries by , , , ,bn,
(b) Hence if the system (6) ,s homogeneous and D + O, it has only the trivial
solutioníl :0, X2:0,... ,Xn: o.i7o __
0,thehomogeneous Systemalsohas
nontrivial solutions.
THEoREM 4
2GRBRIEL CRAMER (l704-I'7 5z), Swiss mathematician,
SEC.7.7 Determinants. Cramer's Rule
P R O O F The augmented matrix Á of the system (6) is of size n
at most n. Now if
313
X (n * 1). Hence its rank can be
a7natt
(8) D: detA: +0,
ant ann
then rank A : n by Theorem 3. Thus rank Á : rank A. Hence, by the Fundamental
Theorem in Sec. 7.5, the system (6) has a unique solution.
Let us now prove (1). Expanding Dby its tth column, we obtain
D : ay"Cro -l a2l"C2k +''' l an1"Cn1",
where C6 is the cofactor of entry ap tn D. If we replace the entries in the frth column of
D by any other numbers, we obtain a new determinant, say, Ď. Clearty, its expansion by
the kth column will be of the form (9), with alk, , , , , ankreplaced by those new numbers
and the cofactors Cp as before. In particular, if we choose as new numbers the entries
all,,,,, anlof the /th column of D (where l + k),we have a new determinant Ó which
has twice the columnfau
^
ant]r, once as its /th column, and once as its kth
because of the replacement. Hence Ď : 0 by Theore m 2(f).If we now expa nd Ď by the
column that has been replaced (the kth column), we thus obtain
auCu" l a2lC2p l l a6Cn1" : 0 (l + k).
We now multiply the first equation in (6) by Cu" on both sides, the second by Cr,r,. . .
,
the last by Cnt", and add the resulting equations. This gives
Cg(alxy+ , , , * alnxn) + , , , l Cnlr(anrlr f . . . + annxn)
: btCu" + ," l bnC61.
Collecting terms with the same xi,lNe can write the left side as
xl(alCy" * a2lC21" + , , , l anlCn1") + , , , l xn(atnCu" l a2nC2p + . . . l annC6").
From this we see that xo is multiplied by
au"Cu"* a27"C21" + ", l an1"Cn1".
Equation (9) shows that this equals D. Similarly, xt is multiplied by
auCu"l a2lC27" + ", l a6Cnp.
Equation (10) shows that this is zero when l + k. Accordingly, the left side of (11) equals
simply x1"D, so that (11) becomes
xkD: btCr1r+ b2c2k + ", l bnCn1".
(9)
(10)
(11)
3l4 CHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear SYstems
Now the right side of this is D6 as defined in the theorem, expanded bY its frth column,
so that division by D gives (7). This proves Cramer,s rule.
If (6) is homogeneous and D * 0, then each D1, has a column of zeros, so that Drc : 0
by Theorem 2(e), and (7) gives the trivial solution,
Finally, if (6) is homogeneous and D : 0, then rank A < n by Theorem 3, so that
nontrivial solutions exist by Theorem 2 in Sec, 7,5, l
Illustrations of Theorem 4 for n : 2 and 3 are given in Sec. 7 -6, and an imPortant
application of the present formulas will follow in the next section,
1. (Second-order determinant) Expand a general secondorder
determinant in four possible ways and show that
the results agree.
2. (Minors, cofactors) Complete the list of minors and
cofactors in Example 1.
3. (Third-order determinant) Do the task indicated in
Example 2. Also evaluate D by reduction to triangular
form.
4. (Scalar multiplicaiion) Show that det (kA) : k" det A
(not k det A), where A is any n X n matrix, Give an
example.
@ EvALuATloN oF DETERMINANTs
Evaluate, showing the details of your work,
(Expansion numerically impractical) Show that the
computation of an nth-order determinant by expansion
involves n! multiplications, which if a multiplication
takes 10-9 sec would take these times:
15.
I200
3400
0056
00,78
0-2 1
2 0-2
-1 2 0
0 -4 -1
16.
1
0
17.
13
a
-L ;l
6. l
cos n0 sin n0|
l -sin n0 cos n0l
ll
':,\
10n z015 25
. 0.004 22 71 0.5, 109
Time
sec mln years years
lcos a sin al
7. l l
I sin B cos É|
,l"l:'l1ll 10.
l,:
|:,
CRAMEťS RULE
Solu" by Cramer's rule and check by Gauss elimination and
back substitution. (Show details.)
18.2x -5y:23
4x+6y--2
19. 3y -l 4z : 14.8
4x-l 2y- z:-6.3
x- y-l5z:13.5
20. w +2x -3z:30
4x-5yf-2z:13
2w -F8;r-4y+ z:42
3w + y-5z:35
@ RANK By DETERMINANTs
Find the rank by Theorem 3 (which is not a very practical
way) and check by row reduction. (Show details,)
"\',,
11.
l;
l;
-C
a
-/-
13. 14.
00
50
15
24
2t[:1]
SEC. 7.8 lnverse of a Matrix. Gauss-Jordan Elimination
TBAM PROJECT. Geometrical Applications:
Curves and Surfaces Through Given Points. The
idea is to get an equation from the vanishing of
the determinant of a homogeneous linear system as the
condition for a nontrivial solution in cramer's theorem.
We explain the trick for obtaining such a system for
the case of a line Z through two given points Pi (It yt)
and P2: (xz, yz). The unknown line is ax l by : -c,
say. We write it as ax + by + c, I : 0. To get a
nontrivial solution a, b, c, the determinant of the
"coefficients" x, y, 1 must be zero. The system is
ax*by *c.1:0 (LineL)
(I2) ax1 *byr-|c,1:0 (PronZ)
ax2l by, -| c, 1 : 0 (P2on L).
315
(a) Line through two points. Derive from D : 0 in
(12) the familiar formula
X*X,s, _ !- jt
X,:, - Xz !l, - lz
(b) Plane. Find the analog of (12) for a plane through
three given points. Apply it when the points are (1, 1, 1),
(3,2, 6), (5, 0, 5).
(c) Circle. Find a similar formula for a circle in the
plane through three given points. Find and sketch the
circle through (2, 6), (6, 4), (1, I).
(d) Sphere. Find the analog of the formula in (c) for
a sphere through four given points. Find the sphere
through (0, 0, 5), (4,0, I), (0, 4, I), (0, 0, -3) by this
formula or by inspection.
(e) General conic section. Find a formula for a
general conic section (the vanishing of a determinant
of 6th order). Try it out for a quadratic parabola and
for a more general conic section of your own choice.
25. WRITING PROJECT. General Properties of
Determinants. Illustrate each statement in Theorems
1 and 2 with an example of your choice.
26. CAS EXPERIMENT. Determinant of Zeros and
ones. Find the value of the determinant of the n x n
matrix A,, with main diagonal entries all0 and all others
1. Try to find a formula for this. Try to prove it by
induction. Interpret ,{3 and A.4as "incidence matrices"
(as in Problem Set 7.1 but without the minuses) of a
triangle and a tetrahedron, respectively; similarly for
an " n-simplex" , having n vertices and n(n - I) 12 edges
(and spanning R'-1, fl: 5,6, , , ,).
[:],i];]
[T
::
,:^ ť]
7.8 |nverse of a Matrix.
Gauss-J ordan E[im i nation
In this section we consi.der square matňces exclusively.
The inverse of ann X nmatríx A: |a1"]is denotedby A-'and is annX nmatrtx
such that
AA_1 :A-lA:I
where I is the n X n unit matrix (see Sec. 7.2).
If A has an inverse, then A is called a nonsingular matrix. If A has no inverse, then
A is called a singular matrix.
If A has an inverse, the inverse is wnique.
Indeed, if both B and C are inverses of A, then AB : f and CA : I, so that we obtain
the uniqueness from
B : IB : (CA)B : C(AB) : CI - C.
(1)
316 CHAP. 7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems
we prove next that A has an inverse (is nonsingular) if and only if it has maximum
possible rank n.The proof will also show that Ax : b Ímptes x : A-lb provided A-1
exists, and will thus give a motivation for the inverse as well as a relation to linear systems'
(But this wIII notgive a good method of solving Ax : b numerically because the Gauss
elimination in Sec. 7.3 requires fewer computations,)
LetAbeagiven
(2)
n X n matrix and consider the linear system
THEoREM 1
PRooF
If the inverse A-1 exists, then
gives
Ax:b.
multiplication from the left on both sides and use of (1)
A-lAx:x:A-lb.
This shows that (2) has a unique solution x. Hence A must have rank nby the Fundamental
Theorem in Sec. 7.5.
Conversely, let rank A : n. Then by the Same theorem, the system (2) has a unique
solution x for any b. Now the back subsiitution following the Gauss elimination (Sec. 7.3)
shows that the components x7 of x aíe linear combinations of those of b. Hence we can
write
x:Bb(3)
with B to be determined. Substitution into (2) gives
Ax:A(Bb):(AB)b:Cb:b (C : AB)
for any b. Hence C : AB : I, the unit matrix. Similarly, if we substitute (2) into (3) we
get
x:Bb:B(Ax):(BA)x
for any x (and b : Ax). Hence BA : L Togetheí, B : A-1 exists,
-"*rr-*r*RDAN (1842-1899), German mathematician and geodesist. |See American Mathematical
Monthly 94 (1981), l30-I42,)
we do not recommend it asa method for solving systems of linear equations, since the number of operations
in addition to those of the Gauss elimination t rarger than that for back substitution' which the Gauss-Jordan
elimination avoids. See also Sec, 20,1,
l
Existence of the lnverse
The inverse A-' of an n X n matrix A, exists if and onlY if rank A : n' thus (bY
Theorem 3, Sec. l.i1 r7 ana only if detA + 0. Hence L is nonsingular if rank L : n,
and is singular if rank L 1 n,
._----<
SEC. 7.8 lnverse of a Matrix. Gauss-Jordan Elimination
Determination of the lnverse
by the Gauss-Jordan Method
For the practical determination of the inverse A-1 of a nonsingular n X n matrix A we
can use the Gauss elimination (Sec. 7.3), actlally a variant of it, called the Gauss-Jordan
elimination3 (footnote of p. 316). The idea of the method is as follows.
Using A, we form n linear systems
AX<rl : (1),
.., A*,r"r:E(n)
where (t), " , , Q(Ď are the columns of the n X n unit matrix I; thus,
iffi ,!1,o,,3*"
""","',',;ii;
':-,j
,,9. .""'r,,iJ ,;ili ;TT fi'rlffii:,11Tj,ffi
AX : I, with the unknown matrix X having the columns x(1), . . , x(n).
Correspondingly, we combine the n augmented matrices [A e<u], . . . , [A elrr,] into
HHT,[íi:,iť",,*;1iť=;t,,"ť:m,i,;:*iit*;ťď+;T
triangular U because the Gauss elimination triangularizes systems. The Gauss-Jordan
method reduces U by further elementary row operations to diagonal form, in fact to the
unit matrix I. This is done by eliminating the entries of U above the main diagonal and
il:-#"il :ffirJ :ťii,"J ;i1,.'.'*#ťlť""li:: ::,T."-:ilTi'i"r',"J.]; ?l#Hi;
ffi ,.# ;,'i, :#
l" á.í'
"Iiffi
jH
;"Ř,: T 9,:#"ň;, ",|.,Ť.J .* TT
The following example illustrates the practical details of the method.
EXAMPLE l lnverse of a Matrix. Gauss-Jordan Elimination
Determine the inverse A-1 of
SOlUtiOn. We apply the Gauss elimination (Sec. 7.3) to the following n X 2n: 3 X 6 matrix, where BLUE
always refers to the previous matrix.
^:[i ]
:^|
tA I] -
l0
3l
-1 0
10
31
-4 -1
Li ::,
Ii
'::,
i:l]0l
al Row2*3Rowl
']
Row3-Rowl
0l
.l
'_]
Row3-Row2
3l7
CHAP.7 Linear Algebra: Matrices, Vectors, Determinants, Linear Systems
318
Gauss-Jordan steps, reducing
J
2Row3
3.5 Row 3
Row 2
a1
2
W
addi
-R(
).5 F
-0.2
i.ow
Row
- I\(
0.5 F
-0.2
Row
Row
Row
v the
nal.
t1ona
Row 1
Row
2Ro
1+
ó
Z-
owl*
The last three columns constitute A-1. Check:
Hence AA-1 : I. Similarly, A-lA : I.
useful Formulas for lnverses
The explicit formula (4) in the following theorem is often useful in theoretical studies (as
opposed to computing inverses). In fact, the special case n : 2 occlrs quite frequently in
geometrical and other applications.
follow
diagon
.I
lI
l1]
This is [U
U to I, that
H] as produced by the Gauss elimination. Now
is, to diagonal íbrm with entries l on the main
Tl -l -2 | -l 0
| , l 15 05
L: ;,l l.-.,
[-,
l o
|
06 0_4
lo l 0 l -l3 -0.2
L. 0 l l or 02
[, 0 0
l -o, 0.2
L: ; : l ii
oo,,,
Lr I lL{ 11
jl L| : l]
l
THEo,RE M, 2 lnverse of a Matrix
The inverse of a nonsingular n X n matrix L -_ |oiu] is given by
['"
C,t
l _ t lcp Czz
(4) A-l : d., /. tQo]- : *, o
l
LC,, Cr,
wllere Ci., is the cofactor of ai4 in det A (see Sec.
,7.1).
(CAUTIO
in A-1, ihe cofactor Ciu occupies the same place ds a.r, (not airc)
In particular, the inverse of
'",l
'*|.
,,-)
N ! Note well that
does in A.)
fan
(4*) A: I
Lo"
anf
orr)
/' 4-1 l
I
azz
1l det A l_- orr,
-a,tz]
,,,_]
'
SEC. 7.8 lnverse of a Matrix. Gauss-Jordan Elimination 319
P R O O F We denote the right side of (4) by B and show that BA : I. We first write
(5) BA:G:[gu]
and then show that G : I. Now by the definition of matrix multiplication and because of
the form of B in (4), we obtain (CAUTION! C"t, not Cp")
(6)
šC"tI
lkt:
"Ž=
*ň
asl:
dďA
(auCu, + ",l a6Cn1),
1
Rtt: detA:l,afufu
det A
8w:0 (l+k),
In particulat, for n : 2 we have in (a) in the first row Cl -- azz, Czt
the second row C2: -azl, Czz: a11. This gives (4*).
EX.AMPLE 2 lnverseof a2 x2Matrix
Now (9) and (10) in Sec. 7.7 show that the sum (, , ,) on the right is D : det A when
l : k, and is zero when l + k. Hence
-alz and in
l
and in (4),
ll 2l
C:r: l l ::.
|-l 1l
|-1 2l
csz:-| |:l.
l 3 ll
|-1 1l
Css:l l--2.
l 3 _1l
A-1 _
^:[;] A-1 :*t;-]] :tli l]] l
EXAMPLE 3 Further lllustration of Theorem 2
Using (4), find the inverse of
ASolution.
We obtain detA : -1(-7) - 1
.,,: |-] 'ol: -, .,,: -l]
l 3 ll |-|llLn:-| l:-I3. Czz:I
|-l 41 1-1
.rr:|'-'| :r, cr":-||-l
3l
o' w23-
|so
that by (4), in agreement with Example 1,
[-t 1 2]
l , -l ,l
L-, , o_]
.13+2.8:l0,
,l_
"
+l- ''
2|^
4l ',
l 1l
l^
l .l- ''
l
f -0.7 0.2 0.3l
| -,, -o2 nr l
L o.* 0.2 -0.2.]
32o cHAp.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Diagonal matrices A : |ail"], ajk : 0 when j + k, have an inverse if and onlY if all
aii * 0. Then A-1 is diagonal, too, with entries lla1, , , , ,Ilann,
P R O O F For a diagonal matrix we have in (4)
Cr, : a22' ann
-
1
, etc.
D atlazz'''ann att
EXAMPLE 4 lnverse of a Diagonal Matrix
Let
l
^:L,i: l]
Then the inverse is
f_2 0 0l
o-':l o 025 .I
L o 0 l]
Products can be inverted by taking the inverse of each factor
inverses in reverse order,
(7) (AC)-' : g-14-r.
Hence for more than two factors,
(8) (AC",pQ)-':Q-lp-l , ,C-lA-1.
and multiplying these
p R o o F The idea is to start from (1) for AC instead of A, that is, AC(AC)-' - I, and multiPlY
it on both sides from the left, first by A-1, which because of A-lA : I gives
A-1AC(AC)-' : C(AC)-1
: A-1I : A-1,
and then multiplying this on both sides from the left, this time by C-' and bY using
C-lC : I,
C-1C(AC)-1 : (AC)-l : 6-14-1.
s proves (7), and from it, (8) follows by induction.
also note that the inverse of the inverse is the given matrix, aS You ma} Prove,
(A-t;-t _ A.
l
Thi
We
(9)
SEC. 7.8 lnverse of a Matrix. Gauss-Jordan Elimination
Unusual Properties of Matrix Muttiplication.
Cancellation Laws
Section J.2 contains warnings that some properties of matrix multiplication deviate from
those for numbers, and we are now able to explain the restricted validity of the so-called
cancellation laws |2.] and [3.] below, using rank and inverse, concepts that were not yet
available in Sec. ] .2. The deviations from the usual are of great practical importance and
must be carefully observed. They are as follows.
[1,.] Matrix multiplication is not commutative, that is, in general we have
AB + BA.
t2.] AB : 0 does not generally imply A : 0 or B : 0 (or BA : 0); for example,
t3.] AC : AD does not generally imply C : D (even when A + 0).
Complete answers to |2.] and [3.] are contained in the following theorem.
THEoREM 3 cancellation Laws
Let A, B, C be n X n matrices. Then:
(a) If rank A : n and AB : AC, then B : C.
(b\ IírankA : n, thenAB : 0 impliesB : 0. Hence ií
^B
: UrbutA + 0
as well as B * 0, thenrank A { n andrank B { n.
(c) If L is singular, so are BA and AB.
PROOF (a) The inverse of A exists by Theorem 1. Multiplication by A-' from the left gives
A-IAB : A-IAC, hence B : C.
(b) Let rank A : n.ThenA-1 exists, and AB : 0 implies A-IAB : B : 0. Similarly
when rank B : n. This implies the second statement in (b).
(c1) Rank A 1 n by Theorem 1. Hence Ax : 0 has nontrivial solutions by Theorem 2
in Sec. 7.5. Multiplication by B shows that these solutions are also solutions of BAx : 0,
so that rank (BA) 1nby Theorem 2 in Sec. ].5 and BA is singular by Theorem 1.
(.r) A' is singular by Theorem 2(d) in Sec. 7 .7 . Hence BTAT is singular by part (c1),
and is equal to (AB)T by (10d) in Sec. 7.2. Hence AB is singular by Theorem 2(d) in
Sec. 7.7.
Determinants of Matrix products
The determinant of a matrix product AB or BA can be written as the product of the
determinants of the factors, and it is interesting that det AB : det BA, although AB * BA
in general. The coíTesponding formula (10) is needed occasionally and can be obtained
by Gauss-Jordan elimination (see Example 1) and from the theorem just proved.
|,, :]t l -1]
:
[: :]
321
372
THEoREM 4
cHAp.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Determinant of a product of Matrices
For any n X n matrices A andB,
(10) det (AB) : det (BA) : det A det B.
pRooF If A orB is singular, so areAB and BAby Theorem 3(c), and (10) reduces to 0:0by
Theorem 3 in Sec. 7.7.
Now let A and B be nonsingular. Then we can reduce A to a diagonal matrix A : |a7"]
by Gauss_Jordan steps. Under these operations, det A retains its value, by Theorem 1 in
Sec. J .7 , (a) and (b) [not (c)] except perhaps for a sign reversal in row interchanging when
pivoting. But the same operations r^educe AB to Ag witrr the same effect on det (AB).
Hence it remains to prove (10) for AB; written out,
ÁB:
,
"""-1.. ,,,)
b,
bzz
j]|':
,::
0
: ...u22
L: 0
árrb,
árrb^
árrb,
árrb,
bnZ
árrbrn
árrbrn
ánnbnn
:
ánnbnt ánnbnz
We now take the determinant det tÁnl. On the right we can take out a factor á11 from
the first row, á22from the second, , , , ánnfrom the nth. But this product árrárr, , , ánn
equals det Á because Á i. diagonal. The remaining determinant is det B. This proves (10)
for det (AB), and the proof for det (BA) follows by the same idea. l
This completes our discussion of linear systems (Secs. ].3-7.8). Section 7.9 on vector
spaces and linear transformations is optional. IÝumeric methods are discussed in Secs.
20.I-20.4, which are independent of other sections on numerics.
@ lNvERsE
Find the inverse by Gauss-Jordan [or by (4*) if n : 2] or
state that it does not exist. Check by using (1).
r 1.2o 4.64a
1. l l
Lo.so 3.60l
10.6 0.8l
2. l l
Lo.s -0.6J
T cos 20
3. l
L-sin 20
sín2Of
"n.
Za-.l
.L1
1 2l
5 5l
3 +l
, _rl3 3J
SEC. 7.9 Vector Spaces, lnner Product Spaces, Linear Transformations optional 323
i;]
,[{
':;]
6
[;:]
[: 1
,1]
-11
61
-2I
13. (Triangular matrix) Is the inverse of a triangular
matrix always triangular (as in Prob. 7)? Give reason.
14. (Rotation) Give an application of the matrix in Prob.
3 that makes the form of its inverse obvious.
15. (Inverse of the square) Verify (Az;-t : 1A-1;2 for
A in Prob. 5.
16. Prove the formula in Prob. 15.
17. (Inverse of the transpose) Verify (,l')-' : (A-r;T
for A in Prob. 5,
1,8. Prove the formula in Prob. 17.
19. (Inverse of the inverse) Prove that (A-1)-1 : A.
20. (Row interchange) Same question as in Prob. 14 for
the matrix in Prob. 9,
@ ExpllclT FoRMuLA (4) FoR THE
lNVERsE
Formula (4) is generally not very practical. To understand
its use, apply it:
21. To Prob. 9. 22. To Prob, 4. 23. To Prob. 7.
7[i
9[:
j]
10.
5l
I
21 12.
I
0J
7.9 Vector Spac s, lnner Product Spac s,
Li near Transfo rmati ons OptionaI
In Sec. 7.4 we have seen that special vector spaces arise quite naturally in connection
with matrices and linear systems, that their elements, called vectors, satisfy rules quite
similar to those for numbers t(3) and (4) in Sec. 7 .Il, and that they are often obtained as
SPanS (sets of linear combinations) of finitely many given vectors. Each such vector has
n rcal numbers as its componenrs. Look this up before going on.
Now if we take all vectors wíth n real numbers as components ("real vectors''), we
obtain the very important real r-dimensional vector space R'. This is a standard name
and notation. Thus, each vector in Rn is an ordered n-tuple of rea1 numbers.
Particular cases are R2, the space of all ordered pairs (''vectors in the planer') and R3,
the sPace of all ordered triples ("vectors in 3-space"). These vectors have wide applications
in mechanics, geometry, and calculus that are basic to the engineer and physicist.
SimilarlY, if we take all ordered n-tuples of complex numbers as vectors and complex
numbers as scalars, we obtain the complex yector space Cn, which we shall consider in
Sec. 8.5.
This is not all- There are other sets of practical interest (sets of matrices, functions,
transformations, etc.) for which addition and scalar multiplication can be defined in a
natural WaY so that they form a "vector space". This suggests to create from the rrconcrete
model" R' the $abstract concept" of a "real vector spuce" V by taking the basic properties
(3) and (a) in Sec. 7.1 as axioms. These axioms guarantee that one obtains a useful and
aPPlicable theorY of those more general situations. Note that each axiom expresses a simple
ProPertY of R' or, aS a matter of fact, of R3. Selecting good axioms needrexperience and
is a Process of trial and error that often extends over a long period of time.
324
DEFl,NlTloN
CHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Real Vector Space
A nonempty set 7 of elements a, b, . . , is called a real vector space (or real linear
space), and these elements are called vectors (regardless of their nature, which will
come out from the context or will be left arbitrary) if in V there are defined two
algebraic operations (called vector addition and scalar multiplication) as follows.
I. Vector addition associates with every pair of vectors a and b of V a unique
vector of V, called the sum of a and b and denoted by a -| b, such that the following
axioms are satisfied.
I.I CommutativiQ. For any two vectors a and b of V,
a*b:b*a.
I.2 Associativity. For any three vectors u, v, w of Y,
(u + v) .l w: u i (v + w) (writtenu * v + w).
I.3 There is a unique vector in V, called the zero vector and denoted bY 0, such
that for every a in V,
a-lO:a.
I.4 For every a in V there is a unique vector in 7 that is denoted bY -a and is
such that
a*(-a):(),
il. Scalar multiplication. The real numbers aíecalled scalars. Scalar
multiplication associates with every a in V and every scalar c a unique vector of V,
called the product of c and a and denoted by ca (or ac) such that the following
axioms are satisfied.
íI.1, Distributivity. For every scalar c and vectors a and b in V,
c(a*b):ca*cb.
II.2 Distributivity. For all scalars c and k and every a in V,
(c + k)a: ca -l ka.
II.3 Associativity. For all scalars c and k and every a in V,
c(ka) : (ck)a (written cka).
II.4 For every a in V,
la: a.
A complex vector space is obtained if, instead of real numbers, we take complex numbers
as scalars.
SEC. 7.9 Vector Spaces, lnner Product Spaces, Linear Transformations Optional
Basic concepts related to the concept of a vector space are defined as in Sec. 7.4.
A linear combination of vectors &(1), , ,3or) in a vector space V is an
expression
Clfl(r) +,,, * crnlQ6 (Cr,, . .,CrnanY SCalarS).
These vectors form a linearly independent set (briefly, they are called linearly
independent) if
(l) cr&(t) + " , -| c-ar*r: 0
implies that c1 - 0, , , , , c,n: 0. Otherwise, if (1) also holds with scalars not aII zero,
the vectors are called linearly dependent.
Note that (I) with m : 1 is ca : 0 and shows that a single vector a is linearly
independent if and only if a * 0.
V has dimension n) or is n-dimensional, if it contains a linearly independent set of n
vectors, whereas any set of more than n vectors in V is linearly dependent. That set of n
linearly independent vectors is called a basis for V. Then every vector in V can be written
as a linear combination of the basis vectors; for a given basis, this representation is unique
(see Prob, 14).
E x A M P L E 1
f: ;':;::::ffi four-dimensional real vector space A basis is
|- l 0l T0 ll 1-0 0l 1-0 0l
Brt:l l. Brr:I l Bzr:l l Bzz:l I
L0 0] L0 0J Ll 0J Lo l_]
becauseany2X2matrixA:|ain]hasauniquerepresentationA:au"Btt-|al2B12 la2lB21 *a22B22.
:,T:TJ"L J:"i#J;"r:T,"1i iť,"i:ilr#J.ťnfiJíť;T,*,i|;Tfi,,*ace
what is the
EXAMPLE 2 Vector Space of Polynomials
The set of all constant, linear, and quadratic polynomials in x together is a vector space of dimension 3 with
basis { |, x, x2} under the usual addition and multiplication by real numbers because these two operations give
polynomials not exceeding degree 2. What is the dimension of the vector space of all polynomials of degree
not exceeding a given fixed n? Can you find a basis? l
[f a vector space V contains a linearly independent set of n vectors for every n, no matter
how large, then V is called infinite dimensional, as opposed to a finite dimensional
(n-dimensiona1) vector space just defined. An example of an infinite dimensional vector
space is the space of all continuous functions on some interval [a, bl of the x-axis, as we
mention without proof.
lnner Product Spaces
If a and b are vectors in Rn, regarded as
This is a 1 X 1 matrix, which we can iden
This product is called the inner product
it are (a, b) and a.b. Thus
aTb : (a, b) : a.b : Iar. . . anf
column vectors, we can form the product aTb.
tify with its single entry, that is, with a number.
or dot product of a and b. Other notations for
f brf
ll7L
l : l:> atbt:alb1 *...+ anbn.
llL_7
Lu")
376
DEFlNlTloN
cHAp. 7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
We now extend this concept to general real vector spaces by taking basic ProPerties of
(a, b) as axioms for an "abstract inner product" (a, b) as follows.
Vectors whose inner product is zero are called orthogonal.
The length or norm of a vectot ínV is defined by
(2) llull : \/(a, Ď (= 0),
A vector of norm 1 is
From these axioms
(3)
From this follows
(4)
called a unit vector.
and from (2) one can derive the basic inequality
|(a, n)|
= ll"ll lln|| Gauchy-Schwar7í inequality),
llu + bll= l|a|l + |ln|| (Trian gle ine quality).
A simple direct calculation gives
(5) llu + nll '+ llu - bll ' : zt llall '+ llnll 'l (P ar all e l o 8 r am e quality) .
aDAvtp HILBERT (1862_1943), gíeatGerman mathematician, taught at Kónigsberg and Góttingen and was
the creator of the famous Góttingen mathematical school. He is known for his basic work in algebra, the calculus
of variations, integral equations, functional analysis, and mathematical logic. His "Foundations of GeometrY"
helped the axiomatic method to gain general recognition, His famous 23 Problems (Presented in 1900 at the
International Congress of Mathematicians in Paris) considerably influenced the develoPment of modern
mathematics.
If V is finite dimensional, it is actually a so-called Hitbert space; see Ref. [GR7], P. 73, listed in APP, I,
sHBxvtRNN AMANDUS SCHWARZ (I843-I92I). German mathematician, known by his work in comPlex
analysis (conformal mapping) and differential geometry. For Cauchy see Sec, 2,5,
Real lnner Product Space
A real vector space V is called a real inner product space (or real pre-Hilberta
space) if it has the following property. With every pair of vectors a and b in 7there
is associated a real number, which is denoted by (a, b) and is called the inner
product of a and b, such that the following axioms are satisfied.
I. For al1 scalars q1 and q2 anď all vectors a, b, c ínV,
(qp ,| Qzb, c) : Qt(L, c) + q2(b, c)
II. For all vectors a and b in V,
(a, b) : (b, a)
III. For every a tn V,
(a, a) > 0,
(a, a) : 0 if and only if a (P
o s it iv e - definit ene s s) .
(Linearity).
(Symmetry).
,}
SEC. 7.9 Vector Spaces, lnner Product Spaces, Linear Transformations OptionaI 327
EXA M P LE 3 n-Dimensional Euclidean Space
R' with the inner product
(a,b) : uTb : alb. -| . . . l anbn
(where both a and b are column vectors) is called the z-dimensional Euclidean space and is denoted by E'
or agaín simply by R". Axioms I-III hold, as direct calculation shows. Equation (2) gives the "Euclidean norm"
(1) ll"ll :\/(a,a):Ý{":\r1'+...+"'. l
EXAMPLE 4 An lnner Product for Functions. Function Space
Thesetofallreal-valuedcontinuousfunctions/("r),s(r)""onagiveninterva],a<x=Bisarealvector
space under the usual addition of functions and multiplication by scalars (real numbers). On this "function
space" we can define an inner product by the integral
rp
(8) (í. s) :
J *f,r',+Gl
dx.
Axioms I-III can be verified by direct calculation. Equation (2) gives the norm
|lrll : \,rín:
Our examples give a first impression of the great generality of the abstract concepts of
vector spaces and inner product spaces. Further details belong to more advanced courses
(on functional analysis, meaning abstract modern analysis; see Ref. tGR7] listed in App. 1)
and cannot be discussed here. Instead we now take up a related topic where matrices play
a central role.
Linear Transformations
Let X and Y be any vector spaces. To each vector x in X we assign a unique vector y in
. Then we say that a mapping (or transformation or operator) of X into is given.
Such a mapping is denoted by a capital letter, say F. The vector y in assigned to a vector
x in X is called the image of x under F and is denoted by F(x) [or Fx, without parentheses].
F is called a linear mapping or linear transformation if for all vectors v and x in X
and scalars c,
(10)
F(v+x):F(v)+F(x)
F(cx) : cF(x).
(6)
l(9)
Linear Transformation of Space R"
From now on we let X : Rn and Y : R*
a transformation of R' ínto R*,
(11) y,
into Space Rm
Then any real m X n matrix A: |aq"f gives
Ax.
Since A(u + x) : Au * Ax and A(cx) : cAx, this transformation is linear.
We show that, conversely, every linear transformation F of R" into R* can be given
in terms of an m X n matrix A, after a basis for R" and a basis for R- have been chosen.
This can be proved as follows.
328 cHAp.7 Linear Algebra: Matrices, Vectors, Determinants. Linear SYstems
Let 11;, , (,") be any basis for R'. Then every x in R'has a unique representation
X:.X1 111 +",lXnE6l.
Since F is linear, this representation implies for the image F(x):
F(x): F(xle,1; + ", l xnern): xlfl(e11l) +, " l xnF(96),
Hence F is uniquely determined by the images of the vectors of a basis for R'. We now
choose for R' the "standard basis"
(1) : E(2) : E(n):
where e17; has its jth component equal to 1 and all others 0, We show that we can now
determine anmXn rnut i*^A : Loiolsuch thatforeveryxinR'andimagey : F(x) tnR*,
Indeed, from the image y(1) - F(e
Y:F(x):Ax,
ru) of 11; W9 get the condition
y(1) :
yÍ"
yr')
.
y,t)
from which we can determine the first column of A, namelY a!! : Yf', or, : Y|7), ' ' ' ,
am.' : yf). Similarly, from the image of e12; we get the second column of A, and so on,
rni, .o*pletes the pioof.
- l
We say that A represents F, or is a representation of F, with respect to the bases for R'
and R*. Quite generally, the purpose of a "representation" is the replacement of one
object of study by another object whose properties are more readily apparent,
In three-dimensional Euclidean space E3 the standard basis is usuallY written (t) : i,
(2) : j, ecsl : k. Thus,
(l2)
atl
"r"]
,:Li],L:],kLl](13)
SEC. 7.9 Vector Spaces, lnner Product Spaces, Linear Transformations Optional
These are the three unit vectors in the positive directions of the axes of the Cartesian
coordinate system in space, that is, the usual coordinate system with the same scale of
measurement on the three mutually perpendicular coordinate axes.
E X A M P L E 5 Linear Transformations
Interpreted as transformations of Cartesian coordinates in the plane, the matrices
|-0 ll |-| 0l T-l 0l fa 0l
L,0] Lo-,] Lo-;] L;;]
represent a reflection in the line x2 : x|, a reflection in the xl-axis, a reflection in the origin, and a stretch
(when až l, oracontractionwhen 01a < 1)inthe"rl-direction,respectively. l
E XA M P L E 6 Linear Transformations
Our discussion preceding Example 5 is simpler than it may look at first sight. To see this, find A representing
the linear transformation that maps (x1, x2) onto (2xt - 5x2,3x1 -l 4x).
SOlution. Obviously, the transformation is
: 2x1 - 5x2
: 3x1 i 4x2.
check:t;;]:[]]]t;;]:
If Ain(11)issquare, nXn,then(11)mapsR2into Rn.IfthisAisnonsingular,sothat
A-1 exists (see Sec. 7.8), then multiplication of (11) by A-1 from the left and use of
A-lA : I gives the inverse transformation
(l4) x : A-lY.
It maPs every Y : yo onto that x, which by (l1) is mapped onto yg. The inverse of a linear
transformation is itself linear, because it is given by a matrix, as (14) shows.
.}r
l"z
that the matrix isFrom this we can directly see
^:[];]
f2x, - 5x"1
| -| l
|3x1 + 4x2)
E vEcToR spAcEs
(Additional problems in Problem Set 7.4.)
Is the given set (taken with the usual addition and scalar
multiplication) a vector space? (Give a reason.) If your
answel is yes, find the dimension and a basis.
1. A1l vectors in R3 satisfying 5u, - 3u, * 2u, : g
2. AlI vectors in R3 satisfying 2u1 * 3u, - u3 : 0,
u1 - 4u2 * u3:0
3. Al1 2 X 3 matrices with all entries nonnegative
4. Al1 symmetric 3 X 3 matrices
5. A1l vectors in R5 with the first three components 0
6. All vectors inRa with u, l ur: 0, u3 - uq: I
7. Allskew-symmetric2 X 2matrices
8. A1l n X nmatrices A with fixed n and detA : 0
9. All polynomials with positive coefficients and degree
3 or less
10. All functions /(x) : d cos x * b sin x with any
constants a and b
11. A11 functions í(x) : (ax -| b)e-* with any constants
aandb
12, AlI2 X 3 matrices with the second row any multiple
of |4 0 -9)
CHAP.7 Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
13. (Different bases) Find three bases for R2.
14. (Uniqueness) Show that the representation
y : c1&11) + l cnuql of any given vector in
an rz-dimensional vector space V in terms of a given
basis a11;, , , , , a,rr, for Vis unique.
Eíro] LINEARTRANsFoRMATloNs
Find the inverse transformation. (Show the details of your
work.)
15. yr : x1 - 2x2 16. yt: 5x1 - x2
yz: 4xt - 3*z j2: 3x1 - x2
20. yr: xt * x2 - 2xg
lz: x1, 1 x2*2x3
}s:-2x7l2x2*4xg
@ INNER pRoDucT. oRTHoGoNALtTy
Find the Euclidean norm of the vectors
21,. |4 z -6]'
22.|0 -3 3 0 5 1]T
23. 116 -3z 0l'
24. li 3 } 2)'
25. t0 1 0 0 -1 1 -1]'
26. 13 -? +]'
27. (Ortbogonality) Show that the vectors in Probs. 21
and 23 are orthogonal.
28. Find all vectors v in R3 orthogonal to [2 0 1]'.
29. (Unit vectors) Find all unit vectors orthogonal to
|4 -3]'. Make a sketch.
30. (Triangle inequality) Verify (4) for the vectors in
Probs. 2l and23.
17. yt: 3xt - x2
!2:-5x1 l2x2
19. yr: 2r, - 3r,
luz: -10x1 * I6x2 l x3
}g: -JrrlIIx2*xg
18. y, : 0.Z5xl _ 0.1_r,
!z: x, - 0.8x3
}3 : 0.2xg
1. What properties of matrix multiplication differ from
those of the multiplication of numbers? What about
division of matrices?
2. Let A be a 50 X 50 matrix and B a 50 X 20 matrix.
Are the following expressions defined or not? A + B,
A'rB', AB, BA, AAT, BTA, B-B, BBT, BTAB. (Give
reasons.)
3. How is matrix multiplication motivated?
4. Are there any linear systems without solutions? With
one solution? With more than one solution? Give simple
examples.
5. How can you give the rank of a matrix in terms of row
vectors? of column vectors? of determinants?
6. What is the role of rank in connection with solving
linear systems?
7. What is the row space of a matrix? The column space?
The null space?
8. What is the idea of Gauss elimination and back
substitution?
9. what is the inverse of a matrix? when does it exist?
How would you determine it?
10. What is Cramer's rule? When would you apply it?
@ LINEAR sysTEMs
Find all solutions or indicate that no solution exists. (Show
the details of your work.)
!1.9x - 3y :
5x*4y:
12. -2x - 4y
x*2y
13.3x + 5y -
x*2y-
15
48
l 1- _
l 11-
*16z:
8z : 18
3z: 6
-6
a
J
1,4.
1,6.15. -8X l 2Z:
61' -l 4; :
12x 1- 2y :
*Jy:
-4y:
*9y:
+9y -
-3y+
+.},-
5x-lOy:2
3x* y:13
-x.t' 6y:6
,),. l - - -lL! l 1-
2x*3y- z:-IZ
5x-4y+3z: 32
-x*4y-27: 1
3x-l 4y+6z: 1
x-2y+2z:-Ž
1
_]
2
17. 3x
5x
6x
-x
2x
19. Jx
18.0
41
15
I47 :
1. :
Á- -Ť., -
36
-12
4
330
TlONS AND PROBLEMS
Summary of Chapter 7
@ cALcuLATloNs wlTH MATRlcEs AND
VEcToRs
Calculate the following expressions (showing the details of
your work) or indicate why they do not exist, when
W4 INVERSE
Find the inverse or state why it does not exist. (Show details.)
37. Of the coefficient matrix in Prob. 11
38. Of the coefficient matrix in Prob. 15
39. Of the coefficient matrix in Prob, 16
40. Of the coefficient matrix in Prob. 18
41. Of the augmented matrix in Prob. 14
42. Ot the diagonal matrix with entries 3, -1, 5
B+n NETwoRKs
Find the currents in the following networks.
44. 3800 V
é
3ffiu +o o
t,oro
u
fuoo
u
331
^:[:
2
18
10
a:
B-
b-
j]
[1]
[_l 1:]
[|
20. AB, BA 21. A - 4r
22.
^2
+ 82 23. det A, det B, det AB
24.
^^-|, ^T^
25.0.ZBBT
26. A,a, aTA, aTAa 27. aÍb, bTa, abT
28. bTBb 29. aíB,Bra
30. 0.1(A + AT)(B - B')
Fr_s6l RANK
Determine the ranks of the coefficient matrix and the
augmented matrix and state how many solutions the linear
system will have.
31. In Prob. 13 32. In Prob. 12 33. In Prob. 17
34. In Prob. 14 35. In Prob. 19 36. In Prob. 18
An m X n matrix A : |a7"] is a rectangular array of numbers or functions ("entries",
"elements") arranged in m horizontal rows and n vertical columns. If m : n, the
matrix is called square. A 1 X n matrixis called a row yector and an m X Imatrix
a column yector (Sec. 7.1).
The sum A + B of matrices of the same size (i.e., both m X n) is obtained by
adding conesponding entries. The product of A by a scalar c is obtained by
multiplying each aiuby c (Sec. 7.1).
The product C : AB of an m X nmatrix A by an r X p matrixB : [b7p] is
defined only when r: n) and is the m X p matrix C : [cip] with entries
Cjk : aitbu" l aizbzt + . . . l ainbntt o:;#Í
ťjffiT
(1)
Linear Algebra:
Linear Systems
Matrices, Vectors,
of Equations
__._-.<a
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
This multiplication is motivated by the composition of linear transformations
(Secs. l.z,i.g).It is associative, but is not commutative: if AB is defined, BA may
not be defined, but even if BA is defined, AB + BA in general, Also AB : 0 may
not imply A : 0 or B : 0 or BA : 0 (Secs.'7.2,7.8). Illustrations:
t; ;] t l l]:[ :]
: [11], [1]
,, o:|'o :]
t]l]t;;]:[-ll]
tl ,,[i]
(2)
The transpose AT of a matrix A : |ah"f is AT : |ooi]: rows become columns and
conversely (Sec. 7.2). Here, A need noiu" iqo*". If it is and A : A', then A is called
symmetric; if A _ _A-, it is called skew_symmetric. For a product, (AB)- : BTAT
(Sec. 7.2).
A main application of matrices concerns linear systems of equations
Ax:b (Sec. 7.3)
(m equatíons in n unknowns xt, , , , , xniA and b given), The most important method
of sÓlution is the Gauss elimination (Sec. 7.3), which reduces the sYstem to
.,triangular,, form by elementary row operations, which leave the set of solutions
unchanged. (Numeiic aspects and variants, such as Doolittle,s and Choleslcy,s
methods, are discussed in Secs. 20,I and 20,2)
Cramer's rule (Sec s. J.6, 7.7) represents the unknowns in a system (2) of n
equations in n unknowns as quotients of determinants; for numeric work it is
iÁpractical. Determinants (Sec. 7 .7) have decreased in importance, but will retain
their place in eigenvalue problems, elementary geometry, etc.
The inverse A-1 of a square matrix satisfies AA-1 : A-lA : I. It exists if and
only if det A * 0. It can be computed by the Gauss_Jordan elimination (Sec, 7,8),
The rank r of amatrix A is the maximum number of linearly independent rows
or columns of A or, equivalently, the number of rows of the largest square submatrix
of A with nonzero determinant (Secs, 7,4,7,7),
The system (2) has solutions if and only if rank A : rank [A b], where [A b]
is the augmented matrix (Fundamental Theorem, Sec. 7.5).
The homogeneous system
Ax:0
has solutions x * 0 ("nontrivial solutions") if and only if rank A 1 n, in the case
m : n equivalently if and only if det A : 0 (Secs. 1.6,7.,7).
Vector Spaces, inner product Spaces, and linear transformations are discussed in
Sec. 7.9. See also Sec. 7.4.
(3)
cHAl
PTER
,
(1)
Linear Al8ebra:
Matrix Eigenvalue Problemsii
Matrix eigenvalue problems concern the solutions of vector equations
Ax : ),x
where A is a given square matrix and vector x and scalar .tr are unknown. Clearly, x : 0
is a solution of (1), giving 0 : 0. But this of no interest, and we want to find solution
vectors x * 0 of (1), called eigenvectors of A. We shall see that eigenvectors can be
found only for certain values of the scalar .tr; these values ,[ for which an eigenvector
exists are called the eigenvalues of A. Geometrically, solving (l) in this way means that
we are looking for vectors x for which the multiplication of x by the matrix A has the
same effect as the multiplication of x by a scalar ,tr, giving a vector ),x with components
proportional to those of x, and ), as the factor of proportionality.
Eigenvalue problems are of greatest practical interest to the engineer, physicist, and
mathematician, and we shall see that their theory makes up a beautiful chapter in linear
algebra that has found numerous applications.
We shall explain how to solve that vector equation (1) in Sec. 8.1, show a few typical
applications in Sec. 8.2, and then discuss eigenvalue problems for symmetric,
skew-symmetric, and orthogonal matrices in Sec. 8.3. In Sec. 8.4 we show how to obtain
eigenvalues by diagonalÍzation of a matrix. We also consider the complex counterparts of
those matrices (Hermitian, skew-Hermitian, and unitary matrices, Sec. 8.5), which play a
role in modern physics.
COMMENT. Numerics for eigenvalues (Secs. 20.6-20.9) can be studied immediately
after this chapter.
Prerequisite: Chap.7.
Sections that may be omitted in a shorter course:8.4, 8.5
References and Answers to Problems; App. 1 Part B, App. 2.
tí
333
8J Eigenvalues, Eigenvectors
From the viewpoint of engineering applications, eigenvalue problems are among the most
important problems in connection *iit matrices, and the student should follow the Present
discussion with particular attention. we begin by defining the basic concepts and show how
to solve these problems, by exampl., u, *"il as in general. Then we shall turn to aPPlications'
Let A -_ |a3t)be a given n X n matrix and consider the vector equation
CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
Ax : ,\x.
^*:
[-' 'l ["'l :
^
["l , in components,
I z -z] Lrr) L*r)
Transferring the terms on the right to the left, we get
(-5-ň)x1 * Zxz -0
(2*)
2xt + (2 - ,\)x, : g,
This can be written in matrix notation
(1)
Here x is an unknown vector and i an unknown scalar. our task is to determine x's and
),,s that satisfy (1). Geometrically, we are looking for vectors x for which the multiPlication
by A has the same effect as the multiplication by a scalar i; in other words, Ax should
be proportional to x,
Clearly, the zerovector x : 0 is a solution of (1) for any value of i" because A0 : 0,
This is of no interest. A value of ,L for which (1) has a solution x * 0 is called an eigenvalue
or characteristic value (or latent root) of the matrix A. ("Eigen" iS German and means
..proper,, or ,ocharacteristic.") The .o.."rponding solutions x * 0 of (1) are called the
eigenvectors or characteristic vectors of A corresponding to that eigenvalue ,[, The set
of al1 the eigenvalues of A is called the spectrum of A. We shall see that the sPectrum
consists of at least one eigenvalue and at most of n numerically different eigenvalues, The
largest of the absolute uulu". of the eigenvalues of A is called the sPectral radius of A'
a name to be motivated later,
How to Find Ei8envalues and Eigenvectors
The problem of determining the eigenvalues and eigenvectors of a matrix is called an
eigenvalue problem. (More pre.ireiy, an algebraic eígenvalue problem, aS opposed to
an eigenvalue problem involuing un ODE, PDE (see Secs. 5.7 and t2,3) or integral
equation.) Such problems occur in physical, technical, geometric, and other aPPlications,
as we shall see. We show how to ,oiu" them, first by an examPle and then in general,
Sometypicalapplicationswillfollowafterwards.
ExAMPLE l Determination of Eigenvalues and Eigenvectors
We illustrate all the steps in terms of the matrix
T-5 2f
l:| l
Lz -z)
Solution. (a) Eigenvalues. These must be determinedfrsí. Equation (1) is
-5x1 t 2x2: Xx1
2x1 - 2x2:
^x2.
SEC.8.1 Eigenvalues,Eigenvectors 335
(3*) (A-nl)x:0
because (1) is Ax - .\x : Ax - ilx : (A - .trI)x : 0, which gives (3*). We see that this is ahomogeneous
linear system. By Cramer's theorem in Sec. 7.7 ithas a nontrivial solution x * 0 (an eigenvector of A we aíe
looking for) if and only if its coefficient determinant is zero, that is,
|-_5-
^
2 l
(4*) D(^):det(A-nl,l: l l:(-5 - Dl z-^)- 4: 12+,7^+ 6:0.
l z -2-^l
We call D(n) the characteristic determinant or, if expanded, the characteristic polynomial, and D(n) : 0
the characteristic equation of A. The solutions of this quadratic equation are.tr1 : -l and iz : -6. These
are the eigenvalues of A.
(b) Eigenvector of A,corresponding lo r\1. This vector is obtained from (2*) with i - ,\r : -1, that is,
-4x1 * 2x2: 0
2*-s, - xz: 0.
A solution is x2 : 2xy &s we see from either of the two equations, so that we need only one of them. This
determines an eigenvector coíTesponding to i1 : -1 up to a scalar multiple. If we choosa x1-: 1, we obtain
the eigenvector
|-ll T-5 2f fll T-ll
-' :
Lr_]
Check: o*, :
L , -r)lr]: L-;]
: (-l)x1 : r\lx1
(b) Eigenvector of A corresponding to )l2. For i : lz: -6, equation (2*) becomes
x1 * 2x2: 0
2x1 * 4x2: 0.
A solution is x2 : -x1l2 with arbitrary x1. If we choose xt: 2, we get x2: - 1. Thus an eigenvector of A
correspondingto ň2: -6 is
r21
xz: I l Check:
L-1]
T-5 2f f 21 f -l21
o"':
l , -r] L-,]
:
L u_]
: t-o,*z: lzxz, l
This example illustrates the general case as follows. Equation (1) written in components is
attXt +... l arrxn: r\xt
aztXt + ", l a2nXn: lxz
anlXl + ", l a,nxn: hXn.
Transferring the terms on the right side to the left side, we have
(al-,tr)x1 * alzxz
(2) azlXt * (azz - X)xz
anIXI
In matrix notation,
(3)
+",+ (ann-X)xn:0.
(A-i,I)x:0.
+ ,..+ alnxn - 0
+... + aznxn :0
336 CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
By Cramer's theorem in Sec. 7,7, this homogeneous linear system of equations has a
nontrivial solution if and only if the corresponding determinant of the coefficients is zero:
(4) D(^) : det (A -
^I)
: -0.
A -
^I
is called the characteristic matrix and D(,\) the characteristic determinant of
A. Equation (4) is called the characteristic equation of A. By developing D(n) we obtain
a polynomial of nth degree in ,\. This is called the characteristic polynomial of A.
This proves the following important theorem.
Eigenvalues
The eigenvalues of a square matrix L are the roots of the characteristic equation
() of A,.
Hence an n X n matrix hrls at least one eigenvalue and at most n numericallY
diffe r e nt e i g env alue s.
For larger n, the actual computation of eigenvalues will in general require the use
of Newton's method (Sec. I9.2) or another numeric approximation method in
Secs. 20.7-20.9.
The eigenvalues must be determined first. Once these are known, coffesponding
eigenvectors aíeobtained from the system (2), for instance, by the GaUss elimination,
where i is the eigenvalue for which an eigenvector is wanted. This is what we did in
Example 1 and shall do again in the examples below. (To prevent misunderstandings:
numeric approximation methods (Sec. 20.8) may determine eigenvectors first.)
Eigenvectors have the following properties.
Eigenvectors, Eigenspace
If w and x are eigenvectors of a matrix A corresponding to the same eigenvalue ),,
so are w -l x(providedx * -w) andl<xfor anyk+ 0.
Hence the eigenvectors corresponding to one and the same eigenvalue L of A,
together with 0, form a vector space (cf. Sec. 7.4), called the eigensPace of A,
corresponding to that ),.
al- l atz
az:. azz - X
anl anz
a,l,n
a2n
ann- L
THEOR]EM,l
THEoREM .2
PROOF Aw: ),w andAx: Lximply A(w + x) : Aw i Ax: iw f ),x:
A(kw) : k(Aw) : k(,\w) : ,\(ftw); hence A(kw + (,x): ,\(kw + (,x).
In particular, an eigenvector x is determined only up to a constant factor.
normalize x, that is, multiply it by a scalar to get a unit vector (see
instance, xr : [1 2fr tnExample 1 has the length ll *, ll : \F+ Ť
|ttxE 2l\r3]- is a norm alized eigenvector (a unit eigenvector).
,[(w + x) and
l
Hence we can
Sec. 7.9). For
: \/5; hence
SEC.8.1 Eigenvalues,Eigenvectors
Examples 2 and 3 will illustrate that an n X n matrix may have n llnearly independent
eigenvectors, or it may have fewer than n. In Example 4 we shall see that a real matrix
may have complex eigenvalues and eigenvectors.
E XA M P L E 2 Multiple Eigenvalues
Find the eigenvalues and eigenvectors of
Solution. For our matrix, the characteristic determinant gives the characteristic equation
-^3-
^2+zl^i45:0.
The roots (eigenvalues of A) are ir : 5, iz : is - -3. To find eigenvectors, we apply the Gauss elimination
(Sec.7.3)tothesystem(A-nl)x:Orfirstwith,tr:5andthenwithi:-3.For.\:5thecharacteristic
matrix is
^:[1,,,-l]
Hence it has rank 2. Choosing .r3 : * 1 we have xz : 2 from -ffx2 - ff*"
-7x1 -| 2x2 - 3xg: 0. Hence an eigenvector of A coresponding to .tr : 5 is x1
For ,tr : -3 the characteristic matrix
f-7 2 -3]
ll
Io -+*+l
Lo 0 0_]
: 0 and then ,t1 :
: tl z -|T.
f -,l 2 -31
A-^I:A-5I:| 2 -4 -6 l ltrow-reducesto
L-, -2 -,]
1 from
A-^I:A+":
I i
Hence it has rank 1. From x1 -| 2x2 - 3*s: 0 we have x1 : -2xz * 3-r3. Choosing xz: I, x3 : 0 and
x2 : 0, x3 : 1 , we obtain two linearly independent eigenvectors of A corresponding to .tr : - 3 [as they must
exist by (5), Sec. 7.5, with rank : I and n : 3l,
The order Mn of an eigenvalue )" as a root of the characteristic polynomial is called the
algebraic multiplicity of
^.
The number m^ of linearly independent eigenvectors
conesponding to .tr is called the geometric multiplicity of ,[. Thus m^ís the dimension of
the eigenspace coffesponding to this ,tr. Since the characteristic polynomial has degree n)
the sum of all the algebraic multiplicities must equal n. In Example 2 for ), : -3 we have
ffi^ : Mx : 2. In general, m^ šM^, as can be shown. The difference A,l : M^ - mo ís
called the defect of ,\. Thus A_s : 0 in Example 2, but positive defects A^ can easily occur:
1-1]
row-reducesto
[i l l]
l
-,:[:]
-,:Li]
and
338 CHAP.8 Linear Algebra: Matrix Eigenvalue Probtems
ExAM PLE 3 Algebraic MultiPlicity, Geometric Multiplicity. Positive Defect
The characteristic equation of the matrix
|-^ 1l
is det(A-^I):l l:,,\2:0.
l o -^l
Hence
^
: 0 is an eigenvalue of algebraic multiplicity Mo: 2. But its geometric_multiplicity is only mo: |,
since eigenvectors result from -0,r1 l xz:0, hence xz : O, in the form [x1 0]T. Hence for i : 0 the defect
isAo:1.
Similarly, the characteristic equation of the matrix
o: [o 'lLo 0J
13 21
A:| l i.
Lo 3_]
^:[_::]
det(A-^I):
l'o^ ,1^l
:(3-^)2:0
Hence
^
: 3 is an eigenvalue of algebraic multiplicity Ms: 2, but its geometric multiplicity is only mz: I,
since eigenvectors result from 0x1 l 2x2:0 in the form [x1 0]T. l
ExAMpLE 4 Real Matrices with Complex Eigenvalues and Eigenvectors
Since real polynomials may have complex roots (which then occur in conjugate pairs), a real matrix may have
complex eigenvalues and eigenvectors. For instance, the characteristic equation of the skew-sYmmetric matrix
|-^ 1l
is det(A-^I):I l:^'*1:0.|-1 -^l
It gives the eigenvalues i1 : i (:V-1), iz : -i. Eigenvectors are obtained from -lx1 i x2 : 0 and
ixl l x2: 0, respectively, and we can choose x1 : 1 to get
t:]andt-:]
In the next section we shall need the following simple theorem.
THEoREM 3 Eigenvalues of the Transpose
The transpose
^T
of a square matrix L has the same eigenvalues as A.
p R o o F Transposition does not change the value of the characteristic determinant, as follows from
Theorem 2d in Sec. 7.7. l
Having gained a first impression of matrix eigenvalue problems, in the next section we
illustrate their importance with some typical applications.
I
E ElGENvALuEs AND ElGENvEcToRs
Find the eigenvalues and eigenvectors of the following
matrices. (Use the given .,\ or factors.)
,.|-' o -l
2. |" 'lL 0 0.4J Lo .]
l:
t;
t::]
l:
u.)
4.
:]
,,f
SEC.8.1 Eigenvalues,Eigenvectors
:l]
u,,,,1]
t::: ll
Ll l
,l,,;]
t; ::;l
L: :,u i]
A':4
t; : :,^1
L : :
,,-r] i-3)2
339
2l,.
:]t;
|-cos 0
L,in e
10.
-o.0l
or]
,^f
T0.8
Lo,u
t]
I
7.
9.
11.
:;]
i:3
L_]i
|:
':
;]
i:'l:,i]
L1': l]^
[l:,l]
-sin 0l
"o,']
12.
13.
1,4.
2 0-rf
0 0 -r|.),:1
2 -2 1_]
a
--J
,}.:9
t1
23L;
000l
1 0 ol
l
0 r ol
4 r-u)
r-1 0 12 0l
lll 0 _| 0 12 l
l 1.1,t+t;2
l o 0 -l -+l
llL 0 0 -4 -lJ
(Multiple eigenvalues) Find further 2 X Z and 3 X 3
matrices with multiple eigenvalues. (See Example 2.)
(Nonzero defect) Find further 2 X 2 and 3 X 3
matrices with positive defect. (See Example 3.)
(Transpose) Illustrate Theorem 3 with examples of
your own.
(Complex eigenvalues) Show that the eigenvalues of
a real matrix are real or complex conjugate in pairs.
(Inverse) Show that the inverse A-1 exists if and only
if none of the eigenvalues ir, . . . , l, of A is zero, and
then A-1 has the eigenvalues l/ir, . . . ,ll^n.
1,6[
r-3 0 -2 8l
l . 1 4 -rl .,\.:3
24. l l.
|-+ 10 -1 -zl' i:-5
L6-4-2,.]
1
2
6
0.
15.
,
26.
)n
17.
18.
29.
19.
340 CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
8.2 Some Applications of Eigenvalue Problems
In this section we discuss a few typical examples from the range of applications of matrix
eigenvalue problems, which is incredibly large. Chapter 4 shows matrix eigenvalue
p-bl"-, related to oDEs governing mechanical systems and electrical networks. To keeP
iil.r;:T: ,1T: ffiindependent
of Chap, 4, we include a typical application of that
EXAMP|-E I Stretching of an Elastic Membrane
An elastic membrane in the xp2-plane with boundary circle x12 l x22 : 1 (Fig. 158) is stretched so that a
point P: (xt, xz) goes over into the point Q: (yy yz) given by
yr:5xr+3x2
in comPonents'
yz: 3xt * 5x2.
Find the principat directions, that is, the directions of the position Vector x of P for which the direction of the
position vector y of Q is the same or exactly opposite. What shape does the boundary circle take under this
deformation?
Solution. We are looking for vectors x such that y : ,[x. Since y : Ax, this gives Ax : ix, the equation
of an eigenvalue problem. In components, Ax : .trx is
(1) ,:
[;;]
: *: [: ;] t';]
5*t
(2)
3*t
The characteristic equation
(3)
,l 3x2: Lt1
or
l 5x2: Áx2
(5-),)x1 + 3*z :0
3rt + (5 - Á)x2: 0.
is
l5-^ 3 l
I" |:rs-^)2-9:0.
l : 5-^l
Its solutions are ,\1 : 8 and lz: 2. These are the eigenvalues of our problem. For i - ir : 8, our sYstem
(2) becomes
-3x1 -| 3x2: 0,
3x1 - 3x2: 0.
For i2 : 2, ottr system (2) becomes
3x1 * 3x2: 0,
3x1 * 3x2: 0.
Solution x2: x1; x1 arbitrary,
for instance. x1 : x2: l.
Solution x2: -x1, x1 arbitrary,
for instance. xr : l. xz : -|.
We thus obtain as eigenvectors of A, for instance, [1 1]T corresponding to i1 and [1 -1]T corresPonding to
i2 (or a nonzero scalar multiple of these). These vectors make 45o and 135" angles with the Positive.rl-direction.
They give the principal directions, the answer to our problem. The eigenvalues show that in the principal
directions the membrane is stretched by factors 8 and 2, respectively; see Fig, 158,
Accordingly, if we choose the principal directions as directions of a new Cartesian ulu2-cooíďinate System,
say, with the positive al-semi-axis in the first quadrant and the positive u2-semi-axis in the second quadrant of
the x1"2_system, and ifwe set ul : r cos @, u2 : r sin @, then a boundary point of the unstretched circular
membrane has coordinates cos Q, sin s. Hence, after the stretch we have
zr:8cos@, z2:2sinS.
Since cos2 @ * sin2 ó -- 7, this shows that the deformed boundary is an ellipse (Fig. 158)
22
Z.l Zq
-2'n2óZ
l(4)
SEC. 8.2 Some Applications of Eigenvalue Problems
Fig.l58. Undeformed and deformed membrane in Example 1
EXAMPLE 7 Eigenvalue Problems Arising from Markov Processes
Markov processes as considered in Example 13 of Sec. '7
.2 lead to eigenvalue problems if we ask for the limit
state of the process in which the state vector x is reproduced under the multiplication by the stochastic matrix
A governing the process, that is, Ax : x. Hence A should have the eigenvalue 1, and x should be a corresponding
eigenvector. This is of practical interest because it shows the long-term tendency of the development modeled
by the process.
In that example,
For the transpose,
Hence AT has the eigenvalue 1, and the same is true for A by Theorem 3 in Sec. 8.1. An eigenvector x of A
for .tr : 1 is obtained fiom
34l
L; |:1 :] [i] [i]
^:[l1 T *]
::::, ,}]A-L;,il'il;]
[ -:lto
row-reduced to l a
L.
l*t 2i',ď*",;;ji,;Í,Tffi,;i',J'il";.ii1';.:,,T3jTlJ.i'":, ij;:T,#i,J],:#?'*l';,h*T
2:6:I, provided that the probabilities given by A remain (about) the same. (We switched to ordinary fractions
to avoid rounding errors.) l
EXAMPLE 3 Eigenvalue Problems Arising from Population Models. Leslie Model
The Leslie model describes age-specified population growth, as follows, Let the oldest age attained by the
females in some animal population be 9 years. Divide the population into three age classes of 3 years each. Let
the "Leslie matrix" be
where /17, is the average number of daughters born to a single female during the time she is in age class , and
li,ilU :2,3) is the fraction of females in age class j - l that will survive and pass into class j. (a) What is
the number of females in each class after 3, 6, 9 years if each class initially consists of 400 females? (b) For
what initial distribution will the number of females in each class change by the same proportion? What is this
rate of change?
L:íl1,1
[+
,í,
T]
(5)
___--<
342 CHAP.8 Linear Algebra: Matrix Eigenvalue Problems
Solution. (a) Initially, x[6; : [400 400 400]. After
2.3
0
0.3
Similarly, after 6 years the number of females in each class is given by *L, : iLx13;)T : [600 648 12), and
after 9 years we have x|9; : (Lx16)T : |I5I9.2 360 194,4],
(b) proportional change means that we are looking for a distribution vector x such that Lx : ix, where A
is the rate of change (growth if
^
> 1, decrease if
^
< 1). The characteristic equation is (develoP the characteristic
determinant by the first column)
det(L -
^I)
: -^3 - 0.6(-2,3^- 0,3,0,4) : -^3 + 1,38^ + 0,0,72 : 0,
A positive root is found to be (for instance, by Newton's method, Sec. 19.2)
^:
I.z. A corresponding eigenvector
x can be determined from the characteristic matrix
A, - 1.2I:
where Xs : 0.125 is chosen, xz : 0.5 then follows from 0.3x2 - I.2xg : 0, and 11 : 1 from
-1.2x1 -| 2.3x2 -l 0.4xg : 0. To get an initial population of 1200 as before, we multiply x by
I20ol0 + 0.5 + 0.125) : 738. Answer: Proportional growth of the numbers of females in the three classes
willoccurif theinitialvalues are738,369,92inclasses I,2,3,respectively.ThegrowthratewillbeI.2per
3 years, l
EXAM PLE 4 Vibrating System of Two Masses on Two Springs (Fig, t59}
Mass-spring systems involving several masses and springs can be treated as eigenvalue Problems. For instance,
the mechanical system in Fig. 159 is governed by the system of oDEs
y'l: -5yl * Zyz
ll
^
lz: zjt - Zlz
3 years,
T] Lffi]:L]l]
I;
],;':I] , say, -:L.{;,]
where y1 and y2 arc the displacements of the masses from rest, as shown in the figure, and Primes denote
derivatives with respect to time /. In vector form, this becomes
,,,, _ lríl: o,, : [-' zl [y,-l
(7) '"
:
|"r;) L , -r) Lr,_l
X(3):*-:
L*
(6)
ffiI= I
hz= 2
^2=
I
(Net change in
spring length
^,
\
t2
System in
static
equilibrium
Fig. l59. Masses on springs in Example 4
System in
motion
SEC. 8.2 Some Applications of Eigenvalue Problems
We try a vector solution of the form
(8) Y : xe-t.
This is suggested by a mechanical system of a single mass on a spring (Sec. 2.4), whose motion is given by
exponential functions (and sines and cosines). Substitution into (7) gives
@2xe't : Axe-t.
Dividing by ," and writing a2 : l, we see that our mechanical system leads to the eigenvalue problem
(9) where ), : a2.
343
From Example 1 in Sec. 8.1 we see that A has the eigenvalues ir : -l and ),2 : -6. Consequently,
@: \/=: +i and '\t:6: *i\/6,respectively. Corresponding eigenvectors are
(10) -, :
[;]
and
",: [ 1)
From (8) we thus obtain the four complex solutions [see (10), Sec. Z.2]
*1"'i' : xl(cos t + i sin t),
*r"-i ': x2(cos l/6 t * i sinÝ6 ).
By addition and subtraction (see Sec. 2.2) we get the four real solutions
x1 cos /, x1 sin /,
"2
co, \6 r. x2 sinÝ6 t.
A general solution is obtained by taking a linear combination of these,
y : xt(at cos / + bl sint) -| x2(a2cos \6 í+ b2 sin 16 4
with arbitrary constants ay by, a2, b2 (to which values can be assigned by prescribing initial displacement and
initial velocity of each of the two masses). By (10), the components of y are
lt : at cos / + b, sin t -| 2a2
"o,
\6 t + 2b2 sin \6 r
yz: 2al cos / + 2bl sint - az"o, V6 t - bz sin V6 r.
These functions describe harmonic oscillations of the two masses. Physically, this had to be expected because
we have neglected damping. l
@ ELAsTlc DEFoRMATIoNs
Given A in a deformation y : Ax, find the principal
directions and corresponding factors of extension or
contraction. Show the details.
[r 5l r 0.4 0.8l
7,
L, , ] 8,
L,, ;;]
g. l''' ' 'l 10. [' 4f
L r.s 6.5_] L+ 1l _]
lt. l '
Ý6]
12. [' 21
LÝ6 2) Lz l3_]
r LINEARTRANsFoRMATloNs
Find the matrix A in the indicated linear transformation
y : Ax. Explain the geometric significance of the
eigenvalues and eigenvectors of A. Show the details.
1. Reflection about the y-axis in R2
2. Reflection about the xy-plane in R3
3. Orthogonal projection (perpendicular projection) of R2
onto the x-axis
4. Orthogonal projection of R3 onto the plane y : x
5. Dilatation (uniform stretching) in R2 by a factor 5
6. Counterclockwise rotation through the angle tl2 about
the origin in R2
Ax : ,\x
r-2 3l
13. l l 14.
L3-2)
I to.s vx/z1
1_1Lttr/z to.o J
''LT
0.1
0.1
0.8 l1]15. (Leontiefl input-output model) Suppose that three
industries are interrelated so that their outputs are used
as inputs by themselves, according to the 3 X 3
consumption matrix
where a,i7, \s the fraction of the output of industry k
.onru..róá (purchased) by industry j,Let p, be the price
charged by industry7 for its total output, A problem is
to find prices so that for each industry, total
expenditures equal total income, Show that this leads
to Ap : po where p : |pl pz pg]T, and find a
solution p with nonnegativl py pz, pz,
16. Show that a consumption matrix as considered in Prob,
15 must have column sums 1 and always has the
eigenvalue 1.
1,7. (Open Leontief input-output model) If not the whole
output but only a portion of it is consumed by the
industries themselves, then instead of Ax : x (as in
Prob. 15), we have x - Ax : y, where x : [íl x2 x,]'
is produced, Ax is consumed by the industries, and, thus,
y is the net production available for other consumers,
Find for what production x a given demand vector
y : [0.136 0.2'72 0.136]T can be achieved if the
consumption matrix is
@ populATloN MoDEL wlTH AGE
sPEclFlcATloN
Find the growth rate in the Leslie model (see Example 3)
with the matrix as given. (Show details,)
r 0 3.45 0.60l
l"I2I. lo.qo 0 0 lal,l"-"I
L0 0.45 0J
24. TEAM PROJECT. General Properties of
Eigenvalues and Eigenvectors, Prove the following
staiements and illustrate them with examples of your
own choice. Here, il, , , ,l,-are the (not necessarily
distinct) eigenvalues of a given n X n matrix A,: |a3l"],
(a)Trace.Thesumofthemaindiagonalentriesiscalled
the trace of A. It equals the sum of the eigenvalues,
(b) "Spectral shift." A - kI has the eigenvalues
Á,1 - k, , , , ,ln - k and the same eigenvectors as A,
(c) Scalar multiples, powers, kA has the eigenvalues
klt,, , , ,kln,L* (m: I,2,, , ,) has the eigenvalues
lrh,, , , , ln*. The eigenvectors are those of A,
(d) Spectral mapping theorem, The "polynomial
matrix"
p(A) : k*L- + k*_1A*-1 +,,, + klA + kol
has the eigenvalues
p(Xi) : k*Xj* + k,,n_t^j*-1 +,,, + k|^j + ko
where j : l,,,,, fl,and the same eigenvectors as A,
(e) Perron's theorem. Show that a Leslie matrix L with
positive l12,!tz, lrr, l"zhas a positive eigenvalue, (This
isaspecialcaseofthefamousPerron-Frobeniustheorem
inSec.20.7,whichisdifficulttoproveinitsgeneralform.)
f0.2 0.5 0l
A,:|aluf:lou 0 o,1
Lo., 0.5 0.7]
l- 0 l2.o 0l
zz.|ols 0 .l
L". o ro 0]
t
0 1.280 2.915]
23. lo.s6o 0 0 l
L0 0.420 0]
f o.2 0.4
l
n: lo.1 0
I
Lo.z o.4
MARKOV PROCESSES
l1]
Pi"a n"it states of the Markov processes modeled by the
following matrices. (Show the details,)
|-0.1 0.4l
18.1 l
Lo.q 0.6_]
l;]
,,Ll]11
lwlsstLy LEONTIEF (1906-1999). American economist at New York UniversitY. For his inPut-outPut
analysis he was awarded the Nobel Prlze in 1913,
CHAP.8LinearA[6ebra:MatrixEigenvalueProblems
SEC. 8.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices 345
8.3 Symmetric, Skew-Symmetric,
and Orthotonat Matrices
We consider three classes of real squaíe matrices that occur quite frequently in applications
because they have several remarkable properties which we shall now discuss. The first
two of these classes have already been mentioned in Sec. 7.2.
DEFlNlTloNs
(1)
Symmetric, Skew-Symmetric, and Orthogonal Matrices
A real square matrix A : |airc] is called
symmetric if transposition leaves it unchanged,
AT : A, thus a1ri : aik;
skew-symmetric if transposition gives the negative of A,
rT
-A' : -A, thus akj: -aik,
orthogonal if transposition gives the inverse of A,
AT : A-1.
(2)
(3)
ExAM PLE l Symmetric, Skew-Symmetric, and orthogonal Matrices
The matrices
[-3 l 5l [0 9-t2f le + 3l
| , 0 -2 l. |-n o ,o|. l-. . ,|
|, : .l l.,^ ":
,:l l-] ,, =,|
L s -2 +) Ltz -20 o_] L 5 , -._.]
are SYmmetric, skew-symmetric, and orthogonal, respectively, as you should verify. Every skew-symmetric
matrix has all main diagonal entries zero. (Can you prove this?) l
*:L:TLLT,T: ffil,i i f,?..oe
written as the sum of a symmetric matrix R and a
R:}(l+A') and S:}(a-Ar).
EXAMPLE 2 lllustration of Formuh (a)
(4)
346
THEoREM l
THEoREM z
CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
Eigenvalues of symmetric and skew-symmetric Matrices
(a) The eigenvalues of a symmetric matrix are real,
(b) The eigenvalues of a skew-symmetric matrix are pure imaginary or zero.
This basic theorem (and an extension of it) will be proved in Sec, 8,5,
ExAMpLE 3 Eigenvalues of Symmetric and Skew-Symmetric Matrices
The matrices in (i) and (7) of Sec. 8.2 are symmetric and have real eigenvalues. The skew-sYmmetric matrix
in Example 1 has the eigenvalues O, -25i, and25l. (Verify this.) The following matrix has the real eigenvalues
1 and 5 but is not symmetric. Does this contradict Theorem 1?
Ii] l
Orthogonal Transformations and Orthogonal Matrices
Orthogonal transformations are transformations
(5) y : Ax where A is an orthogonal matrix.
With each vector x in R' such a transformation assigns a vector Y in R'. For instance,
the plane rotation through an angle 0
,: [];]
:
[:;: :: l [;;]
is an orthogonal transformation. It can be shown that any orthogonal transformation in
the plane oi in three-dimensional space is a rotation (possibly combined with a reflection
in a straight line or a plane, respectively).
The main reason for the importance of orthogonal matrices is as follows.
(6)
lnvariance of lnner product
An orthogonal transformation preserves the value of the inner PrOduct of vectors
a andb in R", defined by
f br1
Il(7) a,b : aTb : lat o.]
l :
I
La.)
That is, for any a andb in Rn, orthogonal n X n matrix L, and u : Aa, v : Ab
we have u,y : a,b.
Hence the transformation also preserves the length or norm of anY vector a in
Rn given by
(8) ll u ll : \/*i: \Fi.
SEC. 8.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices 347
PROOF Let A be orthogonal. Let u : Aa and v : Ab. We must show that u.v : a.b. Now
(Aa)- : aTAT by (10d) in Sec. 7.2 and, ATA : A-lA : I by (3). Hence
(9) u.y: uTy: (Aa)TAn: aTATAb: aTIb: aTb: a.b.
From this the invariance of || a || roUows if we set b : a.
Orthogonal matrices have further interesting properties as follows.
THEoREM 3 Orthonormality of Column and Row Vectors
A real square matrix is orthogonal if and only if its column vectors z1,, . . . , an (and
also its row vectors) form an orthonormal system, that is,
(10) ?j.Zk:
'jrao: {o
if j + k
Ll if j:k.
PROOF (a) LetA be orthogonal. ThenA-lA: ATA: I,interms of column vectors a1, , an,
(11) [: A-lA : ATA : ,a,-f:
aJ.az..."'.
Zn'az, , , dnT
The last equality implies (10), by the definition of the n X n unit matrix I. From (3) it
follows that the inverse of an orthogonal matrix is orthogonal (see CAS Experiment 20).
Now the column vectors of A-1 (: AT) are the row vectors of A. Hence the row vectors
of A also form an orthonormal system.
(b) Conversely, if the column vectors of A satisfy (10), the off-diagonal entries in (11)
must be 0 and the diagonal entries 1. Hence ATA : I, as (11) shows. Similarly, AAT : I.
This implies AT : A-1 because also A-lA : AA-1 : r and the inverse is unique. Hence
A is orthogonal. Similarly when the row vectors of A form an orthonormal system, by
what has been said at the end of part (a).
THEoREM 4 Determinant of an Orthogonal Matrix
The determinant of an orthogonal matrix has the value -lí or -I.
PROOF FromdetAB: detAdetB (Sec.7.8,Theorem4)anddetAT: detA (Sec.7.7,Theorem
2d), we get for an orthogonal matrix
1 : detl: det(AA-l): det(AAT): detAdetAT: (detA)2. l
EXAMPLE 4 lllustration of Theorems 3 and 4
The last matrix in Example 1 and the matrix in (6) illustrate Theorems 3 and 4 because their determinants are
-1 and *1, as you should verify. l
l-",',
Lu,-u,[:]
,",
tr
348
Tl{E.oREM 5
PRooF
E.XAiMPLE 5
CHAP.8 Linear Algebra: Matrix Eigenvalue Problems
The first part of the statement holds for any real matrix A because its characteristic
polynomial has real coefficients, so that its zeros (the eigenvalues of A) must be as
indicated, rt,e ctaim th; |^l
: 1 will be proved in Sec, 8,5, l
Eigenvalues of an Orthogonal Matrix
The orthogonal matrix in Example 1 has the characteristic equation
-^3 + ?x'* t,t - t : o,
Now one o[ the eigenvalues must be real (why? ). hence * | o= l , Trying, *:^
P, - l , Division by ,\ + l
gives _(i2 _ 5N3 + 1) : 0 and the two eigenvalues (5 + i\/i)t6 and (5 _ t.\/Il)ta, which have absolute
value 1. Verify all of this,
Looking back at this section, you will find that the numerous basic results it contains have
relatively short, straightforward proofs. This is typical of large portions of matrix
eigenvalue theory.
Eigenvalues of an Orthogonal Matrix
The eigenvalues of an orthogonal matrix L are real or complex conjugates in pairs
and have absolute value I,
1. (Verification) Verify the statements in Example 1,
2. Verify the statements in Examples 3 and 4,
3. Are the eigenvalues of A + B of the form i3 + llj,
where n7 and Fi are the eigenvalues of A and B,
respectively?
4. (Orthogonality) Prove that eigenvectors of a
symmetric matrix corresponding to different
eigenvalues are orthogonal, Give an example,
5. (Skew-symmetric matrix) Show that the inverse of a
skew-symmetric matrix is skew-symmetric,
6. Do there exist nonsingular skew-symmetric n X n
matrices with odd n?
7. (Orthogonal matrix) Do there exist skew-symmetric
orthogonal 3 X 3 matrices?
8. (Symmetric matrix) Do there exist nondiagonal
symmetric 3 X 3 matrices that are orthogonal?
@ ElGENvALuEs oF syMMETRlc, sKEwSYMMETRIC,
AND ORTHOGONAL
MATRlcEs
Are the following matrices symmetric, skew-symmetric, or
orthogonal? Find their spectrum (thereby illustrating
Theoiems 1 and 5). (Show the details of your work,)
T0.96 -0.28l l- o bl
9. l l 10. l lll
Lo.zs 0.96-] L-a a)
11.[' 'l
L- t 1_j
"|'^_:,
,,LT: :;;
l]
,oo,
,|"L,l ,u],-1]
,,Ll
: il ,.L-l
1il
,|':,',
")
18.(Rotationinspace)Giveageometricinterpretationof
the transformation y : Ax with A as in Prob, 12 and
x and y referred to a Cartesian coordinate system,
19. WRITING PROJECT, Section Summary,
Summarize the main concepts and facts in this section,
with illustrative examples of your own,
PlR:O::B:LĚ:ffi;; ;E.ffi:::8:lB
SEq 8:4 Eigenbasgs. Diagonalization. Qu.adlatic |o|._r_
20. CAS EXPERIMENT. Orthogonal Matrices.
(a) Products. Inverse. Prove that the product of two
orthogonal matrices is orthogonal, and so is the inverse
of an orthogonal matrix. What does this mean in terms
of rotations?
(b) Rotation. Show that (6) is an orthogonal
transformation. Verify that it satisfies Theorem 3. Find
the inverse transformation.
(c) Powers. Write a program for computing powers
L- (m: I,2,...) of a2 X 2matrix A and their
spectra. Apply it to the matrix in Prob. 9 (call it A). To
what rotation does A correspond? Do the eigenvalues
of L- have a limit as m ---> a?
(d) Compute the eigenvalues of (0.9A)-, where A is
the matrix in Prob. 9. Plot them as points. What is their
limit? Along what kind of curve do these points
approach the limit?
(e) Find A such that y : Ax is a counterclockwise
rotation through 30' in the plane.
,-,,-$y' Eigenbases. DiagonaIization.
(1)
]T'Fi,EOREM ],l
P R O O F A11 we have to show is that x1, . .
)xn aíe
Let r be the largest integer such that {xr, .
r 1 n and the set {x1,
... ,x, xr*1} is
Ct, . . . , cr+l, not all zero, such that
linearly independent. Suppose they are not.
, , xr} is a linearly independent set. Then
linearly dependent. Thus there are scalars
(2) clx, *...+ crllxr11 :0
(see Sec. 7.4). Multiplying both sides by A and using A*j : ),7x7, we obtain
clilx1 +,, . l Cr*lÁ,r*tXr+l : 0.(3)
Quadratic Forms
So far we have emphasized properties of eigenvalues. We now turn to general properties
of eigenvectors. Eigenvectors of ann X nmaftix A may (or may not!) form a basis for
R". If we are interested in a transformation y : Ax, such an "eigenbasis" (basis of
eigenvectors)-if it exists-is of great advantage because then we can represent any x in
R'uniquely as a linear combination of the eigenvectors x1, . . . ,xn, say,
X: ClX1 l c2x2 +... l cnxn.
And, denoting the corresponding (not necessarily distinct) eigenvalues of the matrix A by
ir, , , , , hn, we have Arj : ,ň.7x7, so that we simply obtain
y : Ax : A(clx1 + . . . l cnxn)
:CIAX1 +",*cnLxn
:c1,[1X1 +",lcnhnxn.
This shows that we have decomposed the complicated action of A on an arbitrary vector
x into a sum of simple actions (multiplication by scalars) on the eigenvectors of A. This
is the point of an eigenbasis.
Now if the n eígenvalues are all different, we do obtain a basis:
Basis of Eigenvectors
If an n X n matrix A has n distinct eigenvalues, then A has a basis of eigenvectors
Xl,' ,,XnforR".
i
350 CHAP.8 Linear Algebra: Matrix Eigenvalue Probtems
To get rid of the last term, we subtract ir*, times (2) from this, obtaining
cr(ir - .trr*l)x, * l cr(X, - ir*r)x, : 0.
Herecl(ňl-i,*r):0,...,C,(L,-i,*r):0since{x,,...,X,\islinearlyindependent.
Hence c!: . . . _ cr: 0, since all the eigenvalues are distinct. But with this, (2) reduces
to cr*lxr*1 : 0, rr.n." cr+I :0, since xr], * 0 (an eigenvector!). This contradicts the fact
that not a1l scalars in (2) aíezero.Hence tňe conclusion of the theorem must hold, l
ExAMPLElEigenbasis.NondistinctEigenvalues.Nonexistence
t5 3l |-ll T ll
Thematrixa: |
'lr,urabasisoleigenvec,o" |'|,l |,o""pondingtotheeigenvalues
L3 ,_j
,,", d u.JrJ vl llévll --'- "
Lr] L-t]
ir : 8, /.z: 2, (See Example 1 in Sec, 8,2,)
Even if not all n eigenvalues are different, a matrix A may still provide an eigenbasis for Rn, See ExamPle
2 in Sec.8.1, where n -- 3, / .,_____^^1^.._ l^ qnlzn ,,n n hnsi.s^ Fof
ontheotherhand'A.maynothaveenoughlinearlyindependenteigenvectorstomakeupabasis.For
instance, A in Example 3 of Sec, 8,1 is
Tkl
o : [O 'l and has only one eigenvector l
"
l (k + 0, arbitrary). l
Lo 0]
.llu rlgJ vlrlJ v"- --o-
Lo]
Actually, eigenbases exist under much moíegeneral conditions than those in Theorem 1'
An important case is the following,
Tl{Eo.RE,M 2
For a proof (which is involved) see Ref, [B3], vol, 1, pp,270-272,
EXAMPLE 2 OrthonormalBasisof Eigenvectors rlT
The first matrix in Example 1 is symmetric, and an orthonorma' basis of eigenvectors is Ll/Vz |l\/2] '
ltt:/1 -u\,5]r.
^drrtPrv
r rJ oJrrrrrtv!
l
Diagonatization of Matrices
Eigenbases also play a role in reducing a matrix A to a diagonal matrix whose entries are
the eigenvalues of A. This is done by-a
,,similarity transformation," which is defined as
follows (and will have various applications in numerics in Chap, 20),
DEftNlTloN
Symmetric Matrices
A symmetric matrix has an orthonormal basis of eigenvectors for Rn
Similar Matrices. Simitarity Transformation
Ann X nmatri^ Á i, called similar to an n X n matrix A if
Á : p-lAp
for some (nonsingu lar!) n X n matrix P. This transformation, which gives Á f,om
A, is called a similarity transformation,
SEC. 8.4 Eigenbases. Diagonalization. Quadratic Forms
The key property of this transformation is that it preserves the eigenvalues of A:
THEoREM 3 Eigenvalues and Eigenvectors of Similar Matrices
If L is similar to A, then Á has the same eigenvalues as A,.
Furthermore, if x is an eigenvector of A, then y : P-lx is an eigenvector of A
corresponding to the same eigenvalue.
PROOF FromAx: ,\x (,\ an eigenvalue, x * 0) we getP-lAx: .trP-lx. Now I: PP-1. By
this "identity trick" the previous equation gives
P-lAx : P-lAIx : P-lAPP-lx : Á(P-lx) : r\P-lx.
Hence ,i. is an eigenvalue of Á and P-lx a coffesponding eigenvector. Indeed, P-lx : 0
wouldgivex: Ix: PP-lx: P0:0rcontradictingx * 0. l
EXAMPLE 3 Eigenvalues and Vectors of Similar Matrices
35l
Let
Then
^:[; -;]
ancl
":[l
^:[-i;][;-;][i;]
;]
|-3 0l
:L, ,)
Here P-1 was obtained from (4") in Sec.7.8 with detP: 1. We see that Á has the eigenvalues ir : 3,
lz:Z.Thecharacteristicequationof Ais(6 -
^)(-1
- D+ 12: 12 - 5^ + 6:0. Ithastheroots(the
eigenvalues of A) ),1 :3, h2: 2, confirming the first part of Theorem 3.
We confirm the second part. From the first component of (A - nl)x : 0 we have (6 - n)xr - 3x2 : g.
For.tr:3thisgives 3x1_ 3x2:0,say,xr: [1 1]T.For t-2itgives4x1 -3xr: O,say, xz:í3 4]T.
In Theorem 3 we thus have
Yr:P-lxr:Ii]]tl]:t;]
Indeed, these are eigenvectors of the diagonal matrix Á.
!z: P-lxz:
t i ;] t;]
:[:]
THEoREM 4
Perhaps we see that x1 and x2 are the columns of P. This suggests the general method of transforming a
matrix A to diagonal form D by using P : X, the matrix with eigenvectors as columns: l
Diagonalization of a Matrix
If an n X n matrix A has a basis of eigenvectors, then
(5) D : X-IAX
is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X
is the matrix with these eigenvectors as column vectors. Also,
(5*) D* : x-lA-x (m : 2,3, . - .).
I
352 CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
pRooF Let x1, ... ,xn constitute a basis of eigenvectors of A for R'. Let the corresPonding
eigenvalues of A be it" , , Xn, respectively, so that Ax1 : ).1x1, , A1, : lnxn,
Then X : [r,, *r1 tu, ranki, by Theorem 3 in Sec.7.4. Hence X-1 exists bY
Theorem 1 in Sec. 7.8. We claim that
(6) AX : A[x1 xrr] : [Ax1 Ax,"] : [irxr lnxn]: XD
where D is the diagonal matrix as in (5). The fourth equality in (6) follows bY direct
calculation. (Try it for n : 2 and then for general n.) The third equality uses Axrc : XuXu,
The second equality results if we note that the first column of AX is A times the first
column of X, and so on. For instance, when n : 2 and we write X1 : ["l, xzl]-,
xz : íxp xzzfT, we have
AX : A[x1 xz] :
|"o,,,o
'".,,,)[,;,
f anxl l al2x21
l
1_orrrr, * a22x21
Column 1
,,,:,,,f
attXtz -l al2X22
aztXtz * a22X22
Column 2
If we multiply (6) by X-l from the left, we obtain (5). Since
transformation, Theorem 3 implies that D has the same eigenvalues
follows if we note that
D2 : DD : x-lAxx-lAx : x-lAAX : X-lArX,
Diagonalization
Diagonalize
x-1 :
calculating Ax and multiplying by x-' from the left, we thus obtain
D : X-IAX:
: [Axr Ax2].
(5) is a similarity
as A. Equation (5*)
l
ExAMPLE 4
[ ,, 02 -r r-l
a:|-l1.5 1.0 ,rI
L nl 1.8 -9.3_.]
Solution. The characteristic determinant gives the characteristic equation -,\3 -
^2
+ Iz^ : O. The roots
(eigenvalues of A) are ir : 3, lz: -4, is : 9.By the Gauss elimination applied to (A - ,\I)x : 0 with
^
]
^r,
,\2, ,\3 we find eigenvectors and then X-'by the Gauss-Jordan elimination (Sec. 7.8, ExamPle 1). The
results are
[']:
],,,1
11]Li][r]L1],x:Lir1]
[,]1
],,,1
11] L 1
-,o:
l] Ll : l] l
SEC. 8.4 Eigenbases. Diagonalization. Quadratic Forms
Quadratic Forms. Transformation to Principal Axes
By definition, aquadraticform Qinthe components/1, ",,xnof avectorxis asum
of n2 terms, namely,
o: xTAx :iš o,or,*o
j:l k_|
: attX1,2 -lal2xlx2 +",lalnxlxn
* a2lx2x1+ a22x22 + ", l a2nx2xn
+...
l anlxnx1l an2xnxz+ ", + annxnz.
A : |a4"fis called the coefficient matrix of the form. We may assume that A is symmetric,
because we can take off-diagonal terms together in pairs and write the result as a sum of
two equal terms; see the following example.
EXAMPLE 5 Quadratic Form. Symmetric Coefficient Matrix
Let
xTAx: [xr ",, [: i] t;;]
:3xl2 *4xp2*6x2x1+2x22:3,] *l0xp2+2x22.
Here4+6:l0:5*5.FromthecorrespondingsymmetricmatrixC:|qrc],wherec7k:l@3rctou),
thus c11 : 3, cn: c2l: 5, czz: 2, we get the same result; indeed,
xTCx : [xr ,,
[; ]] t;;]
_ 3,12 l 5xp2l 5x2x1 + 2x22 : 3*t2 -| 10xp2 + 2x22. l
Quadratic forms occur in physics and geometry, for instance, in connection with conic
sections (ellipses xr2laz + xr2lb2: 1, etc.) and quadratic surfaces (cones, etc.). Their
transformation to principal axes is an important practical task related to the diagonalization
of matrices, as follows.
By Theorem 2 the symmetric coefficient matrix A of (7) has an orthonormal basis of
eigenvectors. Hence if we take these as column vectors, we obtain a matrix X that is
orthogonal, so that X-1 : XT. From (5) we thus have A : XDX-1 : XDXT. Substitution
into (7) gives
(8) 0 : xTXDxTx.
If we set XTx : y, then, since XT : X-1, we get
(9) x : Xy.
Furthermore, in (8) we have xTX : 1XTx)T : yT and XTx : yl so that Q becomes simply
(10) Q:y-Dy:htltz*Xzyz2+...ll,.!n2.
(7)
353
354 CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
This proves the following basic theorem.
THEoREM 5 Principal Axes Theorem
The substitution (9) transforTns a quadratic form
O : x'Ax : i }, ororr*t" (arc3 : a7")
j:l k:I
to the principal axes form or canonical form (IO), where it, , , , hn are the (not
necessarily distinct) eigenvalues of the (symmetric!) matrix A, and X is an
orthogonal matrix with corresponding eigenvectoís x1, ",,xn, respectively, as
column vectors.
EXAM PLE 6 Transformation to Principal Axes. Conic Sections
Find out what type of conic section the following quadratic form represents and transform it to PrinciPal
axes:
Q : I'\x] - 30xp2 + I7x22 : I28,
Solution. We have Q : xÍ Ax, where
f ll - l5l [*,l
a:| l x:| |
L-ts 17] L,r)
This gives the characteristic equation (17 - D2 - I52 : 0. It has the roots \ : 2, X2 : 32, Hence (10)
becomes
Q: 2y + 32y22.
We see that Q : I28 represents the ellipse 2yl2 + 32yr2 : l28, that is,
o9
j,t' !z'
82 ' 2'-l'
If we want to know the direction of the principal axes in the xlx2-coordinates, we have to determine normalized
eigenvectors from (A - nl)x : 0 with ), : r\r : Z and l : lz : 32 anďthen use (9), We get
f ttÝ11 l -tn/11
l _| anO l _l
Lvx,5) L ttÝz)
hence
f ltxT -ll{2l [y,l *, : yltÝ1 - y2lÝ1
:Xy:l - -ll l."J
LltvT llÝ1) Lrr_.]
' x2: y|Ý1 + y2lÝ1,
This is a 45o rotation. our results agree with those in Sec. 8.2, Example 1, except for the notations, See also
Fig. 158 in that example. I
diagonalize. (Show the details.)
1
l: :1 ,|oo
SEC. 8.4 Eigenbases. Diagonalization. Quadratic Forms
r DlAGoNALlzATloN oF MATRlcEs
Find an eigenbasis (a basis of eigenvectors) and
(d) Diagonalization. What can you do in (5) if you
want to change the order of the eigenvalues in D, for
instance, interchange dl : i1 and d22: Á,2?
@ stMlLAR MATRIcEs HAvE EQuAL
sPEcTRA
Verify this for A and Á : P-IAP. Find eigenvectors y of
Á. Sho, that x : Py are eigenvectors of A. (Show the
details of your work.)
|--sOlr4-2]
ts.A:I l,p:l l
L 0 2) L-: l_]
17.
^
18. A
ELr8] TRANsFoRMATIoN To pRlNclpAL AxEs.
coNlc sEcTloNs
What kind of conic section (or pair of straight lines) is given
by the quadratic form? Transform it to principal axes.
Express x' : lxr x2f ínterms of the new coordinate vector
yT : [yr y2f, as in Example 6.
19. x12 l 24xp2 - 6xr2 : 5
20. 3x * 4Ý5xlx2 * 7xr2 : 9
21,. 3xt2 * 8xp2 - 3xr2 : g
22. 6x12 * I6xp2 - 6rr' : 20
23. 4x] i 2Ý5xp, l 2x22 : 10
24. 7x12 - 24xp2 : I44
25. x12 - Izxlx2 * xz2 : 35
,:]
,,
:^]
,,f
|-s ll 4. t3'L, ,] LI
t.o 6.0l f z
5.1 l 6.1
L r.s 1.0_] Lo
7[i
:
,)
,L;:
l ,l]
:|,,:,I
l] "[ : j]10.
[::,i]-:[:i|]
12.
l4 A:
[; 1]
-:
[; i]
15 A :|_:
-:],-: [] :]
|-: 0l l-s 21
16'A:L,
-r)'P:Ll 4_]
8
[_]
':
:]
11.
(Orthonormal basis) Illustrate Theorem 2 w ith further
examples.
(No basis) Find further 2 X 2 and 3 X 3 matrices
without eigenbases.
PROJECT. Similarity of Matrices. Similarity is
basic, for instance in designing numeric methods.
(a) Trace. By definition, the trace of an n X n matrix
A, : |a7] is the sum of the diagonal entries,
trace A : all l az,z + , , , * ann.
Show that the trace equals the sum of the eigenvalues,
each counted as often as its algebraic multiplicity
indicates. Illustrate this with the matrices in Probs. 1,
3, 5, J,9.
(b) Trace of product. Let B : |bio]be n X n. Show
that similar matrices have equal traces, by first
proving
trace AB : i; airbtl : trace BA.
i:l L:I
(c) Find a relationship between Á in (4) and Á : PAP-i.
355
356
26.
27.
28.
29.
CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
3xr2+22xg2*3xr2:g
!2xj -f 32xlx2 * !2x22 : Il2
6.5xj * 5.0xg2 + 6.5 xz2 : 36
(Definiteness) A quadratic form Q(x) : xTAx and its
(symmetric!) matrix A are called (a) positive definite
if O(x) } 0 for all x * 0, (b) negative definite if
QG) < 0 for all x * 0, (c) indefinite if Q(x) takes
both positive and negative values. (See Fig, 160,) tQ(x)
and A aíecalled positive semidefinite (negative
semidefinite) if 0(x) > 0 (0(x) < 0) for all x,] A
necessary and sufficient condition for positive
definiteness is that all the "principal minors" are
positive (see Ref. [B3], vol. 1, p. 306), that is,
lal avl
a11)0. l l=0.
lae azzl
(o) positive definite íorm
(ó) Negative def inite form
(c) lndefinite form
Fig.16O. Quadratic forms in two variables
\':',":]: :\
>0,
8.5 Complex Matrices and Forms. Optional
Show that the form in Prob. 23 is positive definite,
whereas that in Prob. 19 is indefinite.
30. (Definiteness) Show that necessary and sufficient for
(a), (b), (c) in Prob .29 is that the eigenvalues of A are
(a) all positive, (b) all negative, (c) both positive and
negative. Hint, |Jse Theorem 5.
detA > 0.
;-;,]
then u:[';''
The three classes of real matrices in Sec. 8.3 have complex counterParts that are of Practical
interest in certain applications, mainly because of their sPectra (see Theorem 1 in this
section), for instance, in quantum mechanics. To define these classes, we need the
following standard
EXAMPLE l Notations
IfA _[3+4i
Lo ;l;,] and .:[i ;:'
ul.
2 + 5i)
Notations
Á : |aip] is obtained from A - |aip] by replacing each
(o, Freal) with its complex conjugatěaro: a - i\,Also, ÁT
of Á, hence the conjugate transpose of A.
entryajk:a+iB
: |doifis the transpose
SEC. 8.5 Complex Matrices and Forms. Optional 357
DEFlNlTloN Hermitian, Skew-Hermitian, and Unitary Matrices
A square matrix A : |ooif is called
Hermitian if Á' : A, that is, al"i : aitc
skew-Hermitian if ď : -A, that is, al"i : -a3tt
unitary if ď:A-1.
The first two classes are named after Hermite (see footnote 13 in Problem Set 5.8).
From the definitions we see the following. If A is Hermitian, the entries on the main
diagonal must satisfy dj j : aifi that is, they are real. Similarly, if A is skew-Hermitian,
thenaii: -ajj. If we setaii: a l iB, thisbecomes q,- iP: -(a + iD. Hence
d : 0, so that a3i must be pure imaginary or 0.
EXAMPLE 2 Hermitian, Skew-Hermitian, and Unitary Matrices
^
:
[ ,1r, '-,"f '
f3i
B:I
L-z+i
2 + if l+i +\,5]
l, C:| l
-i _] L+* }i .]
are Hermitian, skew-Hermitian, and unitary matrices, respectively, as you may verify by using the definitions. I
If a Hermitian matrix is real, then ÁT : AT : A. Hence a real Hermitian matrix is a
symmetric matrix (Sec. 8.3.).
Similarly, if a skew-Hermitian matrix is real, then ÁT : A- : -A. Hence a real
skew-Hermitian matrix is a skew-symmetric matrix.
Finally, if a unitary matrix is real, then ÁT : AT : A-1. Hence a real unitary matrix
is an orthogonal matrix.
This shows that Hermitian, skew-Hermitian, and unitaly matrices generalize symmetric,
skew-symmetric, and orthogonal matrices, respectively.
EigenvaIues
It is quite remarkable that the matrices under consideration have spectra (sets of eigenvalues;
see Sec. 8.1) that can be characterizedin a general way as follows (see Fig. 161).
Skew-Hermitian (skew-symmetric)
Unitary (orthogonal)
Hermitian (symmetric)
Fig. 16l. Location of the eigenvalues of Hermitian,
skew-Hermitian, and unitary matrices in the complex ,tr-plane
1 ReL
358
THEoREM l
CHAP.8 Linear Algebra: Matrix Eigenvalue Problems
Eigenvalues
(a) The eigenvalues of a Hermitian matrix (and thus of a symmetric matrix) are
real.
(b) The eigenvalues of a skew-Hermitian matrix (and thus of a skew-symmetric
matrix) are pure imaginary or Zero.
(c) The eigenvalues of a unitary matrix (and thus of an orthogonal matrix) have
absolute value I.
ExAMPLE 3 lllustration of Theorem l
For the matrices in Example 2 we find by direct calculation
ano|t}V3 ++il':fi+i:L l
p R o o F We prove Theorem 1. Let ,tr be an eigenvalue and x an eigenvector of A. MultiPlY Ax :
Ax from the left by *-, thus íTAx : ),íTx, and divide by íTx : ítxl, + , , , ,l ínxn :
|"rl' + . . . + |x,,|2, which is real and not 0 because x * 0. This gives
(1) n:$
(a) If A is Hermitian, ď : A or AT : Á and we show that then the numerator in (1) is
real, which makes .tr real. xTAx is a scalar; hence taking the transPose has no effect. Thus
(2) xTAx : 1íTAx)T : xTATx : *'Áx : (xTAx).
Hence, xTAx equals its complex conjugate, So that it must be real. (a + ib : a - ib
impliesb:0.) _
tnl rr A is skew_Hermitian, AT : -Á and instead of (2) we obtain
xTAx : - 1Xrax;(3)
so that xTAx equals minus its complex conjugate and is pure imaginary or 0,
(a + ib : -(a - ib) implies a:0.)
(c) Let A be unitary. We take Ax : ,trx and its conjugate transpose
(nD':(,l,X)-:^Xand
multiply the two left sides and the two right sides,
(lx;''{x :
^^X'x
: |,\|2XTx.
Characteristic Equation Eigenvalues
A
B
C
Hermitian
Skew-Hermitian
Unitary
^2-Lli*18:0
^2
- 2i^ * 8 :0
^2
- i^ - 1 :0
9,2
4i, -2i
lxE + Lri, -!Ý5 + li
SEC. 8.5 Complex Matrices and Forms. Optional
But A is unitary, Á- : A-1, so that on the left we obtain
(Áx)Ttx : x'Á-Ax : xTA-lAx : xTIx : íTx.
Together, íTx: |,l|2XTx. We now divide by xTx (+ 0) to get lrtl' : 1. Hence lrt|
: t.
This proves Theorem l as well as Theorems 1 and 5 in Sec. 8.3. l
Key properties of orthogonal matrices (invariance of the inner product, orthonormality of
rows and columns; see Sec. 8.3) generalize to unitary matrices in a remarkable way.
To see this, instead of R' we now use the complex vector space Cn of all complex
vectors with n complex numbers as components, and complex numbers as scalars. For
such complex vectors the inner product is defined by (note the overbar for the complex
conjugate)
(4) a.b : áTb.
The length or norm of such a complex vector is a real number defined by
(5) llull : \6Á : \F; : ffi : m.
THfoREM 2 lnvariance of lnner product
Á unitary transformation, that is, y : Ax with a unitary matrix A, preserves the
value of the inner product (4), hence also the norm (5).
P R O O F The proof is the same as that of Theorem 2 tn Sec. 8.3, which the theorem generalizes.
In the analog of (9), Sec. 8.3, we now have bars,
u.v: úTv: 1Áa;Ten : a-ďAb : áTIb : áTb : a.b. l
The complex analog of an orthonormal systems of real vectors (see Sec. 8.3) is defined
as follows.
DEFlNlTloN
Theorem 3 in Sec. 8.3 extends to complex as follows.
THI-O,RE'.M' 3 Unitary Systems of Column and Row Vectors
A complex square matrix is unitary if and only if its column vectors (and also its
row vectors) form a unitary system.
359
Unitary System
A unitary system is a set of complex vect
Lj'?,l": áj'Uo
.ors satisfying the relationships
[o if j+k
-l-1
[t if j:k.
(6)
360 CHAP. 8 Linear Atgebra: Matrix Eigenvalue Problems
p R o o F The proof is the same as that of Theorem 3 in Sec. 8.3, except for the bars required in
^T A-1
o_Á in (A\ onÁ t'Á\ nf fhe nrecr lA : A-1 and in (4) and (6) of the present section,
THEoREM 4 Determinant of a Unitary Matrix
Let A, be a unitary matrix. Then its determinant has absolute value one, that is,
|det n| : 1.
PRooF Similarly as in Sec. 8.3 we obtain
1 : det (AA-1) : det (,rÁ') : det A det ÁT : det A det Á
: det A det A : |det A|2.
Hence |det A| : 1 (where det A may now be complex), l
ExAMPLE 4 Unitary Matrix tllustrating Theorems lc and 2-4
Forthevectors aT:12 _l]and5r:11 + l 4i]we getáT :|2 i]TandaTb:2(I + i)_ 4: _2+2i
and with
[o.si 0 6 lA:| l
Lo.o 0.8l]
also
^"
:
[;]
T-0.8 + 3.2if
and Ab:| l.
L-2.6 + 0.6i _]
as one can readily verify. This gives (ÁDTnn -- -2 * 2i, illustrating Theorem 2. The matrix is unitarY. Its
columns form a unitary system,
ár-u, : -0.8i,0.8l + 0,62 : I, ár'u2: -0,8',0,6 + 0,6,0,8i : 0,
-ur' ur : 0.62 + (-0.8,)0,8, : 1
and so do its rows. Also, detA : -1. The eigenvalues are 0.6 + 0.8' and -0.6 + 0.8i, with eigenvectors
t1 1]T and [1 - 1]T, respectively.
- l
Theorem 2 in Sec. 8.4 on the existence of an eigenbasis extends to complex matrices as
follows.
THEoREM 5 Basis of Eigenvectors
A Hermitian, skew-Hermitian, or unitary matrix has a basis of eigenvectors for C"
that is a unitary system.
For a proof see Ref. [B3], vol. 1 , pp. 270_2]2 and p. 244 (Definition 2).
EXA M P LE 5 Unitary Eigenbases
The matrices A, B, C in Example2have the following unitary systems of eigenvectors, as you should verify.
1
A: ftl - zi 5]T (,\ : 9),
l
ftl - zi -2]T (^: 2)
1 - I --
B: ftl - zi -5]T (l : -2i), .,fu,, l + 2ilT (t : 4i)
l - . r l ..-.
C: ut l' IIT rn : }tr * v6ll. 6l - llT (^: +u - rÁll, l
SEC. 8.5 Complex Matrices and Forms. Optional
Hermitian and Skew-Hermitian Forms
The concept of a quadratic form (Sec. 8.4) can be extended to complex. We call the
numerator XTAx in (1) a form in the components .tr1, . . , xlt of x, which may now be
complex. This form is again a sum of n2 terms
,lL fL
j_7 k:I
: ar171x1+ " ,
l alnílxn
-| a2ní2Xn* a2lí2x1 *
l an ínx1l l annxnxn.
A is called its coefficient matrix. The form is called a Hermitian or skew_Hermitian
form if A is Hermitian or skew-Hermitian, respectively. The value of a Hermitian form
is real, and that of a skew-Hermitian form is pure imaginary or zero. This can be seen
directly from (2) and (3) and accounts for the importance of these forms in physics. Note
that (2) and (3) are valid for any vectors because in the proof of (2) and (3) we did not
use that x is an eigenvector but only that xTx is real and not 0.
EXAMPLE 6 Hermitian Form
For A in Example 2 anď, say, x : [1 + l 5i]T we get
x|Ax:Ll_i _5ij [ 4 l -3il[l +il [+tt +1)+(1_3l),5,1
L,*r, , ]L5i _.l
:[l -i -si]
Lrl +3ixl +t)*r.r,_]
:223' l
Clearly, if A and x in (4) are real, then (7) reduces to a quadratic form, as discussed in
the last section.
1. (Verifrcation) Verify the statements in Examples 2
and 3.
2. (Product) Show 6Áf : -AB for A and B in
Example 2. For any n X n Hermitian A and
skew-Hermitian B.
3. Show that 1ABC;T : -C-IBA for any n x n
Hermitian A, skew-Hermitian B, and unitary C.
4. (Eigenvectors) Find eigenvectors of A, B, C in
Examples 2 and 3. g.
E ElGENvALuEs AND ElGENvEcToRs
Are the matrices in Probs. 5-11 Hermitian? SkewHermitian?
Unitary? Find their eigenvalues (thereby
verifying Theorem 1) and eigenvectors.
36l
(7)
il
Yl -i: ;]
\,5 )
tllr
lv2
7. l
|,l _
LÝz
5
|:,
,) 1-0
6.Í
Lzi
00l
I
0 .51 l
I
5i 0_.]
I+i
0
I-i
10.,I ,:,]
I
|,:
,
,,Ll:i]
CHAP. 8 Linear Algebra: Matrix Eigenvalue Problems
12. PROJECT. Complex Matrices
(a) Decomposition. Show that any square matrix may
be written as the sum of a Hermitian and a
skew-Hermitian matrix. Give examples,
(b) Normal matrix. This important concept denotes
a matrix that commutes with its conjugate transpose,
Aď : Á'A. P.oue that Hermitian, skew-Hermitian,
and unitary matrices are normal, Give corresponding
examples of your own.
(c) Normality criterion. Prove that A is normal if and
only if the Hermitian and skew-Hermitian matrices in
(a) commute.
(d) Find a simple matrix that is not normal, Find a
normal matrix that is not Hermitian, skew-Hermitian,
or unitary.
(e) Unitary matrices. Prove that the product of two
unitary n X n matrices and the inverse of a unitary
matrix are unitary. Give examples,
(f) Powers of unita,ry matrices in applications may
sometimes be very simple. Show that C12 : I in
Example 2. Find further examples.
@ coMpLEx FoRMs
-sthe given matrix (call it A) Hermitian or skew_Hermitian?
Find íTAx.(Show all the details.) a, b, c, k are real
16. (Pauti spin matrices) Find the eigenvalues and
eigenvectors of the so-calledPauli spinmatrices and show
Ň SrS, : iS", SgS" : -iS, S"2 - Sr' : Sr2 : I,
where
,.
[-,,, ;,] ,": [1 ]:]
,o.
|, _,,,
*o',f,
- :
[;;]
",["_,
,l,]
,-: [;,]
,":
[: ;] ',: [:
..:
[; :]
;]
1. In solving an eigenvalue problem, what is given and
what is sought?
2, Do there exist square matrices without eigenvalues?
Eigenvectors corresponding to mofe than one
eigenvalue of a given matrix?
3. What is the defect? Why is it important? Give examples,
4. Can a complex matrix have real eigenvalues? Real
eigenvectors? Give reasons.
5. What is diagonalization of a matrix? Transformation of
a form to principal axes?
6. What is an eigenbasis? When does it exist? Why is it
important?
7. Does a 3 X 3 matrix always have a real eigenvalue?
8. Give a few typical applications in which eigenvalue
problems occur.
E DlAGoNALlzATloN
Find an eigenbasis and diagonalize. (Show the details,)
t l0l 121
9. l l
L-l++ - l03_.l
|- I4.4
10. l
L-tt.z
-11.21
102.6_]
11.
|-14 10 l
L-,o i t.]
|- 15 4
1.10
l
L-tz -2 1],^:
,-12.
z1
;l
-s]
]l ^:,
,]
t5 3
13.|, ?
L-. -a
,.L;,:
_.,_--
Summary of Chapter 8
@ slMlLARlTy
Verify that A and Á :
Here, A, P are:
15. [''
241
|l
Lz.+ o.zl'Lz
P-IAP have the same spectrum.
17[1
',,_:]
[i 1|
363
Transformation to Canonical Form. Reduce the quadratic
form to principal axes.
18. 11.56xr2 + 2o.I6x 2 -| I7.44x22: 100
19. 1.09x12 - 0.06xrx2 l l.\lxr2 : 1
20. I4x12 l 24xp2 - 4xr2 : 29
i]
[ :|
"|',1 ::;]
The practical importance
The problems are defined
(1)
of matrix eigenvalue problems can hardly be overrated.
by the vector equation
Ax : ),x.
A is a given square matrix. A1l matrices in this chapter ate square. ,tr is a scalar. To
solve the problem (1) means to determine values of ),, called eigenvalues (or
characteristic values) of A, such that (1) has a nontrivial solution x (that is,
x t 0), called an eigenvector of A corresponding to that i. An n X n matrix has
at least one and at most n numerically different eigenvalues. These are the solutions
of the characteristic equation (Sec. 8.1)
atz
azz- l
D(^) : det (A -
^I)
: -0.
an2 ann - l
D(^) is called the characteristic determinant of A. By expanding it we get the
characteristic polynomial of A, which is of degree nin),. Some typical applications
are shown in Sec. 8.2.
Section 8.3 is devoted to eigenvalue problems for symmetric (A. : A),
skew-symmetric (A' : -A), and orthogonal matrices (A' : A-1). Section 8.4
concerns the diagonalÍzation of matrices and the transformation of quadratic forms
to principal axes and its relation to eigenvalues.
Section 8.5 extends Sec. 8.3 to the complex analogs of those real matrices,
called Hermitian (Á- : A), skew-Hermitian (ď : -A), and unitary matrices
(ď : A-1). A1l the eigenvalues of a Hermitian matrix (and a symmetric one) are
real. For a skew-Hermitian (and a skew-symmetric) matrix they are pure imaginary
oí Zero. For a unitary (and an orthogonal) matrix they have absolute value 1.
an- h
azt
anI
aIn
a2n
(2)