Eigenvalue problems Existence, uniqueness and conditioning Computation Bi7740: Scientific computing Eigenvalues and eigenvectors Vlad Popovici popovici@iba.muni.cz Institute of Biostatistics and Analyses Masaryk University, Brno Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Outline 1 Eigenvalue problems Eigenvalue problems 2 Existence, uniqueness and conditioning 3 Computation Special forms Power iteration Generalized eigenvalue problem Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Supplemental bibliography - online http: //web.eecs.utk.edu/~dongarra/etemplates/book.html Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Outline 1 Eigenvalue problems Eigenvalue problems 2 Existence, uniqueness and conditioning 3 Computation Special forms Power iteration Generalized eigenvalue problem Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Eigenvalue problems Standard eigenvalue problem Given a square matrix A ∈ Mn×n(R), find a scalar λ and a vector x ∈ Rn , x 0, such that Ax = λx. λ is called eigenvalue and x is called eigenvector a similar "left" eigenvector can be defined as yT A = λyT , but this would be equivalent to a "right" eigenvalue problem (as above) with AT as matrix the definition can be extended to complex-valued matrices λ can be complex, even if A ∈ Mn×n(R) Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Characteristic polynomial previous eq. is equivalent to (A − λI)x = 0 which admits nonzero solutions if and only if (A − λI) is singular, i.e. det(A − λI) = 0 det(. . . ) is the characteristic polynomial of matrix A and its roots λi are the eigenvalues of A (from Fundamental Theorem of Algebra) for an n × n matrix there are n eigenvalues (may not all be real or distinct) Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems reciprocal: a polynomial p(λ) = c0 + c1λ + cn−1λn−1 + λn has a companion matrix   0 0 . . . 0 −c0 1 0 . . . 0 −c1 ... ... ... ... ... 0 0 . . . 1 −cn−1   the characteristic polynomial is not used in numerical computation, because: finding its roots may imply an infinite number of steps of the sensitivity of the coefficients too much work to compute the coefficients and find the roots Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Example Let A = 0 1 0 −1 . The characteristic equation is det(A − λI) = 0 ⇔ λ2 + λ = 0 with solutions λ1 = 0 and λ2 = −1. For eigenvectors v1, v2 (non-null!): (A − λ1I)v1 = 0 1 0 −1 v11 v21 = v21 −v21 := 0 0 so v21 = 0. We choose v11 such that v1 = 1, so v11 = 1. Similarly, for λ2 = −1 we get v2 = 1/ √ 2 −1/ √ 2 . Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Example - in Matlab 1 >> A = [0 1; 0 −1]; 2 >> [V, L] = eig(A) % V: eigenvectors, L: eigenvalues 3 V = 4 1.0000 −0.7071 5 0 0.7071 6 L = 7 0 0 8 0 −1 9 >> eig(A) % only eigenvalues 10 11 >> roots(poly(A)) % not the way to go normally Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Sensitivity of the characteristic polynomial let A = 1 1 with > 0 and slightly smaller than mach the exact eigenvalues are 1 + and 1 − in floating-point arithmetic, det(A − λI) = λ2 − 2λ + (1 − 2 ) = λ2 − 2λ + 1 with the solution 1 (double root) Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems a simple eigenvalue is a simple solution of the characteristic polynomial (multiplicity of the root is 1) a defective matrix has eigenvalues with multiplicity larger than 1, meaning less than n independent eigenvectors a nondefective matrix has exactly n linearly independent eigenvectors and can be diagonalized Q−1 AQ = Λ where Q is a nonsingular matrix of eigenvectors Matlab: Adiag = inv(Q)*A*Q Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Eigen-decomposition it follows that if A admits n independent eigenvectors, it can be decomposed (factorized) as A = QΛQ−1 with Q having the eigenvectors of A as columns, and Λ a diagonal matrix with eigenvalues on the diagonal theoretically, A−1 = QΛ−1 Q−1 (if λi 0 and all eigenvalues are distinct) if A is normal (AH A = AH A) then Q becomes unitary if A is real symmetric, then Q is orthogonal Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Eigenvectors the eigenvectors can be arbitrarily scaled usually, the eigenvectors are normalized, x = 1 the eigenspace is Sλ = {x|Ax = λx} a subspace S ⊂ Rn is invariant if AS ⊆ S for xi eigenvectors, span({xi}) is an invariant subspace Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Some useful properties det(A) = N i=1 λni i , where ni is the multiplicity of eigenvalue λi tr(A) = N i=1 niλi the eigenvalues of A−1 are λ−1 i (for λi 0) the eigenvectors of A−1 are the same as those of A A admits an eigen-decomposition if all eigenvalues are distinct if A is invertible it does not imply that it can be eigen-decomposed; reciprocally, if A admits an eigen-decomposition, it does not imply it can be inverted A can be inverted if and only if λi 0, ∀i Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Eigenvalue problems Before solving an eigenvalue problem... do I need all the eigenvalues? do I need the eigenvectors as well? is A real or complex? is A small, dense or large and sparse? is there anything special about A? e.g.: symmetric, diagonal, orthogonal, Hermitian, etc etc Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem conditioning of EV problem is different than conditioning of linear systems for the same matrix sensitivity is "not uniform" among eigenvectors/eigenvalues for a simple eigenvalue λ, the condition is 1/ yH x|, where x and y are the corresponding right and left normalized eigenvectors (and yH is the conjugate transpose) so the condition is 1/ cos(x, y) a perturbation of order in A may perturb the eigenvalue λ by as much as / cos(x, y) for special cases of A, special forms of conditioning can be derived Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Outline 1 Eigenvalue problems Eigenvalue problems 2 Existence, uniqueness and conditioning 3 Computation Special forms Power iteration Generalized eigenvalue problem Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Computation - general ideas a matrix B is similar to A if there exists a nonsingular matrix T such that B = T−1 AT if y is an eigenvector of B then x = Ty is an eigenvector of A and HOMEWORK: prove that A and B have the same eigenvalues transformations: shift: A ← A − σI inversion: A ← A−1 (if A is nonsingular) power: A ← Ak polynomial: let p be a polynomial, then A ← p(A) Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Forms attainable by similarity For a matrix A with given property, the matrices T and B exist such that B = T−1 AT has the desired property: A T B distinct eigenvalues nonsingular diagonal real symmetric orthogonal real diagonal complex Hermitian unitary real diagonal normal unitary diagonal arbitrary real orthogonal real block triangular (Schur) arbitrary unitary upper triangular (Schur) arbitrary nonsingular almost diagonal Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem If A is diagonal... the eigenvalues are the diagonal entries the eigenvectors are the columns of the identity matrix If a matrix is not diagonalizable, one can obtain a Jordan form:   λ1 1 λ1 1 λ1 λ2 1 λ2 λ3 ... λk 1 λk   Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem If A is triangular (Schur form, in general)... eigenvalues are the elements on the diagonal eigenvectors are obtained as follows: If A − λI =   U11 u U13 0 0 vT O 0 U33   is triangular, then U11y = u can be solved for y, so that x =   y −1 0   is the corresponding eigenvector Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Symmetric matrices - Jacobi method idea: start with a symetric matrix A0 and iteratively form Ak+1 = JT k Ak Jk , where Jk is a plane rotation chose to annihilate a symmetric pair of entries in Ak with the goal of diagonalizing A a rotation matrix has the form cos θ sin θ − sin θ cos θ the problem is to find θ for A = a b b c and requiring that JT AJ is diagonal, we obtain 1 + tan θ a − c b − tan2 θ = 0 from which we use the root with the smallest magnitude Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Outline 1 Eigenvalue problems Eigenvalue problems 2 Existence, uniqueness and conditioning 3 Computation Special forms Power iteration Generalized eigenvalue problem Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Power iterations is the simplest metod to compute one eigenvalue-eigenvector pair the matrix is repeatedly multiplied by an intial starting vector let λ1 be the absolute largest eigenvalue of A, with the corresponding eigenvector v1 start with x0 0 and iterate: xk = Axk−1 k = 1, 2, . . . the process converges to a scaled version of v1 corresponding to the eigenvalue λ1 Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Power iterations - geometrical interpretation v1v2 x0 x1 xn Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Power iterations - convergence Let v1, . . . , vn be the eigenvectors of A. Then, any vector x0 can be written as x0 = n i=1 αivi. Then xk =Axk−1 = · · · = Ak x0 = n i=1 λk i αivi =λk 1  α1v1 + n i=2 (λi/λ1)k αivi   and limk→∞(λi/λ1)k → 0 since |λi/λ1| < 1. Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Power iterations, cont’d theoretically, it can happen that x0 has no component in v1 (i.e. α1 = 0) the iterations cannot converge to a complex solution there might be several equally large and maximal eigenvalues, so the iterations converge to a linear combination of the corresponding eigenvectors the values of xk grow geometrically with k and this can lead to over-/under-flow → use normalization: at each step normalize xk by xk ∞ the rate of convergence depends on |λ2/λ1|: smaller the ratio, faster the convergence → it might be possible to find a shift by σ such that |(λ2 − σ)/(λ1 − σ)| < |λ2/λ1| which accelerates convergence Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Power iterations: Exercise Implement the following procedure in Matlab, to find the largest eigenvalue and the corresponding eignvector for a matrix A: start with an initial vector x0 0, λ0 = 0 for k = 1, 2, . . . compute the new approximation of the eigenvector: xk = Axk−1 xk−1 ∞ eigenvalue: λk = max{x1k , . . . , xnk } stop iterating if a maximum number of iterations has been attained or if the changes between two consecutive iterations is below a threshold: xk − xk−1 < and |λk − λk−1| < scale the final approximation such that xK = 1 Scaling at each iteration prevents over-/under-flow and ensures that the largest component of xk is λk . Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Inverse iteration if the smallest eigenvalue is needed: the eigenvalues of A are the reciprocals of the eigenvalues of A−1 . Try: [v,l] = eig_power(inv(A)); l = 1/l; inverse iteration scheme: Ayk = xk−1 xk = yk / yk ∞ this is equivalent to power iterations applied to A A−1 is not computed explicitly factorization of A is used to solve the system of linear eqs. converges to the eigenvector corresponding to the smallest eigenvalue the shifting strategy can also be applied Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Shifted inverse power iterations if one wants eigenvalues close to a certain value s (not equal to any eigenvalue): transform the problem: Av = λv −→ (A − sI)v = (λ − s)v the eigenvalue sought is λs = 1 largest eigenvalue of (A − sI)−1 + s this method works only if there is a single eigenvalue λs try: [v,l] = eig_power(inv(A − ... s*eye(size(A)))); l = 1/l+s; Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Rayleigh quotient let A be a real matrix with x an approximate eigenvector to find the corresponding eigenvalue λ one can solve the system Ax λx for λ unknown (n × 1 least squares approx. problem) form normal eqs.: xT Ax = λxT x and obtain the LS solution λ = xT Ax xT x this is the Rayleigh quotient Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Rayleigh q., cont’d R.q. gives a good estimate of the eigenvalue corresponding to an eigenvector R.q. can be used as a shift to speed up convergence of the inverse iteration Rayleigh quotient iteration: for some x0 0, σk = xT k Axk xT k xk solve (A − σk I)yk+1 = xk xk+1 = yk+1/ yk+1 ∞ usually 2-3 iterations are enough Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Rayleigh q., cont’d R.q. iteration is very efficient for symmetric matrices solving a different system (different shift) at each iteration introduces some overhead, depending on the form of the matrix the method can be extended to complex matrices using the conjugate transpose the R.q. has values between the minimum and maximum eigenvalues of A - this is sometimes called numerical range (or field of values) of the matrix A Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Deflation computes sequentially each of the (eigenvalue, eigenvector) pairs if λ1 and x1 are already computed, then transform the matrix to remove them and proceed to compute λ2 and x2; iterate this process is known as deflation let H be a nonsingular matrix such as Hx1 = α1e1 (e.g. a Householder transformation) apply this transformation to A: HAH−1 = λ1 bT 0 B where B is a matrix of order n − 1 with eigenvalues λ2, . . . , λn Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Deflation, cont’d then, use B to compute λ2 eigenvectors and eigenvalues of B are linked to those of A as follows: if y2 is an eigenvector of B corresponding to λ2, then x2 = H−1 α y2 where α = bT y2 λ2 − λ1 , provided that λ1 λ2. ...and repeat... Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Alternative deflation let u1 be a vector such that uT 1 x1 = λ1 it follows that A − x1uT 1 has the eigenvalues 0, λ2, . . . , λn examples of u vectors: u1 = λ1x1 if A is symmetric and x1 2 = 1 u1 = λ1y1 where y1 is the left eigenvector normalized such that yT 1 x1 = 1 u1 = AT ek , is x1 is normalized such that x1 ∞ = 1 and the k−th component of x1 is 1. Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Simultaneous iteration in power iteration method, xk = Axk−1 converged to an eigevector why not using a matrix, such that Xk = AXk−1 would converge simultaneously to several eigenvectors? start with a n × p matrix X0 of rank p and iterate Xk = AXk−1 span(bXk ) converges to an invariant space determined by the p largest eigenvalues of A, provided that |λp| > |λp+1| the method is called simultaneous iteration of subspace iteration Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Simultaneous iteration, cont’d to avoid over-/under-flow and bad conditioning with increasing k, the columns of Xk need to be normalized using the reduced QR decomposition avoids these problems: Qk Rk = Xk−1 QR decomposition of previous X Xk = AQk this is the orthogonal iteration scheme if the eigenvalues are distinct in modulus, the process converges to a block trinagular form if p = n and X0 = I, the series of matrices Ak = QH k AQk converges to (block) triangular form yielding all the eigenvalues of A Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Simultaneous iteration, cont’d alternative: QR iteration: compute Ak without forming the product explicitly start with A0 = A and at step k: Qk Rk = Ak−1 compute the QR decomposition Ak = Rk Qk form the inverse product diagonal entries (or eigenvalues of diagonal blocks) converge to eigenvalues of A the product of orthogonal matrices Qk converges to the matrix of corresponding eigenvectors if A is symmetric, Ak converges to a diagonal matrix special forms of A lead to faster convergence → use some pretransformations of the matrix to speed up Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem cost of QR iteration method: for symmetric matrices: ∼ 4 3 n3 for eigenvalues only and ∼ 9n3 for both eigenvalues and eigenvectors for general matrices: ∼ 10n3 for eigenvalues only and ∼ 25n3 for both eigenvalues and eigenvectors other methods: Krylov subspace methods: reduce A to a tridiagonal matrix and find eigen-values/-vectors by QR Lanczos method for symmetric matrices spectrum-slicing: for real symmetric matrices, it can find how many eigenvalues are below a given σ ∈ R → by "slicing" the space, the eigenvalues can be isolated; see also the Sturm sequence Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Outline 1 Eigenvalue problems Eigenvalue problems 2 Existence, uniqueness and conditioning 3 Computation Special forms Power iteration Generalized eigenvalue problem Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem The generalized eigenvalue problem it has the form Ax = λBx where A and B are n × n matrices if any of A or B is nonsingular, the problem can be transformed in a standard eigenvalue problem this is not recommended because loss of accuracy (roundoff errors) and loss of symmetry (if one of A or B is symmetric better: use the QZ algorithm Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem QZ algorithm if A and B are triangular, the eigenvalues are λi = aii/bii, for bii 0 the QZ algorithm reduces A and B simultaneously to upper triangular matrices by orthogonal transformations Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem Singular Value Decomposition - again we saw that SVD of a m × n matrix A has the form A = UΣVT where U is m × m orthogonal matrix and V is n × n orthogonal matrix and Σ is m × n diagonal matrix with non-negative elements on the diagonal this is a eigenvalue-like problem the columns of U and V are the left and right singular vectors, respectively and σii are the singular values Vlad Bi7740: Scientific computing Eigenvalue problems Existence, uniqueness and conditioning Computation Special forms Power iteration Generalized eigenvalue problem The relation between SVD and the eigen-decomposition SVD can be applied to any m × n matrix, while the eigen-decomposition is applied only to square matrices the singular values are non-negative while the eigenvalues can be negative let A = UΣVT be SVD of A ⇒ AT A = (VΣT UT )(UΣVT ) = VΣT ΣVT also, AT A is symmetric real matrix, so it has a eigendecomposition AT A = QΛQT , with Q orthogonal. By unicity of decompositions, it follows that ΣT Σ = Λ V = Q so σi = √ λi Vlad Bi7740: Scientific computing