8 Codes Code Uectors [I]fthe machine can detect an error, why can't it locate the position of the error and correct it? —Richard W. Hamming Quoted in Thomas M. Thompson From Error-Correcting Codes Through Sphere Packings to Simple Groups Cams Mathematical Monographs 21, Mathematical Association of America, 1983, p. 17 The modern theory of codes originated with the work of the American mathematician and computer scientist CI )1), whose 1937 thesis showed how algebra could play a role in the design and analysis of electrical circuits. Shannon would later be instrumental in the formation of the field of information theory and give the theoretical basis for what are now called error-correcting codes. 620 Throughout history, people have transmitted information using codes. Sometimes the intent is to disguise the message being sent, such as when each letter in a word is replaced by a different letter according to a substitution rule. Although fascinating, these secret codes, or ciphers, are not of concern here; they are the focus of the field of cryptography. Rather, we will concentrate on codes that are used when data must be transmitted electronically. A familiar example of such a code is Morse code, with its system of dots and dashes. The advent of digital computers in the 20th century led to the need to transmit massive amounts of data quickly and accurately. Computers are designed to encode data as sequences of 0s and Is. Many recent technological advancements depend on codes, and we encounter them every day without being aware of them: Satellite communications, compact disc players, the universal product codes (UPC) associated with the bar codes found on merchandise, and the international standard book numbers (ISBN) found on every book published today are but a few examples. In this section, we will use vectors to design codes for detecting errors that may occur in the transmission of data. In later sections, we will construct codes that can not only detect but also correct errors. The vectors that arise in the study of codes are not vectors in W but vectors in Z"2 or, more generally, Z"m. We first encountered such vectors in Section 1.1. Since computers represent data in terns of 0s and Is (which can be interpreted as off/on, closed/open, false/true, or no/yes), we begin by considering binary codes, which consist of vectors with entries in Z2. In practice, we have a message (consisting of words, numbers, or symbols) that we wish to transmit. We begin by encoding each "word" of the message as a binary vector. A binary code is a set of binary vectors (of the same length) called code vectors. The process of converting a message into code vectors is called encoding, and the reverse process is called decoding. 63247_08_ch08_p620-649.indd 620 01/11/13 12:54 PM Section 8.1 Code Vectors Example 8.1 As we will see, it is highly desirable that a code have other properties as well, such as the ability to spot when an error has occurred in the transmission of a code vector and, if possible, to suggest how to correct the error. Suppose that we have already encoded a message as a set of binary code vectors. We now want to send the binary code vectors across a channel (such as a radio transmitter, a telephone line, a fiber optic cable, or a CD laser). Unfortunately, the channel may be "noisy" (because of electrical interference, competing signals, or dirt and scratches). As a result, errors may be introduced: Some of the Os may be changed to Is, and vice versa. How can we guard against this problem? We wish to encode and transmit a message consisting of one of the words up, down, left, or right. We decide to use the four vectors in Z2 as our binary code, as shown in Table 8.1. If the receiver has this table too and the encoded message is transmitted without error, decoding is trivial. However, lets suppose that a single error occurred. (By an error, we mean that one component of the code vector changed.) For example, suppose we sent the message "down" encoded as [0, 1] but an error occurred in the transmission of the first component and the 0 changed to a 1. The receiver would then see [1,1] instead and decode the message as "right." (We will only concern ourselves with the case of single errors such as this one. In practice, it is usually assumed that the probability of multiple errors is negligibly small.) Even if the receiver knew (somehow) that a single error had occurred, he or she would not know whether the correct code vector was [0,1 ] or [ 1,0]. Table 8.1 Message up down left right Code [0,0] [0,1] [1.0] [1.1] t suppose we sent the messa a binary code of length 3, as ;e using a code that shown in Table 8.2. was a subset ofZ32—in Table 8.2 Message up down left right Code [0, 0, 0] [0, 1, 1] [1,0, 1] [1, 1,0] The term parity comes from the Latin word par, meaning "equal" or "even." Two integers are said to have the same parity if they are both even or both odd. This code can detect any single error. For example, if "down" was sent as [0, 1, 1] and an error occurred in one component, the receiver would read either [1, 1, 1] or [0, 0, 1] or [0, 1, 0], none of which is a code vector. So the receiver would know that an error had occurred (but not where) and could ask that the encoded message be retransmitted. (Why wouldn't the receiver know where the error was?) The code in Table 8.2 is an example of an error-detecting code. Until the 1940s, this was the best that could be achieved. The advent of digital computers led to the development of codes that could correct as well as detect errors. We will consider these in Sections 8.2, 8.4, and 8.5. 63247_08_ch08_p620-649.indd 621 01/11/13 12:54 PM Chapter 8 Codes The message to be transmitted may itself consist of binary vectors. In this case, a simple but useful error-detecting code is a parity check code, which is created by appending an extra component—called a check digit—to each vector so that the parity (the total number of Is) is even. Example 8.2 If the message to be sent is the binary vector [1, 0, 0, 1, 0, 1], which has an odd number of Is, then the check digit will be 1 (in order to make the total number of Is in the code vector even) and the code vector will be [1, 0, 0, 1, 0, 1, 1]. Note that a single error will be detected, since it will cause the parity of the code vector to change from even to odd. For example, if an error occurred in the third component, the code vector would be received as [1, 0, 1, 1, 0, 1, 1], whose parity is odd because it has five Is. Let's look at this concept a bit more formally. Suppose the message is the binary vector b = [bl,b2,..., b„] in Z2. Then the parity check code vector is v = [bl,b2,..., bn, d] in Z£+1, where the check digit d is chosen so that by + b2 + ■ • ■ + bn + d = 0 in Z2 or, equivalently, so that 1-v = 0 where l = [l,l,...,l],a vector whose every component is 1. The vector 1 is called a check vector. If vector v' is received and 1-v' = 1, then we can be certain that an error has occurred. (Although we are not considering the possibility of more than one error, observe that this scheme will not detect an even number of errors.) Parity check codes are a special case of the more general check digit codes, which we will consider after first extending the foregoing ideas to more general settings. Codes using m-ary vectors are called m-ary codes. The next example is a direct extension of Example 8.2 to ternary codes. Example 8.3 Let b = [by, b2, . . . , b„] be a vector in Z". Then a check digit code vector may be defined by v = [by, b2,..., bn, d] (in Z"+1), with the check digit d chosen so that 1-v 0 (where the check vector 1 digit satisfies [1,1... 1] is the vector of Is in Z"+1); that is, the check by + b2 + ■ • ■ + bn + d = 0 in Z3 For example, consider the vector u = [2, 2, 0, 1,2]. The sum of its components is 2 + 2 + 0 + 1+ 2 = 1, so the check digit must be 2 (since 1+2 = 0). Therefore, the associated code vector is v = [2, 2, 0, 1, 2, 2]. While simple check digit codes will detect single errors, it is often important to catch other common types of errors as well, such as the accidental interchange, or transposition, of two adjacent components. (For example, transposing the second 63247_08_ch08_p620-649.indd 622 01/11/13 12:54 PM Section 8.1 Code Vectors and third components of v in Example 8.3 would result in the incorrect vector v' = [2, 0, 2, 1, 2, 2].) For such purposes, other types of check digit codes have been designed. Many of these simply replace the check vector 1 by some other carefully chosen vector c. Example 8.4 0 IMI74927"0209 Figure 8.1 A Universal Product Code Example 8.5 The Universal Product Code, or UPC (Figure 8.1), is a code associated with the bar codes found on many types of merchandise. The black and white bars that are scanned by a laser at a store's checkout counter correspond to a 10-ary vector u = [ult u2,..., un, d] of length 12. The first 11 components form a vector in Z[J that gives manufacturer and product information; the last component d is a check digit chosen so that c • u is the vector [3,1,3,1,3,1,3, 1, 3, 1, 3, 1]. That is, after rearrangin 0 in Z10, where the check vector c 3(m! + u3 + u5 + u7 + u9 + un) + (u2 + m4 + u6 + u& + ul0) + d = 0 where d is the check digit. In other words, the check digit is chosen so that the left-hand side of this expression is a multiple of 10. For the UPC shown in Figure 8.1, we can determine that the check digit is 6, performing all calculations in Z10: c-u = 3-0 + 7 + 3-4 + 9 + 3-2 + 7 + 3-0 + 2 + 3-0 + 9 + 3-4 + d = 3(0 + 4 + 2 + 0 + 0 + 4) + (7 + 9 + 7 + 2 + 9) + d = 3(0) + 4 + d = 4 + d The check digit d must be 6 to make the result of the calculation 0 in Z10. (Another way to think of the check digit in this example is that it is chosen so that c • u will be a multiple of 10.) The Universal Product Code will detect all single errors and most transposition errors in adjacent components. To see this last point, suppose that the UPC in Example 8.4 were incorrectly written as u' = [0, 7, 4, 2, 9, 7, 0, 2, 0, 9, 4, 6], with the fourth and fifth components transposed. When we applied the check vector, we would have c • u' = 4 + 0 (verify this!), alerting us to the fact that there had been an error. (See Exercises 15 and 16.) The 10-digit International Standard Book Number (ISBN-10) code is another widely used check digit code. It is designed to detect more types of errors than the Universal Product Code and, consequently, is slightly more complicated. Yet the basic principle is the same. The code vector is a vector in Z [J. The first nine components give country, publisher, and book information; the tenth component is the check digit. Suppose the ISBN-10 for a book is 0-534-34450-X. It is recorded as the vector b = [0, 5, 3,4, 3, 4, 4, 5,0, X] where the check "digit" is the letter X. For the ISBN-10 code, the check vector is the vector c = [10, 9, 8, 7,6, 5, 4, 3, 2,1], and we require that c • b = 0 in Zn. Let's determine the check digit for the vector b in this example. We must compute c-b=10-0 + 9-5 + 8-3 + 7-4 + 6-3+5-4 + 4-4 + 3-5+2-0+d 63247_08_ch08_p620-649.indd 623 01/11/13 12:54 PM 0074927020946 Chapter 8 Codes where d is the check digit. We begin by performing all of the multiplications in Zn. (For example, 9 • 5 =1, since 45 is 1 more than the closest smaller multiple of 11 — namely, 44. On an 11-hour clock, 45 o'clock is 1 o'clock.) The simplified sum is 0+1+2+6+7+9+5+4+0+d and adding in Zn leaves us with 1 + d. The check digit d must now be chosen so that the final result is 0; therefore, in Zn, d = 10. (Equivalently, d must be chosen so that c • b will be a multiple of 11.) But since it is preferable that each component of an ISBN-10 be a single digit, the Roman numeral X is used for 10 whenever it occurs as a check digit, as it does here. i -<-- The ISBN code will detect all single errors and adjacent transposition errors (see Exercises 19-21). The ISBN-10 code has been used since 1970. However, since 2007, most books have also been identified with a 13-digit ISBN code. This ISBN-13 code is compatible with the 13-digit European Article Number code (EAN-13) and uses a check digit scheme similar to the UPC. Specifically, an ISBN-13 code is a vector in Zjo where the last digit is the check digit and the check vector is [1, 3, 1, 3,..., 3, 1] in Z|q. Like its UPC cousin, the ISBN-13 code will detect all single errors but not all adjacent transposition errors. Exercises 8.1 Code Vectors In Exercises 1 and 2, find the parity check code vector for the binary vector u. 1. u= [1, 0, 1,1] 2. u = [1, 1, 0, 1, 1] In Exercises 3-6, a parity check code vector x is given. Determine whether a single error could have occurred in the transmission of v. 3. v= [1,0, 1,0] 4. v = [1, 1, 1,0, 1, 1] 5. v = [0, 1, 0, 1, 1, 1] 6. v = [1, 1, 0, 1, 0, 1, 1, 1] Exercises 7-10 refer to check digit codes in which the check vector is the vector c = [1, 1,..., 1] of the appropriate length. In each case, find the check digit d that would be appended to the vector u. 7. u = [1, 2, 2, 2] in Z\ 8. u = [3, 4, 2, 3] in Z\ 9. u = [1, 5, 6, 4, 5] in 13, 10. u = [3, 0, 7, 5,6, 8] in 1% 11. Prove that for any positive integers m and n, the check digit code in 7Lnm with check vector c = 1 = [1, 1,..., 1] will detect all single errors. (That is, prove that if vectors u and v in ZjJ, differ in exactly one entry, then c • u + c • v.) In Exercises 12 and 13, find the check digit d in the given Universal Product Code. 12. [0, 5, 9, 4, 6, 4, 7, 0, 0, 2, 7, d] 13. [0, 1,4,0,1,4,1,8, 4, 1,2, d] 14. Consider the UPC [0, 4, 6, 9, 5, 6, 1, 8, 2, 0, 1, 5]. (a) Show that this UPC cannot be correct. (b) Assuming that a single error was made and that the incorrect digit is the 6 in the third entry, find the correct UPC. 15. Prove that the Universal Product Code will detect all single errors. 16. (a) Prove that if a transposition error is made in the second and third entries of the UPC [0, 7, 4, 9, 2, 7, 0, 2, 0, 9, 4, 6], the error will be detected. (b) Show that there is a transposition involving two adjacent entries of the UPC in part (a) that would not be detected. (c) In general, when will the Universal Product Code not detect a transposition error involving two adjacent entries? 63247_08_ch08_p620-649.indd 624 01/11/13 12:54 PM Section 8.1 Code Vectors In Exercises 17 and 18, find the check digit d in the given International Standard Book Number (ISBN-10). 17. [0, 3, 8, 7, 9, 7, 9, 9, 3, d] 18. [0, 3, 9, 4, 7, 5, 6, 8, 2, d] 19. Consider the ISBN-10 [0, 4, 4, 9, 5, 0, 8, 3, 5, 6]. (a) Show that this ISBN-10 cannot be correct. (b) Assuming that a single error was made and that the incorrect digit is the 5 in the fifth entry, find the correct ISBN-10. 20. (a) Prove that if a transposition error is made in the fourth and fifth entries of the ISBN-10 [0, 6, 7, 9, 7, 6, 2, 9, 0, 6], the error will be detected. (b) Prove that if a transposition error is made in any two adjacent entries of the ISBN-10 in part (a), the error will be detected. (c) Prove, in general, that the ISBN-10 code will always detect a transposition error involving two adjacent entries. 21. Consider the ISBN-10 [0, 8, 3, 7, 0, 9, 9, 0, 2, 6]. (a) Show that this ISBN-10 cannot be correct. (b) Assuming that the error was a transposition error involving two adjacent entries, find the correct ISBN-10. (c) Give an example of an ISBN-10 for which a transposition error involving two adjacent entries will be detected but will not be correctable. 63247_08_ch08_p620-649.indd 625 01/11/13 12:54 PM Vignette The Codabar System Every credit card and ATM card is uniquely identified by a 16-digit number that represents a check digit code. As in the examples in this section, the first 15 digits are assigned by the bank issuing the card, whereas the last digit is a check digit determined by a formula that uses modular arithmetic. All the major banks use a system called Codabar to assign the check digit. It is a slight variation on the method of the Universal Product Code and is based on an algorithm devised by IBM computer scientist Hans Peter Luhn in the 1950s. in Z\q. The Codabar system uses the check vector c = [2, 1, 2, 1, 2, 1,2, 1, 2, 1, 2, 1,2, 1, 2, 1], but instead of requiring that c • x = 0 in Z10, an extra calculation is added to increase the error-detecting capability of the code. Let h count the number of digits in odd positions that are greater than 4. In this example, these digits are 5, 5, 7, and 9, so h = 4. It is now required that c • x + h = 0 in Z10. Thus, in the example, we have, rearranging and working modulo 10, c-x + /z = (2-5 + 4 + 2- l+ 2 + 2- 3+ 4 + 2- 5 + 6 + 2- 7 + 8 + 2- 9 + 0+ 2- 4 + 3+ 2- 2 + d) + 4 = 2 (5+ 1+ 3 + 5 + 7 + 9 + 4 +2)+ (4+ 2 + 4 + 6 + 8 + 0 + 3 + d) + 4 = 2(6) + 7 + d + 4 = 3 + d Thus, the check digit d for this card must be 7, so the result of the calculation is 0 in Z10. The Codabar system is one of the most efficient error-detecting methods. It will detect all single-digit errors and most other common errors such as adjacent transposition errors. Suppose that the first 15 digits of your card are 5412 3456 7890 432 and that the check digit is d. This corresponds to the vector x = [5, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 4, 3, 2, d] 626 63247_08_ch08_p620-649.indd 626 01/11/13 12:54 PM Section 8.2 Error-Correcting Codes 621 Error-Correcting Codes Section 8.1 discussed examples of error-detecting codes. We turn now to the problem of designing codes that can correct as well as detect certain types of errors. Our message will be a vector x in Z* for some k, and we will encode it by using a matrix transformation T : Z* —> Z^ for some n > k. The vector T(x) will be called a code vector. A simple example will serve to illustrate the approach we will take, which is a generalization of the parity check vectors in Example 8.1. Example 8.6 Suppose the message is a single binary digit: 0 or 1. If we encode the message by simply repeating it twice, then the code vectors are [0, 0] and [1, 1]. This code can detect single errors. For example, if we transmit [0, 0] and an error occurs in the first component, then [1, 0] is received and an error is detected, because this is not a legal code vector. However, the receiver cannot correct the error, since [1,0] would also be the result of an error in the second component if [1, 1] had been transmitted. We can solve this problem by making the code vectors longer—repeating the message digit three times instead of two. Thus, 0 and 1 are encoded as [0, 0, 0] and [1, 1, 1], respectively. Now if a single error occurs, we cannot only detect it but also correct it. For example, if [0, 1, 0] is received, then we know it must have been the result of a single error in the transmission of [0, 0, 0], since a single error in [1, 1, 1] could not have produced it. and define T: Z? —> Z? Note that the code in Example 8.6 can be achieved by means of a matrix rr transformation, albeit a particularly trivial one. Let G = 1 _1_ by T(x) = Gx. (Here we are thinking of the elements of Z2 as 1 x 1 matrices.) The matrix G is called a generator matrix for the code. To tell whether a received vector is a code vector, we perform not one but two par- c, ity checks. We require that the received vector c satisfies c1 = c2 = c3. We can write these equations as a linear system over Z2: (1) If we let P 1 1 0 1 0 1 then (1) is equivalent to Pc = 0. The matrix P is called a parity check matrix for the code. Observe that PG O. To see how these matrices come into play in the correction of errors, suppose V we send 1 as 1 = [ 1 1 1 ]T, but a single error causes it to be received as 1 63247_08_ch08_p620-649.indd 627 01/11/13 12:54 PM Chapter 8 Codes [ 1 0 1 ]T. We compute Pc' 1 1 0 1 0 1 ¥= 0 so we know that c' cannot be a code vector. Where is the error? Notice that Pc' is the second column of the parity check matrix P—this tells us that the error is in the second component of c' (which we will prove in Theorem 8.1 below) and allows us to correct the error. (Of course, in this example we could find the error faster without using matrices, but the idea is a useful one.) To generalize the ideas in the last example, we make the following definitions. where A is If k < n, then any n x k matrix of the form G = ^ IA an (n — k) x k matrix over Z2, is called a standard generator matrix for an (n, k) binary code T : Z2 —> Z2. Any (n — k) x n matrix of the form P = [B In-k], where B is an (n — k) x k matrix over Z2, is called a standard parity check matrix. The code is said to have length n and dimension k. Theorem 8.1 Here is what we need to know: (a) When is G the standard generator matrix for an error-correcting binary code? (b) Given G, how do we find an associated standard parity check matrix P? It turns out that the answers are quite easy, as shown by the following theorem. If G is a standard generator matrix and P = [B i„-J is a standard parity check matrix, then P is the parity check matrix associated with G if and only if A = B. The corresponding (n, k) binary code is (single) error-correcting if and only if the columns of P are nonzero and distinct. Example 8.7 Before we prove the theorem, lets consider another, less trivial example that illustrates it. Suppose we want to design an error-correcting code that uses three parity check equations. Since these equations give rise to the rows of P, we have n — k = 3 and k = n — 3. The message vectors come from Z*, so we would like k (and therefore n) to be as large as possible in order that we may transmit as much information as possible. By Theorem 8.1, the n columns of P need to be nonzero and distinct, so the maximum occurs when they consist of all the 23 — 1 = 7 nonzero vectors of Z2~1 = Z2, One such candidate is 110 110 0 10 110 10 0 1110 0 1 63247_08_ch08_p620-649.indd 628 01/11/13 12:54 PM Section 8.2 Error-Correcting Codes 629 This means that 110 1 10 11 0 111 and thus, by Theorem 8.1, a standard generator matrix for this code is 10 0 0 0 10 0 0 0 10 0 0 0 1 110 1 10 11 0 111 As an example of how the generator matrix works, suppose we encode x [0 1 0 1 ]T to get the code vector Gx [0 1 0 1 0 1 0]T If this vector is received, it is seen to be correct, since Pc = 0. On the other hand, if c'=[0 1 1 1 0 1 0 ]T is received, we compute 0" 1 i To 1 = 1 0 l_i 1 0 Pc' 110 110 0 10 110 10 0 1110 0 1 which we recognize as column 3 of P. Therefore, the error is in the third component of c', and by changing it we recover the correct code vector c. We also know that the first four components of a code vector are the original message vector, so in this case we decode c to get the original x = [0 1 0 1]T. The code in Example 8.7 is called the (7, 4) Hamming code. Any binary code constructed in this fashion is called an (n, k) Hamming code. Observe that, by construction, an (n, k) Hamming code has n = 2"~k — 1. I 8.1 (Throughout this proof we denote by a, the ;th column of a matrix A.) With P and G as in the statement of the theorem, assume first that they are standard parity check and generator matrices for the same binary code. Therefore, for every x in Z*, PGx = 0. In terms of block multiplication, [B I] 0 for all x in Z\ 63247_08_ch08_p620-649.indd 629 01/11/13 12:54 PM Chapter 8 Codes Richard W. Hamming (1915- i) received his Ph.D. in mathematics from the University of Illinois at Urbana-Champaign in 1942. His mathematical research interests were in the fields of differential equations and numerical analysis. From 1946 to 1976, he worked at Bell Labs, after which he joined the faculty at the U.S. Naval Postgraduate School in Monterey, California. In 1950, he published his fundamental paper on error-correcting codes, giving an explicit construction for the optimal codes Claude Shannon had proven theoretically possible in 1948. Equivalently, for all x in Z* we have Bx + Ax = (B + A)x = (BI + IA)x = [B I] x = 0 or Bx = Ax If we now take x = e;, the rth standard basis vector in Z2; we see that Be, Aet = Hi for all i Therefore, B = A. Conversely, it is easy to check that if B = A, then PGx = 0 for every x in 1\ (see Exercise 14). To see that such a pair determines an error-correcting code if the columns of P are nonzero and distinct, let x be a message vector in Z2 and let the corresponding code vector be c = Gx. Then Pc = 0. Suppose there has been an error in the rth component, resulting in the vector c'. It follows that c' = c + e;. We now compute Pc' = P(c + ej = Pc + Pet= 0 + pj = pj which pinpoints the error in the rth component. On the other hand, if p, = 0, then an error in the rth component will not be detected (i.e., Pc' =0), and if p, = p^, then we cannot determine whether an error occurred in the rth or the jth component (Exercise 15). The main ideas of this section are summarized below. 1. For n > k, an n x k matrix G and an (n — k) x n matrix P (with entries in Z2) are a standard generator matrix and a standard parity check matrix, respectively, for an (n, k) binary code if and only if, in block form, and P = [A In_k] for some (n — k) x k matrix A over Z2. 2. G encodes a message vector x in Z2 as a code vector c in Z2 via c = Gx. 3. G is error-correcting if and only if the columns of P are nonzero and distinct. A vector c' in Z2 is a code vector if and only if Pc' = 0. In this case, the corresponding message vector is the vector x in Z2 consisting of the first k components of c'. If Pc' + 0, then c' is not a code vector and Pc' is one of the columns of P. If Pc' is the rth column of P, then the error is in the rth component of c' and we can recover the correct code vector (and hence the message) by changing this component. 63247_08_ch08_p620-649.indd 630 01/11/13 12:54 PM Section 8.2 Error-Correcting Codes 631 Exercises 8.2 Error-Correcting Codes 1. Suppose we encode the four vectors in J}2 by repeating the vector twice. Thus, we have [0,0] [0,1] [1.0] [1.1] [0, 0, 0,0] [0, 1,0, 1] [1,0, 1,0] [1,1,1,1] Show that this code is not error-correcting. 2. Suppose we encode the binary digits 0 and 1 by repeating each digit five times. Thus, 0 [0,0,0,0,0] 1 [1, 1, 1, 1, 1] Show that this code can correct double errors. What is the result of encoding the messages in Exercises 3-5 using the (7, 4) Hamming code of Example 8.7? '1' 1 1 1 3. x "0" 1 1 4. x = 0 1 _0_ _1_ 5. x When the (7, 4) Hamming code of Example 8.7 is used, suppose the messages c' in Exercises 6-8 are received. Apply the standard parity check matrix to c' to determine whether an error has occurred and correctly decode c' to recover the original message vector x. 0 6. c' 7. c' 8. c' [0 [1 [0 If of of 9. The parity check code in Example 8.1 is a code 1L\ 7L\. (a) Find a standard parity check matrix for this code. (b) Find a standard generator matrix. (c) Apply Theorem 8.1 to explain why this code is not error-correcting. 10. Define a code ~I\ Z\ using the standard generator matrix 1 0 0 1 1 0 0 1 1 1 (a) List all four code words. (b) Find the associated standard parity check matrix for this code. Is this code (single) error-correcting? 11. Define a code J\ —> I\ using the standard generator matrix 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 .1 1 1. (a) List all eight code words. (b) Find the associated standard parity check matrix for this code. Is this code (single) error-correcting? 12. Show that the code in Example 8.6 is a (3, 1) Hamming code. 13. Construct standard parity check and generator matrices for a (15, 11) Hamming code. 14. In Theorem 8.1, prove that if B = A, then PGx = 0 for every x in Z*. 15. In Theorem 8.1, prove that if p, = pj, then we cannot determine whether an error occurs in the rth or the jth component of the received vector. 63247_08_ch08_p620-649.indd 631 01/11/13 12:54 PM 632 Chapter 8 Codes Dual Codes There are many ways of constructing new codes from old ones. In this section, we consider one of the most important of these. First, we need to generalize the concepts of a generator and a parity check matrix for a code. Recall from Section 8.2 that a standard generator matrix for a code is an n x m matrix of the form and a standard parity check matrix is an (n — m) x n matrix of the form Observe that the form of these matrices guarantees that the columns of G are linearly independent and the rows of P are linearly independent. (Why?) In proving Theorem 8.1, we showed that G and P are associated with the same code if and only if A = B, which is equivalent to requiring that PG = 0. We use these properties as the basis for the following definition. For n > m, an n x m matrix G and an (n — k) x n matrix P (with entries in Z2) are a generator matrix and a parity check matrix, respectively, for an (n, k) binary code C if the following conditions are all satisfied: 1. The columns of G are linearly independent. 2. The rows of P are linearly independent. 3. PG = 0 Notice that property (3) implies that every column v of G satisfies Pv = 0 and so is a code vector in C. Also, a vector y is in C if and only if it is obtained from the generator matrix as y = Gu for some vector u in Z2. In other words, C is the column space of G. To understand the relationship between different generator matrices for the same code, we only need to recall that, just as elementary row operations do not affect the row space of a matrix (by Theorem 3.20), elementary column operations do not affect the column space. For a matrix over Z2, there are only two relevant operations: interchange two columns (CI) and add one column to another column (C2). (Why are these the only two elementary column operations on matrices over Z2?) Similarly, elementary row operations preserve the linear independence of the rows of P. Moreover, if £ is an elementary matrix and c is a code vector, then It follows that EP is also a parity check matrix for C. Thus, any parity check matrix can be converted into another one by means of a sequence of row operations: interchange two rows (Rl) and add one row to another row (R2). We are interested in showing that any generator or parity check matrix can be brought into standard form. There is one other definition we need. We will call two codes Cj and C2 equivalent if there is a permutation matrix M such that P = IB \ J„_ J {EP)c = E{Pc) =£0 = 0 {Mx^inCj = C2 63247_08_ch08_p620-649.indd 632 01/11/13 12:54 PM Section 8.3 Dual Codes In other words, if we permute the entries of the vectors in Q (all in the same way), we can obtain C2. For example, 0 1 1 0 0 , 0 , 0 , 0 0 0 1 1 and C, 0 0 1 1 0 , 1 , 1 , 0 0 0 0 0 are equivalent via the permutation matrix M . Permuting the entries Example 8.8 of code vectors corresponds to permuting the rows of a generator matrix and permuting the columns of a parity check matrix for the code. (Why?) We can bring any generator matrix for a code into standard form by means of operations CI, C2, and Rl. If Rl has not been used, then we have the same code; if Rl has been used, then we have an equivalent code. We can bring any parity check matrix for a code into standard form by means of operations Rl, R2, and CI. If CI has not been used, then we have the same code; if CI has been used, then we have an equivalent code. The following examples illustrate these points. (a) Bring the generator matrix ~1 0 1 0 .0 1. into standard form and find an associated parity check matrix. (b) Bring the parity check matrix "l 0 0 f P = |_0 1 1 1_ into standard form and find an associated generator matrix. (a) We can bring the generator matrix G into standard form as follows: (Do you see why it is not possible to obtain standard form without using Rl?) Hence, A = [1 0],so P = [A\I] = [1 0 1] is an associated parity check matrix, by Theorem 8.1. (b) We use elementary row operations to bring P into standard form, keeping in mind that we want to create an identity matrix on the right—not on the left, as in Gauss-Jordan elimination. We compute "l o" ~1 o" R2<->.R3 1 0 -> 0 1 _0 1_ _1 0_ 10 0 1 0 111 0 111 10 0 1 1110 10 0 1 [All] 63247_08_ch08_p620-649.indd 633 01/11/13 12:54 PM Chapter 8 Codes Thus, A 1 1 1 0 so, by Theorem 8.1, an associated generator matrix is 1 0' 0 1 1 1 1 0 In part (a), it is instructive to verify that G and G' generate equivalent, but not identical, codes. Check that this is so by computing {Gx: x in Z2} and {G'x : x in Z2}. We now turn our attention to the main topic of this section, the notion of a dual code. Let C be a set of code vectors in Z2. The orthogonal complement of C is called the dual code of C and is denoted CL. That is, C1 = {x in Z"2: c ■ x = 0 for all c in C} Example 8.9 The dot product in Z2 behaves like the dot product in IR", with one important exception: Property (d) of Theorem 1.2 is no longer true. In other words, in Z2, a nonzero vector can be orthogonal to itself! As an example, take x = in Zi Then x-x = 1 + 1=0. Find the dual code of the code C in Example 8.8(b). Solution The code C is C = {Gx:xinZ22} 0 , G 0 , G 1 , G 1 0 1 0 1 0 0 1 1 0 1 0 1 0 1 1 0 0 0 1 1 [Alternatively, C = {c in Z4: Pc = 0} = null(P). Check that this really does give the same code.] To find CL, we need those vectors in Z2 that are orthogonal to all four vectors in C. Since there are only 16 vectors altogether in Z2, we could proceed by trial and error—but here is a better method. Let y = [yt y2 y} y4]Tbe'm CL. Since y • c = 0 for each c in C, we have four equations, one of which we can ignore, since it just says 0 = 0. The other three are y2+ y3 =0 7i + 73 + 74 = 0 yx+ y2+ y4 = 0 63247_08_ch08_p620-649.indd 634 01/11/13 12:54 PM Section 8.3 Dual Codes Solving this system, we obtain ~0 1 1 0 o" ~1 0 1 1 0 1 0 1 1 0 -> 0 1 1 0 0 _1 10 1 0_ _0 0 0 0 0 (Check this.) It follows that^ = 73 + yA and y2 = 73> so >3 + 74 73 73 . 74 . 73 + 74 We now examine the relationship between the generator and parity check matrices of a code and its dual. Theorem 8.2 If C is an (n, k) binary code with generator matrix G and parity check matrix P, then CL is an (n, n — k) binary code such that a. GT is a parity check matrix for CL. b. PT is a generator matrix for CL. Example 8.10 By definition, G is an n x k matrix with linearly independent columns, P is an (n — k) x n matrix with linearly independent rows, and PG = O. Therefore, the rows of GT are linearly independent, the columns of PT are linearly independent, and GTPT = {PGf = 0T = O This shows that GT is a parity check matrix for CL and PT is a generator matrix for CL. Since PT is n x (n — k), CL is an (n, n — k) code. Find generator and parity check matrices for the dual code CL from Example 8.9. There are two ways to proceed. We will illustrate both approaches. Method 1: According to Theorem 8.2(b), a generator matrix GL for CL is given by '1 0" 1 0 0 1 0 111 PT 0 1 0 1 1 1 This matrix is in standard form with A [A I] 0 1 1 1 , so a parity check matrix for CL is 0 110 110 1 by Theorem 8.1. 63247_08_ch08_p620-649.indd 635 01/11/13 12:54 PM Chapter 8 Codes Method 2: Using Theorem 8.2(a) and referring to Example 8.8(b), we obtain a parity check matrix PL for CL as follows: "I (TT 0 1 1 1 GT 1 0 10 11 0 110 This matrix is not in standard form, so we use elementary row operations to convert it to 10 11 0 110 0 110 110 1 [A I] = Pt Now we can use Theorem 8.1 to obtain a generator matrix GL for CL: '1 0" A 0 1 0 1 1 1 Example 8.11 Let C be the code with generator matrix 1 0' 0 1 1 0 0 1 List the vectors in C and C . The code C is C ={Gx:3 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 (Note that C is a double repetition code that encodes vectors from 1}2 as vectors in Z* by writing the entries twice. See Exercise 1 in Section 8.2.) Using Theorem 8.2, we find the parity check matrix PL for CL to be "I 0^T GT 0 1 1 0 0 1 10 10 0 10 1 Thus, P has the form [A I], where A = I, so a generator matrix G for C is '1 0" I 0 1 1 0 0 1_ Hence, CL has the same generator matrix as C, so CL = C! 63247_08_ch08_p620-649.indd 636 01/11/13 12:54 PM Section 8.3 Dual Codes '0) was one of the pioneers of coding theory. She received her B.A. and M.A. from Cambridge University in 1938-39, following which she studied in the United States at Johns Hopkins University and Harvard University. After marrying and raising a family, MacWilliams took a job as a computer programmer at Bell Laboratories in Murray Hill, New Jersey, in 1958, where she became interested in coding theory. In 1961, she returned to Harvard for a year and obtained a Ph.D. Her thesis contains one of the most powerful theorems in coding theory. Now known as the MacWilliams Identities, this theorem relates the weight distribution (the number of codewords of each possible weight) of a linear code to the weight distribution of its dual code. The MacWilliams Identities are widely used by coding theorists, both to obtain new theoretical information about error-correcting codes and to determine the weight distributions of specific codes. MacWilliams is perhaps best known for her book The Theory of Error-Correcting Codes (1977), written with N. J. A. Sloane of Bell Labs. This book is often referred to as the "bible of coding theory." In 1980, MacWilliams gave the inaugural Emmy Noether Lecture of the Association for Women in Mathematics. A code C with the property that CL = C is called self-dual. We can check that the code in Example 8.11 is self-dual by showing that every vector in C is orthogonal to all the vectors in C, including itself. (Do this.) You may have noticed that in the self-dual code in Example 8.11, every vector in C has an even number of Is. We will prove that this is true for every self-dual code. The following definition is useful. Let x be a vector in Z£. The weight of x, denoted w(x), is the number of Is in x. For example, w( [1 10 10 0 1]T) = 4. If we temporarily think of x as a vector in K", then we can give the following alternative descriptions of w (x). Let 1 denote the vector (of the same length as x) all of whose entries are 1. Then w (x) = x • 1 and w(x) = x • x. We can now prove the following interesting facts about self-dual codes. Theorem 8.3 IfCisaself-dualcode, then: a. Every vector in C has even weight. b. 1 is in C. (a) A vector x in ZJ! has even weight if and only if w (x) = 0 in Z2. But w(x) = x-x = 0 since C is self-dual. (b) Using property (a), we have 1 • x = w (x) = 0 in Z2 for all x in C. This means that 1 is orthogonal to every vector in C, so 1 is in CL = C, as required. 63247_08_ch08_p620-649.indd 637 01/11/13 12:54 PM Chapter 8 Codes Exercises 8.3 Dual Codes In Exercises 1-4, G is a generator matrix for a code C. Bring G into standard form and determine whether the corresponding code is equal to C. 1. G 1 0 1 1 1 0 2. G 3. G "0 0 0" 1 0 1 0 1 1 1 1 1 4. G 1 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 0 1 0 In Exercises 5-8, P is a parity check matrix for a code C. Bring P into standard form and determine whether the corresponding code is equal to C. 5.P 7.P 8.P [1 1 0] 6. P 110 1 1111 0 1110 110 0 1 .00111 0 10 1 10 0 1 In Exercises 9-12, find the dual code CL of the code C. 9. C 10. C "o" "o" 1 0 1 _0_ _0_ J 0 1 0 1 0 1 0 1 0 0 1 1 11. c 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 12. C = < 0 0 1 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 0 1 J In Exercises 13-16, either a generator matrix G or a parity check matrix P is given for a code C. Find a generator matrix GL and a parity check matrix PL for the dual code ofC. 13. G 15. P 1110 0 10 1 14. G 16. P 10 110 0 10 0 1 0 0 10 1 17. Find generator and parity check matrices for the dual of the (7, 4) Hamming code in Example 8.7. The even parity code E„ is the subset of 1.1 consisting of all vectors with even weight. The n-times repetition code Repn is the subset of 11 consisting of just the two vectors 0 and 1 (all zeros and all Is, respectively). 18. (a) Find generator and parity check matrices for E3 and Rep3. (b) Show that E3 and Rep3 are dual to each other. 19. Show that E„ and Rep„ are dual to each other. 20. If C and D are codes and CCD, show that DL C C\ 21. Show that if C is a code with a generator matrix, then (CL)L = C. 22. Find a self-dual code of length 6. 63247_08_ch08_p620-649.indd 638 01/11/13 12:54 PM Section 8.4 Linear Codes 639 Linear Codes We now turn our attention to the most important, and most widely used, class of codes: linear codes. In fact, many of the examples we have already looked at fall into this category. NASA has made extensive use of linear codes to transmit pictures from outer space. In 1972, the Mariner 9 spacecraft used a type of linear code called a Reed-Muller code to transmit black-and-white images of Mars (Figure 8.2). Then, between 1979 and 1981, Voyager 1 and Voyager 2 were able to send back remarkable color pictures of Jupiter and Saturn (reproduced in black and white in Figure 8.3) using a Golay code, another linear code. A p-ary linear code is a subspace C of Z£. As usual, our main interest is the case p = 2, the binary linear codes. Checking to see whether a subset C of Z2 is a subspace involves showing that C satisfies the conditions of Theorem 6.2. Since in Z2 the only scalars are 0 and 1, checking to see whether C is closed under scalar multiplication only involves showing that C contains the zero vector. All that remains is to check that C is closed under addition. Figure 8.2 The southern polar cap of Mars Figure u.o Jupiter's red spot and the rings of Saturn Example 8.12 Are Q linear codes? U u 1 1 0 0 1 1 0 1 0 1 0 1 0 1 and C, 0 1 1 0 , 0 , 0 0 0 1 (binary) 63247_08_ch08_p620-649.indd 639 01/11/13 12:54 PM Chapter 8 Codes Example 8.13 G\ clearly contains the zero vector and is closed under addition, so it is a linear code. C2 is not closed under addition, since it does not contain 1 1 0 0 + 0 = 0 0 1 1 Hence, C2 is not linear. For the remainder of this section, we will dispense with the adjective "binary," since all of the codes we will be considering will be binary codes. If a linear code C is a /c-dimensional subspace of Z2, then we say that C is an (n, k) code. (a) The code G\ in Example 8.12 is a subspace of Z2 and has dimension 2, since ( "0" "1" 0 1 1 1 0 { _1_ _0_ is a basis for Q. (In fact, G\ has exactly three different two-element bases. What are the other two? See Exercise 9.) Hence, G\ is a (4, 2) code. (b) The (7,4) Hamming code H introduced in Section 8.2 is a (7, 4) linear code (fortunately!), in our new terminology. It is linear because it has a generator matrix G, so its vectors are all the vectors of the form Gx, where x is in Z\. But this is just the column space of the 7X4 matrix G and so is a subspace of Z 2. Since the four columns of G are linearly independent (why?), they form a basis for H. Therefore, H is a (7, 4) code. (c) The codes Í ~o~ ~1~ 1 H 0 1 > and C1 i _0_ _1_ J 0 1 1 0 0 , 1 , 0 , 1 0 0 1 1 are dual codes. It is easy to see that each of these is a linear code, that dim C = 1, and that dim C1 = 2. (Check these claims.) Therefore, C is a (3, 1) code and C1 is a (3, 2) code. The fact that 3 = 1 + 2 is not an accident, as the next theorem shows. Theorem 8.4 Let cbe an (n, k) linear code. a. The dual code C1 is an (n, n — k) linear code. b. C contains 2k vectors, and C1 contains 2"~k vectors. (a) Since C is an (n, k) linear code, it is a /c-dimensional subspace of Z2. Its dual C1 is the orthogonal complement of C and so is also a subspace of Z2, by Theorem 5.9(a). Thus, C1 is a linear code. 63247_08_ch08_p620-649.indd 640 01/11/13 12:54 PM Section 8.4 Linear Codes 641 The Reed-Muller codes are named after the computer scientists Irving S. Reed and David E. Muller, who published papers, independently, about these codes in 1954. Recall that the binary, or base two, representation of a number arises from writing it as a sum of distinct powers of two. If n = bk • 2k +----h bx • 2 + b0, where each fo, is 0 or 1, then in base two n is represented as « = &,■••■ bxb0. For example, 25 = 16 + 8+ 1 = l-24 + l-23 + 0 • 22 + 0 • 2 + 1, so the binary representation of 25 is 11001. Now we can apply Theorem 5.13 to show that dim C1 = n — dim C = n — k (Note: Theorems 5.9 and 5.13 are true if W is replaced by Z2. This is the case for most of the nongeometric results about orthogonality.) It follows that C1 is an (n, n — k) code. (b) Let {v1; . . . , Vfc} be a basis for C. Then the vectors in C are all the vectors of the form v = qv! + c2v2 + ■ • ■ + ckwk where each ci is either 0 or 1. Therefore, there are two possibilities for cl and, for each of these, two possibilities for c2, and so on, making the total number of possibilities for v 2 x 2 x ■ ■ ■ X 2 = 2k Thus, C contains exactly 2 vectors. Applying this formula to its (n, n — k) dual code, we see that C1 has 2"~k vectors. We now construct one of the oldest families of linear codes, the Reed-Muller codes. As mentioned earlier, this is the type of code that was used by the Mariner 9 spacecraft to transmit pictures of Mars. In order to be transmitted, each photograph had to be broken down into picture elements, or pixels. This was done by overlaying the photograph with a 700 x 832 pixel grid and then assigning to each pixel one of 64 shades of gray, ranging from white (0) to black (63). Since 64 = 26, we can use binary arithmetic to represent each of these shades: white is 000000 and black is 111111. We can then rewrite these 64 binary numbers as vectors in Z2 and encode them using a code that corrects as many errors as possible. The code that was chosen for use by Mariner 9 belongs to a large family of codes that are most easily defined inductively. The (first-order) Reed-Muller codes R„ are defined inductively as follows: 1. For n = 0, R0 = Z2 = {0, 1}. 2. For n £ 1, R„ is the subspace of Z2 whose basis consists of all vectors of the form u "0 and u 1 where u is a basis vector in R„-i, 0 is the zero vector in Z2 , and 1 is the vector of Is in Zf '. To get a sense of what vectors these codes contain, let's use the definition to construct Rt and R2. A basis for R0 = Z2 is just {1}, so a basis for Rt is 63247_08_ch08_p620-649.indd 641 01/11/13 12:54 PM 642 Chapter 8 Codes Thus, by closure under addition, Rt must also contain the vectors 1 0 1 1 + and + 1 1 : 1 1 0 It is easy to check that no other vectors can be obtained by addition, so 0 0 1 1 0 1 0 1 Similarly, a basis for R2 is "0" "0" 1 1 0 1 0 1 1 1 1 and, by closure under addition, it is easy to check that the 8 Notice that in Rt every code vector except 0 and 1 has weight 1, and in R2 every code vector except 0 and 1 has weight 2. This is a general property of the Reed-Muller codes, and we prove it as part of the next theorem. But first, note that the complement of a vector x in Z2 is the vector x obtained by changing all the zeros to Is and vice versa. For example, "1" "0" 1 0 x = 0 1 _1_ _0_ Observe that x = x + 1, where 1 is the vector consisting entirely of Is. Theorem 8.5 For n£l, the Reed-Muller code Rn is a (2", n + 1) linear code in which every code vector except 0 and 1 has weight 2"_1. We will prove this theorem by induction on n. For n = 1, we have already seen that R1 = ~L\ is a (2, 2) = (2 , 1 + 1) linear code in which every code vector except 0 and 1 has weight 1 = 21"1. Assume that the result is true for n = k; that is, assume that Rk is a (2k, k + 1) linear code in which every code vector except 0 and 1 has weight 2k Now consider Rk+i- By construction, Rk+1 has a basis consisting of vectors of the form To' in Rk, together with the vector where u is By the induction hypothesis, the vectors u, 0, and 1 are in J}2; hence, the basis vectors for Rk+i are in J}2 . Moreover, the dimension of 63247_08_ch08_p620-649.indd 642 01/11/13 12:54 PM Section 8.4 Linear Codes 643 Ri is k + 1, so there are k + 1 vectors of the form and one more, ,k+l . It follows + Ck+2 0 ck+l 1 that the dimension of -R/c+1 is /c + 2, and therefore -R/c+1 is a (2 , /c + 2) linear code. For the final assertion, note that the vectors in Rk+i are obtained as linear combinations of the basis vectors and so are of the form + where {u1; . . . , uk+l} is a basis for Rk, 0 and 1 are in Z2, and each ci is 0 or 1. Suppose v + 0, 1 and let u = quj + ■ • ■ + ck+luk+l. (Hence, u is in Rk.) If q.+2 = 0, then u i= 0, 1, so, by the induction hypothesis, u has weight 2k~l. But then v has weight 2 • 2k~l = 2k. If ck+2 = 1, then v has the form where u is in Rk. Since u "0 i] 1] + u 1 _u + 1. u (why?), we have w(u) = 2k - w(u) w(v) = w(u) + w(u) = 2k as required. This completes the induction, and we conclude that the theorem is true for all «2 1. As noted, Mariner 9 required a code with 64 = 26 vectors. By Theorem 8.5, the Reed-Muller code _R5 has dimension 6 over Z2. As you will see in the next section, it is also capable of detecting and correcting multiple errors. That is why the Reed-Muller code was the one that NASA used for the transmission of the Mariner photographs. Exercises 13-16 explore further aspects of this important class of codes. Exercises 8.4 Linear Codes Which of the codes in Exercises 1-8 are linear codes? 1. C 2. C 3. C 4. C 1 0 1 1 0 1 ' 1 / "0" r 1 _0_ 1. J 0 1 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 5. C 6. C 0 0 0 0 0 1 1 0 0 0 1 1 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 7. The even parity code E„ (See Exercise 18 in Section 8.3.) 8. The odd parity code 0„ consisting of all vectors in Z2 with odd weight 9. Find the other two bases for the code Q in Example 8.13. 63247_08_ch08_p620-649.indd 643 01/11/13 12:54 PM 644 Chapter 8 Codes 10. (a) If a (9, 4) linear code has generator matrix G and parity check matrix P, what are the dimensions of GandP? (b) Repeat part (a) for an (n, k) linear code. 11. For a linear code C, show that (C1)1 = C without using matrices. 12. If C is an (n, k) linear code that is self dual, prove that n must be even. [Hint: Use the analogue in Z2 of Theorem 5.13.] 13. Write out the vectors in the Reed-Muller code _R3. 14. Define a family of matrices inductively as follows: G0 = [1] and, for nil, where 0 is a zero vector and 1 is a vector consisting entirely of ones. (a) Write out Gu G2, and G3. (b) Using induction, prove that for all n £ 0, G„ is a generator matrix for the Reed-Muller code Rn. 15. Find a parity check matrix for R2. 16. Find a parity check matrix for _R3. 17. Prove that, for a linear code C, either all the code vectors have even weight or exactly half of them do. [Hint: Let E be the set of vectors in C with even weight and O the set of vectors in C with odd weight. If O is not empty, let c0 be in O and consider O' = {c0 + e : e in £}. Show that O' = O.] The Minimum Distance of a Code Consider the triple repetition code f ~o~ 1 C = {c„, cj = < 0 1 1 _0_ _i_ J If one or two errors occur in the transmission of either of these code vectors, the resulting vector cannot be another vector in C. So C can detect up to two errors. For example, if errors occur in the first and second entries when c0 is transmitted, then the vector r 1 .0. is received. However, the receiver has no way of correcting the error, since c' would also result if a single error occurred during the transmission of But any single error can be corrected, since the resulting vector can have arisen in only one way. For example, if "0" 1 _0_ is received and we know that at most one error has occurred, then the original vector must have been c0, since c" cannot arise from C! via a single error. We will now generalize these ideas. As you will see, the notion of Hamming distance plays a crucial role in the definition. Let C be a (binary) code. The minimum distance of C is the smallest Hamming distance between any two distinct vectors in C. That is, d(C) = min{dH(x, y): x + y in C} 63247_08_ch08_p620-649.indd 644 01/11/13 12:54 PM Section 8.5 The Minimum Distance of a Code 645 Clearly, the minimum distance of the triple repetition code C above is 3. Example 8.14 Find the minimum distance of the code C = {c0, q, c2, c3} where "0" "0" "1" "1" 0 1 0 1 , q = , c3 = 0 0 1 1 0 1 0 1 We need to compute the Hamming distance between each pair of distinct vectors. [There are four vectors, so there are dH(c0, ct) = 2 dH(c1; c2) = 4 dH(c0, c2) = 2 dH(c1; c3) = 2 6 pairs.] We find that: dH(c„, c3) = 4 dH(c2, c3) = 2 Therefore, d(C) = 2. It is possible to picture the notions of minimum distance and error correction geometrically. In the case of the triple repetition code C, we have a subset (actually, a subspace) of Z2. We can represent the vectors in Z2 as the vertices of a unit cube, as shown in Figure 8.4(a). The Hamming distance between any two vectors x and y is just the number of edges in a shortest path from x to y. The code C corresponds to two of these vertices, c0 and cx. The fact that d(C) = 3 corresponds to the fact that c0 and ct are three units apart, as shown in Figure 8.4(b). If a received vector x is within one unit of either of these code vectors and we know that at most one error has occurred, we can correctly decode x as the nearest code vector. In Figure 8.4(b), x would be decoded as c0, and y would be decoded as cv This agrees with the fact that C can correct single but not double errors. (a) (b) Figure 8.4 In Exercise 13, you are asked to draw a picture that illustrates the situation in Example 8.14. In general, we cannot draw pictures of Z2, but a Euclidean analogy is helpful. If a code can correct up to k errors, think of the code vectors as the centers 63247_08_ch08_p620-649.indd 645 01/11/13 12:54 PM Chapter 8 Codes Figure 8.5 of spheres of radius k. The code vectors themselves are separated by at least d units. Then, if a received vector x is inside one of these spheres, it will be decoded as the vector corresponding to the center of that sphere. In Figure 8.5, x will be decoded as c0. This process is known as nearest neighbor decoding. Figure 8.5 suggests that if a code is able to correct k errors, then the "spheres" centered at the code vectors cannot touch or overlap; that is, we must have d > 2k. This turns out to be correct, as we now make precise. A code is said to detect k errors if, for each code vector c and each vector c' obtained by changing up to k of the entries of c, c' is not a code vector. A code is said to correct k errors if, for each code vector c and each vector c' obtained by changing up to k of the entries of c, nearest neighbor decoding of c' produces c. Theorem 8.6 Let C be a (binary) code with minimum distance d. a. C detects k errors if and only if d £ k + 1. b. C corrects k errors if and only if d £ 2k + 1. (a) Assume that d £ k + 1 and let c be a vector in C. If up to k errors are introduced into c, then the resulting vector c' has the property that dH(c, c') £ k. But then c' cannot be a code vector, since if it were, we would have k + 1 < d < dH(c, c') < k which is impossible. Conversely, if C can detect up to k errors, then the minimum distance between any two code vectors must be greater than k. (Why?) It follows that d £ k + 1. (b) Assume that d £ 2k + 1 and let c be a vector in C. As in the proof of property (a), let c' be a vector such that dH(c, c') £ k. Let b be another vector in C. Then dH(c, b) £ il22)c+ 1, so, by the Triangle Inequality, dH(c, c') + dH(c',b) > dH(c,b) > 2k + 1 Therefore, dH(c', b) > 2k + 1 - dH(c, c')>2Hl-l = Hl> dH(c', c) So c' is closer to c than to b, and nearest neighbor decoding correctly decodes c' as c. Conversely, assume that C can correct up to k errors. We will show that if d < 2k + 1 (i.e., d £ 2k), then we obtain a contradiction. To do this, we will find a code vector c and a vector c' such that dH (c, c') £ k yet nearest neighbor decoding decodes c' as the wrong code vector b + c. Let b and c be any code vectors in C such that dH(b,c) = d < 2k There is no harm in assuming that these d errors occur in the first d entries of b. (Otherwise, we can just permute the entries of all the vectors until this is true.) Assuming that the code vectors in C have length n, we construct a vector c' in 7L\ as follows. Make c' agree with b in the first k entries, agree with c in the next d — k entries (why is d £ A:?), and agree with both b and c in the last n — d entries. In other words, the entries of c' satisfy (bt + Cj if i = 1, ...,k c( = ( Cj i= bt if i = k + 1,..., d I b, = C-. if i = d + 1,..., n 63247_08_ch08_p620-649.indd 646 01/11/13 12:54 PM Section 8.5 The Minimum Distance of a Code 641 Some books call such a code an (n, 2k, d) code or, more generally, an (n, M, d) code, where n is the length of the vectors, M is the number of code vectors, and d is the minimum distance. Now dH(c, c') = k and dH(c', b) = d - k < k. (Why?) Therefore, dH(c', b) < dH(c', c), so either we have equality and it is impossible to decide whether c' should be decoded as b or c or the inequality is strict and c' will be incorrectly decoded as b. In either case, we have shown that C cannot correct k errors, which contradicts our hypothesis. We conclude that rf22t+ 1. In the case of a linear code, we have the following notation: If an (n, k) linear code has minimum distance d, we refer to it as an (n, k, d) code. For example, the code in Example 8.14 is a (4, 2, 2) code. Linear codes have the advantage that their minimum distance can be easily determined. In Exercise 14, you are asked to show that the minimum distance of a linear code is the same as the minimum weight of a nonzero code vector. It is also possible to determine d(C) by examining a parity check matrix for C. Theorem 8.7 Let C be an (n, k) linear code with parity check matrix P. Then the minimum distance of C is the smallest integer d for which P has d linearly dependent columns. Assume that d(C) = d. The parity check matrix P is an (n — k) x n matrix with the property that, for any vector x in Z2, Px = 0 if and only if x is in C. As you will be asked to show in Exercise 14, C contains a vector c of weight d. Then Pc is a linear combination of exactly d columns of P. But, since Pc = 0, this implies that some set of d columns of P is linearly dependent. On the other hand, suppose some set of d — 1 columns of P is linearly dependent—say, P„ + P,2 + ■' ■ + P,_, = o Let x be a vector in Z2 with Is in positions it, . . . , and zeros elsewhere. Then x is a vector of weight d — 1 such that Px = 0. Hence, x is a code vector of weight d — 1 < d = d(C). This is impossible, by Exercise 14, so we deduce that rank (P) = d — 1. Conversely, assume that any d — 1 columns of P are linearly independent but some set of d columns of P is linearly dependent. Since Px is a linear combination of those columns of P corresponding to the positions of the Is in x, Px + 0 for any vector x of weight d — 1 or less. Therefore, there are no nonzero code vectors of weight less than d. But some set of d columns of P is linearly dependent, so there exists a vector x of weight d such that Px = 0. Hence, this x is a code vector of weight d. By Exercise 14 again, we deduce that d(C) = d. Example 8.15 Show that the Hamming codes all have minimum distance 3. Recall that the (n, k) Hamming code has an (n — k) x n parity check matrix P whose columns are all of the nonzero vectors of Z2~\ arranged so that the identity matrix occupies the last n — k columns. For example, the (7, 4) Hamming code has parity check matrix 110 110 0 10 110 10 0 1110 0 1 63247_08_ch08_p620-649.indd 647 01/11/13 12:54 PM Chapter 8 Codes We can always find three linearly dependent columns: Just take the columns corresponding to e1; e2, and e! + e2. (In the matrix on the previous page, these would be columns 5, 6, and 1, respectively.) But any two columns are linearly independent. By Theorem 8.7, this means the Hamming codes have minimum distance 3. Example 8.15, combined with Theorem 8.6, tells us that the Hamming codes are all single error-correcting. The other major type of linear code that we have considered is the family of Reed-Muller codes. These are capable of correcting many errors, which is one of the reasons they were chosen to transmit photographs from space. Example 8.16 Show that the Reed-Muller code R„ has minimum distance 2" for n£l. By Theorem 8.5, every vector in R„ except 0 and 1 has weig ht2"_1. Since 1 has weight 2", this means that the minimum weight of a nonzero code vector in R„ is 2"_1. Hence, d(R„) = 2"~\by Exercise 14. Exercises 8.5 The Minimum Distance of a Code Find the minimum distance of the codes in Exercises 1-6. l.C 2. C 0 0 1 0 1 1 0 0 0 u 1 1 0 1 0 1 0 1 1 0 0 In Exercises 7 and 8, compute the minimum distance of the code C and decode the vectors u, v, and w using nearest neighbor decoding. 7. C= < 1 1 0 0 V "o" 0 1 0 1 1 1 0 , 0 , 1 , 1 0 , v = 0 1 0 0 1 0 0 0 1 1 0 _0_ _0_ 3. The even parity code En 4. The n-times repetition code Rep„ 5. The code with parity check matrix P = [ I \ A ], where 110 110 1' 10 11110 0 1110 0 0 0 0 0 1 1 1 1 6. The code with parity check matrix 110 0 P = 1 1 1 1 10 0 1 0 0 0 0 1 8. C has generator matrix "1 0 0" "1" "f "0" 0 1 0 0 1 0 0 0 1 0 1 0 1 1 0 0 , v = 0 , w = 0 1 0 1 1 0 1 0 1 1 1 0 1 _1 1 1_ _0_ _0_ _1_ 63247_08_ch08_p620-649.indd 648 01/11/13 12:54 PM Section 8.5 The Minimum Distance of a Code 649 In Exercises 9-12, construct a linear (n, k, d) code or prove that no such code exists. 9. n = 8, it = 1, cf = 8 10. n = 8, k = 2, d = 8 11. n = 8, k = 5, d = 5 12. n = 8, k = 4, d = 4 13. Draw a picture (similar to Figure 8.4) to illustrate Example 8.14. 63247_08_ch08_p620-649.indd 649 14. Let C be a linear code. Show that the minimum distance of C is equal to the minimum weight of a nonzero code vector. 15. Show that d — 1 £ n — k for any linear (n, k, d) code. 16. Let C be a linear (n, k, d) code with parity check matrix P. Prove that d = n — /c+lif and only if every n — k columns of P are linearly independent.