: 2>(R)-» &(M) given by <£(/)=/'■ ®W is the group of differentiable functions from R to R; /' is the derivative of/.
3 The function / : R x R—» R given by f(x, y) = x + y.
4 The function / : R*-» Rpo' denned by f(x) = \x\.
5 The function f:C*-> Rp°" defined by f(a + b\) = VV + b2.
6 Let G be the multiplicative group of all 2 x 2 matrices
* t>e given by /(A) = determinant of A =
satisfying ad — be ^ 0. Let / : G ad — be.
C. Elementary Properties of Homomorphisms
Let G, H, and K be groups. Prove the following:
1 If / : G—*H and g : H-* K are homomorphisms, then their composite g°f: G—*Kisa homomorphism.
2 If / : G—* H is a homomorphism with kernel K, then / is injective iff K = {e}.
3 If /: G—*H is a homomorphism and K is any subgroup of G, then f(K) = {f(x) : x £ K} is a subgroup of H.
4 If / : G —* H is a homomorphism and / is any subgroup of H, then
f-l(J)={xeG:f(x)eJ} is a subgroup of G. Furthermore, ker /C /"'(/).
5 If / : G—* H is a homomorphism with kernel K, and / is a subgroup of G, let/, designate the restriction of / to J. (In other words f, is the same function as /, except that its domain is restricted to J.) Then ker /, = J fl K.
6 For any group G, the function / : G—» G defined by /(jc) = e is a homomorphism.
7 For any group G, {e} and G are homomorphic images of G.
8 The function f :G-*G defined by f(x) = x2 is a homomorphism iff G is abelian.
9 The functions/,(x, y) = x and /2(x, y) = y, from G x H to G and W, respectively, are homomorphisms.
D. Basic Properties of Normal Subgroups
In the following, let G denote an arbitrary group.
1 Find all the normal subgroups (a) of 53 and (b) of D4. Prove the following:
2 Every subgroup of an abelian group is normal.
3 The center of any group G is a normal subgroup of G.
4 Let H be a subgroup of G. H is normal iff it has the following property: For all a and b in G, abe H iff ba £ //.
5 Let // be a subgroup of G. // is normal iff aH = tfa for every a EG.
6 Any intersection of normal subgroups of G is a normal subgroup of G.
144 chapter fourteen
HOMOMORPH1SMS 145
£. Further Properties of Normal Subgroups
Let G denote a group, and ft a subgroup of G. Prove the following:
# 1 If ft has index 2 in G, then ft is normal. (Hint: Use Exercise D5.)
2 Suppose an element a£C has order 2. Then (a) is a normal subgroup of G iff a is in the center of G.
3 If a is any element of G, (a) is a normal subgroup of G iff a has the following property: For any x G G, there is a positive integer k such that xa = akx.
4 In a group G, a commutator is any product of the form aba lb~ , where a and b are any elements of G. If a subgroup ft of G contains off the commutators of G, then ft is normal.
5 If ft and K are subgroups of G, and K is normal, then HK is a subgroup of G. (ftK denotes the set of all products hk as h ranges over ft and k ranges over K.)
# 6 Let S be the union of all the cosets Ha such that fta = aft. Then 5 is a subgroup of G, and ft is a normal subgroup of S.
F. Homomorphism and the Order of Elements
If / : G—* H is a homomorphism, prove each of the following:
1 For each element a G G, the order of /(a) is a divisor of the order of a.
2 The order of any element b 9* e in the range of /is a common divisor of |G| and |H|. (Use part 1.)
3 If the range of / has n elements, then x" G ker / for every leG.
4 Let m be an integer such that m and |//| are relatively prime. For any x G G, if x™ G ker /, then x G ker /.
5 Let the range of /have m elements. If a 6 G has order n, where m and n are relatively prime, then a is in the kernel of /. (Use part 1.)
6 Let p be a prime. If ran / has an element of order p, then G has an element of order p.
G. Properties Preserved under Homomorphism
A property of groups is said to be "preserved under homomorphism" if, whenever a group G has that property, every homomorphic image of G does also. In this exercise set, we will survey a few typical properties preserved under homomorphism. If / : G-* ft is a homomorphism of G onto ft, prove each of the following:
1 If G is abelian, then ft is abelian.
2 If G is cyclic, then ft is cyclic.
3 If every element of G has finite order, then every element of ft has finite order.
4 If every element of G is its own inverse, every element of ft is its own inverse.
5 If every element of G has a square root, then every element of ft has a square root.
6 If G is finitely generated, then ft is finitely generated. (A group is said to be "finitely generated" if it is generated by finitely many of its elements.)
t II. Inner Direct Products
If G is any group, let ft and K be normal subgroups of G such that ft D K Prove the following:
1 Let /i, and h2 be any two elements of ft, and A:, and k2 any two elements of K.
hlkl = h2k2 implies A, = h2 and kl = k2
(Hint: If A1A:1 = h2k2, then h~lh1 G ft n K and k2k^ G ft n K. Explain why.)
2 For any h G ft and k^K, hk = kh. (Hint: hk = kh iff hkh'1k'i = e. Use the fact that ft and K are normal.)
3 Now, make the additional assumption that G = HK; that is, every x in G can be written as x = hk for some h G ft and k G K. Then the function +0={ <6} + l = { <6> + 2={ <6>+3 = { (6)+4={ <6) + 5 = {
These are all the different cosets of (6), for it is easy to see that (6) + 6 = (6) + 0, (6) + 7 = (6) + 1, (6) + 8 = (6) + 2, and so on.
Now, the operation on Z is denoted by +, and therefore we will call the operation on the cosets coset addition rather than coset multiplication. But nothing is changed except the name; for example, the coset (6) + 1 added to the coset (6) + 2 is the coset (6) + 3. The coset (6) + 3 added to the coset (6) + 4 is the coset (6) + 7, which is the same as (6) + 1. To simplify our notation, let us agree to write the cosets in the following shorter form:
0={6)+0 T=<6) + 1 2=<6)+2
3=(6)+3 4=<6)+4 5=(6)+5
Then Z/{6) consists of the six elements 0, T, 2, 3, 4, and 5, and its operation is summarized in the following table:
+ 0 1 2 3 4 5
0 0 1 2 3 4 5
1 1 2 3 4 5 U
2 2 3 4 5 0 1
3 3 4 5 0 1 2
4 4 5 0 1 2 3
5 5 0 1 2 3 4
The reader will perceive immediately the similarity between this group and Z6. As a matter of fact, the quotient group construction of Z/<6) is considered to be the rigorous way of constructing Z6. So from now on, we will consider Z6 to be the same as Z/<6); and, in general, we will consider Z„ to be the same as Z/(n). In particular, we can see that for any n, Z„ is a homomorphic image of Z.
Let us repeat: The motive for the quotient group construction is that it gives us a way of actually producing all the homomorphic images of any group G. However, what is even more fascinating about the quotient group construction is that, in practical instances, we can often choose H so as to "factor out" unwanted properties of G, and preserve in GIH only "desirable" traits. (By "desirable" we mean desirable within the context of some specific application or use.) Let us look at a few examples.
First, we will need two simple properties of cosets, which are given in the next theorem.
Theorem 5 Let G be a group and H a subgroup of G.
(i) Ha = Hb iff ab l(EH and
(ii) Ha = H iff aSH
Then
Proof: If Ha = Hb, then a G Hb, so a = hb for some hGH. Thus, ab'1 = h„ «= {0, {1}}. {P} is the group of subsets of {1,2,3}.)
B. Examples of Quotient Groups of U x U
In each of the following, H is a subset of R x R.
(a) Prove that H is a normal subgroup of R x R. (Remember that every subgroup of an abelian group is normal.)
(b) In geometrical terms, describe the elements of the quotient group G/H.
(c) In geometrical terms or otherwise, describe the operation of G/H.
1 H = {(x,0):xeR}
2 H = {(x,y): y = -x)
3 H={(x,y):y = 2x)
C. Relating Properties of H to Properties of G/H
In parts 1-5 below, G is a group and H is a normal subgroup of G. Prove the following (Theorem 5 will play a crucial role):
1 If x2 e H for every iEG, then every element of G/H is its own inverse. Conversely, if every element of G/H is its own inverse, then x2 E Hfor all i£G.
2 Let m be a fixed integer. U xm G H for every * e G, then the order of every element in G/H is a divisor of m. Conversely, if the order of every element in G/H is a divisor of m, then x™ 6 H for every jEG.
3 Suppose that for every x E G, there is an integer n such that x" E H; then every element of G/H has finite order. Conversely, if every element of G/H has finite order, then for every x&G there is an integer n such that x" E H.
# 4 Every element of G/H has a square root iff for every x£G, there is some y E G such that xy2 E //.
5 G/H is cyclic iff there is an element a E G with the following property: for every * E G, there is some integer n such that xa" E //.
6 If G is an abelian group, let Hp be the set of all x E G whose order is a power of p. Prove that Hp is a subgroup of G. Prove that G/Hp has no elements whose order is a nonzero power of p
7 (a) If G/H is abelian, prove that H contains all the commutators of G.
(b) Let K be a normal subgroup of G, and W a normal subgroup of K. If G/H is abelian, prove that G/K and AT/// are both abelian.
154 CHAPTER FIFTEEN
quotient groups 155
#
D. Properties of G Determined by Properties of G/H and H
There are some group properties which, if they are true in G/H and in H, must be true in G. Here is a sampling. Let G be a group, and H a normal subgroup of
G. Prove the following:
1 If every element of G!H has finite order, and every element of H has finite order, then every element of G has finite order.
2 If every element of G/H has a square root, and every element of H has a square root, then every element of G has a square root. (Assume G is abelian.)
3 Let p be a prime number. If G/H and H are p-groups, then G is a p-group. A group G is called a p-group if the order k ;very element x in G is a power of p.
4 If G/H and H are finitely generated, then G is finitely generated. (A group is said to be finitely generated if it is generated by a finite subset of its elements.)
E. Order of Elements in Quotient Groups
Let G be a group, and H a normal subgroup of G. Prove the following:
1 For each element uEG, the order of the element Ha in G/H is a divisor of the order of a in G. (Hint: Use Chapter 14, Exercise Fl.)
2 If (G: H) = m, the order of every element of G/H is a divisor of m.
3 If (G: H) = p, where p is a prime, then the order of every element a 0H in G is a multiple of p. (Use part 1.)
4 If G has a normal subgroup of index p, where p is a prime, then G has at least one element of order p.
5 If (G: H) = m, then o"€H for every a G G. # 6 In Q/Z, every element has finite order.
t F. Quotient of a Group by Its Center
The center of a group G is the normal subgroup C of G consisting of all those elements of G which commute with every element of G. Suppose the quotient group GIC is a cyclic group; say it is generated by the element Ca oiGIC. Prove parts 1-3:
1 For every xS G, there is some integer m such that Cx = Cam.
2 For every iGG, there is some integer m such that x = ca", where c6C.
3 For any two elements x and y in G, xy = yx. (Hint; Use part 2 to write * = cam, y = c'a", and remember that c, e' G C.)
4 Conclude that if G/C is cyclic, then G is abelian.
t G. Using the Class Equation to Determine the Size of the Center
(Prerequisite: Chapter 13, Exercise 1.)
Let G be a finite group. Elements a and b in G are called conjugates of one another (in symbols, a~b) iff a = xbx'i for some x G G (this is the same as
b = x^'ox). The relation ~ is an equivalence relation in G; the equivalence class of any element a is called its conjugacy class. Hence G is partitioned into conjugacy classes (as shown in the diagram); the size of each conjugacy class divides the order of G. (For these facts, see Chapter 13, Exercise I.)
"Each element of the center C is alone in its conjugacy class.'
Let S,, S2,. .., S, be the distinct conjugacy classes of G, and let
fe2,..., k, be their sizes. Then |G| = A, + k2 +----h k,. (This is called the
class equation of G.)
Let G be a group whose order is a power of a prime p, say |G| = p". Let C denote the center of G. Prove parts 1-3:
1 The conjugacy class of a contains a (and no other element) iff a G C.
2 Let c be the order of C. Then |G| = c + ks + ks+l + ■■■ + k„ where kt,...,k, are the sizes of all the distinct conjugacy classes of elements x0C.
3 For each ie {s, s + 1,..., t), kt is equal to a power of p. (See Chapter 13, Exercise 16.)
4 Solving the equation |G| = c + ks + • ■ • + k, for c, explain why c is a multiple of
P-
We may conclude from part 4 that C must contain more than just the one element e; in fact, \C\ is a multiple of p.
5 Prove: If \G\ =p2, G must be abelian. (Use the preceding Exercise F.) # 6 Prove: If |G| = p2, then either G = Z„2 or G = Z„ x Z„.
156 CHAPTER FIFTEEN
t H. Induction on \G\: An Example
Many theorems of mathematics are of the form is true for every positive
integer n." [Here, P(n) is used as a symbol to denote some statement involving n.] Such theorems can be proved by induction as follows:
(a) Show that P(n) is true for n = 1.
(b) For any fixed positive integer k, show that, if P(n) is true for every n< k, then P(ri) must also be true for n = k.
If we can show (a) and (b), we may safely conclude that P(n) is true for all positive integers n.
Some theorems of algebra can be proved by induction on the order n of a group. Here is a classical example: Let G be a finite abelian group. We will show that G must contain at least one element of order p, for every prime factor p of \G\. If |G] = 1, this is true by default, since no prime p can be a factor of 1. Next, let \G\ = k, and suppose our claim is true for every abelian group whose order is less than k. Let p be a prime factor of k.
Take any element a # e in G. If ord(a) = p or a multiple of p, we are done!
1 If ord(a) = tp (for some positive integer t), what element of G has order pi
2 Suppose ord(a) is not equal to a multiple of p. Then Gl (a) is a group having fewer than k elements. (Explain why.) The order of Gl (a) is a multiple of p. (Explain why.)
3 Why must Gl (a) have an element of order pi
4 Conclude that G has an element of order p. (Hint: Use Exercise El.)
CHAPTER
SIXTEEN
THE FUNDAMENTAL HOMOMORPHISM
THEOREM
Let G be any group. In Chapter 15 we saw that every quotient group of G is a homomorphic image of G. Now we will see that, conversely, every homomorphic image of G is a quotient group of G. More exactly, every homomorphic image of G is isomorphic to a quotient group of G.
It will follow that, for any groups G and H, H is a homomorphic image of G iff H is (or is isomorphic to) a quotient group of G. Therefore, the notions of homomorphic image and of quotient group are interchangeable.
The thread of our reasoning begins with a simple theorem.
Theorem 1 Let f : G-* H be a homomorphism with kernel K. Then f(a) = /(£>) iff Ka=Kb
(In other words, any two elements a and b in G have the same image under / iff they are in the same coset of K.) Indeed,
f(a)=f(b) iff M[nP)V = * iff f(ab-l) = e iff ab-'eK
iff Ka = Kb (by Chapter 15, theorem 5i)
157
158 CHAPTER SIXTEEN
THE FUNDAMENTAL HOMOMORPHISM THEOREM 159
What does this theorem really tell us? It says that if / is a homo-inorphism from G to // with kernel K, then all the elements in any fixed coset of K have the same image, and, conversely, elements which have the same image are in the same coset of K.
K = Ke
Ka = Kb
It is therefore clear already that there is a one-to-one correspondence matching cosets of K with elements in H. It remains only to show that this correspondence is an isomorphism. But first, how exactly does this correspondence match up specific cosets of K with specific elements of HI Clearly, for each x, the coset Kx is matched with the element f(x). Once this is understood, the next theorem is easy.
Theorem 2 Let f: G-
the kernel of f, then
H be a homomorphism of G onto H. If K is
H = GIK
Proof: To show that GIK is isomorphic to H, we must look for an isomorphism from GIK to H. We have just seen that there is a function from GIK to H which matches each coset Kx with the element f(x); call this function 4>. Thus, (Kx)=f(x)
This definition does not make it obvious that (Ka) is the same as is an isomorphism:
4> is injective: If (Ka) = is surjective, because every element of H is of the form f(x) =
(Kab) = f{ab) - f(a)f(b) =
is an isomorphism from GIK onto //. ■
Theorem 2 is often called the fundamental homomorphism theorem. It asserts that every homomorphic image of G is isomorphic to a quotient group of G. Which specific quotient group of G? Well, if / is a homomorphism from G onto H, then // is isomorphic to the quotient group of G by the kernel of f.
The fact that/is a homomorphism from G onto H may be symbolized by writing
-» H
f:G~
Furthermore, the fact that K is the kernel of this homomorphism may be indicated by writing
f:G
H
Thus, in capsule form, the fundamental homomorphism theorem says that
If
f.G
-» H
then
H= GIK
Let us see a few examples:
We saw in the opening paragraph of Chapter 14 that
/=(° 1
J Vo i
2 3 4 5 2 0 12
is a homomorphism from Z6 onto Z,. Visibly, the kernel of / is {0,3}, which is the subgroup of Z6 generated by 3, that is, the subgroup (3). This situation may be symbolized by writing
We conclude by Theorem 2 that
Z, = Z6/{3)
For another kind of example, let G and H be any groups and consider their direct product G x H. Remember that G x H consists of all the ordered pairs {x, y) as x ranges over G and y ranges over H. You multiply ordered pairs by multiplying corresponding components: that is, the operation on G x H is given by
(a, b) ■ (c, d) = (ac, bd)
160 CHAPTER SIXTEEN
the fundamental homomorphism theorem 161
Now, let / be the function from G X H onto H given by
fix, y) = y
It is easy to check that /is a homomorphism. Furthermore, (x, y) is in the kernel of / iff f(x, y) = y = e. This means that the kernel of / consists of all the ordered pairs whose second component is e. Call this kernel G*; then
G* = {(x, e) : x 6 G} We symbolize all this by writing
f-.GxH-
■ H
By the fundamental homomorphism theorem, we deduce that H = (G x H) IG*. [It is easy to see that G* is an isomorphic copy of G; thus, identifying G* with G, we have shown that, roughly speaking, (G x H)/G = H.]
Other uses of the fundamental homomorphism theorem are given in the exercises.
EXERCISES
In the exercises which follow, FHT will be used as an abbreviation for fundamental homomorphism theorem.
A. Examples of the FHT Applied to Finite Groups
In each of the following, use the fundamental homomorphism theorem to prove that the two given groups are isomorphic. Then display their tables.
Example Z2 and 1J (2).
/0 12 3 4 5^ ' VO 10 10T is a homomorphism from 2.b onto Z2. (Do not prove that/is a homomorphism.) The kernel of/is {0,2,4} = (2). Thus,
/ : Z6 -~» Z,
1 6 <2> 2
It follows by the FHT that Z2 = ZJ{2).
1 Z5 and Z20/{5>.
2 Z3 and Z„/(3).
3 Z2 and S,/{e, 0,8}.
4 P2 and P3/K, where K= {0, {c}}. |Hint: Consider the function/(C) = C n {a, b}. P3 is the group of subsets of {a, b, c}, and P, of (a, b}.]
5 Z, and (Z3xZ,)/X, where K={(0,0), (1,1), (2,2)}. [Hint: Consider the function f(a, b) = a — b from Z3 x Z3 to Z_v]
B. Example of the FHT Applied to &(U)
Let a: ^(R)->R be defined by a(/)=/(l) and let 0: &(R)-0(/)=/(2).
be defined by
1 Prove that a and /3 are homomorphisms from SF(R) onto R.
2 Let / be the set of all the functions from R to R whose graph passes through the point (1,0) and let K be the set of all the functions whose graph passes through (2,0). Use the FHT to prove that U = &(R)IJ and R = &(R)IK.
3 Conclude that &(R)/J = &(R)1K.
C. Example of the FHT Applied to Abelian Groups
Let G be an abelian group. Let H = {x1: x€. G} and K = [x G G : x2 = e}.
1 Prove that f(x) = x2 is a homomorphism of G onto H.
2 Find the kerne! of /.
3 Use the FHT to conclude that H - G/K.
t D. Group of Inner Automorphisms of a Group G
Let G be a group. By an automorphism of G we mean an isomorphism / : G—* G.
# 1 The symbol Aut(G) is used to designate the set of all the automorphisms of G. Prove that the set Aut (G), with the operation ° of composition, is a group by proving that Aut(G) is a subgroup of SG.
2 By an inner automorphism of G we mean any function 4>a of the following form:
for every x 6 G a(x) = axa~l Prove that every inner automorphism of G is an automorphism of G.
3 Prove that, for arbitrary a, b e G.
a"^b = ab a!ld (J~l = 4>.->
4 Let 1(G) designate the set of all the inner automorphisms of G. That is, 1(G) = {4>a : oG G}. Use part 3 to prove that 1(G) is a subgroup of Aut(G). Explain why 1(G) is a group.
5 By the center of G we mean the set of all those elements of G which commute with every element of G, that is, the set C defined by
C = [a€: G : ax = xa for every x G G}
Prove that a G C if and only if axa= x for every x G G.
162 CHAFfER SIXTEEN
THE FUNDAMENTAL HOMOMORPHISM THEOREM 163
6 Let h:G—*I(G) be the function defined by h(a) = a. Prove that h is a homomorphism from G onto 1(G) and that C is its kernel.
7 Use the FHT to conclude that 1(G) is isomorphic with G/C.
t H. Quotient Groups Isomorphic to the Circle Group
Every complex number a + b\ may be represented as a point in the complex plane.
t E. The FHT Applied to Direct Products of Groups
Let G and H be groups. Suppose / is a normal subgroup of G and K is a normal subgroup of H.
1 Show that the function/(x, y) = (Jx, Ky) is a homomorphism from G x H onto (GIJ) x (HIK).
2 Find the kernel of /.
3 Use the FHT to conclude that (g x H)/(J x K) = (G/J) x (HIK).
t F. First Isomorphism Theorem
Let g be a group; let H and K be subgroups of g, with H a normal subgroup of g. Prove the following:
1 H n K is a normal subgroup of K. # 2 If HK = {xy : x E H and y e at), then HK is a subgroup of G.
3 H is a normal subgroup of WAT.
4 Every member of the quotient group HKIH may be written in the form Hk for some k E K.
5 The function f(k) = Hk is a homomorphism from a' onto HKIH, and its kernel is // n a:.
6 By the FHT, KI(H n K) = HKIH. (This is referred to as the first isomorphism theorem.)
t G. A Sharper Cayley Theorem
If H is a subgroup of a group g, let X designate the set of all the left cosets of H in G. For each element a E G, define pa : a'—* X as follows:
pa(xH) = (ax)H
1 Prove that each pa is a permutation of X.
2 Prove that h : g—> Sx defined by h(a) - pa is a homomorphism.
# 3 Prove that the set {a E H : xax'1 E H for every x E G), that is, the set of all the elements of H whose conjugates are all in H, is the kernel of h. 4 Prove that if H contains no normal subgroup of g except {e}, then G is isomorphic to a subgroup of Sx.
Imaginary axis
— -ja +bi I
-*
Real axis
cos x + i sin x
The unit circle in the complex plane consists of all the complex numbers whose distance from the origin is 1; thus, clearly, the unit circle consists of all the complex numbers which can be written in the form
cos x + i sin x
for some real number x.
# 1 For each i£R, it is conventional to write cisx = cos x + i sin x. Prove that cis (jc + y) = (cis -t)(eis y).
2 Let T designate the set {cis x : x E R}, that is, the set of all the complex numbers lying on the unit circle, with the operation of multiplication. Use part 1 to prove that T is a group. (T is called the circle group.)
3 Prove that f(x) = cis x is a homomorphism from R onto 7".
4 Prove that ker /= {2nir :nEZ) = (2ir).
5 Use the FHT to conclude that T = Rl(2ir).
6 Prove that g(x) = cis 277.« is a homomorphism from R onto 7", with kernel Z.
7 Conclude that T =
t I. The Second Isomorphism Theorem
Let H and K be normal subgroups of a group g, with H c K. Define G/H-+ GIK by is a well-defined function. [That is, if Ha = Hb, then (Ha) = (Hb).\
2 4> is a homomorphism.
3 is surjective.
4 ker 0 = KIH.
5 Conclude (using the FHT) that (GIH)I(KIH) = GIK.
164 chapter sixteen
the fundamental homomorphism theorem 165
t J. The Correspondence Theorem
Let / be a homomorphisra from G onto H with kernel K:
f:G-» H
If S is any subgroup of H, let 5* = {x G G : f(x) G S). Prove:
1 5* is a subgroup of G.
2 KCS*.
3 Let g be the restriction of/to 5*. [That is, = f(x) for every a: G 5*, and 5* is the domain of g.] Then g is a homomorphism from 5* onto S, and £ = ker g.
4 S = S*/K.
t K. Cauchy's Theorem
Prerequisites: Chapter 13, Exercise I, and Chapter 15, Exercises G and H.
If G is a group and p is any prime divisor of \G\, it will be shown here that G has at least one element of order p. This has already been shown for abelian groups in Chapter 15, Exercise H4. Thus, assume here that G is not abelian. The argument will proceed by induction; thus, let \G\ = k, and assume our claim is true for any group of order less than k. Let C be the center of G, let Ca be the centralizer of a for each a G G, and let k = c + ks + ■ ■ ■ + k, be the class equation of G, as in Chapter 15, Exercise G2.
1 Prove: If p is a factor of |Cj for any a G G, where a&C, we are done. (Explain why.)
2 Prove that for any a^C in G, if p is not a factor of |C„|, then p is a factor of
(G: C„).
3 Solving the equation k = c + k, + • ■ • + k, for c, explain why p is a factor of c. We are now done. (Explain why.)
t L. Subgroups of p-Groups (Prelude to Sylow)
Prerequisites: Exercise J; Chapter 15, Exercises G and H.
Let p be a prime number. A p-group is any group whose order is a power of p. It will be shown here that if |G| = p* then G has a normal subgroup of order pm for every m between 1 and k. The proof is by induction on |G|; we therefore assume our result is true for all p-groups smaller than G. Prove parts 1 and 2:
1 There is an element a in the center of G such that ord(a) = p. (See Chapter 15, Exercises G and H.)
2 (a) is a normal subgroup of G.
3 Explain why it may be assumed that Gl(a) has a normal subgroup of order
m-l P ■
# 4 Use Exercise J4 to prove that G has a normal subgroup of order pm.
SUPPLEMENTARY EXERCISES
Exercise sets M through Q are included as a challenge for the ambitious reader. Two important results of group theory are proved in these exercises: one is called Sylow's theorem, the other is called the basis theorem of finite abelian groups.
t M. p-Sylow Subgroups
Prerequisites: Exercises J and K of this Chapter, Exercise II of Chapter 14, and Exercise D3 of Chapter 15.
Let p be a prime number. A finite group G is called a p-group if the order of every element x in G is a power p. (The orders of different elements may be different powers of p.) If H is a subgroup of any finite group G, and H is a p-group, we call H a p-subgroup of G. Finally, if Kis a p-subgroup of G, and K is maximal (in the sense that K is not contained in any larger p-subgroup of G), then K is called a p-Sylow subgroup of G.
1 Prove that the order of any p-group is a power of p. (Hint: Use Exercise K.)
2 Prove that every conjugate of a p-Sylow subgroup of G is a p-Sylow subgroup of G.
Let AT be a p-Sylow subgroup of G, and N = N(K) the normalizer of K.
3 Let a G N, and suppose the order of Ka in NIK is a power of p. Let S = (Ka) be the cyclic subgroup of NIK generated by Ka. Prove that N has a subgroup S* such that S* IK is a p-group. (Hint: See Exercise J4.)
4 Prove that S* is a p-subgroup of G (use Exercise D3, Chapter 15). Then explain why S* - K, and why it follows that Ka = K.
5 Use parts 3 and 4 to prove: no element of NIK has order a power of p (except, trivially, the identity element).
6 If a G A' and the order of a is a power of p, then the order of Ka (in NIK) is also a power of p. (Why?) Thus, Ka = K. (Why?)
7 Use part 6 to prove: if aKa~x = K and the order of a is a power of p, then ae.K.
t N. Sylow's Theorem
Prerequisites: Exercises K and M of this Chapter and Exercise I of Chapter 14.
Let G be a finite group, and K a p-Sylow subgroup of G. Let X be the set of all the conjugates of K. See Exercise M2. If C,, C.GA', let C, ~ C2 iff C, = aC2a_1 for some a G K
1 Prove that ~ is an equivalence relation on X.
Thus, ~ partitions X into equivalence classes. If CG X, let the equivalence class of C be denoted by [C].
166 CHAPTER SIXTEEN
the fundamental homomorphism theorem 167
K is the only member of its class
2 For each CeX, prove that the number of elements in [C] is a divisor of \K\. (Hint: Use Exercise 110 of Chapter 14.) Conclude that for each CGJf, the number of elements in [C] is either 1 or a power of p.
3 Use Exercise M7 to prove that the only class with a single element is [K].
4 Use parts 2 and 3 to prove that the number of elements in X is kp + 1, for some integer k.
5 Use part 4 to prove that (G : N) is not a multiple of p.
6 Prove that (N : K) is not a multiple of p. (Use Exercises K and M5.)
7 Use parts 5 and 6 to prove that (G : K) is not a multiple of p.
8 Conclude: Let G be a finite group of order pkm, where p is not a factor of m. Every p-Sylow subgroup K of G has order pk.
Combining part 8 with Exercise L gives
Let G be a finite group and let p be a prime number. For each n such that p" divides \G\, G has a subgroup of order p".
This is known as Sylow's theorem.
t O. Lifting Elements from Cosets
The purpose of this exercise is to prove a property of cosets which is needed in Exercise Q. Let G be a finite abelian group, and let a be an element of G such that ord(a) is a multiple of ord(x) for every x G G. Let H= (a). We will prove:
For every x £ G, there is some y G G such that Hx = Hy and ord(y) = ord(tfy).
This means that every coset of H contains an element y whose order is the same as the coset's order.
Let * be any element in G, and let ord(a) = t, ord(x) = s, and ovd(Hx) = r.
1 Explain why r is the least positive integer such that x' equals some power of a,
say x' = am.
2 Deduce from our hypotheses that r divides s, and s divides t.
Thus, we may write s = ru and t = so, so in particular, t = ruv.
3 Explain why amu = e, and why it follows that mu = tz for some integer z. Then explain why m = rvz.
4 Setting y = xaT"', prove that Hx = Hy and ord( y) = r, as required.
t P. Decomposition of a Finite Abelian Group into p-Groups
Let G be an abelian group of order pkm, where pk and m are relatively prime (that is, pk and m have no common factors except ±1). (Remark: If two integers and k are relatively prime, then there are integers j and t such that sj + tk = 1. This is proved on page 220.)
Let Gpt be the subgroup of G consisting of all elements whose order divides p*. Let Gm be the subgroup of G consisting of all elements whose order divides m. Prove:
1 For any i£G and integers s and r, x'" G Gm and x'm G Gpk.
2 For every x G G, there are y G Gpk and z G Gm such that x = yz.
3 GptnGm = {e}.
4 G = Gp*x Gm. (See Exercise H, Chapter 14.)
5 Suppose |G| has the following factorization into primes: |G| = pk,pk
Pn-
Then G = G, x G2 x • • • x G„ where for each ( = !,...,«, G, is a p,-group.
Q. Basis Theorem for Finite Abelian Groups
Prerequisite: Exercise P.
As a provisional definition, let us call a finite abelian group "decomposable' if there are elements a,,.. . , an G G such that:
(Dl) For every *G G, there are integers k1, . . . , kn such that x = aklak2 ■ ■ ■ ak\
(D2) If there are integers /,,...,/„ such that a['a'22 ■ ■ ■ a'; = e then a\> = a'} = ■ ■ ■ = a'n = g_
If (Dl) and (D2) hold, we will write G = [a,, a2, and 2.
, a„J. Assume this in parts 1
1 Let G' be the set of all products a22 ■■• a'n", as lz,..., l„ range over Z. Prove that G' is a subgroup of G, and G' = [a2, . . . , aj.
2 Prove: G = (a,) x G'. Conclude that G = («,) x (a2) x ••■ x
In the remaining exercises of this set, let p be a prime number, and assume G is a finite abelian group such that the order of every element in G is some power of p. Let a G G be an element whose order is the highest possible in G. We will argue by induction to prove that G is "decomposable." Let H= (a).
3 Explain why we may assume that G/H = [Hbu . . ., Hb ] for some
bt.....i.eo.
By Exercise O, we may assume that for each i = l,...,n, ord(6.) = ord(/iT>,). We will show that G = [a, 6,,. . . , 6J.
168 CHAPTER SIXTEEN
4 Prove that for every xSG, there are integers k0, *„..., *„ such that
: , then a'0 = b'<
■■ b'n = e. Conclude that
5 Prove that if oV/ G = la,fc,,... ,b„].
6 Use Exercise P5, together with parts 2 and 5 above, to prove: Every finite abelian group G is a direct product of cyclic groups of prime power order. (This is called the basis theorem of finite abelian groups.)
It can be proved that the above decomposition of a finite abelian group into cyclic p-groups is unique, except for the order of the factors. We leave it to the ambitious reader to supply the proof of uniqueness.
CHAPTER
SEVENTEEN
RINGS: DEFINITIONS AND ELEMENTARY PROPERTIES
In presenting scientific knowledge it is elegant as well as enlightening to begin with the simple and move toward the more complex. If we build upon a knowledge of the simplest things, it is easier to understand the more complex ones. In the first part of this book we dedicated ourselves to the study of groups—surely one of the simplest and most fundamental of all algebraic systems. We will now move on, and, using the knowledge and insights gained in the study of groups, we will begin to examine algebraic systems which have two operations instead of just one.
The most basic of the two-operational systems is called a ring: it will be defined in a moment. The surprising fact about rings is that, despite their having two operations and being more complex than groups, their fundamental properties follow exactly the pattern already laid out for groups. With remarkable, almost compelling ease, we will find two-operational analogs of the notions of subgroup and quotient group, homomorphism and isomorphism—as well as other algebraic notions— and we will discover that rings behave just like groups with respect to these notions.
The two operations of a ring are traditionally called addition and multiplication, and are denoted as usual by + and -, respectively. We must remember, however, that the elements of a ring are not necessarily numbers (for example, there are rings of functions, rings of switching circuits, and so on); and therefore "addition" does not necessarily refer to the conventional addition of numbers, nor does multiplication necessarily refer to the conventional operation of multiplying numbers. In fact,
169
170 chapter seventeen
rings: definitions and elementary properties 171
+ and • are nothing more than symbols denoting the two operations of a ring.
By a ring we mean a set A with operations called addition and multiplication which satisfy the following axioms:
(i) A with addition alone is an abelian group.
(ii) Multiplication is associative.
(iii) Multiplication is distributive over addition. That is, for all a, b, and c in A,
and
a(b + c) = ab + ac (b + c)a = ba + ca
Since A with addition alone is an abelian group, there is in A a neutral element for addition: it is called the zero element and is written 0. Also, every element has an additive inverse called its negative; the negative of a is denoted by -a. Subtraction is defined by
a - b = a + (-b)
The easiest examples of rings are the traditional number systems. The set Z of the integers, with conventional addition and multiplication, is a ring called the ring of the integers. We designate this ring simply with the letter Z. (The context will make it clear whether we are referring to the ring of the integers or the additive group of the integers.)
Similarly, Q is the ring of the rational numbers, IR the ring of the real numbers, and C the ring of the complex numbers. In each case, the operations are conventional addition and multiplication.
Remember that ^(R) represents the set of all the functions from IR to R; that is, the set of all real-valued functions of a real variable. In calculus we learned to add and multiply functions: if/and g are any two functions from R to R, their sum f + g and their product fg are denned as follows:
and
[/ + g](x) - f(x) + g(x) f°r every rea' number x [fg](x) = f(x)g(x) for every real number x
h thpMP nnpratirms fnr addinn and multinlvine function
2F(R) with these operations for adding and multiplying functions is a ring called the ring of real functions. It is written simply as ^(R). On page 46 we saw that ^(R) with only addition of functions is an abelian group. It is left as an exercise for you to verify that multiplication of functions is associative and distributive over addition of functions.
The rings Z, Q, R, C, and ^(R) are all infinite rings, that is, rings with infinitely many elements. There are also finite rings: rings with a finite number of elements. As an important example, consider the group Z„, and define an operation of multiplication on Z„ by allowing the
product ab to be the remainder of the usual product of integers a and b after division by n. (For example, in Z5, 2 • 4 = 3, 3 • 3 = 4, and 4 • 3 = 2.) This operation is called multiplication modulo n. Z„ with addition and multiplication modulo n is a ring: the details are given in Chapter 19.
Let A be any ring. Since A with addition alone is an abelian group, everything we know about abelian groups applies to it. However, it is important to remember that A with addition is an abelian group in additive notation and, therefore, before applying theorems about groups to A, these theorems must be translated into additive notation. For example, Theorems 1, 2, and 3 of Chapter 4 read as follows when the notation is additive and the group is abelian:
a + b = a + c implies b = c (1)
a + b = 0 implies a= -b and b = -a (2)
~(a + b) = (-a) + (-b) and -(-a) = a (3)
Therefore Conditions (1), (2), and (3) are true in every ring.
What happens in a ring when we multiply elements by zero? What happens when we multiply elements by the negatives of other elements? The next theorem answers these questions.
Theorem 1 Let a and b be any elements of a ring A.
(i) «0 = 0 and 0a=0
(ii) a(-b) = -(ab) and (-a)b = -(ab)
(iii) (~a)(-b) = ab
Part (i) asserts that multiplication by zero always yields zero, and parts (ii) and (iii) state the familiar rules of signs.
Proof: To prove (i) we note that
aa + 0 = aa
= a(a + 0) because a = a + 0
= aa + flO by the distributive law
Thus, aa + 0 = aa + a0. By Condition (1) above we may eliminate the term aa on both sides of this equation, and therefore 0 = aO. To prove (ii), we have
a(-b) + ab = a[(-b) + b] by the distributive law
= a0
= 0 by part (i)
Thus, a(-b) + ab=Q. By Condition (2) above we deduce that a(-b) = -(ab). The twin formula (~a)b = -(ab) is deduced analogously.
172 CHAPTER SEVENTEEN
rings: definitions and elementary properties 173
We prove part (iii) by using part (ii) twice:
(-«)(-*) = -[«(-&)] = -[-(*&)] = ab »
The general definition of a ring is sparse and simple. However, particular rings may also have "optional features" which make them more versatile and interesting. Some of these options are described next.
By definition, addition is commutative in every ring but mutiplication is not. When multiplication also is commutative in a ring, we call that ring a commutative ring.
A ring A does not necessarily have a neutral element for multiplication. If there is in A a neutral element for multiplication, it is called the unity of A, and is denoted by the symbol 1. Thus, a • 1 = a and 1 • a = a for every a in A. If A has a unity, we call A a ring with unity. The rings Z, Q, U, C, and 9(R) are all examples of commutative rings with unity.
Incidentally, a ring whose only element is 0 is called a trivial ring; a ring with more than one element is nontrivial. In a nontrivial ring with unity, necessarily 15* 0. This is true because if 1 = 0 and x is any element of the ring, then
x = x\ = *0 = 0
In other words, if 1 = 0 then every element of the ring is equal to 0; hence 0 is the only element of the ring.
If A is a ring with unity, there may be elements in A which have a multiplicative inverse. Such elements are said to be invertible. Thus, an element a is invertible in a ring if there is some x in the ring such that
ax = xa = 1
For example, in R every nonzero element is invertible: its multiplicative inverse is its reciprocal. On the other hand, in Z the only invertible elements are 1 and —1.
Zero is never an invertible element of a ring except if the ring is trivial; for if zero had a multiplicative inverse x, we would have Ox = 1, that is, 0 = 1.
If A is a commutative ring with unity in which every nonzero element is invertible, A is called a field. Fields are of the utmost importance in mathematics; for example, G, R, and C are fields. There are also finite fields, such as Z5 (it is easy to check that every nonzero element of Z5 is invertible). Finite fields have beautiful properties and fascinating applications, which will be examined later in this book.
In elementary mathematics we learned the commandment that if the product of two numbers is equal to zero, say
ab = Q
then one of the two factors, either a or b (or both) must be equal to zero.
This is certainly true if the numbers are real (or even complex) numbers, but the rule is not inviolable in every ring. For example, in Z6,
2-3 = 0
even though the factors 2 and 3 are both nonzero. Such numbers, when they exist, are called divisors of zero.
In any ring, a nonzero element a is called a divisor of zero if there is a nonzero element b in the ring such that the product ab or ba is equal to zero.
(Note carefully that both factors have to be nonzero.) Thus, 2 and 3 are divisors of zero in Z6; 4 is also a divisor of zero in Z6, because 4-3 = 0. For another example, let M2(R) designate the set of all 2 x 2 matrices of real numbers, with addition and multiplication of matrices as described on page 8. The simple task of checking that M2(U) satisfies the ring axioms is assigned as Exercise CI at the end of this chapter. M2(U) is rampant with examples of divisors of zero. For instance,
'0 IN/1 1\ /0 0^ ^0 1A0 0/ VO 0/
hence
'0 V >0 1,
are both divisors of zero in M2(U).
Of course, there are rings which have no divisors of zero at all! For example, Z, Q, R, and C do not have any divisors of zero. It is important to note carefully what it means for a ring to have no divisors of zero: it means that if the product of two elements in the ring is equal to zero, at least one of the factors is zero. (Our commandment from elementary mathematics!)
It is also decreed in elementary algebra that a nonzero number a may be canceled in the equation ax = ay to yield x = y. While undeniably true in the number systems of mathematics, this rule is not true in every ring. For example, in Z6,
2-5 = 2-2
yet we cannot cancel the common factor 2. A similar example involving 2x2 matrices may be seen on page 9. When cancellation is possible, we say the ring has the "cancellation property."
A ring is said to have the cancellation property if
ab = ac or ba = ca implies b = c for any elements a, b, and c in the ring if a ¥0.
and
1 1 0 0
174 chapter seventeen
There is a surprising and unexpected connection between the cancellation property and divisors of zero:
Theorem 2 A ring has the cancellation property iff it has no divisors of zero.
Proof: The proof is very straightforward. Let A be a ring, and suppose first that A has the cancellation property. To prove that A has no divisors of zero we begin by letting ab = 0, and show that a or b is equal to 0. If a = 0, we are done. Otherwise, we have
ab = 0 = flO
so by the cancellation property (cancelling a), b = 0.
Conversely, assume A has no divisors of zero. To prove that A has the cancellation property, suppose ab = ac where a^Q. Then
ab — ac = a(b - c) = 0
Remember, there are no divisors of zero! Since a^O, necessarily b - c = 0, so b = c. m
An integral domain is defined to be a commutative ring with unity having the cancellation property. By Theorem 2, an integral domain may also be defined as a commutative ring with unity having no divisors of zero. It is easy to see that every field is an integral domain. The converse, however, is not true: for example, Z is an integral domain but not a field. We will have a lot to say about integral domains in the following chapters.
EXERCISES
A. Examples of Rings
In each of the following, a set A with operations of addition and multiplication is given. Prove that A satisfies all the axioms to be a commutative ring with unity. Indicate the zero element, the unity, and the negative of an arbitrary a.
1 A is the set Z of the integers, with the following "addition" © and "multiplication" O:
a®b=a+b-\
aOb = ab-(a + b) + 2
2 A is the set Q of the rational numbers, and the operations arc © and O defined as follows:
a@b = a + b + 1
aOb = ab + a + b
rings: definitions and elementary properties 175
# 3 A is the set Q x Q of ordered pairs of rational numbers, and the operations are the following addition © and multiplication O:
(a, fo)©(c, d) = (a + c, b + d)
(a, b)Q(c, d) = (ac - bd, ad + be)
4 A = {x + yV2 :x,yEl} with conventional addition and multiplication.
5 Prove that the ring in part 1 is an integral domain.
6 Prove that the ring in part 2 is a field, and indicate the multiplicative inverse of an arbitrary nonzero element.
7 Do the same for the ring in part 3.
B. Ring of Real Functions
1 Verify that iF(R) satisfies all the axioms for being a commutative ring with unity. Indicate the zero and unity, and describe the negative of any/. ' 2 Describe the divisors of zero in &(R).
3 Describe the invcrtible elements in &(R).
4 Explain why ^(IR) is neither a field nor an integral domain.
C. Ring of 2 x 2 Matrices
Let M2(R) designate the set of all 2 x 2 matrices
ca
whose entries are real numbers a, b, c, and d, with the following addition and multiplication:
and
1 Verify that M2(
2 Show that M2(\
3 Explain why M
a b> c di
(a + r Kc + t
b + s d + u
ar + bt as cr + dt cs -
bu" dm
) satisfies the ring axioms, is not commutative and has a unity. R) is not an integral domain or a field.
D. Rings of Subsets of a Set
If D is a set, then the power set of D is the set PD of all the subsets of D. Addition and multiplication are defined as follows: If A and B are elements of P (that is, subsets of D), then
A + B = (A- B)U(B- A) and AB = A n B
It was shown in Chapter 3, Exercise C, that PD with addition alone is an abelian group.
176 chapter seventeen
rings: definitions and elementary properties 177
# 1 Prove: PD is a commutative ring with unity. (You may assume CI is associative; for the distributive law, use the same diagram and approach as was used to prove that addition is associative in Chapter 3, Exercise C.)
2 Describe the divisors of zero in PD.
3 Describe the invertible elements in P„.
4 Explain why Pn is neither a field nor an integral domain. (Assume D has more than one element.)
5 Give the tables of P3, that is, PD where D = {a, b, c].
E. Ring of Quaternions
A quaternion (in matrix form) is a 2 x 2 matrix of complex numbers of the form
_ / a + bi c + di\ 01 \—c + di a - bi)
1 Prove that the set of all the quaternions, with the matrix addition and multiplication explained on pages 7 and 8, is a ring with unity. This ring is denoted by the symbol 2. Find an example to show that Si is not commutative. (You may assume matrix addition and multiplication are associative and obey the distributive law.)
2 Let
-('„;) -a °.) i) -g a
Show that the quaternion a, defined previously, may be written in the form
a = al + bi + cj + dk
(This is the standard notation for quaternions.) # 3 Prove the following formulas:
l'.j'-k'--l ij 4 The conjugate of a is
jk = -kj
ki = -ik = j
_ _ (a — bi —c- di\ a \c - di a + bi)
The norm of a is a2 + b1 + c2 + d2, and is written ||a||. Show directly (by matrix multiplication) that
■ t 0\
(i ,)
where t— a
Conclude that the multiplicative inverse of a is (\lt)a.
5 A skew field is a (not necessarily commutative) ring with unity in which every nonzero element has a multiplicative inverse. Conclude from parts 1 and 4 that .2 is a skew field.
F. Ring of Endomorphisms
Let G be an abelian group in additive notation. An endomorphism of G is a homomorphism from G to G. Let End(G) denote the set of all the endomorphisms of G, and define addition and multiplication of endomorphisms as follows:
[/ + #10) = /(*) + g(*) for every x in G fg=f°8 the composite of / and g
1 Prove that End(G) with these operations is a ring with unity.
2 List the elements of End(Z4), then give the addition and multiplication tables for End(Z„).
Remark: The endomorphisms of Z4 are easy to find. Any endomorphisms of Z4 will carry 1 to either 0, 1, 2, or 3. For example, take the last case: if
then necessarily
1 + 1-^3 + 3 = 2 1 + 1 + 1-^3 + 3 + 3=1 hence / is completely determined by the fact that
1-C3
and 0-»0
G. Direct Product of Rings
If A and B are rings, their direct product is a new ring, denoted by A x B, and defined as follows: A x B consists of all the ordered pairs (x, y) where x is in A and y is in B. Addition in A x B consists of adding corresponding components:
(*,, y,) + (x2, y2) = (*, + x2, y, + y2)
Multiplication in A x B consists of multiplying corresponding components:
(x„ y,)-(*2, y2) = (x,x2, y,y2)
1 If A and B arc rings, verify that A x B is a ring.
2 If A and B are commutative, show that A x B is commutative. If A and B each has a unity, show that Ax B has a unity.
3 Describe carefully the divisors of zero in A X S. It 4 Describe the invertible elements in A x B.
5 Explain why Ax B can never be an integral domain or a field. (Assume A and B each have more than one element.)
H. Elementary Properties of Rings
Prove parts 1-4:
178 chapter seventeen
rings: definitions and elementary properties 179
1 In any ring, a(b — c) = ab - ac and (b - c)a = ba — ca.
2 In any ring, if ab = -ba, then (a + bf = (a - b)2 = a2 + b2.
3 In any integral domain, if a2 = b2, then a = ±b.
4 In any integral domain, only 1 and -1 are their own multiplicative inverses. (Note thatx = x ' iff x2 = 1.)
5 Show that the commutative law for addition need not be assumed in denning a ring with unity: it may be proved from the other axioms. [Hint: Use the distributive law to expand (a + b)(l + 1) in two different ways.]
# 6 Let A be any ring. Prove that if the additive group of A is cyclic, then A is a commutative ring.
7 Prove: In any integral domain, if a" =0 for some integer n, then a = 0.
I. Properties of Invertible Elements
Prove that parts 1-5 are true in a nontrivial ring with unity.
1 If a is invertible and ab = ac, then b = c.
2 An element a can have no more than one multiplicative inverse.
3 If a2 = 0 then a + 1 and a - 1 are invertible.
4 If a and b are invertible, their product ab is invertible.
5 The set S of all the invertible elements in a ring is a multiplicative group.
6 By part 5, the set of all the nonzero elements in a field is a multiplicative group. Now use Lagrange's theorem to prove that in a finite field with m elements, xml = \ for every x ^ 0.
7 If ax = 1, x is a right inverse of a; if ya = 1, y is a left inverse of a. Prove that if a has a right inverse x and a left inverse y, then a is invertible, and its inverse is equal to x and to y. (First show that yaxa = 1.)
8 Prove: In a commutative ring, if ab is invertible, then a and b are both invertible.
J. Properties of Divisors of Zero
Prove that each of the following is true in a nontrivial ring.
1 If a ^ ±1 and a2 = 1, then a + 1 and a - 1 are divisors of zero.
# 2 If ab is a divisor of zero, then a or b is a divisor of zero.
3 In a commutative ring with unity, a divisor of zero cannot be invertible.
4 Suppose ab ^ 0 in a commutative ring. If either a or b is a divisor of zero, so is ab.
5 Suppose a is neither 0 nor a divisor of zero. If ab = ac, then b = c.
6 Ax. B always has divisors of zero.
K. Boolean Rings
A ring A is a boolean ring if a2 = a for every a e A. Prove that parts 1 and 2 are true in any boolean ring A.
1 For every a 6 A, a= -a. [Hint: Expand (a + a)2.]
2 Use part 1 to prove that A is a commutative ring. [Hint: Expand (a + b)2.] In parts 3 and 4, assume A has a unity and prove:
3 Every element except 0 and 1 is a divisor of zero. [Consider x(x - 1).|
4 1 is the only invertible element in A.
5 Letting a v b = a + b + ab we have the following in A:
a v bc = (a v b)(a v c) a v (1 + a) = 1 a v a = a a(a v ft) = a
L. The Binomial Formula
An important formula in elementary algebra is the binomial expansion formula for an expression (a + b)". The formula is as follows:
(a+by = 2 (%" v
where the binomial coefficient
(n\ n(n-l)(n-2)---(n-k + \)
\k) ' k\
This theorem is true in every commutative ring. (If k is any positive integer and a is an element of a ring, ka refers to the sum a + a + ■ ■ ■ + a with k terms, as in elementary algebra.) The proof of the binomial theorem in a commutative ring is no different from the proof in elementary algebra. We shall review it here.
The proof of the binomial formula is by induction on the exponent n. The formula is trivially true for n = 1. In the induction step, wc assume the expansion for (a + b)" is as above, and we must prove that
(a + b)" '1 = 2
n + 1
Now,
(a + b)" -1 = (a + b)(a + b)"
= (a + b) 2 (l)a*-*bk
Collecting terms, we find that the coefficient of a"M V is
(*) + (*-L
180 chapter seventeen
By direct computation, show that
GM:.MT)
It will follow that (a + £>)n+I is as claimed, and the proof is complete.
M. Nilpotent and Unipotent Elements
An element a of a ring is nilpotent if a" = 0 for some positive integer n.
1 In a ring with unity, prove that if a is nilpotent, then a + 1 and a - 1 are both invcrtible. [Hint: Use the factorization
1 - a" = (1 - a)(l + a + a2 + ■ ■ ■ + a" ')
for 1 - a, and a similar formula for 1 + a.]
2 In a commutative ring, prove that any product xa of a nilpotent element a by any element x is nilpotent.
# 3 In a commutative ring, prove that the sum of two nilpotent elements is nilpotent. (Hint: You must use the binomial formula; see Exercise L.)
An element a of a ring is unipotent iff 1 - a is nilpotent.
4 In a commutative ring, prove that the product of two unipotent elements a and b is unipotent. [Hint: Use the binomial formula to expand 1 - ab = (1 - a) + fl(l - b) to power n + m.\
5 In a ring with unity, prove that every unipotent element is invertible. (Hint: Use Part 1.)
CHAPTER
EIGHTEEN
IDEALS AND HOMOMORPHISMS
We have already seen several examples of smaller rings contained within larger rings. For example, Z is a ring inside the larger ring Q, and Q itself is a ring inside the larger ring R, When a ring B is part of a larger ring A, we call B a subring of A. The notion of subring is the precise analog for rings of the notion of subgroup for groups. Here are the relevant definitions:
Let A be a ring, and B a nonempty subset of A. If the sum of any two elements of B is again in B, then B is closed with respect to addition. If the negative of every element of B is in B, then B is closed with respect to negatives. Finally, if the product of any two elements of B is again in B, then B is closed with respect to multiplication. B is called a subring of A if B is closed with respect to addition, multiplication, and negatives. Why is B then called a subring of Al Quite elementary:
// a nonempty subset B C A is closed with respect to addition, multiplication, and negatives, then B with the operations of A is a ring.
This fact is easy to check: If a, b, and c are any three elements of B, then a, b, and c are also elements of A because B ■ B satisfying the identities
and
f(xl + x2)=f(x1)+f(x2) f(x1x2)=f(x,)f(x2)
There is a longer but more informative way of writing these two identities:
1. = V, and f(x2) = y2, then /(*, + x2) = y, + y2.
2. /// (xj = yj andf(x2) = y2, then f(xtx2) = y,y2.
In other words, if / happens to carry xr to y, and x2 to y2, then, necessarily, it must carry x, + x2 to y, + y2 and xxx2 to y,y2. Symbolically,
—»y, and x2—*y2, then necessarily
and
xix2—*yly2
One can easily confirm for oneself that a function / with this property will transform the addition and multiplication tables of its domain into the addition and multiplication tables of its range. (We may imagine infinite rings to have "nonterminating" tables.) Thus, a homomorphism from a ring A onto a ring B is a function which transforms A into B. For example, the ring Z6 is transformed into the ring Z3 by
2 3 2 0
as we may verify by comparing their tables. The addition tables are
184 CHAPTER EIGHTEEN
IDEAI.S AND HOMOMORPHISMS 185
compared on page 136, and we may do the same with their multiplication tables:
0 1 2 3 4 5 0 1 2 0 1 2
0 0 0 0 0 0 0 Replace 0 0 0 0 0 0 0
1 0 1 2 3 4 5 x by /(*) 1 0 1 2 0 1 2
2 0 2 4 0 2 4 2 0 2 1 0 2 1
3 0 3 0 3 0 3 0 0 (i 0 {) 0 0
4 0 4 2 0 1 2 1 0 1 2 0 1 2
5 0 5 4 3 2 1 2 0 2 1 0 2 1
0 1 2
0 0 0 0
1 0 1 2
T 0 2 1
Eliminate duplicate information
(For example, 2-2 = 1 appears four separate times in table above.)
If there is a homomorphism from A onto B, we call B a homomorphic image of A. If / is a homomorphism from a ring A to & ring B, not necessarily onto, the range of/is a subring of B. (This fact is routine to verify.) Thus, the range of a ring homomorphism is always a ring. And obviously, the range of a homomorphism is always a homomorphic image of its domain.
Intuitively, if B is a homomorphic image of A, this means that certain features of A are faithfully preserved in B while others are deliberately lost. This may be illustrated by developing further an example described in Chapter 14. The parity ring P consists of two elements, e and o, with addition and multiplication given by the tables
+ e o e o
e e o and e e e
o o e o e 0
We should think of e as "even" and o as "odd," and the tables as describing the rules for adding and multiplying odd and even integers. For example, even + odd = odd, even times odd = even, and so on.
The function f:Z-*P which carries every even integer to e and every odd integer to o is easily seen to be a homomorphism from Z to P; this is made clear on page 137. Thus, P is a homomorphic image of Z. Although the ring P is very much smaller than the ring Z, and therefore few of the features of Z can be expected to reappear in P, nevertheless
one aspect of the structure of Z is retained absolutely intact in P, namely, the structure of odd and even numbers. As we pass from Z to P, the parity of the integers (their being even or odd), with its arithmetic, is faithfully preserved while all else is lost. Other examples will be given in the exercises.
If/is a homomorphism from a ring A to a ring B, the kernel of /is the set of all the elements of A which are carried by / onto the zero element of B. In symbols, the kernel of / is the set
K = {x e A : f(x) = 0}
It is a very important fact that the kernel offis an ideal of A. (The simple verification of this fact is left as an exercise.)
If A and B are rings, an isomorphism from A to B is a homomorphism which is a one-to-one correspondence from A to B. In other words, it is an injective and surjective homomorphism. If there is an isomorphism from A to B we say that A is isomorphic to B, and this fact is expressed by writing
A = B
EXERCISES
A. Examples of S si brings
Prove that each of the following is a subring of the indicated ring:
1 (x + V3y :x,y£ 1} is a subring of R.
2 {x + 21/3y + 22,3z : x, y, z 6 Z} is a subring of R.
3 {x2y : x, ySZ] is a subring of R.
# 4 Let <(R) be the set of all the functions from R to R which are continuous on (-«,«)), and let S5(R) be the set of all the functions from R to R which are differentiable on (-°°,°°). Then
2 h
3 h
&(U)->R given by d>(/) = /(0). R x R-> IR given by h(x, y) = x. R^>M2(R) given by
•»-(; I)
4 h : R x R-*^«2(R) given by
Kx,y) = (l °y)
# 5 Let A be the set R X R with the usual addition and the following "multiplication":
(a, b)Q(c, d) = (ac, be) Granting that A is a ring, let / : A-+M2(U) be given by
A*,-(J o)
6 h: Pc~* Pc given by h(A) = ADD, where D is a fixed subset of C.
7 List all the homomorphisms from Z2 to Z4; from Z3 to Z6.
F. Elementary Properties of Homomorphisms
Let A and B be rings, and /: A-»fl a homomorphism. Prove each of the following:
1 f(A) = {/(*) : x G A} is a subring of B.
2 The kernel of / is an ideal of A.
3 /(0) = 0, and for every a £ A, /(-a) = -f(a).
4 /is injective iff its kernel is equal to {0}.
5 If B is an integral domain, then either/(1) = 1 or /(1) = 0. If /(1) = 0, then f(x) = 0 for every x e A. If /(1) = 1, the image of every invertible element of A is an invertible element of B.
188 chapter eighteen
ideals and HOMOMOSPH1SMS 189
6 Any homornorphic image of a commutative ring is a commutative ring. Any homomorphic image of a field is a field.
7 If the domain A of the homomorphism / is a field, and if the range of / has more than one element, then / is injective. (Hint: Use Exercise D6.)
G. Examples of Isomorphisms
1 Let A be the ring of Exercise A2 in Chapter 17. Show that the function f(x) = x - 1 is an isomorphism from Q to A; hence Q s A.
2 Let Sf be the following subset of M2(U):
9 =
Prove that the function
/(a + bi)
( a b \-b a
is an isomorphism from C to if. [Remark: You must begin by checking that /is a well-defined function; that is, if a + bi = c + di, then /(a + bi) = f(c + di). To do this, note that if a + bi = c + di then a - c = (d - b)i; this last equation is impossible unless both sides are equal to zero, for otherwise it would assert that a given real number is equal to an imaginary number.]
3 Prove that {(x, x): x GZ} is a subring of Z x Z, and show {(x, x) : x G Z} 3 Z.
4 Show that the set of all 2 x 2 matrices of the form
(8 9
is a subring of M2(R), then prove this subring is isomorphic to U.
For any integer k, let kZ designate the subring of Z which consists of all the multiples of k.
5 Prove that Z ^ 2Z; then prove that 2Z ^ 3Z. Finally, explain why if k I, then kZ ?= /Z. (Remember: How do you show that two rings, or groups, are not isomorphic?)
H. Further Properties of Ideals
Let 4 be a ring, and let / and K be ideals of A.
Prove parts 1-4. (In parts 2-4, assume A is a commutative ring.)
1 If J n K = {0}, then jk = 0 for every / £ J and k E K.
2 For any a G A, Ia = {ax + j + k : x G A, j G J, k 6 K) is an ideal of A.
# 3 The radical of J is the set rad J = {a £ A : a" e / for some n E Z}. For any ideal /, rad J is an ideal of A.
4 For any a : A—» jtf is given by {a) = ira, then <£> is a homomorphism.
6 If A has a unity, then is an isomorphism. Similarly, if A has no divisors of zero then is an isomorphism.
QUOTIENT RINGS 191
CHAPTER
NINETEEN
QUOTIENT RINGS
We continue our journey into the elementary theory of rings, traveling a road which runs parallel to the familiar landscape of groups. In our study of groups we discovered a way of actually constructing all the homomor-phic images of any group G. We constructed quotient groups of G, and showed that every quotient group of G is a homomorphic image of G. We will now imitate this procedure and construct quotient rings. We begin by defining cosets of rings:
Let A be a ring, and J an ideal of A. For any element a 65 A, the symbol J + a denotes the set of all sums j + a, as a remains fixed and j ranges over J. That is,
7 + « = {/ + a: jej}
7 + a is called a coset of J in A.
It is important to note that, if we provisionally ignore multiplication, A with addition alone is an abelian group and 7 is a subgroup of A. Thus, the cosets we have just defined are (if we ignore multiplication) precisely the cosets of the subgroup 7 in the group A, with the notation being additive. Consequently, everything we already know about group cosets continues to apply in the present case—only, care must be taken to translate known facts about group cosets into additive notation. For example, Property (1) of Chapter 13, with Theorem 5 of Chapter 15, reads as follows in additive notation:
aeJ+b J+a=J+b J + a = J
iff iff iff
J+a=J+b
a-bGJ
aSJ
(1)
(2)
(3)
We also know, by the reasoning which leads up to Lagrange's theorem, that the family of all the cosets / + a, as a ranges over A, is a partition of A.
There is a way of adding and multiplying cosets which works as follows:
(7 + a) + (7 + b) = 7 + (a + b)
(7 + a)(7 + b) = 7 + ab
In other words, the sum of the coset of a and the coset of b is the coset of a + b; the product of the coset of a and the coset of b is the coset of ab.
It is important to know that the sum and product of cosets, defined in this fashion, are determined without ambiguity. Remember that 7 + a may be the same coset as 7 + c [by Condition (1) this happens iff c is an element of 7 + a], and, likewise, 7 + b may be the same coset as 7 + d. Therefore, we have the equations
(7 + a) + (7 + b) = 7 + (a + b) (7 + a)(J +b) = J + ab
II II and ii II
(7 + c) + (7 + d) = J + (c + d) (7 + c)(7 + d) = 7 + cd
Obviously we must be absolutely certain that 7 + (a + b) = J + (c + d) and 7 + ab = J + cd. The next theorem provides us with this important guarantee.
Theorem 1 Let 7 be an ideal of A. If J + a = 7 + c and 7 4 b = 7 + d,
then
(i) 7 + (a + b) = J + (c + d), and
(ii) 7 + ab = 7 + cd.
Proof: We are given that 7 + a = 7 + c and 7 + b = 7 + d; hence by Condition (2),
and
b-d&J
192 CHAPTER NINETEEN
QUOTIENT RINGS 193
Since J is closed with respect to addition, (a - c) + (b - d) = (a + b) -(c + d) is in J. It follows by Condition (2) that J + (a + b) = J + (c: + d), which proves (i). On the other hand, since J absorbs products in A,
(a - c)b GJ c(b~d)e.J
' 1 and '—"*>-'
ab - cb cb - cd
and therefore (ab - cb) + (cb - cd) = ab - cd is in J. It follows by Condition (2) that J + ab = J + cd. This proves (ii). ■
Now, think of the set which consists of all the cosets of J in A. This set is conventionally denoted by the symbol AIJ. For example, if J + a, J + b, J + c,. . . are cosets of J, then
AIJ = {J + a, J + b, J + c, . . .}
We have just seen that coset addition and multiplication are valid operations on this set. In fact,
Theorem 2 AIJ with coset addition and multiplication is a ring.
Proof: Coset addition and multiplication are associative, and multiplication is distributive over addition. (These facts may be routinely checked.) The zero element of AIJ is the coset / = J + 0, for if J + a is any coset,
(J + a) + (J + 0) = / + (a + 0) = J + a Finally, the negative of J + a is J + (~a), because
(J + a) + (J + (-a)) = J + (a + (-a)) = J + 0 ■
The ring AIJ is called the quotient ring of A by /.
And now, the crucial connection between quotient rings and homomorphisms:
Theorem 3 A/J is a homomorphic image of A.
Following the plan already laid out for groups, the natural homo-morphism from A onto AIJ is the function / which carries every element to its own coset, that is, the function / given by
f(x) = J + x
This function is very easily seen to be a homomorphism.
Thus, when we construct quotient rings of A, we are, in fact, constructing homomorphic images of A. The quotient ring construction is
useful because it is a way of actually manufacturing homomorphic images of any ring A.
The quotient ring construction is now illustrated with an important example. Let Z be the ring of the integers, and let (6) be the ideal of Z which consists of all the multiples of the number 6. The elements of the quotient ring Z/(6) are all the cosets of the ideal (6), namely:
(6)+0={.. •, -18, -12, -6,0,6,12,18,. ••} = 0
(6) + l = {.. • , -17, -11, -5,1,7,13,19, . •■} = T
<6) + 2={.. • , -16, -10, -4,2,8, 14,20, . ••} = 2
(6)+3={. . .,-15, -9, -3,3,9,15,21,. ••} = 3
(6>+4={.. • , -14, -8, - -2, 4,10, 16, 22, . ■•} = 4
(6)+5={.. . , -13, -7, - 1,5,11,17,23, . •■} = 5
We will represent these cosets by means of the simplified notation 0, 1,2, 3, 4, 5. The rules for adding and multiplying cosets give us the following tables:
+ 0 1 2 3 4 5 0 1 2 3 4 5
0 0 1 2 3 4 5 0 0 0 0 0 0 0
1 1 2 3 4 5 0 1 0 1 2 3 4 5
2 2 3 4 5 0 1 2 0 2 4 0 2 4
3 3 4 5 0 1 2 3 0 3 0 3 0 3
4 4 5 0 1 2 3 4 0 4 2 0 4 2
5 5 0 1 2 3 4 5 0 5 4 3 2 1
One cannot fail to notice the analogy between the quotient ring Z/(6) and the ring Z6, In fact, we will regard them as one and the same. More generally, for every positive integer n, we consider Z„ to be the same as Z/{n). In particular, this makes it clear that Z„ is a homomorphic image of Z.
By Theorem 3, any quotient ring AIJ is a homomorphic image of A. Therefore the quotient ring construction is a way of actually producing homomorphic images of any ring A. In fact, as we will now sec, it is a way of producing all the homomorphic images of A.
Theorem 4 Let f: A-* B be a homomorphism from a ring A onto a ring B, and let K be the kernel of f. Then B = A IK.
Proof: To show that A/K is isomorphic with B, we must look for an isomorphism from A/K to B. Mimicking the procedure which worked
194 CHAPTER NINETEEN
QUOTIENT RINGS 195
successfully for groups, we let is a well-defined, bijective function from AIK to B. Finally,
4>((K + a) + (K + b)) =
(K + a) +
(K + a)q>(K + b) Thus,
/<2>.
1 Z3 and Z20/(5).
2 Z3 and Z6/<3>.
3 P2 and P3/K, where K={0,{c}}. [Hint: See Chapter 18, Exercise E6. Consider the function f(X) = Xn {a, b}.]
4 Z2 and Z2 x Z2/K, where K= {(0,0), (0, 1)}.
C. Quotient Rings and Homomorphic Images in J^(R)
1 Let be the function from 9(U) to R x R defined by *(/) = (/(0), /(l)). Prove that rf> is a homomorphism from 3>(U) onto R x R, and describe its kernel.
2 Let J be the subset of ^(R) consisting of all / whose graph passes through the points (0,0) and (1,0). Referring to part 1, explain why J is an ideal of £F(R), and ^(R)/7 = RxR.
3 Let be the function from ^(R) to &(Q, R) defined as follows:
(f) =/o = the restriction of / to Q
(Note: The domain of fQ is q and on this domain fQ is the same function as /.) Prove that is a homomorphism from ^(R) onto !f(Q, R), and describe the kernel of d>. [^(Q, R) is the ring of functions from Q to R.]
4 Let J be the subset of 9(U) consisting of all / such that f(x) = 0 for every rational x. Referring to part 3, explain why / is an ideal of ^(R) and ^(R)/7 =
D. Elementary Applications of the Fundamental Homomorphism Theorem
In each of the following let A be a commutative ring. If a £ A and n is a positive integer, the notation na will stand for
a + a + ■ • ■ + a (n terms)
1 Suppose 2x = 0 for every xS A. Prove that (x + y) = x2 + y2 for all x and y in A. Conclude that the function h(x) = x2 is a homomorphism from A to A. If J = {x G A : x2 = 0} and B = [x2 : x E. A}, explain why J is an ideal of A, B is a subring of A, and AIJ = B.
2 Suppose 6x = 0 for every «6/1. Prove that the function h(x) = 3x is a homomorphism from A to A. If J = {x : 3x = 0) and B = {3x : x E A], explain why / is an ideal of A, B is a subring of A, and AIJ = B.
3 If a is an idempotent element of A (that is, a2 = a), prove that the function ■na(x) = ax is a homomorphism from A into A. Show that the kernel of u(l is Ia, the annihilator of a (defined in Exercise H4 of Chapter 18). Show that the range of ira is (a). Conclude by the FHT that AI1a = (a).
4 For each a £ A, let tt„ be the function given by ira(x) - ax. Define the following addition and multiplication on A = {wa : a £ A):
"a + Kb = ^ + b
and
(A is a ring; however, do not prove this.) Show that the function (a) = 7ra is a homomorphism from A onto A. Let / designate the annihilating ideal of A (defined in Exercise H4 of Chapter 18). Use the FHT to show that All = A.
E. Properties of Quotient Rings AIJ in Relation to Properties of J
Let A be a ring and J an ideal of A. Use Conditions (1), (2), and (3) of this chapter. Prove each of the following:
# 1 Every element of AIJ has a square root iff for every x e A, there is some v £E A such that x - y2 G /.
2 Every element of AIJ is its own negative iff x + x G J for every x G A.
3 AIJ is a boolean ring iff x2 - x G J for every xG A (A ring 5 is called a boolean ring iff s2 = s for every 5 G 5.)
4 If J is the ideal of all the nilpotent elements of a commutative ring A, then AIJ has no nilpotent elements (except zero). (Nilpotent elements are defined in Chapter 17, Exercise M; by M2 and M3 they form an ideal.)
5 Every element of AIJ is nilpotent iff J has the following property: for every x G A, there is a positive integer n such that x" G /.
# 6 AIJ has a unity element iff there exists an element a G A such that w-rey and - a: G 7 for every j: G A.
198 chapter nineteen
quotient rings 199
F. Prime and Maximal Ideals
Let A be a commutative ring with unity, and J an ideal of A. Prove each of the following:
1 AIJ is a commutative ring with unity.
2 / is a prime ideal iff AIJ is an integral domain.
3 Every maximal ideal of A is a prime ideal. (Hint: Use the fact, proved in this chapter, that if / is a maximal ideal then AIJ is a field.)
4 If AIJ is a field, then / is a maximal ideal. (Hint: See Exercise 12 of Chapter 18.)
G. Further Properties of Quotient Rings in Relation to Their Ideals
Let A be a ring and J an ideal of A. (In parts 1-3 and 5 assume that A is a commutative ring with unity.)
1 Prove that AIJ is a field iff for every element a £ A, where a^J, there is some b £ A such that ah - 1 £ J.
2 Prove that every nonzero element of AIJ is either invertible or a divisor of zero iff the following property holds, where a, x £ A: For every a0J, there is some x 0 J such that either ax £ J or ax - 1 £ J.
3 An ideal J of a ring A is called primary iff for all a, b £ A, if ab £ J, then either a £ J or b" £ J for some positive integer n. Prove that every zero divisor in AIJ is nilpotent iff J is primary.
4 An ideal J of a ring A is called semiprime iff it has the following property: For every a £ A, if a" £ J for some positive integer n, then necessarily a £ J. Prove that / is semiprime iff AIJ has no nilpotent elements (except zero).
5 Prove that an integral domain can have no nonzero nilpotent elements. Then use part 4, together with Exercise F2, to prove that every prime ideal in a commutative ring is semiprime.
H. Z„ as a Homomorphic Image of Z
Recall that the function
f(a) = a
is the natural homomorphism from Z onto Z„. If a polynomial equation p = 0 is satisfied in Z, necessarily /(p) ==/(0) is true in Z„. Let us take a specific example; there are integers x and y satisfying llx2 - 8y2 + 29 = 0 (we may take x = 3 and y = 4). It follows that there must be elements £ and y in Z6 which satisfy TTf2 -8y2 +29 = 0 in Z6, that is, 5 x2 - 2 y2 + 5 = 0. (We take x = 3 and y = 4.) The problems which follow are based on this observation.
1 Prove that the equation x2 - ly2 - 24 = 0 has no integer solutions. (Hint: If there are integers x and y satisfying this equation, what equation will x and y satisfy in Z7?)
2 Prove that x2 + (x + l)2 + (x + 2)2 = y2 has no integer solutions.
3 Prove that x2 + 10y2 = n (where n is an integer) has no integer solutions if the last digit of n is 2, 3, 7, or 8.
4 Prove that the sequence 3, 8, 13, 18, 23, . . . does not include the square of any integer. (Hint: The image of each number on this list, under the natural homomorphism from Z to Z5, is 3.)
5 Prove that the sequence 2, 10, 18, 26,. . . does not include the cube of any integer.
6 Prove that the sequence 3, 11, 19, 27, . . .does not include the sum of two squares of integers.
7 Prove that if n is a product of two consecutive integers, its units digit must be 0, 2, or 6.
8 Prove that if n is the product of three consecutive integers, its units digit must be 0, 4, or 6.
INTEGRAL DOMAINS 201
CHAPTER
TWENTY
INTEGRAL DOMAINS
Let us recall that an integral domain is a commutative ring with unity having the cancellation property, that is,
if
and
ab = ac
then
b = c
(1)
At the end of Chapter 17 we saw that an integral domain may also be defined as a commutative ring with unity having no divisors of zero, which is to say that
if
ab=0
then
a = Q
b = 0
(2)
for as we saw, (1) and (2) are equivalent properties in any commutative ring.
The system Z of the integers is the exemplar and prototype of integral domains. In fact, the term "integral domain" means a system of algebra ("domain") having integerlike properties. However, Z is not the only integral domain: there are a great many integral domains different from Z.
Our first few comments will apply to rings generally. To begin with, we introduce a convenient notation for multiples, which parallels the exponent notation for powers. Additively, the sum
a + a + ■ ■ • + a
of n equal terms is written as n • a. We also define 0 a to be 0, and let (—n) • a = ^(n ■ a) for all positive integers n. Then
m-a + n- a = (m + n)-« and m • (n • a) = (mn) ■ a
for every element a of a ring and all integers in and n. These formulas are the translations into additive notation of the laws of exponents given in Chapter 10.
If A is a ring, A with addition alone is a group. Remember that in additive notation the order of an element a in A is the least positive integer n such that n • a = 0. If there is no such positive integer n, then a is said to have order infinity. To emphasize the fact that we are referring to the order of a in terms of addition, we will call it the additive order of a.
In a ring with unity, if 1 has additive order n, we say the ring has "characteristic n." In other words, if A is a ring with unity,
the characteristic of A is the least positive integer n such that 1 + 1+ • • • + 1=0
n times
If there is no such positive integer n, A has characteristic 0.
These concepts are especially simple in an integral domain. Indeed,
Theorem 1 All the nonzero elements in an integral domain have the same additive order.
Proof: That is, every a ¥> 0 has the same additive order as the additive order of 1. The truth of this statement becomes transparently clear as soon as we observe that
n - a = a + a + • • • + a — la + ■ • ■ + la = (1 + - ■ • + 1)« = (n ■ l)a
hence n • a = 0 iff n ■ 1 = 0. (Remember that in an integral domain, if the product of two factors is equal to 0, at least one factor must be 0.) ■
It follows, in particular, that if the characteristic of an integral domain is a positive integer n, then
n-x = 0
for every element x in the domain. Furthermore,
Theorem 2 In an integral domain with nonzero characteristic, the characteristic is a prime number.
Proof: If the characteristic were a composite number mn, then by the distributive law,
200
202 CHAPTER TWENTY
INTEGRAL DOMAINS 203
(m • l)(n • 1) = (1 + • • • + 1)(1 + ••■ + !) =1 + 1 + -- - + 1 = (mn) -1 = 0
m terms
n terms
mn terms
Thus, either m -1 = 0 or n • 1 = 0, which is impossible because mn was chosen to be the least positive integer such that (mn) -1=0. ■
A very interesting rule of arithmetic is valid in integral domains whose characteristic is not zero.
Theorem 3 In any integral domain of characteristic p, (a + b)p = ap + bp for all elements a and b
Proof: This formula becomes clear when we look at the binomial expansion of (a + b)p. Remember that by the binomial formula,
(a + bf = ap + ( J) • ap~lb + ■ ■ ■ + ( P_ J ) ■ abp 1 + bp
where the binomial coefficient
pV, p(p-l)(p-2)--(p-k+l)
(I)
k/ k! It is demonstrated in Exercise L of Chapter 17 that the binomial formula is correct in every commutative ring.
Note that if p is a prime number and 0 is an isomorphism from A to A', so A* contains an isomorphic copy A' of A.
EXERCISES
A. Characteristic of an Integral Domain
Let A be a finite integral domain. Prove each of the following:
t Let a be any nonzero element of A. If n-a = 0, where n ^ 0, then n is a multiple of the characteristic of A.
2 It A has characteristic zero, n ^ 0, and n • a = 0, then a = 0.
3 If A has characteristic 3, and 5 • a = 0, then a = 0.
4 If there is a nonzero element a in A such that 256 ■ a = 0, then A has characteristic 2.
5 If there are distinct nonzero elements a and b in A such that 125 • a = 125 • b, then A has characteristic 5.
6 If there are nonzero elements a and b in A such that (a + b)2 = a1 + b2, then A has characteristic 2.
7 If there are nonzero elements a and b in A such that 10a = 0 and 14b = 0, then A has characteristic 2.
B. Characteristic of a Finite Integral Domain
Let A be an integral domain. Prove each of the following:
1 If A has characteristic q, then q is a divisor of the order of A.
2 If the order of A is a prime number p, then the characteristic of A must be equal to p.
3 If the order of A is pm, where p is a prime, the characteristic of A must be equal to p.
4 If A has 81 elements, its characteristic is 3.
5 If A, with addition alone, is a cyclic group, the order of A is a prime number.
C. Finite Rings
Let A be a finite commutative ring with unity.
1 Prove: Every nonzero element of A is either a divisor of zero or invertible. (Hint: Use an argument analogous to the proof of Theorem 4)
206 chapter twenty
INTEGRAL DOMAINS 207
2 Prove: If a 0 is not a divisor of zero, then some positive power of a is equal to 1. (Hint: Consider a, a2, a3,.... Since A is finite, there must be positive integers n < m such that a" = am.)
3 Use part 2 to prove: If a is invertible, then a1 is equal to a positive power of a. I). Field of Quotients of an Integral Domain
The following questions refer to the construction of a field of quotients of A, as outlined on pages 203 to 205.
1 If [a, b] = [r, s] and [c, d] = [t, u], prove that \a, b\ + [c, d] = \r, s] + [t, u].
2 If [a, b] = [r, s] and [c, d] = [t, u], prove that [a, />][c, d] = [r, s][t, u).
3 If (a, b)~ (c, d) means aa1 = 6c, prove that — is an equivalence relation on S.
4 Prove that addition in A* is associative and commutative.
5 Prove that multiplication in A* is associative and commutative.
6 Prove the distributive law in A*.
7 Verify that b has the same meaning as b < a. Furthermore, a =£ b means "a < b or a = b ," and b 3= a means the same as a =s b.
In an ordered integral domain A, an element a is called positive if «>0. If «<0 we call a negative. Note that if a is positive then —a is negative. (Proof: Add -a to both sides of the inequality a >0.) Similarly, if a is negative, then —a is positive.
In any ordered integral domain, the square of every nonzero element is positive. Indeed, if c is nonzero, then either c>0 of c<0. If c>0, then, multiplying both sides of the inequality c > 0 by c,
so c >0. On the other hand
cc > cO if c<0,
= 0 then
(-c)>0
hence (- c)(- c) > 0(- c) = 0
But (—c)(—c) = c2, so once again, c2 >0.
In particular, since 1 = I1, 1 is always positive. From the fact that 1 >0, we immediately deduce that I + 1 > 1, I + 1 + 1 > 1 + 1, and so on. In general, for any positive integer n,
(n+1)-1 >nl
where n • 1 designates the unity element of the ring A added to itself n
208
210 CHAPTER TWENTY-ONE
THE INTEGERS 211
times. Thus, in any ordered integral domain A, the set of all the multiples of 1 is ordered as in Z: namely
• • • < (-2) • 1 <(-l)' 1<0< 1<2- 1<3-1<- ■■
The set of all the positive elements of A is denoted by A+. An ordered integral domain A is called an integral system if every nonempty subset of A+ has a least element. In other words, if every nonempty set of positive elements of A has a least element. This property is called the well-ordering property for A+.
It is obvious that Z is an integral system, for every nonempty set of positive integers contains a least number. For example, the smallest element of the set of all the positive even integers is 2. Note that Q and IR are not integral systems. For although both are ordered integral domains, they contain sets of positive numbers, such as {x: 00 and there are no elements of A between 0 and 1, so b> 1. (Remember that b cannot be equal to 1 because b is not a multiple of 1.) Since b> 1, it follows that b — 1 >0. But b - 1 1.
Thus, b - 1 > 0, and b - 1 £ K. But then, by Condition (ii), b 6 K, which is impossible. ■
Let the symbol Sn represent any statement about the positive integer n. For example, S„ might stand for "n is odd," or "n is a prime," or it might represent an equation such as (n — l)(n + 1) = n2 - 1 or an inequality such as n € n2. If, let us say, 5n stands for n =s n2, then S{ asserts that 1 =s l2, S2 asserts that 2 =£ 22, S3 asserts that 3 =s 32, and so on.
Theorem 2: Principle of mathematical induction Consider the following conditions:
(i) 5, is true.
(ii) For any positive integer k, if Sk is true, then also 5k+1 is true.
If Conditions (i) and (ii) are satisfied, then Sa is true for every positive integer n.
Proof: Indeed, if K is the set of all the positive integers k such that 5k is true, then K complies with the conditions of Theorem 1. Thus, K contains all the positive integers. This means that Sn is true for every n. ■
As a simple illustration of how the principle of mathematical indue-
212 CHAPTER TWENTY-ONE
THE INTEGERS 213
tion is applied, let Sa be the statement that
, „ n(n + l)
1 +2+ • ■ • +n= —1
that is, the sum of the first n positive integers is equal to n(n + l)/2. Then S, is simply
-¥
which is clearly true. Suppose, next, that k is any positive integer and that Sk is true. In other words,
1 + 2 + ... + k=!E0E±i)
Then, by adding k + 1 to both sides of this equation, we obtain
k(k+ 1)
1 + 2 + • ■• + k + (k + l)
+ (k + l)
that is,
1 + 2--- + (k + 1)
(k+l)(k + 2)
However, this last equation is exactly Sk+l, We have therefore verified that whenever Sk is true, Sk+J also is true. Now, the principle of mathematical induction allows us to conclude that
n(n + 1)
1 + 2 +
+ n ■
for every positive integer n.
A variant of the principle of mathematical induction, called the principle of strong induction, asserts that 5n is true for every positive integer n on the conditions that
(i) 5, is true, and
(ii) For any positive integer k, if 5( is true for every i < k, then Sk is true.
The details are outlined in Exercise H at the end of this chapter.
One of the most important facts about the integers is that any integer m may be divided by any positive integer n to yield a quotient q and a positive remainder r. (The remainder is less than the divisor n.) For example, 25 may be divided by 8 to give a quotient of 3 and a remainder of 1:
25 = 8 x 3 + 1 m n q r
This process is known as the division algorithm. It is stated in a precise manner as follows:
Theorem 3: Division algorithm If m and n are integers and n is positive, there exist unique integers q and r such that
m = nq + r and 0 =£ r < n We call q the quotient, and r the remainder, in the division of m by n.
Proof: We begin by showing a simple fact:
There exists an integer x such that xn =£ m.
(*)
Remember that n is positive; hence ns=l. As for m, either m 3=0 or m < 0. We consider these two cases separately: Suppose m*0. Then
0 =s m hence (0)n =s m
Suppose m < 0. We may multiply both sides of n & 1 by the positive integer —m to get (-m)n 3= -m. Adding mn + m to both sides yields
mn «m
X
Thus, regardless of whether m is positive or negative, there is some integer x such that xn=sm.
Let W be the subset of Z consisting of all the nonnegative integers which are expressible in the form m - xn, where x is any integer. By (*) W is not empty; hence by the well-ordering property, W contains a least integer r. Because r £ W, r is nonnegative and is expressible in the form m - nq for some integer q. That is,
and
r3=0 r = m - nq
Thus, we already have m = nq + r and 0 =s r. It remains only to verify that rc < ac.
5 a 0, then aab, if a2 + 62^0 5« + K«6 + l, if a, b>\
6 ab + ac + bc+Ka + b + c + abc, if a, ft, c > 1
C. Uses of Induction
Prove parts 1-7, using the principle of mathematical induction. (Assume n is a positive integer.)
1 1 + 3 + 5 + ■ • • + (2n - 1) = n2 (The sum of the first n odd integers is n2)
2 l3 + 23 + --- + n3 = (l + 2 + --- + n)2
3 1-
22 +
+ (n - 1)2< y 0,
F„^F„ + 1-FnF„ + 3 = (-1)"
D. Every Integral System Is Isomorphic to Z
Let A be an integral system. Let A: Z—> A be defined by: h(n) = n • 1. The purpose of this exercise is to prove that h is an isomorphism, from which it follows that A = I.
1 Prove: For every positive integer n, n ■ 1 >0. What is the characteristic of A?
2 Prove that h is injective and surjective.
3 Prove that h is an isomorphism.
E. Absolute Values
In any ordered integral domain, define \a\ by
I if a if a&0 W I-a if a<0
Using this definition, prove the following:
1 |-a| = |a|
2 a«|a|
3 a 3= -|a|
4 If b>0, |o|«6 iff -b^a^b
# 5 |« + fc|«|a| + |*|
6 jo-fc|«|«| + |fe|
7 |afr| = H-|*l
# 8 \a\-\b\*k\a-b\ 9 \\a\-M<\a-b\
F. Problems on the Division Algorithm
Prove parts 1-3, where k, m, n, q, and r designate integers.
1 Let n > 0 and k > 0. If q is the quotient and r is the remainder when m is divided by n, then q is the quotient and kr is the remainder when km is divided by kn.
2 Let n > 0 and k > 0. If q is the quotient when m is divided by n, and q, is the quotient when q is divided by k, then q, is the quotient when m is divided by nk.
3 If n 5*0, there exist q and r such that m = nq + r and 0 *S r < |n|. (Use Theorem 3, and consider the case when n < 0.)
4 In Theorem 3, suppose m = nq, + r, = nq2 + r2 where 0 =sr,, r2 < n. Prove that r, - r2 = 0. [Hint: Consider the difference (nq, + r,) - (nq2 + r2).]
5 Use part 4 to prove that ql - q2 = 0. Conclude that the quotient and remainder, in the division algorithm, are unique.
6 If r is the remainder when m is divided by n, prove that m = r in Zn. G. Laws of Multiples
The purpose of this exercise is to give rigorous proofs (using induction) of the basic identities involved in the use of exponents or multiples. If A is a ring and a€E A, we define n • a (where n is any positive integer) by the pair of conditions:
(i) la = a, and (ii) (n + l)-a = n-a + a
Use mathematical induction (with the above definition) to prove that the following are true for all positive integers n and all elements a, b £ A:
1 a ■ (a + b) = n - a + n ■ b
2 (n + m)-a = n- a + nva
3 (n ■ a)b = a(n • b) = n • (ab)
4 m • (n • a) = (mn) • a
# 5 n • a = (n • 1 )a where 1 is the unity element of A 6 (n ■ a)(m • b) - (am) ■ ab (Use parts 3 and 4.)
H. Principle of Strong Induction
Prove the following in Z:
1 Let K denote a set of positive integers. Consider the following conditions:
(i) ie*.
(ii) For any positive integer k, if every positive integer less than k is in K, then
If K satisfies these two conditions, prove that K contains all the positive integers.
2 Let 5„ represent any statement about the positive integer n. Consider the following conditions:
(i) 5, is true.
(ii) For any positive integer k, if S{ is true for every i 1. In case r > 1 we may multiply both sides of 1< r by s to get 5 < rs = 1; this is impossible because s cannot be positive and <1. Thus, it must be that r = 1; hence 1 = rs = Is = s, so also s = 1.
If r and s are both negative, then —r and —s are positive. Thus,
1 = rs = (-r)(-s) and by the preceding case, —r = — s = 1. Thus, r = s = — 1. ■
A pair of integers r and s are called associates if they divide each other, that is, if r\ s and s\r.\ir and 5 are associates, this means there are integers k and / such that r= ks and s = Ir. Thus, r = ks = klr, hence kl = 1. By Theorem 2, k and / are ±1, and therefore r = ±s. Thus, we have shown that
// r and s are associates in Z, then r = ±s . (1)
An integer t is called a common divisor of integers r and s \i t\r and t\s. A greatest common divisor of r and s is an integer t such that
(i) t\r and t\s, and
(ii) For any integer u, if u \ r and u | s, then u 11.
In other words, t is a greatest common divisor of r and s if f is a common divisor of r and 5, and every other common divisor of r and 5 divides t. Note that the adjective "greatest" in this definition does not mean primarily that t is greater in magnitude than any other common divisor, but, rather, that it is a multiple of any other common divisor.
The words "greatest common divisor" are familiarly abbreviated by gcd. As an example, 2 is a gcd of 8 and 10; but -2 also is a gcd of 8 and 10. According to the definition, two different gcd's must divide each other; hence by Property (1) above, they differ only in sign. Of the two possible gcd's ± t for r and s, we select the positive one, call it the gcd of r and s, and denote it by
gcd(r, s)
Does every pair r, s of integers have a gcd? Our experience with the integers tells us that the answer is "yes." We can easily prove this, and more:
Theorem 3 Any two nonzero integers r and s have a greatest common divisor t. Furthermore, t is equal to a "linear combination" of r and s. That is,
t = kr + Is
for some integers k and I.
220 CHAPTER TWENTY-TWO
FACTORING INTO PRIMES 221
Proof: Let J be the set of all the linear combinations of r and s, that is, the set of all ur + vs as u and v range over Z. J is closed with respect to addition and negatives and absorbs products because
(«[/• + vts) + (u2r + v2s) = (Mj + u2)r + (Uj + i>2).s
— (wr + iw) = (— u)r + (-v)s
and w(ur + vs) = (wu)r + (wv)s
Thus, J is an ideal of Z. By Theorem 1, J is a principal ideal of Z, say J= (t). (/ consists of all the multiples of f.)
Now t is in /, which means that t is a linear combination of r and s:
t= kr+ Is
Furthermore, r =lr + 0s and s = Or + Is, so r and 5 are linear combinations of r and s; thus r and 5 are in /. But all the elements of / are multiples of t, so r and s are multiples of t. That is,
t\r
and
Now, if u is any common divisor of r and s, this means that r = xu and 5 = yu for some integers x and y. Thus,
t = kr + Is = kxu + lyu = u(kx + ly)
It follows that u 11. This confirms that t is the gcd of r and s. ■
A word of warning: the fact that an integer m is a linear combination of r and 5 does not necessarily imply that m is the gcd of r and s. For example, 3 = (1)15 + (-2)6, and 3 is the gcd of 15 and 6. On the other hand, 27 = (1)15 + (2)6, yet 27 is not a gcd of 15 and 6.
A pair of integers r and 5 are said to be relatively prime if they have no common divisors except ±1. For example, 4 and 15 are relatively prime. If r and s are relatively prime, their gcd is equal to 1; so by Theorem 3, there are integers k and / such that kr + Is = 1. Actually, the converse of this statement is true too: if some linear combination of r and s is equal to 1 (that is, if there are integers k and / such that kr + Is = 1), then r and s are relatively prime. The simple proof of this fact is left as an exercise.
If m is any integer, it is obvious that ±1 and ±m are factors of m. We call these the trivial factors of m. If m has any other factors, we call them proper factors of m. For example, ±1 and ±6 are the trivial factors of 6, whereas ±2 and ±3 are proper factors of 6.
If an integer m has proper factors, m is called composite. If an integer p 7s 0, 1 has no proper factors (that is, if all its factors are trivial), then we call p a prime. For example, 6 is composite, whereas 7 is a prime.
Composite number lemma // a positive integer m is composite, then m = rs where
\ 1 can be expressed as a product of positive primes. That is, there are one or more primes px,...,pr such that
Pi Pi
Pr
Proof: Let K represent the set of all the positive integers greater than 1 which cannot be written as a product of one or more primes. We will assume there are such integers, and derive a contradiction.
By the well-ordering principle, K contains a least integer m; m cannot be a prime, because if it were a prime it would not be in K. Thus, m is composite; so by the composite number lemma,
m = rs
for positive integers r and s less than m and greater than 1; r and s are not in K because m is the least integer in K. This means that r and 5 can be expressed as products of primes, hence so can m = rs. This contradiction proves that K is empty; hence every n > 1 can be expressed as a product of primes. ■
Theorem 5: Unique factorization Suppose n can be factored into positive primes in two ways, namely,
n = Pi ■ ■ ■ Pr = 0i ■ ■ • q,
Then r = t, and the pi are the same numbers as the qj except, possibly, for the order in which they appear.
Proof: In the equation px ■ • • pr = qt • • • q„ let us cancel common factors from each side, one by one, until we can do no more canceling. If all the factors are canceled on both sides, this proves the theorem. Otherwise, we are left with some factors on each side, say
Pi
Pk = 1j-" 4»
Now, pi is a factor of pt- ■ ■ pk, so p,\q ■ ■ ■ qm. Thus, by Corollary 2 to Euclid's lemma, pt is equal to one of the factors q.f..., qm, which is impossible because we assumed we can do no more canceling. ■
It follows from Theorems 4 and 5 that every integer m can be factored into primes, and that the prime factors of m are unique (except for the order in which we happen to list them).
EXERCISES
A. Properties of the Relation "a Divides ft"
Prove the following, for any integers a, b, and c:
1 If a | b and b \ c, then a | c.
2 a\b iff a\(-b) iff {-a)\b.
3 l|a and (-l)|a.
4 a|0.
5 If c| a and c\ b, then c\(ax + by) for all x, y G Z.
6 If a>0 and 6>0 and a\b, then a « b.
7 a | b iff ac \ be, when c ¥- 0.
8 If a | b and c \ d, then ac | bd.
9 Let p be a prime. If p \ a" for some n > 0, then p \ a.
B. Properties of the gcd
Prove the following, for any integers a, b, and c. For each of these problems, you will need only the definition of the gcd.
# 1 If a > 0 and a | b, then gcd(a, b) = a.
2 gcd(a, 0) = a, if a >0.
3 gcd(a, b) = gcd(a, b + xa) for any xE.1.
4 Let p be a prime. Then gcd(a, p) = 1 or p. (Explain.)
5 Suppose every common divisor of a and b is a common divisor of c and d, and vice versa. Then gcd(a, b) = gcd(c, d).
6 If gcd(afc, c) = 1, then gcd(a, c) = 1 and gcd(6, c) = 1.
7 Let gcd(a, b) = c. Write a = ca' and b = cb'. Then gcd(a', b') = 1.
C. Properties of Relatively Prime Integers
Prove the following, for all integers a, b, c, d, r, and 5. (Theorem 3 will be helpful.)
1 If there arc integers r and s such that ra + sb = 1, then a and b are relatively prime.
2 If gcd(a, c) = 1 and c\ab, then c\b. (Reason as in the proof of Euclid's lemma.)
3 If a | a1 and c \ d and gcd(a, c) = 1, then ac \ d.
4 If a"|aft and d\cb, where gcd(a, c) = 1, then d\b.
5 If d = gcd(a, b) where a = dr and b = ds, then gcd(/\ s) = 1.
6 If gcd(a, c) = 1 and gcd(ft, c) = 1. then gcd(aft, c) = 1.
224 chapter twenty-two
factoring into primes 225
D. Further Properties of gcd's and Relatively Prime Integers
Prove the following, for all integers a, b, c, d, r, and s:
1 Suppose a|ft and c\b and gcd(a, c) = d. Then ac\bd.
2 If ac | b and ad | b and gcd(c, d) = 1, then acd \ b.
3 Let d = gcd(a, b). For any integer x, d\x iff x is a linear combination of a and 6.
4 Suppose that for all integers x, x | a and x | b iff jr | c. Then c = gcd(a, b).
5 For all n >0, if gcd(a, ft) = 1, then gcd(a, b") = 1. (Prove by induction.)
6 Suppose gcd(a, ft) = 1 and c|aft. Then there exist integers r and s such that c= rs, r\a, s \ ft, and gcd(r, s) = 1.
E. A Property of the gcd
Let a and ft be integers. Prove parts 1 and 2:
# 1 Suppose a is odd and ft is even, or vice versa. Then gcd(a, ft) = gcd(a + ft, a - ft).
2 Suppose a and ft are both odd. Then 2gcd(a, ft) = gcd(a + ft, a - ft).
3 If a and ft are both even, explain why either of the two previous conclusions are possible.
F. Least Common Multiples
A least common multiple of two integers a and ft is a positive integer c such that (i) a I c and ft | c; (ii) if a \x and ft \ x, then c\x.
1 Prove: The set of all the common multiples of a and ft is an ideal of Z.
2 Prove: Every pair of integers a and ft has a least common multiple. (Hint: Use part 1.)
The positive least common multiple of a and ft is denoted by lcm(a, ft). Prove the following for all positive integers a, ft, and c:
# 3 a • lcm(ft, c) = lcm(aft, ac).
4 If a = a,c and ft = btc where c = gcd(a, ft), then lcm(a, ft):
5 lcm(a, aft) = aft.
6 If gcd(a, ft) = 1, then lcm(a, ft) = ab.
7 If lcm(a, b) = ab, then gcd(a, ft) = 1.
8 Let gcd(a, ft) = c. Then lcm(a, ft) = able.
9 Let gcd(a, ft) = c and lcm(a, ft) = d. Then cd = ab.
a^b^c.
G. Ideals in Z
Prove the following:
1 (n) is a prime ideal iff n is a prime number.
2 Every prime ideal of Z is a maximal ideal. [Hint: If (p) C (a), but ( p} ^ ( a), explain why gcd(/>, a) = 1 and conclude that 1 £ (a).] Assume the ideal is not <0).
3 For every prime number p, Zp is a field. (Hint: Remember Zp=Z/(p). Use Exercise 4, Chapter 19.)
4 If c = lcm(a, ft), then (a) n = (c).
5 Every homomorphic image of Z is isomorphic to Z„ for some n.
6 Let G be a group and let a,bEG. Then S = {n £ Z: aft" = ft"a} is an ideal of
Z.
7 Let G be a group, ft a subgroup of G, and a £ G. Then
S= (n£Z: a" £ft)
is an ideal of Z.
8 If gcd(a, ft) = d, then (a) + (b) = (d). (Note: If / and K are ideals of a ring A, then J + K = {x + y: jr £ J and y £ K}.)
H. The gcd and the 1cm as Operations on Z
For any two integers a and ft, let a * ft = gcd(a, ft) and a ° ft = lcm(a, ft). Prove the following properties of these operations:
1 * and » are associative.
2 There is an identity element for °, but not for * (on the set of positive integers).
3 Which integers have inverses with respect to »?
4 Prove: a * (ft °c) = (a * ft)°(a * c).
ELEMENTS OF NUMBER THEORY (OPTIONAL) 227
CHAPTER
TWENTY-THREE
ELEMENTS OF NUMBER THEORY (OPTIONAL)
Almost as soon as children are able to count, they learn to distinguish between even numbers and odd numbers. The distinction between even and odd is the most elemental of all concepts relating to numbers. It is also the starting point of the modern science of number theory.
From a sophisticated standpoint, a number is even if the remainder, after dividing the number by 2, is 0. The number is odd if that remainder is 1.
This notion may be generalized in an obvious way. Let n be any positive integer: a number is said to be congruent to 0, modulo n if the remainder, when the number is divided by n, is 0. The number is said to be congruent to 1, modulo n if the remainder, when the number is divided by n, is 1. Similarly, the number is congruent to 2, modulo n if the remainder after division by n is 2; and so on. This is the natural way of generalizing the distinction between odd and even.
Note that "even" is the same as "congruent to 0, modulo 2"; and "odd" is the same as "congruent to 1, modulo 2."
In short, the distinction between odd and even is only one special case of a more general notion. We shall now define this notion formally:
Modulo 2. < z, 0 /VV # f $ & ✓" i i ] i r i i t i
-10 -9 0 1 2 3 4 5 6 7 H
Modulo 3: < O \ % •$ .0 vo ... /y> ... 1 1 1 /////// '
-9 -8 -7 0 1 2 3 4 5 h 7 K
Modulo 4: O \ % ") «P \° *w° *P ... //// 1 1 1 1 >P ->° vo ^> v° v° ,p vo / r J/ / r / / t i i i i i i i i
-12 -ii -lo -i> 0 1 2345678
Let n be any positive integer. If a and b are any two integers, we shall say that a is congruent to b, modulo nil a and b, when they are divided by n, leave the same remainder r. That is, if we use the division algorithm to divide a and b by n, then
a = nql + r
and
b = nq2 + r
where the remainder r is the same in both equations. Subtracting these two equations, we see that
a - b = (nqx + r) - {nq2 + r) = n(q1 - q2)
Therefore we get the following important fact:
a is congruent to b, modulo n iff n divides a — b (1)
If a is congruent to b, modulo n, we express this fact in symbols by writing
a = b(mod n)
which should be read "a is congruent to b, modulo n." We refer to this relation as congruence modulo n.
By using Condition (1), it is easily verified that congruence modulo n is a reflexive, symmetric, and transitive relation on Z. It is also easy to check that for any n > 0 and any integers a, b, and c,
a = b (mod n) implies a + c = b + c (mod n)
and a = b (mod n) implies ac = be (mod n)
226
228 CHAPTER TWENTY-THREE
ELEMENTS OF NUMBER THEORY (OPTIONAL) 229
(The proofs, which are exceedingly easy, are assigned as Exercise C at the end of this chapter.) Recall that
(n) = {..., -3n, -2n, —n, 0, n, 2n, 3n, . . .}
is the ideal of Z which consists of all the multiples of n. The quotient ring Z/{n_) is usually denoted by Z„, and its elements are denoted by 0,1,2,. . . , n - 1. These elements are cosets:
0= (n) +0 = {..., -2«,-«,0, n,2n,. ..}
I = («) + 1 = {. .., -In + 1, -n + 1,1, n + 1,2« + 1, . . .}
2= (n) + 2 = {... , -2« + 2, -n +2, 2, n + 2,2« +2, . . .}
and so on. It is clear by inspection that different integers are in the same coset iff they differ from each other by a multiple of «. That is,
a and b are in the same coset iff n divides a - b
iff a^b (mod n) (2)
If a is any integer, the coset (in Z„) which contains a will be denoted by a. For example, in Z6,
0 = 6=-6=12 = 18=-- 1 = 7 = -5 = 13 = •••
2 = 8=-4=14=-- etc.
In particular, a = b means that a and b are in the same coset. It follows by Condition (2) that
a = b in Z„ iff a = b (mod«) (3)
On account of this fundamental connection between congruence modulo n and equality in Z„, most facts about congruence can be discovered by examining the rings Z„. These rings have very simple properties, which are presented next. From these properties we will then be able to deduce all we need to know about congruences.
Let « be a positive integer. It is an important fact that for any integer
a,
a is invertible in Z„ iff a and n are relatively prime.
(4)
Indeed, if a and n are relatively prime, their gcd is equal to 1. Therefore, by Theorem 3 of Chapter 22, there are integers s and t such that sa + tn = 1. It follows that
1 - sa = tn e (n )
so by Condition (2) of Chapter 19,1 and sa belong to the same coset in Z/(n). This is the same as saying that I = sa = sa; hence s is the multiplicative inverse of a in Z„. The converse is proved by reversing the steps of this argument.
It follows from Condition (4) above, that if « is a prime number, every nonzero element of Z„ is invertible! Thus,
Zp is a field for every prime number p.
(5)
In any field, the set of all the nonzero elements, with multiplication as the only operation (ignore addition), is a group. Indeed, the product of any two nonzero elements is nonzero, and the multiplicative inverse of any nonzero element is nonzero. Thus, in Zp, the set
Z*p = {\,2,...,p-\)
with multiplication as its only operation, is a group of order p - 1.
Remember that if G is a group whose order is, let us say, m, then xm = e for every x in G. (This is true by Theorem 5 of Chapter 13.) Now, Z* has order p — 1 and its identity element is T, so (d)p~] = I for every a t^O in Zp. If we use Condition (3) to translate this equality into a congruence, we get a classical result of number theory:
Little theorem of Fermat Let p be a prime. Then,
1 (mod p) for every a^0 (mod p)
Corollary ap = a (mod p) for every integer a.
Actually, a version of this theorem is true in Zn even where « is not a prime number. In this case, let Vn denote the set of all the invertible elements in Z„. Clearly, Vn is a group with respect to multiplication. (Reason: The product of two invertible elements is invertible, and, if a is invertible, so is its inverse.) For any positive integer n, let 0.)
7 If a2 = 1 (mod 2), then a2 ■ 1 (mod 4).
8 If a = ft (mod n), then a2 + ft2 = 2aft (mod n2), and conversely.
9 If a = 1 (mod m), then a and m are relatively prime.
E. Consequences of Fermat's Theorem
1 If p is a prime, find 2 is a prime and a^O (mod p), then
a(p-D'2s±1 (mod p)
3 (a) Let p a prime >2. If p = 3 (mod4), then (p - l)/2 is odd.
(ft) Let p > 2 be a prime such that p = 3 (mod 4). Then there is no solution to the congruence x2 + 1 =0 (mod p). [Hint: Raise both sides of at2 — -1 (mod p) to the power (p - l)/2, and use Fermat's little theorem.]
# 4 Let p and q be distinct primes. Then p* + qp
l(mod pq).
5 Let p be a prime.
(a) If, (p - 1) | m, then a™ = 1 (mod p) provided that p -fa.
(ft) If, (p - l)|m, then am + 1 = a (mod p) for all integers a. # 6 Let p and g be distinct primes.
(a) If (p - 1)|m and (q-\)\m, then am ■ 1 (mod pq) for any a such that p -f a and a -T a.
(ft) If (p - 1)| m and (a - 1)| m, then a""1 ■ a (mod pa) for integers a.
7 Generalize the result of part 6 to n distinct primes, p,.....p„. (State your
result, but do not prove it.)
8 Use part 6 to explain why the following are true: # (a) a'" = a (mod 133).
(ft) a10 = 1 (mod 66), provided a is not a multiple of 2, 3, or 11.
(c) a13 = a (mod 105).
(d) a49 = a (mod 1547). (Hint: 1547 = 7 x 13 x 17.)
9 Find the following integers x:
(a) x m 838 (mod 210) (ft) x - 757 (mod 133) (c) * - 5" (mod 66)
F. Consequences of Euler's Theorem
Prove parts 1-6:
1 If gcd (a, n) = l, the solution modulo n of ax=b (mod n) is x = a4 (mod n).
2 If gcd (a, n) = 1, then a"*'"1 ■ 1 (mod rc) for all values of m.
3 If gcd (m, n) = gcd (a, mn) = 1, then a^m)*i"'1 = 1 (mod mn).
4 If p is a prime, 4>(p") = p" ~p"~l = p" \p~ 1). (Hint: For any integer a, a and p" have a common divisor # ±1 iff a is a multiple of p. There are exactly p""1 multiples of p between 1 and p".)
S For every a^O (mod p), a
p"(p-i)
), where p is a prime.
6 Under the conditions of part 3, if t is a common multiple of 4>(m) and 4>(n), then a' = 1 (mod mn). Generalize to three integers /, m, and n.
7 Use parts 4 and 6 to explain why the following are true:
(a) a12 = 1 (mod 180) for every a such that gcd(a, 180) = 1.
(ft) a42 = 1 (mod 1764) if gcd (a, 1764) = 1. (Remark: 1764 = 4 x 9 x 49.)
(c) a60 = 1 (mod 1800) if gcd (a, 1800) = 1. # 8 If gcd (m, n) = 1, prove that n*1"0 + m*"" = 1 (mod mn).
9 If /, m, n are relatively prime in pairs, prove that (mn)*"' + (/n)*(m) + (lm)*in) = \ (mod/mn).
G. Wilson's Theorem, and Some Consequences
In any integral domain, if x2 = 1, then x2 - 1 = (x + 1)(* - 1) = 0; hence jc = ±1. Thus, an element xiL±\ cannot be its own multiplicative inverse. As a consequence, in Zp the integers 2, 3, . . . , p - 2 may be arranged in pairs, each one being paired off with its multiplicative inverse. Prove the following:
1 In Zp,2-3---p-2=l.
2 (p — 2)! = 1 (mod p) for any prime number p.
3(p - 1)! + 1 = 0 (mod p) for any prime number p This is known as Wilson's theorem.
4 For any composite number n ^ 4, (n - 1)! ■ 0 (mod n). [Hint: If p is any prime factor of n, then p is a factor of (n - 1)! Why?]
Before going on to the remaining exercises, we make the following observations: Let p > 2 be a prime. Then
(p_1)! = 1.2...£_i.£±i.....(,_2).(p-i)
Consequently,
(p-l)!-(-l)(-"'2(l-2---^-L)2(modp)
Reason: p - Is -1 (mod p), p -2= -2 (mod p),-■ ■, (p + l)/2= -(p - l)/2 (mod p).
With this result, prove the following:
5 l(p - l)/2]!2 = (-l)(" + l)'2 (mod p), for any prime p>2. (Hint: Use Wilson's theorem.)
238 chapter twenty-three
elements of number theory (optional) 239
6 If p = 1 (mod4), then (p + 1)12 is odd. (Why?) Conclude that
(^-)!2--l (mod p)
7 If p = 3 (mod4), then (p + l)/2 is even. (Why?) Conclude that
(£^1)'-2 = 1 (mod p)
8 When p >2 is a prime, the congruence x2 + 1 =0 (mod p) has a solution if pi (mod 4).
9 For any prime p > 2, jr2 = -1 (mod p) has a solution iff p^3 (mod 4). (Hint: Use part 8 and Exercise E3.)
H. Quadratic Residues
An integer a is called a quadratic residue modulo m if there is an integer x such that x2 = a (mod m). This is the same as saying that a is a square in Zm. If a is not a quadratic residue modulo in, then a is called a quadratic nonresidue modulo m. Quadratic residues are important for solving quadratic congruences, for studying sums of squares, etc. Here, we will examine quadratic residues modulo an arbitrary prime p > 2.
Let h:Z*^>Z* be defined by h(a) = a2.
1 Prove h is a homomorphism. Its kernel is (±1). # 2 The range of h has (p - l)/2 elements. Prove: If ran h = R, R is a subgroup of Z* having two cosets. One contains all the residues, the other all the nonresidues.
The Legendre symbol is defined as follows
+ 1 -1
0
if p -f a and a is a residue mod p. if p t a and a is a nonresidue mod p. if p\a.
3 Referring to part 2, let the two cosets of R be called 1 and -1. Then
Z*IR = (1, -1}. Explain why
(!)-«.->
for every integer a which is not a multiple of p. # 4 E.,l»..« (g); (|); ); (|).
5 Prove: if a = 6 (mod p), then = ^In particular, = (")■
'("?)"{-> If-)S<) (Hint: Use Erercises G6 and 7.)
The most important rule for computing
(?)
is the /aw o/ quadratic reciprocity, which asserts that for distinct primes p, q>2,
f-(J)
if p, a are both = 3 (mod 4)
otherwise
(The proof may be found in any textbook on number theory, for example,
Fundamentals of Number Theory by W. J. LeVeque.)
8 Use parts 5 to 7 and the law of quadratic reciprocity to find:
(30) (JO) (15) (M) V 101 / V 151 / V41/ V59/
Is 14 a quadratic residue, modulo 59?
9 Which of the following congruences is solvable?
(a) *2 = 30 (mod 101) (b) x2 = 6 (mod 103)
/379\ 1401/
(c) 2.t2 = 70 (mod 106)
Note: x = a (mod p) is solvable iff a is a quadratic residue modulo p iff
1
I. Primitive Roots
Recall that Vn is the multiplicative group of all the invcrtible elements in Zn. If V„ happens to be cyclic, say Va = (m), then any integer a = m (mod n) is called a primitive root of n,
1 Prove that a is a primitive root of n iff the order of a in Vn is d>(n).
2 Prove that every prime number p has a primitive root. (Hint: For every prime p, Z* is a cyclic group. The simple proof of this fact is given as Theorem 1 in Chapter 33.)
3 Find primitive roots of the following integers (if there are none, say so): 6, 10, 12, 14, 15.
4 Suppose a is a primitive root of m. Prove: If b is any integer which is relatively prime to m, then b = a* (mod m) for some k»l.
5 Suppose m has a primitive root, and let n be relatively prime to 4>(m). (Suppose n > 0.) Prove that if a is relatively prime to m, then x" = a (mod m) has a solution.
6 Let p>2 be a prime. Prove that every primitive root of p is a quadratic nonresidue, modulo p. (Hint: Suppose a primitive root a is a residue; then every power of a is a residue.)
7 A prime p of the form p = 2"' + 1 is called a Fermat prime. Let p be a Fermat prime. Prove that every quadratic nonresidue mod p is a primitive root of p. (Hint: How many primitive roots arc there? How many residues? Compare.)
RINGS OF POLYNOMIALS 241
CHAPTER
TWENTY-FOUR
RINGS OF POLYNOMIALS
In elementary algebra an important role is played by polynomials in an unknown x. These are expressions such as
, 2
2xJ
\x + 3
whose terms are grouped in powers of x. The exponents, of course, are positive integers and the coefficients are real or complex numbers.
Polynomials are involved in countless applications—applications of every kind and description. For example, polynomial functions are the easiest functions to compute, and therefore one commonly attempts to approximate arbitrary functions by polynomial functions. A great deal of effort has been expended by mathematicians to find ways of achieving this.
Aside from their uses in science and computation, polynomials come up very naturally in the general study of rings, as the following example will show:
Suppose we wish to enlarge the ring Z by adding to it the number tt. It is easy to see that we will have to adjoin to Z other new numbers besides just tt; for the enlarged ring (containing tt as well as all the integers) will also contain such things as - tt, it + 7, 6tt -11, and so on.
As a matter of fact, any ring which contains Z as a subring and which also contains the number tt will have to contain every number of the form
r" + bir"
+ k-rr + I
where a, b,. . . , k, I are integers. In other words, it will contain all the polynomial expressions in tt with integer coefficients.
But the set of all the polynomial expressions in tt with integer coefficients is a ring. (It is a subring of R because it is obvious that the sum and product of any two polynomials in tt is again a polynomial in tt.) This ring contains Z because every integer a is a polynomial with a constant term only, and it also contains tt.
Thus, if we wish to enlarge the ring Z by adjoining to it the new number tt, it turns out that the "next largest" ring after Z which contains Z as a subring and includes tt, is exactly the ring of all the polynomials in tt with coefficients in Z.
As this example shows, aside from their practical applications, polynomials play an important role in the scheme of ring theory because they are precisely what we need when we wish to enlarge a ring by adding new elements to it.
In elementary algebra one considers polynomials whose coefficients are real numbers, or in some cases, complex numbers. As a matter of fact, the properties of polynomials are pretty much independent of the exact nature of their coefficients. All we need to know is that the coefficients are contained in some ring. For convenience, we will assume this ring is a commutative ring with unity.
Let A be a commutative ring with unity. Up to now we have used letters to denote elements or sets, but now we will use the letter x in a different way. In a polynomial expression such as ax2 + bx + c, where a, b, c 6 A, we do not consider x to be an element of A, but rather x is a symbol which we use in an entirely formal way. Later we will allow the substitution of other things for x, but at present x is simply a placeholder.
Notationally, the terms of a polynomial may be listed in either ascending or descending order. For example, 4x3 - 3x2 + x + 1 and 1 + x — 3x + 4x denote the same polynomial. In elementary algebra descending order is preferred, but for our purposes ascending order is more convenient.
Let A be a commutative ring with unity, and x an arbitrary symbol. Every expression of the form
a0 + a}x + a2x2 + - • • + anx"
is called a polynomial in x with coefficients in A, or more simply, a polynomial in x over A. The expressions akxk, for k E {1, . . . , n}, are called the terms of the polynomial.
Polynomials in x are designated by symbols such as a(x), b(x), q(x), and so on. If a(x) = a0 + axx + ■ ■ ■ + anx" is any polynomial and akxk is any one of its terms, ak is called the coefficient of xk. By the degree of a polynomial a(x) we mean the greatest n such that the coefficient of x" is not zero. In other words, if a(x) has degree n, this means that a„^0 but
242 CHAPTER TWENTY-FOUR
RINGS OF POLYNOMIALS 243
am = 0 for every m > n. The degree of a(x) is symbolized by
deg a(x)
For example, 1 + 2x — 3x2 + x3 is a polynomial degree 3.
The polynomial 0 + Ox + Ox2 + ■ ■ ■ all of whose coefficients are equal to zero is called the zero polynomial, and is symbolized by 0. It is the only polynomial whose degree is not defined (because it has no nonzero coefficient).
If a nonzero polynomial a(x) = aQ + axx + ■ ■ ■ + anxn has degree n, then an is called its leading coefficient: it is the last nonzero coefficient of a(x). The term anx" is then called its leading term, while a0 is called its constant term.
If a polynomial a(x) has degree zero, this means that its constant term a0 is its only nonzero term: a(x) is a constant polynomial. Beware of confusing a polynomial of degree zero with the zero polynomial.
Two polynomials a(x) and b(x) are equal if they have the same degree and corresponding coefficients are equal. Thus, if a(x) = a0 + ■ ■ ■ + anx" is of degree n, and b(x) = ba + ■ ■ - + bmxm is of degree m, then a(x) = b(x) iff n = m and ak = bk for each k from 0 to n.
The familiar sigma notation for sums is useful for polynomials. Thus,
a(x) = a0 + atx + ■ ■ ■ + anx" = 2 <*kxk
with the understanding that x — 1.
Addition and multiplication of polynomials is familiar from elementary algebra. We will now define these operations formally. Throughout these definitions we let a(x) and b(x) stand for the following polynomials:
a(x) = aa + axx + • • • + anxn
b(x) = b0 + bxx + ■ ■ • + bnx"
Here we do not assume that a(x) and b(x) have the same degree, but allow ourselves to insert zero coefficients if necessary to achieve uniformity of appearance.
We add polynomials by adding corresponding coefficients. Thus,
a(x) + b(x) = (a0 + b0) + (ax + bx)x +■••• + («. + bn)x"
Note that the degree of a(x) + b{x) is less than or equal to the higher of the two degrees, deg a(x) and deg b(x). Multiplication is more difficult, but quite familiar:
a(x)b(x)
= a0b0 + (aQb1 + bQax)x + (anb2 + axbx + a2b0)x2 + ■■■ + a„bnx2"
In other words, the product of a(x) and b(x) is the polynomial
c(x) = c0 + cxx +
■ + c-,„x
2n
whose &th coefficient (for any k from 0 to 2/z) is
^=2 aibj
i+i=k
This is the sum of all the aibj for which i + j = k. Note that deg [a(x)b(x)\ < deg a(x) + deg b(x). If A is any ring, the symbol
A[x]
designates the set of all the polynomials in x whose coefficients are in A, with addition and multiplication of polynomials as we have just defined them.
Theorem 1 Let A be a commutative ring with unity. Then A[x] is a commutative ring with unity.
Proof: To prove this theorem, we must show systematically that A[x] satisfies all the axioms of a commutative ring with unity. Throughout the proof, let a(x), b(x), and c(x) stand for the following polynomials:
a(x) = a0 + axx + • • ■ + anx" b(x) = b0 + bxx + ■ ■ ■ + bnx" and c(x) = c0 + cxx + • • • + cnxn
The axioms which involve only addition are easy to check: for example, addition is commutative because
a(x) + b(x) = (a0 + b0) + (ax + bx)x + --- + (ar[ + bn)x"
= (°o + a0) + (bx + a,)x + --- + (bn + a„)x"
= b(x) + a(x)
The associative law of addition is proved similarly, and is left as an exercise. The zero polynomial has already been described, and the negative of a(x) is
-a(x) = (-a0) + (-ax)x + ■ ■ ■ + (~an)xn
To prove that multiplication is associative requires some care. Let b(x)c(x) = d(x), where d(x) = d0 + dxx + ■ ■ ■ + d2nx2". By the definition of polynomial multiplication, the kth coefficient of b(x)c(x) is
/+/=*
244 CHAPTER TWENTY-FOUR
RINGS OF POLYNOMIALS 245
Then a(x)[b(x)c(x)] = a(x)d(x) = e(x), where e(x) = e0 + exx -I----+
einxin. Now, the /th coefficient of a(x)d(x) is
e,= 2 M*= S a*( 2 V,
h+k-l hik=l
It is easy to see that the sum on the right consists of all the terms ahbtc, such that h + i + j = I. Thus,
e, = 2 flfcVy
For each / from 0 to 3n, e, is the /th coefficient of a(x)[b(x)c(x)].
If we repeat this process to find the /th coefficient of [a(x)b(x)]c(x), we discover that it, too, is e,. Thus,
a(x)[b(x)c(x)] = [a(x)b(x)]c(x)
To prove the distributive law, let a(x)[b(x) + c(x)] = d(x) where d(x) = dQ + djX + ■ ■ • + d2nx2n. By the definitions of polynomial addition and multiplication, the ^h coefficient a(x)[b(x) + c(x)] is
dk= 2 ",(/>,- + c,)= 2 («,*>, + «,<:,)
i+j^k i*j—k
= 2 2 aici
i+j = k
,+i-k
But is exactly the £th coefficient of a(jc)6(x), and Et+jmk aicl is
the /cth coefficient of a(x)c(x), hence dk is equal to the kth coefficient of a(x)b(x) + a(x)c(x). This proves that
a(x)\b(x) + c(x)] = a(x)b(x) + a(x)c(x)
The commutative law of multiplication is simple to verify and is left to the student. Finally, the unity polynomial is the constant polynomial 1. ■
Theorem 2 If A is an integral domain, then A[x] is an integral domain.
Proof: If a(x) and b(x) are nonzero polynomials, we must show that their product a(x)b(x) is not zero. Let «„ be the leading coefficient of a(x), and bm the leading coefficient of b(x). By definition, an^0, and bm #0. Thus anbm #0 because A is an integral domain. It follows that a(x)b(x) has a nonzero coefficient (namely, anbm), so it is not the zero polynomial. ■
If A is an integral domain, we refer to A[x] as a domain of polynomials, because A[x] is an integral domain. Note that by the
preceding proof, if an and bm are the leading coefficients of a(x) and b(x), then anbm is the leading coefficient of a(x)b(x). Thus, deg a(x)b(x) = n + m: In a domain of polynomials A[x\, where A is an integral domain,
deg[a(;t) • b(x)] = deg a(x) + deg b(x)
In the remainder of this chapter we will look at a property of polynomials which is of special interest when all the coefficients lie in a field. Thus, from this point forward, let F be a field, and let us consider polynomials belonging to F[x].
It would be tempting to believe that if F is a field then F[x] also is a field. However, this is not so, for one can easily see that the multiplicative inverse of a polynomial is not generally a polynomial. Nevertheless, by Theorem 2, F[x) is an integral domain.
Domains of polynomials over afield do, however, have a very special property: any polynomial a(x) may be divided by any nonzero polynomial b(x) to yield a quotient q(x) and a remainder r(x). The remainder is either 0, or if not, its degree is less than the degree of the divisor b(x). For example, x2 may be divided by x - 2 to give a quotient of x + 2 and a remainder of 4:
x2 = (x-2)(x + 2) + 4^
a(x) b(x) q(x) r(x)
This kind of polynomial division is familiar to every student of elementary algebra. It is customarily set up as follows:
x + 2
Divisor-a(x)
>x-2
-2x
2x 2x
-Quotient q(x) -Dividend b(x)
-Remainder r(x)
The process of polynomial division is formalized in the next theorem.
Theorem 3: Division algorithm for polynomials // a(x) and b(x) are polynomials over a field F, and b(x)^0, there exist polynomials q(x) and r(x) over F such that
and
a(x) = b(x)q(x) + r(x)
r(x) = 0 or deg r(x) < deg b(x)
Proof: Let b(x) remain fixed, and let us show that every polynomial a(x) satisfies the following condition:
There exist polynomials q(x) and r(x) over F such that a(x) = b(x)q(x) + r(x), and r(x) = 0 or deg r(x) < deg b(x).
246 chapter twenty-four
rings of polynomials 247
We will assume there are polynomials a(x) which do not fulfill the condition, and from this assumption we will derive a contradiction. Let a(x) be a polynomial of lowest degree which fails to satisfy the conditions. Note that a(x) cannot be zero, because we can express 0 as 0 = b(x)-0 + Q, whereby a(x) would satisfy the conditions. Furthermore, deg a(x) 3= deg b(x), for if deg a(x) < deg b(x) then we could write a(x) = b(x)-0 + a(x), so again a(x) would satisfy the given conditions.
Let a(x) = «„ + •••+ anx" and b(x) = b0 + ■■■ + bmxm. Define a new polynomial
A(x) = a(x) - x"
nb(x)
(1)
a(x) - yb0 — x
+ b,
X + • ■
b.
This expression is the difference of two polynomials both of degree n and both having the same leading term anx". Because anxn cancels in the subtraction, A(x) has degree less than n.
Remember that a(x) is a polynomial of least degree which fails to satisfy the given condition; hence A(x) does satisfy it. This means there are polynomials p(x) and r(x) such that
A{x) = b(x)p(x) + r(x)
where r(x) = 0 or deg r{x) < deg b(x). But then
a(x) = A (x) + |=- x" " mb(x) by Equation (1) = b{x)p{x) + r(x) +YX" ~"*(z)
= Kx)
p(x) +
+ r(x)
If we let p(x) + {ajbm)x"'m be renamed q(x), then a(x) = b(x)q(x) + r(x), so a(x) fulfills the given condition. This is a contradiction, as required. ■
EXERCISES
A. Elementary Computation in Domains of Polynomials
Remark on notation: In some of the problems which follow, we consider polynomials with coefficients in Z„ for various n. To simplify notation, we denote
the elements of Z„ by 1,2,...,«- 1 rather than the more correct 1,2, . . . , n - 1.
# 1 Let a(x) = 2x2 + 3x + 1 and b(x) = x3 -f 5x2 + x. Compute a(x) + b(x), a(x) -b{x) and a(x)b(x) in Z[x], Z5[x], Zb[x], and Z7[*].
2 Find the quotient and remainder when x3 + x2 + x + 1 is divided by x2 + 3x + 2 in Z[at] and in Z5[x].
3 Find the quotient and remainder when x3 + 2 is divided by 2x2 + 3.v + 4 in Z[x], in Z,[x], and in Z5[x].
We call b(x) a factor.of a(x) if a(x) = b(x)q(x) for some q(x), that is, if the remainder when a(x) is divided by b(x) is equal to zero.
4 Show that the following is true in A[x] for any ring A: For any odd n,
(a) x + 1 is a factor of x" + 1.
(b) x + 1 is a factor of x" + x"+ ■ - • + x + 1.
5 Prove the following: In Z3(x], x + 2 is a factor of x"' + 2, for all m. In Zn[x], x + (n - 1) is a factor of + (n - 1), for all m and n.
6 Prove that there is no integer m such that 3x2 + 4x + m is a factor of 6at4 + 50 in Z[x).
1 For what values of n is at2 + 1 a factor of x1 + 5x + 6 in Z„[x]?
B. Problems Involving Concepts and Definitions
1 Is x* + 1 = x3 + 1 in Zs[jc]? Explain your answer.
2 Is there any ring A such that in A[x\, some polynomial of degree 2 is equal to a polynomial of degree 4? Explain.
# 3 Write all the quadratic polynomials in Z3[jt]. How many are there? How many cubic polynomials are there in Z5[x]? More generally, how many polynomials of degree m are there in Z„[x]?
4 Let A be an integral domain; prove the following:
If (x + l)2 = x2 + 1 in A[x], then A must have characteristic 2. If (x + l)4 = x4 + 1 in A[x\, then /I must have characteristic 2. If (x + l)6 = x6 + 2x' + I in A[x], then A must have characteristic 3.
5 Find an example of each of the following in Za[x]: a divisor of zero, an invertible element. (Find nonconstant examples.)
6 Explain why x cannot be invertible in any A[x], hence no domain of polynomials can ever be a field.
7 There are rings such as P3 in which every element ^0,1 is a divisor of zero. Explain why this cannot happen in any ring of polynomials A[x], even when A is not an integral domain.
8 Show that in every A[x], there are elements ^0, 1 which are not idempotcnt, and elements ?*0,1 which are not nilpotent.
248 CHAPTER TWENTY-FOUR
RINGS OF POLYNOMIALS 249
C. Rings A[x] Where A Is Not an Integral Domain
1 Prove: If A is not an integral domain, neither is A[x\.
2 Give examples of divisors of zero, of degrees 0, 1, and 2, in Z4[x].
3 In I10[x], (2x + 2)(2x + 2) = (2x + 2)(5x2 + 2x + 2), yet (2x + 2) cannot be canceled in this equation. Explain why this is possible in Zlu[x], but not in Z5[x].
4 Give examples in l4[x], in Z„[x], and in Z9[x] of polynomials a(x) and b(x) such that deg a(x)b(x) < deg a(x) + deg b(x).
5 If A is an integral domain, we have seen that in A[x],
deg a(x)b(x) = deg a(x) + deg b{x)
Show that it A is not an integral domain, we can always find polynomials a(x) and b(x) such that deg a(x)b(x) < deg a(x) + deg b(x).
6 Show that if A is an integral domain, the only invertible elements in A[x] are the constant polynomials with inverses in A. Then show that in Z4[x] there are invertible polynomials of all degrees.
# 7 Give all the ways of factoring x2 into polynomials of degree 1 in Z9[x]; in Z5[x]. Explain the difference in behavior.
8 Find all the square roots of x2 + x + 4 in Z5[x]. Show that in Zs[x], there are infinitely many square roots of 1.
D. Domains A[x] Where A Has Finite Characteristic
In each of the following, let A be an integral domain:
1 Prove that if A has characteristic p, then A[x\ has characteristic p.
2 Use part 1 to give an example of an infinite integral domain with finite characteristic.
3 Prove: If A has characteristic 3, then x + 2 is a factor of xm + 2 for all m. More generally, if A has characteristic p, then x + (p - 1) is a factor of x" + (p - 1) for all m.
4 Prove that HA has characteristic p, then in A[x], (x + c)p = x" + cp. (You may use essentially the same argument as in the proof of Theorem 3, Chapter 20.)
5 Explain why the following "proof of part 4 is not valid: (x + c)p = xp + c" in A[x] because (a + c)p = a" + cp for all a, c 6 A. (Note the following example: in Z2, a2 + 1 = a4 + 1 for every a, yet x2 + 1. * x4 + 1 in Z2 [x].)
# 6 Use the same argument as in part 4 to prove that if A has characteristic/?, then \a(x) + b(x)]p = a(x)p + b(x)p for any a(x), b(x) e A[x]. Use this to prove:
(a0 + «,* + ••• + a„*")' = < + «f*' f • ■' +
E. Subrings and Ideals in A[x]
1 Show that if 5 is a subring of A, then B[x] is a subring of A[x]
2 If B is an idea/ of A, B[x] is an ideal of A[x].
3 Let 5 be the set of all the polynomials a(x) in A[x] for which every coefficient a, for odd i is equal to zero. Show that 5 is a subring of A[x]. Why is the same not true when "odd" is replaced by "even"?
4 Let / consist of all the elements in A[x] whose constant coefficient is equal to zero. Prove that / is an ideal of A[x].
# 5 Let / consist of all the polynomials a0 + aLx + ■ ■ ■ + anx" in A[x] such that a0 + a, + • • ■ + an = 0. Prove that / is an ideal of A[x\.
6 Prove that the ideals in both parts 4 and 5 are prime ideals. (Assume A is an integral domain.)
F. Homomorphisms of Domains of Polynomials
Let A be an integral domain.
1 Let h : A[x]-+ A map every polynomial to its constant coefficient; that is,
h(alt + «,* + •■■ + anx") = a0 Prove that h is a homomorphism from A\x\ onto A, and describe its kernel.
2 Explain why the kernel of h in part 1 consists of all the products xa(x), for all a(x) e A[x]. Why is this the same as the principal ideal (x) in A[x]l
3 Using parts 1 and 2, explain why A[x]l(x) = A.
4 Let g: A[x)-»A send every polynomial to the sum of its coefficients. Prove that g is a surjective homomorphism, and describe its kernel.
5 If c£ A, let h : A[x]-> A[x] be denned by h(a(xj) = a(cx), that is,
h(a0 + a,x + ■ ■ ■ + a„x") = a0 + a^cx + a2c2x2 + ■■■ + arlcnx" Prove that h is a homomorphism and describe its kernel.
6 If h is the homomorphism of part 5, prove that h is an automorphism (isomorphism from A[x] to itself) iff c is invertible.
G. Homomorphisms of Polynomial Domains
Induced by a Homomorphism of the Ring of Coefficients
Let A and B be rings and let h : A -* B be a homomorphism with kernel K Define h : A[x]-* B[x] by
h\a0 + atx + ■ ■ ■ + anx") = h(a0) + h(at)x + ■■■ + h(ajx" (We say that h is induced by h.)
1 Prove that h is a homomorphism from A[x\ to B[x\.
2 Describe the kernel K of h.
# 3 Prove that h is surjective iff h is surjective.
4 Prove that h is injective iff h is injective.
5 Prove that if a(x) is a factor of b(x), then h(a(x)) is a factor of h(b(x)).
250 chapter twenty-four
6 If h : Z—>Z„ is the natural homomorphism, let h : Z[x}—>In[x\ be the homo-morphism induced by h. Prove that h(a(xj) = 0 iff n divides every coefficient of a(x).
7 Let h be as in part 6, and let n be a prime. Prove that if a(x)b(x) £ ker h, then either a(x) or b(x) is in ker h. (Hint: Use Exercise F2 of Chapter 19.)
H. Polynomials in Several Variables
j4[x,,x2] denotes the ring of all the polynomials in two letters xt and x2 with coefficients in A. For example, x2 - 2xy + y2 + x — 5 is a quadratic polynomial in Q[x, y]. More generally, A[x,, . . . , at„| is the ring of the polynomials in n letters x,,. . . , x„ with coefficients in A. Formally it is defined as follows: Let A[x^\ be denoted by A,; then ^4,[jc-,] is A[x,,x2]. Continuing in this fashion, we may adjoin one new letter x: at a time, to get A[x^ . . . ,xlt\.
1 Prove that if A is an integral domain, then A\xt, .... xn\ is an integral domain.
2 Give a reasonable definition of the degree of any polynomial p(x, y) in A\x, y] and then list all the polynomials of degree =s3 in Z,[x, y\.
Let us denote an arbitrary polynomial p(x, y) in A[x, y] by T. a^x'y' where £ ranges over some pairs i, j of nonnegative integers.
3 Imitating the definitions of sum and product of polynomials in A\x\, give a definition of sum and product of polynomials in A[x, y].
4 Prove that deg a(x, y)b(x, y) = deg a(x, y) + deg b(x, y) if A is an integral domain.
I. Fields of Polynomial Quotients
Let A be an integral domain. By the closing part of Chapter 20, every integral domain can be extended to a "field of quotients." Thus, A[x] can be extended to a field of polynomial quotients, which is denoted by A(x). Note that A(x) consists of all the fractions a(x)lb(x) for a(x) and b(x) ¥=Q in A[x\, and these fractions are added, subtracted, multiplied, and divided in the customary way.
1 Show that A(x) has the same characteristic as A.
2 Using part 1, explain why there is an infinite field of characteristic p, for every prime p.
3 If A and B are integral domains and h : A —» B is an isomorphism, prove that h determines an isomorphism h : A(x)—> B(x).
J. Division Algorithm: Uniqueness of Quotient and Remainder
In the division algorithm, prove that q(x) and r(x) are uniquely determined. [Hint: Suppose a(x) = b(x)ql(x) + r^x) = b(x)q2(x) + r2(x), and subtract these two expressions, which are both equal to a(x).]
CHAPTER
TWENTY-FIVE
FACTORING POLYNOMIALS
Just as every integer can be factored into primes, so every polynomial can be factored into "irreducible" polynomials which cannot be factored further. As a matter of fact, polynomials behave very much like integers when it comes to factoring them. This is especially true when the polynomials have all their coefficients in a field.
Throughout this chapter, we let F represent some field and we consider polynomials over F. It will be found that F[x] has a considerable number of properties in common with Z. To begin with, all the ideals of F[x\ are principal ideals, which was also the case for the ideals of Z.
Note carefully that in F[x\, the principal ideal generated by a polynomial a(x) consists of all the products a(x)s(x) as a(x) remains fixed and s(x) ranges over all the members of F[x].
Theorem 1 Every ideal of F[x] is principal.
Proof: Let J be any ideal of F[x]. If / contains nothing but the zero polynomial, J is the principal ideal generated by 0. If there are nonzero polynomials in J, let b(x) be any polynomial of lowest degree in J. We will show that J = (b(x)), which is to say that every element of J is a polynomial multiple b(x)q(x) of b(x).
Indeed, if a(x) is any element of J, we may use the division algorithm to write a(x) = b(x)q(x) + r(x), where r(x) = 0 or deg r(x) < deg b(x). Now, r(x) = a(x) - b(x)q(x); but a(x) was chosen in 7, and b(x) 6 /; hence b{x)q(x) e /. It follows that r(x) is in J.
If r(x)¥=0, its degree is less than the degree of b(x). But this is
251
252 CHAPTER TWENTY-FIVE
FACTORING POLYNOMIALS 253
impossible because b(x) is a polynomial of lowest degree in J. Therefore, of necessity, r(x) = 0.
Thus, finally, a(x) = b(x)q(x); so every member of J is a multiple of b(x), as claimed. ■
It follows that every ideal / of F[x] is principal. In fact, as the proof above indicates, J is generated by any one of its members of lowest degree.
Throughout the discussion which follows, remember that we are considering polynomials in a fixed domain F[x] where F is a field.
Let a(x) and b(x) be in F[x]. We say that b(x) is a multiple of a(x) if
b(x) = a(x)s(x)
for some polynomial s(x) in F[x]. If b(x) is a multiple of a(x), we also say that a(x) is a factor of b(x), or that a(x) divides b(x). In symbols, we write
a(x)\b(x)
Every nonzero constant polynomial divides every polynomial. For if c i* 0 is constant and a(x) = a0 -i----+ anx", then
(ao,ai ,
a0 + axx + ■ ■ ■ + anx = c^— + — x +
hence c | a(x). A polynomial a(x) is invertible iff it is a divisor of the unity polynomial 1. But if a(x)b(x) = 1, this means that a(x) and b(x) both have degree 0, that is, are constant polynomials: a(x) = a, b(x) = b, and ab = 1. Thus,
the invertible elements of F[x] are all the nonzero constant polynomials.
A pair of nonzero polynomials a(x) and b(x) are called associates if they divide one another: a(x)\b(x) and b(x)\a(x). That is to say,
a(x) = b(x)c(x) and b(x) = a(x)d(x)
for some c(x) and d(x). If this happens to be the case, then
a(x) = b(x)c(x) = a(x)d(y)c(x)
hence d(x)c(x) = 1 because F[x] is an integral domain. But then c(x) and d(x) are constant polynomials, and therefore a(x) and ft(jc) are constant multiples of each other. Thus, in F[x\,
a(x) and b(x) are associates iff they are constant multiples of each other.
If a(x) = a0 + • • • + anxn, the associates of a(x) are all its nonzero constant multiples. Among these multiples is the polynomial
a0 a,
-f + ~x +
+ x
which is equal to (\lan)a(x), and which has 1 as its leading coefficient. Any polynomial whose leading coefficient is equal to 1 is called monk. Thus, every nonzero polynomial a(x) has a unique monk associate. For example, the monic associate of 3 + Ax + Ix* is § + 2x + x2.
A polynomial d(x) is called a greatest common divisor of a(x) and b(x) if d(x) divides a(x) and b(x), and is a multiple of any other common divisor of a(x) and b(x); in other words,
(i) d(x)\a(x) and d(x)\b(x), and
(ii) For any u(x) in F[x], if u(x)\a(x) and u(x)\b(x), then u(x)\d(x). According to this definition, two different gcd's of a(x) and b(x)
divide each other, that is, are associates. Of all the possible gcd's of a(x) and b(x), we select the monic one, call it the gcd of a(x) and b(x), and denote it by gcd[a(^), b(x)].
It is important to know that any pair of polynomials always has a greatest common divisor.
Theorem 2 Any two nonzero polynomials a(x) and b(x) in F[x] have a gcd d{x). Furthermore, d(x) can be expressed as a "linear combination"
d(x) = r{x)a(x) + s(x)b(x)
where r(x) and s(x) are in F[x].
Proof: The proof is analogous to the proof of the corresponding theorem for integers. If J is the set of all the linear combinations
u(x)a(x) + v(x)b(x)
as u(x) and v(x) range over F[x], then J is an ideal of F[x], say the ideal (d(x)) generated by d(x). Now a(x) = la(x) + 0b(x) and b(x) = 0a(x) + ib(x), so a(x) and b(x) are in J. But every element of J is a multiple of d(x), so
d(x)\a(x) and d(x)\b(x)
If k(x) is any common divisor of a(x) and b(x), this means there are polynomials f(x) and g(x) such that a(x) = k(x)f(x) and b(x) = k(x)g(x). Now, d(x) G J, so d(x) can be written as a linear combination
d(x) = r{x)a(x) + s(x)b(x)
= r(x)k(x)f(x) + s(x)k(x)g(x)
= k(x)[r(x)f(x) + s(x)g(x)}
hence k(x)\d(x). This confirms that d(x) is the gcd of a(x) and b(x). m
254 CHAPTER TWENTY-FIVE
FACTORING POLYNOMIALS 255
Polynomials a{x) and b(x) in F[x] are said to be relatively prime if their gcd is equal to 1. (This is equivalent to saying that their only common factors are constants in F.)
A polynomial a(x) of positive degree is said to be reducible over F if there are polynomials b(x) and c{x) in F[x], both of positive degree, such that
a(x) = b(x)c(x)
Because b(x) and c(x) both have positive degrees, and the sum of their degrees is deg a{x), each has degree less than deg a(x).
A polynomial p(x) of positive degree in F[x] is said to be irreducible over F if it cannot be expressed as the product of two polynomials of positive degree in F[x]. Thus, p(x) is irreducible iff it is not reducible.
When we say that a polynomial p(x) is irreducible, it is important that we specify irreducible over the field F. A polynomial may be irreducible over F, yet reducible over a larger field E. For example, p(x) = x2 + 1 is irreducible over R; but over C it has factors (x + i)(x - i).
We next state the analogs for polynomials of Euclid's lemma and its corollaries. The proofs are almost identical to their counterparts in Z; therefore they are left as exercises.
Euclid's lemma for polynomials Let p(x) be irreducible. If p{x) | a(x)b(x), then p{x) \ a(x) or p{x) \ b(x).
Corollary 1 Let p(x) be irreducible. If p(x) \ ax{x)a2{x) ■■■ an{x), then p(x) | a,(x) for one of the factors a,{x) among a,(*), . . . , «„(*)•
Corollary 2 Let ,(*),..., qr(x) and p{x) be monk irreducible polynomials. If p(x) | qt(x) • ■ ■ qr(x), then p(x) is equal to one of the factors 1i(x), ■ • • , qA*)-
Theorem 3: Factorization into irreducible polynomials Every polynomial a(x) of positive degree in F[x] can be written as a product
a{x) = kpx{x)p2{x)-- pr{x)
where k is a constant in F and pt(x).....pr(x) are monk irreducible
polynomials of F[x].
If this were not true, we could choose a polynomial a(x) of lowest degree among those which cannot be factored into irreducibles. Then a{x) is reducible, so a(x) = b{x)c(x) where b(x) and c(x) have lower degree than a{x). But this means that b(x) and c(x) can be factored into irreducibles, and therefore a(x) can also.
Theorem 4: Unique factorization // a(x) can be written in two ways as a product of monk irreducibles, say
a(x) = kp,(x) ■ ■ ■ pr(x) = lq,(x) ■ ■ ■ q,(x)
then k = I, r = s, and each p,(x) is equal to a q^x).
The proof is the same, in all major respects, as the corresponding proof for Z; it is left as an exercise.
In the next chapter we will be able to improve somewhat on the last two results in the special cases of U[x] and C[x]. Also, we will learn more about factoring polynomials into irreducibles.
EXERCISES
A. Examples of Factoring into Irreducible Factors
1 Factor x* - 4 into irreducible factors over Q, over R, and over C.
2 Factor x" - 16 into irreducible factors over Q, over R, and over C.
3 Find all the irreducible polynomials of degree =s4 in 12\x\.
# 4 Show that x2 + 2 is irreducible in Z5[x]. Then factor x4 - 4 into irreducible factors in Zs[x], (By Theorem 3, it is sufficient to search for monic factors.)
5 Factor 2*3 + Ax + 1 in Z,[x]. (Factor it as in Theorem 3.)
6 In Z6[*], factor each of the following into two polynomials of degree 1: x, x + 2, x + 3. Why is this possible?
B. Short Questions Relating to Irreducible Polynomials
Let F be a field. Explain why each of the following is true in
1 Every polynomial of degree 1 is irreducible.
2 If a(x) and b(x) are distinct monic polynomials, they cannot be associates.
3 Any two distinct irreducible polynomials are relatively prime.
4 If a(x) is irreducible, any associate of a(x) is irreducible.
5 If a(x) ¥ 0, a(x) cannot be an associate of 0.
6 In 1p\x\, every nonzero polynomial has exactly p - 1 associates.
7 x2 + 1 is reducible in Zp[x] iff p = a + b where ab = 1 (mod p).
C. Number of Irreducible Quadratics over a Finite Field
1 Without finding them, determine how many reducible monic quadratics there are in i^[x\. [Hint: Every reducible monic quadratic can be uniquely factored as (x + a)(x + b).]
256 CHAPTER TWENTY-FIVE
FACTORING POLYNOMIALS 257
2 How many reducible quadratics are there in Z,[*]? How many irreducible quadratics?
3 Generalize: How many irreducible quadratics are there over a finite field of n elements?
4 How many irreducible cubics are there over a field of n elements?
D. Ideals in Domains of Polynomials
Let F be a field, and let J designate any ideal of F[x]. Prove parts 1-4.
1 Any two generators of J are associates.
2 J has a unique monic generator m(x). An arbitrary polynomial a(x) £ F[x\ is in J iff m(x)\a(x).
3 J is a prime ideal iff it has an irreducible generator.
# 4 If p(x) is irreducible, then ( p(x)) is a maximal ideal of F[x]. (See Chapter 18, Exercise H5.)
5 Let S be the set of all polynomials au + atx + ■ ■ - + anx" in F[x] which satisfy a„ + a, + ■■- + «„ =0. It has been shown (Chapter 24, Exercise E5) that 5 is an ideal of F[x]. Prove that x - 1 £ 5, and explain why it follows that 5 = (x - 1).
6 Conclude from part 5 that F[x\l(x - 1) = F. (See Chapter 24, Exercise F4.)
7 Let F[x, y] denote the domain of all the polynomials E a:ix'y' in two letters x and y, with coefficients in F. Let 7 be the ideal of F[x, y] which contains all the polynomials whose constant coefficient in zero. Prove that J is not a principal ideal. Conclude that Theorem 1 is not true in F[x, y).
E. Proof of the Unique Factorization Theorem
1 Prove Euclid's lemma for polynomials.
2 Prove the two corollaries of Euclid's lemma.
3 Prove the unique factorization theorem for polynomials.
F. A Method for Computing the gcd
Let a(x) and b(x) be polynomials of positive degree. By the division algorithm, we may divide a(x) by b(x):
a(x) = b(x)q,(x) + r,(x)
1 Prove that every common divisor of a(x) and b(x) is a common divisor of b(x) and rx(x).
It follows from part 1 that the gcd of a(x) and b(x) is the same as the gcd of b(x) and rt(x). This procedure can now be repeated on b(x) and r,(jr); divide b(x) by r,(x):
b(x) = rl(x)q2(x) + r2(x)
Next
rl{x)= r2(x)q3(x) + r3(x)
Finally, r„_1(x) = r„(x)o„ + 1(x) + 0
In other words, we continue to divide each remainder by the succeeding remainder. Since the remainders continually decrease in degree, there must ultimately be a zero remainder. But we have seen that
gcd[fl(jc), b{x)\ = gcd[6(x), r,(x)] - ■ • • - gedfr., _,(*), rn(x)}
Since rn(x) is a divisor of r„ ,(*), it must be the gcd of rn(x) and ra_l(x). Thus,
rn(x) = gcd[a(x),b(x)]
This method is called the euclidean algorithm for finding the gcd.
# 2 Find the gcd of x3 + 1 and x4 + x3 + 2x2 + x - 1. Express this gcd as a linear combination of the two polynomials.
3 Do the same for xu - 1 and x15 - 1.
4 Find the gcd of x3 + x2 + x + 1 and x" + x3 + 2x2 + 2x in Z3[x].
G. A Transformation of F[x]
Let G be the subset of F[x\ consisting of all polynomials whose constant term is nonzero. Let h : G —> G be defined by
h(a0 + atx + ■•• + anx") = an + an_lx H-----h a^x"
Prove parts 1-3:
1 h preserves multiplication, that is, h[a(x)b(x)] = h\a(x)]h[b(x)\-
2 h is injective and surjective and h°h = e.
3 a0 + axx + ■ ■ ■ + anx" is irreducible iff an + an ,* + •■• + oqx" is irreducible.
4 Let a0 + a,x + ■ ■ ■ + anx" = (£„ + ••■ + bmx"')(c0 + • • • + cqxq). Factor
an + an ,x + ■ ■ ■ + aax"
5 Let a(x) = a0 + a,x + • • • + anx" and d(x) = an + an_tx + ■ ■ ■ + a„x". If c £ F, prove that a(c) = 0 iff d(\lc) = 0.
SUBSTITUTION IN POLYNOMIALS 259
CHAPTER
TWENTY-SIX
SUBSTITUTION IN POLYNOMIALS
Up to now we have treated polynomials as formal expressions. If a(x) is a polynomial over a field F, say
a(x) = a0 + axx + ■ • • + anx"
this means that the coefficients au, aan are elements of the field F, while the letter x is a placeholder which plays no other role than to occupy a given position.
When we dealt with polynomials in elementary algebra, it was quite different. The letter x was called an unknown and was allowed to assume numerical values. This made a(x) into a function having x as its independent variable. Such a function is called a polynomial function.
This chapter is devoted to the study of polynomial functions. We begin with a few careful definitions.
Let a(x) = a0 + a,* + • ■ • + anx" be a polynomial over F. If c is any element of F, then
aa + fl,c + • • • + anc"
is also an element of F, obtained by substituting c for x in the polynomial a(x). This element is denoted by a(c). Thus,
a(c) = a0 + a^c + ■ • • + ancn
Since we may substitute any element of F for x, we may regard a(x) as a function from F to F. As such, it is called a polynomial function on F.
The difference between a polynomial and a polynomial function is mainly a difference of viewpoint. Given a{x) with coefficients in F: if x is regarded merely as a placeholder, then a(x) is a polynomial; if x is allowed to assume values in F, then a(x) is a polynomial function. The difference is a small one, and we will not make an issue of it.
If a(x) is a polynomial with coefficients in F, and c is an element of F such that
a(c) = 0
then we call c a root of a(x). For example, 2 is a root of the polynomial 3x2 + x - 14 G U[x], because 3 • 22 + 2 - 14 = 0.
There is an absolutely fundamental connection between roots of a polynomial and factors of that polynomial. This connection is explored in the following pages, beginning with the next theorem:
Let a(x) be a polynomial over a field F.
Theorem 1 c is a root of a(x) iff x — c is a factor of a(x).
Proof: If x — c is a factor of a(x), this means that a(x) — (x — c)q(x) for some q(x). Thus, a(c) = (c - c)q(c) = 0, so c is a root of a(x). Conversely, if c is a root of a(x), we may use the division algorithm to divide a(x) by x — c: a(x) = (x - c)q(x) + r(x). The remainder r(x) is either 0 or a polynomial of lower degree than x — c\ but lower degree than x — c means that r(x) is a constant polynomial: r(x) = r3=0. Then
0 = a(c) = (c - c)q(c) + r = 0+ r= r
Thus, r = 0, and therefore x — c is a factor of a(x). m
Theorem 1 tells us that if c is a root of a(x), then x — c is a factor of a(x) (and vice versa). This is easily extended: if c, and c2 are two roots of a(x), then x — C, and * - c2 are two factors of aLv). Similarly, three roots give rise to three factors, four roots to four factors, and so on. This is stated concisely in the next theorem.
Theorem 2 // a(x) has distinct roots c,, cx)(x — c2) • • ■ (x — cm) is a factor of a(x).
, cm in F, then (x —
Proof: To prove this, let us first make a simple observation: if a polynomial a(x) can be factored, any root of a(x) must be a root of one of its factors. Indeed, if a(x) = s(x)t(x) and a(c) = 0, then s(c) t(c) = 0, and therefore either s(c) = 0 or t(c) - 0.
258
260 CHAPTER TWENTY-SK
SUBSTITUTION IN POLYNOMIAL 261
Let c,,... ,cm be distinct roots of a(x). By Theorem 1,
a(x) = (x-c1)q1(x)
By our observation in the preceding paragraph, c2 must be a root of x - c, or of q-y{x). It cannot be a root of x - c1 because c2 - c, 7* 0; so c2 is a root of ^(x). Thus, qx(x) = (x- c2)q2(x), and therefore
a(x) = (x - CjX* - c2)q2{x)
Repeating this argument for each of the remaining roots gives us our result. ■
An immediate consequence is the following important fact: Theorem 3 // a(x) has degree n, it has at most n roots.
Proof: If a(x) had n + 1 roots c„
then by Theorem 2,
(x - cj ■ • • (x - c„+i) would be a factor of a(x), and the degree of a(x) would therefore be at least n + 1. ■
It was stated earlier in this chapter that the difference between polynomials and polynomial functions is mainly a difference of viewpoint. Mainly, but not entirely! Remember that two polynomials a(x) and b(x) are equal iff corresponding coefficients are equal, whereas two functions a(x) and b(x) are equal iff a(x) = b(x) for every x in their domain. These two notions of equality do not always coincide!
For example, consider the following two polynomials in Z5 [x\.
a(x) = xs + 1 b(x) = x - 4
You may check that a(0) = b(0), «(1) = b(l),. . . , a(4) = 6(4); hence a(x) and b(x) are equal functions from Z5 to Z5. But as polynomials, a(x) and b(x) are quite distinct! (They do not even have the same degree.)
It is reassuring to know that this cannot happen when the field F is infinite. Suppose a(x) and b(x) are polynomials over a field F which has infinitely many elements. If a(x) and b(x) are equal as functions, this means that a(c) = b(c) for every cE\F. Define the polynomial d(x) to be the difference of a(x) and b(x): d(x) = a(x) - b(x). Then d(c) - 0 for every c£f. Now, if d(x) were not the zero polynomial, it would be a polynomial (with some finite degree n) having infinitely many roots, and by Theorem 3 this is impossible! Thus, d(x) is the zero polynomial (all its coefficients are equal to zero), and therefore a(x) is the same polynomial as b{x). (They have the same coefficients.)
This tells us that if F is a field with infinitely many elements (such as
Q, R, or C), there is no need to distinguish between polynomials and polynomial functions. The difference is, indeed, just a difference of viewpoint.
POLYNOMIALS OVER Z AND Q
In scientific computation a great many functions can be approximated by polynomials, usually polynomials whose coefficients are integers or rational numbers. Such polynomials are therefore of great practical interest. It is easy to find the rational roots of such polynomials, and to determine if a polynomial over Q is irreducible over Q. We will do these things next. First, let us make an important observation: Let a(x) be a polynomial with rational coefficients, say
a(x)
^0 , &i
T + Tx +
1 kn „
We may now factor out s from all but the first term to get 1
a(x)>
(V.
+ Mo ■
b(x)
The polynomial b(x) has integer coefficients; and since it differs from a{x) only by a constant factor, it has the same roots as a(x). Thus, for every polynomial with rational coefficients, there is a polynomial with integer coefficients having the same roots. Therefore, for the present we will confine our attention to polynomials with integer coefficients. The next theorem makes it easy to find all the rational roots of such polynomials: Let sit be a rational number in simplest form (that is, the integers s and t do not have a common factor greater than 1). Let a(x) = a0 + ■ ■ ■ + anx" be a polynomial with integer coefficients.
Theorem 4 If sit is a root of a(x), then s\aQ and t\an.
Proof: If sit is a root of a(x), this means that a0 + a1(s/t) + --- + a„(sn/tn) = Q Multiplying both sides of this equation by t" we get
V" + i/"' + -- + fl/ = 0 (l) We may now factor out s from all but the first term to get
-ant" = s(a1tn~i + --- + ans""')
262 chapter twenty-six
substitution in polynomials 263
Thus, s\a0t"; and since s and / have no common factors, s\a0. Similarly, in Equation (1), we may factor out t from all but the last term to get
Ketof-' +
■ + a.
Thus, t\ans"; and since s and / have no common factors, t\an. m
As an example of the way Theorem 4 may be used, let us find the rational roots of a(x) = 2x4 + lx3 + 5x2 + Ix + 3. Any rational root must be a fraction sit where s is a factor of 3 and t is a factor of 2. The possible roots are therefore ±1, ±3, ±5 and ±\. Testing each of these numbers by direct substitution into the equation a(x) = 0, we find that - \ and —3 are roots.
Before going to the next step in our discussion we note a simple but fairly surprising fact.
Lemma Let a(x) = b(x)c(x), where a(x), b(x), and c(x) have integer coefficients. If a prime number p divides every coefficient of a(x), it either divides every coefficient of b(x) or every coefficient of c(x).
Proof: If this is not the case, let br be the first coefficient of b(x) not divisible by p, and let c, be the first coefficient of c(x) not divisible by p.
Now, a(x) = b(x)c(x), so
+ b.c. + - ■ • + b.
Each term on the right, except brcn is a product bic/ where either 1> r or / > t. By our choice of br and c,, if i > r then p \ bt, and if /' > t then p\c}. Thus, p is a factor of every term on the right with the possible exception of brcn but p is also a factor of ar+l. Thus, p must be a factor of brcn hence of either br or c„ and this is impossible. ■
We saw (in the discussion immediately preceding Theorem 4) that any polynomial a(x) with rational coefficients has a constant multiple ka(x), with integer coefficients, which has the same roots as a(x). We can go one better; let a(x)G.I\x\.
Theorem 5 Suppose a(x) can be factored as a(x) = b{x)c(x), where b(x) and c(x) have rational coefficients. Then there are polynomials B(x) and C(x) with integer coefficients, which are constant multiples ofb(x) and c(x), respectively, such that a(x) = B(x)C(x).
Proof. Let k and / be integers such that kb{x) and lc(x) have integer coefficients. Then kla(x) = [kb(x)][lc(x)]. By the lemma, each prime factor of kl may now be canceled with a factor of either kb(x) or lc(x). m
Remember that a polynomial a(x) of positive degree is said to be reducible over F if there are polynomials b(x) and c(x) in F[x], both of positive degree, such that a(x) = b{x)c(x). If there are no such polynomials, then a(x) is irreducible over F.
If we use this terminology, Theorem 5 states that any polynomial with integer coefficients which is reducible over Q is reducible already over I.
In Chapter 25 we saw that every polynomial can be factored into irreducible polynomials. In order to factor a polynomial completely (that is, into irreducibles), we must be able to recognize an irreducible polynomial when we see one! This is not always an easy matter. But there is a method which works remarkably well for recognizing when a polynomial is irreducible over Q:
Theorem 6: Eisenstein's irreducibility criterion Let
a(x) = a„ + a + • - • + anx"
be a polynomial with integer coefficients. Suppose there is a prime number p which divides every coefficient of a(x) except the leading cofficient an; suppose p does not divide an and p2 does not divide a0. Then a(x) is irreducible over Q.
Proof: If a(x) can be factored over Q as a(x) = b(x)c{x), then by Theorem 5 we may assume b(x) and c(jc) have integer coefficients: say
b(x) = b0 + ■ ■ ■ + bkxk and c(x) = c0 + ■ ■ ■ + cmxm
Now, a0 = b0c„; p divides a0 but p2 does not, so only one of b0, c0 is divisible by p. Say p | c0 and p -f h0. Next, an = bkcm andp -f an, so p -r cm. Let s be the smallest integer such that p -f cs. We have
and by our choice of cs, every term on the right except b0cs is divisible by p. But as also is divisible by p, and therefore bacs must be divisible by p. This is impossible because p * bQ and p*c5. Thus, a(x) cannot be factored. ■
For example, x3 + 2x2 + 4x + 2 is irreducible over Q because p = 2 satisfies the conditions of Eisenstein's criterion.
POLYNOMIALS OVER R AND C
One of the most far-reaching theorems of classical mathematics concerns polynomials with complex coefficients. It is so important in the frame-
264 CHAPTER TWENTY-SIX
SUBSTITUTION IN POLYNOMIALS 265
work of traditional algebra that it is called the fundamental theorem of algebra. It states the following:
Every nonconstant polynomial with complex coefficients has a complex root.
(The proof of this theorem is based upon techniques of calculus and can be found in most books on complex analysis. It is omitted here.)
It follows immediately that the irreducible polynomials in C[x] are exactly the polynomials of degree 1. For if a(x) is a polynomial of degree greater than 1 in C[x], then by the fundamental theorem of algebra it has a root c and therefore a factor x - c.
Now, every polynomial in C[x] can be factored into irreducibles. Since the irreducible polynomials are all of degree 1, it follows that if a(x) is a polynomial of degree n over C, it can be factored into
a(x) - k(x - c,)(x -c2)--(x-cn)
In particular, if a(x) has degree n it has n (not necessarily distinct) complex roots Cj,.. . , cn.
Since every real number a is a complex number (a = a + Of), what has just been stated applies equally to polynomials with real coefficients. Specifically, if a(x) is a polynomial of degree n with real coefficients, it can be factored into a(x) = k(x — c,) • • • (x - cn), where c,, . . . , cn are complex numbers (some of which may be real).
For our closing comments, we need the following lemma:
Lemma Suppose a(x) £ R[x]. If a + bi is a root of a(x), so is a — bi.
Proof: Remember that a — bi is called the conjugate of a + bi. If r is any complex number, we write r for its conjugate. It is easy to see that the function f(r) - r is a homomorphism from C to C (in fact, it is an isomorphism). For every real number a, f(a) = a. Thus, if a(x) has real coefficients, then f(a0 + atr + ■ • ■ + anr") = a0 + axr + ■ • • + a„r". Since /(0) = 0, it follows that if r is a root of a(x), so is r. ■
Now let a(x) be any polynomial with real coefficients, and let r=a + bi be a complex root of a(x). Then f is also a root of a(x), so
(x - r)(x -F) = x2- lax + (a2 + b2)
and this is a quadratic polynomial with real coefficients*. We have thus shown that any polynomial with real coefficients can be factored into polynomials of degree I or 2 in U[x]. In particular, the irreducible polynomials of R[x] are the linear polynomials and the irreducible quadratics (that is, the ax2 + bx + c where b2 - 4ac <0).
EXERCISES
A. Finding Roots of Polynomials over Finite Fields
In order to find a root of a(x) in a finite field F, the simplest method (if Fis small) is to test every element of F by substitution into the equation a(x) = 0.
1 Find all the roots of the following polynomials in Z5[x], and factor the polynomials:
x3 + x2 + x+l; 3x4 + x2 + l; x5 + 1; x4 + 1; x4 + 4
# 2 Use Fermat's theorem to find all the roots of the following polynomials in Z,[x]:
1;
3*98 + x19 + 3;
2x:
- jt" + 2x + 6
3 Using Fermat's theorem, find polynomials of degree «6 which determine the same functions as the following polynomials in Z7[x]:
3x'
- Sx54 + 2*1
4xlm+6x>
-2xs
3x"
-3x$
4 Explain why every polynomial in Zp[x] has the same roots as a polynomial of degree
(c) for n values of c, prove that a(x) = ft(x).
7 There are infinitely many irreducible polynomials in Z5[x].
# 8 How many roots does x2 - x have in Z10? In Z„? Explain the difference.
D. Irreducible Polynomials in Q[x] by Eisenstein's Criterion (and Variations on the Theme)
1 Show that each of the following polynomials is irreducible over O:
2 1
3x4 - 8x3 + 6x2 - Ax + 6; r x5 + ^ *"
2x2 + J
5*
1 3 2
3* ~3X + 1-
1 4 4 , 2 2 , ,
2* + 3* "3* +1
2 It often happens that a polynomial a(y), as it stands, does not satisfy the conditions of Eisenstein's criterion, but with a simple change of variable y = x + c, it does. It is important to note that if a(x) can be factored into p(x)q(x), then certainly a(x + c) can be factored into p(x + c)q(x + c). Thus, the ir-reducibility of a(x + c) implies the irreducibility of a(x).
(a) Use the change of variable y = x + 1 to show that x" + Ax + 1 is irreducible in Q[x]. [In other words, test (x + l)4 + 4(x + 1) + 1 by Eisenstein's criterion.]
(fr) Find an appropriate change of variable to prove that the following are irreducible in £„\x] be defined by
h(a0 + a,x + ■ ■ ■ + anxn) = h(au) + h(at)x + ■■■ + h(a„)x"
In Chapter 24, Exercise G, it is proved that h is a homomorphism. Assume this fact and prove:
# 1 If h(a(x)) is irreducible in Zn[x] and a(x) is monic, then a(x) is irreducible in Z[x\.
2 x + lux3 + 7 is irreducible in Q[x] by using the natural homomorphism from Z
to Z5.
3 The following are irreducible in Q[x] (find the right value of n and use the natural homomorphism from Z to Zn):
x4~10j-2 + 1; x4 + 7x3 + 14x2 + 3; x5 + 1
G. Roots and Factors in A[x] When A Is an Integral Domain
It is a useful fact that Theorems 1, 2, and 3 are still true in A[x] when A is not a field, but merely an integral domain. The proof of Theorem 1 must be altered a
268 chapter twenty-six
substitution in polynomials 269
bit to avoid using the division algorithm. We proceed as follows: If a(x) = a0 + axx H----+ anx" and c is a root of a(x), consider
a(x) - a(c) = ax(x -c) + a2(x2 - c2) + • • • + an(x" - c")
1 Prove that for k= 1,..., n:
ak(x" - c") = ak(x - c)(xk 1 + xk 2c + ■ ■ ■ + ck ')
2 Conclude from part 1 that a(x) - a(c) = (x - c)q(x) for some q(x).
3 Complete the proof of Theorem 1, explaining why this particular proof is valid when A is an integral domain, not necessarily a field.
4 Check that Theorems 2 and 3 are true in A[x] when A is an integral domain.
IS. Polynomial Interpolation
One of the most important applications of polynomials is to problems where we are given several values of x (say, x = a0,a1,. . . , an) and corresponding values of y (say, y = b0, blt..., £>„), and we need to find a function y = f(x) such that f(a0) = b0, /(a,) = />„...,/(«„) = bn. The simplest and most useful kind of function for this purpose is a polynomial function of the lowest possible degree.
We now consider a commonly used technique for constructing a polynomial p(x) of degree n which assumes given values b0,bx,. . . ,bn are given points «„, ax,..., a„. That is,
P(a0) = b0, p(«,) = 6,,. . . , p(a„) = bn
First, for each i =0,1,.. ., n, let
q,(x) = (x-au)-(x- a,_x)(x - oi+,)- • -(x - aj
1 Show that q.(aj) = 0 for /' ^ i, and a,(a,) ^0.
Let ?,(a,) = c,, and define p(x) as follows:
p(x) = S 7 = y 9o(*) + ■ • • + f 1„(x)
i-0 C, '"O Ln
(This is called the Lagrange interpolation formula.)
2 Explain why p(a„) = b0, p(ax) = b„ . . . , p(a„) = b„.
3 Prove that there is one and only one polynomial p(x) of degree =£n such that P(a0) = b0, ...,p(a„) = bn.
4 Use the Lagrange interpolation formula to prove that if F is a finite field, every function from F to F is equal to a polynomial function. (In fact, the degree of this polynomial is less than the number of elements in F.)
5 If t(x) is any polynomial in F[x], and a0,. . . , a„ £ F, the unique polynomial p(x) of degree =sn such that p(a0) = t(a0), . . . , p(an) = t(a„) is called the Lagrange interpolator for t(x) and a0, . . . , an. Prove that the remainder, when t(x) is divided by (x - a0)(x -«,)•••(*- «„), is the Lagrange interpolator.
I. Polynomial Functions over a Finite Field
1 Find three polynomials in I5[x] which determine the same function as
x2 - x + 1
2 Prove that xp - x has p roots in lp[x], for any prime p. Draw the conclusion that in Zp[x], xp - x can be factored as
x" - x = x(x - l)(x - 2) • • ■ [x - (p - 1)]
3 Prove that if a(x) and b(x) determine the same function in Zp[x], then
(x"-x)\(a(x)-b(x)) In the next four parts, let F be any finite field. # 4 Let a(x) and b(x) be in F[x]. Prove that if a(x) and b(x) determine the same function, and if the number of elements in F exceeds the degree of a(x) as well as the degree of b(x), then a(x) = b(x).
5 Prove: The set of all a(x) which determine the zero function is an ideal of F[x]. What its generator?
6 Let 9(F) be the ring of all functions from F to F, defined in the same way as 9(U). Let h: F[x]-> 9(F) send every polynomial a(x) to the polynomial function which it determines. Show that h is a homomorphism from F[x] onto 9(F). (Note: To show that h is onto, use Exercise H4.)
7 Let F= {c„ . . . , cj and p(x) = (x - c,)-• ■ fx - c„). Prove that
F[xV(p(x)) = ^(F)
EXTENSIONS OF FIELDS 271
CHAPTER
TWENTY-SEVEN
EXTENSIONS OF FIELDS
In the first 26 chapters of this book we introduced the cast and set the scene on a vast and complex stage. Now it is time for the action to begin. We will be surprised to discover that none of our effort has been wasted; for every notion which was defined with such meticulous care, every subtlety, every fine distinction will have its use and play its prescribed role in the story which is about to unfold.
We will see modern algebra reaching out and merging with other disciplines of mathematics; we will see its machinery put to use for solving a wide range of problems which, on the surface, have nothing whatever to do with modern algebra. Some of these problems—ancient problems of geometry, riddles about numbers, questions concerning the solutions of equations—reach back to the very beginnings of mathematics. Great masters of the art of mathematics puzzled over them in every age and left them unsolved, for the machinery to solve them was not there. Now, with a light touch modern algebra uncovers the answers.
Modern algebra was not built in an ivory tower but was created part and parcel with the rest of mathematics—tied to it, drawing from it, and offering it solutions. Clearly it did not develop as methodically as it has been presented here. It would be pointless, in a first course in abstract algebra, to replicate all the currents and crosscurrents, all the hits and misses and false starts. Instead, we are provided with a finished product in which the agonies and efforts that went into creating it cannot be discerned. There is a disadvantage to this: without knowing the origin of a given concept, without knowing the specific problems which gave it
birth, the student often wonders what it means and why it was ever invented.
We hope, beginning now, to shed light on that kind of question, to justify what we have already done, and to demonstrate that the concepts introduced in earlier chapters are correctly designed for their intended purposes.
Most of classical mathematics is set in a framework consisting of fields, especially Q, U, and C. The theory of equations deals with polynomials over R and C, calculus is concerned with functions over R, and plane geometry is set in R x R. It is not surprising, therefore, that modern efforts to generalize and unify these subjects should also center around the study of fields. It turns out that a great variety of problems, ranging from geometry to practical computation, can be translated into the language of fields and formulated entirely in terms of the theory of fields. The study of fields will therefore be our central concern in the remaining chapters, though we will see other themes merging and flowing into it like the tributaries of a great river.
If F is a field, then a subfield of F is any nonempty subset of F which is closed with respect to addition and subtraction, multiplication and division. (It would be equivalent to say: closed with respect to addition and negatives, multiplication and multiplicative inverses.) As we already know, if K is a subfield of F, then K is a field in its own right.
If K is a subfield of F, we say also that F is an extension field of K. When it is clear in context that both Fand K are fields, we say simply that F is an extension of K.
Given a field F, we may look inward from F at all the subfields of F. On the other hand, we may look outward from F at all the extensions of F. Just as there are relationships between F and its subfields, there are also interesting relationships between F and its extensions. One of these relationships, as we shall see later, is highly reminiscent of Lagrange's theorem—an inside-out version of it.
Why should we be interested in looking at the extensions of fields? There are several reasons, but one is very special. If F is an arbitrary field, there are, in general, polynomials over F which have no roots in F. For example, x2 4- 1 has no roots in U. This situation is unfortunate but, it turns out, not hopeless. For, as we shall soon see, every polynomial over
270
272 CHAPTER TWENTY-SEVEN
EXTENSIONS OF FIELDS 273
any field F has roots. If these roots are not already in F, they are in a suitable extension of F. For example, x2 + 1 = 0 has solutions in C.
In the matter of factoring polynomials and extracting their roots, C is Utopia! In C every polynomial a(x) of degree n has exactly n roots cM . . . , c„ and can therefore be factored as a(x) = k(x - c,)(x - c2) ■ ■ • (x - c„). This ideal situation is not enjoyed by all fields—far from it! In an arbitrary field F, a polynomial of degree n may have any number of roots, from no roots to n roots, and there may be irreducible polynomials of any degree whatever. This is a messy situation, which does not hold the promise of an elegant theory of solutions to polynomial equations. However, it turns out that F always has a suitable extension E such that any polynomial a(x) of degree n over F has exactly n solutions in E. Therefore, a(x) can be factored in E[x] as
a(x) = k(x - c,)(x - c2)- • • (x - C„)
Thus, paradise is regained by the expedient of enlarging the field F. This is one of the strongest reasons for our interest in field extensions. They will give us a trim and elegant theory of solutions to polynomial equations.
Now, let us get to work! Let £ be a field, F a subfield of E, and c any
element of E. We define the substitution function ac as follows: For every polynomial a(x) in F\x\,
ac(a(x)) = a{c)
Thus, oc is the function "substitute c for x." It is a function from F[x] into E. In fact, ac is a homomorphism. This is true because
(* + c)) = F[x]l{p(x)).
5 Let p(x) be irreducible, and let a be a root of p(cx). Then F[x]I(p(cx)) = F(a) and F[x]/(p(x)) = F(ca). Conclude that F[x]l(p(cx)) = F[x]/(p(x)).
6 Use parts 4 and 5 to prove the following:
(a) Zu[x]/{x2+l) ~Zu[x]l(x2 + x + 4).
(b) If a is a root of x2 - 2 and b is a root of x2 - 4x + 2, then Q(a) = Q(b).
(c) If a is a root of x2 - 2 and b is a root of x2 - J, then Q(a) = Q(ft).
t F. Quadratic Extensions
If the minimum polynomial of a over F has degree 2, we call F(a) a quadratic extension of F.
1 Prove that, if F is a field whose characteristic is 5*2, any quadratic extension of F is of the form F(Va), for some a£F. (Hint: Complete the square, and use Exercise E4.)
Let F be a finite field, and F* the multiplicative group of nonzero elements of F. Obviously H = {x2: x £ F*} is a subgroup of F*; since every square x2 in F* is the square of only two different elements, namely ±x, exactly half the elements of F* are in H. Thus, H has exactly two cosets: H itself, containing all the squares, and aH (where a0H), containing all the nonsquares. If a and b are
nonsquares, then by Chapter 15, Theorem 5(i),
a
H
Thus: if a and b are nonsquares, alb is a square. Use these remarks in the following:
2 Let F be a finite field. If a, b £ F, let p(x) = x2 - a and o(x) = x1 — b be irreducible in F[x], and let Va and Vb denote roots of p(x) and q(x) in an extension of F. Explain why alb is a square, say alb = c2 for some c£ F. Prove that Vft is a root of /?(o:).
3 Use part 2 to prove that F[x]l(p(cx)} = F(Vb); then use Exercise E5 to conclude that F(Va) = F(Vb).
4 Use part 3 to prove: Any two quadratic extensions of a finite field are isomorphic.
5 If a and b are nonsquares in R, alb is a square (why?). Use the same argument as in part 4 to prove that any two simple extensions of R are isomorphic (hence isomorphic to C).
G. Questions Relating to Transcendental Elements
Let F be a field, and let c be transcendental over F. Prove the following:
1 (a(c):a(x) £ F\x]} is an integral domain isomorphic to F[x\. # 2 F(c) is the field of quotients of (a(c): a(x)E F[jt]}, and is isomorphic to F(x), the field of quotients of F[x].
3 If c is transcendental over F, so are c + 1, kc (where kS F and k ¥= 0), c2.
4 If c is transcendental over F, every element in F(c) but not in Fis transcendental over F.
t H. Common Factors of Two Polynomials: Over F and over Extensions of F
Let F be a field, and let a(x), b(x)E F[x]. Prove the following:
1 If a(x) and b(x) have a common root c in some extension of F, they have a common factor of positive degree in F[x\. [Use the fact that a(x), b(x) £ ker trc.]
2 If a(x) and b(x) are relatively prime in F\x\, they are relatively prime in K[x], for any extension K of F. Conversely, if they are relatively prime in K[x\, then they are relatively prime in F[jc].
t I. Derivatives and Their Properties
Let a(x) = a„ + axx + ■ ■ ■ + anx" £ F[x]. The derivative of a(x) is the following polynomial a'(x) £ F[x\.
a'(x) = a, + 2a;,x + • • ■ + /ja„x"_1
280 CHAPTER TWENTY-SEVEN
EXTENSIONS OF FIELDS 281
(This is the same as the derivative of a polynomial in calculus.) We now prove the analogs of the formal rules of differentiation, familiar from calculus. Let a(x), b(x) S F[x], and let itEF.
Prove parts 1-4:
1 [a(x) + b(x)\ = a'(x) + b'(x)
2 [a(x)b(x)]' = a\x)b(x) + a(x)b'(x)
3 [ka(x)]' = ka\x)
4 If F has characteristic 0 and a'(x) = 0, then a(x) is a constant polynomial. Why is this conclusion not necessarily true if F has characteristic p ¥° 0?
5 Find the derivative of the following polynomials in Z5[x]:
x6 + 2x3 + x + 1 x5 + 3x2 + l x15 + 3^IO + 4^5 + l
6 If Fhas characteristic p ^0, and a'(x) = 0, prove that the only nonzero terms of a(x) are of the form ampxmp for some m. [That is, a(x) is a polynomial in powers of*".]
t J. Multiple Roots
Suppose a(x) £ F[x], and K is an extension of F. An element c E K is called a multiple root of a(x) if (x - c)m \ a(x) for some m > 1. It is often important to know if all the roots of a polynomial are different, or not. We now consider a method for determining whether an arbitrary polynomial a(x) £ F[x] has multiple roots in any extension of F.
Let K be any field containing all the roots of a(x). Suppose a(x) has a multiple root c.
1 Prove that a(x) = (x - c)2q(x) £ K[x].
2 Compute a'(x), using part 1.
3 Show that x - c is a common factor of a(x) and a'(x). Use Exercise HI to conclude that a(x) and a'(x) have a common factor of degree >1 in F[x].
Thus, if a(x) has a multiple root, then a(x) and a'(x) have a common factor in F[x]. To prove the converse, suppose a(x) has no multiple roots. Then a(x) can be factored as a(x) = (x — c,)- • • (jc — cn) where c,, . . . , cn are all different.
4 Explain why a'(x) is a sum of terms of the form
(x - c,)• • • (x- ct_,)(x - c,+1)- • • (x - cj
5 Using part 4, explain why none of the roots c,,. . ., c„ of a(x) are roots of «'(*)■
6 Conclude that u(a:) and a'(x) have no common factor of degree >1 in F[x].
7 Show that each of the following polynomials has no multiple roots in any extension of its field of coefficients:
x3 - 7x2 + 8 £ Q[x] x2 + x + 1 £ Z,[x]
1 £ Z7[x]
The preceding example is most interesting: it shows that there are 100 different hundredth roots of 1 over Z7. (The roots ±1 are in Z7, while the remaining 98 roots are in extensions of Z7.) Corresponding results hold for most other fields.
This important result is stated as follows: A polynomial a(x) in F[x] has a multiple root iff a(x) and a'(x) have a common factor of degree >1 in F[x].
VECTOR SPACES 283
CHAPTER
TWENTY-EIGHT
VECTOR SPACES
Many physical quantities, such as length, area, weight, and temperature, are completely described by a single real number. On the other hand, many other quantities arising in scientific measurement and everyday reckoning are best described by a combination of several numbers. For example, a point in space is specified by giving its three coordinates with respect to an xyz coordinate system.
Here is an example of a different kind: A store handles 100 items; its
monthly inventory is a sequence of 100 numbers (a1,a2,
specifying the quantities of each of the 100 items currently in stock. Such a sequence of numbers is usually called a vector. When the store is restocked, a vector is added to the current inventory vector. At the end of a good month of sales, a vector is subtracted.
As this example shows, it is natural to add vectors by adding corresponding components, and subtract vectors by subtracting corresponding components. If the store manager in the preceding example decided to double inventory, each component of the inventory vector would be multiplied by 2. This shows that a natural way of multiplying a vector by a real number k is to multiply each component by k. This kind of multiplication is commonly called scalar multiplication.
Historically, as the use of vectors became widespread and they came to be an indispensable tool of science, vector algebra grew to be one of the major branches of mathematics. Today it forms the basis for much of advanced calculus, the theory and practice of differential equations, statistics, and vast areas of applied mathematics. Scientific computation is enormously simplified by vector methods; for example, 3, or 300, or 3000
individual readings of scientific instruments can be expressed as a single vector.
In any branch of mathematics it is elegant and desirable (but not always possible) to find a simple list of axioms from which all the required theorems may be proved. In the specific case of vector algebra, we wish to select as axioms only those particular properties of vectors which are absolutely necessary for proving further properties of vectors. And we must select a sufficiently complete list of axioms so that, by using them and them alone, we can prove all the properties of vectors needed in mathematics.
A delightfully simple list of axioms is available for vector algebra. The remarkable fact about this axiom system is that, although we conceive of vectors as finite sequences (a1, a2,... , an) of numbers, nothing in the axioms actually requires them to be such sequences! Instead, vectors are treated simply as elements in a set, satisfying certain equations. Here is our basic definition:
A vector space over a field F is a set V, with two operations + and • called vector addition and scalar multiplication, such that
1. V with vector addition is an abelian group.
2. For any k £ F and a £ V, the scalar product ka is an element of V, subject to the following conditions: for all k, I £ F and a, b£ V,
(a) k(a + b) = ka + kb,
(b) (k + l)a = ka + la,
(c) k(la) = (kl)a,
(d) la = a.
The elements of V are called vectors and the elements of the field F are called scalars.
In the following exposition the field F will not be specifically referred to unless the context requires it. For notational clarity, vectors will be written in bold type and scalars in italics.
The traditional example of a vector space is the set R" of all n-tuples of real numbers, (al,a2, ■ ■ ■ ,an), with the operations
fa, a2,..., a„) + (b„ b2,bn) = fa + blt a2 + b2, ...,«„ + b„) and kfa, a2,. . . ,an) = (kax, ka2,. . . , kan)
For example, R2 is the set of all two-dimensional vectors (a, b), while R3 is the set of all vectors (a, b, c) in euclidean space. (See the figure on the next page.)
However, these are not the only vector spaces! Our definition of vector space is so very simple that many other things, quite different in appearance from the traditional vector spaces, satisfy the conditions of our definition and are therefore, legitimately, vector spaces.
282
284 CHAPTER TWENTY-EIGHT
VECTOR SPACES 285
,{a. b)
{a. b. c)
For example, &(U), you may recall, is the set of all functions from R to R. We define the sum / + g of two functions by the rule
[f + g](x)=f(x) + g(x)
and we define the product af, of a real number a and a function /, by
[af](x) = af(x)
It is very easy to verify that &(R), with these operations, satisfies all the conditions needed in order to be a vector space over the field R.
As another example, let ®£ denote the set of all polynomials with real coefficients. Polynomials are added as usual, and scalar multiplication is defined by
k(a0 + «,* + ••• + a„x") = (K) + (ka,)x + ■•■ + (kan)x"
Again, it is not hard to see that is a vector space over R.
Let V be a vector space. Since V with addition alone is an abelian group, there is a zero element in V called the zero vector, written as 0. Every vector a in V has a negative, written as -a. Finally, since V with vector addition is an abelian group, it satisfies the following conditions which are true in all abelian groups:
a + b = a + c implies b = c a + b = 0 implies a=-b and b: -(a + b) = (-a) + (-b) and -(-a) = a
(1)
(2) (?)
There are simple, obvious rules for multiplication by zero and by negative scalars. They are contained in the next theorem.
Theorem I If V is a vector space, then:
(i) 0a = 0, for every a6K.
(ii) kO = 0, for every scalar k. (hi) // Jta = 0, then k = 0 or a = 0. (iv) (-1 )a = -a for every a £ V.
To prove Rule (i), we observe that
0a = (0 + 0)a = 0a + 0a
hence 0 + 0a = 0a + 0a. It follows by Condition (1) that 0 = 0a.
Rule (ii) is proved similarly. As for Rule (iii), if k = 0, we are done. If fc^O, we may multiply &a = 0 by 1/k to get a = 0. Finally, for Rule (iv), we have
a + (-l)a = la + (-l)a = (1 + (-l))a = 0a = 0
so by Condition (2), (-l)a= -a.
Let V be a vector space, and U CV. We say that U is closed with respect to scalar multiplication if ka&U for every scalar k and every a £ U. We call U a subspace of V if U is closed with respect to addition and scalar multiplication. It is easy to see that if V is a vector space over the field F, and U is a subspace of V, then U is a vector space over the same field F.
If a,, a2,. . . , a„ are in V and kx, k2,. . . , kn are scalars, then the vector
Mi + Mz + • • • + M„
is called a linear combination of a,,a2,... ,a„. The set of all the linear combinations of a,, a2,. . . , a„ is a subspace of V. (This fact is exceedingly easy to verify.)
If U is the subspace consisting of all the linear combinations of a1;a2,... ,a„, we call U the subspace spanned by a,,a2,. . . ,a„. An equivalent way of saying the same thing is as follows: a space (or subspace) U is spanned by a„ a2,. . . , a„ iff every vector in U is a linear combination of a,, a2, . . . , a„.
If U is spanned by a,, a2,. . . , a„, we also say that a,, a,, . . . , a„ span
U.
Let S = {a,, a2, . . . , a„} be a set of distinct vectors in a vector space V. Then S is said to be linearly dependent if there are scalars ku . . . , kn, not all zero, such that
/c,a, + fc2a2 + ■ • • +/c„a„ = 0 (4)
Obviously this is the same as saying that at least one of the vectors in S is a linear combination of the remaining ones. [Solve for any vector a, in Equation (4) having a nonzero coefficient.]
If S = {a,,a2,.. . ,a„} is not linearly dependent, then it is linearly independent. That is, S is linearly independent iff
fcja, + k2a2 + • • ■ + knan - 0 implies Jkt = k2 = • • • = kH = 0
This is the same as saying that no vector in S is equal to a linear combination of the other vectors in S.
286 CHAPTER TWENTY-EIGHT
VECTOR SPACES 287
It is obvious from these definitions that any set of vectors containing the zero vector is linearly dependent. Furthermore, the set {a}, containing a single nonzero vector a, is linearly independent.
The next two lemmas, although very easy and at first glance rather trite, are used to prove the most fundamental theorems of this subject.
Lemma 1 If {a,,a2,. . . , a„} is linearly dependent, then some a, is a linear combination of the preceding ones, aj,a2,. . . ,a;_j.
Proof: Indeed, if {a,,a2, . . . ,a„} is linearly dependent, then /fc,aj + ' " + knan = 0 f°r coefficients kt, k2,..., k„ which are not all zero. If fc, is the last nonzero coefficient among them, then klai + • • • + fc,af = 0, and this equation can be used to solve for a, in terms of a,,. . . .a,.,. ■
Let (a,,a2, removal of a,.
,. . . ,a„} denote the set {a,,a2, ...,»„} after
Lemma 2//{a„a2,...,aJ spans V, and a, is a linear combination of preceding vectors, then {a„ . .., /,.....a„} still spans V.
Proof: Our assumption is that a, = ^a, + • • • + ki^1ai_1 for some scalars kl,...,ki_1. Since every vector be V is a linear combination
b = /,a, + • • ■ + l,a, + • - • + /„a„
it can also be written as a linear combination
b = llal + ■ • • + /,(fcia, + • ■ • + A:,.^.,) + • • • + lnan
in which a; does not figure. ■
A set of vectors {a1,..., a„} in Vis called a basis of V if it is linearly independent and spans V.
For example, the vectors % = (1,0,0), e2 = (0,1,0), and e3 = (0,0,1) form a basis of U. They are linearly independent because, obviously, no vector in {e,, e2, e3} is equal to a linear combination of preceding ones. [Any linear combination of e, and e2 is of the form ael + be2 = (a, b, 0), whereas e3 is not of this form; similarly, any linear combination of ex alone is of the form aex = (a, 0, 0), and e2 is not of that form.) The vectors e,, e2, e3 span R3 because any vector (a, b, c) in R can be written as (a, b, c) - ae} + be2 + ce3.
Actually, {e^ e2, e3} is not the only basis of R3. Another basis of R3 consists of the vectors (1, 2, 3), (1, 0, 2), and (3, 2,1); in fact, there are infinitely many different bases of R3. Nevertheless, all bases of R3 have one thing in common: they contain exactly three vectors! This is a consequence of our next theorem:
Theorem 2 Any two bases of a vector space V have the same number of elements.
Proof: Suppose, on the contrary, that V has a basis A = {als..., an} and a basis B = {blt. .., bm} where m^n. To be specific, suppose n V is a homomorphism if it satisfies the following two conditions:
h(a + b) = A(a) + h(b) and h(ka) = kh(a)
A homomorphism of vector spaces is also called a linear transformation.
If h : U—> V is a linear transformation, its kernel [that is, the set of all aG U such that /i(a) = 0] is a subspace of U, called the null space of h.
Homomorphisms of vector spaces behave very much like homomorph-of groups and rings. Their properties are presented in the exercises.
isms
EXERCISES
A. Examples of Vector Spaces
1 Prove that R", as defined on page 283, satisfies all the conditions for being a vector space over R.
2 Prove that .^(R), as defined on page 284, is a vector space over R.
3 Prove that 9(, as defined on page 284, is a vector space over R.
4 Prove that M2(R), the set of all 2x2 matrices of real numbers, with matrix addition and the scalar multiplication
(a b\ika kb\ \c d) \kc kd)
is a vector space over
B. Examples of Subspaces
# 1 Prove that {(a, b, c) : la - 3b + c = 0} is a subspace of R3.
2 Prove that the set of all (x, y, z) G R3 which satisfy the pair of equations ax + by + c = 0, dx + ey + f = 0 is a subspace of R3.
3 Prove that { / : /(l) =0} is a subspace of 3?(R).
4 Prove that {/: / is a constant on the interval [0,1]} is a subspace of &{U).
5 Prove that the set of all even functions [that is, functions / such that f(x) = f(-x)] is a subspace of 3P(U). Is the same true for the set of all the odd functions [that is, functions / such that f(-x) = -f(x)]n.
6 Prove that the set of all polynomials of degree =Sn is a subspace of 9i,
C. Examples of Linear Independence and Bases
1 Prove that {(0,0,0,1), (0,0,1,1), (0,1,1,1), (1,1,1,1)} is a basis of R4.
2 If a = (1, 2, 3,4) and b = (4, 3, 2,1), explain why {a, b} may be extended to a basis of R4 Then find a basis of R4 which includes a and b.
3 Let A be the set of eight vectors (x, y, z) where x, y,z = 1,2. Prove that A spans R3, and find a subset of A which is a basis of R3.
4 If 9tn is the subspace of consisting of all polynomials of degree prove that {1, x, x2,.. . , x") is a basis of Then find another basis of 9(n.
5 Find a basis for each of the following subspaces of R3:
# (a) 5, = {(x, y, z): 3x - 2y + z = 0} (6) S2 = {{x, y, z): x + y - z = 0 and 2x — y + z = 0}
6 Find a basis for the subspace of R3 spanned by the set of vectors (x, y, z) such that x2 + y2 + z2 = 1.
290 CHAPTER TWENTY-EIGHT
VECTOR SPACES 291
7 Let U be the subspace of fr(U) spanned by {cos2 x, sin2 x, cos2x}. Find the dimension of U, and then find a basis of U.
8 Find a basis for the subspace of Sff spanned by
{jc3 + x2 + x + l,x2 + \,x3 - x2 + x - 1, x2 - 1} D. Properties of Subspaces and Bases
Let V be a finite-dimensional vector space. Let dim V designate the dimension of V. Prove each of the following:
1 If U is a subspace of V, then dim U « dim V.
2 If U is a subspace of V, and dim U = dim V, then U = V.
3 Any set of vectors containing 0 is linearly dependent.
4 The set {a}, containing only one nonzero vector a, is linearly independent.
5 Any subset of an independent set is independent. Any set of vectors containing a dependent set is dependent.
# 6 If {a, b, c} is linearly independent, so is {a + b, b + c, a + c}.
7 If {a,,.... a,,} is a basis of V, so is {fc,a,,. . . , kna.n) for any nonzero scalars k....., kn.
8 The space spanned by {a,,..., a,,} is the same as the space spanned by
{b,.....bm} iff each a, is a linear combination of b,,. . . , bm, and each by is a
linear combination of a,, ... ,a..
E. Properties of Linear Transformations
Let U and V be finite-dimensional vector spaces over a field F, and let h : U-be a linear transformation. Prove parts 1-3:
1 The kernel of h is a subspace of U. (It is called the null space of h.)
2 The range of h is a subspace of V. (It is called the range space of h.)
3 h is injective iff the null space of h is equal to {0}.
Let Jf be the null space of h, and 91 the range space of h. Let {a,, basis of .A'. Extend it to a basis {a,,..., a,,... ,a„} of U. Prove parts 4-6:
, ar} be
4 Every vector be 'M is a linear combination of /i(ar+l),. . . , h(an). # 5 {/j(a,+1), . . . , h(atl)} is linearly independent.
6 The dimension of 91 is n - r.
7 Conclude as follows: for any linear transformation h, dim (domain fc) = dim (null space of h) + dim (range space of h).
8 Let U and V have the same dimension n. Use part 7 to prove that h is injective iff h is surjective.
F. Isomorphism of Vector Spaces
Let U and V be vector spaces over the field F, with dim U = n and dim V= m. Let h : U—> V be a homomorphism. Prove the following:
1 Let h be injective. If {a,,... ,ar} is a linearly independent subset of U, then {/ifa,),. . . , /i(ar)} is a linearly independent subset of V. # 2 h is injective iff dim U = dim h(U).
3 Suppose dim U = dim V; h is an isomorphism (that is, a bijective homomorphism) iff h is injective iff h is surjective.
4 Any ^-dimensional vector space V over F is isomorphic to the space F" of all n-tuples of elements of F.
t G. Sums of Vector Spaces
Let T and U be subspaces of V. The sum of T and V, denoted by T + U, is the set of all vectors a + b, where a £ T and bet/.
1 Prove that T + U and T D U are subspaces of V.
V is said to be the direct sum of T and U if V = T + U and T n U = {0}. In that case, we write V= T@U.
# 2 Prove: V= T ® U iff every vector c£V can be written, in a unique manner, as a sum c = a + b where a e T and b£(/.
3 Let T be a /t-dimensional subspace of an n-dimensiona! space V. Prove that an (n — A)-dimensional subspace U exists such that V" T ®U.
4 If T and 1/ are arbitrary subspaces of V, prove that
dim (T+ U) = dim T + dim U — dim (7/ n U)
DEGREES OF FIELD EXTENSIONS 293
CHAPTER
TWENTY-NINE
DEGREES OF FIELD EXTENSIONS
In this chapter we will see how the machinery of vector spaces can be applied to the study of field extensions.
Let F and K be fields. If K is an extension of F, we may regard K as being a vector space over F. We may treat the elements in K as "vectors"
and the elements in F as "scalars." That is, when we add elements in K, we think of it as vector addition; when we add and multiply elements in F, we think of this as addition and multiplication of scalars; and finally, when we multiply an element of F by an element of K, we think of it as scalar multiplication.
We will be especially interested in the case where the resulting vector space is of finite dimension. If K, as a vector space over F, is of finite dimension, we call K a finite extension of F. If the dimension of the vector space K is n, we say that K is an extension of degree n over F. This is symbolized by writing
\K : F] = n
which should be read, "the degree of K over F is equal to «."
Let us recall that F(c) denotes the smallest field which contains F and c. This means that F(c) contains F and c, and that any other field K containing F and c must contain F(c). We saw in Chapter 27 that if c is algebraic over F, then F(c) consists of all the elements of the form a(c), for all a(x) in F[x]. Since F(c) is an extension of F, we may regard it as a vector space over F. Is F(c) a finite extension of F?
Well, let c be algebraic over F, and let p(x) be the minimum polynomial of c over F. [That is, p(x) is the monic polynomial of lowest degree having c as a root.] Let the degree of the polynomial p{x) be equal ton. It turns out, then, that the n elements
1, c, c,. . . , c""1
are linearly independent and span F(c). We will prove this fact in a moment, but meanwhile let us record what it means. It means that the set of n "vectors" {1, c, c2,..., c"~1} is a basis of F(c); hence F(c) is a vector space of dimension n over the field F. This may be summed up concisely as follows:
Theorem 1 The degree of F(c) over F is equal to the degree of the minimum polynomial of c over F.
Proof: It remains only to show that the n elements I, c,..., c"~l span F(c) and are linearly independent. Well, if a(c) is any clement of F(c), use the division algorithm to divide a{x) by p(x):
a(x) = p(x)q(x) + r(x) where deg r(x) n - 1
Therefore, a(c) = p(c)q(c) + r(c) = 0 + r(c) = r(c)
= o
This shows that every element of F(c) is of the form r(c) where r(x) has degree n — 1 or less. Thus, every element of F(c) can be written in the form
a„ + a,c+---Jtan_lc"~l
which is a linear combination of 1, c, c2,.. . , c" \
Finally, to prove that 1, c, c2,... , c"_1 are linearly independent, suppose that an + a,c + • • • + a„_,c"~l = 0. If the coefficients
«0'«1
a„_. were not all zero, c would be the root of a nonzero
polynomial of degree n — 1 or less, which is impossible because the minimum polynomial of c over F has degree n. Thus, «„ = «!, = ••• =
For example, let us look at Q(V2): the number V5 is not a root of any monic polynomial of degree 1 over Q. For such a polynomial would
292
294 CHAPTER TWENTY-NINE
DEGREES OF FIELD EXTENSIONS 295
have to be x - V2, and the latter is not in Q[x] because V2 is irrational. However, V2 is a root of x2 - 2, which is therefore the minimum polynomial of V2 over Q, and which has degree 2. Thus,
[Q(V2): Q] = 2
In particular, every element in Q(V2) is therefore a linear combination of 1 and V2, that is, a number of the form a + bV2 where a,b£G.
As another example, iis a root of the irreducible polynomial x2 + 1 in U[x]. Therefore x2 + 1 is the minimum polynomial of i over R; x2 + 1 has degree 2, so [R(i) : R] = 2. Thus, R(() consists of all the linear combinations of 1 and i with real coefficients, that is, all the a + hi where a, b G R. Clearly then, R(j) = C, so the degree of C over R is equal to 2.
In the sequel we will often encounter the following situation: £ is a finite extension of AT, where AT is a finite extension of F. If we know the
degree of E over K and the degree of K over F, can we determine the degree of E over F? This is a question of major importance! Fortunately, it has an easy answer, based on the following lemma:
Lemma Let alt a2,..., am be a basis of the vector space K over F, and let b,, b2, . . . , bn be a basis of the vector space E over K. Then the set of mn products {aft^ is a basis of the vector space E over the field F.
Proof: To prove that the set {a,-fty} spans E, note that each element c in E can be written as a linear combination c — , + ■■• + knbn with coefficients kt in K. But each kn because it is in AT, is a linear combination
k, = lnax +
+ limam
with coefficients L in F. Substituting,
c = {lnax + ■■■ + ltmam)b, + ■■■ + (/Bla, + ■ ■ • + lnmajbn
and this is a linear combination of the products a,i>; with coefficient /,. in F.
To prove that {aft^} is linearly independent, suppose L ltjaibl=0. This can be written as
(/na, + • • • + lXmam)b, + ■■■ + (/„,«, + ■ ■ ■ + lnmam)bn = 0
and since bx,. . . , b„ are independent, lnal + ■ ■ ■ + limam = 0 for each ;'. But a,, . . . , am are also independent, so every li} = 0. ■
With this result we can now conclude the following:
Theorem 2 Suppose F C K C E where E is a finite extension of K and K is a finite extension of F. Then E is a finite extension of F, and
[E: F] = [E: K)[K : F]
This theorem is a powerful tool in our study of fields. It plays a role in field theory analogous to the role of Lagrange's theorem in group theory. See what it says about any two extensions, K and E, of a fixed "base field" F: If K is a subfield of E, then the degree of AT (over F) divides the degree of E (over F).
If c is algebraic over F, we say that F(c) is obtained by adjoining c to F. If c and d are algebraic over F, we may find adjoin c to F, thereby obtaining F(c), and then adjoin d to F(c). The resulting field is denoted F(c, d), and is the smallest field containing F, c and d. [Indeed, any field containing F, c and d must contain F(c), hence also F(c, d).] It does not matter whether we first adjoin c and then d, or vice versa.
If Cj,. . . , c„ are algebraic over F, we let F(cx,. . . , c„) be the smallest field containing Fand c,, . . . , cn. We call it the field obtained by adjoining c,,. . . , c„ to F. We may form F(c1(. . . , c„) step by step, adjoining one c, at a time, and the order of adjoining the c, is irrelevant.
An extension F(c) formed by adjoining a single element to F is called a simple extension of F. An extension F(c,,. . . , c„), formed by adjoining a finite number of elements clt..., cn, is called an iterated extension. It is called "iterated" because it can be formed step by step, one simple extension at a time:
F C F(cJ C F(c,, c2) C F(c,, c2, c3) C ■ • • C F(Cl,
(1)
If Cj,..., c„ are algebraic over F, then by Theorem 1, each extension in Condition (1) is a finite extension. By Theorem 2, F(ct, c2) is a finite extension of F; applying Theorem 2 again, Fic^^^c^) is a finite extension of F; and so on. So finally, if cu . . . ,cn are algebraic over F, then F(c,,.. ., cn) is a finite extension of F.
Actually, the converse is true too: every finite extension is an iterated extension. This is obvious: for if AT is a finite extension of F, say an extension of degree n, then K has a basis {a1;. . . ,an} over F. This means that every element in AT is a linear combination of a,,. . . , an with coefficients in F; but any field containing F and ax,. . . , an obviously
296 CHAPTER TWENTY-NINE
DEGREES OF FIELD EXTENSIONS 297
contains all the linear combinations of a,, . . . , an; hence K is the smallest field containing F and a,, . . . , an. That is, K = F(a,,. . . , an).
In fact, if K is a finite extension of F and K = F(al,. . . ,an), then a,,. . . , an have to be algebraic over F. This is a consequence of a simple but important little theorem:
Theorem 3 If K is a finite extension of F, every element of K is algebraic over F.
Proof: Indeed, suppose K is of degree n over F, and let c be any element of K. Then the set {1, c, c2,..., c") is linearly dependent, because it has n + 1 elements in a vector space K of dimension n. Consequently, there are scalars o0, ...,«„ € F, not all zero, such that + anc" = 0. Therefore c is a root of the polynomial a(x) = + anx" in F[x]. ■
Let us sum up: Every iterated extension F(c,,..., c„), where c,, . . . , cn arc algebraic over F, is a finite extension of F. Conversely, every /iiwte extension of F is an iterated extension F(cl,. . . , c„), where C,,.. . , cn are algebraic over F.
Here is an example of the concepts presented in this chapter. We have already seen that 0(v2) is of degree 2 over Q, and therefore Q(v2) consists of all the numbers a + bV2 where a, bEQ. Observe that v5 cannot be in Q(v2); for if it were, we would have v5 = a + bV2 for rational a and b; squaring both sides and solving for VI would give us v2 = a rational number, which is impossible.
Since V3 is not in Q(v2), V3 cannot be a root of a polynomial of degree 1 over Q(v2) (such a polynomial would have to be x — V3). But V3 is a root of x1 - 3, which is therefore the minimum polynomial of V3 over Q(v2). Thus, Q(v2, v5) is of degree 2 over Q(V2), and therefore by Theorem 2, Q(v5, v5) is of degree 4 over Q.
By the comments preceding Theorem 1, {1, V2) is a basis of Q(v2) over Q, and {1, V3} is a basis of Q(v2, v5) over ) = F(Va, Vb).
2 Prove that if b ¥- xza for any xE. F, then Vb~0F(Va). Conclude that F(Va, Vb) is of degree 4 over F.
298 chapter twenty-nine
degrees of field extensions 299
3 Show that x = Va + Vb satisfies x* - 2(a + b)x2 + (a-bf Va + b + 2Vaft also satisfies this equation. Conclude that
0. Show that x =
F(Va + b + 2Vab) = F(V5, Vft)
4 Using parts 1 to 3, find an uncomplicated basis for O(d) over Q, where d is a root of x" - 14*2 + 9. Then find a basis for Q(V7 + 2VlO) over Q.
C. Finite Extensions of Finite Fields
By the proof of the basic theorem of field extensions, if p(x) is an irreducible polynomial of degree n in F[x], then F[x]I(p(x)) = F(c) where c is a root of p(x). By Theorem 1 in this chapter, F(c) is of degree n over F. Using the paragraph preceding Theorem 1:
1 Prove that every element of F(c) can be written uniquely as a0 + fl,c + ■ • • + fl„_,c"-1, for some a0, . . . , a„__, E F. # 2 Construct a field of four elements. (It is to be an extension of Z2.) Describe its elements, and supply its addition and multiplication tables.
3 Construct a field of eight elements. (It is to be an extension of Z2).
4 Prove that if F has q elements, and a is algebraic over F of degree n, then F(a) has q" elements.
5 Prove that for every prime number p, there is an irreducible quadratic in Z^jx]. Conclude that for every prime p. there is a field with p2 elements.
D. Degrees of Extensions (Applications of Theorem 2)
Let F be a field, and K a field extension of F. Prove the following:
1 [AT: F] = l iff K=F. # 2 If [K: F] is a prime number, there is no field properly between F and K (that is, there is no field L such that fC L C K).
3 If [K : F] is a prime, then K = F(a) for every a E K - F.
4 Suppose a, ft E K are algebraic over F with degrees m and n, where m and n are relatively prime. Then:
(a) F(a, b) is of degree mn over F. (ft) F(a) n F(ft) = F.
5 If the degree of F(a) over F is a prime, then F(a) = F(a") for any n (on the condition that a" £"F).
6 If an irreducible polynomial p(x) £ F[x] has a root in /f, then deg p(x)\\K:F].
E. Short Questions Relating to Degrees of Extensions
Let F be a field. Prove parts 1-3:
1 The degree of a over F is the same as the degree of 1 la over F. It is also the same as the degrees of a + c and ac over F, for any c £ F.
2 a is of degree 1 over F iff a £ F.
3 If a real number c is a root of an irreducible polynomial of degree >1 in Q[jc], then c is irrational.
4 Use part 3 and Eisentein's irreducibility criterion to prove that VmTň (where m, n EZ) is irrational if there is a prime number which divides m but not n, and whose square does not divide m.
5 Show that part 4 remains true forVmJTi, where q > 1.
6 If a and ft are algebraic over F, prove that F(a, ft) is a finite extension of F.
f F. Further Properties of Degrees of Extensions
Let F be a field, and K a finite extension of F. Prove each of the following:
1 Any element algebraic over K is algebraic over F, and conversely.
2 If ft is algebraic over K, then [F(ft): F]\\K(b): F].
3 If ft is algebraic over K, then [K(b) : K]« [F(b): F]. (Hint: The minimum polynomial of ft over F may factor in K[x], and ft will then be a root of one of its irreducible factors.)
# 4 If 6 is algebraic over K, then [K(b): F(b)] =£ [K : F], [Hint: Note that FCKC K(b) and FC F{b) Q K(b). Relate the degrees of the four extensions involved here, using part 3.]
# 5 Let p(x) be irreducible in F\x\. If [K : F] and deg p(x) are relatively prime, then p(x) is irreducible in K[x\.
f G. Fields of Algebraic Elements: Algebraic Numbers
Let FC K and a, ft £ K. Wc have seen on page 295 that if a and ft are algebraic over F, then F(a, ft) is a finite extension of F. Use the above to prove parts 1 and 2.
1 If a and ft are algebraic over F, then a + ft, a - ft, aft, and a/ft are algebraic over F. (In the last case, assume ft#0.)
2 The set {x(= K : x is algebraic over F} is a subfield of K, containing F.
Any complex number which is algebraic over Q is called an algebraic number. By part 2, the set of all the algebraic numbers is a field, which we shall designate by A.
Let a{x) = a„ + a,x + ■ ■ • + a„.r" be in A[x], and let c be any root of a(x). We will prove that c£A. To begin with, all the coefficients of a{x) are in Q(a„, u,,. . . , a„).
3 Prove: Q(a0, a....., a„) is a finite extension of Q.
300 CHAPTER TWENTY-NINE
Let Q(a„, ...,«„) = Qj. Since a(x)e.Q1[x], c is algebraic over Q,. Prove parts 4 and 5:
4 Qjfc) is a finite extension of Q,, hence a finite extension of Q. (Why?)
5 cGA.
Conclusion: The roots of any polynomial whose coefficients are algebraic numbers are themselves algebraic numbers.
A field F is called algebraically closed if the roots of every polynomial in F[x] are in F. We have thus proved that A is algebraically closed.
CHAPTER
THIRTY
RULER AND COMPASS
The ancient Greek geometers considered the circle and straight line to be the most basic of all geometric figures, other figures being merely variants and combinations of these basic ones. To understand this view we must remember that construction played a very important role in Greek geometry: when a figure was defined, a method was also given for constructing it. Certainly the circle and the straight line are the easiest figures to construct, for they require only the most rudimentary of all geometric instruments: the ruler and the compass. Furthermore, the ruler, in this case, is a simple, unmarked straightedge.
Rudimentary as these instruments may be, they can be used to carry out a surprising variety of geometric constructions. Lines can be divided into any number of equal segments, and any angle can be bisected. From any polygon it is possible to construct a square having the same area, or twice or three times the area. With amazing ingenuity, Greek geometers devised ways to cleverly use the ruler and compass, unaided by any other instrument, to perform all kinds of intricate and beautiful constructions. They were so successful that it was hard to believe they were unable to perform three little tasks which, at first sight, appear to be very simple: doubling the cube, trisecting any angle, and squaring the circle. The first task demands that a cube be constructed having twice the volume of a given cube. The second asks that any angle be divided into three equal parts. The third requires the construction of a square whose area is equal to that of a given circle. Remember, only a ruler and compass are to be used!
301
302 CHAPTER THIRTY
RULER AND COMPASS 303
Mathematicians, in Greek antiquity and throughout the Renaissance, devoted a great deal of attention to these problems, and came up with many brilliant ideas. But they never found ways of performing the above three constructions. This is not surprising, for these constructions are impossible! Of course, the Greeks had no way of knowing that fact, for the mathematical machinery needed to prove that these constructions are impossible—in fact, the very notion that one could prove a construction to be impossible—was still two millennia away.
The final resolution of these problems, by proving that the required constructions are impossible, came from a most unlikely source: it was a by-product of the arcane study of field extensions, in the upper reaches of modern algebra.
To understand how all this works, we will see how the process of ruler-and-compass constructions can be placed in the framework of field theory. Clearly, we will be making use of analytic geometry.
If si is any set of points in the plane, consider operations of the following two kinds:
1. Ruler operation: Through any two points in si, draw a straight line.
2. Compass operation: Given three points A, B, and C in si, draw a circle with center C and radius equal in length to the segment AB,
The points of intersection of any two of these figures (line-line, line-circle, or circle-circle) are said to be constructible in one step from si. A point F is called constructible from si if there are points F,, F2, . . . , Pn = P such that F, is constructible in one step from si, P2 is constructible in one step from jtfUlF,}, and so on, so that F, is constructible in one step from si U {Px,. . . , F,_,}.
As a simple example, let us see that the midpoint of a line segment AB is constructible from the two points A and B in the above sense. Well, given A and B, first draw the line AB. Then, draw the circle with center A and radius AB and the circle with center B and radius AB; let C and D be the points of intersection of these circles. C and D are constructible in one step from {A, B}. Finally, draw the line through C and D; the
intersection of this line with AB is the required midpoint. It is constructible from {A, B).
As this example shows, the notion of constructible points is the correct formalization of the intuitive idea of ruler-and-compass constructions.
We call a point in the plane constructible if it is constructible from Q x Q, that is, from the set of all points in the plane with rational coefficients.
How does field theory fit into this scheme? Obviously by associating with every point its coordinates. More exactly, with every constructible point F we associate a certain field extension of Q, obtained as follows:
Suppose F has coordinates (a, b) and is constructed from QxQ in one step. We associate with F the field 0(a, b), obtained by adjoining to Q the coordinates of F. More generally, suppose F is constructible from Q x Q in n steps: there are then n points Px, P2,. . . , Pn = P such that each P, is constructible in one step from QxQu {F,,. . . , Let the
coordinates of F,, . . . , F„ be (a,, £>,), . . . , (an, bn), respectively. With the points F,, . . . , F„ we associate fields Kt,. . . , Kn where Kx = Q(a,, b,), and for each i>1,
Kt-K, -i(«,.*,)
Thus, Ki = G(a,, bx), K2 = Kx(a2, b2), and so on: beginning with Q, we adjoin first the coordinates of F,, then the coordinates of F2, and so on successively, yielding the sequence of extensions
O C K, C K2 C • • • C K„ = K
We call K the field extension associated with the point P.
Everything we will have to say in the sequel follows easily from the next lemma.
Lemma // Kt,. .., Kn are as defined previously, then [Ki : K, ,] = 1,2, or 4.
Proof: Remember that K,_t already contains the coordinates of F,, . . . , F,._,, and Ki is obtained by adjoining to the coordinates x„ y, of P,. But F, is constructible in one step from QxQu {Fp . . . , F;_,}, so we must consider three cases, corresponding to the three kinds of intersection which may produce F,, namely: line intersects line, line intersects circle, and circle intersects circle.
Line intersects line: Suppose one line passes through the points (a,,a2) and (bx,b2), and the other line passes through (c,,c2) and (rf,, d2). We may write equations for these lines in terms of the constants «,, a2, bl, b2, c,, c2 and dx, d2 (all of which are in /C,_,), and then solve
304 CHAPTER THIRTY
RULER AND COMPASS 305
these equations simultaneously to give the coordinates x, y of the point of intersection. Clearly, these values of x and y are expressed in terms of a,, o,, b,, b2, c,, c2, dy, d2, hence are still in Thus, Kt. = Ki_l.
Line intersects circle: Consider the line AB and the circle with center C and radius equal to the distance k = DE. Let A, B,C have coordinates («,, o2), (b,, b2), and (c,, c2), respectively. By hypothesis, K,_y contains the numbers a,, a2, fej, Z>2, c,, c2, as well as A:2 = the square of the distance D£. (To understand the last assertion, remember that contains the coordinates of D and E; see the figure and use the Pythagorean theorem.)
Now, the line AB has equation
x - by by - «,
and the circle has equation
(x-Cl)2 + (y-c2)2 = k2 Solving for x in (1) and substituting into (2) gives
-> b^
-?C*-6,)-*2
c2 = k
(1)
(2)
This is obviously a quadratic equation, and its roots are the x coordinates of S and T. Thus, the x coordinates of both points of intersection are roots of a quadratic polynomial with coefficients in The same is true of the y coordinates. Thus, if K, = K, y,) where y,) is one of the points of intersection, then
V,): *,-,] = [*,-i<*,. >'-): : ^-il
=2x2=4
{This assumes that x,, y, £ K,_,. If either x, or y, or both are already in JKi_„ then [K,_y(x„ y,): K._,\ = 1 or 2.}
Circfe intersects circle: Suppose the two circles have equations
x2 + y2 + ax + fey + c = 0 (3)
and jt2 + y2 + djt + ey + /= 0 (4)
Then both points of intersection satisfy
(a-d)x + (b-e)y + (c-f) = 0 (5)
obtained simply by subtracting (4) from (3). Thus, x and y may be found by solving (4) and (5) simultaneously, which is exactly the preceding case. ■
We are now in a position to prove the main result of this chapter:
Theorem 1: Basic theorem on constructive points // the point with coordinates (a, b) is constructible, then the degree of Q(a) over Q is a power of 2, and likewise for the degree of Q(b) over Q.
Proof: Let P be a constructible point; by definition, there are points Py,...,Pn with coordinates (a,, b,),..., (a„, bn) such that each P, is constructible in one step from Q x Qu {Pu . . . , />_,}, and Pn = P. Let the fields associated with P,,..., Pn be Kt,..., Kn. Then
[Kn:Q] = [Kn:Kn_l][Kn.,:Kn_2]--[Ki:Q]
and by the preceding lemma this is a power of 2, say 2"'. But
[X„:Q] = K:Q(«)][Q(fl):Q]
hence[Q(a) : Q| is a factor of 2m, hence also a power of 2. ■
We will now use this theorem to prove that ruler-and-compass constructions cannot possibly exist for the three classical problems described in the opening to this chapter.
Theorem 2 "Doubling the cube" is impossible by ruler and compass.
Proof: Let us place the cube on a coordinate system so that one edge of the cube coincides with the unit interval on the x axis. That is, its
A-----
f"+-----T i I
Ml
J7!
i I I
Xl.O)
0)
306 chapter thirty
ruler and compass 307
endpoints are (0,0) and (1,0). If we were able to double the cube by ruler and compass, this means we could construct a point (c, 0) such that c = 2. However, by Theorem 1, [Q(c): O] would have to be a power of 2, whereas in fact it is obviously 3. This contradiction proves that it is impossible to double the cube using only a ruler and compass. ■
Theorem 3 "Trisecting the angle" by ruler and compass is impossible. That is, there exist angles which cannot be trisected using a ruler and compass.
Proof: We will show specifically that an angle of 60° cannot be trisected. If we could trisect an angle of 60°, we would be able to construct a point (c, 0) (see figure) where c = cos 20°; hence certainly we could construct (b,0) where b = 2 cos 20°.
(c,0)
Proof. If we were able to square the circle by ruler and compass, it would be possible to construct the point (0, Vn); hence by Theorem 1, [Q(Vtt) : Q] would be a power of 2. But it is well known that ir is transcendental over Q. By Theorem 3 of Chapter 29, the square of an algebraic clement is algebraic; hence Vlr is transcendental. It follows that Q(Vtt) is not even a finite extension of Q, much less an extension of some degree 2m as required. ■
EXERCISES
t A. Constructive Numbers
If O and / are any two points in the plane, consider a coordinate system such that
the interval Ol coincides with the unit interval on the x axis. Let D be the set of real numbers such that a E D iff the point (a, 0) is constructible from {O, I}. Prove the following:
1 If a, b £ D, then a + b £ D and a - b £ D.
2 If a, ftEO, then aft ED. (Hint: Use similar triangles. See the accompanying figure.)
But from elementary trigonometry
hence
cos 36* = 4 cos (?-3cos(? cos 60° = 4 cos3 20° - 3 cos 20°
Thus, b — 2cos20° satisfies b3-3fr-l = 0. The polynomial
p(x) = x3 — 3x — 1
is irreducible over <□ because p(x + 1) = x3 + 3x2 — 3 is irreducible by Eisenstein's criterion. It follows that Q(b) has degree 3 over Q, contradicting the requirement (in Theorem 1) that this degree has to be a power of 2. ■
Theorem 4 "Squaring the circle" by ruler and compass is impossible.
3 If o, ft E D, then a/ft £ D. (Use the same figure as in part 2.)
4 If a > 0 and a £ D, then Vn £ D. (Hint: In the accompanying figure, AS is the diameter of a circle. Use an elementary property of chords of a circle to show that x = Va.)
308 chapter thirty
ruler AND COMPASS 309
It follows from parts 1 to 4 that D is a field, closed with respect to taking square roots of positive numbers. D is called the field of constructible numbers.
5 QcD.
6 If a is a real root of any quadratic polynomial with coefficients in D, then a E D. (Hint: Complete the square and use part 4.)
f B. Constructible Points and Constructible Numbers
Prove each of the following:
1 Let sd be any set of points in the plane; (a, b) is constructible from sA iff (a, 0) and (0, b) are constructible from s£.
2 If a point P is constructible from {O, /} [that is, from (0,0) and (1,0)], then P is constructible from Q X O.
# 3 Every point in Q x Q is constructible from {O, I}. (Use Exercise A5 and the definition of D.)
4 If a point P is constructible from Qx Q, it is constructible from {O, /}.
By combining parts 2 and 4, we get the following important fact: Any point P is constructible from QxQ iff P is constructible from {0,1}. Thus, we may define a point to be constructible iff it is constructible from {O, /}.
5 A point P is constructible iff both its coordinates are constructible numbers.
t C. Constructible Angles
An angle a is called constructible iff there exist constructible points A, B, and C such that LABC = a. Prove the following:
1 The angle a is constructible iff sin a and cos a are constructible numbers.
2 cos a ED iff sin a ED.
3 If cos a, cos BED, then cos (a + B), cos (a - B) E D.
4 cos (2a) E D iff cos a ED.
5 If a and 8 are constructible angles, so are a + B, a - B, \a, and na for any positive integer n.
# 6 The following angles are constructible: 30°, 75°, 222°.
7 The following angles are not constructible: 2ff, 40°, 140°. (Hint: Use the proof of Theorem 3.)
D. Constructible Polygons
A polygon is called constructible iff its vertices are constructible points. Prove the following:
# 1 The regular /j-gon is constructible iff the angle l-rrln is constructible.
2 The regular hexagon is constructible.
3 The regular polygon of nine sides is not constructible.
t E. A Constructible Polygon
We will show that 2W5 is a constructible angle, and it will follow that the regular pentagon is constructible.
1 If r = cos k + i sin k is a complex number, prove that 1/r = cos k - i sin k. Conclude that r + 1 Ir = 2 cos k.
By de Moivre's theorem,
2tt . . 2ir = cos -j- + i sin —
is a complex fifth root of unity. Since
x5 - 1 = (x -\)(x4 + x3 + x2 + X + 1) (o is a root of p(x) = x* + x3 + x2 + x + 1.
2 Prove that a>2 + a> + 1 + co~l + oT2 = 0.
3 Prove that
, 27T „ 2ff . „
4 cos — + 2 cos -j- - 1 = 0
(Hint: Use parts 1 and 2.) Conclude that cos (2tt/5) is a root of the quadratic 4*2-2.r-l.
4 Use part 3 and A6 to prove that cos (2tt/5) is a constructible number.
5 Prove that 2ir/5 is a constructible angle.
6 Prove that the regular pentagon is constructible.
t F. A Nonconstructible Polygon
By de Moivre's theorem,
2ir . . 2tt (o = cos — + i sin —
is a complex seventh root of unity. Since
xy - 1 = (x - l)(x" + xs + x* + x3 + x2 + x + 1) w is a root of x6 + xs + x" + x3 + x2 + x + 1.
1 Prove that 2 + w + 1 + a)'1 + a>2 + to'3 = 0.
2 Prove that
n 3 2tt , 2w 2m
8cos — + 4cos ^--4cos-y-l = 0
310 CHAPTER THIRTY
(Use part 1 and Exercise El.) Conclude that cos (2tt/7) is a root of 8*3 + 4x2 -Ax - 1.
3 Prove that 8*' + 4xz - 4jc - 1 has no rational roots. Conclude that it is irreducible over Q.
4 Conclude from part 3 that cos(2ir/7) is not a constructible number.
5 Prove that 2tt/7 is not a constructible angle.
6 Prove that the regular polygon of seven sides is not constructible.
G. Further Properties of Constructible Numbers and Figures
Prove each of the following:
1 If the number a is a root of an irreducible polynomial p(x) £ Q[x] whose degree is not a power of 2, then a is not a constructible number.
2 Any constructible number can be obtained from rational numbers by repeated addition, subtraction, multiplication, division, and taking square roots of positive numbers.
3 D is the smallest field extension of Q closed with respect to square roots of positive numbers (that is, any field extension of Q closed with respect to square roots contains D). (Use part 2 and Exercise A.)
4 All the roots of the polynomial x" - 3x2 + 1 are constructible numbers.
A line is called constructible if it passes through two constructible points. A circle is called constructible if its center and radius are constructible.
5 The line ax + by + c = 0 is constructible if a, b,cS D.
6 The circle x2 + y2 + ax + by + c = 0 is constructible if a,b,cE.D.
CHAPTER
THIRTY-ONE
GALOIS THEORY: PREAMBLE
Field extensions were used in Chapter 30 to settle some of the most puzzling questions of classical geometry. Now they will be used to solve a problem equally ancient and important: they will give us a definite and elegant theory of solutions of polynomial equations.
We will be concerned not so much with finding solutions (which is a problem of computation) as with the nature and properties of these solutions. As we shall discover, these properties turn out to depend less on the polynomials themselves than on the fields which contain their solutions. This fact should be kept in mind if we want to clearly understand the discussions in this chapter and Chapter 32. We will be speaking of field extensions, but polynomials will always be lurking in the background. Every extension will be generated by roots of a polynomial, and every theorem about these extensions will actually be saying something about the polynomials.
Let us quickly review what we already know of field extensions, filling in a gap or two as we go along. Let F be a field; an element a (in an extension of F) is algebraic over F if a is a root of some polynomial with its coefficients in F. The minimum polynomial of a over F is the monic polynomial of lowest degree in F[x] having a as a root; every other polynomial in F[x\ having a as a root is a multiple of the minimum polynomial.
The basic theorem of field extensions tells us that any polynomial of degree n in F[x] has exactly n roots in a suitable extension of F. However, this does not necessarily mean n distinct roots. For example, in U[x] the polynomial (x - if has five roots all equal to 2. Such roots are
311
312 CHAPTER THIRTY-ONE
GALOIS THEORY: PREAMBLE 313
called multiple roots. It is perfectly obvious that we can come up with polynomials such as (jc - 2)5 having multiple roots; but are there any irreducible polynomials with multiple roots? Certainly the answer is not obvious. Here it is:
Theorem 1 If F has characteristic 0, irreducible polynomials over F can never have multiple roots.
Proof: To prove this, we must define the derivative of the polynomial a(x) = «0 + axx + •■••+ anx". It is a'(x) = a, + 2a2x + • ■ • + nanx"'x. As in elementary calculus, it is easily checked that for any two polynomials/^) and g(x),
V + gy-f' + g' and (fg)'=fg'+f'8
Now suppose a(x) is irreducible in F[x] and has a multiple root c: then in a suitable extension we can factor a{x) as a{x) = (x — c)2q(x), and therefore a'{x) = 2{x - c)q{x) + (x - c)2q'(x). So x - c is a factor of a'(x), and therefore c is a root of a'{x). Let p(x) be the minimum polynomial of c over F; since both a(x) and a'(x) have c as a root, they are both multiples of p(x).
But a(x) is irreducible: its only nonconstant divisor is itself; so p(x) must be a(x). However, a(x) cannot divide a'(x) unless a'(x) = 0 because a'{x) is of lower degree than a(x). So a'(x) = 0 and therefore its coefficient nan is 0. Here is where characteristic 0 comes in: if nan — 0 then an = 0, and this is impossible because a„ is the leading coefficient of a(x). ■
In the remaining three chapters we will confine our attention to fields of characteristic 0. Thus, by Theorem 1, any irreducible polynomial of degree n has n distinct roots.
Let us move on with our review. Let E be an extension of F. We call E a finite extension of F if E, as a vector space with scalars in F, has finite dimension. Specifically, if E has dimension n, we say that the degree of E over F is equal to n, and we symbolize this by writing [E: F] = n. If c is algebraic over F, the degree of F(c) over F turns out to be equal to the degree of p(x), the minimum polynomial of c over F.
F(c), obtained by adjoining an algebraic element c to F, is called a simple extension of F. F(c,,. . . , cn), obtained by adjoining n algebraic elements in succession to F, is called an iterated extension of F. Any iterated extension of F is finite, and, conversely, any finite extension of F is an iterated extension F(c,,. . . , c„). In fact, even more is true; let Fbe of characteristic 0.
Theorem 2 Every finite extension of F is a simple extension F(c).
Proof: We already know that every finite extension is an iterated extension. We will now show that any extension F(a, b) is equal to F(c) for some c. Using this result several times in succession yields our theorem. (At each stage, we reduce by 1 the number of elements that must be adjoined to F in order to get the desired extension.)
Well, given F(a, b), let A(x) be the minimum polynomial of a over F, and let B(x) be the minimum polynomial of b over F. Let K denote any extension of F which contains all the roots ax,... , aH of A(x) as well as all the roots blt..., bm of B(x). Let a, be a and let bx be b.
Let t be any nonzero element of F such that
a, — a ^t^b,
for every i' 1 and j # 1
Cross multiplying and setting c = a f tb, it follows that c ^ a, + tb,, that is,
c - tbj ai for all i 1 and j 1 Define h(x) by letting h(x) = A(c - tx); then h(b) = A(c-tb) = 0
while for every j¥>\,
h(bj) = A(c- tbj)*Q
/ any a,
Thus, b is the only common root of h(x) and B(x).
We will prove that b G F(c), hence also a = c - tb (E F(c), and therefore F(a, o) C F(c). But c G F(a, b), so F(c) C F(a, fo). Thus F(a, 6) = F(c).
So, it remains only to prove that b G F(c). Let p(x) be the minimum polynomial of b over F(c). If the degree of p(x) is 1, then p(x) is x - />, so i»GF(c), and we are done. Let us suppose deg/>(x)s=2 and get a contradiction: observe that h(x) and fl(jr) must both be multiples of p(x) because both have b as a root, and p(x) is the minimum polynomial of b. But if h(x) and B(x) have a common factor of degree 5=2, they must have two or more roots in common, contrary to the fact that h is their only common root. Our proof is complete. ■
For example, we may apply this theorem directly to Q(v2, v5). Taking t = 1, we get c = v2 + V3, hence Q(v2, v5) = Q(v2 + V3).
If a(x) is a polynomial of degree n in F[x], let its roots be c,, . . . , cn. Then F(cj,. . . , c„) is clearly the smallest extension of F containing all the roots of a{x). F(c,,,. . , c„) is called the root field of a{x) over F. We will have a great deal to say about root fields in this and subsequent chapters.
314 CHAPTER THIRTY-ONE
GALOIS THEORY: PREAMBLE 315
Isomorphisms were important when we were dealing with groups, and they are important also for fields. You will remember that if F, and F2 are fields, an isomorphism from F, to F2 is a bijective function h: F, —* F2 satisfying
h(a + b) = h(a) + h{h) and h(ab) = h(a)h(b)
From these equations it follows that h(0) = 0, h(l) = 1, h(-a) = — h(a), and h(a~') = (h(a)y\
Suppose F, and F2 are fields, and h: F, —* Ft is an isomorphism. Let K} and K2 be extensions of F, and F2, and let h: KX—^K2 also be an isomorphism. We call h and extension of /i if h(x) = h(x) for every x in F,, that is, if /. and h are the same on F,. (£ is an extension of h in the plain sense that it is formed by "adding on" to h.)
As an example, given any isomorphism h: F, —* F,, we can extend h to an isomorphism h: FjLt]—* F2[x]. (Note that F[x] is an extension of F when we think of the elements of F as constant polynomials; of course, F[jc] is not a field, simply an integral domain, but in the present example this fact is unimportant.) Now we ask: What is an obvious and natural way of extending hi The answer, quite clearly, is to let ft send the polynomial with coefficients ct0, at,. .., an to the polynomial with coefficients h(aa), /i(fl,), . . . , h(an):
h(a0 + atx H-----h anx") = h{aQ) + h(ax)x + • • • + h{an)x"
It is child's play to verify formally that h is an isomorphism from F\x\ to F2[x]. In the sequel, the polynomial h(a(x)), obtained in this fashion, will be denoted simply by ha(x). Because R is an isomorphism, a(x) is irreducible iff ha(x) is irreducible.
A very similar isomorphism extension is given in the next theorem.
Theorem 3 Let h: F, —> F2be an isomorphism, and letp(x) be irreducible in F^x]. Suppose a is a root of p(x), and b a root of hp(x). Then h can be extended to an isomorphism
ft: F,(a)-»F2(ft)
Proof: Remember that every element of F^a) is of the form
c0 + cxa + • • ■ 4- c„a"
where cu, . . . , cn are in Fu and every element of F2(b) is of the form dn + dxb + • • • + dnb" where d0,. . . , dn are in F2. Imitating what we did successfully in the preceding example, we let h send the expression with coefficients c0,. . . , cn to the expression with coefficients h(c0),.. . ,h(cn):
h(c0 + cta + --- + cna") = h(c0) + h(ct)b + ■■■ + h(c„)b"
Again, it is routine to verify that h is an isomorphism. Details are laid out in Exercise H at the end of the chapter. ■
Most often we use Theorem 3 in the special case where F, and F2 are the same field—let us call it F—and h is the identity function e: F-+F. [Remember that the identity function is e(x) = x.\ When wc apply Theorem 3 to the identity function e: F—* F, we get
Theorem 4 Suppose a and b are roots of the same irreducible polynomial p(x) in F[x\. Then there is an isomorphism g: F(a)-* F(b) such that g(x) = x for every x in F, and g(a) = b.
Now let us consider the following situation: K and K' are finite extensions of F, and K and have a common extension E. If h• : K~* K' is an isomorphism such that h(x) = x for every x in F, we say that h fixes F. Let c be an element of K; if h fixes F, and c is a root of some polynomial a(x) = a0 + ■ ■ • + aaxn in F[x], h(c) also is a root of a(x). It is easy to see why: the coefficients of a(x) are in F and are therefore not changed by h. So if a(c) = 0, then
a(h(c)) = «0 + a,/i(c) + • • ■ + anh(c)"
= h(a0 + fljC + • • • + a„c") = h(0) = 0
Furthermore, h(a) = b.
316 chapter thirty-one
galois theory: preamble 317
What we have just shown may be expressed as follows:
(*) Let a(x) be any polynomial in F[x]. Any isomorphism which fixes F sends roots of a(x) to roots of a(x).
If K happens to be the root field of a(x) over F, the situation becomes even more interesting. Say K = F(cl7 c2,. .. , c„), where cy, c2,. . ., cn are the roots of a(x). lfh:K—>K'is any isomorphism which fixes F, then by (*), h permutes c,, c2,. . ., c„. Now, by the brief discussion headed "For later reference" on page 296, every element of F(cu ..., c„) is a sum of terms of the form
kc^c'i ■ • •
where the coefficient k is in F. Because h fixes F, h(k) = k. Furthermore, c,, c2,. . . , c„ are the roots of a(x), so by (*), the product c'lc'j • • • c'n" is transformed by h into another product of the same form. Thus, h sends every element of F(cu c2,. . . , c„) to another element of F(cit c2,..., cB).
The above comments are summarized in the next theorem.
Theorem 5 Let K and K' be finite extensions of F. Assume K is the root field of some polynomial over F. If h: K-* K' is an isomorphism which fixes F, then K= K'.
Proof: From Theorem 2, K and K' are simple extensions of F, say K = F(a) and K' = F(b). Then E = F(a, b) is a common extension of K and K'. By the comments preceding this theorem, h maps every element of K to an element of K'; hence K' C K. Since the same argument may be carried out for h~ , we also have K C K'. ■
Theorem 5 is often used in tandem with the following (see the figure on the next page):
Theorem 6 Let L and L' be finite extensions of F. Let K be an extension of L such that K is a root field over F. Any isomorphism h : L—» L' which fixes F can be extended to an isomorphism h : K—> K.
Proof: From Theorem 2, K is a simple extension of L, say K = L(c). Now we can use Theorem 3 to extend the isomorphism h : L —* U to an isomorphism
h : L(c)-^L'(d) K K' By Theorem 5 applied to h, K = K'. ■
Remark: It follows from the theorem that L' C K, since ran h C ran h = K.
For later reference. The following results, which are of a somewhat technical nature, will be needed later. The first presents a surprisingly strong property of root fields.
Theorem 7 Let K be the root field of some polynomial over F. For every irreducible polynomial p(x) in F[x], if p(x) has one root in K, then p(x) must have all of its roots in K.
Proof: Indeed, suppose p(x) has a root a in K, and let b be any other root of p(x). From Theorem 4, there is an isomorphism h : F(a)—* F(b) fixing F. But F(a) C K; so from Theorem 6 and the remark following it F(b) C K; hence b e K. ■
Theorem 8 Suppose IQE C K, where E is a finite extension of I and K is a finite extension of E. If K is the root field of some polynomial over I, then K is also the root field of some polynomial over E.
Proof: Suppose K is a root field of some polynomial over /. Then K is a root field of the same polynomial over E. ■
EXERCISES
A. Examples of Root Fields over Q
Example Find the root field of a(x) = (x2
3)(x} - 1) over Q.
Solution The complex roots of a(x) are ±V3,1, |(-1 ± V3i), so the root field is Q(±V3,1, ± V5i))- The same field can be written more simply as Q(V3, i).
318 CHAPTER THIRTY-ONE
galois theory: preamble 319
1 Show that Q(V3, i) is the root field of (x2 - 2x - 2)(x2 + 1) over Q.
Comparing part 1 with the example, we note that different polynomials may have the same root field. This is true even if the polynomials are irreducible.
2 Prove that x2 ~ 3 and x2 - 2x-2 are both irreducible over Q. Then find their root fields over Q and show they are the same.
3 Find the root field of x* - 2, first over O, then over R.
4 Explain: Q(i, V2) is the root field of x" - 2x2 + 9 over Q, and is the root field
of x2 - 2\f2x + 3 over Q(V2).
5 Find irreducible polynomials a(x) over Q, and b(x) over Q(i), such that Q(i, V3) is the root field of a(x) over Q, and is the root field of b(x) over Q(i). Then do the same for ©(V^, V^).
# 6 Which of the following extensions are root fields over Q? Justify your answer: O(i); Q(V2); 0(^2), wherei/2 is the real cube root of 2; Q(2 + VE); Q(; + V3); Q(i, VI, V3).
B. Examples of Root Fields over Zp
Example Find the root field of x2 + 1 over Z3. Solution By the basic theorem of field extensions,
7VTT) = z'(tt)
where u is a root of x2 + 1. In Z3(«), x2 + 1 = (x + u)(x - «), because u2 + 1 = 0. Since Z,(u) contains ±u, it is the root field of x2 + 1 over Z3. Note that Z,(«) has nine elements, and its addition and multiplication tables are easy to construct. (See Chapter 27, Exercise C4).
1 Show that, in any extension of Z3 which contains a root u of
a(x) = x3 + 2x + 1 G Z3[x] it happens that u + 1 and « + 2 are the remaining two roots of a(x). Use this fact to find the root field of x3 + 2x + 1 over Z3. List the elements of the root field.
2 Find the root field of x2 + x + 2 over Z3, and write its addition and multiplication tables.
3 Find the root field of x3 + x2 + 1 GZ-Jx] over Z2. Write its addition and multiplication tables.
4 Find the root field over Z2 of x3 + x + 1 G 12\x]. (Caution: This will prove to be a little more difficult than part 3.)
# 5 Find the root field of x3 + x2 + x + 2 over Z3. Find a basis for this root field over Z3.
C. Short Questions Relating to Root Field
Prove each of the following
1 Every extension of degree 2 is a root field.
2 If F C / C K and K is a root field of a(x) over F, then K is a root field of a(x) over 7.
3 The root field over R of any polynomial in U[x] is R or C.
4 If c is a complex root of a cubic a(x) G Q[x], then Q(c) is the root field of a(x) over Q.
# 5 If p(x) = x4 + ax2 + b is irreducible in F[x], then F[x]/(p(x)) is the root field of p(x) over F.
6 If K — F(a) and K is the root field of some polynomial over F, then K is the root field of the minimum polynomial of a over F.
7 Every root field over F is the root field of some irreducible polynomial over F. (Hint: Use part 6 and Theorem 2.)
8 Suppose [K :F] = n, where AT is a root field over F. Then K is the root field over F of every irreducible polynomial of degree n in F[x] having a root in K.
9 If a(x) is a polynomial of degree n in F[x], and K is the root field of a(x) over F, then [K: F] divides n\
D. Reducing Iterated Extensions to Simple Extensions
1 Find c such that Q(V2, V^3) = Q(c). Do the same for Q(V2,V2)
2 Let a be a root of x3 - x + 1, and b a root of x2 - 2x - 1. Find c such that Q(a, b) = Q(c). (Hint: Use calculus to show that x3 ~ x + 1 has one real and two complex roots, and explain why no two of these may differ by a real number.)
# 3 Find c such that Q(V2, V3, V^S) = Q(c).
4 Find an irreducible polynomial p(x) such that Q(V2, V5) is the root field of p(x) over Q. (Hint: Use Exercise C6.)
5 Do the same as in part 4 for Q(V2, V3, V^5).
t E. Roots of Unity and Radical Extensions
De Moivre's theorem provides an explicit formula to write the n complex nth roots of 1. (See Chapter 16, Exercise H.) By de Moivre's formula, the nth roots of unity consist of a> = cos (2irln) + (sin(27r/n) and its first n powers, namely, 1, id, a)2,. . . , to"~\ We call ): Q] = n - 1.
3 If n is a prime, ai"~l is equal to a linear combination of 1, to,..., (o"2 with rational coefficients.
4 Find [Q(cu): ©], where to is a primitive nth root of unity, for n = 6, 7, and 8.
320 CHAPTER THIRTY-ONE
galois theory: preamble 321
5 Prove that for any rS (1,2,..., n — 1}, \Zau>r is an nth root of a. Conclude that \/fl, V"fl<, v'a) is the root field of x" - a over Q.
7 Find the degree of 0(&>,v'2) over Q, where w is a primitive cube root of 1. Also show that Q(o),V2) = Q(v2, /V3) (Hint: Compute co.)
8 Prove that if K is the root field of any polynomial over Q, and K contains an nth root of any number a, then K contains all the nth roots of unity.
t I . Separable and Inseparable Polynomials
Let F be a field. An irreducible polynomial p(x) in F[x] is said to be separable over F if it has no multiple roots in any extension of F. If p(x) does have a multiple root in some extension, it is inseparable over F.
1 Prove that if F has characteristic 0, every irreducible polynomial in F[x] is separable.
Thus, for characteristic 0, there is no question whether an irreducible polynomial is separable or not. However, for characteristic p 5^0, it is different. This case is treated next. In the following problems, let F be a field of characteristic p ^ 0.
2 If a'(x) = 0, prove that the only nonzero terms of a(jt) are of the form ampx"'p for some m. [In other words, a(x) is a polynomial in powers of xp.]
3 Prove that if an irreducible polynomial a(x) is inseparable over F, then a(x) is a polynomial in powers of xp. (Hint: Use part 2, and reason as in the proof of Theorem 1.)
4 Use Chapter 27, Exercise J (especially the conclusion following J6) to prove the converse of part 3.
Thus, if F is a field of characteristic p ^ 0, an irreducible polynomial a(x) e F[x] is inseparable iff a(x) is a polynomial in powers of xp. For finite fields,
we can say even more:
5 Prove that if F is any field of characteristic p t^O, then in F[x],
(au + «,* + ••• + anx")p = ap + apx" + ■■■ + a"„x"p (Hint: See Chapter 24, Exercise D6.)
6 If F is a finite field of characteristic p^O, prove that, in F[x], every polynomial a(xp) is equal to for some b(x). [Hint: Use part 5 and the fact that in a finite field of characteristic p, every element has a pth root (see Chapter 20, Exercise F).]
7 Use parts 3 and 6 to prove: In any finite field, every irreducible polynomial is separable.
Thus, fields of characteristic 0 and finite fields share the property that irreducible polynomials have no multiple roots. The only remaining case is that of infinite fields with finite characteristic. It is treated in the next exercise set.
t G. Multiple Roots over Infinite Fields of Nonzero Characteristic
If Zp[y] is the domain of polynomials (in the letter y) over Zp, let E = Zp(y) be the field of quotients of Zp[y]. Let K denote the subfield Zp(yp) of Zp(y).
1 Explain why Zp(y) and Z.p(yp) are infinite fields of characteristic p.
2 Prove that a(x) = xp - yp has the factorization xp - yp = (x — y)p in E[x], but is irreducible in K[x]. Conclude that there is an irreducible polynomial a(x) in K[x] with a root whose multiplicity is p.
Thus, over an infinite field of nonzero characteristic, an irreducible polynomial may have multiple roots. Even these fields, however, have a remarkable property: all the roots of any irreducible polynomial have the same multiplicity. The details follow: Let F be any field, p(x) irreducible in F[x], a and b two distinct roots of p(x), and K the root field of p(x) over F. Let i: AT—► i(K) = K' be the isomorphism of Theorem 4, and f: K[x[—> K'[x] the isomorphism described immediately preceding Theorem 3.
3 Prove that (leaves p(x) fixed.
4 Prove that i((x - a)m) = (x - b)m.
5 Prove that a and b have the same multiplicity.
t H. An Isomorphism Extension Theorem (Proof of Theorem 3)
Let F,, F2, h, p(x), a, b, and h be as in the statement of Theorem 3. To prove that h is an isomorphism, it must first be shown that it is properly defined: that is, if c(a) = aYa) in F,(a), then h(c(a)) = h(d(a)).
1 If c(a) = d(a), prove that c{x) - d(x) is a multiple of p(x). Deduce from this that hc(x) — hd(x) is a multiple of hp(x). # 2 Use part 1 to prove that h(c(a)) = h(d{a)).
3 Reversing the steps of the preceding argument, show that h is injective.
4 Show that h is surjective.
5 Show that h is a homomorphism.
t I. Uniqueness of the Root Field
Let h: Fx —> F2 be an isomorphism. If a(x) e F,[x], let Kl be the root field of a(x) over F,, and K2 the root field of ha(x) over F2.
322 CHAPTER THIRTY-ONE
1 Prove: If p(x) is an irreducible factor of a(x), u E AT, is a root of p(x), and v E K2 is a root of hp(x), then F,(«) = F2(i>).
2 F,(u) = KI iffF2(v) = K1.
# 3 Use parts 1 and 2 to form an inductive proof that K, = K2.
4 Draw the following conclusion: The root field of a polynomial a(x) over a field F is unique up to isomorphism.
t J. Extending Isomorphism
In the following, let F be a subfield of c. An infective homomorphism h: F-called a monomorphism; it is obviously an isomorphism F—>h(F).
•C is
1 Let íú be a complex pth root of unity (where p is a prime), and let h: Q(o>)—»C be a monomorphism fixing Q. Explain why h is completely determined by the value of h(io). Then prove that there exist exactly p — 1 monomorphisms Q(w)—>C which fix O.
# 2 Let p(x) be irreducible in F[x], and c a complex root of p(x). Let h: F-*c be a monomorphism. If deg /j(j:) = «, prove that there are exactly n monomorphisms F(c)—»c which are extensions of h.
3 Let F C KCC. with [K: F] = n. If h: F—>C is a monomorphism, prove that there are exactly n monomorphisms K—*c which are extensions of h.
# 4 Prove: The only possible monomorphism h:Q—>c is h(x) = x. Thus, any monomorphism h: Q(«)—»c necessarily fixes Q.
5 Prove: There arc exactly three monomorphisms Q(V2)^>c, and they are determined by the conditions: V2-^V2;i/2^V2(a;V2—>V2o)2, where o> is a primitive cube root of unity.
K. Normal Extensions
If K is the root field of some polynomial a(x) over F, K is also called a normal extension of F. There are other possible ways of defining normal extensions, which are equivalent to the above. We consider the two most common ones here: they are precisely the properties expressed in theorems 7 and 6. Let AT be a finite extension of F.
1 Suppose that for every irreducible polynomial p(x) in F\x\, if p(x) has one root in K, then p(x) must have all its roots in K. Prove that K is a normal extension of F.
2 Suppose that, if h is any isomorphism with domain K which fixes F, then h(K)Q K. Prove that AT is a normal extension of F.
CHAPTER
THIRTY-TWO
GALOIS THEORY: THE HEART OF
THE MATTER
If AT is a field and h is an isomorphism from K to K, we call h an automorphism of K (automorphism = "self-isomorphism").
We begin this chapter by restating Theorems 5 and 6 of Chapter 31:
Let K be the root field of some polynomial over F; suppose a El K:
(i) Any isomorphism with domain K which fixes F is an automorphism of K.
(ii) If a and b are roots of an irreducible polynomial p(x) in F[x], there is an automorphism of K fixing F and sending a to b.
Rule (i) is merely a restatement of Theorem 5 of Chapter 31, using the notion of automorphism. Rule (ii) is a result of combining Theorem 4 of Chapter 31 [which asserts that there exists an F-fixing isomorphism from L = F(a) to L' = F(b)] with Theorem 6 of the same chapter.
Let K be the root field of a polynomial a(x) in F[x\. If c,, c2,..., cn are the roots of a(x), then K= F{cl, c2,. . . , cn), and, by (*) on page 316, any automorphism // of K which fixes Fpermutes c,, C2,..., cn. On the other hand, remember that every element a in F(c,, c2,. . . , c„) is a sum of terms of the form
kc^-'-c';
where the coefficient k of each term is in F. If h is an automorphism which fixes F, h does not change the coefficients, so h(a) is completely determined once we know /i(c,), . . . , h(cn). Thus, every automorphism
323
324 CHAPTER THIRTY-TWO
GALOIS THEORY: THE HEART OF THE MATTER 325
of K fixing F is completely determined by a permutation of the roots of a(x).
This is very important!
What it means is that we may identify the automorphisms of K which fix F with permutations of the roots of a(x).
It must be pointed out here that, just as the symmetries of geometric figures determine their geometric properties, so the symmetries of equations (that is, permutations of their roots) give us all the vital information needed to analyze their solutions. Thus, if K is the root field of our polynomial a(x) over F, we will now pay very close attention to the automorphisms of K which fix F.
To begin with, how many such automorphisms are there? The answer is a classic example of mathematical elegance and simplicity.
Theorem 1 Let K be the root field of some polynomial over F. The number of automorphisms of K fixing F is equal to the degree of K over F.
Proof: Let [K: F] — n, and let us show that K has exactly n automorphisms fixing F. By Theorem 2 of Chapter 31, K = F{a) for some aE. K. Let p(x) be the minimum polynomial of a over F; if b is any root of p(x), then by (ii) on the previous page, there is an automorphism of K fixing F and sending a to b. Since p(x) has n roots, there are exactly n choices of b, and therefore n automorphisms of K fixing F.
[Remember that every automorphism h which fixes F permutes the roots of p(x) and therefore sends a to some root of p(x); and h is completely determined once we have chosen h(a).] ■
For example, we have already seen that Q(v2) is of degree 2 over Q. Q(v2) is the root field of x2 - 2 over Q because Q(v2) contains both roots of x2 -2, namely ±v2. By Theorem 1, there are exactly two automorphisms of Q(v2) fixing Q: one sends v2 to v2; it is the identity function. The other sends v2 to -v2, and is therefore the function a + bV2^> a - bV2.
Similarly, we saw that C = R(i), and C is of degree 2 over R. The two automorphisms of C which fix R are the identity function and the function a + bi—* a - bi which sends every complex number to its complex conjugate.
As a final example, we have seen that Q(V2, V3) is an extension of degree 4 over Q, so by Theorem 1, there are four automorphisms of Q(v2, V3) which fix Q: Now, Q(V2, V3) is the root field of (x2 -2)(x2 - 3) over Q for it contains the roots of this polynomial, and any extension of Q containing the roots of (x2 - 2)(x2 - 3) certainly contains v5 and V3. Thus, by (*) on page 316, each of the four automorphisms which fix Q sends roots of x2 - 2 to roots of x2 - 2, and roots of x2 - 3 to roots of x2 - 3. But there are only four possible ways of doing this,
namely,
V2-IV3-
v2^ v2 V3-*-V3
•V3
} {
and
V3-
-V2j V3J
V2-»-V2l V3-+-V3J
Since every element of Q(v2, V3) is of the form a + bV2 + cV3 + dV6, these four automorphisms (we shall call them e, a, B, and y) are the following:
a + bVl + cV3 + dV6—"-^ a + bV2 + cV3 + dV6
a + bVl + cV3 + dV6——* a - bV2 + cV3 - dV6
a + bV2 + cV3 + dV6 - > a + bV2 - cV3 - dV6
a + by/2 + cV3 + dV6—y—+ a - bV2 - cVl" + dV6
If K is an extension of F, the automorphisms of K which fix F form a group. (The operation, of course, is composition.) This is perfectly obvious: for if g and h fix F, then for every x in F,
h
x—*x
and
s
x-*x
so
h
x—*x-
that is, g°h fixes F. Furthermore, if
then
h
x^>x
that is, if h fixes F so does h~ .
This fact is perfectly obvious, but nonetheless of great importance, for it means that we can now use all of our accumulated knowledge about groups to help us analyze the solutions of polynomial equations. And that is precisely what Galois theory is all about.
If K is the root field of a polynomial a(jc) in F[x], the group of all the automorphisms of K which fix F is called the Galois group of a(x). We also call it the Galois group of K over F, and designate it by the symbol
Gal{K : F)
In our last example we saw that there are four automorphisms of Q(v2, V3) which fix Q. We called them e, a, B, and y. Thus, the Galois group of Q(v2, v5) over Q is Ga/(0(v2~, V3): Q) = {e, a, B, y}; the operation is composition, giving us the table
0 e a B y
e e a B y
a a e y 8
13 B y e a
7 y 8 a e
326 CHAPTER THIRTY-TWO
GALOIS THEORY: THE HEART OF THE MATTER 327
As one can see, this is an abelian group in which every element is its own inverse; almost at a glance one can verify that it is isomorphic to Z2 x Z2.
Let K be the root field of a(x), where a(x) is in F[x]. In our earlier discussion we saw that every automorphism of K fixing F [that is, every member of the Galois group of a(xj\ may be identified with a permutation of the roots of a(x). However, it is important to note that not every permutation of the roots of a(x) need be in the Galois group of a(x), even when a(x) is irreducible. For example, we saw that Q(v2, v5) = G(v2 + V3), where V2+V3 is a root of the irreducible polynomial x* - lux2 + 1 over Q. Since x" - lO*2 + 1 has four roots, there are 4! = 24 permutations of its roots, only four of which are in its Galois group. This is because only four of the permutations are genuine symmetries of x4 - I0x2 + 1, in the sense that they determine automorphisms of the root field.
In the discussion throughout the remainder of this chapter, let F and K remain fixed. F is an arbitrary field and K is the root field of some polynomial a(x) in F[x]. The thread of our reasoning will lead us to speak about fields /where FQlQK, that is, fields "between" Fand K. We will
refer to them as intermediate fields. Since K is the root field of a(x) over F, it is also the root field of a(x) over / for every intermediate field /.
The letter G will denote the Galois group of K over F. With each intermediate field /, we associate the group
/* = Gal(K : I)
that is, the group of all the automorphisms of K which fix /. It is obviously a subgroup of G. We will call /* the fixer of I.
Conversely, with each subgroup H of G we associate the subfield of K containing all the a in K which are not changed by any nEH. That is,
{a E K : rr(a) = a for every it €E H)
One verifies in a trice that this is a subfield of K. It obviously contains F, and is therefore one of the intermediate fields. It is called the fixed field of H. For brevity and euphony we call it the fixfield of H.
Let us recapitulate: Every subgroup H of G fixes an intermediate field /, called the fixfield of H. Every intermediate field / is fixed by a subgroup H of G, called the fixer of /. This suggests very strongly that there is a one-to-one correspondence between the subgroups of G and the fields intermediate between F and K. Indeed, this is correct. This one-to-one correspondence is at the very heart of Galois theory, because it provides the tie-in between properties of field extensions and properties of subgroups.
Just as, in Chapter 29, we were able to use vector algebra to prove new things about field extensions, now we will be able to use group theory to explore field extensions. The vector-space connection was a relative lightweight. The connection with group theory, on the other hand, gives us a tool of tremendous power to study field extensions.
We have not yet proved that the connection between subgroups of G and intermediate fields is a one-to-one correspondence. The next two theorems will do that.
Theorem 2 If H is the fixer of /, then I is the fixfield of H.
Proof: Let H be the fixer of /, and /' be the fixfield of H. It follows from the definitions of fixer and fixfield that / C /', so we must now show that /' C /. We will do this by proving that a e" I implies a £" /'. Well, if a is an element of K which is not in /, the minimum polynomial p(x) of a over I must have degree s*2 (for otherwise, aG/). Thus, p(x) has another root b. By Rule (ii) given at the beginning of this chapter, there is an automorphism of K fixing / and sending a to b. This automorphism moves a, so a 0 I'. ■
Lemma Let H be a subgroup of G, and I the fixfield of H. The number of elements in H is equal to [K : /],
Proof: Let H have r elements, namely, hx,..., hr. Let K = 1(a). Much of our proof will revolve around the following polynomial:
b{x) = [x- h{(a)][x - h2(a)\ --[x- hr(a)}
Since one of the h, is the identity function, one factor of b(x) is (x - a), and therefore a is a root ofb(x). In the next paragraph we will see that all the coefficients of b(x) are in /, so b(x)£I[x]. It follows that b(x) is a multiple of the minimum polynomial of a over /, whose degree is exactly [K : /]. Since b(x) is of degree r, this means that r & [K : I], which is half our theorem.
Well, let us show that all the coefficients of b(x) are in /. We saw on page 314 that every isomorphism h^.K^K can be extended to an
328 CHAPTER THIRTY-TWO
GALOIS THEORY: THE HEART OF THE MATTER 329
isomorphism ft, : K[x]-* K[x]. Because ft, is an isomorphism of polynomials, we get
ft>(x)) = h,(x - ft.^ft'C* - h2{a)) ■ ■ ■ h,(x - hr(a))
= (x-hioh1(a))-(x-htohr(a))
But ft,°/ij, ft,0h2,. .. , hjOh, are r distinct elements of H, and H has exactly r elements, so they are all the elements of H (that is, they are ft,,..., hr, possibly in a different order). So the factors of hXb(x)) are the same as the factors of b(x), merely in a different order, and therefore ft•(&(*)) = b(x). Since equal polynomials have equal coefficients, ft, leaves the coefficients of b(x) invariant. Thus, every coefficient of b(x) is in the fixfield of H, that is, in /.
We have just shown that [AT:/]=£r. For the opposite inequality, remember that by Theorem 1, [K : 1] is equal to the number of /-fixing automorphisms of K. But there are at least r such automorphisms,
namely ft,
hr. Thus, [K : /] 3s r, and we are done.
Theorem 3 If I is the fixfield of H, then H is the fixer of I.
Proof: Let / be the fixfield of H, and /* the fixer of /. It follows from the definitions of fixer and fixfield that H CI*. We will prove equality by showing that there are as many elements in H as in /*. By the lemma, the order of II is equal to [K : I]. By Theorem 2, / is the fixfield of /*, so by the lemma again, the order of /* is also equal to [K : I]. ■
It follows immediately from Theorems 2 and 3 that there is a one-to-one correspondence between the subgroups of Gal(K : F) and the intermediate fields between K and F. This correspondence, which matches every subgroup with its fixfield (or, equivalently, matches every intermediate field with its fixer) is called a Galois correspondence. It is worth observing that larger subfields correspond to smaller subgroups; that is,
/, C I2 iff I*2 C I*
As an example, we have seen that the Galois group of Q(V5, V5) over QisG={e,a, j8,y) with the table given on page 325. This group has exactly five subgroups—namely, {e}, {e, a}, {s, 8}, {e,y}, and the whole group G. They may be represented in the "inclusion diagram":
<
In order to effectively tie in subgroups of G with extensions of the field F, we need one more fact, to be presented next.
Suppose E C / C K, where AT is a root field over E and / is a root field over E. (Hence by Theorem 8 of Chapter 31, K is a root field over /.) If ft E Gal(K: E), ft is an automorphism of K fixing E. Consider the restriction of ft to I, that is, ft restricted to the smaller domain /. It is an isomorphism with domain / fixing E, so by Rule (i) given at the beginning of this chapter, it is an automorphism of /, still fixing E. We have just shown that if ft E Gal(K : £), then the restriction of ft to / is in Gal(I: E). This permits us to define a function fi : Gal(K : E)-^ Gal(I: E) by the rule
/i(ft) = the restriction of ft to /
It is very easy to check that p, is a homomorphism. p. is surjective, because every E-fixing automorphism of / can be extended to an £-fixing automorphism of K, by Theorem 6 in Chapter 31.
330 CHAPTER THIRTY-TWO
Finally, if h £ Gal(K : E), the restriction of h to / is the identity function iff h(x) = x for every x £ /, that is, iff h fixes /. This proves that the kernel of p. is Gal(K : I).
To recapitulate: |i is a homomorphism from Gal(K : E) onto Gal(I: E) with kernel Gal(K : I). By the FHT, we immediately conclude as follows:
Theorem 4 Suppose E C / C K, where 1 is a root field over E and K is a root field over E. Then
r hi m ~ GaliJ^E) Gal(I:E)- Gal(K . t)
It follows, in particular, that Gal(K: /) is a normal subgroup of Gal(K : E).
EXERCISES
t A. Computing a Galois Group
1 Show that Q(i, V2) is the root field of (jT + \ )(x2 - 2) over Q. # 2 Find the degree of Q(/', V2) over Q.
3 List the elements of Gal(Q(i, V5) : O) and exhibit its table.
4 Write the inclusion diagram for the subgroups of Gal(Q(i, V2): Q), and the inclusion diagram for the fields intermediate between Q and Q(/, V2). Indicate the Galois correspondence.
t B. Computing a Galois Group of Eight Elements
1 Show that Q(V2, V5, V5) is the root field of (*2 - 2)(;r2 - 3)(;t2 - 5) over Q.
2 Show that the degree of Q(V2, V3, V5) over O is 8.
3 List the eight elements of G = Ga/(Q(V2, V3, V5): Q) and write its table.
4 List the subgroups of G. (By Lagrange's theorem, any proper subgroup of G has either two or four elements.)
5 For each subgroup of G, find its flxfield.
6 Indicate the Galois correspondence by means of a diagram like the one on page 329.
t C. A Galois Group Equal to S3
1 Show that 0(^2, i'V3) is the root field of x3 - 2 over Q, where 1/2 designates the real cube root of 2. (Hint: Compute the complex cube roots of unity.)
2 Show that [Q$i% : Q] = 3.
3 Explain why x2 + 3 is irreducible over Q(i/2), then show that [Q(i/2, iV"3): Q(V2)] = 2. Conclude that [Q$2, iV3) : Q] = 6.
galois theory: the heart of the matter 331
4 Use part 3 to explain why Gal(Q(y/2, iV3) : Q) has six elements. Then use the discussion following Rule (ii) on page 323 to explain why every element of Gal(Q(i/2, iV3) : Q) may be identifed with a permutation of the three cube roots of 2.
5 Use part 4 to prove that Gal(Q$2, iV3): Q) = Sy.
t D. A Galois Group Equal to I>4
If a ="v/2 is a real fourth root of 2, then the four fourth roots of 2 are ±„, the group of symmetries of the square.
t E. A Cyclic Galois Group
# 1 Describe the root field K of x1 - 1 over Q. Explain why [K : Q] = 6.
2 Explain: If a is a primitive seventh root of unity, any h £ Gal{K : Q) must send a to a seventh root of unity. In fact, h is determined by h(a).
3 Use part 2 to list explicitly the six elements of Gal(K : Q). Then write the table of Gal(K : Q) and show that it is cyclic.
4 List all the subgroups of Gal(K : Q), with their fixfields. Exhibit the Galois correspondence.
5 Describe the root field L of x6 - 1 over Q, and show that [L:Q] = 2. Explain why it follows that there are no intermediate fields between Q and L (except for Q and L themselves).
# 6 Let L be the root field of x6 - 2 over Q. List the elements of Gal(L : O) and write its table.
t F. A Galois Group Isomorphic to Ss
Let a(x) = x5 - 4x4 + 2x + 2 £ Qfx], and let r,,.. , r5 be the roots of a(x) in C. Let K = Q(r,.....r5) be the root field of a(x) over Q.
332 chapter thirty-two
galois theory: the heart of the matter 333
Prove: parts 1-3:
1 a(x) is irreducible in Q[jc] .
2 a(x) has three real and two complex roots. [Hint: Use calculus to sketch the graph of y = a(x), and show that it crosses the x axis three times.]
3 If r, denotes a real root of a(x), [Q(r,) : Q] = 5. Use this to prove that [K : Q] is a multiple of 5.
4 Use part 3 and Cauchy's theorem (Chapter 13, Exercise E) to prove that there is an element a of order 5 in Gal(K: Q). Since a may be identified with a permutation of {/•,,..., r5}, explain why it must be a cycle of length 5. (Hint: Any product of disjoint cycles on {r,,. . . , r5} has order 5^5.)
5 Explain why there is a transposition in Gal(K : Q). [It permutes the conjugate pair of complex roots of a(x).]
6 Prove: Any subgroup of S5 which contains a cycle of length 5 and a transposition must contain all possible transpositions in S5, hence all of S5. Thus, Gal(K : Q) = S5.
G. Shorter Questions Relating to Automorphisms and Galois Groups
Let F be a field, and K a finite extension of F. Suppose a, be K. Prove parts 1-3:
1 If an automorphism h of K fixes F and a, then h fixes F(a).
2 F(a, by = F(a)* n F(b)\
3 Aside from the identity function, there are no Q-fixing automorphisms of Q(^/2). [Hint: Note that Q(i/2) contains only real numbers.]
4 Explain why the conclusion of part 3 does not contradict Theorem 1.
In the next three parts, let to be a primitive plh root of unity, where p is a prime.
5 Prove: If h G Ga/(Q(w) : Q), then h(a>) = to" for some k where 1« k «p - 1.
6 Use part 5 to prove that Gal(Q(o>): Q) is an abelian group.
7 Use part 5 to prove that Ga/(Q(a>): Q) is a cyclic group.
t H. The Group of Automorphisms of C
1 Prove: The only automorphism of Q is the identity function. [Hint: If h is an automorphism, «(1) = 1; hence n(2) = 2, and so on.]
2 Prove: Any automorphism of R sends squares of numbers to squares of numbers, hence positive numbers to positive numbers.
3 Using part 2, prove that if h is any automorphism of R, a) = b. Conclude that
hl*h'x 4 shown to be futile, but a criterion is made available to test any equation and determine if it has solutions given by a radical formula. All this will be made clear in the following pages.
Every quadratic equation ax2 + bx + c = 0 has its roots given by the
formula -
-b± VV^4ac
2a
Equations of degree 3 and 4 can be solved by similar formulas. For example, the cubic equation x3 + ax + b = 0 has a solution given by
+ Vd +
Vd
where D
27 + 4
(1)
Such expressions are built up from the coefficients of the given polynomials by repeated addition, subtraction, multiplication, division, and taking roots. Because of their use of radicals, they are called radical expressions or radical formulas. A polynomial a{x) is solvable by radicals if there is a radical expression giving its roots in terms of its coefficients.
Let us return to the example of x3 + ax + b = 0, where a and b are rational, and look again at Formula (1). We may interpret this formula to assert that if we start with the field of coefficients Q, adjoin the square root VD, then adjoin the cube roots V-6/2 ± VD, we reach a field in which Jt3 + ax + b = 0 has its roots.
In general, to say that the roots of a(x) are given by a radical expression is the same as saying that we can extend the field of coefficients of a(x) by successively adjoining nth roots (for various n), and in this way obtain a field which contains the roots of a(x). We will express this notion formally now, in the language of field theory.
F(c,,. . . ,cn) is called a radical extension of F if, for each i, some
power of c. is in F(c,
c,_,). In other words, F(c,,. . . , c„) is an
iterated extension of F obtained by successively adjoining nth roots, for various n. We say that a polynomial a(x) in F[x] is solvable by radicals if there is a radical extension of F containing all the roots of a(x), that is, containing the root field of a(x).
To deal effectively with nth roots we must know a little about them. To begin with, the nth roots of 1, called nth roots of unity, are, of course, the solutions of x" - 1 = 0. Thus, for each n, there are exactly n nth roots of unity. As we shall see, everything we need to know about roots will follow from properties of the roots of unity.
In C the nth roots of unity are obtained by de Moivre's theorem. They consist of a number to and its first n powers: 1 = to , to, to ,. . . , w"-1. We will not review de Moivre's theorem here because, remarkably, the main facts about roots of unity are true in every field of characteristic zero. Everything we need to know emerges from the following theorem:
Theorem 1 Any finite group of nonzero elements in a field is a cyclic group. (The operation in the group is the field's multiplication.)
Proof: If F* denotes the set of nonzero elements of F, suppose that G C F*, and that G, with the field's "multiply" operation, is a group of n
336 CHAPTER THIRTY-THREE
SOLVING EQUATIONS BY RADICALS 337
elements. We will compare G with Z„ and show that G, like Z„, has an element of order n and is therefore cyclic.
For any integer k, let g(k) be the number of elements of order k in G, and let z(k) be the number of elements of order k in Z„. For every positive integer k which is a factor of n, the equation x = 1 has af mosf /c solutions in F; thus,
(*) G contains at most k elements whose order is a factor of k.
If G has an element a of order k, then (a) = {e, a, a2,... , a*-1} are a// the distinct elements of G whose order is a factor of k. [By (*), there cannot be any others.] In Z„, the subgroup
A: I
contains all the elements of Z„ whose order is a factor of k.
Since (a) and {nlk) are cyclic groups with the same number of elements, they are isomorphic; thus, the number of elements of order k in (a) is the same as the number of elements of order k in (nlk). Thus, g(k) = z(k).
Let us recapitulate: if G has an element of order k, then g(k) = z(k); but if G has no such elements, then g(k) = 0. Thus, for each positive integer k which is a factor of n, the number of elements of order k in G is less than (or equal to) the number of elements of order k in Z„.
Now, every element of G (as well as every element of Z„) has a well-defined order, which is a divisor of n. Imagine the elements of both groups to be partitioned into classes according to their order, and compare the classes in G with the corresponding classes in Z„. For each k, G has as many or fewer elements of order k than Z„ does. So if G had no elements of order n (while Z„ does have one), this would mean that G has fewer elements than Z„, which is false. Thus, G must have an element of order n, and therefore G is cyclic. ■
The nth roots of unity (which are contained in F or a suitable extension of F) obviously form a group with respect to multiplication. By Theorem 1, it is a cyclic group. Any generator of this group is called a primitive nth root of unity. Thus, if w is a primitive nth root of unity, the set of all the nth roots of unity is
! 2 n -1
1, at, m ,..., to
If co is a primitive nth root of unity, F(co) is an abelian extension of F in the sense that g°h = h°g for any two F-fixing automorphisms g and h of F(io). Indeed, any automorphism must obviously send nth roots of unity to nth roots of unity. So if g(to) = cor and h(a>) = «/, then g°h(io) = g(cos) = co", and analogously, h° g(co) = co". Thus, g°h(co) = h°g(co).
Since g and h fix F, and every element of F("'1. Indeed, if c is any other nth root of a, then clearly elb is
an nth root of 1, say cor; hence c = bco4. We may infer from the above that if F contains a primitive nth root of unity, and b is an nth root of a, then F(b) is the root field of x" - a over F.
In particular, F(b) is an abelian extension of F. Indeed, any F-fixing automorphism of F(b) must send nth roots of a to nth roots of a: for if c is any nth root of a and g is an F-fixing automorphism, then g(c)" = g(c") = g(i) = «; hence g(c) is an nth root of a. So if g(b) = bco' and h(b) = bco\ then
g"h(b) = g(bcos) = bcoW = bcor*s
and
h-g(b) = h(bcor) = bcoW = bcor+'
hence g«h(b) = h° g(b). Since g and h fix F, and every element in F(b) is a linear combination of powers of b with coefficients in F, it follows that g°h=h"g.
If a(x) is in F[x], remember that a(x) is solvable by radicals just as long as there exists some radical extension of F containing the roots of a(x). [Any radical extension of F containing the roots of a(x) will do.] Thus, we may as well assume that any radical extension used here begins by adjoining to F the appropriate roots of unity; henceforth we will make this assumption. Thus, if K= F(c,,. . . , cj is a radical extension of F, then
FC F(Cl) C F(c]; c2) C• ■ - C F(c,, . . . , c„)
(2)
is a sequence of simple abelian extensions. (The extensions are all abelian by the comments in the preceding three paragraphs.)
Still, this is not quite enough for our purposes: In order to use the machinery which was set up in the previous chapter, we must be able to say that each field in (2) is a root field over F. This may be accomplished as follows: Suppose we have already constructed the extensions /„ C /, C ■ • • £ Iq in (2) so that Iq is a root field over F. We must extend / to / +1, so is a root field over F. Also, Iq+1 must include the element c
which is the nth root of some element a&Iq.
Let H = {hy,..., hr} be the group of all the F-fixing automorphisms of Iq, and consider the polynomial
b(x) = [xn - h^lx" - h2(a)] •••[,"- «,(«)]
338 CHAPTER THIRTY-THREE
SOLVING EQUATIONS BY RADICALS 339
By the proof of the lemma on page 327, one factor of b(x) is (x" - a); hence cq + 1 is a root of b(x). Moreover, by the same lemma, every coefficient of b(x) is in the fixfield of H, that is, in F. We now define / +1 to be the root field of b(x) over F. Since all the roots of b(x) are nth roots of elements in lq, it follows that Iq + 1 is a radical extension of Iq. The roots may be adjoined one by one, yielding a succession of abelian extensions, as discussed previously. To conclude, we may assume in (2) that K is a root field over F.
If G denotes the Galois group of K over F, each of these fields Ik has a fixer which is a subgroup of G. These fixers form a sequence
K*Cl*_t C-'-QI*QF*
For each k, by Theorem 4 of Chapter 32, l*k is a normal subgroup of and Ik + 1/I*k - Gal(Ik + l : lk) which is abelian because Ik + l is an abelian extension of Ik. The following definition was invented precisely to account for this situation.
A group G is called solvable if it has a sequence of subgroups {e} = H0 C /Y, C • • • C Hm = G such that for each k, Gk is a normal subgroup of Gk + l and Gk + xIGk is abelian.
We have shown that if K is a radical extension of F, then Gal(K : F) is a solvable group. We wish to go further and prove that if a(x) is any polynomial which is solvable by radicals, its Galois group is solvable. To do so, we must first prove that any homomorphic image of a solvable group is solvable. A key ingredient of our proof is the following simple fact, which was explained on page 152: GIH is abelian iff H contains all the products xyx y~x, for all x and y in G. (The products xyx~ly~ are called "commutators" of G.)
Theorem 2 Any homomorphic image of a solvable group is a solvable group.
Proof: Let G be a solvable group, with a sequence of subgroups
{e) C H, C ■ • • C Hm = G
as specified in the definition. Let / : G-> X be a homomorphism from G onto a group X. Then f(H0), f(Hx),. . . , f{Hm) are subgroups of X, and clearly {e} C f(Ha) C /(//,) C • • • C f(Hm) = X. For each i we have the following: if f(a) G/(//,) and f{x) G f(Hi+,), then a G H, and r£//,+]; hence xax~'&H, and therefore f(x)f(a)f(x)'x G/(Hf). So /(//,) is a normal subgroup of f{Hl + l). Finally, since Hi+lIHi is abelian, every commutator xyx~ly~l (for all x and y in Hi+l) is in H,; hence every f(x)f(y)f(x) 'f(yyl is in/(//,). Thus, /(//, + 1)//(//,) is abelian. .
Now we can prove the main result of this chapter:
Theorem 3 Let a(x) be a polynomial over a field F. If a(x) is solvable by radicals, its Galois group is a solvable group.
Proof: By definition, if K is the root field of a(x), there is an extension by radicals F(c,,. . . , c„) such that FC KCF(c,, ... ,cn). It follows by Theorem 4 of Chapter 32 that Gal(F(cx,. . . , cj : F)l Gal(F(cx, . . . , cj : K) = Gal(K : F); hence by that theorem, Gal(K: F) is a homomorphic image of Gal(F(cx,. . . ,cn): F) which we know to be solvable. Thus, by Theorem 2 Gal(K : F) is solvable. ■
Actually, the converse of Theorem 3 is true also. All we need to show is that, if K is an extension of F whose Galois group over F is solvable, then K may be further extended to a radical extension of F. The details are not too difficult and are assigned as Exercise E at the end of this chapter.
Theorem 3 together with its converse say that a polynomial a(x) is solvable by radicals iff its Galois group is solvable.
We bring this chapter to a close by showing that there exist groups which are not solvable, and there exist polynomials having such groups as their Galois group. In other words, there are unsolvable polynomials. First, here is an unsolvable group:
Theorem 4 The symmetric group 5S is not a solvable group.
Proof: Suppose 55 has a sequence of subgroups
{e} = HnCHlC---QHm = S5
as in the definition of solvable group. Consider the subset of Ss containing all the cycles (ijk) of length 3. We will show that if //; contains all the cycles of length 3, so does the next smaller group //,_,. It would follow in m steps that H0 = {} contains all the cycles of length 3, which is absurd.
So let Hi contain all the cycles of length 3 in 55. Remember that if a and B are in Ht, then their commutator aBa~xB~l is in But any
cycle (ijk) is equal to the commutator
(ilj)(jkm){ili)-\jkm)'1 = (ilj)(jkm)(jli)(mkj) = (ijk)
hence every (ijk) is in //,_,, as claimed. ■
Before drawing our argument toward a close, we need to know one more fact about groups; it is contained in the following classical result of group theory:
Cauchy's theorem Let G be a finite group of n elements. If p is any prime number which divides n, then G has an element of order p.
340 CHAPTER THIRTY-THREE
SOLVING EQUATIONS BY RADICALS 341
For example, if G is a group of 30 elements, it has elements of orders 2, 3, and 5. To give our proof a trimmer appearance, we will prove Cauchy's theorem specifically for p = 5 (the only case we will use here, anyway). However, the same argument works for any value of p.
Proof: Consider all possible 5-tuples (a, b, c, d, k) of elements of G whose product abcdk = e. How many distinct 5-tuples of this kind are there? Well, if we select a, b, c, and d at random, there is a unique k = d~xc~ b~la~l in G making abcdk = e. Thus, there are n4 such 5-tuples.
Call two 5-tuples equivalent if one is merely a cyclic permutation of the other. Thus, (a, b, c, d, k) is equivalent to exactly five distinct 5-tuples, namely, (a,b,c,d,k), (b,c,d,k,a), (c,d,k,a,b), (d, k, a, b, c) and (k, a, b, c, d). The only exception occurs when a 5-tuple is of the form (a, a, a, a, a) with all its components equal; it is equivalent only to itself. Thus, the equivalence class of any 5-tuple of the form (a, a, a, a, a) has a single member, while all the other equivalence classes have five members.
Are there any equivalence classes, other than {(e, e, e, e, e)}, with a single member? // not, then 5|(n4 - 1) [for there are n4 5-tuples under consideration, less (e, e, e, e, e)]; hence n4 = 1 (mod 5). But we are assuming that 5\n; hence «4 = 0 (mod 5), which is a contradiction.
This contradiction shows that there must be a 5-tuple (a, a, a, a, a) ^ (e, e, e, e, e) such that aaaaa = as = e. Thus, there is an element aEGof order 5. ■
We will now exhibit a polynomial in Q[x] having Ss as its Galois group (remember that 5, is not a solvable group).
Let a(x) = x5 — 5x — 2. By Eisenstein's criterion, a(x + 2) is irreducible over Q; hence a(x) also is irreducible over Q. By elementary calculus, a(x) has a single maximum at (—1,2), a single minimum at (1, —6), and a single point of inflection at (0, -2). Thus (see figure), its graph intersects the x axis exactly three times. This means that a(jt) has three real roots, r„ r2, and r3, and therefore two complex roots, rA and r5, which must be complex conjugates of each other.
Let K denote the root field of a(x) over Q, and G the Galois group of a(x). As we have already noted, every element of G may be identified with a permutation of the roots rlt r2, r3, r4, r5 of a(x), so G may be viewed as a subgroup of S5. We will show that G is all of Ss.
Now, [Q(r,): Q] = 5 because r, is a root of an irreducible polynomial of degree 5 over Q. Since [K : Q] = [K : a primitive pth root of unity in the field F.
1 If d is any root of xp - a £ F[x], show that F(o>, d) is a root field of xp - a. Suppose xp - a is not irreducible in F[x\.
2 Explain why xp - a factors in F[x] as x" - a = p(x)f{x), where both factors have degree s2.
# 3 If deg p(x) = m, explain why the constant term of p(x) (let us call it b) is equal to the product of m pth roots of a. Conclude that b = a>kdm for some k.
4 Use part 3 to prove that bp = am.
5 Explain why m and p are relatively prime. Explain why it follows that there are integers s and t such that sm + tp = 1.
6 Explain why b'p = a"". Use this to show that (b'a')p = a.
7 Conclude: If x" - a is not irreducible in F[x], it has a root (namely, bsa') in F.
We have proved: xp — a either has a root in.F or is irreducible over F.
t D. Another Way of Defining Solvable Groups
Let G be a group. The symbol H <3 G should be read, "H is a normal subgroup of G." A maximal normal subgroup of G is an H < G such that, if H < J < G, then necessarily J = H or / = G. Prove the following:
1 If G is a finite group, every normal subgroup of G is contained in a maximal normal subgroup.
2 Let/: G^H be a homomorphism. If J<1H, then /"'(/)< G.
# 3 Let K<\G. If $ is a subgroup of GIK, let J1 denote the union of all the cosets which are members of f. If $ < G/AT, then £).
24 A x (B - D) = (A x B) - (A x £>).
APPENDIX
B
REVIEW OF THE INTEGERS
One of the most important facts about the integers is that any integer m can be divided by any positive integer n to yield a quotient q and a positive remainder r. (The remainder is less than the divisor n.) For example, 25 may be divided by 8 to give a quotient of 3 and a remainder of 1:
25 = 8 x 3 + 1
This process is known as the division algorithm. It may be stated precisely as follows:
Theorem 1: Division algorithm If m and n are integers and n is positive, there exist unique integers q and r such that
m- nq + r and 0 « r < n
We call q the quotient and r the remainder when m is divided by n.
Here we shall take the division algorithm as a postulate of the system of the integers. (In Chapter 21 we started with a more fundamental premise and proved the division algorithm from it.)
If r and s are integers, we say that s is a multiple of r if there is an integer k such that
s = rk
In this case, we also say that r is a factor of s, or r divides s, and we symbolize this relationship by writing
r\s
349
350 APPENDIX B
REVIEW OF THE INTEGERS 351
For example, 3 is a factor of 12, so we write: 3112. Some of the elementary properties of divisibility are stated in the next theorem.
Theorem 2: The following are true for all integers a, b, and c:
(i) // a | b and b | c, then, a \ c.
(ii) \ \a.
(iii) a|0.
(iv) If c\a and c\b, then c\(ax + by) for all integers x and y.
(v) If a\b and c\d, then ac\bd.
The proofs of these relationships follow from the definition of divisibility. For instance, we give here the proof of (iv): If c\a and c\b, this means that a = kc and b = Ic for some k and /. Then
ax + by = kcx + Icy = c(kx + ly)
Visibly, c is a factor of c(kx + ly), and hence a factor of ax + by. In symbols, c\(ax + by), and we are done.
An integer / is called a common divisor of integers r and s if t\r and t\s. A greatest common divisor of r and s is an integer t such that (i) t\r and t\s, and (ii) For any integer u, if u\r and u\s, then u\t.
In other words, t is a greatest common divisor of r and s if t is a common divisor of r and 5 and every other common divisor of r and s divides t.
It is an important fact that any two nonzero integers r and s always have a positive greatest common divisor:
Theorem 3: Any two nonzero integers r and s have a unique positive greatest common divisor t. Moreover, t is equal to a "linear combination" of r and s. That
is,
t = kr+ls
for some integers k and I.
The unique positive greatest common divisor of r and s is denoted by the symbol gcd(r, s).
A pair of integers r and s are said to be relatively prime if they have no common divisors except ±1. For example, 4 and 15 are relatively prime. If r and s are relatively prime, their gcd is equal to 1. So by Theorem 3, there are integers k and / such that
kr + Is = 1
Actually, the converse of this statement is true too, and is stated in the next theorem:
Theorem 4: Two integers r and s are relatively prime if and only if there are integers k and I such that kr + Is = 1.
The proof of this theorem is left as an exercise. From Theorem 4, we deduce the following:
Theorem 5 If r and s are relatively prime, and r\st, then r\t.
Proof From Theorem 4 we know there are integers k and / such that kr + Is = 1. Multiplying through by t, we get
krt + lst = t (i)
But we are given the fact that r\st; that is, st is a multiple of r, say st = mr. Substitution into Equation (1) gives krt + lmr=t, that is, r(kt+ lm) = t. This shows that r is a factor of t, as required. ■
If an integer m has factors not equal to 1 or -1, we say that m is composite. If a positive integer m # 1 is not composite, we call it a prime. For example, 6 is composite (it has factors ±2 and ±3), while 7 is prime.
If p is a prime, then for any integer n, p and n are relatively prime. Thus, Theorem 5 has the following corollary:
Corollary Let m and n be integers, and letp be a prime: If p\mn, then either p\m or p\n.
It is a major fact about integers that every positive integer m > 1 can be written, uniquely, as a product of primes. (The proof is given in Chapter 22.)
By a least common multiple of two integers r and s we mean a positive integer m such that
(i) r\m and s\m, and
(ii) If r\x and s\x, then m\x.
In other words, m is a common multiple of r and s and m is a factor of every other common multiple of r and s. In Chapter 22 it is shown that every pair of integers r and s has a unique least common multiple.
The least common multiple of r and 5 is denoted by lcm(r, s). The least common multiple has the following properties:
Theorem 6 For any integers a, b, and c,
(i) If gcd(a, b) = 1, then lcm(a, b) = ab.
(ii) Conversely, if lcm{a, b) = ab, then gcd(a, b)=\.
(iii) If gcd{a, b) = c, then lcm(a, b) = able.
(iv) lcm(a, ab) = ab.
The proofs are left as exercises.
EXERCISES
Prove that the following are true for any integers a, b, and c:
1 If a I b and b \ c, then a \ c.
2 If a\b, then a\(-b) and (~a)\b.
3 \\a and (-l)|a.
4 alO.
352 APPENDIX B
5 If a | ft, then ac\bc.
6 If a > 0, then gcd(a, 0) = a.
7 If gcd(«6, c) = 1, then gcd(a, c) = 1 and gcd(ft. c) = 1.
8 If there are integers k and / such that ka + lb = 1, then a and 6 are relatively prime.
9 If a\d and c\d and gcd(a, c) = 1, then ac|d.
10 If d\ab and d"|cft and gcd(a, c) = 1, then d\b.
11 If gcd(a, 6) = 1, then lcm(a, b) = aft.
12 If lcm(a, ft) = aft. then gcd(a, ft) = 1.
13 If gcd(a, ft) = c, then lcm(a, ft) = able.
14 lcm(a, aft) = aft.
15 a • lcm(ft. e) = lcm(aft, ac).
APPENDIX
c
REVIEW OF MATHEMATICAL INDUCTION
The basic assumption one makes about the ordering of the integers is the following:
Well-ordering principle. Every nonempty set of positive integers has a least element.
From this assumption, it is easy to prove the following theorem, which underlies the method of proof by induction:
Theorem 1: Principle of mathematical induction Let A represent a set of positive integers. Consider the following two conditions:
(i) 1 is in A.
(ii) For any positive integer k, if k is in A, then k + 1 is in A.
If A is any set of positive integers satisfying these two conditions, then A consists of all the positive integers.
Proof: If A does not contain all the positive integers, then by the well-ordering principle (above), the set of all the positive integers which are not in A has a least element; call it ft. From Condition (i), ft ¥ 1; hence ft > 1.
Thus, ft - 1 >0, and b — IE A because ft is the least positive integer not in A. But then, from Condition (ii), fte A, which is impossible. ■
Here is an example of how the principle of mathematical induction is used: We shall prove the identity
n(n + 1)
1 + 2 +
+ n
(1)
that is, the sum of the first n positive integers is equal to n(n + 1)12.
353
354 APPENDIX C
Let A consist of all the positive integers n for which Equation (1) is true. Then 1 is in A because
1 :
1-2
ANSWERS TO SELECTED EXERCISES
Next, suppose that k is any positive integer in A; we must show that, in that case, k + 1 also is in A. To say that k is in A means that
1+2+--- + k-
k(k + 1)
By adding k + 1 to both sides of this equation, we get
1 + 2+ • ■ • + * + (k + 1) = M^_Lii + (fc + i)
that is,
1 + 2 +
(k + l)-
(* + !)(*+ 2)
From this last equation, k + 1 E A.
We have shown that IE. A, and moreover, that if k 0 such that x" EH and y"1 £ //. Since H is a subgroup of G, it is closed under products—and hence under exponentiation (which is repeated multiplication of an element by itself). Thus (x")m 6 H and (x"')" e H. Set q = mn. Since G is abelian, {xy)q = x"y" = xmnym" = (x")",(yT £ H since both (*")"■ and (vT are in Complete the problem.
D5 S = {a,,. . . , a„} has « elements. The n products 0,0,, a,a2,.. . , alan are elements of 5 (why?) and no two of them can be equal (why?). Hence every element of S is equal to one of these products. In particular, at = alak for some k. Thus, a,e = a,at, and hence e = ak. This shows that eE.S. Now complete the problem.
D7 (a) Suppose x £ K. Then
(i) if a £ H, then jtymT1 £ r7, and
(ii) if xbx~' EH, then bEH.
We shall prove that x1 E K: we must first show that if a EH, then jc^Vjc"1)"' =x~laxEH. Well, a = ^(x"'ax)x"', and if x(jt_Iar)jc-1 £ if, then x'lax E H by (ii) above. [Use (ii) with jTW replacing b.\ Conversely, we must show that if x~lax E H, then a EH. Well, if x~*ax E H, then by (i) above, x(x~'ax)x_1 = a £ //.
E7 We begin by listing all the elements of Z2 x Z4 obtained by adding (1,1) to itself repeatedly: (1,1); (1,1) + (1,1) = (0, 2); (1,1) + (1,1) + (1,1) = (1,3); (1,1) + (1,1) + (1,1) + (1,1) = (0,0). If we continue adding (1,1) to multiples of itself, we simply obtain the above four pairs over and over again. Thus, (1, 1) is not a generator of all of Z2 x Z4.
This process is repeated for every element of Z2 x Z4 generator of Z2 x Z4; hence Z2 x Z4 is not cyclic.
Fl The table of G is as follows:
None is a
e a b b1 ab ab2
e e a b b2 ab ab2
a a e ab ab1 b b1
b b ab2 b2 e a ab
ly b2 ab e b ab' a
ab ab b2 ab1 a t' h
h2 ab2 b a ab Ir 1'
H3
Using the defining equations a2 = e, fe3 = e, and ba = ab2, we compute the product of ab and ab2 in this way:
(ab)(ab2) = a(ba)b2 = a(ab2)b2 = a2b'b = eeb = b
Complete the problem by exhibiting the computation of all the table entries. Recall the definition of the operation + in Chapter 3, Exercise F : x + y has Is in those positions where x and y differ, and Os elsewhere.
CHAPTER 6
A4 From calculus, the function f(x) = x3 - 3* is continuous and unbounded. Its graph is shown below. [/ is unbounded because f(x) = x(x2 - 3) is an arbitrarily large positive number for sufficiently large positive values of x, and an arbitrarily large negative number (large in absolute value) for sufficiently large negative values of x] Because / is continuous and unbounded, the range of /is IR. Thus, f is surjective. Now determine whether/ is injective, and prove your answer.
Graph of f(x) = x' - 3x
360 ANSWERS TO SELECTED EXERCISES
A6
F5
HI
f is injective: To prove this, note first that if x is an integer then f(x) is an integer, and if x is not an integer, then f(x) is not an integer. Thus, if /(•*) =/(y)> ^en x and y are either both integers or both nonintegers. Case 1, both integers: then f(x) - 2x, f(y) - 2y, and 2x = 2y; so x — y. Case 2, both nonintegers: ■ ■ ■ (Complete the problem. Determine whether / is surjective.)
Let A = {«,, a2, . . . , an}. If / is any function from A to A, there are n possible values for /(a,), namely, alt a2, . . . , an. Similarly there are n possible values for/(fl2). Thus there are n pairs consisting of a value of/(a,) together with a value of f(a2). Similarly there are n3 triples consisting of a value of /(ai), a value of f(a2), and a value of /(fl3). We may continue in this fashion and conclude as follows: Since a function / is specified by giving a value for/(a,), a value for f(a2), and so on up to a value for f(an), there are n" functions from A to A. Now, by reasoning in a similar fashion, how many bijective functions are there?
The following is one example (though not the only possible one) of a machine capable of carrying out the prescribed task:
A = {a, b, c, d) S = {s„, s,, s2, s2, st}
The next-state function is described by the following table:
abed
To explain why the machine carries out the prescribed function, note first that the letters b, c, and d never cause the machine to change its state—only a does. If the machine begins reading a sequence in state s0, it will be in state s3 after exactly three a's have been read. Any subsequent a's will leave the machine in state s4. Thus, if the machine ends in sy, the sequence has read exactly three a's. The machine's state diagram is illustrated below:
b.c.d
b. c. d
b.c.d
b.c.d
a.b. c.d
14 M, has only two distinct transition functions, which we shall denote by T0 and Tt (where o may be any sequence with on odd number of Is, and e any sequence with an even number of Is). Tm and Tt may be described as follows:
Us„)-
Us,)
ANSWERS TO SELECTED EXERCISES 361
Now, T» 7"„ = Toc by part 3. Since e is a sequence with an even number of Is and o is a sequence with an odd number of Is, oe is a sequence with an odd number of Is; hence T„ = T.. Thus, Tt°T0= T,. Similarly
In brief, the table of 5^(A/,) is as follows:
and 7\ ° T.
T. T. T„
T. r0 T.
The table shows that if(M^) is a two-element group. The identity element is Tc, and Tm is its own inverse.
CHAPTER 7
D2 is the function defined by the formula
+ = " + m
Use this fact to show that f„"fm =/„+„■
In order to show that /_„ is the inverse of /„, we must verify that fnof_a = e and /_„°/„ - e. We verify the first equation as follows: /„(*) = x + (-«); hence
[/„ •/ „](*)=/■(/ .«) -/■(* - ") " x ~ n + " = x = EW Since [/„ of. „\(x) = e(x) for every x in R, it follows that /„ •/_„ = e. E3 To prove that flla,„u is the inverse of /„ „, we must verify that
/..t°/i/..-»/. = e and /i/..-»/«°/..i. = e
To verify the first equation, we begin with the fact that /,,„..h/„(jc) = xla - bla. Now complete the problem.
H2 If /, g G G, then / moves a finite number of elements of A, say a and g moves a finite number of elements of A, say b,, any element of A which is not one of the elements a,, then
bm. Now if x is
«„, .....£>„
Thus, fog moves at {al,...,an,bl,...,bj.
U'g](x)=f[g(x)) =/W = *
most the finite set
of elements
CHAPTER 8
Al(e) (I
8 2
5 6 6 5 1
7 8^ 7 4y
362 ANSWERS TO SELECTED EXERCISES
ANSWERS TO SELECTED EXERCISES 363
A4( /) y = y y °y1
3 -1 / ,
y ° a = (1
= (1 2 3 4 5); a'1 =(4 1 7 3); thus, y'a1 2 3 4 5)-(4 1 7 3) = (1 7 4 2 3 5)
B2
" \"i a„---aj' ,= /a, a2--a,\
and so on. Note that a^a,) = a3, a2(a2) = a4, . . . , «2(", 2) = Finally, a (,_,) = a, and a (as) = a2. Thus, a2(a,) = a? Complete the problem using addition modulo s, page 27.
B4 Let a = (a, a,---a/) where s is odd. Then a :
2 .
E2 If a and /3 are cycles of the same length, a = (a, ■ • ■ fl5) and /3 = (ft, • • • 6,), let 7r be the following permutation: w(a,) = t>. for < = 1,... ,$ and ir(k) = A: for k ¥= a,,. .. , as, bx,...,bs. Finally, let tt map distinct elements of {&,,. . . , bs] — {a,,. . . , a,} to distinct elements of {a,,...,a,)-{b,.....6s}. Now complete the problem, supplying details.
Fl
ja. a, • • ■ a, \
I when « is a positive integer, k
\ai +t a>. + k ' '' I
For what values of k can you have a* = £? H2 Use Exercise HI and the fact that (/;) = (li)(l;)(li).
CHAPTER 9
CI The group tables for G and // are as follows:
Table for G Table for H
1 V H D l-l i -i
/ I V II D I 1-1 i -i
V V I D II -1 -1 1 1 I
11 H D I V i i -/ -1 1
D D H V I —i -i / 1 -1
G and // are not isomorphic because in G every element is its own inverse (VV= /, //// = /, and DD = /), whereas in H there are elements not equal to their inverse; for example, (-«')(-0 = -1 1. Find at least one other difference which shows that G ¥ H.
C4 The group tables for G and H are as follows: Table for G
Table for H
£ ü ß r s K I A B c D K
e a ß y s K i I A B c: D K
a t y ß K s A A 1 C B K I)
ß K S a e y B B K D A I c
y s K e a ß C C D K 1 A B
8 y £ K ß a D D c I K B A
K ß a s y i: K K B A D C I
G and H are isomorphic. Indeed, let the function / : G—» H be defined by
E a ß y S I A B C
5 k\ D K/
By inspection, / transforms the table of G into the table of H. Thus, / is an isomorphism from G to H.
El Show that the function / :Z-*E given by f(n)=2n is bijective and that f(n + m)=f(n)+f(m).
Fl Check that (2 4)2 = e, (1 2 3 4)" = e, and (1 2 3 4)(2 4) = (2 4)(1 2 3 4)3. Now explain why G = G'.
H2 Let fa : G,—* G2 and f„: Hx—*H2 be isomorphisms, and find an isomorphism / : Gj x Hx -* G2 x //2.
CHAPTER 10
Al(c) If m < 0 and « < 0, let m = -k and n = —I, where fc, / > 0. Then m + n = -(fc + /). Now,
and Hence
m -ft /
a = a = (a ) a"-«-,-(a-,y ama" = (a"1 )*(a-)' = (a ' )ř 1' = a-(t+" = a"
B3 The order of / is 4. Explain why.
C4 For any positive integer A:, if a* = e, then
{bab'lf = bakb~l (why?) = bbl
Conversely, if (bab ')* = 6a*b 1 = e, then a4 = e. (Why?) Thus, for any positive integer k, ak = e iff (bab'1)11 = e. Now complete the problem.
364 ANSWERS TO SELECTED EXERCISES
D2 Let the order of a be equal to n. Then (ak)" = a"k = (a")k = ek = e. Now use Theorem 5.
F2 The order of a8 is 3. Explain why. H2 The order is 24. Explain why.
CHAPTER 11
A6
Bl
C4
C7
D6
F3
If A: is a generator of Z, this means that Z consists of all the multiples of k; that is, k, 2k, 3k, etc., as well as 0 and -k, -2k, -3k, etc.:
-2k
0
2k
Let G be a group of order n, and suppose G is cyclic, say G = (a). Then a, the generator of G, is an element of order n. (This follows from the discussion on the first two pages of this chapter.) Conversely, let G be a group of order n (that is, a group with n elements), and suppose G has an element a of order n. Prove that G is cyclic.
By Exercise B4, there is an element b of order m in (a), and b e Cm. Since C„ is a subgroup of (a), which is cyclic, we know from Theorem 2 that Cm is cyclic. Since every element x in Cm satisfies x" = e, no element in C„ can have order greater than m. Now complete the argument.
First, assume that ord(a') = m. Then (a')m = a"" = e. Use Theorem 5 of Chapter 10 to show that r = kl for some integer /. To show that / and m are relatively prime, assume on the contrary that / and m have a common factor a; that is.
m = hq h ')
Slope = 2
Complete the solution, supplying details.
D5 If ab 1 commutes with every x in G, then we can show that ba 1 commutes with every x in G:
ba~'x = (x 'ab ') '=(ab'xvy' (why?)
CHAPTER 13
Bl Note first that the operation in the case of the group Z is addition. The subgroup (3) consists of all the multiples of 3, that is,
(3) = {...,-9, -6,-3,0,3,6,9,...}
The cosets of (3) are (3) + 0 = (3), as well as
<3) + l = {...,-8, -5,-2,1,4,7, 10,...}
(3) + 2 = {. . . , -7, -4, -1,2,5, 8,11,.. .}
Note that (3) + 3 = (3>, (3) + 4 = (3) + 1, and so on; hence there are only three cosets of (3), namely,
(3> = <3)+0 (3) + l (3>+2
366 ANSWERS TO SELECTED EXERCISES
C6 Every element a of order p belongs in a subgroup (a). The subgroup (a) has p - 1 elements (why?), and each of these elements has order p (why?). Complete the solution.
D6 For one part of the problem, use Lagrange's theorem. For another part, use the result of Exercise F4, Chapter 11.
E4 To say that aH = Ha is to say that every element in aH is in Ha and conversely. That is, for any hE H, there is some k E H such that ah - ka and there is some / G H such that ha = al. (Explain why this is equivalent to aH = Ha.) Now, an arbitrary element of a'H is of the form a'lh = (hla)~\ Complete the solution.
J3 0(1) = 0(2) = {1, 2, 3, 4}; G, = {e, B}; G2 = {f, aBa}. Complete the problem, supplying details.
CHAPTER 14
A6 We use the following properties of sets: For any three sets X, y and Z,
(i) (Xu Y)nz = (xnz){j(Ynz)
(ii) (x- y)nz = (xnz)-(ynz)
Now here is the proof that h is a homomorphism: Let C and D be any subsets of A; then
h(C + D) = h[(C -D)U(D- C)]
= [(C - D)U(D - C)]n B Now complete, using (i) and (ii).
by def of the operation + by def of h
CI Let /'be injective. To show that K = {e}, take any arbitrary element xEK and show that necessarily x = e. Well, since xEK, f(x) = e=f(e). Now complete, and prove the converse: Assume K = {e} ....
D6 Consider the following family of subsets of G : {/-/, : i G /}, where each Hi is a normal subgroup of G. Show that H = Pi //,is a normal subgroup of G. First, show that H is closed under the group operation: well, if a,bEH, then a G //, and b G //, for every i G /. Since each H{ is a subgroup of G, ab G Ht for every i G /; hence aft G PI //,. Now complete.
El If H has index 2, then G is partitioned into exactly two right cosets of H; also G is partitioned into exactly two left cosets of H. One of the cosets in each case is H.
E6 First, show that if xGS and yES, then xySS. Well, if x G 5, then xE Ha = aH for some a EG. And if y £ 5, then y G //ft = bH for some ft EG. Show that xy G H(ab) and that H(ab) = (ab)H and then complete the problem.
ANSWERS TO SELECTED EXERCISES 367
14 It is easy to show that a//a~1 C H. Show it. What does Exercise 12 tell you about the number of elements in aHa~ll
18 Let X= {aHa~' : a EG) be the set of all the conjugates of H, and let Y = {aN : a EG) be the set of all the cosets of N. Find a function / : X-* Y and show that /is bijective.
CHAPTER 15
C4 Every element of GIH is a coset Hx. Assume every element of GIH has a square root: this means that for every x E G,
Hx = (Hy)2
for some y G G. Avail yourself of Theorem 5 in this chapter.
D4 Let H be generated by {ft,,...,«n} and let GIH be generated by {//a,,. . . , Ham). Show that G is generated by
{«„... ,a„,hu----ft„}
that is, every x in G is a product of the above elements and their inverses.
E6 Every element of Q/Z is a coset Z + (mln).
G6 If G is cyclic, then necessarily G = Zp2. (Why?)
If G is not cyclic, then every element x ^ e in G has order p. (Why?) Take any two elements a ^ e and b^e in G where ft is not a power of a. Complete the problem.
CHAPTER 16
Dl Let / G Aut(G); that is, let / be an isomorphism from G onto G. We shall prove that / "' G Aut(G); that is, /""' is an isomorphism from G onto G. To begin with, it follows from the last paragraph of Chapter 6 that f~l is a bijective function from G onto G. It remains to show that /"' is a homomorphism. Let f\c) = a and /~'(d) = ft, so that c = f(a) and d = /(ft). Then cd = /(a)/(ft) = /(aft), whence f~'(cd) = aft. Thus,
f-\cd) = ab = f'(c)f \d)
which shows that / "' is a homomorphism.
F2 If a, ft G HK, then a = h,k, and ft = ft2&2, where hnh2E H and kt,k2E K. Then aft = hJk,h2k2 = AI(/cl/t2/c||~ 1)/c,/c2.
G3 Note that the range of ft is a group of functions. What is its identity clement?
HI From calculus, cos(x + y) = cosx cos y - sin x sin y, and sin (x + y) = sin x cos y + cos x sin y.
368 ANSWERS TO SELECTED EXERCISES
ANSWERS TO SELECTED EXERCISES 369
L4 The natural homomorphism (Theorem 4, Chapter 15) is a homomorphism / : G—*GI{a) with kernel (a). Let 5 be the normal subgroup of Gl(a) whose order is p"1 '. (The existence of S is assured by part 3 of this exercise set.) Referring to Exercise J, show that 5* is a normal subgroup of G, and that the order of S* is pm.
CHAPTER 17
A3 We prove that O is associative:
(a. fc)©[(c, d)Q(p, o)) = (a, b)Q(cp - dq, cq + dp)
= (acp - adq - bcq — bdp, acq + adp + bcp— bdq)
[(«, b)Q(c, d)]Q(p, q) = (ac - bd, ad + bc)Q(p, q)
= (acp - bdp - adq - bcq, acq — bdq + adp + bcp)
Thus, (a, b)Q[(c, d)Q(p, q)] = [(a, b)Q(c, d)]Q(p, q).
B2 A nonzero function /is a divisor of zero if there exists some nonzero function g such that fg = 0, where 0 is the zero function (page 46). The equation /g = 0 means that f(x)g(x) = 0(x) for every jcER. Very precisely, what functions / have this property?
Dl For the distributive law, refer to the diagram on page 30, and show that
a n (B + c) = (a n B) + (a n cy.
B + c consists of the regions 2, 3, 4, and 7; a n (B + c) consists of the regions 2 and 4. Now complete the problem.
E3
",)(; °)-("o' -,)-«?)
G4 (a, b) is an invcrtible element of A x B iff there is an ordered pair (c, d) in 4xfi satisfying (a, b) - (c, d) = (1, 1). Now complete.
H6 If A is a ring, then, as we have seen, A with addition as the only operation is an abelian group: this group is called the additive group of the ring A. Now, suppose the additive group of A is a cyclic group, and its generator is c. If a and b are any two elements of A, then
a = c + c +
and
+ c (m terms)
6 = c + c + ■ • • + c (n terms)
for some positive integers m and n.
J2 If ab is a divisor of 0, this means that ab ^ 0 and there is some x ¥= 0 such that abx — 0. Moreover, a ^ 0 and b # 0, for otherwise a6 = 0.
M3 Suppose am = 0 and 6" = 0. Show that (a + b)m *" = 0. Explain why, in every term of the binomial expansion of (a + b)'"*", either a is raised to a powers m, or b is raised to a power s n.
CHAPTER 18
A4 From calculus, the sum and product of continuous functions are continuous.
B3 The proof hinges on the fact that if k and a are any two elements of Z„, then
ka = a + a + -- + a (k terms)
C4 If the cancellation law holds in A, it must hold in B. (Why?) Why is it necessary to include the condition that B contains 1?
C5 Let B be a subring of a field F. If b is any nonzero element of B, then b~1 is in F, though not necessarily in B. (Why?) Complete the argument.
E5 f(x,y)f(U,V)-(;;)(; J)-
f[x, y)Q(u, v)) =
Complete the problem.
H3 If a" e J and bm e /, show that (a + b)"*m 6 J. (See the solution of Exercise M3, Chapter 17.) Complete the solution.
CHAPTER 19
El To say that the coset J + x has a square root is to say that for some element y in A, J + x = (J + y)(J + y) = J + y1.
E6 A unity element of All is a coset J + a such that for any xE A,
(J + a)(J + x) = J + x and (J + x)(J + a) = J + x
Gl To say that a ^ J is equivalent to saying that J + a ^ J; that is, J + a is not the zero element of AIJ. Explain and complete.
CHAPTER 20
E5 Restrict your attention to A with addition alone, and see Chapter 13, Theorem 4.
E6 For n = 2 you have
(a + bf2 = [(a + b)"]p = [ap + bp]p by Theorem 3 = (ap)p + (bp)p by Theorem 3 - a"2 + bp2
Prove the required formula by reasoning similarly and using induction: assume the formula is true for n = k, and prove for n = k + 1.
370 ANSWERS TO SELECTED EXERCISES
ANSWERS TO SELECTED EXERCISES 371
CHAPTER 21
B5 Use the product (a - l)(b - 1).
C8 In the induction step, you assume the formula is true for n = k and prove it is true for n = k + 1. That is, you assume
Vk+3 = (-l)k
and prove
Recall that by the definition of the Fibonacci sequence, F„ + 2 = Fn + 1 + Fa for every n>2. Thus, Fk + 2 = Fi+, + Fk and Fk^4 = Ft+3 + Fk + 2. Substitute these in the second of the equations above.
E5 An elegant way to prove this inequality is by use of part 4, with a + b in the place of a, and \a\ + \b\ in the place of b.
E8 This can be proved easily using part 5.
F2 You are given that m = nq + r and q = kq, + r, where 0 s r < n and 0 =s r, < k. (Explain.) Thus,
m = n(kq, + r,) + r = (nk)q, + (nr, + r)
You must show that nr, + r < nk, (Why?) Begin by noting that k - r, > 0; hence k — r, s= 1, so n(k - r,) s= n.
G5 For the induction step, assume ka = (k-l)a, and prove (k+l)a = [(k + !)• l]a. From (ii) in this exercise, (k + 1) • 1 = k • 1 + 1.
CHAPTER 22
Bl Assume a>0 and a\b. To solve the problem, show that a is the greatest common divisor of a and b. First, a is a common divisor of a and b: a | a and a | b. Next, suppose tis any common divisor of a and b: t \ a and t\b. Explain why you are now done.
D3 From the proof of Theorem 3, a1 is the generator of the ideal consisting of all the linear combinations of a and b.
El Suppose a is odd and b is even. Then a + b is odd and a - b is odd. Moreover, if / is any common divisor of a - b and a + b, then t is odd. (Why?) Note also that if / is a common divisor of a — b and a + b, then t divides the sum and difference of a + b and a - b.
F3 If / = \cm(ab, ac), then / = abx = acy for some integers x and y. From these equations you can see that a is a factor of /, say / = am. Thus, the equations become am = abx = acy. Cancel a, then show that m = lcm(£>, c).
G8 Look at the proof of Theorem 3. CHAPTER 23
A4(/) 3x2 -6x + 6 = 3(x2 - 2x + 1) + 3 = 3(x - l)2 + 3. Thus, we are to solve the congruence 3(x - if ■ -3 (mod 15). We begin by solving 3y = -3 (mod 15), then we will set y = (x- l)2.
We note first that by Condition (6), in a congruence ax = b (mod n), if the three numbers a, b, and n have a common factor d, then
ax m b(mod n) is equivalent to ~dX~ \ (mod ^)
(That is, all three numbers a, b, and n may be divided by the common factor d.) Applying this observation to our congruence 3y = -3 (mod 15), we get
3y = -3 (mod 15) is equivalent to y = -1 (mod 5) This is the same as y = 4 (mod 5), because in Z5 the negative of 1 is 4. Thus, our quadratic congruence is equivalent to (x-l)2 = 4 (mod 5)
In Z5, 22 = 4 and 32 = 4; hence the solutions are x - 1 = 2 (mod 5) and x - 1 = 3 (mod 5), or finally,
x «e 3 (mod 5) and x = 4 (mod 5)
\6(d) We begin by finding all the solutions of 30z + 24v = 18, then set z = x2. Now,
30z + 24y = 18 iff 24y=18-30z iff 30z = 18 (mod 24)
the previous solution, this is equivalent to 5z: lA, 5 = 1; hence 5z = 3 (mod 4) is the same as z'-
By comments in (mod 4). But in , (mod 4).
Now set z = x2. Then the solution is *2 = 3 (mod 4). But this last congruence has no solution, because in Z4, 3 is not the square of any number. Thus, the Diophantine equation has no solutions.
B3 Here is the idea: By Theorem 3, the first two equations have a simultaneous solution; by Theorem 4, it is of the form x = c (mod t), where t = \cm(m1, m2). To solve the latter simultaneously with x = c3 (mod m3), you will need to know that c3 = c [mod gcd(f, m3)]. But gcd(t, m3) = lcm(rf13, d2i). (Explain this carefully, using the result of Exercise H4 in Chapter 22.) From Theorem 4 (especially the last paragraph of its proof), it is clear that since c3 = c, (mod dl3) and c3 — c2 (mod d23), therefore c3 = c [mod lcm(, aa>2, aco3, aw4, and aco5. Any automorphism of Q(a, V3i) fixing Q maps sixth roots of 2 to sixth roots of 2, at the same time mapping V3i to ±V5i (and hence mapping co to