394 Chapter 7 Inferences Based on the Normal Distribution Questions 7.3.1. Show directly—without appealing to the fact that xl is a gamma random variable — that fu(u) as stated in Definition 7.3.1 is a true probability density function. 7.3.2. Find the moment-generating function for a chi square random variable and use it to show that E[x2n)=n and Var(x^) = 2n. 7.3.3. Is it believable that the numbers 65, 30, and 55 are a random sample of size 3 from a normal distribution with fi = 50 and a = 10? Answer the question by using a chi square distribution. [Hint: Let Z, = (7, — 50)/10 and use Theorem 7.3.1.] 7.3.4. Use the fact that (n — l)S2/a2 is a chi square random variable with n — 1 df to prove that 2<74 Var(52) = 1 (Hint: Use the fact that the variance of a chi square random variable with k df is 2k.) 7.3.5. Let Yx, Y2,..., Yn be a random sample from a normal distribution. Use the statement of Question 7.3.4 to prove that S2 is consistent for a2. 7.3.6. If Y is a chi square random variable with n degrees of freedom, the pdf of (Y — n)/V2n converges to fz(z) asn goes to infinity (recall Question 7.3.2). Use the asymptotic normality of (Y — n)/V2n to approximate the fortieth percentile of a chi square random variable with 200 degrees of freedom. 7.3.7. Use Appendix Table A.4 to find (a) ^.50,6,7 (b) -F.001,15,5 (C) ^.90,2,2 7.3.8. Let V and U be independent chi square random variables with 7 and 9 degrees of freedom, respectively. Is it more likely that ^ will be between (1) 2.51 and 3.29 or (2) 3.29 and 4.20? 7.3.9. Use Appendix Table A.4 to find the values of x that satisfy the following equations: (a) P(0.109 5.35) = 0.01 (d) P(0A15 oo, the pdf of a Student t random variable with n df converges to fz (z) ■ (Hint: To show that the constant term in the pdf for Tn converges to 1 /V2jt, use Stirling's formula, n! = V2^W"0 Also, recall that lim (l + -)" = ea. 7.3.14. Evaluate the integral 1 f Jo ■ dx lo l+x2 using the Student t distribution. 7.3.15. For a Student t random variable Y with n degrees of freedom and any positive integer k, show that E(Y2k) exists if 2k < n. (Hint: Integrals of the form f Jo 1 dy Jo a+r)ß are finite if a > 0, ß > 0, and aß > 1.) 7.4 Drawing Inferences About /jl One of the most common of all statistical objectives is to draw inferences about the mean of the population being represented by a set of data. Indeed, we already took a first look at that problem in Section 6.2. If the 7,-'s come from a normal distibution 7.4 Drawing Inferences About [i 395 where a is known, the null hypothesis Hq : fi — fig can be tested by calculating a Z ratio, ^-7= (recall Theorem 6.2.1). Implicit in that solution, though, is an assumption not likely to be satisfied: rarely does the experimenter actually know the value of a. Section 7.3 dealt with precisely that scenario and derived the pdf of the ratio r„_i — jj^, where a has been replaced by an estimator, S. Given r„_i (which we learned has a Student t distribution with n — 1 degrees of freedom), we now have the tools necessary to draw inferences about jjl in the all-important case where a is not known. Section 7.4 illustrates these various techniques and also examines the key assumption underlying the "r test" and looks at what happens when that assumption is not satisfied. t Tables We have already seen that doing hypothesis tests and constructing confidence intervals using or some other Z ratio requires that we know certain upper and/or lower percentiles from the standard normal distribution. There will be a similar need to identify appropriate "cutoffs" from Student t distributions when the inference procedure is based on J^=, or some other t ratio. Figure 7.4.1 shows a portion of the t table that appears in the back of every statistics book. Each row corresponds to a different Student t pdf. The column headings give the area to the right of the number appearing in the body of the table. Figure 7.4.1 a df .20 .15 .10 .05 .025 .01 .005 1 1.376 1.963 3.078 6.3138 12.706 31.821 63.657 2 1.061 1.386 1.886 2.9200 4.3027 6.965 9.9248 3 0.978 1.250 1.638 2.3534 3.1825 4.541 5.8409 4 0.941 1.190 1.533 2.1318 2.7764 3.747 4.6041 5 0.920 1.156 1.476 2.0150 2.5706 3.365 4.0321 6 0.906 1.134 1.440 1.9432 2.4469 3.143 3.7074 30 0.854 1.055 1.310 1.6973 2.0423 2.457 2.7500 DO 0.84 1.04 1.28 1.64 1.96 2.33 2.58 For example, the entry 4.541 listed in the a — .01 column and the df = 3 row has the property that P(T3 > 4.541) =0.01. More generally, we will use the symbol rtti„ to denote the 100(1 — a)th percentile of fr„(t)- That is, P(Tn> rtti„) — a (see Figure 7.4.2). No lower percentiles of Student t curves need to be tabulated because the symmetry of fr„(t) implies that P(Tn < tct,n) — ^ ■ The number of different Student t pdfs summarized in a t table varies considerably. Many tables will provide cutoffs for degrees of freedom ranging only from 1 to 30; others will include df values from 1 to 50, or even from 1 to 100. The last row in any t table, though, is always labeled "oo": Those entries, of course, correspond to za- 396 Chapter 7 Inferences Based on the Normal Distribution Figure 7.4.2 ln / V s~ Area = a = P(T >t ) Constructing a Confidence Interval for /x The fact that has a Student t distribution with n — 1 degrees of freedom justifies the statement that / Y-fi \ P —ta/Xn-l < - , ,- < ta/2,n-l 1 = 1-0! or, equivalently, that P[Y- ta/2,„-i ■ —7= < [i 1.134) (b) P(TIS< 0.866) (c) P(T3> -1.250) (d) P(-1.055x) = 0.85 (c) P(T26x)= 0.025 7.4.3. Which of the following differences is larger? Explain. 7'.4 A- A random sample of size n = 9 is drawn from a normal distribution with fi = 27.6. Within what interval (—a, +a) can we expect to find 80% of the time? 90% of the time? S/V9 7.4.5. Suppose a random sample of size n = 11 is drawn from a normal distribution with fi = 15.0. For what value of k is the following true? 15.0 >k = 0.05 7.4.6. Let Y and S denote the sample mean and sample standard deviation, respectively, based on a set of n = 20 measurements taken from a normal distribution with fi = 90.6. Find the function k(S) for which P[90.6 - k(S) ,2 (a) Find the 95% confidence interval for the mean monthly precipitation. (b) The table on the right gives a frequency ditri-bution for the Dismal Swamp precipitation data. Does this distribution raise questions about using Theorem 7.4.1? Rainfall in inches Frequency 0-1 85 1-2 38 2-3 35 3-4 41 4-5 28 5-6 24 6-7 18 7-8 16 8-9 16 9-10 5 10-11 9 11-12 21 Source: www.wcc.nrcs.usda.gov. Testing H0: /x = /x0 (The One-Sample t Test) Suppose a normally distributed random sample of size n is observed for the purpose of testing the null hypothesis that jjl — jjl0. If a is unknown—which is usually the case—the procedure we use is called a one-sample t test. Conceptually, the latter is much like the Z test of Theorem 6.2.1, except that the decision rule is defined in terms of t — ^^j^ rather than z — ^yj^ [which requires that the critical values come from /r„_, (?) rather than fz(z)]. Theorem Let y\,yi,... ,yn be a random sample of size n from a normal distribution where a is 7.4.2 unknown. Let t = y-^. a. To test Hq : fi — fi0 versus H\ : fi > fi0 at the a level of significance, reject Hq if t > ?a,n-l- b. To test Hq : fi — fi0 versus H\ : fi < fi0 at the a level of significance, reject Hq if t — ta,n— 1 ■ c. To test Hq : fi — fi0 versus H\ : fi ^ fi0 at the a level of significance, reject Hq if t is either (1) < -?„/2,«-i or (2) > ?„/2,«-i- Proof Appendix 7.A.3 gives the complete derivation that justifies using the procedure described in Theorem 7.4.2. In short, the test statistic ? — ^Tt^ is a monotonie function of the X that appears in Definition 6.5.2, which makes the one-sample ? test a GLRT □ Case Study 7.4-2 Not all rectangles are created equal. Since antiquity, societies have expressed aesthetic preferences for rectangles having certain width (w) to length (I) ratios. One "standard" calls for the width-to-length ratio to be equal to the ratio of the length to the sum of the width and the length. That is, (Continued on next page) 402 Chapter 7 Inferences Based on the Normal Distribution (Case Study 7.4.2 continued) 7 = 47 <7A2> / w +1 Equation 7.4.2 implies that the width is |(\/5 — 1), or approximately 0.618, times as long as the length. The Greeks called this the golden rectangle and used it often in their architecture (see Figure 7.4.4). Many other cultures were similarly inclined. The Egyptians, for example, built their pyramids out of stones whose faces were golden rectangles. Today in our society, the golden rectangle remains an architectural and artistic standard, and even items such as driver's licenses, business cards, and picture frames often have w/l ratios close to 0.618. w I Figure 7.4-4 A golden rectangle (y = ^) The fact that many societies have embraced the golden rectangle as an aesthetic standard has two possible explanations. One, they "learned" to like it because of the profound influence that Greek writers, philosophers, and artists have had on cultures all over the world. Or two, there is something unique about human perception that predisposes a preference for the golden rectangle. Researchers in the field of experimental aesthetics have tried to test the plausibility of those two hypotheses by seeing whether the golden rectangle is accorded any special status by societies that had no contact whatsoever with the Greeks or with their legacy. One such study (37) examined the w/l ratios of beaded rectangles sewn by the Shoshoni Indians as decorations on their blankets and clothes. Table 7.4.2 lists the ratios found for twenty such rectangles. If, indeed, the Shoshonis also had a preference for golden rectangles, we would expect their ratios to be "close" to 0.618. The average value of the entries in Table 7.4.2, though, is 0.661. What does that imply? Is 0.661 close enough to 0.618 to support the position that liking the golden rectangle is a human characteristic, or is 0.661 so far from 0.618 that the only prudent conclusion is that the Shoshonis did not agree with the aesthetics espoused by the Greeks? Table 7.4.2 Width-to-Length Ratios of Shoshoni Rectangles 0.693 0.749 0.654 0.670 0.662 0.672 0.615 0.606 0.690 0.628 0.668 0.611 0.606 0.609 0.601 0.553 0.570 0.844 0.576 0.933 (Continued on next page) 7.4 Drawing Inferences About [i 403 Let fi denote the true average width-to-length ratio of Shoshoni rectangles. The hypotheses to be tested are HQ: fj, = 0.618 versus Hi: fi^ 0.618 For tests of this nature, the value of a — 0.05 is often used. For that value of a and a two-sided test, the critical values, using part (c) of Theorem 7.4.2 and Appendix Table A.2, are r 025,19 — 2.0930 and -f. 025,19 — -2.0930. The data in Table 7.4.2 have y — 0.661 and s — 0.093. Substituting these values into the t ratio gives a test statistic that lies just inside of the interval between -2.0930 and 2.0930: y-ixo 0.661-0.618 t —-— =-■=- — 2.068 s/Vn 0.093/V20 Thus, these data do not rule out the possibility that the Shoshoni Indians also embraced the golden rectangle as an aesthetic standard. About the Data Like it and e, the ratio w/l for golden rectangles (more commonly referred to as either phi or the golden ratio), is an irrational number with all sorts of fascinating properties and connections. Algebraically, the solution of the equation w I I w +1 is the continued fraction w = 1 I 1 1 Among the curiosities associated with phi is its relationship with the Fibonacci series. The latter, of course, is the famous sequence in which each term is the sum of its two predecessors—that is, 1 1 2 3 5 8 13 21 34 55 89 ... Example Three banks serve a metropolitan area's inner-city neighborhoods: Federal Trust, 7.4.2 American United, and Third Union. The state banking commission is concerned that loan applications from inner-city residents are not being accorded the same consideration that comparable requests have received from individuals in rural areas. Both constituencies claim to have anecdotal evidence suggesting that the other group is being given preferential treatment. Records show that last year these three banks approved 62% of all the home mortgage applications filed by rural residents. Listed in Table 7.4.3 are the approval rates posted over that same period by the twelve branch offices of Federal Trust 404 Chapter 7 Inferences Based on the Normal Distribution Table 7.4.3 Bank Location Affiliation Percent Approved 1 3rd & Morgan AU 59 2 Jefferson Pike TU 65 3 East 150th & Clark TU 69 4 Midway Mall FT 53 5 N. Charter Highway FT 60 6 Lewis & Abbot AU 53 7 West 10th & Lorain FT 58 8 Highway 70 FT 64 9 Parkway Northwest AU 46 10 Lanier & Tower TU 67 11 King & Tara Court AU 51 12 Bluedot Corners FT 59 (FT), American United (AU), and Third Union (TU) that work primarily with the inner-city community. Do these figures lend any credence to the contention that the banks are treating inner-city residents and rural residents differently? Analyze the data using an a — 0.05 level of significance. As a starting point, we might want to test Ho : fi — 62 versus Hi : [i / 62 where jjl is the true average approval rate for all inner-city banks. Table 7.4.4 summarizes the analysis. The two critical values are ±f 025,11 — ±2.2010, and the observed t ratio is —1.66[— ^67-62 ) so our decision is "Fail to reject Hn." \ 6.946/v 12 / Table 7.4-4 Banks n y st Ratio Critical Value Reject H0? All 12 58.667 6.946 -1.66 ±2.2010 No About the Data The "overall" analysis of Table 7.4.4, though, may be too simplistic. Common sense would tell us to look also at the three banks separately. What emerges, then, is an entirely different picture (see Table 7.4.5). Now we can see why both groups felt discriminated against: American United (t — —3.63) and Third Table 7.4.5 Banks n y .s t Ratio Critical Value Reject H0? American United 4 52.25 5.38 -3.63 ±3.1825 Yes Federal Trust 5 58.80 3.96 -1.81 ±2.7764 No Third Union 3 67.00 2.00 +4.33 ±4.3027 Yes 7.4 Drawing Inferences About [i 405 Union (t — +4.33) each had rates that differed significantly from 62% — but in opposite directions! Only Federal Trust seems to be dealing with inner-city residents and rural residents in an even-handed way. ■ Questions 7.4.17. Recall the Bacillus subtilis data in Question 5.3.2. Test the null hypothesis that exposure to the enzyme does not affect a worker's respiratory capacity (as measured by the FEV^/VC ratio). Use a one-sided Hx and let a = 0.05. Assume that a is not known. 7.4.18. Recall Case Study 5.3.1. Assess the credibility of the theory that Etruscans were native Italians by testing an appropriate H0 against a two-sided Hx. Set a equal to 0.05. Use 143.8 mm and 6.0 mm for y and s, respectively, and let [i„ = \32A. Do these data appear to satisfy the distribution assumption made by the t test? Explain. 7.4.19. MBAs R Us advertises that its program increases a person's score on the GMAT by an average of forty points. As a way of checking the validity of that claim, a consumer watchdog group hired fifteen students to take both the review course and the GMAT. Prior to starting the course, the fifteen students were given a diagnostic test that predicted how well they would do on the GMAT in the absence of any special training. The following table gives each student's actual GMAT score minus his or her predicted score. Set up and carry out an appropriate hypothesis test. Use the 0.05 level of significance. Subject yt = act. GMAT - pre. GMAT y\ SA 35 1225 LG 37 1369 SH 33 1089 KN 34 1156 DF 38 1444 SH 40 1600 ML 35 1225 JG 36 1296 KH 38 1444 HS 33 1089 LL 28 784 CE 34 1156 KK 47 2209 CW 42 1764 DP 46 2116 7.4.20. In addition to the Shoshoni data of Case Study 7.4.2, a set of rectangles that might tend to the golden ratio are national flags. The table below gives the width-to-length ratios for a random sample of the flags of thirty-four countries. Let fi be the width-to-length ratio for national flags. At the a = 0.01 level, test H0: fi= 0.618 versus Hx: fi^ 0.618. Ratio Ratio Country Width to Height Country Width to Height Afghanistan 0.500 Iceland 0.720 Albania 0.714 Iran 0.571 Algeria 0.667 Israel 0.727 Angola 0.667 Laos 0.667 Argentina 0.667 Lebanon 0.667 Bahamas 0.500 Liberia 0.526 Denmark 0.757 Macedonia 0.500 Djibouti 0.553 Mexico 0.571 Ecuador 0.500 Egypt 0.667 Monaco 0.800 El 0.600 Namibia 0.667 Salvador Nepal 1.250 Estonia 0.667 Romania 0.667 Ethiopia 0.500 Rwanda 0.667 Gabon 0.750 South 0.667 Africa Fiji 0.500 St. 0.500 Helena France 0.667 Sweden 0.625 Honduras 0.500 United 0.500 Kingdom Source: http://www.anyflag.com/country/costaric.php. 7.4.21. A manufacturer of pipe for laying underground electrical cables is concerned about the pipe's rate of corrosion and whether a special coating may retard that rate. As a way of measuring corrosion, the manufacturer examines a short length of pipe and records the depth of the maximum pit. The manufacturer's tests have shown that in a year's time in the particular kind of soil the manufacturer must deal with, the average depth of the maximum pit in a foot of pipe is 0.0042 inch. To see whether that average can be reduced, ten pipes are 406 Chapter 7 Inferences Based on the Normal Distribution coated with a new plastic and buried in the same soil. After one year, the following maximum pit depths are recorded (in inches): 0.0039,0.0041,0.0038,0.0044,0.0040, 0.0036, 0.0034, 0.0036, 0.0046, and 0.0036. Given that the sample standard deviation for these ten measurements is 0.00383 inch, can it be concluded at the a = 0.05 level of significance that the plastic coating is beneficial? 7.4-22. The first analysis done in Example 7.4.2 (using all n = 12 banks with y = 58.667) failed to reject H0: fi = 62 at the a = 0.05 level. Had [i„ been, say, 61.7 or 58.6, the same conclusion would have been reached. What do we call the entire set of /x0's for which H0:fi = fio would not be rejected at the a = 0.05 level? Testing H0: /x = /x0 When the Normality Assumption Is Not Met Every t test makes the same explicit assumption—namely, that the set of ny,'s is normally distributed. But suppose the normality assumption is not true. What are the consequences? Is the validity of the t test compromised? Figure 7.4.5 addresses the first question. We know that if the normality assumption is true, the pdf describing the variation of the t ratio, is /r„_,(0- The latter, of course, provides the decision rule's critical values. If Hq : jjl — jjl0 is to be tested against H\\ jjl^ jjl0, for example, the null hypothesis is rejected if t is either (1) < — fa/2,n-i or (2) > ta/2,n-i (which makes the Type I error probability equal to a). Figure 7.4.5 Reject H0--1 1-►Reject H{ If the normality assumption is not true, the pdf of jjj= will not be fTnl (t) and In effect, violating the normality assumption creates two a's: The "nominal" a is the Type I error probability we specify at the outset—typically, 0.05 or 0.01. The "true" a is the actual probability that jjj= falls in the rejection region (when Hq is true). For the two-sided decision rule pictured in Figure 7.4.5, /— ta/2,n-\ r°° fT*{t)dt + / fT*(t)dt ■OO J>a/2,n-\ Whether or not the validity of the t test is "compromised" by the normality assumption being violated depends on the numerical difference between the two a's. If fT* (t) is, in fact, quite similar in shape and location to fTnX (t), then the true a will be approximately equal to the nominal a. In that case, the fact that the y,'s are not normally distributed would be essentially irrelevant. On the other hand, if fT* (t) and fjnX (t) are dramatically different (as they appear to be in Figure 7.4.5), it would follow that the normality assumption is critical, and establishing the "significance" of a t ratio becomes problematic. 7.4 Drawing Inferences About [i 407 Unfortunately, getting an exact expression for fj*{t) is essentially impossible, because the distribution depends on the pdf being sampled, and there is seldom any way of knowing precisely what that pdf might be. However, we can still meaningfully explore the sensitivity of the t ratio to violations of the normality assumption by simulating samples of size n from selected distributions and comparing the resulting histogram of t ratios to fTnX (t). Figure 7.4.6 shows four such simulations, using Minitab; the first three consist of one hundred random samples of size n — 6. In Figure 7.4.6(a), the samples come from a uniform pdf defined over the interval [0,1]; in Figure 7.4.6(b), the underlying pdf is the exponential with X — l; and in Figure 7.4.6(c), the data are coming from a Poisson pdf with X — 5. If'the normality assumption were true, t ratios based on samples of size 6 would vary in accordance with the Student t distribution with 5 df. On pp. 407-408, fTs (t) has been superimposed over the histograms of the t ratios coming from the three different pdfs. What we see there is really quite remarkable. The t ratios based on 3>,'s coming from a uniform pdf, for example, are behaving much the same way as t ratios would vary if the y,'s were normally distributed—that is, fr*(t) in this case appears to be very similar to fr5(t)- The same is true for samples coming from a Poisson distribution (see Theorem 4.2.2). For both of those underlying pdfs, in other words, the true a would not be much different from the nominal a. Figure 7.4.6(b) tells a slightly different story. When samples of size 6 are drawn from an exponential pdf, the t ratios are not in particularly close agreement with Figure 7.4.6 (a) -3 & cd g o-a a- V /yOO = 1 HTB > random"100"cl-c6; SUBO uniform"0"l. HTB > rmean"cl-c6"c7 HTB > rstdev"cl-c6"c8 HTB > let"c9" =" sqrt (6) * ( ( (c7)-0 . 5) / (c8) ) HTB > histogram"c9 \ This command calculates y-fi _ j-0.5 0.4 f "\ l 1 1 \ \ - \ Sample distribution /r5(0" -~^/ / 1 l l 0.2 \ \ \ \ \ \ \ \ / \ \ ■-■ --I----V -A 0 2 t ratio (n = 6) 408 Chapter 7 Inferences Based on the Normal Distribution Figure 7.4.6 (Continued) (b) 1.00 t bs 0.50 MTB > random"100"cl-c6; SUBO exponential"l. MTB > rmean"cl-c6"c7 MTB > rstdev"cl-c6"c8 MTB > lefc9" = "sqrt (6) * ( ( (c7) ' MTB">"histogram"c9 •1.0)/(c8)) L s/V6 J -I_I_l_ Sample distribution 1 — -- -14 -12 -10 -6 -4 -2 t ratio (n = 6) («0 0.16 ■ px{k) = /t! 10 MTB">"random"100"cl-c6; SUBC">"poisson"5. MTB">"rmean"cl-c6"c7 MTB">"rstdev"cl-c6"c8 MTB">"let"c9"="sqrt(6)*(((c7)"-"5.0)/(c8)) MTB">"histogram"c9 0.4 Sample distribution \ /r5(0- f f 1 1 1 1 f 0.2 \ \ \ \ ^ \ \ \ \ \ ✓ / / \ \ 1 - - _ j--- -4 -2 0 2 4 t ratio (n = 6) 7.4 Drawing Inferences About [i 409 fr5(t). Specifically, very negative t ratios are occurring much more often than the Student t curve would predict, while large positive t ratios are occurring less often (see Question 7.4.23). But look at Figure 7.4.6(d). When the sample size is increased ton — 15, the skewness so prominent in Figure 7.4.6(b) is mostly gone. Figure 7.4.6 (Continued) (d) HTB > random 100 cl-cl5; SUBO exponential 1. HTB > rmean cl-cl5 cl6 HTB > rstdev cl-cl5 cl7 HTB > let cl8 = sqrt(15)*(((cl6 - 1.0)/(cl7)) HTB > histogram cl8 0.4 frj)- 0.2 - Sample distribution -2 0 2 t ratio (n = 15) Reflected in these specific simulations are some general properties of the t ratio: 1. The distribution of Jy-^= is relatively unaffected by the pdf of the y,'s [provided fy (y) is not too skewed and n is not too small]. 2. As n increases, the pdf of becomes increasingly similar to fTnl (t). In mathematical statistics, the term robust is used to describe a procedure that is not heavily dependent on whatever assumptions it makes. Figure 7.4.6 shows that the t test is robust with respect to departures from normality. From a practical standpoint, it would be difficult to overstate the importance of the t test being robust. If the pdf of J^= varied dramatically depending on the origin of the yt% we would never know if the true a associated with, say, a 0.05 decision rule was anywhere near 0.05. That degree of uncertainty would make the t test virtually worthless. 410 Chapter 7 Inferences Based on the Normal Distribution Questions 7.4-23. Explain why the distribution of t ratios calculated from small samples drawn from the exponential pdf, fr(y) = e~y, y > 0, will be skewed to the left [recall Figure 7.4.6(b)]. [Hint: What does the shape of fr(y) imply about the possibility of each yt being close to 0? If the entire sample did consist of y,'s close to 0, what value would the t ratio have?] 7-4-24- Suppose one hundred samples of size n = 3 are taken from each of the pdfs = 5261 i=i 19 J2yf = 1,469,945 so the sample variance is 733.4: , 19(1,469,945) - (5261)2 s2 = ——----—--- = 733.4 19(18) Since n = 19, the critical values appearing in the left-hand and right-hand limits of the a confidence interval come from the chi square pdf with 18 df. According to Appendix Table A.3, P(8.23 a2 at the a level of significance, reject Hq if X2 — Xl-a,n-V b. To test Hq : a2 — a2 versus H\\a2 < a2 at the a level of significance, reject Hq if X — Xa,n-1' c. To test Hq : a2 — a2 versus H\ \a2 ^a2 at the a level of significance, reject Hq if x2 is either (1) < Xa/2,n-i or (2) > xf-a/2,n-i- D Case Study 7.5.2 Mutual funds are investment vehicles consisting of a portfolio of various types of investments. If such an investment is to meet annual spending needs, the owner of shares in the fund is interested in the average of the annual returns of the fund. Investors are also concerned with the volatility of the annual returns, measured by the variance or standard deviation. One common method of evaluating a mutual fund is to compare it to a benchmark, the Lipper Average being one of these. This index number is the average of returns from a universe of mutual funds. The Global Rock Fund is a typical mutual fund, with heavy investments in international funds. It claimed to best the Lipper Average in terms of volatility over the period from 1989 through 2007. Its returns are given in the table below. Investment Investment Year Return % Year Return % 1989 15.32 1999 27.43 1990 1.62 2000 8.57 1991 28.43 2001 1.88 1992 11.91 2002 -7.96 1993 20.71 2003 35.98 1994 -2.15 2004 14.27 1995 23.29 2005 10.33 1996 15.96 2006 15.94 1997 11.12 2007 16.71 1998 0.37 The standard deviation for these returns is 11.28%, while the corresponding figure for the Lipper Average is 11.67%. Now, clearly, the Global Rock Fund has a smaller standard deviation than the Lipper Average, but is this small difference due just to random variation? The hypothesis test is meant to answer such questions. Let a2 denote the variance of the population represented by the return percentages shown in the table above. To judge whether the observed standard deviation less than 11.67 is significant requires that we test (Continued on next page) 416 Chapter 7 Inferences Based on the Normal Distribution (Case Study 7.5.2 continued) H0: a2 = (11.67)2 Hi: a2 < (11.67)2 Let a — 0.05. With n — 19, the critical value for the chi square ratio [from part (b) of Theorem 7.5.2] is Xi-a «-i = X 05 18 = 9.390 (see Figure 7.5.3). But („ - 1)^ (19-1)(11.28)2 X =-ö-=--= 16.82 (11.67)2 so our decision is clear: Do not reject Hq. 0.08- 0.07 ■ 0.06 ■ I 0.05 ■ 3 0.04 ■ IE 0.03 ■ 0.02 ■ 0.01 ■ 0 ■ c ;- a- ■f,2(y) Area = 0.05 i 10 9.390 Reject H0 , I 15 20 25 30 35 Figure 7.5.3 Questions 7.5.1. Use Appendix Table A.3 to find the following cutoffs and indicate their location on the graph of the appropriate chi square distribution. (a) X.95,14 (b) X|0,2 (C) X.025,9 7.5.2. Evaluate the following probabilities: (a) P(Xl27> 8.672) (b) P(x26< 10.645) (c) P(9.591y) = 0.99 (b) P(x125 5.009) =0.975 (b) P(21.2040.95 (Hint: Use a trial-and-error method.) 7.5.7. Start with the fact that (n - \)S2/a2 has a chi square distribution with n — 1 df (if the 7, 's are normally distributed) and derive the confidence interval formulas given in Theorem 7.5.1. 7.5.8. A random sample of size n = 19 is drawn from a normal distribution for which a2 = 12.0. In what range are we likely to find the sample variance, s2l Answer the question by finding two numbers a and b such that P(a,'s in Table 7.5.1. How does this confidence interval compare with the one in Case Study 7.5.1? 7.5.13. If a 90% confidence interval for 0; 6>0 (a) Use moment-generating functions to show that the ratio 2nY/9 has a chi square distribution with 2n df. (b) Use the result in part (a) to derive a 100(1 — a)% confidence interval for 9. 7.5.15. Another method for dating rocks was used before the advent of the potassium-argon method described in Case Study 7.5.1. Because of a mineral's lead content, it was capable of yielding estimates for this same time period with a standard deviation of 30.4 million years. The potassium-argon method in Case Study 7.5.1 had a smaller sample standard deviation of V733.4 = 27.1 million years. Is this "proof" that the potassium-argon method is more precise? Using the data in Table 7.5.1, test at the 0.05 level whether the potassium-argon method has a smaller standard deviation than the older procedure using lead. 7.5.16. When working properly, the amounts of cement that a filling machine puts into 25-kg bags have a standard deviation ( 1 using the a = 0.05 level of significance. Assume that the weights are normally distributed. 26.18 24.22 24.22 25.30 26.48 24.49 25.18 23.97 25.68 24.54 25.83 26.01 25.14 25.05 25.50 25.44 26.24 25.84 24.49 25.46 26.09 25.01 25.01 25.21 25.12 24.71 26.04 25.67 25.27 25.23 Use the following sums: 30 30 yi = 758.62 and ^ y2 = 19,195.7938 7.5.17. A stock analyst claims to have devised a mathematical technique for selecting high-quality mutual funds and promises that a client's portfolio will have higher average ten-year annualized returns and lower volatility; that is, a smaller standard deviation. After ten years, one of the analyst's twenty-four-stock portfolios showed an average ten-year annualized return of 11.50% and a standard deviation of 10.17%. The benchmarks for the type of funds considered are a mean of 10.10% and a standard deviation of 15.67%. (a) Let fi be the mean for a twenty-four-stock portfolio selected by the analyst's method. Test at the 0.05 level that the portfolio beat the benchmark; that is, test H0: fi = 10.1 versus Hx: /x > 10.1. (b) Let a be the standard deviation for a twenty-four-stock portfolio selected by the analyst's method. Test at the 0.05 level that the portfolio beat the benchmark; that is, test H0:a = 15.67 versus Hx:a < 15.67. 7.6 Taking a Second Look at Statistics (Type II Error) For data that are normal, and when the variance a2 is known, both Type I errors and Type II errors can be determined, staying within the family of normal distributions. (See Example 6.4.1, for instance.) As the material in this chapter shows, the situation changes radically when a2 is not known. With the development of the Student t distribution, tests of a given level of significance a can be constructed. But what is the Type II error of such a test? To answer this question, let us first recall the form of the test statistic and critical region testing, for example, Hq: [i — [Lq versus H\ \ [i> \lq 7.6 Taking a Second Look at Statistics (Type II Error) 419 The null hypothesis is rejected if c , i— — la,n-l Sls/n The probability of the Type II error, /J, of the test at some value > jjlq is Y -Mo (1 < ta,n— 1 However, since mo is not the mean of Y under Hi, the distribution of is nof Student t. Indeed, a new distribution is called for. The following algebraic manipulations help to place the needed density into a recognizable form. Y - Mo _ Y - Mi + (mi - Mo) _ —f1 + wa"" _ + S/Jn S/Jn SL/E S/a y-i*\ I (mi-mo) y-^i I g ct/Vh cr/Vn o/Jn^~ Z+8 (n-l)S2/a2 (n-l)S2/a2 V n-1 V "-i where Z = is normal, £/ = ("~252 is a chi square variable with n — 1 degrees of freedom, and 8 — (^.'^o) is an (unknown) constant. Note that the random variable z+s differs from the Student t with n — 1 degrees of freedom -£= only because of / u 0 / u Y «—1 Y n—1 the additive term mo at the a — 0.05 level of sig-7.6.1 nificance. Let n — 20. In this case the test is to reject Hq if the test statistic is greater than f. 05,19 — 1-7291. What will be the Type II error if the mean has shifted by 0.5 standard deviation to the right of mo? Saying that the mean has shifted by 0.5 standard deviation to the right of mo is equivalent to setting Ml~M0 — 0.5. In that case, the noncentrality parameter is 8 = HijJis. — (0.5) • -n/20 — 2.236. The probability of a Type II error is P(7l9,2.236 < 1-7291) where ^9^.236 is a noncentral t variable with 19 degrees of freedom and noncentrality parameter 2.236. 420 Chapter 7 Inferences Based on the Normal Distribution To calculate this quantity, we need the cdf of Ft9,2.236- Fortunately, many statistical software programs have this function. The Minitab commands for calculating the desired probability are MTB > CDF 1.7291; SUBC > T 19 2.23 6 with output Cumulative Distribution Function Student's t distribution with 19 DF and noncentrality parameter 2.236 x P(X <= x) 1.7291 0.304828 The sought-after Type II error to three decimal places is 0.305. Simulations As we have seen, with enough distribution theory, the tools for finding Type II errors for the Student t test exist. Also, there are noncentral chi square and F distributions. However, the assumption that the underlying data are normally distributed is necessary for such results. In the case of Type I errors, we have seen that the t test is somewhat robust with regard to the data deviating from normality. (See Section 7.4.) In the case of the noncentral t, dealing with departures from normality presents significant analytical challenges. But the empirical approach of using simulations can bypass such difficulties and still give meaningful results. To start, consider a simulation of the problem presented in Example 7.6.1. Suppose the data have a normal distribution with ^0 = 5 and a — 3. The sample size is n — 20. Suppose we want to find the Type I error when the true 8 — 2.236. For the given a — 3, this is equivalent to ct/V" 3/V20 or in — 6.5. A Type II error occurs if the test statistic is less than 1.7291. In this case, HQ would be accepted when rejection is the proper decision. Using Minitab, two hundred samples of size 20 from the normal distribution with jjl — 6.5 and a2 — 9 are generated: Minitab produces a 200 x 20 array. For each row of the array, the test statistic is calculated and placed in Column 21. If this value is less than 1.7291, a 1 is placed in that row of Column 22; otherwise a 0 goes there. The sum of the entries in Column 22 gives the observed number of Type II errors. Based on the computed value of the Type II error, 0.305, for the assumed value of 8, this observed number should be approximately 200(0.305) =61. The Minitab simulation gave sixty-four observed Type II errors —a very close figure to what was expected. The robustness for Type II errors can lead to analytical thickets. However, simulation can again shed some light on Type II errors in some cases. As an example, suppose the data are not normal, but gamma with r — 4.694 and X — 0.722. Even though the distribution is skewed, these values make the mean jjl — 6.5 and the variance a2 — 9, as in the normal case above. Again relying on Minitab to give two hundred random samples of size 20, the observed number of Type II errors is sixty, so the test has some robustness for Type II errors in that case. Even though the data Appendix 7.A.l Minitab Applications 421 are not normal, the key statistic in the analysis, y, will be approximately normal by the central limit theorem. If the distribution of the underlying data is unknown or extremely skewed, nonparametric tests, like the ones covered in Chapter 14 and in (28) are advised. Appendix 7.A.1 Minitab Applications Many statistical procedures, including several featured in this chapter, require that the sample mean and sample standard deviation be calculated. Minitab's DESCRIBE command gives y and s, along with several other useful numerical characteristics of a sample. Figure 7.A.1.1 shows the DESCRIBE input and output for the twenty observations cited in Example 7.4.1. Figure 7.A.I.I mtb > set ci DATA > 2.5 3.2 0.5 0.4 0.3 0.1 0.1 0.2 7.4 8.6 0.2 0.1 DATA > 0.4 1.8 0.3 1.3 1.4 11.2 2.1 10.1 DATA > end MTB > describe cl Descriptive Statistics: Cl Variable N N* Mean SE Mean StDev Minimum Ql Median Q3 Maximum Cl 20 0 2.610 0.809 3.617 0.100 0.225 0.900 3.025 11.200 Here, N — sample size N* — number of observations missing from cl (that is, the number of "interior" blanks) Mean = sample mean = y SE Mean — standard error of the mean — -J= StDev — sample standard deviation — s Minimum — smallest observation Ql — first quartile — 25th percentile Median — middle observation (in terms of magnitude), or average of the middle two if n is even Q3 — third quartile — 75th percentile Maximum — largest observation Describing Samples Using Minitab Windows 1. Enter data under Cl in the WORKSHEET. Click on STAT, then on BASIC STATISTICS, then on DISPLAY DESCRIPTIVE STATISTICS. 2. Type Cl in VARIABLES box; click on OK. Percentiles of chi square, t, and F distributions can be obtained using the INVCDF command introduced in Appendix 3 A.l. Figure 7A.1.2 shows the syntax for printing out x.95,6(= 12.5916) and F0i,4,7(= 0.0667746).