The Analysis of Two-Factor Interactions in Fixed Effects Linear Models Author(s): Robert J\ Boik Reviewed work(s): Source: Journal of Educational Statistics, Vol. 18, No. 1 (Spring, 1993), pp. 1-40 Published by: American Educational Research Association and American Statistical Association Stable URL: http://www.jstor.Org/stable/l 16518T Accessed: 22/12/2011 06:24 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. 9 STOR American Educational Research Association and American Statistical Association are collaborating with JSTOR to digitize, preserve and extend access to Journal of Educational Statistics. http://www.jstor.org Journal of Educational Statistics Spring 1993, Vol. 18, No. 1, pp. 1-40 The Analysis of Two-Factor Interactions in Fixed Effects Linear Models Robert J. Boik Montana State University Key words: simultaneous inference, multiple comparisons, product contrasts This article considers two related issues concerning the analysis of interactions in complex linear models. The first issue concerns the omnibus test for interaction. Apparently, it is not well known that the usual F test for interaction can be replaced, in many applications, by a test that is more powerful against a certain class of alternatives. The competing test is based on the maximal product interaction contrast F statistic and achieves its power advantage by focusing solely on product contrasts. The maximal product interaction F test is reviewed and three new results are reported: (a) An extended table of exact critical values is computed, (b) a table of moment functions useful for approximating the p-value corresponding to an observed maximal F statistic is computed, and (c) a simulation study concerning the null distribution of the maximal F statistic when data are unbalanced or covariates are present is reported. It is conjectured that lack of balance or presence of covariates has no effect on the null distribution. The simulation results support the conjecture. The second issue concerns follow-up tests when the omnibus test is significant. It appears that researchers, in general, do not perform coherent follow-up tests on interactions. To make it easier for researchers to do so, an exposition on the use of product interaction contrasts and partial interactions in complex fixed-effects models is provided. The recommended omnibus and follow-up tests are illustrated on an educational data set analyzed using SAS (SAS Institute, 1988) and SPSS (1990). Hypotheses in an analysis of variance (anova) or an analysis of covariance (ancova) model are typically categorized into a small number of families. A two-way classification with covariates, for instance, might have four families: row effects, column effects, row x column interaction effects, and covariate effects. Associated with each family is a composite hypothesis stating that the null form of all subhypotheses in the family is true. The conventional strategy begins by testing the composite hypothesis; if it is The author thanks Carol Bittinger for her assistance with SAS and SPSS. Appreciation is also expressed to Steve Cherry, Don Daly, and Karen Summers for their comments on an earlier draft of this article. 1 Boik rejected, then subhypotheses implied by the composite are tested. Gabriel (1969) refers to such a strategy as logically coherent. For example, in a one-way classification, the usual composite hypothesis states that all population means are identical. This composite hypothesis implies that every contrast among the population means is equal to zero. Accordingly, testing contrasts among means after rejection of the composite hypothesis is a coherent strategy. The usual composite hypothesis for a two-factor interaction states that contrasts among the levels of one factor do not differ between levels of the other factor. In one strategy, rejection of the composite interaction hypothesis is followed by tests of simple effects contrasts. A simple effects contrast is a contrast among the levels of one factor at a specific level of the other factor. It is well known that this strategy is not coherent (Betz & Gabriel, 1978). That is, simple effects hypotheses are not implied by the composite interaction hypothesis. Testing simple effects following a significant interaction produce what Marascuilo and Levin (1970) call a Type IV error: "the incorrect interpretation of a correctly rejected hypothesis" (p. 398). Rosnow and Rosenthal (1989a), in a survey of studies employing factorial anova, documented the widespread practice of following a significant interaction by tests of simple effects contrasts. Rosnow and Rosenthal (1989b) suggested that one reason for the high frequency of incoherent analyses is that, for the analysis of interactions, researchers are poorly served by standard software packages. While I sympathize with (and have empathy for) software users, I am not in complete agreement. I suspect that interactions are rarely analyzed correctly for the following three reasons, (a) Descriptions of coherent procedures for analyzing interactions have been, with few exceptions, restricted to balanced data without covariates. This is true in the statistical (Boik, 1986; Bradu & Gabriel, 1974; Gabriel, Putter, & Wax, 1973), psychological (Boik, 1979; Keppel, 1973; Keppel & Zedeck, 1989), as well as educational (Betz & Gabriel, 1978; Betz & Levin, 1982; Marascuilo & Levin, 1970) literature. As a consequence, most researchers are unaware that methods for analyzing two-factor interactions are applicable to unbalanced as well as balanced data and to models that include covariates as well as higher order interactions, (b) Most researchers are unaware that standard software can compute detailed analyses of two-factor interactions, (c) Most researchers are unaware that specialized multiple comparison procedures for interaction have been developed. This article attempts to correct the preceding misconceptions. In particular, the analysis of interactions in unbalanced data with covariates is described and illustrated with SAS (SAS Institute, 1985, 1988) and SPSS (1990). These software packages were selected because they are, to the author's knowledge, the only widely available packages that include both a 2 Analysis of Interactions flexible linear models procedure and a matrix procedure capable of computing the maximal product contrast F statistic. An extensive table of critical values for the maximal product contrast F statistic is given along with a table to facilitate computation of the associated p-values. Simulation evidence that the critical values and p-values are applicable when data are not balanced or covariates are present is reported. This article also compares the analysis strategy based on the maximal product contrast F statistic to the Lutz and Cundari (1987) strategy based on the most significant parametric function. To enhance readability, mathematical details have been relegated to the Appendix. Also, long technical phrases have been abbreviated to short technical phrases (second best, after short nontechnical phrases). For instance, Factor A simple effects contrast is shortened to simple-A contrast, and Factor B main effects contrast is shortened to main-B contrast. Adjusted Means and Main Effects Tests Adjusted Means Consider a fixed effects linear model that includes two factors, A and B, and their interaction. The model may also include other factors, interactions, and covariates. Factor A has a levels, and Factor B has b levels. The data need not be balanced, provided that each cell in the model is observed at least once, and the mean square error has at least one degree of freedom. All information concerning Factors A and B is contained in two matrices: the matrix of estimated means (adjusted, if covariates are present) and the matrix of estimated covariances among the estimated means. The corresponding model is M = M + E or p. = p. + e, where M is the a x b matrix of estimated (adjusted) means, M is the corresponding matrix of population (adjusted) means, and E is the a x b matrix of random residuals. The vectors fl, p., and e are each ab x 1 and are obtained by stacking the columns of M, M, and E, respectively. This operation is denoted by p. = vec(M), p. = vec(M), and e = vec(E). The entries in M are called least-squares means by SAS (SAS Institute, 1988) and adjusted means by SPSS (1990). Estimation of adjusted means is described in the Appendix. The matrix of covariances among the entires of M can be written as var(|l) = tr22 for £ in (Al) and where cr2 is an unknown scalar. The covariance matrix is estimated by var((l) = a-2 2, where a-2 is the mean square error (MSE) obtained from fitting the full model and has v degrees of freedom. 3 Boik Likelihood Ratio Tests The usual hypotheses associated with a two-way classification can be written as H0: C'ji = 0, where C is a known ab x s coefficient matrix and where C denotes the transpose of C. The linear function, C'(i> could consist of a set of main effects contrasts, simple effects contrasts, or interaction contrasts depending on the choice of C. The likelihood ratio test (LRT) statistic for H0: C'|i = 0 is an F statistic, is denoted by F(C), and is given in (A3). The principal disadvantage of expressing hypotheses as H0: C'|i = 0 is that the appropriate choice of C is not always apparent. Fortunately, most hypotheses of interest can be expressed, somewhat more transparently, as H0:CAMCB = 0, where CA and CB are known coefficient matrices. The matrix CA operates on Factor A while the matrix CB operates on Factor B. If the hypothesis concerns an effect averaged over the levels of Factor A, then CA is an a x 1 vector with each element equal to a"1. If the hypothesis concerns differences among the levels of Factor A, then each column of CA consists of the coefficients associated with a particular contrast among the levels of Factor A. The Factor B coefficient matrix is constructed similarly. For example, suppose a = 3, b = 4, and the difference between Ax and A3, averaged over B, is of interest (a main-^4 contrast). To average over columns, CB is equated to (.25 .25 .25 .25)'. To compare rows 1 and 3, CA is equated to (1 0 -1)'. Regardless of the particular choice of CA and CB, the LRT statistic is still an F statistic (or proportional to an F statistic). To emphasize the hypothesis being tested, the LRT statistic is written as T(CA,CB). An expression for T(CA,CB) is given in (A4). In general, T(CA,CB) is equal to the Fstatistic for testing H0: CAMCB = 0 multiplied by the numerator degrees of freedom. That is, the numerator is the hypothesis sum of squares, and the denominator is MSE. Subscripts are used to distinguish between the coefficient matrices when multiple hypotheses are tested. Factor A coefficient matrices are denoted by CAW, CA(2), and so forth. The matrix CA(i) concerns the ith hypothesis involving Factor A; it does not refer to the /th level of Factor A. Factor B coefficient matrices are labeled in the same way. Small cs, c^,) and cB(;1, are used if the coefficient matrix is a vector. If the coefficient vector is a column of ones, it is denoted by la or lb. Main Effects Tests Main effects hypotheses concern contrasts among the row or column means of M. In computing these marginal means, rows and columns of M are weighted equally. The A means and their estimators are Ha = MUZr1 and pLA = Mlbb~l, 4 Analysis of Interactions respectively. Similarly, the B means and their estimators are ftB = M'laa"1 and \iB = M'laa~l. Let \\iA be a contrast among the A means, and let \\iA be the corresponding estimator. That is, i|m = z'a\*-a and iiu = c'A\*.a, where cA is an a x 1 coefficient vector whose elements sum to zero. For example, suppose that a = 4 and that the difference between A{ and the average of.A2 and A3 is of interest. The contrast is 4 contrasts is a set of a - 1 contrasts whose coefficient vectors are linearly independent. The vectors need not be orthogonal. If the coefficient vectors are arranged into a matrix, CU = (c,4(i) cA(2) ■ ■ ■ cA(a - i)), then the composite null can be written as H0: nA = 0 or, equivalently, as H0: C'A Mlfc = 0. For example, if a = 4, then a suitable matrix is CA = / 1 1 »1 -1 0 0 -1 0 \ o 0 -1/ The columns of are said to form a basis set of coefficients. In the remainder of this article, CA and CB, without additional subscripts, denote matrices forming basis sets of coefficients for main-v4 and main-fl contrasts. The LRT statistic for H0:CAM16 = 0 is denoted by T(CA, lb) to emphasize that a basis set of contrasts among rows, summed over columns, is 5 Boik being tested. The statistic is identical to a - 1 times the usual F statistic for testing row effects. For an a level test, the composite null is rejected if (a - ly'TiCA, 1„) s= F\ Z. where FlaZ"v is the upper 100(1 - ^percentile of the F distribution with a - 1 and v degrees of freedom. Scheffe's (1953) method can be used to control familywise Type I error rate for follow-up tests: H0:cAMl6 = 0 is rejected if T(cA,lb) s (a - l)F]l?,v. Furthermore, if the composite null is rejected, then Scheffe's method is guaranteed to find at least one significant main-A contrast because max T(cA,lb) = T(CA,lb), where the maximization is over all vectors that sum to zero. Main-B contrasts are tested in an analogous manner. Interaction Tests Partial Interaction Hypotheses Let \\>B be a main-B contrast: i|/fl = c'b\lb, where cBlb = 0. Associated with each main-B contrast is a set of simple-B contrasts, one at each level of Factor A. The simple-B contrast at the rth level of Factor A is denoted by tyB(Ai)'- *\>b(Aj) = ^/*= iC/|x,7, where c; is the jth element of cfl. In matrix terms, the vector of simple-B contrasts and its estimator are and ^) = Mcfl= *™ B{A„)I respectively. A main-B contrast and its associated vector of simple-B contrasts are related in a straightforward manner: The main-B contrast is the mean of the associated simple-B contrasts. The question of interaction is also straightforward: Are the simple-B contrasts identical at all levels of A, or do they differ? A main-B contrast is said to interact with A if the simple-B contrasts are not identical. A main-B contrast does not interact with A if the simple-B contrasts are identical. A main-B contrast that does not interact with A can be interpreted without regard for any AB interactions that might exist. To help determine if \\iB interacts with A, equality of the simple-B contrasts can be tested. The corresponding null is H0: <]>b(ai) = ^b(Aj) for all /,;'. Boik (1979) called this a partial interaction hypothesis. The partial interaction hypothesis implies that all contrasts among the simple-B contrasts are equal to zero. Thus, the null can be written as H0: CA«|»b(a) = 0 or, equivalently, Ho:CAMcB = 0. The LRT statistic for H0:C;Mcs = 0 is T(CA,cB). The notation emphasizes that a basis set of row contrasts among a set of simple-B contrasts is being tested. For a priori cB, T(CA,cB) is distributed as a - 1 times an F distribution with a - 1 and v degrees of freedom. 6 Analysis of Interactions The distinction between simple effects hypotheses and partial interaction hypotheses is an important one and warrants repeating. The partial interaction hypothesis H0: CA McB = 0 states that the a simple-B contrasts are each equal to the same value; but this value need not be zero. The simple-B hypothesis, H0: ^fB(A) = 0, or, equivalently, H0: McB = 0, states that the a simple-B contrasts are each equal to the same value and that this value is 0. The LRT statistic is T(la,cB), and, for a priori cB, is distributed as a times an Fwith a and v degrees of freedom. The simple-B hypothesis is false if some simple-fi contrast, or some combination of the simple-B contrasts, is nonzero. The partial interaction hypothesis is false if some difference among the simple-B contrasts is nonzero. Composite Interaction Hypothesis: Likelihood Ratio Test If a priori partial interaction hypotheses have not been specified, then a composite interaction null is usually tested. The composite null states that, for any main-B contrast, the associated simple-B contrasts are identical at all levels of A. The null can be written as H0: C'A MCB = 0. The LRT statistic for the composite interaction null is T(CA,CB) and is identical to (a — 1) (b - 1) times the usual F statistic for interaction. For an a level test, H0 is rejected if FAB > F\a\\b - I)jV, where FAB = [(a - l)(b - 1)] 1 T(CA,CB)- A composite interaction null, in many applications, can be tested by a test that is more powerful against a certain class of alternatives than the LRT. The competing test is based on the maximal product interaction contrast F statistic. To understand the rationale underlying the maximal F statistic, some background on interaction contrasts is needed. Interaction Contrasts A variety of coherent follow-up tests can be conducted if the composite interaction null is rejected. The composite null implies that all interaction contrasts are zero. The general form of an interaction contrast is a b is = 2 S c,7(x,7, or, equivalently, >\iAB = trace(CAB M), i = iy = i where CAB is an a x b matrix with elements {c,7}; each row and each column of CAB sums to zero. The LRT statistic for Ho: trace(CAe M) = 0 is a special case of (A3) and can be written as F[vec(CAB)] = J^= [trace(C;flM)f var(*M var[trace(CU*M)]' W where var[trace(C/4BM)] is given in (A5). If CAB is specified a priori, then F[\ec(CAB)] has an F distribution with 1 and v degrees of freedom. In practice, attention can often be restricted to a subset of interaction contrasts called product interaction contrasts. A product contrast is an inter- 7 Boik action contrast for which the coefficient matrix can be written as Cab ~ Ca Cb, where c„ and cB are coefficient vectors that sum to zero. The contrast is called a product contrast because the j'y'th coefficient in k^ab is given by the product of the ith coefficient in cA and the y'th coefficient in cB. A product contrast can be written as *\iab = CaMcb, and the LRT statistic for H0: c'AMcfl = 0 is T(cA,cB). If min(a, b) > 2, then product contrasts are only a subset of interaction contrasts. Consequently, some components of the interaction are ignored if attention is restricted to product contrasts. Nevertheless, substantial information is not likely to be lost because nonproduct: contrasts are very difficult to interpret. Product contrasts, on the other hand, are frequently easy to interpret. The difficulty of interpreting nonproduct contrasts is illustrated in a later section that compares the Lutz and Cundari (1987) approach to the present approach. To interpret a product contrast, c^Mcfl, consider, first, the associated main-/? contrast: \\iB = c'ByiB. A complete interpretation of the main-/? contrast entails a statement about its value, averaged over the levels of A, plus a statement about how it differs among the levels of A. Testing the partial interaction, using T(CA, cB), helps to determine if the contrast differs among the levels of A. If it is concluded that the simple-/? contrasts do differ among the levels of A, then a natural follow-up strategy is to examine specific differences among the simple-/? contrasts. This is where product contrasts are useful. A product contrast is a specific difference among the simple-/? contrasts. Hence, to interpret a product contrast, one need only interpret a difference among simple-/? contrasts. Of course, if the partial interaction null cannot be rejected, then product contrasts need not be examined; the simple-/? contrasts do not differ significantly. Product contrasts can also be interpreted as a difference among simple-/! contrasts. As an illustration, consider the example from Rosnow and Rosenthal (1989a): The sample means reflect the effects of a fictitious treatment, ralphing, on the performance (number of hits) of baseball players. Factor A has levels A\. control and A2: ralphed. Factor B has levels B\. inexperienced players and B2: experienced players. The two main effects and their interaction are significant. There is only one contrast in a two-level factor, so this analysis is somewhat mechanical. For = (-1 1)', the estimated simple-,4 contrasts are tyAW = (2 4)', and the average contrast is \\>A = 3. The performance improvement due to ralphing is estimated to be 2 hits for inexperienced players, 4 hits for experienced players, and 3 hits on the average. For /?! B2 8 Analysis of Interactions cB = (-1 1)', the estimated product contrast is \\tAB = 2. Because the interaction has just one degree of freedom, this product contrast is the entire interaction. The interpretations are straightforward. On the average, the performance improvement due to ralphing is 3 hits, but experienced players benefit more (by two hits) than inexperienced players. Composite Interaction Hypothesis: Maximal F Test The LRT test of H0: CA MCB = 0 is not recommended when attention is restricted to product contrasts. It is not as powerful for product contrasts as a competing test which considers only product contrasts. The recommended test is based on Roy's (1953) union-intersection principle and rejects the composite null for large R, where R = max T(cA,cB), (2) and where the maximization is over all vectors that sum to zero. The test statistic, R, is the maximal Fcorresponding to a product interaction contrast. When data are balanced and there are no covariates, the exact null distribution of R is known. Boik (1985,1986) referred to the distribution of R as the Studentized maximum root (SMR) distribution. The 100(1 - a) percentile of the SMR distribution is denoted by Rp~qav, where p = min (a - 1, b - 1) and q = max(a - 1, b - 1). Tables of Rxp ~° for 2 < p < 5, p < q < 6, a = .05, and a = .01 are given in Boik (1986). There is no need for special tables corresponding to p = 1 because R\,q" = qF *~a. The SMR percentiles can still be used when data are unbalanced or covariates are present, but the percentiles are, perhaps, no longer exact. The accuracy of the SMR percentiles for unbalanced data or ancova is discussed in a following section. Interaction Contrasts Versus Corrected Cell Means Rosnow and Rosenthal (1989a, 1989b) argued that to correctly interpret an interaction "the exercise of looking at the 'corrected' cell means is absolutely essential" (1989b, p. 1282). Corrected cell means are sometimes called interaction effects and are obtained by removing row, column, and grand mean effects from the cell means. The z';'th corrected cell mean is Itj = M-/y ~ fa. "TO - (F./ -]!..)- jl.. = V-ij ~ -}!./+ ji.., using the usual dot and overbar notation to denote averaging. The a x b matrix of corrected cell means is T = {-y(|-} = I^MH,, where Hfl = I„ - a"11„ \'a and H6 = l„ - b"116 \'b. From the expression for T, it can be deduced that a corrected cell mean is a product contrast. In particular, 7,, = cJi(,)McB(y), where cAV) is the ith 9 Boik column of H„ and cfl(y) is the ;th column of Hb. For example, if a = 4 and b = 5, then the coefficient vectors corresponding to 723 are cAp) = (-.25 .75 -.25 -.25)'andcfl(3) = (-.2 -.2 .8 -.2 -.2)'. It is not clear why Rosnow and Rosenthal insisted that one must examine the corrected cell means. The corrected cell means are merely one set of product contrasts. In a particular study, other interaction contrasts may be more meaningful. Rosenthal and Rosnow (1985, p. 28-36) also examined more general product contrasts (they call them crossed contrasts). They computed the product contrasts on the corrected cell means, T, rather than on the uncorrected cell means, M. This is not erroneous, but it is unnecessary. Interaction contrasts (product or otherwise) are identical whether computed on the corrected or uncorrected cell means. That is, trace(CAs T) = trace(CAsM) for all matrices, CAB, in which each row and each column sums to zero. Thus, corrected cell means need not be computed to examine interaction contrasts. Multiple Comparison Procedures for Interactions It is asumed that Type I error rate is to be controlled for some set (i.e., family) of contrasts. Power for testing a particular contrast depends, in part, on the size of the set the contrast belongs to. Large sets translate into small power for individual contrasts. Power can be increased by restricting tests to smaller sets of contrasts. This trade-off between generality and power is typical of multiple comparison procedures. Hochberg and Tamhane (1987, sec. 10.5) review multiple comparison procedures for interaction in balanced two-way classifications without covariates. This section reviews selected procedures that can be employed in more complex linear models where data need not be balanced and covariates may be present. Family 1: All Interaction Contrasts If the set of interest consists of all interaction contrasts, then the recommended test of the composite null, Ho: C'A MCB = 0, is the LRT: reject H0 if FAB > F\~-\b _ n>v. Scheffe's (1953) method can be used to control family-wise Type I error rate of any follow-up tests of interaction contrasts. That is, Ho:trace(CABM) = 0 is rejected if F[\ec(CAB)] > (a - l)(b - 1) F}a~A)(b - i),v Furthermore, if the composite null is rejected, then Scheffe's method is guaranteed to find at least one significant interaction contrast because max F[vec(CAB)] = T(CA,CB) = (a - l)(b - \)FAB. (3) For a proof of (3), see Johnson (1973). The associated simultaneous confidence intervals are given by trace(CABM) ± V(fl - l)(b - 1) F(V-ai)(b - iXvvar[trace(CABM)] ■ 10 Analysis of Interactions Family 2: All Product Interaction Contrasts If the set of interest consists of all product interaction contrasts, then the recommended test of the composite null, H0: C'AMCB = 0, is the maximal F test: reject H0 if R ^ Rp ~q,Familywise Type I error is controlled at a if a partial interaction null, H0:CAMcB = 0, is rejected whenever T(CA,cB) 2= Rp~q,av. Similarly, Ho : c'AMCB = 0 is rejected whenever T(cA, CB) > R p~q", and a product contrast null, H0: cA McB = 0, is rejected whenever T(cA,cB) ^ By construction, a significant maximal Ftest guarantees the existence of at least one significant product contrast. Significant partial interactions are also guaranteed because R is the maximal statistic for testing a partial interaction as well as the maximal Ffor a product contrast: R = max T(cA,cB) = max T(cA,CB) = max T(CA,cB), *a-cb *a cb and the maximization is over all coefficient vectors that sum to zero. Simultaneous confidence intervals for product contrasts are given by cAMcß ± Vi?>--var(c;Mcj,), for var(c^ McB) of (A6). The increase in sensitivity purchased by restricting attention to product contrasts can be gauged by comparing the Scheffé and SMR critical values. For example, if a = 6, b = 7, v = 100, and a = .05, then the Scheffé critical value for tests of interaction contrasts is 30 Fx,\ao = 47.197. The corresponding SMR critical value for product contrasts, from Boik (1986), is RfXioo = 25.571. The SMR simultaneous confidence intervals are only 100V25.571/47.197 = 74% as wide as the Scheffé intervals. Family 3: An A Priori Set of Partial Interactions Sensitivity is increased further if attention is restricted to a small set of a priori main effect contrasts and their associated interactions. The Bonfer-roni inequality provides a straightforward way of controlling the per family Type I error rate (an upper bound on the familywise error rate) in this situation. The procedure consists of allocating a portion of a to each a priori test in the family. Suppose, for example, that one of the factors^—say Factor B—has quantitative levels and that it is sensible to partition Factor B according to polynomial trend contrasts. For b = 3, the a priori main-ß hypotheses are H0: l'a McB(1) = 0 and Ho: Va McB(2) = 0, where cB(1) = (-1 0 1)' and cB(2) = (1 -2 1)'. If each of the a priori hypotheses is tested at the a/2 level, then the per family Type I error rate for Factor B is a. The interaction can be partitioned similarly. The questions to be answered are whether the linear effect of Factor B varies over the levels of A and whether the quadratic effect of Factor B varies over the levels of A. If 11 Boik Ho: CAMcB(1) = O and H0: CAMcB(2) = O are each tested at level a/2, then the per family Type I error rate for the interaction is a. The appropriate critical value for the test statistics r(C,i,cfl(i)) and T{CA,cfl(2)) is (a — 1) F\ - \ \- The critical value for follow-up tests of product contrasts also is (a-\)Flzfv. The gain in sensitivity purchased by restricting attention to the two trend contrasts can be gauged by comparing the critical values. Suppose, as above, that b = 3. Also, suppose that a = 5, v = 50, and a = .05. If all interaction contrasts are of interest, the critical value for an interaction test is 8 F$9550 = 17.040. If attention is restricted to product contrasts, the corresponding critical value is #2^50 = 13.876. Finally, if attention is restricted to the two trend contrasts, then the critical value is only 4F°%5 = 12.218. Extension of SMR Percentiles and Computation of P-Values If min(a,b) > 6 or max(a,b) > 7, the tables in Boik (1986) cannot be used. New percentage points corresponding to larger a and/or b are given in Table 1. The entries in Table 1 were extracted from a larger set of exact upper percentiles. The complete set is available from the author and includes denominator degrees of freedom 1(1)30,32(2)50,55(5)100,125,150, 200(100)1000, and 00. The upper percentiles in Table 1 were computed by using the mathematical results of Krishnaiah and Chang (1971) to evaluate Equation 4.1 in Boik (1986). Reasonably accurate interpolation between tabled values, R^'X < R\-* < can be accomplished as follows: d1~« ^ p1-« _l_/pl-0i _ /? 1 ~ a n —_^2 J n-p.q.v np,q,v2 "T" W,9,vi np,q,v2)\ -1 , —1 I- \Vi V2 / For example, the exact value of /?6,'*iso is 35.759; interpolation yields RtZso * 33.404 + (36.970 - 33.404)(|^_| ~ °j = 35.781. In practice, many researchers like to compute the p-value corresponding to an observed test statistic. Computation of exact p-values for the SMR distribution is quite complicated, but relatively simple approximations have been proposed. Johnson (1976) approximated the distribution of the numerator of R by a multiple of a \2 random variable. The multiplier and degrees-of-freedom parameters were obtained by matching the first two moments. Boik (1985) obtained a 3-moment approximation by matching the moments oiR to those of a multiple of an Frandom variable. Moment functions for using Boik's (1985) approximation are given in Table 2. Table 2 represents a simplification and extension of Table 1 in Boik (1985). The moment functions in Table 2 assume that v > 6 and are defined by _ (v ~ 2)E(J?) _ (v - 4)E(ft2) (v - 6)E(R3) "l — j "2 — 7--i\ri-/- r»\n2 ' anQ "3 —--- (v-2)[E(K)]2' "3 (v - 2)E(R)E(R2) 12 Analysis of Interactions The 3-moment F approximation to the SMR distribution is where 62(v - 2) - (v - 4) v2 = 6 + 4(v - 6) Vl = 93(v - 2)(v - 4) - 202(v - 2)(v - 6) + (v - 4)(v - 6). 2(v - 4)(v2 - 2) A1 _ BlV(v2 - 2) , and A; 02(v - 2)(v2 - 4) - (v - 4)(v2 - 2) v2(v - 2) If v s 6, Johnson's (1976) approximation can be obtained by letting v2 = v, vi = 2/(62 - 1), and k = 8i. Critical values are approximated by As an illustration, #2 12>36 = 39.330. The 3-moment approximation yields *40;?U = 22.15 FS&.32.17 = 39.296 and Pr(/?4,i2,36 s 39.330) - Pr(F37.75.32.17 ^ f^j) = -9503. Distribution of the Maximal F When Data Are Unbalanced Equation A7 in the Appendix gives a sufficient condition for R to follow the SMR distribution. The condition in (A7) is satisfied, for example, when there are no covariates and when data are balanced or sample sizes are proportional. It is not known if (A7) is a necessary condition. I suspect that R follows the SMR distribution regardless of lack of balance or presence of covariates. Of course, I could be wrong. For the case of unbalanced data without covariates, Boik (1989) showed, theoretically, that as sample size increases, R converges in distribution to the SMR distribution. Simulation evidence that the null distribution of R is accurately approximated by the SMR distribution for the case of unbalanced data with covariates is given in this section. A two-way classification with a = 6 and b = 7 was selected for the simulation. The condition in (A7) does not depend on a2 or error degrees of freedom, so, for convenience, cr2 was equated to 1 and assumed known. Each of 5,000 trials in the simulation consisted of (a) randomly generating a 30 x 30 covariance matrix, <&; (b) randomly generating a 30 x 1 vector, vec(CJiMCfl), from a multivariate normal distribution with mean 0 and variance 3>; and (c) computing the test statistic R. The covariance matrices were generated to represent a wide variety of structures not satisfying the sufficient condition in (A7). In each trial, the 13 TABLE 1 Upper percentiles of the studentized maximum root distribution v a 2,7 2,8 2,9 2,10 2,11 2,12 2,13 2,14 2,15 3,7 3,8 1 .05 2490.5 2805.5 3116.5 3424.4 3729.4 4032.1 4332.7 4631.5 4928.6 3146.6 3502.2 .01 62350. 70234. 78021. 85726. 93362. 100938 108463 115941 123379 78774. 87673. 2 .05 198.02 222.12 245.93 269.50 292.86 316.04 339.06 361.94 384.70 247.73 274.92 .01 1014.1 1137.1 1258.5 1378.8 1498.0 1616.3 1733.8 1850.5 1966.7 1267.5 1406.2 3 .05 89.147 99.681 110.09 120.38 130.59 140.72 150.79 160.79 170.74 110.70 122.56 .01 275.77 308.02 339.89 371.42 402.69 433.72 464.54 495.18 525.65 341.59 377.91 4 .05 60.246 67.197 74.064 80.860 87.597 94.282 100.92 107.52 114.09 74.374 82.198 .01 146.49 163.10 179.50 195.74 211.85 227.83 243.70 259.48 275.18 180.09 198.78 5 .05 47.659 53.053 58.379 63.651 68.876 74.061 79.211 84.331 89.423 58.560 64.622 .01 100.77 111.89 122.89 133.76 144.55 155.25 165.88 176.45 186.97 123.11 135.62 6 .05 40.748 45.284 49, .764 54.197 58.590 62.949 67.278 71.582 75.862 49.872 54.966 .01 78.659 87.147 95. .531 103.83 112.05 120.22 128.33 136.39 144.41 95.596 105.12 7 .05 36.413 40.410 44. .356 48.261 52.130 55.969 59.782 63.571 67.340 44.419 48.902 .01 65.935 72.910 79. .797 86.614 93.370 100.07 106.73 113.35 119.94 79.772 87.589 8 .05 33.450 37.077 40. .658 44.200 47.709 51.191 54.648 58.085 61.502 40.688 44.752 .01 57.766 63.770 69, .698 75.564 81.378 87.146 92.875 98.570 104.23 69.617 76.337 9 .05 31.300 34.658 37, ,972 41.250 44.497 47.718 50.917 54.095 57.256 37.978 41.737 .01 52.115 57.448 62, .713 67.921 73.081 78.201 83.287 88.341 93.368 62.592 68.554 10 .05 29.670 32.823 35, .935 39.011 42.058 45.081 48.082 51.064 54.029 35.922 39.448 .01 47.990 52.833 57, .612 62.340 67.023 71.669 76.283 80.869 85.429 57.464 62.871 12 .05 27.365 30.227 33, ,049 35.839 38.602 41.341 44.060 46.762 49.448 33.010 36.204 .01 42.392 46.570 50. .690 54.764 58.798 62.800 66.772 70.720 74.645 50.504 55.157 15 .05 25.214 27.802 30. .352 32.871 35.365 37.836 40.289 42.725 45.147 30.286 33.166 .01 37.425 41.011 44. ,544 48.035 51.491 54.917 58.316 61.693 65.050 44.326 48.305 20 .05 23.203 25.532 27.824 30.086 32.324 34.540 36.738 38.921 41.090 27.730 30.312 .01 33.013 36.070 39.078 42.047 44.984 47.892 50.777 53.641 56.487 38.830 42.206 30 .05 21.318 23.400 25.446 27.462 29.454 31.424 33.378 35.315 37.239 25.325 27.620 .01 29.086 31.669 34.206 36.704 39.172 41.613 44.031 46.430 48.811 33.931 36.763 50 .05 19.893 21.784 23.639 25.464 27.264 29.042 30.802 32.547 34.277 23.495 25.567 .01 26.256 28.493 30.684 32.838 34.961 37.056 39.129 41.183 43.218 30.392 32.824 100 .05 18.868 20.620 22.334 24.017 25.674 27.309 28.924 30.523 32.108 22.172 24.077 .01 24.297 26.293 28.242 30.152 32.031 33.881 35.708 37.515 39.302 27.937 30.087 00 .05 17.878 19.492 21.066 22.607 24.121 25.611 27.080 28.531 29.965 20.886 22.624 .01 22.467 24.234 25.953 27.631 29.276 30.891 32.480 34.046 35.592 25.639 27.518 3,9 3,10 3,11 3,12 3,13 3,14 3,15 4,7 4,8 4,9 4,10 1 .05 3851.0 4194.4 4533.1 4867.7 5198.9 5527.1 5852.5 3724.3 4111.9 4490.7 4862.1 .01 96405. 104999 113477 121855 130145 138359 146505 93234. 102936 112416 121714 2 .05 301.60 327.87 353.79 379.41 404.76 429.89 454.81 291.67 321.30 350.26 378.68 .01 1542.3 1676.4 1808.6 1939.4 2068.8 2197.0 2324.1 1491.6 1642.7 1790.5 1935.5 3 .05 134.22 145.69 157.01 168.20 179.28 190.26 201.15 129.80 142.73 155.37 167.77 .01 413.59 448.71 483.39 517.66 551.59 585.21 618.56 399.98 439.56 478.26 516.25 4 .05 89.880 97.445 104.91 112.29 119.60 126.84 134.02 86.922 95.442 103.77 111.95 .01 217.13 235.21 253.05 270.69 288.15 305.46 322.63 209.99 230.34 250.24 269.78 5 .05 70.576 76.440 82.227 87.949 93.613 99.227 104.80 68.254 74.854 81.310 87.646 .01 147.90 159.99 171.94 183.74 195.44 207.02 218.52 143.04 156.64 169.96 183.03 6 .05 59.969 64.895 69.758 74.565 79.325 84.041 88.720 57.996 63.538 68.960 74.283 .01 114.48 123.70 132.80 141.80 150.71 159.54 168.31 110.72 121.09 131.23 141.18 7 .05 53.305 57.642 61.922 66.153 70.342 74.493 78.611 51.553 56.429 61.199 65.882 .01 95.269 102.83 110.30 117.69 125.00 132.25 139.45 92.146 100.64 108.96 117.13 8 .05 48.744 52.674 56.554 60.389 64.185 67.948 71.681 47.142 51.560 55.882 60.125 .01 82.940 89.444 95.866 102.22 108.50 114.74 120.92 80.225 87.526 94.674 101.69 9 .05 45.428 49.063 52.649 56.195 59.706 63.185 66.636 43.936 48.019 52.014 55.936 .01 74.411 80.180 85.876 91.509 97.086 102.61 108.10 71.978 78.451 84.789 91.012 TABLE 1 (Continued,) p,q v a 3,9 3,10 3,11 3,12 3,13 3,14 3,15 4,7 4,8 4,9 4,10 10 .05 42.910 46.319 49.683 53.008 56.300 59.562 62.798 41.501 45.330 49.075 52.752 .01 68.182 73.415 78.580 83.687 88.745 93.758 98.731 65.956 71.824 77.568 83.209 12 .05 39.339 42.426 45.471 48.481 51.461 54.413 57.342 38.048 41.513 44.902 48.229 .01 59.726 64.226 68.669 73.061 77.409 81.720 85.996 57.781 62.823 67.759 72.607 15 .05 35.992 38.774 41.517 44.229 46.912 49.571 52.208 34.813 37.933 40.984 43.978 .01 52.212 56.058 59.854 63.606 67.320 71.001 74.653 50.517 54.822 59.035 63.173 20 .05 32.844 35.334 37.789 40.215 42.615 44.992 47.349 31.769 34.560 37.288 39.964 .01 45.518 48.776 51.990 55.166 58.308 61.422 64.510 44.047 47.690 51.252 54.750 30 .05 29.868 32.076 34.252 36.401 38.525 40.627 42.712 28.893 31.365 33.780 36.147 .01 39.536 42.262 44.947 47.598 50.219 52.814 55.386 38.269 41.309 44.281 47.195 50 .05 27.592 29.579 31.534 33.462 35.367 37.251 39.117 26.694 28.915 31.082 33.203 .01 35.200 37.530 39.823 42.083 44.314 46.522 48.707 34.081 36.677 39.208 41.688 100 .05 25.936 27.756 29.545 31.306 33.044 34.761 36.459 25.095 27.127 29.106 31.041 .01 32.181 34.230 36.242 38.221 40.173 42.099 44.004 31.169 33.447 35.664 37.831 00 .05 24.315 25.966 27.584 29.174 30.738 32.280 33.803 23.530 25.371 27.157 28.899 .01 29.342 31.121 32.860 34.565 36.241 37.891 39.517 28.433 30.406 32.318 34.179 4,11 4,12 4,13 4,14 4,15 5,7 5,8 5,9 5,10 5,11 5,12 1 .05 5227.4 5587.4 5942.8 6294.1 6641.9 4256.1 4670.9 5075.1 5470.4 5858.3 6239.8 .01 130857 139868 148764 157558 166263 106543 116928 127045 136941 146650 156199 2 .05 406.63 434.18 461.38 488.27 514.90 332.20 363.91 394.81 425.05 454.73 483.92 .01 2078.1 2218.7 2357.5 2494.7 2630.6 1698.3 1860.1 2017.7 2172.0 2323.5 2472.4 3 .05 179.98 192.01 203.90 215.65 227.28 147.44 161.28 174.76 187.96 200.92 213.67 .01 553.62 590.47 626.86 662.85 698.48 453.95 496.30 537.59 578.01 617.69 656.73 4 .05 120.00 127.93 135.77 143.52 151.19 98.525 107.64 116.53 125.23 133.77 142.18 .01 289.01 307.97 326.70 345.23 363.57 237.67 259.44 280.67 301.46 321.87 341.96 5 .05 93.882 100.03 106.10 112.11 118.06 77.228 84.287 91.172 97.913 104.53 111.05 .01 195.90 208.59 221.12 233.52 245.80 161.50 176.06 190.26 204.16 217.82 231.26 6 .05 79.520 84.686 89.788 94.833 99.830 65.521 71.447 77.228 82.889 88.448 93.919 .01 150.99 160.66 170.21 179.66 189.01 124.76 135.84 146.65 157.24 167.64 177.88 7 .05 70.490 75.035 79.525 83.965 88.362 58.164 63.377 68.462 73.441 78.331 83.145 .01 125.17 133.11 140.94 148.70 156.37 103.64 112.72 121.59 130.27 138.81 147.21 8 .05 64.301 68.419 72.487 76.511 80.495 53.125 57.847 62.453 66.964 71.394 75.755 .01 108.61 115.42 122.16 128.82 135.42 90.081 97.884 105.50 112.96 120.29 127.51 9 .05 59.796 63.603 67.364 71.083 74.765 49.460 53.823 58.080 62.249 66.343 70.373 .01 97.141 103.19 109.16 115.07 120.92 80.703 87.618 94.369 100.98 107.48 113.88 10 .05 56.371 59.940 63.465 66.952 70.404 46.676 50.765 54.755 58.662 62.500 66.278 .01 88.764 94.244 99.658 105.02 110.32 73.853 80.119 86.237 92.231 98.121 103.92 12 .05 51.503 54.732 57.921 61.076 64.199 42.723 46.421 50.030 53.563 57.034 60.451 .01 77.380 82.089 86.742 91.345 95.904 64.551 69.931 75.185 80.334 85.393 90.374 15 .05 46.925 49.831 52.701 55.540 58.350 39.013 42.340 45.586 48.765 51.887 54.961 .01 67.247 71.265 75.236 79.163 83.053 56.279 60.868 65.348 69.739 74.053 78.302 20 .05 42.598 45.193 47.757 50.292 52.801 35.516 38.488 41.386 44.225 47.012 49.755 .01 58.192 61.587 64.941 68.258 71.543 48.903 52.778 56.560 60.267 63.908 67.493 30 .05 38.475 40.768 43.032 45.270 47.485 32.199 34.826 37.386 39.892 42.351 44.771 .01 50.061 52.886 55.676 58.433 61.163 42.300 45.525 48.671 51.751 54.776 57.753 50 .05 35.287 37.339 39.363 41.362 43.340 29.652 32.005 34.295 36.534 38.731 40.891 .01 44.123 46.520 48.885 51.221 53.531 37.502 40.243 42.912 45.523 48.083 50.602 100 .05 32.938 34.804 36.643 38.457 40.250 27.790 29.934 32.018 34.053 36.046 38.004 .01 39.955 42.042 44.098 46.125 48.128 34.154 36.547 38.873 41.143 43.366 45.549 00 .05 30.603 32.274 33.917 35.535 37.130 25.957 27.887 29.756 31.576 33.355 35.097 .01 35.997 37.777 39.526 41.245 42.939 30.998 33.053 35.041 36.975 38.862 40.709 5,13 5,14 5,15 6,6 6,7 6,8 6,9 6,10 6,11 6,12 6,13 1 .05 6615.8 6986.8 7353.5 4302.3 4756.9 5195.9 5622.5 6039.0 6446.8 6847.3 7241.4 TABLE 1 (Continued) p,q V a 5, 13 5, 14 5, 15 6, 6 6, 7 6, 8 6, 9 6, 10 6, 11 6, 12 6, 13 .01 165609 174897 184077 107702 119079 130068 140748 151172 161381 171406 181271 2 .05 512.69 541.10 569.17 335.70 370.42 403.97 436.59 468.44 499.64 530.29 560.45 .01 2619.2 2764.2 2907.5 1716.1 1893.2 2064.4 2230.9 2393.4 2552.6 2709.0 2862.9 3 .05 226.24 238.65 250.91 148.96 164.10 178.73 192.97 206.87 220.50 233.88 247.06 .01 695.22 733.22 770.79 458.57 504.91 549.72 593.30 635.88 677.60 718.58 758.93 4 .05 150.47 158.65 166.74 99.518 109.49 119.13 128.51 137.68 146.66 155.48 164.17 .01 361.76 381.32 400.66 240.03 263.83 286.86 309.26 331.16 352.62 373.71 394.47 5 .05 117.47 123.81 130.08 77.992 85.709 93.176 100.44 107.54 114.50 121.34 128.07 .01 244.51 257.60 270.55 163.07 178.97 194.37 209.35 224.00 238.35 252.46 266.35 6 .05 99.313 104.64 109.91 66.159 72.636 78.904 85.004 90.965 96.808 102.55 108.20 .01 187.98 197.96 207.82 125.94 138.04 149.76 161.17 172.32 183.26 194.01 204.59 7 .05 87.891 92.578 97.212 58.723 64.418 69.930 75.295 80.538 85.678 90.729 95.702 .01 155.49 163.67 171.77 104.60 114.52 124.12 133.48 142.62 151.59 160.41 169.09 8 .05 80.055 84.302 88.501 53.629 58.786 63.778 68.637 73.386 78.042 82.618 87.124 .01 134.63 141.67 148.62 90.904 99.419 107.67 115.70 123.56 131.27 138.84 146.30 9 .05 74.348 78.272 82.153 49.924 54.688 59.300 63.790 68.178 72.481 76.709 80.874 .01 120.20 126.43 132.60 81.428 88.972 96.282 103.40 110.37 117.19 123.91 130.52 10 .05 70.003 73.682 77.320 47.109 51.573 55.895 60.103 64.215 68.248 72.211 76.114 .01 109.64 115.29 120.88 74.508 81.341 87.963 94.413 100.72 106.91 112.99 118.99 12 .05 63.820 67.148 70.438 43.113 47.148 51.055 54.859 58.578 62.224 65.807 69.336 .01 95.289 100.14 104.95 65.108 70.972 76.656 82.194 87.610 92.924 98.149 103.30 15 .05 57.991 60.984 63.944 39.361 42.989 46.503 49.923 53.266 56.545 59.767 62.941 .01 82.493 86.633 90.729 56.749 61.746 66.590 71.309 75.926 80.455 84.910 89.298 20 .05 52.460 55.131 57.772 35.824 39.062 42.197 45.249 48.232 51.156 54.031 56.862 .01 71.030 74.524 77.979 49.294 53.509 57.594 61.575 65.468 69.288 73.044 76.744 30 .05 47.157 49.512 51.840 32.468 35.327 38.093 40.784 43.414 45.992 48.525 51.019 .01 60.689 63.589 66.456 42.620 46.122 49.514 52.818 56.048 59.216 62.330 65.398 50 .05 43.018 45.118 47.192 29.890 32.446 34.917 37.320 39.665 41.963 44.220 46.441 .01 53.083 55.532 57.952 37.768 40.739 43.612 46.406 49.136 51.811 54.439 57.026 100 .05 39.931 41.830 43.706 28.004 30.330 32.575 34.755 36.880 38.959 41.000 43.006 .01 47.696 49.814 51.903 34.382 36.970 39.468 41.892 44.257 46.570 48.839 51.071 00 .05 36.808 38.492 40.150 26.146 28.235 30.244 32.188 34.079 35.925 37.731 39.503 .01 42.521 44.301 46.054 31.189 33.404 35.533 37.591 39.591 41.540 43.447 45.315 6,14 6,15 7,7 7,8 7,9 7,10 7,11 7,12 7,13 7,14 7,15 1 .05 7629.9 8013.4 5235.1 5696.0 6143.0 6578.6 7004.5 7422.2 7832.8 8237.0 8635.6 .01 190995 200594 131049 142586 153775 164678 175340 185796 196072 206191 216170 2 .05 590.19 619.55 406.94 442.17 476.34 509.66 542.24 574.21 605.63 636.57 667.09 .01 3014.6 3164.5 2079.6 2259.3 2433.7 2603.7 2769.9 2933.0 3093.4 3251.3 3407.0 3 .05 260.05 272.87 180.02 195.39 210.30 224.85 239.07 253.03 266.76 280.27 293.61 .01 798.71 837.99 553.66 600.70 646.36 690.89 734.46 777.20 819.23 860.63 901.46 4 .05 172.73 181.19 119.98 130.10 139.93 149.51 158.89 168.09 177.14 186.05 194.84 .01 414.94 435.16 288.87 313.04 336.52 359.42 381.83 403.82 425.45 446.75 467.77 5 .05 134.71 141.26 93.828 101.67 109.28 116.70 123.97 131.10 138.11 145.01 151.83 .01 280.06 293.59 195.70 211.86 227.56 242.87 257.87 272.58 287.05 301.31 315.37 6 .05 113.78 119.29 79.449 86.027 92.417 98.650 104.75 110.74 116.63 122.43 128.15 .01 215.03 225.34 150.77 163.07 175.02 186.69 198.10 209.31 220.34 231.20 241.92 7 .05 100.61 105.45 70.407 76.192 81.811 87.293 92.660 97.927 103.11 108.21 113.25 .01 177.66 186.12 124.95 135.03 144.83 154.39 163.76 172.95 181.99 190.90 199.70 8 .05 91.569 95.958 64.209 69.447 74.536 79.501 84.363 89.134 93.827 98.450 103.01 .01 153.66 160.93 108.37 117.03 125.45 133.66 141.71 149.60 157.37 165.03 172.59 9 .05 84.981 89.038 59.697 64.536 69.238 73.825 78.317 82.726 87.062 91.335 95.551 .01 137.05 143.49 96.905 104.57 112.03 119.31 126.44 133.44 140.33 147.12 153.82 10 .05 79.964 83.766 56.266 60.800 65.205 69.505 73.714 77.846 81.910 85.915 89.867 .01 124.90 130.74 88.524 95.471 102.22 108.82 115.28 121.62 127.86 134.02 140.09 VD r- O tj- o 104.72 o oo m Cvl r- VO VO vo C-4 vO O Cvl ^t-m o O Ov Os Ov r- VO r—t VO C-4 O r~ *-t oo (N T* r- vo vo Ov oo OV i/l VO r~ Tt-m vo VO Os 00 m m Tf O d tj-VO Os 00 oo Cvl u-> vO VO 00 m m U") O Tf VD tj" o VO r—* o Ö o t—t Os oo m m C-4 O 00 r—t Tj" 00 t); *-1 m tj- O o m VO Ov C-4 t-; o Cvl VO OV m 00 55 r-' vo O VO VO oo in m Tj"' Ov tJ- cí vo in Tí U") T—f Tí- Ov tj" oó ro Tj" tj- m in fi C") m m VO m Cvl m 00 n o in CN (N m m tJ- r~ o m VO tj-Os in r—f oo (N m o vo O VO (N in OV r~ tj" vo m Cvl tj" Os VO tj- C-l Ov 00 vo o Ov o o r~ m Ov oo Cvl o Tj-oo t-; VO O CJ i^i tí- C-4 VO (N Ov Tt T—f O ov m 00 00 ■t' U") r-' oó tj- vo vo tj- VD m Ö Tt Ov ■t m ? Tt d Tj- 00 tj- m OV O tj-00 tj- r~ o m OV O (N T—í Cvl m C") o o r~ 00 OV o VO Cvl oo tj- o Ov 00 m 00 o (N r~ m (N vo r—f m m r~ VO i/i m (N Tt (N m r~ Os oo p-> m o oo ■t Os Os T—i SD tJ- m Ov (N f) Ov oo r~ o (S o tí-00 r~ m r~ C-4 VO VO m o tj- Ov VO m vo tJ- VO VO (N tj- m oo ro Ov m ro tT C-J <^> Ov m o u-> m VO m (N Os m vo VO O Os (N tJ- o o Ov O Ov Cvl Ov vo Ov O C-4 m Os r-tj" T—i VO Ov r~ Ov VO r~ T—t Ov VO f~~ Os C-4 VO tj-oo m o m (S VO vo m m C-4 tj- oo tj" oo 00 00 oó r~ O oo (N vq U") vo vo Ov ro Os r-tj; in (Nl r—t vq r~ r~ ití (N 00 Ov Ov VO (N m Tj-(N m (S r~ O VO VO ro Ov Ov in Ö 00 m oó vO oó oš in Tf Tf ro in Tt tj- 1/1 o o m o T—í o m o o m o O m o O m o o m o O (N m r—t o Cvl o o m o o 8 Os oo o ro Tí o m 00 Ov Cvl O oó tj; p 00 (N ro n-j c-i (N m vo VO o Ov Cv| ro ro OV C-4 tj- m VO O OV 00 VO Ov vo 00 r-' VO r~ r-; O d r~ Ov Ov oó 00 Cvl oo tj" Ov Os 00 Cvl vo ro Cvl Ov r—H m O O m Cv| oo f) vo Os m Os o oó u-i vd Ov tj- CO 00 00 oo C-4 VO CO Cvl 00 o oo CO tj; r~ CO r~ Cvi T, VI (N CO CO Os VO tj" d Ov Os 00 r- VO ro Cvl oo ,—' in VO in C-4 oo VO C4 oó O r~ o oó oo Ov VO C") oo VO oó d 1^1 00 oo Ov m oo m (N C-4 Ov r~ Ov Ti; o Ov u-i CN VD in vo 00 Ov r~ Ov O (N d O tj" oo tj- vO 1^- ITi (N Cvl r- ,—' 1^- Tt CO in O (N oo C-4 oó OV Ov Tj- VO Tj- C-4 vd r~ d VO VD T—f VD Cvl Ov m VO ' 1 m (N C-4 VO O VO OV m Ov VO Cvl VD Ov vo tj" t-; vq r- tj- 00 tj" t—i Ov d r- Tf tj- vO tj- Cvl (N vO in T—i i/l o O o O O O O .01 338.27 362.73 386.56 409.84 432.67 455.09 477.16 498.91 5 .05 109.84 117.77 125.50 133.05 140.45 147.72 154.87 161.92 .01 228.73 245.08 261.01 276.59 291.86 306.86 321.63 336.19 6 .05 92.892 99.549 106.03 112.37 118.59 124.69 130.70 136.62 .01 175.91 188.36 200.49 212.35 223.99 235.42 246.67 257.76 7 .05 82.227 88.081 93.784 99.359 104.83 110.20 115.48 120.69 .01 145.55 155.76 165.71 175.43 184.98 194.35 203.58 212.68 8 .05 74.912 80.213 85.378 90.429 95.380 100.24 105.03 109.76 .01 126.06 134.83 143.37 151.73 159.93 167.98 175.92 183.74 9 .05 69.584 74.481 79.253 83.919 88.494 92.990 97.416 101.78 .01 112.58 120.34 127.91 135.32 142.59 149.73 156.76 163.69 10 .05 65.530 70.118 74.590 78.962 83.250 87.463 91.610 95.700 .01 102.72 109.75 116.61 123.32 129.90 136.37 142.75 149.03 12 .05 59.760 63.907 67.949 71.901 75.777 79.586 83.336 87.033 .01 89.312 95.348 101.24 107.00 112.65 118.20 123.67 129.07 15 .05 54.324 58.050 61.682 65.235 68.718 72.142 75.513 78.837 .01 77.364 82.505 87.519 92.427 97.241 101.98 106.64 111.24 20 .05 49.167 52.489 55.726 58.893 61.998 65.050 68.056 71.019 .01 66.666 70.996 75.220 79.354 83.411 87.400 91.329 95.204 30 .05 44.230 47.153 50.002 52.789 55.522 58.207 60.852 63.459 .01 57.025 60.609 64.105 67.527 70.884 74.185 77.437 80.644 50 .05 40.383 42.984 45.518 47.995 50.424 52.810 55.158 57.474 .01 49.945 52.963 55.906 58.784 61.607 64.382 67.115 69.810 100 .05 37.519 39.868 42.154 44.387 46.574 48.721 50.834 52.915 .01 44.941 47.543 50.076 52.551 54.976 57.357 59.700 62.008 00 .05 34.634 36.710 38.725 40.689 42.608 44.488 46.334 48.150 .01 40.149 42.330 44.446 46.507 48.519 50.490 52.423 54.323 TABLE 2 Moment functions for approximating the SMR distribution p,q 6i e2 e3 6i e2 e3 2,2 3.5708 1. 5234 2.0571 3,14 22. .0874 1 .0655 1 .1330 2,3 5.0000 1. 3600 1.7294 3,15 23. .3654 1 .0616 1 .1251 2,4 6.3562 1. 2762 1.5604 3,16 24. .6338 1 .0581 1. .1180 2,5 7.6667 1. 2250 1.4565 3,17 25. .8937 1 .0550 1 .1117 2,6 8.9452 1. 1902 1.3859 3,18 27 .1458 1 .0523 1 .1061 2,7 10.2000 1. 1649 1.3347 3,19 28 .3908 1 .0498 1 .1010 2,8 11.4361 1. .1458 1.2957 3,20 29 .6292 1 .0475 1 .0964 2,9 12.6571 1. .1307 1.2651 3,21 30 .8615 1 .0455 1 .0922 2,10 13.8656 1 .1185 1.2403 3,22 32 .0882 1 .0436 1 .0884 2,11 15.0635 1 .1085 1.2199 3,23 33. .3097 1 .0419 1 .0849 2,12 16.2522 1 .1000 1.2027 3.24 34 .5263 1 .0403 1 .0817 2,13 17.4329 1. .0928 1.1881 3,25 35. .7383 1 .0388 1 .0787 2,14 18.6065 1 .0866 1.1755 3,26 36 .9460 1 .0375 1 .0759 2,15 19.7739 1 .0812 1.1645 3,27 38 .1497 1 .0362 1 .0733 2,16 20.9356 1 .0765 1.1548 3,28 39 .3495 1 .0350 1 .0709 2,17 22.0922 1 .0722 1.1462 3,29 40 .5458 1 .0339 1 .0687 2,18 23.2441 1 .0685 1.1385 3,30 41. .7386 1 .0329 1 .0665 2,19 24.3917 1 .0651 1.1316 4,4 10 .1312 1 .1549 1 .3159 2,20 25.5354 1 .0620 1.1254 4,5 11. .8210 1 .1286 1 .2624 2,21 26.6755 1 .0592 1.1198 4,6 13 .4368 1 .1104 1 .2253 2,22 27.8122 1 .0567 1.1146 4,7 14 .9982 1 .0970 1 .1979 2,23 28.9457 1 .0544 1.1099 4,8 16 .5173 1 .0867 1 .1767 2,24 30.0764 1 .0522 1.1055 4,9 18 .0024 1 .0785 1 .1599 2,25 31.2042 1 .0502 1.1015 4,10 19 .4593 1 .0717 1 .1462 2,26 32.3295 1 .0484 1.0978 4,11 20 .8926 1 .0661 1 .1347 2,27 33.4524 1 .0467 1.0944 4,12 22. .3054 1. .0614 1. .1250 2,28 34.5730 1 .0451 1.0912 4,13 23. .7004 1 .0573 1 .1166 2,29 35.6914 1 .0437 1.0882 4,14 25. .0797 1 .0538 1 .1094 2,30 36.8077 1 .0423 1.0854 4,15 26. .4452 1 .0507 1. .1030 3,3 6.7321 1 .2527 1.5140 4,16 27. .7981 1. .0479 1. ,0974 3,4 8.3333 1 .1968 1.4009 4,17 29, .1397 1, .0454 1. .0924 3,5 9.8547 1 .1621 1.3303 4,18 30, .4710 1. .0432 1. ,0878 3,6 11.3210 1 .1383 1.2817 4,19 31. ,7929 1. .0412 1. ,0838 3,7 12.7465 1 .1208 1.2461 4,20 33. .1061 1. .0394 1. ,0801 3,8 14.1404 1 .1075 1.2189 4,21 34. ,4113 1. .0378 1, ,0767 3,9 15.5086 1 .0969 1.1972 4,22 35. ,7092 1. ,0363 1. ,0736 3,10 16.8557 1 .0883 1.1797 4,23 37. .0001 1. .0349 1. ,0708 3,11 18.1849 1 .0811 1.1651 4,24 38. ,2846 1. 0336 1, ,0682 3,12 19.4986 1 .0751 1.1527 4,25 39. 5631 1. 0324 1. ,0657 3,13 20.7990 1 .0699 1.1422 4,26 40. 8359 1. .0313 1. ,0635 TABLE 2 (Continued) 6! 8: z e 3 p,q 6i e2 83 4,27 42. .1033 1. .0303 1. .0614 6,16 33.2982 1.0364 1 .0741 4,28 43. .3658 1. .0293 1 .0594 6,17 34 .7707 1 .0347 1 .0705 4,29 44, .6235 1 .0284 1 .0576 6,18 36 .2289 1 .0331 1 .0672 4,30 45, .8767 1 .0276 1 .0559 6,19 37, .6739 1 .0316 1 .0642 5,5 13 .6547 1 .1074 1 .2193 6,20 39 .1070 1 .0303 1 .0615 5,6 15 .3982 1 .0927 1 .1892 6,21 40 .5290 1 .0291 1 .0590 5,7 17 .0754 1 .0818 1 .1669 6,22 41. .9407 1 .0280 1 .0568 5,8 18 .7013 1 .0733 1 .1496 6,23 43 .3428 1 .0269 1 .0547 5,9 20 .2858 1 .0666 1 .1358 6,24 44 .7359 1 .0260 1 .0527 5,10 21 .8364 1 .0610 1 .1244 6,25 46 .1208 1 .0251 1 .0509 5,11 23 .3581 1 .0564 1 .1149 6,26 47, .4977 1 .0243 1 .0493 5,12 24 .8552 1 .0525 1 .1069 6,27 48 .8673 1 .0235 1 .0477 5,13 26 .3308 1 .0491 1 .0999 6,28 50 .2299 1 .0228 1 .0462 5,14 27, .7875 1 .0461 1 .0939 6,29 51 .5859 1 .0221 1 .0449 5,15 29 .2273 1 .0435 1 .0886 6,30 52, .9357 1 .0215 1 .0436 5,16 30 .6520 1 .0412 1 .0839 7,7 20 .9073 1 .0631 1 .1288 5,17 32, .0631 1 .0392 1 .0796 7,8 22, .7133 1 .0569 1 .1160 5,18 33, .4617 1 .0373 1 .0758 7,9 24 .4657 1 .0519 1 .1058 5,19 34 .8490 1 .0356 1 .0724 7,10 26 .1740 1 .0478 1 .0973 5,20 36 .2259 1 .0341 1 .0693 7,11 27, .8450 1 .0443 1 .0902 5,21 37, .5931 1 .0327 1 .0665 7,12 29, .4842 1. .0413 1. .0842 5,22 38 .9514 1 .0314 1 .0638 7,13 31, .0955 1 .0388 1. .0790 5,23 40 .3013 1 .0303 1 .0614 7,14 32, .6824 1 .0366 1. .0744 5,24 41 .6435 1 .0292 1 .0592 7,15 34, .2475 1 .0346 1 .0703 5,25 42, .9785 1 .0282 1 .0572 7,16 35, .7932 1 .0328 1. .0667 5,26 44 .3066 1 .0272 1 .0552 7,17 37, .3213 1 .0312 1. .0635 5,27 45, .6283 1 .0263 1 .0534 7,18 38. .8333 1 .0298 1. .0606 5,28 46, .9439 1 .0255 1 .0518 7,19 40. ,3307 1 .0285 1. .0580 5,29 48, .2538 1 .0248 1. .0502 7,20 41. ,8146 1 .0274 1. .0556 5,30 49, .5583 1 .0240 1 .0487 7,21 43. .2862 1 .0263 1. .0534 6,6 17, .2548 1 .0803 1 .1639 7,22 44. .7462 1. .0253 1. .0514 6,7 19, .0346 1 .0711 1 .1451 7,23 46. 1955 1. .0244 1. .0495 6,8 20, .7549 1 .0639 1 .1304 7,24 47. 6348 1. .0235 1, .0478 6,9 22. .4276 1, .0582 1. .1187 7,25 49. 0647 1. .0228 1. .0462 6,10 24. .0609 1. .0535 1. .1090 7,26 50. 4859 1. ,0220 1. .0447 6,11 25. 6610 1, ,0495 1. .1009 7,27 51. 8988 1. .0213 1. .0433 6,12 27. ,2326 1, ,0461 1. .0940 7,28 53. 3039 1. 0207 1. ,0420 6,13 28. ,7794 1. .0432 1, .0880 7,29 54. 7017 1. .0201 1. .0408 6,14 30. 3044 1. .0407 1. .0828 7,30 56. 0924 1. 0195 1. .0396 6,15 31. 8100 1. .0384 1. .0782 8,8 24. 5981 1. 0514 1. ,1048 Boik TABLE 2 (Continued) p,q 6i e2 e3 p,q 8i e2 e3 8,9 26.4240 1. .0470 1 .0957 8,20 44. .3890 1. .0250 1 .0509 8,10 28.2013 1. .0433 1 .0882 8,21 45. .9061 1. .0241 1 .0489 8,11 29.9376 1 .0402 1 .0819 8,22 47. .4105 1. .0232 1 .0471 8,12 31.6389 1 .0376 1 .0765 8,23 48 .9033 1. .0224 1 .0454 8,13 33.3096 1 .0353 1 .0718 1 .0677 8,24 50. .3850 1. .0216 1 .0438 8,14 34.9534 1 .0333 8,25 51 .8565 1. .0209 1 .0424 8,15 36.5734 1 .0315 1 .0641 8,26 53, .3183 1. .0202 1 .0410 8,16 38.1719 1 .0300 1 .0609 8,27 54 .7711 1. .0196 1 .0398 8,17 39.7511 1 .0285 1 .0580 8,28 56 .2153 1. .0190 1 .0386 8,18 41.3127 1 .0273 1 .0554 8,29 57. .6514 1. .0185 1 .0375 8,19 42.8582 1 .0261 1 .0530 8,30 59 .0799 1 .0180 1 .0365 covariance matrix was generated as the sum of two component matrices: = I30 + S. The first is an identity matrix and would be the only component if data were balanced and no covariates were present. The second component is a random matrix with distribution 31 x S ~ W30(31,I). The second component reflects the contribution of unbalanced data and covariates to . This method of generating the s yields covariance matrices more deviant from (A7) than those likely to be encountered in practice. Figure 1 presents an empirical cumulative distribution plot of the results of the simulation experiment. Also plotted is a simultaneous 95% acceptance region for testing the hypothesis that R follows the SMR distribution. The acceptance region is based on inverting the Kolmogorov test. The entire empirical distribution function falls inside the 95% acceptance region. The computed Kolmogorov statistic is .0152 (p ~ .12). As Figure 1 shows, the R percentiles are accurately approximated by the SMR percentiles. For example, 94.78% of the R statistics were smaller than the 95th SMR percentile, R 5G;965»=23.954, and 98.76% of the R statistics were smaller than the 99th SMR percentile, fl5°;?*=28.862. Maximal Product Contrast F Versus Most Significant Parametric Function A competing strategy for selecting interaction contrasts for further examination after rejecting the composite null was described by Lutz and Cundari (1987). If the composite null is rejected by the LRT, they suggested examining the coefficient matrix, CAB, that maximizes F[vec(CAB)] in (1). The corresponding interaction contrast is necessarily significant according to Scheffe's (1953) method because of (3). Direct interpretation of the maxi- 24 Analysis of Interactions 35.0 40.0 Maximal F Statistic: R FIGURE 1. Empirical distribution function of the maximal F. The area between the upper and lower curves is a 95% acceptance region for testing that the maximal F follows the SMR distribution mizing coefficient matrix is likely to be elusive, so they simplify the coefficients (by rescaling and rounding) and interpret the simplified interaction contrast. To illustrate their approach, Lutz and Cundari used a study conducted by Beatty (1984). Learning disabled (LD) students from Grades 3, 4, and 5 were assigned to treatment (summer reading program) or control groups. Non-LD students from each grade also served as controls. The data were analyzed according to a 3 (Grades 3, 4, and 5) x 3 (LD treatment, LD control, non-LD control) fixed effects model. The interactionp-value from the LRT was 0.043. The maximizing coefficient matrix and its simplification are / 25.207 - 0.603 - 24.603\ / 0.5 0.0 -0.5\ CAB = 20.462 -21.231 0.769 « C*AB = 50 x 0.5 -0.5 0.0. \-45.669 21.834 23.834/ \-1.0 0.5 0.5/ The simplified interaction contrast, trace (CAB'M), is also significant, but its meaning is still elusive. To interpret the interaction, Lutz and Cundari further simplify the coefficient matrix to 25 Boik I 0.50 -0.25 -0.25\ 50-1C*ab~CTb = \ 0.50 -0.25 -0.25. \-1.00 0.50 0.50/ The resulting interaction contrast, trace(C^a'M), is not significant according to Scheffe's (1953) method, but Lutz and Cundari were able to make an interpretation: The difference between fifth-grade students and the average of third- and fourth-grade students depends on whether students participated in the summer reading program. Note that the contrast that Lutz and Cundari were finally able to interpret is a product contrast. The row (grade) coefficient vector is * = (0.5 0.5 -1.0)', and the column (group) coefficient vector is c%* = (1.0 -0.5 -0.5)'. Apparently, the nonproduct contrasts were uninterpretable. Reanalysis of the data by the proposed method leads to the same contrast, but it does so more directly. The computed test statistic is R = 9.38 which, by coincidence, has the same p-value as the LRT (p = 0.043). The maximizing vectors in (2) are c„ = (0.46 0.35 -0.81)' and cfl = (0.81 -0.37 -0.44)'. Simplification yields c%* and c$*■ Furthermore, the product interaction c^*'Mc|* is significant by the proposed method: T(eX*,ci*) = 9.33, p = 0.044. Analyses of Interaction With SAS and SPSS Project TALENT Project TALENT was a large scale survey conducted to assess the abilities, interests, and personality characteristics of American high-school students. The present analysis is concerned with modeling interest in physical science as a function of size of high school (4 levels), geographic region of the country (9 levels), plans for attending college (5 levels), and gender. Socioeconomic status, results of a mathematics test, and results of a mechanical reasoning test served as covariates. Cooley and Lohnes (1971, Appendix B) list a subset of measures from 505 high-school seniors enrolled in the project (a 2% random sample of all enrolled seniors). Female case 215 was dropped because of missing data. The number of high-school sizes was reduced to three by merging students from the smallest high schools (n = 9) with students from the second smallest high schools (n = 144). The number of geographic regions was reduced to eight by merging students from Alaska and Hawaii (n = 2) with students from the far western states (n = 41). Preliminary tests suggested that some two-factor, all three-factor, and the four-factor interactions can be eliminated from the model. An ancova based on the reduced model is summarized in Table 3. All sums of squares are SAS Type III. Most of the families are significant and, in practice, would merit follow-up tests. For present purposes, attention is focused on the college plans main effect and the plans x size of high-school interaction. 26 Analysis of Interactions TABLE 3 ancova summary table of physical science interest inventory Source SS df MS F P -Value Covariates 3534. ,44 3 1178.15 24.79 P < 0.01 Mathematics test 1124, ,83 1 1124.83 23.67 P <0.01 Mechanical reasoning test 693, ,43 1 693.43 14.59 P < 0.01 Socioeconomic status index 11. .39 1 11.39 0.24 P = 0.62 Gender 1452, ,45 1 1452.45 30.56 P < 0.01 College plans 913 ,63 4 228.41 4.81 P < 0.01 Geographic region 741, .21 7 105.89 2.23 P = 0.03 Size of high school 96, .21 2 48.10 1.01 P = 0.36 Gender x plans 305 .71 4 76.43 1.61 P = 0.17 Gender x region 700 .60 7 100.09 2.11 P = 0.04 Gender x size 332, .27 2 166.14 3.50 P = 0.03 Plans x size 1234, .51 8 154.31 3.25 P < 0.01 Error 22100 .13 465 47.53 Total 46294 .66 503 Computation of the Maximal F Statistic If the usual Ftest is nonsignificant andpq FAB < R\ ~ °, then the maximal F test need not be performed because the outcome (nonsignificance) is known. Conversely, if the Ftest is nonsignificant butpqFAB > Rlp,qav, then the maximal F test ought to be performed because significant product contrasts might exist. See Boik (1986) for an example. If, as in the present case, the Ftest is significant, then one could bypass the maximal Ftest and proceed directly to follow-up tests. Nevertheless, this strategy is not recommended. Computing the maximal F statistic automatically produces the maximizing vectors, and cfl. These vectors are quite useful when selecting follow-up tests of partial interactions and interaction contrasts. In addition, unless the maximal Ftest is performed, one cannot be sure that follow-up tests on product contrasts are necessary. It is unlikely, but possible, for the usual F test to detect a significant nonproduct contrast while the maximal Ftest declares all product contrasts to be nonsignificant. The interpretation of such an interaction would be difficult. Table 4 lists a SAS program for computing the maximal F statistic for the college plans x size of high-school interaction. The computation requires two steps. First, the model is fit using proc glm (SAS Institute, 1988), and the estimated adjusted means (covariates equated to their means) and corresponding covariances are saved. The estimated adjusted means are displayed in Table 5 and plotted in Figure 2. In Step two, an alternating least-squares algorithm (Boik, 1989) is used to compute the maximal F 27 Boik TABLE 4 SAS program to compute maximal F statistic data; infile talent; input size region gender plan mech math physics ses; proc glm; class plan size gender region; model physics = math mech ses plan|size plan|gender size|gender gender|region; lsmeans plan*size/ cov out = means; proc iml; use means; reset noname; read all var _num_ into X; a = ncol(design(X[,2])); p = a - 1; b = ncol(design(X[,l])); q = b - 1; Sigma = X[,6:a*b + 5]; mu = X[,3]; Ha = 1(a) - J(a,a,l/a); Ca = Ha[,l:p]; Hb = 1(b) - J(b,b,l/b); Cb = Hb[,l:q]; Phi = (Cb@Ca)'*Sigma*(Cb@Ca); Psi = Ca'*shape(mu,b,a)'*Cb; call svd(U,D,V,Psi); wp = U[,l]; psi = shape(Psi',p*q,l); start als; wq = inv((I(q)@wp)'*Phi*(I(q)@wp))*(I(q)@wp)'*psi; wp = inv((wq@I(p))'*Phi*(wq@I(p)))*(wq@I(p))'*psi; epsi = psi'*(wq@wp) - R; R = R + epsi; finish; epsi = 1; R = 0; start iterate; do while(epsi > = .00001); run als; end; finish; run iterate; print "Maximal Contrast Coeff.: Treat. A" (Ca*wp/sqrt(wp'*Ca'*Ca*wp)); print "Maximal Contrast Coeff.: Treat. B" (Cb*wq/sqrt(wq'*Cb'*Cb*wq)); print "Maximal F Ratio for Product Contrast" R; statistic. The second step involves matrix computations and.is performed by proc iml, the interactive matrix language (SAS Institute, 1985). The proc iml statements can be applied to other data sets without modification. The computed test statistic is R = 23.08. Designating college plans as Factor A and high-school size as Factor B, the maximizing coefficients are -0.49 0.11 0.75 0.06 -0.43/ and cB = Interpolation in Tables 1 and 2 of Boik (1986) yields R^X^es = 12.80 and ^2,4?465 = 16.97. Using the 3-moment approximation,/? = 8.3 x 10"4. 28 Analysis of Interactions TABLE 5 Estimated adjusted means: College plans x size of high school Size of high school Row means College plans Small Medium Large Definitely will go 15.76 18.02 19.57 17.78 Almost sure to go 19.31 17.76 18.64 18.57 Likely to go 21.87 14.93 11.99 16.26 Not likely to go 14.53 15.35 15.28 15.06 Definitely will not go 13.37 12.55 16.49 14.14 Column means 16.96 15.72 16.39 16.36 Table 6 lists SPSS programs (SPSS, 1990, Release 4.0) to compute the maximal F statistic. The analysis requires two SPSS runs. In Run 1, the estimated adjusted means and corresponding standard errors, correlations, and covariance factors (covariances divided by MSE) are computed. Because of a bug in Release 4.0, multiple covariates, if they exist, must be specified on the design command rather than on the analysis subcommand. Yes 2 3 4 No Full—Time College Plans FIGURE 2. Profile plot of estimated adjusted means: college plans x high school size 29 TABLE 6 SPSS program to compute maximal F statistic Computation of adjusted means and corresponding correlation/covariance matrix descriptives variables = math mech ses/ save. manova physics by plan(l,5) size(l,3) gender(l,2) region(l,8) with zmath zmech zses/ analysis physics/ print = parameters(estim cor)/ design = muplus plan by size gender plan by gender size by gender region gender by region zmath zmech zses. Computation of maximal F data list file = adjust free/ mean se covl to covl5. matrix. get X. compute a = 3. compute b = 5. compute p = a - 1. compute q = b - 1. compute mu = X(:,l). compute Sigma = mdiag(X(:,2)&*X(:,2)). compute k = a*b + 2. compute Corr = X(:,3:k). loop i = 2 to a*b. + loop j = 1 to i — 1. + compute Sigma(i,j) = Corr(i, j)*X(i,2)*X( j,2). + compute Sigma(j,i) = Sigma(i,j). + end loop, end loop. compute Ca = Ident(a,a - 1) - make(a,a - 1,1/a). compute Cb = Ident(b,b - 1) - make(b,b - 1,1/b). compute Psi = t(Ca)*t(reshape(mu,b,a))*Cb. call svd(Psi,U,D,V). compute wp = U(:,l). compute Phi = t(Kroneker(Cb,Ca))*Sigma*Kroneker(Cb,Ca). compute psi = reshape(t(Psi),p*q,l). compute R = 0. compute epsi = 1. loop. + compute C = Kroneker(Ident(q),wp). + compute wq = inv(t(C)*Phi*C)*t(C)*psi. + compute C = Kroneker(wq,Ident(p)). + compute wp = inv(t(C)*Phi*C)*t(C)*psi. + compute epsi = t(psi)*Kroneker(wq,wp) - R. + compute R = R + epsi. end loop if (epsi It .00001). print (Ca*wp/sqrt(t(Ca*wp)*Ca*wp))/ title "Maximal Contrast Coeff.: Treat. A", print (Cb*wq/sqrt(t(Cb*wq)*Cb*wq))/ title "Maximal Contrast Coeff.: Treat. B". print R/title "Maximal F Ratio for Product Contrast", end matrix. Analysis of Interactions Otherwise, incorrect standard errors and covariances are obtained. Specifying covariates on the design command ordinarily produces adjusted means in which covariates are equated to zero. By centering the covariates at zero (performed by the descriptives command), adjusted means in which covariates are equated to their means can be obtained. The output is edited to produce a file containing only the estimated means, the standard errors, and the correlation/covariance matrix. If the design contains all higher order interactions and there are no empty cells, then the pmeans subcommand can be used to obtain estimated adjusted means (covariates equated to averaged unweighted means). Nevertheless, the muplus keyword is still required to obtain correlations among the estimated adjusted means. In Run 2, the file containing means, standard errors, and correlations/covariances is read, and matrix—end matrix commands (SPSS, 1990) are used to compute the maximal F. To apply the matrix—end matrix program to other data sets, a and b must be set to their correct values (lines 4 and 5). Factor B precedes Factor A in the manova command. Also, the variable name covl5 (line 1) should be changed, if necessary, so that SPSS reads ab correlations/covariances after each (mean, standard error) pair. In some applications, the numerical accuracy of the Run 2 output can be somewhat degraded because of its dependence on the accuracy of the printed Run 1 output. For the TALENT data, the maximal F, computed by SPSS, is correct to two decimal places. Follow-Up Tests This section examines selected partial interactions and interaction contrasts related to the college plans by high-school size interaction. SAS and SPSS programs to perform the analyses appear after the description of the tests. The Factor A (college plans) coefficient vector associated with the maximal F statistic primarily reflects a comparison between students who are decided about their college plans (levels 1 and 5) and students who are relatively undecided (level 3). That is, cAm = (-.5 0 10 -.5)'appears to be a near maximizer of the product contrast F statistic. The corresponding main-A and simple-A contrast estimates are / 17.78 18.57 16.26 = 0.30 15.06 \l4.14/ 4U(i) = c'a(1)(la = (-.5 0 10 -.5) and «IU(ix« = M'c„(1) = (7.30 -0.35 -6.04)', respectively. Averaged over school sizes, it appears that decided students (mean = 15.96) and undecided students (mean = 16.26) are about equally 31 Boik interested in physical science. The corresponding main effect contrast is not significant: T(cAW,lb) = 0.07 < 4F°A%5 = 9.564. The AmB partial interaction, however, is significant, T(cAW,CB) = 22.49 > R°Xa6s, indicating that the difference between decided and undecided students depends on high-school size. This partial interaction is said to be disordinal (Hager & Westermann, 1983) because the simple-^ contrasts do not have the same algebraic sign for all school sizes. In general, disordinal interactions are more difficult to interpret than ordinal interactions. Virtually all of the AmB partial interaction can be accounted for by a contrast between small and large high schools. The associated coefficient vector is cB(u = (1 0 -1)', and the product contrast estimate is tya(\)b(\) = c'A(\) McB(i) = 13.34. The hypothesis ^o)B(i) = 0 is rejected because TXc^D, cB(1)) = 22.44 exceeds the a = 0.01 SMR critical value of 16.97. The corresponding 99% confidence interval is 1.74 < < 24.95.^ Table 7 displays the estimated adjusted means that correspond to 4u(i)s(i)-To interpret a product contrast, I usually begin with a direct transcription. The product contrast estimate says that, with respect to interest in physical science, the difference between undecided and decided students (undecided - decided) is 13.34 points larger at small high schools than at large high schools. Equivalently, the product contrast estimate says that the difference between small and large high schools (small - large) is 13.34 points larger among undecided students than among decided students. Often, literal translations such as these are sufficient to interpret the contrast (e.g., effects of ralphing on baseball players). In this case, however, the literal translations are not very satisfying, possibly because they do not suggest a plausible underlying mechanism or because of the disordinal nature of the interaction. Interpretations beyond a literal translation require caution. In an uncontrolled observational study such as project TALENT, conclusions regarding cause-effect cannot be made. Tentative explanations that are consistent with the data can, of course, be proposed. Their validity, however, must await further research. One such explanation is the following. It seems TABLE 7 Estimated adjusted means corresponding to *\>A0)Bil) Size of high school College plans Small Large Difference Undecided Decided 21.87 14.57 11.99 18.03 9.88 -3.46 Difference 7.30 -6.04 13.34 32 Analysis of Interactions reasonable to assume that interest in physical science (or lack thereof) precedes and affects college enrollment decisions rather than vice versa. It may be that students at large high schools are more likely to base their career choices on interest patterns than are students at small high schools. If so, a student who has definite interests and is from a large high school is more likely to be sure of his/her college plans than is a comparable student from a small high school. Strong interest in physical science may actually make college decisions more difficult for students from small high schools. Additional analyses in which college plans is the response variable (e.g., log-linear models, logistic regression) could be informative. Some researchers might choose to ignore the interaction contrast in Table 7 and, instead, test the associated simple effect contrasts. Tests of these four simple effect contrasts, however, are not part of a coherent strategy unless the model is changed. The strategy is coherent if the three families (A,B, and AB) are combined to form a single family (Betz & Gabriel, 1978). The composite null now states that there are no differences among the ab adjusted means. A follow-up test of H0: c'p = 0 is judged to be significant if F(c) > (ab - 1) Flb\aI 4F°%5 = 13.439. Averaged over school sizes, high-school students most likely to attend college are more interested in physical science than are high-school students least likely to attend college. The corresponding A^B partial interaction is not significant, T(cMT),CB) = 0.17 < ^°,45465, indicating that the difference between students most and least likely to attend college does not depend on high-school size. The follow-up tests are summarized in Table 8. Table 9 lists the SAS commands (SAS Institute, 1988) to compute the analysis. Coefficients of an orthogonal basis set of Factor A (college plans) contrasts are assigned in the data step. Proc glm computes an ancova in which the plans main effect is partitioned according to four contrasts each having 1 df while the plans x size interaction is partitioned into four partial interactions each having 2 df. The basis set of coefficients must be orthogonal; otherwise, the correct partitioning is not obtained. To partition main effects and interactions according to nonorthogonal contrasts, multiple proc glm executions 33 Boik TABLE 8 Follow-up tests on college plans x size of high school Source SS df T(CA{i), Cfl(,)) p -Value Factor A; College plans 913.63 4 Aay. Decided vs. undecided 3.17 1 0.07 p > 0.50 Ai2)' Most likely vs. least likely 784.99 1 16.52 p < 0.01 Factor B: Size of high school 96.21 2 Bm: Large vs. small 16.84 1 0.35 p > 0.50 AB Interaction: Plans x size 1234.51 8 Maximal product contrast 1097.09 1 23.08 p < 0.01 AmB 1069.07 2 22.49 p < 0.01 AmB(i) 1066.58 1 22.44 p < 0.01 A(2)B 7.97 2 0.17 p > 0.50 are required. The contrast coefficients employed in each proc glm must constitute an orthogonal basis set. In the present case, a single proc glm is sufficient because coefficients of the two contrasts of interest, i\ia([) and <\ia(2), happen to be orthogonal. Contrast estimates and standard errors are obtained by an estimate statement. Note that a scaling factor of 3 is used for \\iAW and that a scaling factor of 2 is used for i^). This is because of the model parameterization. If a coefficient vector—say —is assigned in the data step, then the coefficient vector that actually corresponds to the contrast is cA -T- cAcA. In the present case, to obtain (-.5 0 1 0 -.5), TABLE 9 SAS program to compute follow-up test statistics data; infile talent; input size region gender plan mech math physics ses; if plan = 1 then do; Al = -1; A2 = 1; A3 = -1; A4 = 2; end; if plan = 2 then do; Al = 0; A2 = 1; A3 = 1; A4 = -3; end; if plan = 3 then do; Al = 2; A2 = 0; A3 = 0; A4 = 2; end; if plan = 4 then do; Al = 0; A2 = -1; A3 = -1; A4 = -3; end; if plan = 5 then do; Al = -1; A2 = -1; A3 = 1; A4 = 2; end; Proc glm; class plan size gender region; model physics = math mech ses Al|gender A2|gender A3|gender A4|gender size|gender gender|region Al*size A2*size A3*size A4*size; estimate 'Decided vs Undecided' Al 3; estimate 'Most vs Least Likely' A2 2; estimate 'Bl: Large vs Small' size 10-1; contrast 'Bl: Large vs Small' size 10-1; estimate 'Al x Bl' Al*size 3 0 -3; contrast 'Al x Bl' Al*size 10-1; 34 Analysis of Interactions «U(i> ci(i)Cx(i) must be multiplied by 3. Contrast sums of squares are obtained by using a contrast statement. An excellent discussion on the use of SPSS" (1983) to partition interactions when one or both factors are repeated measures can be found in O'Brien and Kaiser (1985, pp. 323-329). Certain modifications are required to partition interactions when neither factor represents repeated measures. The SPSS (1990) subcommands to perform this partitioning are listed in Table 10. The covariates need not be centered to obtain correct follow-up tests by SPSS. Contrast coefficients are assigned by a contrast subcommand. The first row of the contrast subcommand is a vector of ones which weights college plans (sizes) equally when averaging to obtain means for sizes (plans). The remaining rows must form a basis set of contrast coefficient vectors. The rows need not be orthogonal as they are in Table 10. The effect of plans is partitioned into three components (1,1, and 2 df) that correspond to row 2, row 3, and rows 4 and 5, respectively, of the contrast subcommand. The effect of size is partitioned into two components (1 df each). Sums of squares for partial interactions are produced by the first design subcommand. Sums of squares for product interaction contrasts are produced by the second design subcommand. Concluding Comments Although each has relative strengths and weaknesses, either of the two software packages can be used to compute detailed analyses of two-factor interactions. SAS's (SAS Institute, 1985,1988) strength is that the maximal TABLE 10 SPSS program to compute follow-up test statistics manova physics by plan(l,5) gender(l,2) size(l,3) region(l,8) with math mech ses/ contrast(plan) = special( 11111 -1 0 2 0-1 1 1 0-1-1 -1 1 0-1 1 2 -3 2 -3 2)1 partition(plan) = (1,1,2)/ analysis physics/ design = plan(l) plan(2) plan(3) gender size region plan by gender size by gender gender by region plan(l) by size plan(2) by size plan(3) by size math mech ses/ contrast(size) = special( 111 1 0 -1 1 -2 1)/ partition(size) = (1,1)/ analysis physics/ design = plan gender size region plan by gender size by gender gender by region plan(l) by size(l) plan(l) by size(2) plan(2) by size plan(3) by size math mech ses/ 35 Boik Fstatistic can be computed in a single run; there is no need to edit an output file. SAS's weakness is that, to perform follow-up tests, orthogonal basis sets of contrast coefficients must be specified. The main strength of SPSS (1990) is its straightforward syntax for partitioning an effect into multiple components. Coefficient vectors need not be orthogonal, but a complete basis set must be specified. In addition, SPSS can compute the maximal F statistic, but the computations require two runs. One goal of this article was to demonstrate the usefulness of partial interactions and product contrasts for interpreting significant interactions. I do not claim that partial interactions and product contrasts always lead to straightforward interpretations (disordinal interactions can be particularly troublesome), nor do I contend that simple effects contrasts should never be tested after detection of a significant interaction. Rather, I suggest that when interaction is detected, some effort ought to be expended to find out why. That is, the initial follow-up procedures should test hypotheses which are implied by the composite interaction hypothesis. If the interaction resists interpretation by a coherent strategy and the study is exploratory in nature, then one is certainly free to test other hypotheses, more amenable to interpretation. If this means that simple contrasts are tested after detection of interaction, then so be it. Testing simple contrasts after detection of an interaction, however, implies that the factorial model has been discarded and that an alternative (nested or one-way) model has been adopted. Naturally, the model change should be reported. Otherwise, readers might be misled into believing that the interaction is being interpreted in terms of simple effects contrasts. If the study is strictly confirmatory, a model change may be difficult to justify. APPENDIX Kronecker products Let F and G be matrices of size p * q and rxs, respectively. Then FG is a pr x qs matrix and is given by Adjusted means The data analytic methods in this article are based on the linear model where X is an n x d design matrix, Z is an n x t matrix of covariates, rank(X) = r, rank(Z) = t, n> rank(X Z) = r + t, and e is an n x 1 vector of residuals with a y = Xp + Zy + e, 36 Analysis of Interactions multivariate normal distribution: e~N(0,o-2i). The design matrix must code uniquely for each of the ab combinations of Factor A x Factor B. The design matrix may code for additional factors and interactions. If the model contains no covariates, then the ab cell means are linear combinations of the elements in p. In particular, the /y'th mean is the expectation of the /y'th treatment combination, averaged over levels of other factors (e.g., C, D) and interactions (e.g., AC, BC, CD). For example, in a three-way classification having no three factor interaction, the entries in p can be partitioned as a,, p,, yk, (aP),7, (ay),k, and ($y)Jk for / = 1,... ,a, y = b, and k = 1,... ,c. The /y'th mean is n,,-,- = u. + a,■ + p, + y. + (aP);, + (cry),. + (p7),., where, for example, (p7),. = c-12* = , (p7)>*. In general, the ab xl vector of means can be obtained as u. = f'P, where f is a d x ab matrix with rank ab. The addition of covariates requires minimal modifications. The /y'th adjusted mean is the average expectation of the /y'th treatment combination, conditional on the t x 1 vector of covariates being equal to a specified vector—say, z<>. Typically, z0 is equated to the vector of means: z0 = z = Z'l„ n~l or to the vector of averaged unweighted means: Zo= Z'lab (ab)~l, where Z is the ab x t matrix of unweighted cell means of the / covariates: Z = f'(x'x)~x'Zand where (x'x)" is any generalized inverse of x'x. The adjusted means and their estimators are (a = f'P + lab Zo y and lI = f'P + \ab z07, respectively, where (p 7') is a solution to the normal equations. Searle, Speed, and Milliken (1980) refer to ll as a vector of population marginal means and to (i as a vector of estimated marginal means. It can be shown that var(ji) = ■<■■-'■-'"», (A2) n - r - t where Pz x = (i„ - p,) Z [Z'(i„ - px) Z]"1 Z'(i„ - p,). Likelihood ratio test statistics Let C be a known ab x s matrix of constants with rank s. The LRT of H0: C ja = 0 rejects H0 for large values of , s ii'cic'Xcy'c'iL F{c)= 5a2- • (A3) where 2 is given in (Al) and a2 is given in (A2). The test statistic has distribution /•(C)~FJvk, where \ =-5-. a An important special case consists of linear functions, C'ti, in which C has the Kronecker structure C = CB <8> CA, where C„ is a x 5t, CB is 6 x j2, and 5 = s2. The 37 Boik corresponding null can be written as H0: CA MCB = 0. It follows that the LRT of Ho: C'AMCb = 0 rejects H0 for large values of ^ [vec(CA MCB)]' [(CB <8> CA) 'S(CB <8> C„)] -1 vec(CA MCB) T(Ca , Cfl) — |p . (A4) The test statistic has distribution ncA,cB) --JVv.X, s where [vec(CA MCB)]'[(CB ®CA)'S(CB ®CA)]~l vec(CA MCfl) X- Variance of interaction contrast estimator The variance of an interaction contrast estimator, trace(CAsM), is var[trace(CAB M)] = a2[vec(Q4B)]'2vec(Cy,B), where X is given in (Al). The estimator of the variance is var[trace(CABM)] = a2[vec(C„B)]' X \ec{CAB), (A5) where a2 is given in (A2). For a product contrast, the variance and estimator of the variance are and respectively. var(cA McB) = a2 (cB <8> cA)' X(cB ® cA) var(cA McB) = a2 (cB <8>cA)' X(cB ® c„), (A6) Sufficient condition for R to follow the SMR distribution The covariance matrix for a basis set of interaction contrasts, CAMCB, is O = var[vec(CA MCB)] = a2 (CB ®CA)' 2(CB <8> CA). It can be shown that, if the composite interaction null is true, then R follows the SMR distribution whenever «I> satisfies = 4>B®B and some a - 1 x a - 1 matrix &A. References Beatty, K. L. (1984). Maintaining the reading levels of learning disabled students during the summer. Unpublished doctoral dissertation, Lehigh University, Bethlehem, PA. Betz, M. A., & Gabriel, K. R. (1978). Type IV errors and analysis of simple effects. Journal of Educational Statistics, 3, 121-143. 38 Analysis of Interactions Betz, M. A., & Levin, J. R. (1982). Coherent analysis-of-variance hypothesis testing strategies: A general approach. Journal of Educational Statistics, 7, 193-206. Boik, R. J. (1979). Interactions, partial interactions, and interaction contrasts in the analysis of variance. Psychological Bulletin, 86, 1084-1089. Boik, R. J. (1985). A new approximation to the distribution function of the Studen- tized maximum root. Communications in Statistics—Simulation & Computation, 14, 759-767. Boik, R. J. (1986). Testing the rank of a matrix with applications to the analysis of interactions in ANOVA. Journal of the American Statistical Association, 81, 243-248. Boik, R. J. (1989). Reduced-rank models for interaction in unequally replicated two-way classifications. Journal of Multivariate Analysis, 28, 69-87. Bradu, D., & Gabriel, K. R. (1974). Simultaneous statistical inference on interactions in two-way analysis of variance. Journal of the American Statistical Association, 69, 428-436. Cooley, W. W., & Lohnes, P. R. (1971). Multivariate data analysis. New York: Wiley. Gabriel, K. R. (1969). Simultaneous test procedures—some theory of multiple comparisons. The Annals of Mathematical Statistics, 40, 224-250. Gabriel, K. R., Putter, J., & Wax, Y. (1973). Simultaneous confidence intervals for product-type interaction contrasts. Journal of the Royal Statistical Society Series B, 35, 234-244. Hager, W., & Westermann, R. (1983). Ordinality and disordinality of first order interaction in ANOVA. Archiv für Psychologie, 135, 341-359. Hochberg, Y., & Tamhane, A. C. (1987). Multiple comparison procedures. New York: Wiley. Johnson, D. E. (1973). A derivation of Scheffe's S-method by maximizing a quadratic form. The American Statistician, 27, 27-29. Johnson, D. E. (1976). Some new multiple comparison procedures for the two-way AOV model with interaction. Biometrics, 32, 929-934. Keppel, G. (1973). Design and analysis: A researcher's handbook. Englewood Cliffs, NJ: Prentice-Hall. Keppel, G., & Zedeck, S. (1989). Data analysis for research designs. New York: Freeman. Krishnaiah, P. R., & Chang, T. C. (1971). On the exact distributions of the extreme roots of the Wishart and MANOVA matrices. Journal of Multivariate Analysis, 1, 108-117. Lutz, J. G., & Cundari, L. A. (1987). Determining the most significant parametric function for a given linear hypothesis. Journal of Educational Statistics, 12, 225-233. Marascuilo, L. A., & Levin, J. R. (1970). Appropriate post hoc comparisons for interaction and nested hypotheses in analysis of variance designs: The elimination of Type IV errors. American Educational Research Journal, 7, 397-421. O'Brien, R. G., & Kaiser, M. K. (1985). MANOVA method for analyzing repeated measures designs: An extensive primer. Psychological Bulletin, 97, 316-333. Rosenthal, R., & Rosnow, R. L. (1985). Contrast analysis: Focused comparisons in the analysis of variance. Cambridge, England: Cambridge University Press. 39 Boik Rosnow, R. L. & Rosenthal, R. (1989a). Definition and interpretation of interaction effects. Psychological Bulletin, 105, 143-146. Rosnow, R. L., & Rosenthal, R. (1989b). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44, 1276-1284. Roy, S. N. (1953). On a heuristic method of test construction and its use in multivariate analysis. Annals of Mathematical Statistics, 24, 220-238. SAS Institute. (1985). SASIIML user's guide (Version 5 ed.). Cary, NC: Author. SAS Institute. (1988). SASI STAT user's guide{Release 6.03 ed.). Cary, NC: Author. Scheffe, H. (1953). A method for judging all contrasts in the analysis of variance. Biometrika, 40, 87-104. Searle, S. R., Speed, F. M., & Milliken, G. A. (1980). Population marginal means in the linear model: An alternative to least squares means. The American Statistician, 34, 216-221. SPSS. (1990). SPSS reference guide. Chicago: Author. SPSS. (1983). SPSS": User's guide. New York: McGraw-Hill. Author ROBERT J. BOIK is Associate Professor, Department of Mathematical Sciences, Montana State University, Bozeman, MT 59717-0240. He specializes in linear models and multivariate statistics. 40