STATISTICAL TESTS BASED ON RANKS Jana Jurečkova Charles University, Prague 2 Jana Jurečková Contents 1 Basic concepts of hypotheses testing in nonparametric setup 5 1.1 Introduction........................................ 5 1.2 Principle of invariance in hypotheses testing...................... 6 2 Properties of ranks and of order statistics 9 3 Locally most powerful rank tests 11 4 Selected two-sample rank tests 15 4.1 Two-sample tests of location .............................. 15 4.2 Two-sample rank tests of scale............................. 19 4.3 Rank tests of Hq against general two-sample alternatives based on the empirical distribution functions................................... 20 4.4 Modification of tests in the presence of ties ...................... 22 5 Tests for comparison of the treatments based on paired observations 25 5.1 Rank tests of Hi..................................... 25 5.2 One-sample Wilcoxon test................................ 27 5.3 Sign test.......................................... 27 6 Tests of independence in bivariate population 29 6.1 Spearman test ...................................... 29 6.2 Quadrant test....................................... 30 7 Rank test for comparison of several treatments 31 7.1 One-way classification.................................. 31 7.2 Kruskal-Wallis rank test................................. 32 7.3 Two-way classification (random blocks) ........................ 33 3 4 Jana Jurečková Chapter 1 Basic concepts of hypotheses testing in nonparametric setup 1.1 Introduction Let X = (Xi,...,Xn) be a random vector (vector of observations) and let H and K be two disjoint sets of probability distributions on (Mn,Bn). We say that X fulfills the hypothesis if the distribution of X belongs to H and that X fulfills the alternative if its distribution belongs to K. We shall use the same symbols H and K either to denote the hypotheses or the set. The hypothesis is usually the homogeneous, symmetric, independent of the statements while the alternative means inhomogeneity, asymmetry, dependence etc. The problem is to decide between the hypothesis and alternative on the basis of observations Xi,..., Xn. Every rule, which assigns just one of the decisions "to accept H" or "to reject H" to any point x = (x\,..., xn), is called the test (nonrandomized) of hypothesis H against alternative K. Such test partitions the sample space X into two complementary parts: the critical region (rejection region) Ak and acceptance region Ah- The test rejects H if x G Ak and accepts H if If we perform the test on the basis of observations x, then either our decision is correct or we could make either of the following two kinds of errors: (1) We reject H even if it is correct (error of the first kind); (2) we accept H even if it is incorrect (error of the second kind). It is desirable to use the test with the smallest possible probabilities of both errors. If the true distribution P of X satisfies P G H, then the probability of the error of the 1st kind equals %(K G Ak) and supPeií P(X G Ak) is called the size of the test with the critical region Ak- If the true distribution Q of X satisfies Q G K, then the probability of the error of the second kind equals Q(X G AH) = 1 - Q(X G AK). The probability ß(Q) = Q(X G AK), Q G if is called the power of the test against the alternative Q. The function ß(Q) : K i-)- [0,1] is called the power function of the test. The desirable test maximizes the power function uniformly over the whole alternative and has the small probability of the error of the first kind for all distributions from the hypothesis. The testing theory and searching for the optimum considerably simplifies when we supplement the family of tests by the randomized tests. A randomized test rejects H with the probability <&(x) and accepts with probability 1 — <&(x) while observing x, where 0 < <&(x) < 1 Vx is the test function. The set of randomized tests coincides with the set {(x) : 0 < < 1} and hence it is convex and weekly compact. 5 6 Jana Jurečková If X has distribution P, then the test rejects H with the probability fo{P) = Ep($(X)) = f $(x)eZP(x). Jx Intuitively, the best test should satisfy /3$(Q)=EQ($(X)):=max VQ is invariant with respect to Q if and only if there exists a function h such that $(x) = h(T(x)) Vx G X. PROOF, (i) If $(x) = h(T(x)) Vx, then $(tpc) = h(T(gx)) = h(T(x)) = $(x) Vt/ G Q and hence <& is invariant. (ii) Let be invariant and let T(x\) = T(x2). Then, by the definition, X2 = 3x1 for some / G Q and hence (x2) = ^(xi). ■ Examples of maximal invariants 1. Let x = (x\,..., xn) and let G be the group of translations gx = (xi + c,... ,xn + c), c G M1. Then the maximal invariant is, e.g., T(x) = (x2 — x\,..., xn — x\). 2. Let G be the group of orthonormal transformations IR™ *-¥ IR™. Then T(x) = Ym=i x1 1S *ne maximal invariant. 3. Let G be the set of n! permutations of xi,..., xn. Then the vector ordered components of x (vector of order statistics) T(x) = (xn:i < xn:2 < ... < xn:n) is the maximal invariant with respect to G. 4. Let G be the set of transformations x\ = f(xi), i = 1,... ,n) such that / : M1 >-)• 1R1 is continuous and strictly increasing function. Consider only the points of the sample space X with different components. Let Ri be the rank of Xi among x\,..., xn, i.e. Ri = Yľj=i I\xj ^ Xi], i = 1,..., n. Then T(x) = (i?i,..., Rn) is the maximal invariant for G. Actually, a continuous and increasing function does not change the ranks of the components of x, i.e. T is invariant to G. On the other hand, let two different vectors x and x' have the same vector of ranks R±,..., Rn. Put f(xi) = x\, i = 1,..., n and let / be linear on the intervals [xn:i,xn:2], ■ ■ ■, [xn:n-i,xn:n]; define / in the rest of the real line so that it is strictly increasing. Such / always exists, hence T is the maximal invariant. 8 Jana Jurečková Chapter 2 Properties of ranks and of order statistics Let X = (Xi,...,Xn) be the vector of observations; denote Xn:i < Xn.^... < Xn:n the components of X ordered according to increasing magnitude. The vector X(.) = (Xn:i,... ,Xn:n) is called the vector of order statistics and Xn:i is called the ith order statistic. Assume that the components of X are different and define the rank of X{ as Ri = Xľľ=i I[Xj — X{\. Then the vector R of ranks of X takes on the values in the set R of n! permutations (ri,..., rn) of(l,...,n). The first property of X^ and R is described in the following proposition. Proposition 1. The pair (X( ),i?) is a sufficient statistic for any family of absolute continuous probability distributions of X. PROOF. If X(.) = X(.) and R = r are prescribed, then P(X G A|X(0 = x(0, R = r) = P {(Xn:ri,..., Xn:rn) G A|X(0 = x(.}, R = r} = 0 orl depending on whether (xn:ri,..., xn:rn) is an element of A or not; this probability does not depend on the original distribution of X and this is the property defining the sufficiency. ■ DEFINITION 2.0.1 We say that the random vector X satisfies the hypothesis of randomness Hq, if it has a probability distribution with density of the form n p(x)=JI/(xi), XÉl" i=l where f is an arbitrary one-dimensional density. Otherwise speaking, X satisfies the hypothesis of randomness provided its components are independent identically distributed (i.i.d.) random variables with absolute continuous distribution. The following theorem gives the general form of the distribution of X(.) and of R. THEOREM 2.0.1 Let X have the density pn{xi,..., xn). (i) Then the vector Xm of order statistics has the distribution with the density -i \ — j 2^r£TlP\xn'-riT ■ ■ ixn:rn) ■■■xn:l S • • • S xn:n /r, i \ P(Xn:U • • • , Zn:nJ - j Q otherwise. {ZA) 9 10 Jana Jurečková (ii) The conditional distribution of R given X(.) = X(.) has the form Pv(R = r|X(0 = x(0) = P_(gn:ri' • • • Xn:rn) (2.2) P\%n:li ■ ■ ■ ) %n:n) /or any r£K and any xn:i < ... < xn:n. The distributions of X^ and i? considerably simplify under the hypothesis Hq : This is described in the following theorem. THEOREM 2.0.2 // X satisfies the hypothesis of randomness Hq, then X(.) and R are independent, the vector of ranks R has the uniform discrete distribution Pr(i? = r) = -1 reK (2.3) n! and the distribution o/X() has the density P(Xn:U • • • , Xn:n) - j Q ...otherwise. { ' Finally, the following theorem summarizes some properties of the marginal distributions of the random vectors R and X(.) under Hq. THEOREM 2.0.3 Let X satisfies the hypothesis H0. Then (i)Pr(Ri=j) = ±Vi,j = l,...,n. (ii) Pv(Ri = k,Rj = m) = „(„-i) for l ka 0 ...n\Q(R = r) ka)} + 7#{r : n\Q(R = r) = ka} = n\a, 0 < a < 1. However, many composite alternatives of the practical interest are too rich and the uniformly most powerful rank tests against such alternatives do not exist. Then we may take excurse to the local tests and look for a rank test most powerful locally in a neighborhood of the hypothesis. DEFINITION 3.0.1 Letd(Q) be a measure of distance of alternative Q €E K from the hypothesis H. The a—test $0 is called the locally most powerful in the class M. of a—tests of H against K if, given any other $ G .M, there exists e > 0 such that ß$o(Q) ^ ß$(Q) VQ satisfying 0 < d(Q) < e. We shall illustrate the structure of the locally most powerful rank tests of Hq against a class of alternatives covering the shift and regression in the location and scale. THEOREM 3.0.2 Let A be a class of densities, A = {g(x, 0) : 9 G J} such that (a) J C M1 is an open interval, J 3 0. (b) g(x, 6) is absolutely continuous in 6 for almost all x. (c) For almost all x, there exists the limit g(x,0) = ]hn)j\g(x,0)-g(x,0)] 6—»0 u and /oo /*oo \g(x, 9)\dx = / \g(x,0)\dx. 00 ./-oo 11 12 Jana Jurečková Consider the alternative K = {qa : A > 0}, where n qA(xi,..., xn) = JJ g(xi, Ac,), ci,..., cn given numbers. Then the test with the critical region ^2cian(Ri,g) > k is the locally most powerful rank test of Ho against K with the significance level a = -P(X^ľ=i cian(Pi k), where P is any distribution satisfying Hq, an{i,g) =E 9(Xn-.i,0) g(Xn:i,0)\ * = 1, ,n and Xn:i,..., Xn:n are the order statistics corresponding to the random sample of size n from the population with the density g(x,0). Let us apply the theorem to find the locally most powerful rank tests of Hq against some standard alternatives. I. We shall start with the alternative of the shift in location and test Hq on the random vector (Xi,..., Xn) against the alternative K\ : {q& : A > 0} where N 9A(a;i, o=n/te) n /ta-*). i=l i=m+l where / is a fixed absolute continuous density such that \f'(x)\dx < oo. / J—t (3.1) (3.2) Then the family of densities A with g(x,9) = f(x — 9) and J = M1 fulfills the conditions (a) -(c) of Theorem 3.2. Then the locally most powerful rank a—test of Hq against K has the critical region N ^2 aiv(Rhf) > k =m+l ^iV where k satisfies the condition P(Yli=m+i aN(Ri, f)>k) = a, P G Hq and ojv(*,/) = E f'(XN:i) f(XN:i) 1.....N (3.3) (3.4) and Xjv:i < ... < Xjv:jv are the order statistics corresponding to the sample of size N from the distribution with the density /. The scores (3.4) may be also written as aN(i, f) = *E*p{UN..i, f),i = l,...,N (3.5) where (p(u, f) = ~ t(F-i(u)V ® < u < ^ an(^ ^JV:1' • • •' Un-.n are the order statistics corresponding to the sample of size N from the uniform R(0,1) distribution. The scores (3.4) could be also expressed in the form aN(iJ)=N N -1 i-\ f'{x)Fl-\x){l-F{x))N-ldx (3.6) Statistical Tests Based on Ranks 13 Remark. The computation of the scores (3.4) (see also (3.6) is difficult for some densities; if there are no tables of the scores at disposal, they are often replaced by the approximate scores Mi, f) =

0} where m N , x qA(xu...,xN) = l[f(xi-ri J] e-A/Í^M,A>0 i=í i=m+í N / T- --- II \ (3.8) \ p— / i=m+l where / is an absolutely continuous density satisfying J^ \xf'(x)\dx < oo and fj, is the nuisance parameter. Then the family of densities A with g(x,9) = e~ef((x — ß)e~e, J = M1, fulfills the conditions (a) - (c) of Theorem 3.2 and the locally most powerful test has the critical region N J2 alN(Ri,f)>k, (3.9) i=m+l where k is determined by the condition P(Yli=m+i aiN{Ri, f)>k) = a, P G Hq and the scores have the form alN(i,g) =1e\-1 - XN.J'l^N:i)) =lEiPl(UN:iJ), (3.10) i = 1,...,N, where (pi(u, /) = —1 — F~l (u) :',>F_1r)), 0 < u < 1. In this case, too, we could replace the scores (3.10) by the approximate scores aijv(i, /) = (pi í jÁr[,f) ,i = 1,... ,N. III. Tests of Hq against the alternative of simple regression. Consider Hq against the alternative K = {qA : A > 0} where K3 : q\(xi,... ,xn) = Ili=i f(xi ~ ^■ci) w^ a nxed absolutely continuous density / satisfying (3.2) and with given constants c\,..., cat, Y1í=i c1 > 0- Then the locally most powerful test has the critical region N J2ciaN(RiJ)>k (3.11) i=l with the scores (3.5) and with k determined by the condition P(Y^i=i cíO-n(Rí, f)>k) = 01. 14 Jana Jurečková Chapter 4 Selected two-sample rank tests Consider two random samples (Xi,...,Xm) and (Yi,...,Yn) with the respective distribution functions F and G. For the sake of brevity, we shall also denote (Xi,... ,Xm, Yi,... ,Yn) = (Zi,..., Zn) with N = m + n. The hypothesis of randomness for the vector (Zi,..., Zn) in this special case could be reformulated as Hq : F = G. Consider first testing Hq against the alternative Ki : G(x) < F(x) Vx G M1, G(x) ^ F(x) at least for one x. K\ is a one-sided alternative stating that the random variable Y is stochastically larger than X. The problem of testing Hq against K\ is invariant to the group Q of transformations z\ = g(zi), i = 1,..., N where g is any continuous strictly increasing function. As we have seen before, the vector of ranks R±,..., Rn of Z\,..., Z n is the maximal invariant with respect to Q. Then, by Theorem 1.2, the class of invariant tests coincides with that of rank tests. Hence, we shall restrict our considerations to the rank tests. However, we could still reduce the class of tests due to the following considerations. Because both (Xi,..., Xm) and (Yi,..., Yn) are random samples, the distribution of the vector of ranks (R±,..., Rm, Rm+i, ■ ■ ■, i?m+n) is symmetric in the first m and the last n arguments under all pairs of distributions F and G. Hence, the sufficient statistic for the vector (i?i,... ,Rm,Rm+i,... ,Rm+n) are two vectors of ordered ranks R[<... 0. 15 U 16 Jana Jurečková If we know that F is normal, we use the two-sample t-test. Generally, the test statistic of any rank test is a function of the ordered ranks of the second sample. Theorem 3.2 and the following example I. show that the locally most powerful test generally has the critical region of the form N y^ aN(Ri) > k; i=m+l hence the test criterion really depends only on the ordered ranks of Yj's. The scores ajv(i) = Wjip(U:i) (which could be approximated by un(í) = íp í jp-^ J), i = 1,..., N, are generated by an appropriate score function ip : (0,1) i->- IR,1. We shall now describe three basic tests of this type which are the most often used in practice. Every one is locally most powerful for some special F, but the probabilities of the error of the first kind are the same for all F €E Hq. (i) Wilcoxon / Mann-Whitney test. The Wilcoxon test has the critical region N W= J2 Ri>ka (4.2) i=m+l i.e., the test function ' 1 ...W>ka (x) = < 0 ...W < ka , 7 -W = ka where ka is determined so that Ph0(W > ka) + jPh0(W = ka) = a, 0 < a < 1 (a = 0.05, a = 0.01). This test is the locally most powerful against Ki with F logistic with the density Jy ' (l + e-*)2 For small m and n, the critical value ka could be directly determined: For each combination si < ... < sn of the numbers 1,..., N we calculate X)ľ=i Si an(^ order these values in the increasing í N \ magnitude. The critical region is formed of the Mjv largest sums where Mjv = a\ J; if there ( N is no integer Mjv satisfying this condition, we find the largest integer Mjv less than a I and randomize the combination which leads to the (Mjv + 1)—st largest value. However, this systematic way, though precise, becomes difficult for large N, where we should use the tables of critical values. There exist various tables of the Wilcoxon test, organized in various ways. Many tables provide the critical values of the Mann-Whitney's statistic N m i=m+l j=l we could easily see that U n and Wn are in one-to-one relation Wn = U n + n^"2+ '. For an application of the Wilcoxon test, we could alternatively use the dual form of the Wilcoxon statistic: Let Z\ < ... < Z^:n be the order statistics and define Vi,...,Vn in the following way: Vi = 0 if Zjv:j belongs to the 1st sample and Vi = 1 if Zjv:j belongs to the second sample. Then WN = ^f=liVi. Statistical Tests Based on Ranks 17 For large m and n, where there are no tables, we use the normal approximation of Wn ■ If m, n —> oo, then, under Hq, Wn has asymptotically normal distribution in the following sense: Hm PeAWn/Z^.n < 4 = *(*)■ - * < (4-3) ,7i-)-oo L v var Wjv J m,n—»oo where $ is the standard normal distribution function. To be able to use the normal approximation (4.3), we must know the expectation and variance of Wn under Hq. The following theorem gives the expectation and the variance of a more general linear rank statistic, covering the Wilcoxon as well other rank tests. THEOREM 4.1.1 Let the random vector R±,... ,Rn have the discrete uniform distribution on the set R of all permutations of numbers 1,...,N, i.e. Pr(i? = r) = -^y, r €E H\ let ci,..., Cjv and a\ = a(l),... , gsjv = a(N) are arbitrary constants. Then the expectation and variance of the linear statistic N sN = J2cia(Ri) (4-4) are 1 N N E5^ = ^Ec*Eai (4-5) N i=l j=l N N var Sn = ňzti Xľta -č)2 Í>; - ")2' (4-6) where č = ^ Jľili ci and ä = W Si=i ai- =i j=i i y^ív „ „„J ^ _ i y^ív Proof. Actually, N WW N var 5at = YJ ci .var a(ižj) + YJ YJ CjCjCOv(a(ižj), a(Rj)) i=l tyj N var a(i?i) TJ c? + cov(a(i?i), a(i?2)) /J /J CiCj i=l i^j JV = A/"2č2.cov(a(iži),a(iž2)) + J^cľ- i=l Theorem 2.3 further gives 1 * var a(Ri) = — ^(a* - a)2, i=l 1 * cov(a(Ri),a(R2)) = -^———^2(ai - a) 18 Jana Jurečková hence N N 1 N N var SN = -^—f2 J2(ai - «)2 + J73I Iľ^ _ ^'H0)- As a special case, we get the parameters of the Wilcoxon statistic under Hq : mxi, n(N + l) „r mn(N + l) ,,„N ^WN = -^-----'-, var WN =-----K—-----'-. (4.7) The tables of critical values profit from the fact that the distribution of Wn under Hq is symmetric around TEWn. If we test the Hq against the left-sided alternative (A < 0, the second sample shifted to the left with respect the first one), we reject Hq if Wn < 2TEWn — ka. A sufficient condition for the symmetry of the linear rank statistic, which cover the Wilcoxon, follows from the following theorem: THEOREM 4.1.2 Let (R±,... ,Rn) be a random vector with discrete uniform distribution on the set H of permutations of 1,..., N. Let ci,..., cjv and a± = a(l),..., gsjv = a(N) be constants such that either a-i + a-N-i+i = K = const, i = 1,..., N (4.8) or Ci + Cjv-j+i = K = const, i = 1,..., N. (4.9) Then the distribution of the statistic Sn = Yli=i cia(Ri) is symmetric around ESV, i-e. Sn—JESn and —(Sn — JESn) have the same distributions. Proof. Under (4.8), 2Na = Yli=i ai + Yli=i aN-i+i = NK, hence aj + cln-í+i = 2ä, i = 1,..., N. Because (N — R\ + 1,..., N — ižjv + 1) and (i?i,..., Rn) have the same distributions, S'N = Yli=i cia{N — Ri + 1) has the same distribution as Sn and N S]\f = 2ä y Ci — Sn = 21EiS'jv — Sn =$■ Sjy — IEiSjy = ESV — Sn i=l => Pr(5jv - E5jv = s)= Pv(S'N - TES'N = s) = Pr(E5jv - SN = s) holds for any s. Analogously, under (4.9), c, + cjv-j+i = 2č, i = 1,..., N and (Rn, ■■■, Ri) has the same distribution as (i?i,... ,Rn)- Hence, SN = Yli=icN-i+ia(Ri) = Yli=icia(^-N-i+i) has the same distribution as Sn and S n = 2č Yli=i ai ~ Sn = 21E5at — Sn Sn — TESn = IE5at — Sn-The rest of the proof follows the steps of the first part. ■ (ii) van der Waerden test. Consider the test criterion (3.3) with the approximate scores (3.7) corresponding to the score function ip(u) = $_1(it), 0 < u < 1, where $ is the standard normal distribution function. The van der Waerden test is convenient for testing Hq against K\ if the distribution function F has approximately normal tails. In fact, the test is asymptotically optimal for Hq against the normal alternatives and its ra/aiiVe asymptotic efficiency (Pitman efficiency) with respect to the t-test is equal to 1 under normal F and > 1 under all nonnormal F. For Statistical Tests Based on Ranks 19 these good properties the test could be recommended; for large m, n, if we do not have the tables at disposal, we could use the critical values of the test based on the normal approximation JV(lE5Ar,var Sjy) where in the van der Waerden case, by Theorem 4.1, ESjv = 0, var SN mn N(N - 1) N E $- N+l t2 Moreover, by Theorem 4.2, the distribution of Sjy under Hq is symmetric around 0. . (iii) Median test. The median test is based on the criterion 3.3) with the scores (3.7) generated by the score function (p{u) ( 0 ...0 < u < i i u=l- 2 •"" 2 1 ...\ 1. 20 Jana Jurečková The locally most powerful rank test against K4 is given by (3.9) and (3.10). However, instead of the tests optimal against some special shapes of F with complicated form of the scores, we shall rather describe tests with simple scores which are really used in the practice. Notice, by (3.10), that the score function ip\ for the scale alternatives is not more monotone but U—shaped and the test statistics are of the form Sn = £ *i (Jh) ■ (4.11) i=m+i N + l (i) The Siegel-Tukey test. This test is based on reordering the observations, leading to new ranks, and to the test statistics whose distribution under Hq is the same as that of the Wilcoxon statistic. Let Z^-.i < Zjq-.2 < ... < Zn-.n be the order statistics corresponding to the pooled sample of N = m + n variables. Reorder this vector in the following way: (4.12) , N. The critical Zn-.I, Zn-.N, Zn-.N-U Zn:2, Zn:3, Zn:N-2, Zn:N-3, Zn-A, Zn:5, and denote R{ the new rank of Z{ with respect to the order (4.12), % = 1,. region of the Siegel-Tukey test has the form N sN = 2_^ Ri ^ ka i=m+l where k'a is determined so that Ph0(S'n < k'a) + jPh0(S'n = k'a) = a. The distribution of S'N under Hq coincides with the distribution of the Wilcoxon statistic, hence we could use the tables of the Wilcoxon test. However, unlike in the case of the Wilcoxon test, the Pitman efficiency of the Siegel-Tukey test to the F—test is rather low under normal F, namely -JU = 0.608. Anyway, we should not use the two sample F—test of scale unless we are sure of the normality; namely this test is very sensitive to the deviations from the normal distribution. (ii) Quartile test. Put in (4.11) and we get the test statistic Sn 0 ...0.25 < u < 0.75 0.5 ...« = 0.25, « = 0.75 1 ...0 < u < 0.25 and 0.75 < u < 1 N E =m+l sign Ri N + l + 1 (4.13) and reject Hq for large values of Sn- The value of S n is, unless N +1 is divisible by 4, the number of observations of the Y—sample which belong either to the first or to the fourth quartile of the pooled sample. If N is divisible by 4, then S n has the hypergeometric distribution under Hq, analogously as the median test. 4.3 Rank tests of H0 against general two-sample alternatives based on the empirical distribution functions. Again, Xi,...,Xm and Yi,... ,Yn are two samples with the respective distribution functions F and G. We wish to test the hypotheses of randomness Hq : F = G either against the one-sided alternative K+ : G(x) < F(x) Vs, F^G Statistical Tests Based on Ranks 21 or against the general alternative K5: F^G. This case in not covered by Theorem 3.2; moreover, testing against K§ is invariant to all continuous functions and there is no reasonable maximal invariant under this setup. In this case we usually use the tests based on the empirical distribution functions, which are the maximal likelihood estimators of the theoretical distribution functions in such nonparametric setup. Among these tests, we shall describe the Kolmogorov -Smirnov tests; another known test of this type is the Cramer - von Mises test. The empirical distribution function Fm corresponding to the sample Xi,..., Xm is defined as 1 m Fm{x) = -Y,I[Xi (X, Y): <&(X,Y) = 1 ...L)mn > Ua T —-Ľ'mn = ^a 0 ...Dmn < Ca The statistic Dmn is the rank statistic, though not linear. To see this, consider the order statistics Zjv:i < ... < Zjv:jv of the pooled sample and establish the indicators Vi,...,Vn where Vj = 0 if Zn:j comes from the X—sample and Z^:j = 1 otherwise. Because Fm and Gn are nondecreasing step functions, the maximum in (4.15) could be attained only in either of the points Zjv:i, • • •, Zn:n; moreover Fm{ZN:j) — Gn(ZN:j) =-------- mn what gives the value of the test criterion _. m + n Dmn =--------• max mn i* one of the ranks i?m+i,..., Rn is equal to i, while Vi = 0 ^=>* one of the ranks R±,..., Rm is equal to i. Thus Vi,...,Vn are dependent only on the ranks, and so is also Dmn. This implies that the distribution of Dmn under Hq is the same for all F. (4.16) is also used for the calculation of Dmn. Analogous consideration could be made for the one-sided Kolmogorov-Smirnov criterion -D+ which could be expressed in the form _. , m + n DZ.n =--------. max mn mn i 0. (4.17) m,fi-»-oo \ \m + n I 22 Jana Jurečková 4.4 Modification of tests in the presence of ties If both distribution functions F and G are continuous, then all observations are different with probability 1 and the ranks are well defined. However, we round the observations to a finite number of decimal places and thus, in fact, we express all measurement on a countable network. In such case, the possibility of ties cannot be ignored and we should consider the possible modifications of rank tests for such situation. Let us first make several general remarks: • If the tied observations belong to the same sample, then their mutual ordering does not affect the value of the test criterion. Hence, we should mainly consider the ties of observations from different samples. • A small number of tied observations could be eventually omitted but this is paid by a loss of information. • Some test statistics are well defined even in the presence of ties; the ties may only change the probabilities of errors of the 1st and 2nd kinds. Let us mention the Kolmogorov -Smirnov test as an example: The definitions of the empirical distribution function (4.14) and of the test criterion (4.16) make sense even in the presence of ties. However, if we use the tabulated critical values of the Kolmogorov - Smirnov test in this situation, the size of the critical region will be less than the prescribed significance level. Actually, we may then consider our observations X\,..., Xm, Y\,..., Yn as the data rounded from the continuous data XI,...,X^,Y*,...,Y*. Then the possible values of Fm{x) - Gn(x), a: G M1 form a subset of possible values of F^(x) — G*(a;), x E M1 where F^ and G* are the empirical distribution functions of X*'s and Yjf's, respectively; hence max[Fm(x) - Gn(x)] < max[F^{x) - G*n(x)] xeu1 xeu1 and similarly for the maxima of absolute values. We shall describe two possible modifications of the rank tests in the presence of ties: randomization and method of midranks. Randomizat ion Let Z\,..., Zjv be the pooled sample. Take independent random variables U\,..., Un, uniformly R(0,1) distributed and independent of Z\,..., Zjq. Order the pairs (Z\, Ui),..., (Zn, Un) in the following way: Denote R\,..., iž* the ranks of the pairs (Z\, Ui),..., (Zn, Un). We shall say that Z\,..., Zn satisfy the hypothesis H if they are independent and identically distributed (not necessarily with an absolutely continuous distribution). Then, under H, the vector R\,... ,iž* is uniformly distributed over the set TZ of permutations of 1,...,N. We shall demonstrate it on an important special case when Z\,..., Z n could take on the equidistant values, e.g. when the data are rounded on k decimal places. Statistical Tests Based on Ranks 23 THEOREM 4.4.1 Let Zi,...,Zjv be random variables satisfying the hypothesis H which take on the values from the set a + kd; k = 0, ±1, ±2,..., a G M1, d > 0. Then the vector R* = (i?*,..., R*N) has the probabilit Pr(iT = r) = ^ r EU. PROOF. We may assume, without loss of generality, that Z\,..., Z n take on the integer values. Then the random variable Tj = Z{ + Ui is equivalent to the pair (Zj, Ui), because Zj = [Tj] and Ui=Ti- [Tj] with probability l,i = l,...,N. Because Pr(Tj = t) = 0 Vi G K1, the distribution function of Tj is continuous. Moreover, (Zi,Ui)<(Zj,Uj)<=i'Ti x. In this section we shall consider the rank tests of Hi against various alternatives. 5.1 Rank tests of Hi Apply the folowing transformation to (Xi, Yf), i = 1,..., n : Zi = Yi-Xi, Wi = Xi+Yi, i = l,...,n (5.2) Under Hi, the distribution of the vector (Zi, Wi),..., (Zn, Wn) is symmetric around the w—axis, while under the alternative it is shifted in the direction of the positive half-axis z. The problem of testing Hi against such alternative is invariant with respect to the transformations z\ = Zi, w\ = g(wi), i = 1,..., n, where g is a 1 : 1 function with finite number of discontinuities. The vector (Zi,..., Zjv) is the maximal invariant with respect to such transformations, hence the invariant tests will be only the functions of (Zi,..., Zn), which forms a random sample from some one-dimensional distribution with a continuous distribution function D. The problem of testing Hi is then equivalent to H[ : D(z) + D(-z) = 1 zeE1 (5.3) 25 26 Jana Jurečková stating that the distribution D is symmetric around 0, against the alternative K[ : D(z + A) + D(-z + A) = 1 Mz G IR1, A > 0 (5.4) what means that the distribution is shifted in the direction of the positive z. The distribution D is uniquely determined by the triple (p, Fi, F2) with p = Pr(Z < 0), F\(z) -Pr(|Z| < z\Z < 0) and F2(z) = Pv(Z < z\Z > 0). Equivalent expressions for H[ and K[ are H'{: p = 1/2, F2 = FU p< 1/2, F2 < Fľ. This problem is invariant with respect to the group of transformations G : z[ = g(zi), i = 1,..., n, where g is continuous, odd and increasing function. We could easily see that the maximal invariant with respect to G is (S\,... ,Sm,Ri,...,Rn), where Si,...,Sm are the ranks of the absolute values of negative Z's among \Zi\,..., \Zn\ and i?i,..., n are the ranks of positive Z's among \Zi\,..., \Zn\. Moreover, the vectors S[ < ... < S'm and R[ < ... < R'n of ordered ranks are sufficient for (S\,..., Sm, R±,..., Rn) and, further, one of them uniguely determines the other; hence it is finally consider only, e.g., R[ < ... < R'n and the invariant tests of Hi [or of H[] depends only on R[ < ... < R'n. Let v be the number of positive components of (Z\,..., Zn). Then v is a binomial random variable B(N, it); it = 1/2 under H\ and, for any fixed n, PHl(R'1 = r1,...,R'u = r„,v = n)= (5.5) PHÁR[=r1,...,K = ru\v = n)P^ = n) = jj—(Nn ) Q)" = Q)" for any n—tuple (r\,..., rn), 1 < r\ < ... < rn < N. The number of such tuples is Yln=o I ) = (^) . The critical region of any rank test of the size a = ^r contains just k such points (n,...,rn). However, among such critical regions, there generally does not exist the uniformly most powerful one for H" against K". We usually consider H" against the alternative of shift in location under which (Zi,..., Zn) has the density ^a, A > 0 : N o m) i=l where / is a one-dimensional symmetric density, f(—x) = f(x), x G M1. A = 0 under Hi [or H'(.\ The locally most powerful rank test of H\ against (5.6) has the critical region N ^a+iiz^/JsignZi^fca (5.7) i=l where i?+ is the rank of \Z{\ among \Zi\,..., \Zn\ and the scores a~j^(i, f) have the form u + 1 a+(i, f) = J&p+iUftJ), i = l,...,N V+(u,f) = 0 among |Zi|,..., |Zjv|, ľ is the number of positive components. Obviously W+ = 2W++ - ±N(N + 1). We reject Hi if Wjj > Ca, i.e.if the test criterion exceeds the critical value. For large N, when the tables of critical values are not available, we may use the normal approximation: { WZ - JEWZ 1 -ff, í —-------jľ^ < s ^ -»• oo as N^oo (5.11) where EW+ = 0, var W+ = -N(N + 1)(2N + 1) (5.12) The parameters (5.12) follow from the following proposition: Proposition 5.2.1 Let Z be a random variable with continuous distribution function symmetric around 0, i.e. F(z) + F(—z) = 1, z G M1. Then Z and sign Z are independent. PROOF. Obviously P(sign Z = 1) = P(sign Z = -1) = 1/2. Then P(sign Z = 1, \Z\ < z) = i>(0 < Z < z) = P(-z < Z < 0) = P(sign Z = -1, \Z\ < z) = \P{\Z\ < z). M Similarly as the two-sample Wilcoxon test, the one-sample Wilcoxon test is convenient for the densities of logistic type. 5.3 Sign test Consider a more general situation that Zi,...,Zjv are independent random variables, Zj distributed according to the distribution function Dj, and not necessarily all D\,..., D n are equal. This situation occurs when we compare two treatments under different experimental conditions or using different methods. Under this situation we want to verify the hypothesis H{ : Di(z) + Di(-z) = 1, ZEM1, i = l,...,N (5.13) of symmetry of all distributions around 0, against the alternative that all distributions are shifted toward the positive values. Such problem is invariant with respect to all transformations of the type z\ = fi(zi), i = 1,...,N, where /j's are continuous, increasing and odd functions. The maximal invariant with respect to such transformations is the number n of positive components. The uniformly most powerful invariant test (most powerful among the tests dependent only on n) has the form (n) 1 ...n > Ca 7 ...n = Ca (5.14) 0 ...n < Ca 28 Jana Jurečková where Ca and 7 are determined by the equation n>Ca The criterion of the sign test is simply the number of positive components among Z\,..., Z n and its distribution under H\ is binomial b(N, 1/2). For large N we could again use the normal approximation. If all distribution functions D\,..., D n coincide, the sign test is the locally most powerful rank test of Hi for D of the double-exponential type with the density d{z) = ^e~^_Al, z €E M1. When we want to use the rank test, we need not to know the exact values X{, Y{, i = 1,..., N; it is sufficient to know the signs of the differences Y{ — X{. This is a very convenient property of the sign test: we could use this test even for the qualitative observations of the type: "drogue A gives a better pain relief than drogue B". As a matter of fact, we do not have any better test under such conditions. N n iV + 7 N iV a. (5.15) Chapter 6 Tests of independence in bivariate population Let (Xi,Y\),..., (Xn,Yn) be a random sample from a bivariate distribution with a continuous distribution function F(x,y). We want to test the hypothesis of independence H2: F(x,y) = F1(x)F2(y) (6.1) where F\ and F2 are arbitrary distribution functions. The most natural alternative for H2 is the positive [or negative] dependence, but it is too wide and we could hardly expect to find a uniformly most powerful test against such alternative. Instead of it we consider the alternative £r#++A* A >0, i = l,...,», (6.2) where Xf, Y®, Zi, i = 1,..., n are independent and their distributions are independent of i. The independence then means that A = 0. Let i?i,... ,Rn be the ranks of Xi,..., Xn and let Si,...,Sn be the ranks of Yi,..., Yn, respectively. Under the hypothesis of independence, the vectors (Ri,..., Rn) and (Si,..., Sn) are independent and both have the uniform distribution on the set H of permutations of 1,..., n. The locally most rank powerful test of H2 against the alternative K2 in which Xf has the density /1 and Y® the density f2, respectively, both densities continuously differentiable, has the critical region n ^2 Ca (6.3) i=l with the scores an(i, f) given in (3.4), which are usually replaced be approximate scores (3.7). We shall briefly describe two the most well-known rank tests of independence. 6.1 Spearman test The Spearman test is based on the correlation coefficient of (Ri,..., Rn) and (Si,..., Sn): r, =_________kT,?=iRjSi-RS________ IkEiUto -Ř)2lňTeilst -5)2]V2 ^ • > where 29 30 Jana Jurečková i=í i=í i=í v ' n+l\2 n2-l n f—' n f—' n f—' V 2 / 12 Then we could express (6.4) in the simpler form 12 " 0 3(n + l) (6.5) n 1Z r5 n(n2 — 1) -f—' X;^-^^. (6.6) r—' n — 1 The Spearman test rejects i?2 if r s > Ca, or, equivalently, if S = X)ľ=i ^^ > ^a- -*-n some tables we find the critical values for the statistic n S' = J2(Ri-Si)2 (6.7) i=l for which rs = 1-----s^S'. The test based on S' rejects Hi if S' < C'a. For large n we use the normal approximation with __ n(n + l)2 n2(n + l)2(n-l) ElS = ---------A---------' Var S = ---------------7J~A---------------• 4 144 The Spearman test is the locally most powerful against the alternatives of the logistic type. 6.2 Quadrant test This test is based on the criterion Q=\ ^[signíÄj - ^) + lHsigntä - ^) + 1] (6.8) i=l and rejects Hi for large values of Q. For even n is Q equal to the number of pairs (Xi,Y{), for which X{ lies above the X—median and Y{ lies above the Y—median. Statistic Q then has, under the hypothesis Hi, the hypergeometric distribution m \ I m , q J \ m — a , Pr(Q = q)= v W7 V x 7 (6.9) n m for g = 0,1,..., m, m = n/2. For large n we use the normal approximation with the parameters n EQ = n/4, var Q = —-------. (6.10) 16(n — 1) Chapter 7 Rank test for comparison of several treatments 7.1 One-way classification We want to compare the effects of p treatments; the experiment is organized in such a way that the i-th treatment is applied on rii subjects with the results xn,..., Xini, i = 1,... ,p, Ya=i ni = n-Then xn,... ,Xini is a random sample from a distribution with a distribution function Fi, i = l,...,p. The hypothesis of no difference between the treatments could be then expressed as the hypothesis of equality of p distribution functions, namely H2: ^ = ^2 = ... = ^ (7.1) and we could consider this hypothesis either against the general alternative K2 : Fi(x) ŕ Fj(x) (7.2) at least for one pair i,j at least for some x = xo, or against a more special alternative K2 : Fi(x) = F(x - Ai), i = l,...,p (7.3) and A j ^ A j at least for one pair i,j. The alternative (7.3) claims that the effects of treatments on the values of observations are-linear and that at least two treatments differ in their effects. The classical test for this situation is the F-test of the variance analysis; this test works well if we could assume that F{ ~ N(n + «j, cr2), i = 1,... ,p. We obtain the usual model of variance analysis Xij = fj, + ai + eij, j = 1,..., nf, i = l,...,p (7.4) where e,j are independent random variables with the normal distribution A/"(0, a2). The hypothesis H2 could be then reformulated as H2 : ol\ = a2 = ... = otp = 0. The .F-test rejects the hypothesis H2 provided T_n-V Y.Unj{X, - X..? 31 32 Jana Jurečková where -.ni -, P rii **■ = - E xa and *■ = - E E *tf. J = l 2 = 1 J = l i = 1,... ,pand where the critical value Ca is found in the tables of F-distribution with (p—l,n—p) degrees of freedom. 7.2 Kruskal-Wallis rank test Let us order all observations X\\, . . . , Xini, Xi\ i ■ ■ ■ i %2ri2) • • • ) -^pl) • • • ) -Eprip according to the increasing magnitude. Let Rn,..., ižj„ť be the ranks of the observations xn,..., Xjni. Let R*ľ < ... < R*n. be the same observations ordered in increasing magnitude. Then, under the hypothesis H2, -Ph2 (-Rii =rn,..., R\ni = rini,..., Rpl = rpi,..., Rpnp = rpnpj = j (7-6) for any permutation (rn,..., r±ni,..., rpl,..., rpTp) of numbers 1,..., n such that rn < ... < Vini for alii = 1,... ,p. Denote 1 n» Ri- = —^Rij, i = l,...,p 1 p n* 11 *- = £££*« = ^r1- <"> i=i j=i If we replace X{. and X.. in (7.5) by ižj. and R.., respectively, i = 1,... ,p, we obtain p-iELiE-=i(^-^-)2 and this is proportional to the criterion of the Kruskal- Wallis test, IC 12 A /„ n + L\2 £»<(«,-^)) (M) n(n+ 1) *-f j=i 12 p = -7------^Vn^iž?-3(n + l). n(n+ 1) 4^ j=i In the special case p = 2, the Kruskal-Wallis reduces to the two-sided (two-sample) Wilcoxon test. We reject the hypothesis H2, provided IC > Ca, where the critical value Ca is either obtained from special tables, or, if p > 3 and rii > 5, i = 1,... ,p, we use the asymptotic critical values: it could be shown that, under H2 and for large n\,...,np, the criterion IC has asymptotically %2 distribution with p — 1 degrees of freedom. Remark. In case of ties between the observations we replace the ranks by the midranks, similarly as in the case of the Wilcoxon test. Statistical Tests Based on Ranks 33 7.3 Two-way classification (random blocks) We want to compare p treatments, but simultaneously we want to reduce the influence of non-homoheneity of the sample units. Then we could organize the experiment in such a way that we divide the subjects in n homogeneous groups, so called blocks, and compare the effects of treatments within each block separately. The subjects in the block are usually assigned the treatments in a random way. Let us consider the simplest of these models with n blocks, each containing p elements, and each treatment is applied just once in each block. We assume that the blocks are independent of each other. The observations could be formally described by the following table: Treatment 1 2 3 P Block 1 xu »12 »13 X\p 2 »21 »22 »23 x2p n Xnl »n2 »7l3 Xnp The observation Xij is the measured effect of the j-th treatment applied in the i-th block. We cLSSUm6 tllcLt si-ij are independent random variables and Xý- has a continuous distribution function Fij, j = 1,..., n; i = 1,... ,p. We wish to verify the hypothesis that there is no significant difference among the treaments, hence H3:Fil=Fi2 = ... = Fipyi = l,...,n (7.9) against the alternative K3 : Fa ± Fik (7.10) at least for one i and at least for one pair j, k, or against a more special alternative K3: Fij(x) = Fi(x-Aj), j = l,...,n; i = l,...,p (7.11) Aj ^ A*, at least for one pair j, k. (7-12) The classical test of H3 is the F-test, corresponding to the model Xij = n + ai + ßj+ Eij, j = 1,..., n; i = 1,... ,p, (7.13) where E^ are independent random variables with the normal distribution A/"(0, Ca (7.14) 2^1 j=i Z^i=i where Ca is the critical value of F-distribution with p — 1 and (p — l)(n — 1) degrees of freedom. 34 Jana Jurečková Friedman rank test Order the observations within each block and denote the corresponding ranks Rn,... ,Rip; i 1,..., n. The ranks we arrange in the following table: Treatment 1 2 3 p Row Block average 1 Rn R12 -R13 Rip p+i 2 2 R21 R22 -R23 R2p p+i 2 n Rnl Rn2 Rn3 Jrinp p+i 2 Column Ai R.2 R.3 M,.p Overall average average A. = *±i where A.,- = ± Eľ=i Ä« and A. = £ Eľ=i E?=i Ay- The Friedman test is based on the following criterion: Vt] 12" E^-^1)2 p(p+i) 00, then the distribution of Qn is approximately %2 with p — 1 degrees of freedom. In case p = 2, the Friedman test is reduced to the two-sided sign test. The Friedman test is applicable for the comparison of p treatments even in the situation that we observe only the ranks rather than exact values of the treatment effects. Bibliography [1] H. Bünning and G. Trenkler (1978). Nichparametrische statistische Methoden (mit Aufgaben und Lösungen und einem Tabellenanhang). W. de Gruyter, Berlin. [2] J.Hájek and Z.Šidák (1967). Theory of Rank Tests. Academia, Prague. [3] J. Jurečková (1981). Rank Tests (in Czech). Lecture Notes, Charles University in Prague. [4] J. P. Lecoutre and P. Tassi (1987). Statistique non parametrique et robustesse. Economica, Paris. [5] E. L. Lehmann (1975). Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco. [6] E. L. Lehmann (1986). Testing Statistical Hypotheses. Second Edition. J. Wiley. 35