STATISTICAL TESTS BASED ON RANKS
Jana Jurečkova
Charles University, Prague
2                                                                                                                        Jana Jurečková
Contents
1    Basic concepts of hypotheses testing in nonparametric setup                                   5
1.1    Introduction........................................      5
1.2    Principle of invariance in hypotheses testing......................      6
2    Properties of ranks and of order statistics                                                                    9
3    Locally most powerful rank tests                                                                                   11
4    Selected two-sample rank tests                                                                                      15
4.1    Two-sample tests of location    ..............................    15
4.2    Two-sample rank tests of scale.............................    19
4.3    Rank tests of Hq against general two-sample alternatives based on the empirical distribution functions...................................    20
4.4    Modification of tests in the presence of ties    ......................    22
5    Tests for comparison of the treatments based on paired observations                  25
5.1    Rank tests of Hi.....................................    25
5.2    One-sample Wilcoxon test................................    27
5.3    Sign test..........................................    27
6    Tests of independence in bivariate population                                                            29
6.1     Spearman test    ......................................    29
6.2    Quadrant test.......................................    30
7    Rank test for comparison of several treatments                                                         31
7.1     One-way classification..................................    31
7.2    Kruskal-Wallis rank test.................................    32
7.3    Two-way classification (random blocks)    ........................    33
3
4                                                                                                                        Jana Jurečková
Chapter 1
Basic concepts of hypotheses testing in nonparametric setup
1.1     Introduction
Let X = (Xi,...,Xn) be a random vector (vector of observations) and let H and K be two disjoint sets of probability distributions on (Mn,Bn). We say that X fulfills the hypothesis if the distribution of X belongs to H and that X fulfills the alternative if its distribution belongs to K. We shall use the same symbols H and K either to denote the hypotheses or the set. The hypothesis is usually the homogeneous, symmetric, independent of the statements while the alternative means inhomogeneity, asymmetry, dependence etc.
The problem is to decide between the hypothesis and alternative on the basis of observations Xi,..., Xn. Every rule, which assigns just one of the decisions "to accept H" or "to reject H" to any point x = (x\,..., xn), is called the test (nonrandomized) of hypothesis H against alternative K. Such test partitions the sample space X into two complementary parts: the critical region (rejection region) Ak and acceptance region Ah- The test rejects H if x G Ak and accepts H if
If we perform the test on the basis of observations x, then either our decision is correct or we could make either of the following two kinds of errors:
(1)  We reject H even if it is correct (error of the first kind);
(2)  we accept H even if it is incorrect (error of the second kind).
It is desirable to use the test with the smallest possible probabilities of both errors. If the true distribution P of X satisfies P G H, then the probability of the error of the 1st kind equals %(K G Ak) and supPeií P(X G Ak) is called the size of the test with the critical region Ak- If the true distribution Q of X satisfies Q G K, then the probability of the error of the second kind equals Q(X G AH) = 1 - Q(X G AK). The probability ß(Q) = Q(X G AK), Q G if is called the power of the test against the alternative Q. The function ß(Q) : K i-)- [0,1] is called the power function of the test. The desirable test maximizes the power function uniformly over the whole alternative and has the small probability of the error of the first kind for all distributions from the hypothesis.
The testing theory and searching for the optimum considerably simplifies when we supplement the family of tests by the randomized tests. A randomized test rejects H with the probability <&(x) and accepts with probability 1 — <&(x) while observing x, where 0 < <&(x) < 1 Vx is the test function. The set of randomized tests coincides with the set {<E>(x) : 0 < <E> < 1} and hence it is convex and weekly compact.
5
6
Jana Jurečková
If X has distribution P, then the test <E> rejects H with the probability
fo{P) = Ep($(X)) = f $(x)eZP(x).
Jx
Intuitively, the best test should satisfy
/3$(Q)=EQ($(X)):=max    VQ <E If                                      (1.1)
and simultaneously
fo(P) = EP($(X)) := min    VP G iř.                                       (1.2)
Because no test satisfies both conditions simultaneously, the optimal test is defined in the following way: Select a small number a, 0 < a < 1, called the significance level, and among all tests satisfying
/M-P) < a    VP (E iř                                                    (1.3)
we look for the test satisfying (1.1). Such test, if it exists, is called the uniformly most powerful test of size < a, briefly the uniformly most powerful a-test of H against K. The hypothesis [alternative] is called simple if H [K] is a one-point set; otherwise it is called composite. The test of a simple hypothesis against a simple alternative is given by the fundamental Neyman-Pearson lemma.
THEOREM 1.1.1 Neyman-Pearson Lemma. Let P and Q be two probability distributions with densities p and q with respect to some measure fj, (e.g., ß = P + Q). Then, for testing the simple hypothesis H : {P} against the simple alternative K : {Q}, there exists the test $ and a constant k such that
Ep($(X)) = a                                                         (1.4)
and
•w={J HíK^ífK                         («)
' q(x) < fe.p(x) This test is the most powerful a-test of H against K.
1.2    Principle of invariance in hypotheses testing
Let g be a 1:1 transformation X : X. We say that the problem of testing of H against K is invariant with respect to g if g retains both H and K, i.e.
X satisfies H iff gX. satisfies H X satisfies K iff gX. satisfies K.
If the problem of testing H against K is invariant with respect to the group Q of transformations of X onto X, then we could naturally consider only the tests which are equally invariant, i.e. the tests $ satisfying
SfoX) = $(X)    Vx eX,    Vg G Q.
We shall then look for the most powerful invariant a-test. In some cases, there exists a statistic T(X), called maximal invariant, such that every invariant test is a function of T(X).
Statistical Tests Based on Ranks
7
DEFINITION 1.2.1 The statistic T = T(X) is called maximal invariant with respect to the group Q of transformations, provided T is invariant, i. e.
T(gx)=T(x)    VxG*,     VgeQ
and
i/T(xi) = T(x2) then there exists g G Q such that X2 = 3x1.
The structure of invariant tests is characterized in the following theorem.
THEOREM 1.2.1 Let T(X) be the maximal invariant with respect to the group of transformations X. The the test <E> is invariant with respect to Q if and only if there exists a function h such that
$(x) = h(T(x))    Vx G X.
PROOF, (i) If $(x) = h(T(x)) Vx, then $(tpc) = h(T(gx)) = h(T(x)) = $(x) Vt/ G Q and hence <& is invariant.
(ii) Let <E> be invariant and let T(x\) = T(x2). Then, by the definition, X2 = 3x1 for some / G Q and hence <E>(x2) = ^(xi).                                                                                                            ■
Examples of maximal invariants
1.  Let x = (x\,..., xn) and let G be the group of translations
gx = (xi + c,... ,xn + c), c G M1. Then the maximal invariant is, e.g., T(x) = (x2 — x\,..., xn — x\).
2.  Let G be the group of orthonormal transformations IR™ *-¥ IR™. Then T(x) = Ym=i x1 1S *ne maximal invariant.
3.  Let G be the set of n! permutations of xi,..., xn. Then the vector ordered components of x (vector of order statistics)
T(x) = (xn:i < xn:2 < ... < xn:n)
is the maximal invariant with respect to G.
4.  Let G be the set of transformations x\ = f(xi), i = 1,... ,n) such that / : M1 >-)• 1R1 is continuous and strictly increasing function. Consider only the points of the sample space X with different components. Let Ri be the rank of Xi among x\,..., xn, i.e. Ri = Yľj=i I\xj ^ Xi], i = 1,..., n. Then T(x) = (i?i,..., Rn) is the maximal invariant for G.
Actually, a continuous and increasing function does not change the ranks of the components of x, i.e. T is invariant to G. On the other hand, let two different vectors x and x' have the same vector of ranks R±,..., Rn. Put f(xi) = x\, i = 1,..., n and let / be linear on the intervals [xn:i,xn:2], ■ ■ ■, [xn:n-i,xn:n]; define / in the rest of the real line so that it is strictly increasing. Such / always exists, hence T is the maximal invariant.
8                                                                                                                        Jana Jurečková
Chapter 2
Properties of ranks and of order statistics
Let X = (Xi,...,Xn) be the vector of observations; denote Xn:i < Xn.^... < Xn:n the components of X ordered according to increasing magnitude. The vector X(.) = (Xn:i,... ,Xn:n) is called the vector of order statistics and Xn:i is called the ith order statistic.
Assume that the components of X are different and define the rank of X{ as Ri = Xľľ=i I[Xj — X{\. Then the vector R of ranks of X takes on the values in the set R of n! permutations (ri,..., rn) of(l,...,n).
The first property of X^ and R is described in the following proposition.
Proposition 1. The pair (X( ),i?) is a sufficient statistic for any family of absolute continuous probability distributions of X.
PROOF. If X(.) = X(.) and R = r are prescribed, then
P(X G A|X(0 = x(0, R = r) = P {(Xn:ri,..., Xn:rn) G A|X(0 = x(.}, R = r} = 0 orl
depending on whether (xn:ri,..., xn:rn) is an element of A or not; this probability does not depend on the original distribution of X and this is the property defining the sufficiency.                      ■
DEFINITION 2.0.1 We say that the random vector X satisfies the hypothesis of randomness Hq, if it has a probability distribution with density of the form
n p(x)=JI/(xi),  XÉl"
i=l
where f is an arbitrary one-dimensional density. Otherwise speaking, X satisfies the hypothesis of randomness provided its components are independent identically distributed (i.i.d.) random variables with absolute continuous distribution.
The following theorem gives the general form of the distribution of X(.) and of R.
THEOREM 2.0.1 Let X have the density pn{xi,..., xn).
(i) Then the vector Xm of order statistics has the distribution with the density
-i                         \ — j    2^r£TlP\xn'-riT ■ ■ ixn:rn)     ■■■xn:l S • • • S xn:n                      /r, i \
P(Xn:U • • • , Zn:nJ - j   Q                                                otherwise.                                     {ZA)
9
10
Jana Jurečková
(ii) The conditional distribution of R given X(.) = X(.) has the form
Pv(R = r|X(0 = x(0) = P_(gn:ri' • • • Xn:rn)                                     (2.2)
P\%n:li ■ ■ ■ ) %n:n)
/or any r£K and any xn:i < ... < xn:n.
The distributions of X^ and i? considerably simplify under the hypothesis Hq : This is described in the following theorem.
THEOREM 2.0.2 // X satisfies the hypothesis of randomness Hq, then X(.) and R are independent, the vector of ranks R has the uniform discrete distribution
Pr(i? = r) = -1   reK                                                  (2.3)
n!
and the distribution o/X() has the density
P(Xn:U • • • , Xn:n) - j   Q                              ...otherwise.                                 {       '
Finally, the following theorem summarizes some properties of the marginal distributions of the random vectors R and X(.) under Hq.
THEOREM 2.0.3 Let X satisfies the hypothesis H0.  Then
(i)Pr(Ri=j) = ±Vi,j = l,...,n.
(ii) Pv(Ri = k,Rj = m) = „(„-i) for l <ij,k,m<n,i^ j, k ^ m.
(Hi) TERi = *±i, i = l,...,n.
(iv) var Ri = n^2l, i = 1,... ,n.
(v) coviRi.Rj) = -5±i,  1 < i,j < n,i ^ j-
(vi) //X has the density p(x±,... ,xn) = Y\7=i f(xi)i then Xn:h has the distribution with the density
f[n){x) = n^nk-_\yF{X)f-\l-F{X)Y-kf{X), xGM1
where F(x) = f*^ f{y)dy, k = l,...,n.
Remark. In the special case that X is a sample from the uniform R[0,1] distribution, Xnxk has beta distribution B(k, n — k + 1) with the expectation and variance
TOV             k                v            k(n-k + l)
Jü^n:t — ----"Ti       Var A„.jfc
n+ľ              "■"      (n+l)2(n + 2)'
Chapter 3
Locally most powerful rank tests
We want to test a hypothesis on the distribution of the random vector X, say the hypothesis of randomness Hq. The rank test depends on X only through the ranks Ri,...,Rn of the components of X and hence could be characterized by test function <&(R). If we test the hypothesis of randomness Hq against a simple alternative K : {Q} [X has the fixed distribution Q], then the most powerful rank a—test of Hq against K follows directly from the Neyman-Pearson Lemma and from Theorem 2.1:
$(r)
1 ...n\Q{R = r)>ka 0 ...n\Q(R = r) <ka 7   ...n\Q(R = r) = ka, reK
where ka and 7 are determined so that
#{r : n\Q(R = r) > ka)} + 7#{r : n\Q(R = r) = ka} = n\a, 0 < a < 1.
However, many composite alternatives of the practical interest are too rich and the uniformly most powerful rank tests against such alternatives do not exist. Then we may take excurse to the local tests and look for a rank test most powerful locally in a neighborhood of the hypothesis.
DEFINITION 3.0.1 Letd(Q) be a measure of distance of alternative Q €E K from the hypothesis H. The a—test $0 is called the locally most powerful in the class M. of a—tests of H against K if, given any other $ G .M, there exists e > 0 such that ß$o(Q) ^ ß$(Q) VQ satisfying 0 < d(Q) < e.
We shall illustrate the structure of the locally most powerful rank tests of Hq against a class of alternatives covering the shift and regression in the location and scale.
THEOREM 3.0.2 Let A be a class of densities, A = {g(x, 0) : 9 G J} such that
(a)  J C M1 is an open interval, J 3 0.
(b)  g(x, 6) is absolutely continuous in 6 for almost all x.
(c)  For almost all x, there exists the limit
g(x,0) = ]hn)j\g(x,0)-g(x,0)] 6—»0 u
and
/oo                                 /*oo
\g(x, 9)\dx = /      \g(x,0)\dx. 00                              ./-oo
11
12
Jana Jurečková
Consider the alternative K = {qa : A > 0}, where
n
qA(xi,..., xn) = JJ g(xi, Ac,),     ci,..., cn    given numbers.
Then the test with the critical region
^2cian(Ri,g) > k
is the locally most powerful rank test of Ho against K with the significance level a = -P(X^ľ=i cian(Pi k), where P is any distribution satisfying Hq,
an{i,g) =E
9(Xn-.i,0)
g(Xn:i,0)\
* = 1,
,n
and Xn:i,..., Xn:n are the order statistics corresponding to the random sample of size n from the population with the density g(x,0).
Let us apply the theorem to find the locally most powerful rank tests of Hq against some standard alternatives.
I. We shall start with the alternative of the shift in location and test Hq on the random vector (Xi,..., Xn) against the alternative K\ : {q& : A > 0} where
N
9A(a;i,
o=n/te) n /ta-*).
i=l            i=m+l
where / is a fixed absolute continuous density such that
\f'(x)\dx < oo.
/
J—t
(3.1)
(3.2)
Then the family of densities A with g(x,9) = f(x — 9) and J = M1 fulfills the conditions (a) -(c) of Theorem 3.2. Then the locally most powerful rank a—test of Hq against K has the critical region
N
^2  aiv(Rhf) > k
=m+l
^iV
where k satisfies the condition P(Yli=m+i aN(Ri, f)>k) = a, P G Hq and
ojv(*,/) = E
f'(XN:i) f(XN:i)
1.....N
(3.3)
(3.4)
and Xjv:i < ... < Xjv:jv are the order statistics corresponding to the sample of size N from the distribution with the density /. The scores (3.4) may be also written as
aN(i, f) = *E*p{UN..i, f),i = l,...,N                                        (3.5)
where (p(u, f) = ~ t(F-i(u)V ® < u < ^ an(^ ^JV:1' • • •' Un-.n are the order statistics corresponding to the sample of size N from the uniform R(0,1) distribution. The scores (3.4) could be also expressed in the form
aN(iJ)=N
N -1 i-\
f'{x)Fl-\x){l-F{x))N-ldx
(3.6)
Statistical Tests Based on Ranks
13
Remark.   The computation of the scores (3.4) (see also (3.6) is difficult for some densities; if there are no tables of the scores at disposal, they are often replaced by the approximate scores
Mi, f) = <p ( jvqn J = ^(E^:i, f),i = l,...,N.                           (3.7)
The asymptotic critical values coincide for both types of scores.
II. Hq against the alternative of two samples differing by scales.   Consider Hq against the alternative Ki : {qA : A > 0} where
m                            N                    ,               x
qA(xu...,xN) = l[f(xi-ri   J]   e-A/Í^M,A>0
i=í                    i=m+í
N
/   T-   ---   II   \
(3.8) \    p—    /
i=m+l
where / is an absolutely continuous density satisfying J^ \xf'(x)\dx < oo and fj, is the nuisance parameter. Then the family of densities A with g(x,9) = e~ef((x — ß)e~e, J = M1, fulfills the conditions (a) - (c) of Theorem 3.2 and the locally most powerful test has the critical region
N
J2  alN(Ri,f)>k,                                             (3.9)
i=m+l
where k is determined by the condition P(Yli=m+i aiN{Ri, f)>k) = a, P G Hq and the scores have the form
alN(i,g) =1e\-1 - XN.J'l^N:i)) =lEiPl(UN:iJ),                     (3.10)
i = 1,...,N, where (pi(u, /) = —1 — F~l (u) :',>F_1r)),  0 < u < 1. In this case, too, we could
replace the scores (3.10) by the approximate scores aijv(i, /) = (pi í jÁr[,f) ,i = 1,... ,N.
III. Tests of Hq against the alternative of simple regression. Consider Hq against the alternative K = {qA : A > 0} where K3 : q\(xi,... ,xn) = Ili=i f(xi ~ ^■ci) w^ a nxed absolutely continuous density / satisfying (3.2) and with given constants c\,..., cat, Y1í=i c1 > 0- Then the locally most powerful test has the critical region
N
J2ciaN(RiJ)>k                                              (3.11)
i=l with the scores (3.5) and with k determined by the condition P(Y^i=i cíO-n(Rí, f)>k) = 01.
14                                                                                                                      Jana Jurečková
Chapter 4
Selected two-sample rank tests
Consider two random samples (Xi,...,Xm) and (Yi,...,Yn) with the respective distribution functions F and G. For the sake of brevity, we shall also denote (Xi,... ,Xm, Yi,... ,Yn) = (Zi,..., Zn) with N = m + n. The hypothesis of randomness for the vector (Zi,..., Zn) in this special case could be reformulated as Hq : F = G. Consider first testing Hq against the alternative Ki : G(x) < F(x) Vx G M1, G(x) ^ F(x) at least for one x. K\ is a one-sided alternative stating that the random variable Y is stochastically larger than X.
The problem of testing Hq against K\ is invariant to the group Q of transformations z\ = g(zi), i = 1,..., N where g is any continuous strictly increasing function. As we have seen before, the vector of ranks R±,..., Rn of Z\,..., Z n is the maximal invariant with respect to Q. Then, by Theorem 1.2, the class of invariant tests coincides with that of rank tests.
Hence, we shall restrict our considerations to the rank tests. However, we could still reduce the class of tests due to the following considerations. Because both (Xi,..., Xm) and (Yi,..., Yn) are random samples, the distribution of the vector of ranks (R±,..., Rm, Rm+i, ■ ■ ■, i?m+n) is symmetric in the first m and the last n arguments under all pairs of distributions F and G. Hence, the sufficient statistic for the vector (i?i,... ,Rm,Rm+i,... ,Rm+n) are two vectors of ordered ranks
R[<...<R'm    and    R'm+l < ... < R'm+n                                   (4.1)
of random variables X\,..., Xm and Y\,..., Yn, ordered according to the increasing magnitude.
Because either of the vectors in (4.1) determines the other, the family of invariant tests of Hq
against K\ in turns reduces to the tests dependent only on the ordered ranks of one of the samples,
e.g. on the ordered ranks of Yi,..., Yn.
( N \ Vector R'm+i,..., R'N could be equal to any of I         J combinations. All these combinations are
equally probable under Hq and hence the critical region of each rank test of the size a = k/
consists of just k points si,...,sn, 1 < si < ... < sn < N. The rank test differ in the points which they include in the critical regions.
The above alternative K\ is still to rich and hence there does not exist the uniformly most powerful rank test of Hq against K. However, we are able to find rank tests locally most powerful for Hq against some important subsets of K.
4.1    Two-sample tests of location
Consider the special alternative of Ki, namely that G differ from F by a shift in location, i.e.,
K2:  G[x) =F(x-A), A > 0. 15
U
16
Jana Jurečková
If we know that F is normal, we use the two-sample t-test. Generally, the test statistic of any rank test is a function of the ordered ranks of the second sample. Theorem 3.2 and the following example I. show that the locally most powerful test generally has the critical region of the form
N
y^   aN(Ri) > k;
i=m+l
hence the test criterion really depends only on the ordered ranks of Yj's.   The scores ajv(i) =
Wjip(U:i) (which could be approximated by un(í) = íp í jp-^ J), i = 1,..., N, are generated by an
appropriate score function ip : (0,1) i->- IR,1. We shall now describe three basic tests of this type which are the most often used in practice. Every one is locally most powerful for some special F, but the probabilities of the error of the first kind are the same for all F €E Hq.
(i) Wilcoxon / Mann-Whitney test. The Wilcoxon test has the critical region
N
W=   J2   Ri>ka                                                (4.2)
i=m+l
i.e., the test function
'  1    ...W>ka
<f>(x) = <   0    ...W < ka
, 7    -W = ka
where ka is determined so that Ph0(W > ka) + jPh0(W = ka) = a, 0 < a < 1 (a = 0.05, a = 0.01). This test is the locally most powerful against Ki with F logistic with the density
Jy '      (l + e-*)2
For small m and n, the critical value ka could be directly determined:  For each combination
si < ... < sn of the numbers 1,..., N we calculate X)ľ=i Si an(^ order these values in the increasing
í N \ magnitude. The critical region is formed of the Mjv largest sums where Mjv = a\         J; if there
( N is no integer Mjv satisfying this condition, we find the largest integer Mjv less than a I
and randomize the combination which leads to the (Mjv + 1)—st largest value. However, this systematic way, though precise, becomes difficult for large N, where we should use the tables of critical values.
There exist various tables of the Wilcoxon test, organized in various ways. Many tables provide the critical values of the Mann-Whitney's statistic
N       m i=m+l j=l
we could easily see that U n and Wn are in one-to-one relation Wn = U n + n^"2+ '.
For an application of the Wilcoxon test, we could alternatively use the dual form of the Wilcoxon statistic: Let Z\ < ... < Z^:n be the order statistics and define Vi,...,Vn in the following way:
Vi = 0 if Zjv:j belongs to the 1st sample and Vi = 1 if Zjv:j belongs to the second sample. Then WN = ^f=liVi.
Statistical Tests Based on Ranks
17
For large m and n, where there are no tables, we use the normal approximation of Wn ■ If m, n —> oo, then, under Hq, Wn has asymptotically normal distribution in the following sense:
Hm   PeAWn/Z^.n < 4 = *(*)■ - * <                       (4-3)
,7i-)-oo          L   v var Wjv            J
m,n—»oo
where $ is the standard normal distribution function.
To be able to use the normal approximation (4.3), we must know the expectation and variance of Wn under Hq. The following theorem gives the expectation and the variance of a more general linear rank statistic, covering the Wilcoxon as well other rank tests.
THEOREM 4.1.1 Let the random vector R±,... ,Rn have the discrete uniform distribution on the set R of all permutations of numbers 1,...,N, i.e. Pr(i? = r) = -^y, r €E H\ let ci,..., Cjv and a\ = a(l),... , gsjv = a(N) are arbitrary constants. Then the expectation and variance of the linear statistic
N
sN = J2cia(Ri)                                                  (4-4)
are
1     N         N
E5^ = ^Ec*Eai                                    (4-5)
N
i=l       j=l
N                       N
var Sn = ňzti Xľta -č)2 Í>; - ")2'                           (4-6)
where č = ^ Jľili ci and ä = W Si=i ai-
=i                 j=i
i y^ív   „   „„J ^ _  i y^ív
Proof. Actually,
N                             WW
N
var 5at = YJ ci .var a(ižj) + YJ YJ CjCjCOv(a(ižj), a(Rj)) i=l                              tyj
N
var a(i?i) TJ c? + cov(a(i?i), a(i?2)) /J /J CiCj
i=l                                                    i^j
JV
= A/"2č2.cov(a(iži),a(iž2)) + J^cľ-
i=l
Theorem 2.3 further gives
1   * var a(Ri) = — ^(a* - a)2,
i=l
1      *
cov(a(Ri),a(R2)) = -^———^2(ai - a)
18
Jana Jurečková
hence
N          N                          1       N                    N
var SN = -^—f2 J2(ai - «)2 + J73I Iľ^ _ ^'H0)-
As a special case, we get the parameters of the Wilcoxon statistic under Hq :
mxi,        n(N + l)          „r        mn(N + l)                                        ,,„N
^WN = -^-----'-, var WN =-----K—-----'-.                                      (4.7)
The tables of critical values profit from the fact that the distribution of Wn under Hq is symmetric around TEWn. If we test the Hq against the left-sided alternative (A < 0, the second sample shifted to the left with respect the first one), we reject Hq if Wn < 2TEWn — ka.
A sufficient condition for the symmetry of the linear rank statistic, which cover the Wilcoxon, follows from the following theorem:
THEOREM 4.1.2 Let (R±,... ,Rn) be a random vector with discrete uniform distribution on the set H of permutations of 1,..., N. Let ci,..., cjv and a± = a(l),..., gsjv = a(N) be constants such that either
a-i + a-N-i+i = K = const,     i = 1,..., N                                        (4.8)
or
Ci + Cjv-j+i = K = const,     i = 1,..., N.                                        (4.9)
Then the distribution of the statistic Sn = Yli=i cia(Ri) is symmetric around ESV, i-e. Sn—JESn and —(Sn — JESn) have the same distributions.
Proof. Under (4.8),
2Na = Yli=i ai + Yli=i aN-i+i = NK, hence aj + cln-í+i = 2ä, i = 1,..., N. Because (N — R\ +
1,..., N — ižjv + 1) and (i?i,..., Rn) have the same distributions,
S'N = Yli=i cia{N — Ri + 1) has the same distribution as Sn and
N
S]\f = 2ä y   Ci — Sn = 21EiS'jv — Sn     =$■     Sjy — IEiSjy = ESV — Sn
i=l =>     Pr(5jv - E5jv = s)= Pv(S'N - TES'N = s) = Pr(E5jv - SN = s)
holds for any s.
Analogously, under (4.9), c, + cjv-j+i = 2č, i = 1,..., N and (Rn, ■■■, Ri) has the same distribution as (i?i,... ,Rn)- Hence, SN = Yli=icN-i+ia(Ri) = Yli=icia(^-N-i+i) has the same distribution as Sn and
S n = 2č Yli=i ai ~ Sn = 21E5at — Sn    Sn — TESn = IE5at — Sn-The rest of the proof follows the steps of the first part.                                                                     ■
(ii) van der Waerden test. Consider the test criterion (3.3) with the approximate scores (3.7) corresponding to the score function ip(u) = $_1(it), 0 < u < 1, where $ is the standard normal distribution function. The van der Waerden test is convenient for testing Hq against K\ if the distribution function F has approximately normal tails. In fact, the test is asymptotically optimal for Hq against the normal alternatives and its ra/aiiVe asymptotic efficiency (Pitman efficiency) with respect to the t-test is equal to 1 under normal F and > 1 under all nonnormal F. For
Statistical Tests Based on Ranks
19
these good properties the test could be recommended; for large m, n, if we do not have the tables at disposal, we could use the critical values of the test based on the normal approximation JV(lE5Ar,var Sjy) where in the van der Waerden case, by Theorem 4.1,
ESjv = 0,    var SN
mn
N(N - 1)
N
E
$-
N+l
t2
Moreover, by Theorem 4.2, the distribution of Sjy under Hq is symmetric around 0.
. (iii) Median test. The median test is based on the criterion 3.3) with the scores (3.7) generated by the score function
(p{u)
( 0    ...0 < u < i
i       u=l-
2      •""        2
1    ...\<U<1.
(4.10)
The test statistic has a simple interpretation: Let jjl be the median of the pooled sample Xi,..., Xn, The test statistic is equal to the number of Y—observations situated above /U, increased by ^ for odd N.
If N is even, M = N/2 then, under Hq, Sn has the hypergeometric probability distribution:
Pr(5iv = k\H0) = {
M k
M n — k
N n
... max(0, n — M) < k < min(M, n)
...otherwise.
Actually, the vector (R'm+1,..., R'N) of ordered ranks of Yi,..., Yn in the pooled sample could be
equal to any of
N
n
combinations, each with the same probability. If Sn = k, then k elements
in the combinations are greater than ß and n — k elements are less than ß; the number of such
, .     ..       .   ( M\(    M combinations is I          I I
Hence, we could use the critical values from the tables of the hypergeometric distribution. For large number of observations we use the normal approximation with the parameters
ESjv = n/2,     var Sjv
mn
4(JV-1)"
The median test is the most convenient for the heavy tailed F with the density / such that while lim-c^-i-oo f(x) = 0, this convergence is much slower than in the case of the normal or logistic distributions (e.g., for the Cauchy distribution).
4.2    Two-sample rank tests of scale
Let Xi,..., Xm and Yi,..., Yn be two samples with the respective distribution functions F(x — jjl and G(y — fj,), where fj, is an unknown nuisance shift parameter. We wish to test the hypothesis of randomness, i.e. Hq : F = G, against the two-sample alternative of scale
K4 :  G{x-n) = F (-—^ J  Vx G M1, a > 1.
20
Jana Jurečková
The locally most powerful rank test against K4 is given by (3.9) and (3.10). However, instead of the tests optimal against some special shapes of F with complicated form of the scores, we shall rather describe tests with simple scores which are really used in the practice. Notice, by (3.10), that the score function ip\ for the scale alternatives is not more monotone but U—shaped and the test statistics are of the form
Sn =   £   *i (Jh) ■                               (4.11)
i=m+i
N + l
(i) The Siegel-Tukey test. This test is based on reordering the observations, leading to new ranks, and to the test statistics whose distribution under Hq is the same as that of the Wilcoxon statistic. Let Z^-.i < Zjq-.2 < ... < Zn-.n be the order statistics corresponding to the pooled sample of N = m + n variables. Reorder this vector in the following way:
(4.12) , N.   The critical
Zn-.I, Zn-.N, Zn-.N-U Zn:2, Zn:3, Zn:N-2, Zn:N-3, Zn-A, Zn:5,
and denote R{ the new rank of Z{ with respect to the order (4.12), % = 1,. region of the Siegel-Tukey test has the form
N
sN = 2_^ Ri ^ ka
i=m+l
where k'a is determined so that Ph0(S'n < k'a) + jPh0(S'n = k'a) = a. The distribution of S'N under Hq coincides with the distribution of the Wilcoxon statistic, hence we could use the tables of the Wilcoxon test. However, unlike in the case of the Wilcoxon test, the Pitman efficiency of the Siegel-Tukey test to the F—test is rather low under normal F, namely -JU = 0.608. Anyway, we should not use the two sample F—test of scale unless we are sure of the normality; namely this test is very sensitive to the deviations from the normal distribution.
(ii) Quartile test. Put in (4.11)
and we get the test statistic
Sn
0       ...0.25 < u < 0.75 0.5   ...« = 0.25, « = 0.75
1       ...0 < u < 0.25 and 0.75 < u < 1
N
E
=m+l
sign
Ri
N + l
+ 1
(4.13)
and reject Hq for large values of Sn- The value of S n is, unless N +1 is divisible by 4, the number of observations of the Y—sample which belong either to the first or to the fourth quartile of the pooled sample.
If N is divisible by 4, then S n has the hypergeometric distribution under Hq, analogously as the median test.
4.3    Rank tests of H0 against general two-sample alternatives based on the empirical distribution functions.
Again, Xi,...,Xm and Yi,... ,Yn are two samples with the respective distribution functions F and G. We wish to test the hypotheses of randomness Hq : F = G either against the one-sided alternative
K+ :  G(x) < F(x) Vs, F^G
Statistical Tests Based on Ranks
21
or against the general alternative
K5:  F^G.
This case in not covered by Theorem 3.2; moreover, testing against K§ is invariant to all continuous functions and there is no reasonable maximal invariant under this setup. In this case we usually use the tests based on the empirical distribution functions, which are the maximal likelihood estimators of the theoretical distribution functions in such nonparametric setup. Among these tests, we shall describe the Kolmogorov -Smirnov tests; another known test of this type is the Cramer - von Mises test.
The empirical distribution function Fm corresponding to the sample Xi,..., Xm is defined as
1     m
Fm{x) = -Y,I[Xi <x],xE M1;                                        (4.14)
i=l
analogously is defined the empirical d.f. Gn for the sample Yi,..., Yn. Denote
D+n = maxfiUz) - Gn(x)]                                             (4.15)
xeu1
Dmn = max \Fm(x) - Gn(x)\. xeu1
The Kolmogorov-Smirnov test against K§ has the test function <E>(X, Y):
<&(X,Y) =
1     ...L)mn > Ua T     —-Ľ'mn = ^a
0    ...Dmn < Ca
The statistic Dmn is the rank statistic, though not linear. To see this, consider the order statistics
Zjv:i < ... < Zjv:jv of the pooled sample and establish the indicators Vi,...,Vn where
Vj = 0 if Zn:j comes from the X—sample and Z^:j = 1 otherwise.
Because Fm and Gn are nondecreasing step functions, the maximum in (4.15) could be attained
only in either of the points Zjv:i, • • •, Zn:n; moreover
Fm{ZN:j) — Gn(ZN:j) =--------
mn
what gives the value of the test criterion
_.          m + n
Dmn =--------• max
mn    i<j<N
.mn
3------------Vi — ... — Vj
m + n
■   mn        ir                ir
3-----------Vi — ... — Vj
m + n
,N
(4.16)
Notice that this expression depends only on Vi,...,Vn', on the other hand, Vj = 1 <^=>* one of the ranks i?m+i,..., Rn is equal to i, while Vi = 0 ^=>* one of the ranks R±,..., Rm is equal to i. Thus Vi,...,Vn are dependent only on the ranks, and so is also Dmn. This implies that the distribution of Dmn under Hq is the same for all F. (4.16) is also used for the calculation of Dmn. Analogous consideration could be made for the one-sided Kolmogorov-Smirnov criterion -D+   which could be expressed in the form
_. ,        m + n
DZ.n =--------. max
mn        mn    i<j<N
■   mn        ir                ir
3-----------Vi — ... — Vj
m + n
For large values m,n, we could use the limit critical values of the tests, but the asymptotic distributions of the criteria are not normal. More precisely, it holds
lim   PHo \ (-^-)      D+n<x\ = l- exp{-2a;2}, x > 0.                  (4.17)
m,fi-»-oo         \ \m + n I
22
Jana Jurečková
4.4    Modification of tests in the presence of ties
If both distribution functions F and G are continuous, then all observations are different with probability 1 and the ranks are well defined.
However, we round the observations to a finite number of decimal places and thus, in fact, we express all measurement on a countable network. In such case, the possibility of ties cannot be ignored and we should consider the possible modifications of rank tests for such situation. Let us first make several general remarks:
•  If the tied observations belong to the same sample, then their mutual ordering does not affect the value of the test criterion. Hence, we should mainly consider the ties of observations from different samples.
•  A small number of tied observations could be eventually omitted but this is paid by a loss of information.
•  Some test statistics are well defined even in the presence of ties; the ties may only change the probabilities of errors of the 1st and 2nd kinds. Let us mention the Kolmogorov -Smirnov test as an example: The definitions of the empirical distribution function (4.14) and of the test criterion (4.16) make sense even in the presence of ties. However, if we use the tabulated critical values of the Kolmogorov - Smirnov test in this situation, the size of the critical region will be less than the prescribed significance level. Actually, we may then consider our observations X\,..., Xm, Y\,..., Yn as the data rounded from the continuous data XI,...,X^,Y*,...,Y*. Then the possible values of Fm{x) - Gn(x), a: G M1 form a subset of possible values of F^(x) — G*(a;), x E M1 where F^ and G* are the empirical distribution functions of X*'s and Yjf's, respectively; hence
max[Fm(x) - Gn(x)] < max[F^{x) - G*n(x)] xeu1                               xeu1
and similarly for the maxima of absolute values.
We shall describe two possible modifications of the rank tests in the presence of ties: randomization and method of midranks.
Randomizat ion
Let Z\,..., Zjv be the pooled sample. Take independent random variables U\,..., Un, uniformly R(0,1) distributed and independent of Z\,..., Zjq. Order the pairs (Z\, Ui),..., (Zn, Un) in the following way:
Denote R\,..., iž* the ranks of the pairs (Z\, Ui),..., (Zn, Un). We shall say that Z\,..., Zn satisfy the hypothesis H if they are independent and identically distributed (not necessarily with an absolutely continuous distribution). Then, under H, the vector R\,... ,iž* is uniformly distributed over the set TZ of permutations of 1,...,N. We shall demonstrate it on an important special case when Z\,..., Z n could take on the equidistant values, e.g. when the data are rounded on k decimal places.
Statistical Tests Based on Ranks
23
THEOREM 4.4.1 Let Zi,...,Zjv be random variables satisfying the hypothesis H which take on the values from the set
a + kd; k = 0, ±1, ±2,..., a G M1, d > 0.
Then the vector R* = (i?*,..., R*N) has the probabilit
Pr(iT = r) = ^    r EU.
PROOF. We may assume, without loss of generality, that Z\,..., Z n take on the integer values. Then the random variable Tj = Z{ + Ui is equivalent to the pair (Zj, Ui), because Zj = [Tj] and Ui=Ti- [Tj] with probability l,i = l,...,N. Because Pr(Tj = t) = 0 Vi G K1, the distribution function of Tj is continuous. Moreover,
(Zi,Ui)<(Zj,Uj)<=i'Ti<Tj,
hence iž*,..., iž^ are the ranks of the continuous random variables T\,..., Tv satisfying the hypothesis of randomness Hq; this implies their distribution.                                                    ■
Method of midranks
The idea behind this method is that the equal observations should have equal ranks; the joint value of their rank is then taken as an average of all ranks of the group. We shall mainly describe this method on the Wilcoxon test, but it is applicable also to other tests.
Assume that there are e different values among N observations; among them, d\ observations equal to the smallest value, d2 observations equal to the second smallest value, etc., de observations equal to the largest value, YH=i di = N. The average ranks of the individual groups are
	vi = ... = vdl = -(di + 1)
	Vd!+i = ■■■ = vdl+d2 =di + -(d2 + 1)
«dl+d2+l =	= ... = vdl+d2+ds =di + d2 + -{d3 + 1)
Vdl+dz + ...+de-1+l   =	... = di+d2 + ... + de-i + -(de +1).
(4.19)
Let R[,...,R'N denote the midranks of the observations Zi,...,Zjv- We have the modified Wilcoxon statistic
N
W*N = Y,R'i-                                                    (4-20)
i=l
Because the distribution of (R[,..., R'N) under H is not more uniform on 11, (and the values may not be integer), we cannot use the standard tables of Wilcoxon critical values. If the numbers of equal observations are small comparing with N then we could use the normal approximation for sufficiently large m, n. To use this approximation, we must know the expectation and the variance of W£f under H. These characteristics are conditional given the values d\,...,de and hence the whole test is conditional. We have
N + 1 W,{W*N\du...,de)=n^— = WWN                                      (4.21)
24
Jana Jurečková
and hence the expected value of the midrank statistic coincides with that of ordinary Wilcoxon statistic. Actually,
N
nw^\dl,...,de)= J2 n^i\di,...,de)
i=m+l
d1-{d1 + 1) + d2(di + -(d2 + 1)) + ... + de(di + ...+ de-i + -de)
N     1 1^   Ň
i=m+í
1        N
N+l
n-
i=m+i
The variance of Wti is equal
var W*
mn(N + 1)      mn^=1((^-rfi) 12                 12N(N - 1)      '
(4.22)
The first term in (4.22) is the variance of the standard Wilcoxon statistic, while the second term is a correction for the ties which vanishes if there are no ties among the observations.
Chapter 5
Tests for comparison of the treatments based on paired observations
When we want to compare the new treatment with the standard one, we wish to exclude the effects due to the inhomogeneity of the data. One possibility how to do that is to divide the experimental units in n pairs, as much homogeneous as possible, and apply the new treatment to one unit of the pair while the other unit serves for the control. We could also apply both treatments successively to the same unit.
Let Yi,...,Yn be the measurements of the effects of the new treatment and Xi,..., Xjv be the control measurements. Then (Xi,Yi),..., (Xn, Yjv) could be considered as a random sample from a bivariate distribution with the distribution function F(x,y); it is generally unknown and we only assume that it is continuous.
The hypothesis Hi of no effect of the new treatment is equivalent to the statement that the distribution function F(x, y) is symmetric around the straight line y = x, i.e.
Hi :  F(x, y) = F(y, x) Vx, y G IR1.                                         (5.1)
The alternative of the positive effect of the new treatment generally means that the distribution of the random vector (X, Y) is shifted toward the positive halfplane y > x. In this section we shall consider the rank tests of Hi against various alternatives.
5.1    Rank tests of Hi
Apply the folowing transformation to (Xi, Yf), i = 1,..., n :
Zi = Yi-Xi,     Wi = Xi+Yi,    i = l,...,n                           (5.2)
Under Hi, the distribution of the vector (Zi, Wi),..., (Zn, Wn) is symmetric around the w—axis, while under the alternative it is shifted in the direction of the positive half-axis z. The problem of testing Hi against such alternative is invariant with respect to the transformations z\ = Zi, w\ = g(wi), i = 1,..., n, where g is a 1 : 1 function with finite number of discontinuities. The vector (Zi,..., Zjv) is the maximal invariant with respect to such transformations, hence the invariant tests will be only the functions of (Zi,..., Zn), which forms a random sample from some one-dimensional distribution with a continuous distribution function D. The problem of testing Hi is then equivalent to
H[ :  D(z) + D(-z) = 1    zeE1                                  (5.3)
25
26
Jana Jurečková
stating that the distribution D is symmetric around 0, against the alternative
K[ :     D(z + A) + D(-z + A) = 1 Mz G IR1, A > 0                            (5.4)
what means that the distribution is shifted in the direction of the positive z.
The distribution D is uniquely determined by the triple (p, Fi, F2) with p = Pr(Z < 0),     F\(z) -Pr(|Z| < z\Z < 0) and F2(z) = Pv(Z < z\Z > 0). Equivalent expressions for H[ and K[ are
H'{: p = 1/2, F2 = FU        p< 1/2, F2 < Fľ.
This problem is invariant with respect to the group of transformations G : z[ = g(zi), i = 1,..., n, where g is continuous, odd and increasing function. We could easily see that the maximal invariant with respect to G is (S\,... ,Sm,Ri,...,Rn), where Si,...,Sm are the ranks of the absolute values of negative Z's among \Zi\,..., \Zn\ and i?i,..., n are the ranks of positive Z's among \Zi\,..., \Zn\. Moreover, the vectors S[ < ... < S'm and R[ < ... < R'n of ordered ranks are sufficient for (S\,..., Sm, R±,..., Rn) and, further, one of them uniguely determines the other; hence it is finally consider only, e.g., R[ < ... < R'n and the invariant tests of Hi [or of H[] depends only on R[ < ... < R'n.
Let v be the number of positive components of (Z\,..., Zn). Then v is a binomial random variable B(N, it); it = 1/2 under H\ and, for any fixed n,
PHl(R'1 = r1,...,R'u = r„,v = n)=                                         (5.5)
PHÁR[=r1,...,K = ru\v = n)P^ = n) = jj—(Nn ) Q)" = Q)"
for any n—tuple (r\,..., rn), 1 < r\ < ... < rn < N. The number of such tuples is Yln=o I         ) =
(^)   .    The critical region of any rank test of the size a =  ^r contains just k such points
(n,...,rn).
However, among such critical regions, there generally does not exist the uniformly most powerful one for H" against K". We usually consider H" against the alternative of shift in location under which (Zi,..., Zn) has the density ^a, A > 0 :
N
<?A(*i,...,*jv)=n/(^-A):A>o                                 m)
i=l
where / is a one-dimensional symmetric density, f(—x) = f(x), x G M1. A = 0 under Hi [or H'(.\ The locally most powerful rank test of H\ against (5.6) has the critical region
N
^a+iiz^/JsignZi^fca                                                (5.7)
i=l
where i?+ is the rank of \Z{\ among \Zi\,..., \Zn\ and the scores a~j^(i, f) have the form
u + 1 a+(i, f) = J&p+iUftJ), i = l,...,N        V+(u,f) = <p(—^~, /), 0 < u < 1.        (5.8)
and where (p(u, /) = ~ t(F-i(u)V ® < u < ^" ^e s^a^ describe two main tests of this type: the one sample Wilcoxon test and the sign test.
Statistical Tests Based on Ranks
27
5.2    One-sample Wilcoxon test
The one-sample Wilcoxon test is based on the criterion
n
W+ = J2^^Zi.Rf                                            (5.9)
i=l where R^ is the rank of | Zj | among | Z\ |,..., | Z n I, or in the equivalent form
V
w++ = Y,Ri                                                 (5-10)
i=l
where Ri is the rank of Zj > 0 among |Zi|,..., |Zjv|, ľ is the number of positive components. Obviously W+ = 2W++ - ±N(N + 1).
We reject Hi if Wjj > Ca, i.e.if the test criterion exceeds the critical value. For large N, when the tables of critical values are not available, we may use the normal approximation:
{
WZ - JEWZ        1 -ff, í —-------jľ^ < s ^ -»• oo    as    N^oo                               (5.11)
where
EW+ = 0,        var W+ = -N(N + 1)(2N + 1)                              (5.12)
The parameters (5.12) follow from the following proposition:
Proposition 5.2.1 Let Z be a random variable with continuous distribution function symmetric around 0, i.e. F(z) + F(—z) = 1, z G M1. Then Z and sign Z are independent.
PROOF. Obviously P(sign Z = 1) = P(sign Z = -1) = 1/2. Then P(sign Z = 1, \Z\ < z) = i>(0 < Z < z) = P(-z < Z < 0) = P(sign Z = -1, \Z\ < z) = \P{\Z\ < z).                M Similarly
as the two-sample Wilcoxon test, the one-sample Wilcoxon test is convenient for the densities of logistic type.
5.3    Sign test
Consider a more general situation that Zi,...,Zjv are independent random variables, Zj distributed according to the distribution function Dj, and not necessarily all D\,..., D n are equal. This situation occurs when we compare two treatments under different experimental conditions or using different methods.
Under this situation we want to verify the hypothesis
H{ :  Di(z) + Di(-z) = 1,     ZEM1, i = l,...,N                            (5.13)
of symmetry of all distributions around 0, against the alternative that all distributions are shifted toward the positive values.
Such problem is invariant with respect to all transformations of the type z\ = fi(zi), i = 1,...,N, where /j's are continuous, increasing and odd functions. The maximal invariant with respect to such transformations is the number n of positive components. The uniformly most powerful invariant test (most powerful among the tests dependent only on n) has the form
<E>(n)
1    ...n > Ca
7    ...n = Ca                                                  (5.14)
0    ...n < Ca
28                                                                                                                      Jana Jurečková
where Ca and 7 are determined by the equation
n>Ca
The criterion of the sign test is simply the number of positive components among Z\,..., Z n and its distribution under H\ is binomial b(N, 1/2). For large N we could again use the normal approximation.
If all distribution functions D\,..., D n coincide, the sign test is the locally most powerful rank test of Hi for D of the double-exponential type with the density d{z) = ^e~^_Al, z €E M1.
When we want to use the rank test, we need not to know the exact values X{, Y{, i = 1,..., N; it is sufficient to know the signs of the differences Y{ — X{. This is a very convenient property of the sign test: we could use this test even for the qualitative observations of the type: "drogue A gives a better pain relief than drogue B". As a matter of fact, we do not have any better test under such conditions.
N n
iV
+ 7
N
iV
a.
(5.15)
Chapter 6
Tests of independence in bivariate population
Let (Xi,Y\),..., (Xn,Yn) be a random sample from a bivariate distribution with a continuous distribution function F(x,y). We want to test the hypothesis of independence
H2:  F(x,y) = F1(x)F2(y)                                                (6.1)
where F\ and F2 are arbitrary distribution functions. The most natural alternative for H2 is the positive [or negative] dependence, but it is too wide and we could hardly expect to find a uniformly most powerful test against such alternative. Instead of it we consider the alternative
£r#++A*    A >0, i = l,...,»,                                      (6.2)
where Xf, Y®, Zi, i = 1,..., n are independent and their distributions are independent of i. The independence then means that A = 0.
Let i?i,... ,Rn be the ranks of Xi,..., Xn and let Si,...,Sn be the ranks of Yi,..., Yn, respectively. Under the hypothesis of independence, the vectors (Ri,..., Rn) and (Si,..., Sn) are independent and both have the uniform distribution on the set H of permutations of 1,..., n.
The locally most rank powerful test of H2 against the alternative K2 in which Xf has the density /1 and Y® the density f2, respectively, both densities continuously differentiable, has the critical region
n
^2<in(Ri, fi)an(Ri, f2) > Ca                                       (6.3)
i=l
with the scores an(i, f) given in (3.4), which are usually replaced be approximate scores (3.7). We shall briefly describe two the most well-known rank tests of independence.
6.1    Spearman test
The Spearman test is based on the correlation coefficient of (Ri,..., Rn) and (Si,..., Sn):
r, =_________kT,?=iRjSi-RS________
IkEiUto -Ř)2lňTeilst -5)2]V2                              ^ • >
where
29
30
Jana Jurečková
i=í                                  i=í                                i=í             v             '
n+l\2      n2-l
n f—'                     n f—'                    n f—'         V    2    /           12
Then we could express (6.4) in the simpler form
12         "       0      3(n + l)
(6.5)
n 1Z
r5
n(n2 — 1) -f—'
X;^-^^.                                  (6.6)
r—'               n — 1
The Spearman test rejects i?2 if r s > Ca, or, equivalently, if S = X)ľ=i ^^ > ^a- -*-n some tables we find the critical values for the statistic
n
S' = J2(Ri-Si)2                                                (6.7)
i=l
for which rs = 1-----s^S'. The test based on S' rejects Hi if S' < C'a.
For large n we use the normal approximation with
__     n(n + l)2                    n2(n + l)2(n-l)
ElS  =   ---------A---------'       Var   S   =   ---------------7J~A---------------•
4                                     144
The Spearman test is the locally most powerful against the alternatives of the logistic type.
6.2    Quadrant test
This test is based on the criterion
Q=\ ^[signíÄj - ^) + lHsigntä - ^) + 1]                           (6.8)
i=l
and rejects Hi for large values of Q. For even n is Q equal to the number of pairs (Xi,Y{), for which X{ lies above the X—median and Y{ lies above the Y—median. Statistic Q then has, under the hypothesis Hi, the hypergeometric distribution
m \ I      m
,   q   J \  m — a , Pr(Q = q)=   v   W7 V    x        7                                            (6.9)
n
m
for g = 0,1,..., m,     m = n/2. For large n we use the normal approximation with the parameters
n
EQ = n/4,        var Q = —-------.                                       (6.10)
16(n — 1)
Chapter 7
Rank test for comparison of several treatments
7.1    One-way classification
We want to compare the effects of p treatments; the experiment is organized in such a way that the i-th treatment is applied on rii subjects with the results xn,..., Xini, i = 1,... ,p, Ya=i ni = n-Then xn,... ,Xini is a random sample from a distribution with a distribution function Fi, i = l,...,p. The hypothesis of no difference between the treatments could be then expressed as the hypothesis of equality of p distribution functions, namely
H2:  ^ = ^2 = ... = ^                                                      (7.1)
and we could consider this hypothesis either against the general alternative
K2 :  Fi(x) ŕ Fj(x)                                                           (7.2)
at least for one pair i,j at least for some x = xo, or against a more special alternative
K2 :  Fi(x) = F(x - Ai), i = l,...,p                                           (7.3)
and A j ^ A j at least for one pair i,j.
The alternative (7.3) claims that the effects of treatments on the values of observations are-linear and that at least two treatments differ in their effects.
The classical test for this situation is the F-test of the variance analysis; this test works well if we could assume that F{ ~ N(n + «j, cr2), i = 1,... ,p. We obtain the usual model of variance analysis
Xij = fj, + ai + eij, j = 1,..., nf, i = l,...,p                                    (7.4)
where e,j are independent random variables with the normal distribution A/"(0, a2). The hypothesis H2 could be then reformulated as
H2 :  ol\ = a2 = ... = otp = 0.
The .F-test rejects the hypothesis H2 provided
T_n-V    Y.Unj{X, - X..?
31
32
Jana Jurečková
where
-.ni                                         -,     P     rii
**■ = - E xa and *■ = - E E *tf.
J = l                                            2 = 1 J = l
i = 1,... ,pand where the critical value Ca is found in the tables of F-distribution with (p—l,n—p) degrees of freedom.
7.2    Kruskal-Wallis rank test
Let us order all observations
X\\, . . . , Xini, Xi\ i ■ ■ ■ i %2ri2) • • • ) -^pl) • • • ) -Eprip
according to the increasing magnitude. Let Rn,..., ižj„ť be the ranks of the observations xn,..., Xjni. Let R*ľ < ... < R*n. be the same observations ordered in increasing magnitude. Then, under the hypothesis H2,
-Ph2 (-Rii =rn,..., R\ni = rini,..., Rpl = rpi,..., Rpnp = rpnpj =          j              (7-6)
for any permutation (rn,..., r±ni,..., rpl,..., rpTp) of numbers 1,..., n such that rn < ... < Vini for alii = 1,... ,p.
Denote
1    n» Ri- = —^Rij, i = l,...,p
1   p   n*                       11
*- = £££*« = ^r1-                          <">
i=i j=i If we replace X{. and X.. in (7.5) by ižj. and R.., respectively, i = 1,... ,p, we obtain
p-iELiE-=i(^-^-)2
and this is proportional to the criterion of the Kruskal- Wallis test,
IC
12      A     /„       n + L\2
£»<(«,-^))                          (M)
n(n+ 1) *-f j=i
12        p =    -7------^Vn^iž?-3(n + l).
n(n+ 1) 4^
j=i
In the special case p = 2, the Kruskal-Wallis reduces to the two-sided (two-sample) Wilcoxon test. We reject the hypothesis H2, provided IC > Ca, where the critical value Ca is either obtained from special tables, or, if p > 3 and rii > 5, i = 1,... ,p, we use the asymptotic critical values:
it could be shown that, under H2 and for large n\,...,np, the criterion IC has asymptotically %2 distribution with p — 1 degrees of freedom.
Remark. In case of ties between the observations we replace the ranks by the midranks, similarly as in the case of the Wilcoxon test.
Statistical Tests Based on Ranks
33
7.3    Two-way classification (random blocks)
We want to compare p treatments, but simultaneously we want to reduce the influence of non-homoheneity of the sample units. Then we could organize the experiment in such a way that we divide the subjects in n homogeneous groups, so called blocks, and compare the effects of treatments within each block separately. The subjects in the block are usually assigned the treatments in a random way.
Let us consider the simplest of these models with n blocks, each containing p elements, and each treatment is applied just once in each block. We assume that the blocks are independent of each other.
The observations could be formally described by the following table:
	Treatment	1	2	3	P
Block					
1		xu	»12	»13	X\p
2		»21	»22	»23	x2p
n		Xnl	»n2	»7l3	Xnp
The observation Xij is the measured effect of the j-th treatment applied in the i-th block.
We cLSSUm6 tllcLt si-ij are independent random variables and Xý- has a continuous distribution function Fij, j = 1,..., n; i = 1,... ,p. We wish to verify the hypothesis that there is no significant difference among the treaments, hence
H3:Fil=Fi2 = ... = Fipyi = l,...,n                                      (7.9)
against the alternative
K3 : Fa ± Fik                                                        (7.10)
at least for one i and at least for one pair j, k, or against a more special alternative
K3:    Fij(x) = Fi(x-Aj), j = l,...,n; i = l,...,p                           (7.11)
Aj ^ A*,   at least for one pair j, k.                                    (7-12)
The classical test of H3 is the F-test, corresponding to the model
Xij = n + ai + ßj+ Eij, j = 1,..., n; i = 1,... ,p,                           (7.13)
where E^ are independent random variables with the normal distribution A/"(0, <r2), ß is the main additive effect, a, is the effect if the i-th block and ßj is the effect of the j-th treatment, j = 1,..., n; i = 1,... ,p. The hypothesis H3 then reduces to the form
ß1=ß2 = ... = ßp. The critical region of the F-test of H3 has the form
{n-l)Yľ-AX.j-X..))2         _      -        -   ,
T =--------£%-       '-------—(Xij - Xi.) - X.j + X.)2 > Ca                 (7.14)
2^1 j=i Z^i=i
where Ca is the critical value of F-distribution with p — 1 and (p — l)(n — 1) degrees of freedom.
34
Jana Jurečková
Friedman rank test
Order the observations within each block and denote the corresponding ranks Rn,... ,Rip; i 1,..., n. The ranks we arrange in the following table:
Treatment	1	2	3	p	Row
Block					average
1	Rn	R12	-R13	Rip	p+i 2
2	R21	R22	-R23	R2p	p+i 2
n	Rnl	Rn2	Rn3	Jrinp	p+i 2
Column	Ai	R.2	R.3	M,.p	Overall average
average					A. = *±i
where A.,- = ± Eľ=i Ä« and A. = £ Eľ=i E?=i Ay-
The Friedman test is based on the following criterion:
Vt]
12" E^-^1)2
p(p+i)<r;
n
-^-V^.-Snb+l)
(7.15)
while the large value of the criterion are significant.
As n —> 00, then the distribution of Qn is approximately %2 with p — 1 degrees of freedom.
In case p = 2, the Friedman test is reduced to the two-sided sign test.
The Friedman test is applicable for the comparison of p treatments even in the situation that we observe only the ranks rather than exact values of the treatment effects.
Bibliography
[1] H. Bünning and G. Trenkler (1978). Nichparametrische statistische Methoden (mit Aufgaben und Lösungen und einem Tabellenanhang). W. de Gruyter, Berlin.
[2] J.Hájek and Z.Šidák (1967). Theory of Rank Tests. Academia, Prague.
[3] J. Jurečková (1981). Rank Tests (in Czech). Lecture Notes, Charles University in Prague.
[4] J. P. Lecoutre and P. Tassi (1987). Statistique non parametrique et robustesse. Economica, Paris.
[5] E. L. Lehmann (1975). Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco.
[6] E. L. Lehmann (1986). Testing Statistical Hypotheses. Second Edition. J. Wiley.
35