Receiver Operating Characteristic (ROC) Curve: A Tool for Describing and Comparing Continuous Diagnostic Tests Ivana Horov´a with cooperation of Jiˇr´ı Zelinka Department of Mathematics and Statistics, Faculty of Science Masaryk University, Czech Republic Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 1 Contents • Introduction • Measure of Diagnostic Accuracy • Receiver Operating Characteristic Curve • Estimates of ROC curves Parametric methods Nonparametric methods Simulations • Summary ROC measures Area under curve Partial area under curve Maximum inprovement of sensitivity • Application for real data Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 2 Introduction ROC curves and their analysis are based on statistical decision theory, they were originally developed for electronic-detection-signal theory (see Zhou et al. 2002 for details). The concept of ROC curves was introduced in medicine by Lee Lusted in 1971. Recently, there has been an increased use of ROC methodology in a wide area of different disciplines. Statistical aspects of ROC analysis: many excellent books and papers are available (e.g. Pepe 2003 or Zhou et al. 2002 and list of References). Aim: To explain the definition, properties and constructions of ROC curves and to make them accessible for the general scientific audience. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 3 Diagnostic tests play an important role in medical care and contribute significantly to health care cost. A diagnostic test has two purposes: 1. to provide reliable information about the patient’s condition 2. to influence the health care provider’s plan for managing the patients A test can serve these purposes only if the health care provider knows how to interpret it. This information is acquired through an assessment of the test’s diagnostic accuracy, which is the ability of a test to detect correctly a condition when it is actually present and to correctly rule out when it is truly ab- sent. Two basic measures of diagnostic accuracy are sensitivity and specificity. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 4 Measure of Diagnostic Accuracy • G1 group of subjects with a condition • G0 group of subjects without a condition • D = 0, 1 random variable denotes absence or presence of the condition • T = 1 positive test result • T = 0 negative test result Test Results (Confusion matrix) Positive test, T = 1 Negative test, T = 0 Total G1 (D = 1) True positive (a) False negative (b) a + b G0 (D = 0) False positive (c) True negative (d) c + d Total a + c b + d Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 5 The sensitivity (Se) of the test is its ability to detect the condition when it is present. Se = P(T = 1|D = 1) is a probability P that the test result is positive (T = 1), given that the condition is present (D = 1), Se = a a + b The specificity (Sp) of a test is its ability to exclude the condition when it is absent. Sp = P(T = 0|D = 0) is a probability P that the test result is negative (T = 0), given that the condition is absent (D = 0), Sp = d c + d , FPR = 1 − Sp = c c + d , FPR = false positive rate G1 True positive (a) False negative (b) a + b G0 False positive (c) True negative (d) c + d Total a + c b + d Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 6 Example The accuracy of screening mammography test results: • 30 patients with pathology proven breast cancer • 30 patients without disease The mammograph was positive if the mammographer recommended additional diagnostic follow-up. Test results Cancer status Positive Negative Total Present 29 1 30 Absent 19 11 30 Total 48 12 60 Se = 29 30 = 0.967, Sp = 11 30 = 0.367, FPR = 19 30 = 0.633 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 7 Mammographer used a different decision treshold Test results Cancer status Positive Negative Total Present 23 7 30 Absent 8 22 30 Total 31 29 60 Se = 23 30 = 0.767, Sp = 22 30 = 0.733, FPR = 19 30 = 0.267 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 8 Receiver Operating Characteristic (ROC) Curve The accuracy of a medical diagnostic test is often summarized in a Receiver Operating Characteristic (ROC) Curve. The ROC curve is defined as a plot of the probability of false classification (1-Sp) of subjects from G0 versus the probability of true classification (Se) of subjects from G1 across of all possible values of the given test. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 9 Explicit formula I • X – the diagnostic test variable (one-dimensional absolutely continuous random variable) • The subject is classified as G1 if X ≥ c and G0 otherwise for given cutoff point c ∈ R • F0(c) = P(X ≤ c|G0) = c −∞ f0(x)dx F1(c) = P(X ≤ c|G1) = c −∞ f1(x)dx F0 or F1 are distribution functions of group G0 or G1, respectively, and f0 and f1 are corresponding density functions. • F0 – the specificity (Sp) of the test • 1 − F1 – the sensitivity (Se) of the test Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 10 • p – the probability of false classification of subject from G0 • q – the probability of true classification of subject from G1 p = 1 − F0(c) ⇒ c = F−1 0 (1 − p), 0 ≤ p ≤ 1 q = 1 − F1(c) = 1 − F1(F−1 0 (1 − p)), 0 ≤ p ≤ 1 ROC(p) = R(p) = 1 − F1(F−1 0 (1 − p)), 0 ≤ p ≤ 1 ROC curve is displayed by plotting 1 − F1(c) against 1 − F0(c) for a range of cutoff points c ∈ R. Notation Xj, j = 0, 1 denote random variable X if D = j, j = 0, 1, X0, X1 are indepen- dent Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 11 c0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 f0 (x) f 1 (x) TP FP FN TN TP – True positive FP – False positive FN – False negative TN – True negative f0(x) = 1 √ π e−x2 f1(x) = 1 √ 2π e− (x−2)2 2 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 12 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 c 1 0 1 0 1 p 1 q1 c1 = −1 p1 = 1 − F0(c1) = ∞ −1 f0(x)dx = 0.9214 q1 = 1 − F1(c1) = ∞ −1 f1(x)dx = 0.9987 Sp = 0.0786, Se = 0.9987 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 13 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 c 2 0 1 0 1 p 1 q1 p 2 q 2 c2 = 0 p2 = 1 − F0(c2) = ∞ 0 f0(x)dx = 0.5 q2 = 1 − F1(c2) = ∞ 0 f1(x)dx = 0.9772 Sp = 0.5, Se = 0.9772 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 13 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 c 3 0 1 0 1 p 1 q1 p 2 q 2 p 3 q 3 c3 = 1 p3 = 1 − F0(c3) = ∞ 1 f0(x)dx = 0.0786 q3 = 1 − F1(c3) = ∞ 1 f1(x)dx = 0.8413 Sp = 0.9214, Se = 0.8413 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 13 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 c 4 0 1 0 1 p 1 q1 p 2 q 2 p 3 q 3 p 4 q4 c4 = 2 p4 = 1 − F0(c4) = ∞ 2 f0(x)dx = 0.0023 q4 = 1 − F1(c4) = ∞ 2 f1(x)dx = 0.5 Sp = 0.9977, Se = 0.5 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 13 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 c 5 0 1 0 1 p 1 q1 p 2 q 2 p 3 q 3 p 4 q4 p 5 q5 c5 = 3 p5 = 1 − F0(c5) = ∞ 3 f0(x)dx = 0.00001 q5 = 1 − F1(c5) = ∞ 3 f1(x)dx = 0.1587 Sp = 0.99999, Se = 0.1587 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 13 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 0 1 p1 q1 p2 q2 p3 q3 p4 q4 p5 q5 pi = 1 − F0(ci) = ∞ ci f0(x)dx qi = 1 − F1(ci) = ∞ ci f1(x)dx, i = 1, . . . , 5 Point (1, 1) – all subject are classified to be from G1 Point (0, 0) – all subject are classified to be from G0 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 13 Extreme cases 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sensitivity A perfectly accurate test because sensitivity is 1.0 when 1specificity is 0.0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sensitivity A perfectly inaccurate test, patients with the condition are located incorrectly as negative and patients without condition are located incorrectly as positive 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sensitivity A diagonal – chance diagonal. The test is not usable for separation of the patients. Diagnostic tests with ROC curves above the chance diagonal have at least some ability do discriminate between patients with and without condition. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 14 ROC curve close to the perfectly accurate one 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity f0(x) = 1 √ π e−(x+1)2 f1(x) = 1 0.5 √ 2π e− (x−1)2 2·0.52 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 15 Explicit formula II • ROC(p) = R(p) – the distribution function of 1 − F0(X1), i.e. • R(p) is the nonzero distribution function of the p-value 1 − F0(X1) for testing the null hypothesis that an individual comes from G0 V = 1 − F0(X1) FV (p) = P(V ≤ p) = = P(1 − F0(X1) ≤ p) = = P(X1 ≥ F−1 0 (1 − p)) = = 1 − F1(F−1 0 (1 − p)) = R(p) Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 16 Estimates of ROC curves Parametric methods (see DeLong et al. 1988, Zhou et al. 2002, Pepe 2003) X – diagnostic variable → fX (x) = α0f0(x) + α1f1(x), α0 + α1 = 1, α0,1 ≥ 0. f0, f1 – normal (Gaussian) densities with means µ0, µ1 and variances σ2 0, σ2 1, respectively, fj(x) = 1 σj √ 2π e − (x−µj )2 2σ2 j , j = 0, 1 . The ROC curve: R(p) = Φ a + bΦ−1 (p) a = µ1−µ0 σ1 , b = σ0 σ1 , Φ – standard normal distribution function, Φ(x) = x 0 1√ 2π e− t2 2 dt. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 17 Nonparametric methods The empirical ROC curve: F0 and F1 are replaced by their cumulative distribution function. Kernel methods: F0 and F1 are estimated by kernel methods (e.g. Azzalini 1981, Lejeune and Sarda 1992, Altman and Léger 1995, Zou, Hall and Shapiro 1997, Bowman et al. 1998, Lloyd 1998, Lloyd and Yong 1999, Hall and Hyndman 2002, Zhou et al. 2002, Zhou and Harezlak 2002, Peng and Zhou 2004). Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 18 Nonparametric estimates of distribution function Let Z1,. . . ,Zn be random sample from random variable Z with distribution function F. Empirical distribution function Fn(x) = 1 n n i=1 I(Zi ≤ x). Kernel estimate of distribution function ˆFh(x) = 1 n n i=1 W x − Zi h , W(x) = x −1 K(t)dt • K – a kernel, a non-negative symmetric function, supported on [−1, 1], integrated to unity K(x) = 15 16 (1 − x2 )2 I[−1,1], K(x) = 3 4 (1 − x2 )I[−1,1] • h – a smoothing parameter (bandwidth), h = h(n) – a sequence of nonrandom positive numbers, h → 0, nh → ∞ as n → ∞. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 19 Nonparametric estimates of ROC curve Notation • Independent samples X0,1,. . . X0,n0 from G0 and X1,1,. . . X1,n1 from G1 on, respectively F0 and F1 are at hand Empirical ROC curve F0 and F1 are replaced by their empirical distribution functions Fnj (x) = 1 nj nj i=1 I(Xi,j ≤ x), j = 0, 1 and ˆR(p) = 1 − ˆF1( ˆF−1 0 (1 − p)). Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 20 Smooth kernel estimate of ROC curve Estimates of F0 and F1: ˆFj,hj (x) = 1 nj nj i=1 W x − Xj,i hj , j = 0, 1 Kernel formula I ˆR(p) = 1 − ˆF1,h1 ( ˆF−1 0,h0 (1 − p)), 0 ≤ p ≤ 1, hj = O(n −1/3 j ), j = 0, 1 Estimates of optimal bandwidths for ˆFj, j = 0, 1 need not to be optimal for ˆR(p). Kernel formula II. ˆR(p) = ˆFV,h1 (p) = 1 n1 n1 i=1 W p − (1 − ˆF0(X1,i)) ˜h1 ˆF0,h0 (X1,i) = 1 n0 n0 j=1 W X1,i − X0,j h0 , W(x) = x −1 K(t)dt Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 21 Problems with bandwidth selection ˆFh(x) = 1 n n i=1 W x − Zi h , W(x) = x −1 K(t)dt Mean Integrated Square Error (E denotes an expectation) MISE( ˆFh) = E( ˆFh(x) − F(x))2 dx, Optimal bandwidth minimizing MISE( ˆFh) provided that F ∈ C2 : hopt = n−1/3 c1 β2 2 ψ2 1/3 c1 = 1 −1 W(x)(1 − W(x))dx > 0, β2 = 1 −1 x2 K(x)dx, ψ2 = (F′′ (x))2 dx Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 22 Methods for estimation of the optimal bandwidth: • Terrell and Scott (1985), Terrell (1990) – maximal smoothing principle • Sarda (1993) – a cross-validation method • Altman and Léger (1995), Zhou and Harezlak (2002) – a method of the reference (Gaussian) density • Lloyd and Yong (1999) – a more complex selection of bandwidth, procedure based on two-stage plug-in method • Zhou et al. (2002) – the bandwidths optimal for densities estimates • Hall and Hyndman (2003) – a method allows interaction between distribution for each group • Peng and Zhou (2004) – a method is based on local linear smoothing • Horová and Zelinka (2007) – an iterative method • Horová et al. (2007) Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 23 Simulations Normal data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking: − · − true ROC curve f0(x) = 1√ 2π e− (x)2 2 , f1(x) = 1√ π e−(x−1)2 - - - Kernel formula I h0 = 0.8595, h1 = 0.6781 ––– Kernel formula II h0 = 0.8595, ˜h1 = 0.0982 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 24 Exponential data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking: − · − true ROC curve f0(x) = e−x , f1(x) = 1 5 e− x 5 - - - Kernel formula I h0 = 0.3424, h1 = 2.2254 ––– Kernel formula II h0 = 0.3424, ˜h1 = 0.0010 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 25 Simulation study We generated 1000 random samples of normally distributed random variables for testing the quality of kernel estimates of ROC curve: X0,i ∼ N(0, 1), X1,i ∼ N(1.5, 05), i = 1, . . . 100. Following figures present the bounds (yellow area) containing all estimates of ROC curve for the both kernel formulae (dashed blue lines) and the true ROC curve (solid red line), as well. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Kernel formula I Kernel formula II Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 26 Summary ROC measures • Area under ROC curve (AUC) • Partial area under ROC curve (PAUC) • Specificity corresponding to maximum improvement of sensitivity (MIS) Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 27 Area under curve • The most common used global index of diagnostic accuracy is the area under the ROC curve – AUC • The area under the ROC curve is the probability that a pair of individuals known to be from different groups will be correctly classified. AUC(R(p)) = 1 0 R(p)dp • A simple calculation shows that the area under ROC curve is exactly equal to the probability P(X0 < X1): AUC(R(p)) = P(X0 < X1) • Values of AUC close to 1.0 indicate that the test has high diagnostic accuracy Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 28 Empirical AUC The empirical AUC: calculate the trapezoidal area under each vertical slice of an empirical ROC curve having a straight-line segment as its top; then sum all individual areas. AUCemp = 1 n0n1 n1 i=1 n0 j=1 Ψ (X0j, X1i) where X01,. . . ,X0n0 and X11,. . . ,X1n1 are independent samples from F0 and F1, respectively and Ψ (X0j, X1i) =    1 X1i > X0j, 1 2 X1i = X0j 0 otherwise, j = 0, 1 i = 1, . . . , nj. Remark: It is analogous to the Mann-Witney U-statistics. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 29 AUC for binormal model • X (a diagnostic test variable) −→ fX (x) = α0f0(x) + α1f1(x) fj(x) = 1 σj √ 2π e − (x−µj )2 2σ2 j , j = 0, 1 . R(p) = Φ a + bΦ−1 (p) , a = µ1 − µ0 σ1 , b = σ0 σ1 AUC = Φ a √ 1 + b2 • Φ – standard normal distribution function Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 30 Nonparametric methods of estimates of AUC Composite trapezoidal rule The estimates of F0 and F1 are evaluated in some set {xr ∈ R; r = 0 . . . N}, mostly xr = x0 + r t, t > 0. The kernel estimate R of the ROC curve is formed by pairs of points [pr, ˆR(pr)] where pr = 1 − F0(xr), R(pr) = 1 − F1(xr), r = 0, . . . , N. pr is non-increasing in r. The composite trapezoidal rule yields AUC = N r=1 1 2 (pr−1 − pr) R(pr−1) + R(pr) = = 1 2 N r=1 F0,h0 (xr) − F0,h0 (xr−1) 2 − F1,h1 (xr−1) − F1,h1 (xr) Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 31 The 1st method of kernel estimation of AUC (Kernel 1) In terms of distribution function AUC can be expressed as AUC = P(X0 < X1) = P(X0 − X1 < 0) = FX0−X1 (0) = Fc (0), where Fc = FX0−X1 is a distribution function of a random variable Y = X0 − X1. Then a kernel estimate of Fc is ˆFc h0,h1 (x) = 1 n0n1 n1 i=1 n0 j=1 W x−(X0j−X1i) √ h2 0+h2 1 , where h0 and h1 are the bandwidths for F0 and F1, respectively (Lloyd 1998). Hence the kernel estimate AUCI of AUC is given by AUCI = ˆFc h0,h1 (0) = 1 n0n1 n1 i=1 n0 j=1 W X1i−X0j √ h2 0+h2 1 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 32 The 2nd method of kernel estimation of AUC (Kernel 2) An estimate of Fc by means of the only bandwidth h, i. e. ˆFc h(x) = 1 n0n1 n1 i=1 n0 j=1 W x−(X0j −X1i) h , and hF c opt = (n0n1)−1/3 c1 β2 2 ψc 2 1/3 , hF c opt ≈ O((n0n1)−1/3 ), where ψc 2 = Fc′′ (x) 2 dx, AUCII = ˆFc h(0) = 1 n0n1 n1 i=1 n0 j=1 W X1i−X0j h . Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 33 The 3rd method of kernel estimation of AUC (Kernel 3) This method uses the Kernel formula II for ROC estimate. We get by direct integration: AUCIII = 1 0 ˆR(p)dp = 1 0 ˆFV,h1 (p)dp = = 1 n1 n1 i=1 1 0 W p − (1 − ˆF0(X1,i)) ˜h1 dp ˆF0,h0 (X1,i) = 1 n0 n0 j=1 W X1,i − X0,j h0 , W(x) = x −1 K(t)dt This method is usefull for evaluating the Partial Area Under Curve. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 34 Partial area under ROC curve AUC: the average performance over the entire range of possible sensitivities and specificities. Problems: • Two different curves can provide the same area • Not all regions of the ROC curve have the equal clinical importance • Clinical relevant sensitivities or specificities are often somewhere away from the ends of the ROC curve • PAUC – a partial area under curve, i.e. area between two specificities or sensitivities PAUCI = p2 p1 R(p)dp, pi = 1 − F0(ci), i = 1, 2 between two specificities PAUCII = ˜p2 ˜p1 R(p)dp, ˜pi = 1 − F0(F−1 1 (1 − ˜qi)), i = 1, 2 between two sensitivities ˜q1, ˜q2 The choice of the appropriate ranges depends on clinical settings. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 35 Maximum improvement of sensitivity over chance diagonal (MIS) MIS – the maximum difference in observed sensitivity and sensitivity at chance diagonal over all values of specificity. The corresponding (1-specificity) is denoted by pMIS 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity MIS p MIS MIS = R(pMIS) − pMIS Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 36 A different point of view: Assume R(p) is concave. pMIS is defined as argument of maximum of the function Q(p) = R(p) − p, i.e. zero of Q′ (p): Q′ (p) = R′ (p) − 1, R′ (p) = f1(c) f0(c) , c = F−1 0 (1 − p) Q′ (p) = 0 : f1(θ) f0(θ) = 1 ⇒ f1(θ) = f0(θ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 θ θ = F−1 0 (1 − pMIS), pMIS = 1 − F−1 0 (θ) R′′ (p) < 0 ⇒ pMIS realizes the maximum of Q(p) = R(p) − p. Explanation: pMIS is such a point where a tangent to the ROC curve has a slope equal to 1. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 37 Application for real data Leukaemia data (Data provided by Faculty Hospital Brno) Fusion gene (FG) is the most common chromosomal aberration in acute leukaemias. Detectable FG at the end of induction therapy predict relapse with a high probability. However, detection of it with sensitivity of at least one malignant cell among 10 000 normal cells is not successful in all patients. Wilms Tumour Gene (WT1) is a tumour suppressor gene, expressed in malignant and normal hematopoietic progenitor cells. Because WT1 has been shown to be expressed in the vast majority of patients with acute leukaemias, the relevance of WT1 mRNA expression regarding prognosis and possible prediction of relapse was investigated. The WT1 expression and FG occurrence was followed in CD34+ peripheral blood progenitor cells collected from 59 leukemic patients in the first remission. 29 patients were in group G0 (without FG) and 30 in group G1. The question: Does higher expression of WT1 indicate FG occurrence? Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 38 Leukaemia data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking, bandwidths and AUC: − · − empirical ROC curve - - - Kernel formula I h0 = 3.9740, h1 = 4.5904 AUCI = 0.6397 ––– Kernel formula II h0 = 3.9740, ˜h1 = 0.0728 AUCIII = 0.6314 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 39 Head trauma data (Source of the data: see Zhou et al. 2002) The bi-normal model and the kernel method were used for processing of the second real data set. We consider the use of CK–BB isoenzyme measured within 24 hours of injury for predicting the outcome of severe head trauma. We are interested in determining which patients have a poor outcome after suffering a severe head trauma. 60 patients: 19 had moderate to full recovery and 41 eventually had poor or no recovery. We use the ROC curve to assess the discrimination between patients with and without a poor outcome. Question: Is CK–BB isoenzyme a good predictor of the outcome? The data don’t satisfy the conditions of normality and the bi-normal model gives worse results in this case. For improvement of them some transformation of the data (logarithmic, Box-Cox) should be used. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 40 Head trauma data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking, bandwidths and AUC: − · − empirical ROC curve - - - Kernel formula I h0 = 175.5347, h1 = 300.7402 AUCI = 0.7896 ––– Kernel formula II h0 = 175.5347, ˜h1 = 0.0013 AUCIII = 0.8148 − · − Binormal model Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 41 Pancreatic cancer data (Source of the data: see Zhou and Hazerlak 2002) The kernel methods were applied to real data set from Mayo Clinic, where sera from group of 51 ‘control’ patients with pancreatitis and 90 ‘case’ patients with pancreatic cancer were studied with a carbohydrate antigen assay (CA19-9). We study a relative accuracy of biomarker CA19-9 for 90 patients with condition and 51 patients without condition. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 42 Pancreatic cancer data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking, bandwidths and AUC: − · − empirical ROC curve - - - Kernel formula I h0 = 1.0176, h1 = 2.6694 AUCI = 0.8460 ––– Kernel formula II h0 = 1.0176, ˜h1 = 0.000908 AUCIII = 0.8593 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 43 Salmon data (Source of the data: see Johnson and Wichern 1992) The salmons have a remarkable life cycle. They are born in freshwater streams and after a year or two swim into the ocean. After a couple of years in saltwater they return to their, place of birth to spawn and die. At the time they are about to return as mature fish, they are harvested while still in the ocean. To help regulate catches samples of fish taken during the harvest must be identified, as coming from Alaskan or Canadian waters. The fish carry some information about their birth place in the growth rings on their scales. Typically, the rings associated with freshwater growth are smaller for the Alaskan-born than for the Canadian-born salmon. X0: diameter of rings for the first-year freshwater growth for the Alaskan-born salmons (hundredths of an inch) X1: diameter of rings for the first-year freshwater growth for the Canadian-born salmons (hundredths of an inch) Samples of sizes n0 = n1 = 50 Question: Is the diameter of rings for the first-year freshwater growth suitable indicator of the origin of the salmon? Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 44 Salmon data 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking, bandwidths and AUC: − · − empirical ROC curve - - - Kernel formula I h0 = 21.5009, h1 = 24.2317 AUCI = 0.9253 ––– Kernel formula II h0 = 21.5009, ˜h1 = 0.0026 AUCIII = 0.9371 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 45 Hurricanes Data (Source of the data: http://sunsite.univie.ac.at/textbooks/statistics/stclatre.html) Suppose you have records of the Longitude and Latitude coordinates at which 37 storms reached hurricane strength for two classifications of hurricanes Baro hurricanes and Trop hurricanes. The fictitious data were presented for illustrative purposes by Elsner, Lehmiller, and Kimberlain (1996), who investigated the differences between baroclinic and tropical North Atlantic hurri- canes. The Longitude coordinates were taken as the response variable X for the first ROC curve and the Latitude coordinates for the second one. X0: Longitude (Latitude) coordinates for Trop hurricanes X1: Longitude (Latitude) coordinates for Baro hurricanes Question: Are Longitude or Latitude coordinates usable for classification of the hurricanes? Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 46 Hurricanes Data – Longitude coordinates 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking, bandwidths and AUC: − · − empirical ROC curve - - - Kernel formula I h0 = 6.8773, h1 = 7.5679 AUCI = 0.4440 ––– Kernel formula II h0 = 6.8773, ˜h1 = 0.5611 AUCIII = 0.4456 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 47 Hurricanes Data – Latitude coordinates 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1-specificity sesitivity Colour marking, bandwidths and AUC: − · − empirical ROC curve - - - Kernel formula I h0 = 3.3794, h1 = 3.5817 AUCI = 0.9258 ––– Kernel formula II h0 = 3.3794, ˜h1 = 0.0983 AUCIII = 0.9441 Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 48 Conclusion The ROC curves have found useful application in diagnostic medicine. Ongoing development in ROC analysis will address more complex types of diagnostic situations and will likely expand the applicability of ROC analysis. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 49 References [1] Altman, N., Léger, Ch.: Bandwidth selection for kernel distribution function estimation, Journal of Stat. Planning and Inference, 46, pp. 195–214, 1995. [2] Azzalini, A.: A note on the estimation of a distribution function and quantiles by a kernel method. Biometrika, 68, No. 1, pp. 326–328, 1981. [3] Bowman, A., Hall, P., Prvan, T.: Bandwidth selection for the smoothing of distribution functions. Biometrika, 85, No. 4, pp. 799–808, 1998. [4] Bradley A.P.: The Use of Area Under the ROC Curve in the Ecaluation of Machine Learning Algorithms, Patern. Recognition, 30, No. 7, pp. 1145–1159, 1997. [5] DeLong E. R., DeLong D. M., Clarke-Pearson D. L.: Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach, Biometrics, 44, pp. 837–845, 1988. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [6] Elsner, J. B., Lehmiller, G. S. , Kimberlain, T. B.: Objective classification of Atlantic basin hurricanes, Journal of Climate, 9, pp. 2880–2889, 1996. [7] Eng J.: Receiver Operating Characteristic Analysis: A Prime, Academic Radiology, 12, pp. 909–916, 1999. [8] Faraggi D., Reiser B.: Estimation of the area under the ROC curve, Statistics in Medicine, 21, pp. 3093–3106, 2002. [9] Forbelská, M.: Elliptically Contoured Models in Classification, Proceedings of Summer School Matlab’06, Masaryk Univerisy, pp. 77–94, 2007. [10] Forbelská, M.: Elliptically Contoured Models in ROC Analysis, S.Co.2007, Fifth Conference, Mantovan P., Pastore A., Tonellato S., (Eds.), Book of Short Papers„ pp. 243–248, 2007. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [11] Gu, J., Ghosal, S., Roy A.: Non-parametric estimation of ROC curve, http://www4.stat.ncsu.edu/ sghosal/papers/ROCBB.pdf [12] Hall, P.H., Hyndman R.J.: Improved methods for bandwidth selection when estimating ROC curves. Statistics & Probability Letters, 64, pp 181–189, 2003. [13] Hanley J. A., McNeil B. J.: The meaning and use of the area under the receiver operating characteristic (roc) curve, Radiology, 143, pp. 29–36, 1982. [14] Hanley J. A., McNeil B. J.: A method of comparing the areas under receiver operating characteristic curves derived from the same cases, Radiology, 148, pp. 839–843, 1983. [15] Horová I., Zelinka J.: Contribution to the bandwidth choice for kernel density estimates, Computational Statistics, 22, No. 1, pp. 31–47, 2007. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [16] Horová I. Koláˇcek J., Zelinka, J., El-Shaarawi A.: Smooth Estimates of Distribution Functions with Application in Environmental Studies, Accepted for MABE’08, 2007. [17] Jiang Y., Metz C. E., Nishikawa R. M.: A. receiver. operating characteristic partial area index for highly. sensitive diagnostic tests, Radiology, 201, pp. 7745–750, 1996. [18] Johnson, R. A., Wichern D.W.: Applied multivariate statistical analysis, Prentice-Hall, Inc., 1992. [19] Lloyd, C.J.: Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. JASA, 93, No. 444, pp. 1356–1364, 1998. [20] Lloyd, C.J.: Estimation of a convex ROC curve, Statistics & probability letters, 59, pp. 99–111, 2002. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [21] Lloyd, C.J., Zhou Yong: Kernel estimators of the ROC curve are better than empirical. Statistics and Prob. Letters 44, pp. 221–228, 1999. [22] Lusted L. B.: Signal Detectability and Medical Decision-Making, Science, 171, pp. 1217-1219, 1971. [23] O’Malley A.J.1, Zou K.H., Fielding J.R., Tempany C.M.C.: Bayesian Regression Methodology for Estimating a Receiver Operating Characteristic Curve with Two Radiologic Applications, Academic Radiology, 8, No. 8, pp. 713-725, 2001. [24] Metz C. E., Herman B. A., Shen J-H.: Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data, Statistics in Medicine, 17, pp. 1033–1053, 1998. [25] Obuchowski N. A.: Nonparametric Analysis of Clustered ROC Curve Data, Biometrics, 53, No. 2, pp. 567–578, 1997. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [26] Peng L., Zhou X-H.: Local linear smoothing of receiver operating characteristic (ROC) curves, 118, pp. 129–143, 2004. [27] Pepe, M. S.: An Interpretation for the ROC Curve and Inference Using GLM Procedures, Biometrics, 56, No. 2, pp. 352–359, 2000. [28] Pepe, M. S.: The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford Statistical Science Series, Oxford University Press, 2003. [29] Pepe, M. S.: Receiver Operating Characteristic Methodology, JASA, 95, No. 449, 2000. [30] Qin J., Zhang B.: A goodness-of-fit test for logistic regression models based on case-control data, Biometrika, 84, No. 3, pp. 609–618, 1997. [31] Terrell, G. R.: The maximal smoothing principle in density estimation. Journal of the American Statistical Association. 85, No. 410, pp. 440-447, 1990. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [32] Terrel, G. R., Scott, D. W.: Oversmoothed nonparametric density estimates. Journal of the American Statistical Association, 80, 209-214, 1985. [33] Venkatraman E. S., Begg C. B.: A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment, Biometrika, 83, No. 4, pp. 835–848, 1996. [34] Wan S., Zhang B.: Smooth semiparametric receiver operating characteristic curves for continuous diagnostic tests, Statistics in Medicine, 26, pp. 2565–2586, 2007. [35] Wand, I.P., Jones, I.C.: Kernel smoothing. Chapman & Hall, London, 1995. [36] Wieand S., Gail M. H., James B. R., James K. L.: A Family of Nonparametric Statistics for Comparing Diagnostic Markers with Paired or Unpaired Data, Biometrika, 76, No. 3, pp. 585–592, 1989. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50 References [37] Zhang D. D., Zhou X-H., Freeman D. H. Jr., Freeman J. L.: A non-parametric method for the comparison of partial areas under ROC curves and its application to large health care data sets, Statistics in Medicine, 21, pp. 701–715, 2002. [38] Zhou X.–H., Harezlak J.: Comparison of bandwidth selection methods for kernel smoothing of ROC curves, Statistics in Medicine, 21, 2045–2055, 2002. [39] Zhou X.–H.,Obuchowski N. A., McClish D. K.: Statistical Methods in Diagnostic Medicine, Wiley–Interscience, 2002. [40] Zou K. H., Hall W. J., Shapiro D. E.: Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests, Statistics in Medicine, 16, 2143–2156, 1997. Receiver Operating Characteristic (ROC) Curve:A Tool for Describing and ComparingContinuous Diagnostic Tests – p. 50