Bootstrapping Permutation tests Bi7740: Scientific computing Resampling methods: bootstrapping and permutations Vlad Popovici popovici@iba.muni.cz Institute of Biostatistics and Analyses Masaryk University, Brno Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Supplemental bibliography Efron, Tibshirani: An Introduction to the Bootstrap. 1993. Chapman & Hall Good: Resampling Methods. A Practical Guide to Data Analysis. 2006. Birkhäuser, 3rd Ed Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Introduction resampling technique for statistical inference: assess uncertainty especially useful when no assumptions can be made on the underlying model confidence intervals without distributional assumptions there are many versions of bootstrapping Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Example (from Efron, Tibshirani, 1993): Group Heart attacks Subjects aspirin 104 11037 placebo 189 11034 The odds ratio: ˆθ = 104/11037 189/11034 = 0.55 so it seems that aspirin reduced the incidence of heart attacks. Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Log-odds can be approximated by the normal distribution, so we use it to construct a 95% CI. Standard error is SE(log(OR)) = 1/104 + 1/189 + 1/11037 + 1/11034 = 0.1228 giving a 95% CI for log θ: log ˆθ ± 1.96 × SE(log(OR)) = (−0.839, −0.357) with a corresponding 95% for θ obtained by exponentiating: (0.432, 0.700). Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test At the same time, aspirin seems to have a detrimental effect on strokes Group Heart attacks Subjects aspirin 119 11037 placebo 98 11034 which leads to an odds ratio of ˆθ = 1.21 with a 95% CI of (0.925, 1.583). Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test ...and how bootstrap would proceed to infering the CI: create a sample for the treatment (s1) and one for the placebo (s2) group as vectors containing as many 1s as case there are draw with replacement a random sample from s1 and from s2, of the same size as the groups compute the odds ratios based on the drawn samples repeat the process a number of times and record all the odds ratios computed using their empirical distribution, estimate the CI of interest Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test set . seed (1) n1 = 11037; n1 . cases = 119; n2 = 11034; n2 . cases = 98 s1 = c ( rep (1 , n1 . cases ) , rep (0 , n1−n1 . cases ) ) s2 = c ( rep (1 , n2 . cases ) , rep (0 , n2−n2 . cases ) ) B = 1000; # number of bootstrap samples p = n2 / n1 theta = rep (0 , B) for ( i in 1:B) { theta [ i ] = p ∗ sum( sample ( s1 , n1 , replace = TRUE) ) / sum( sample ( s2 , n2 , replace = TRUE) ) } hist ( theta ) quantile ( theta , probs=c (.025 , .975)) 2.5% 97.5% 0.9365309 1.5711275 Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Histogramof theta theta Frequency 0.8 1.0 1.2 1.4 1.6 1.8 050100150200250 the CI estimate by the quantiles is not the most precise nor efficient that can be obtained by bootstrapping it works for symmetric, close to normal distributions of the bootstrap replicate Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test The empirical distribution the underlying probability distribution F generates the observed sample: F → (x1, . . . , xn) = x the empirical distribution ˆF is the discrete distribution that puts probability 1/n at each value xi, i = 1, . . . , n ˆF assigns to a set A in the sample space of x its empirical probability: Prob{A} = #{xi ∈ A} n = ProbˆF {A} Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test a parameter is a functional of the distribution function, θ = t(F). Example: the mean µ(F) = xdF(x) a statistic is a function of the sample x. Example: the sample average, ˆµ = 1 n n i=1 xi the plug-in estimate of a parameter θ = t(F) is defined to be ˆθ = t(ˆF) (sometimes called summary statistics, estimates or estimator) Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Bootstrap estimate of the standard error bootstrap sample: ˆF → (x∗ 1 , . . . , x∗ n) = x∗ (resampling with replacement) let ˆθ = s(x) be an estimate for the parameter of interest the question is: what is the standard error of the estimate? bootstrap replication of ˆθ is ˆθ∗ = s(x∗ ) ideal bootstrap estimate of SE: seˆF (ˆθ∗ ) i.e. the standard error of ˆθ for data sets of size n randomly sampled from ˆF unfortunately, close-form formulas exist only for some estimators Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test General form of the bootstrap method by resampling with replacement from x one samples from the empirical distribution ˆF x∗ b are the bootstrap samples of size n s(x∗ b ) = ˆθ∗ b are the bootstrap replications of the parameter of interest θ Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Bootstrap estimation of standard errors 1 select B independent bootstrap samples x∗ 1 , . . . , x∗ B 2 evaluate the bootstrap replicate of each bootstrap sample ˆθ∗ b = s(x∗ b ), b = 1, 2, . . . , B 3 estimate the standard error seˆF (ˆθ) by the sample standard deviation of the B replications: seB = 1 B − 1 B b=1 ˆθ∗ b − ˆθ∗ 0 2 where ˆθ∗ 0 = 1 B B b=1 ˆθ∗ b Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Implement the previous procedure in R: write a function bstrap.nonparam(x, B, s, ...) which will generate B bootstrap samples x∗ b and for each of them will compute the bootstrap replicate of the parameter: ˆθ∗ b = s(x∗ b , · · · ) write a function bstrap.theta0(T) which computes the bootstrap estimate of the parameter, given the bootstrap replicates in the vector T (ˆθ∗ 0 ) write a function bstrap.se(T) which computes the bootstrap estimate of the standard error of the parameter, given the bootstrap replicates in the vector T (seB) use the Rainfall data set to compute the bootstrap estimate of the mean, median and corresponding standard errors HOMEWORK: compare with textbook results! (discuss!) Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Bias–corrected and accelerated CI the quantile-based CI is not tight enough nor robust idea: better exploit the quantiles of the empirical distribution by: correcting the bias improving convergence simple bootstrap quantile-based CI: for an (1 − 2α) coverage, the bounds of the CI are given by (ˆθ∗(α), ˆθ∗(1−α)) where ˆθ∗(q) is the q−th quantile of the bootstrap replicates Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test The BCa CI is given by (ˆθ∗(α1), ˆθ∗(α2)) where α1 = Φ ˆz0 + ˆz0 + z(α) 1 − ˆa(ˆz0 + z(α)) α2 = Φ ˆz0 + ˆz0 + z(1−α) 1 − ˆa(ˆz0 + z(1−α)) where Φ(·) is the standard normal CDF z(q) is the q−th quantile of standard normal distribution ˆa and ˆz0 are cleverly chosen Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test The parameters of BCa CIs: ˆz0 = Φ−1   #{ˆθ∗ b < ˆθ} B   ˆa = n i=1 ˆθ(·) − ˆθ(i) 3 6 n i=1 ˆθ(·) − ˆθ(i) 2 3/2 where ˆθ(i) is the value of the parameter computed on the vector x with the i−th component removed (jackknife values of the parameter) ˆθ(·) = n i=1 ˆθ(i)/n Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Exercise: implement the BCa procedure in R: write a function bstrap.bca(x, B, s, ..., alpha=c(0.025, 0.05)) that returns the low and upper bounds of the CI computed by BCa method you can use (call) the previous function bstrap.nonparam compute the 90% and 95% BCa CIs for the mean of Rainfall data: bstrap.bca(Rainfall, 2000, mean) Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Important properties of BCa CIs transformation respecting: the bounds of the CIs transform correctly if the parameter is changed by some function: e.g. the CIs for √ ·-transformed parameter are obtained by taking √ of the bounds of the parameter itself second order accurate: convergence rate of 1/n to true coverage Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Bootstrapping for tests consider two possibly different distributions F and G, F → z = (z1, . . . , zn) G → y = (y1, . . . , ym) hypotheses: H0 : F = G H1 : F G F = G ⇔ ProbF {A} = ProbG{A} for all sets A observe a test statistic ˆθ (e.g. mean difference) achieved significance level (ASL): probability of observing that large a value under H0: ASL = ProbH0 {ˆθ∗ ≥ ˆθ} Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Bootstrapping hypothesis testing procedure 1 choose a test statistic (not necessary a parameter): t(x) (for example: t(x) = ¯z − ¯y) 2 draw B samples of size n + m from x = (z, y) and call the first n observations z∗ and the remaining m y∗ 3 evaluate t(·) for each sample: t(x∗ b ) (for example t(x∗ b) = ¯z∗ b − ¯y∗ b ) for b = 1, 2, . . . , B 4 approximate ASLboot by ASLboot = #{t(x∗ b) ≥ t(x)}/B Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Exercise: consider the data vectors mouse.c and mouse.t for the control and treatment arms of an experiment (some clinical variable) implement the bootstrap hypothesis testing procedure use the test statistic t(x) = ¯z − ¯y ¯σ √ 1/n + 1/m where ¯σ = n i=1(zi − ¯z)2 + m i=1(yi − ¯y)2 (n + m − 2) Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test Implementations in R many R packages implement various bootstrapping procedures bootstrap package contains data and functions accompanying the book by Efron and Tibshirani boot package contains a lot of well tested and documented functions Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise Permutation tests nonparametric testing procedure allow testing hypotheses when the properties of the test statistic under the null hypothesis are not known do not make assumptions on the data work on small data sets idea: generate the distribution of the test statistic under the null hypothesis from the data Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise exact permutation tests: for (very) small data sets, generate all permutations and compute the corresponding test statistics random test: for large data sets, generate a number of random permutations, for which compute the test statistic test procedure: count how many times the test statistic from the permutations is more extreme than the real test statistic and reject H0 if the proportion is below the predefined α−level Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise Outline 1 Bootstrapping Introduction Empirical distribution and the plug-in principle Improved bootstrap confidence intervals Bootstrapping for hypothesis test 2 Permutation tests Introduction Example/exercise Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise Example - two populations tests consider the data vectors mouse.c and mouse.t for the control and treatment arms of an experiment (some clinical variable) implement a permutation testing procedure for testing H0 : there is no significant difference in the clinical variable between control and treatment vs H1 : there is a significant difference in the clinical variable between control and treatment which test statistic? what to permute? how many permutations? what should be changed if the test was about superiority of treatment vs control? Vlad Bi7740: Scientific computing Bootstrapping Permutation tests Introduction Example/exercise Histogram of d d Frequency −50 0 50 050010001500 Vlad Bi7740: Scientific computing