Bootstrapping
Permutation tests
Bi7740: Scientiﬁc computing
Resampling methods: bootstrapping and
permutations
Vlad Popovici
popovici@iba.muni.cz
Institute of Biostatistics and Analyses
Masaryk University, Brno
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Supplemental bibliography
Efron, Tibshirani: An Introduction to the Bootstrap. 1993.
Chapman & Hall
Good: Resampling Methods. A Practical Guide to Data
Analysis. 2006. Birkhäuser, 3rd Ed
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Introduction
resampling technique for statistical inference: assess
uncertainty
especially useful when no assumptions can be made on the
underlying model
conﬁdence intervals without distributional assumptions
there are many versions of bootstrapping
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Example (from Efron, Tibshirani, 1993):
Group Heart attacks Subjects
aspirin 104 11037
placebo 189 11034
The odds ratio:
ˆθ =
104/11037
189/11034
= 0.55
so it seems that aspirin reduced the incidence of heart attacks.
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Log-odds can be approximated by the normal distribution, so we
use it to construct a 95% CI. Standard error is
SE(log(OR)) = 1/104 + 1/189 + 1/11037 + 1/11034 = 0.1228
giving a 95% CI for log θ:
log ˆθ ± 1.96 × SE(log(OR)) = (−0.839, −0.357)
with a corresponding 95% for θ obtained by exponentiating:
(0.432, 0.700).
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
At the same time, aspirin seems to have a detrimental effect on
strokes
Group Heart attacks Subjects
aspirin 119 11037
placebo 98 11034
which leads to an odds ratio of ˆθ = 1.21 with a 95% CI of
(0.925, 1.583).
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
...and how bootstrap would proceed to infering the CI:
create a sample for the treatment (s1) and one for the placebo
(s2) group as vectors containing as many 1s as case there are
draw with replacement a random sample from s1 and from s2,
of the same size as the groups
compute the odds ratios based on the drawn samples
repeat the process a number of times and record all the odds
ratios computed
using their empirical distribution, estimate the CI of interest
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
set . seed (1)
n1 = 11037; n1 . cases = 119; n2 = 11034; n2 . cases = 98
s1 = c ( rep (1 , n1 . cases ) , rep (0 , n1−n1 . cases ) )
s2 = c ( rep (1 , n2 . cases ) , rep (0 , n2−n2 . cases ) )
B = 1000; # number of bootstrap samples
p = n2 / n1
theta = rep (0 , B)
for ( i in 1:B) {
theta [ i ] = p ∗ sum( sample ( s1 , n1 , replace = TRUE) ) /
sum( sample ( s2 , n2 , replace = TRUE) )
}
hist ( theta )
quantile ( theta , probs=c (.025 , .975))
2.5% 97.5%
0.9365309 1.5711275
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Histogramof theta
theta
Frequency
0.8 1.0 1.2 1.4 1.6 1.8
050100150200250
the CI estimate by the quantiles is
not the most precise nor efﬁcient
that can be obtained by
bootstrapping
it works for symmetric, close to
normal distributions of the
bootstrap replicate
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
The empirical distribution
the underlying probability distribution F generates the
observed sample:
F → (x1, . . . , xn) = x
the empirical distribution ˆF is the discrete distribution that puts
probability 1/n at each value xi, i = 1, . . . , n
ˆF assigns to a set A in the sample space of x its empirical
probability:
Prob{A} =
#{xi ∈ A}
n
= ProbˆF {A}
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
a parameter is a functional of the distribution function,
θ = t(F). Example: the mean
µ(F) = xdF(x)
a statistic is a function of the sample x. Example: the sample
average,
ˆµ =
1
n
n
i=1
xi
the plug-in estimate of a parameter θ = t(F) is deﬁned to be
ˆθ = t(ˆF)
(sometimes called summary statistics, estimates or estimator)
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Bootstrap estimate of the standard error
bootstrap sample: ˆF → (x∗
1
, . . . , x∗
n) = x∗ (resampling with
replacement)
let ˆθ = s(x) be an estimate for the parameter of interest
the question is: what is the standard error of the estimate?
bootstrap replication of ˆθ is
ˆθ∗
= s(x∗
)
ideal bootstrap estimate of SE:
seˆF (ˆθ∗
)
i.e. the standard error of ˆθ for data sets of size n randomly
sampled from ˆF
unfortunately, close-form formulas exist only for some
estimators
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
General form of the bootstrap method
by resampling with
replacement from x one
samples from the empirical
distribution ˆF
x∗
b
are the bootstrap
samples of size n
s(x∗
b
) = ˆθ∗
b
are the bootstrap
replications of the parameter
of interest θ
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Bootstrap estimation of standard errors
1 select B independent bootstrap samples x∗
1
, . . . , x∗
B
2 evaluate the bootstrap replicate of each bootstrap sample
ˆθ∗
b
= s(x∗
b
), b = 1, 2, . . . , B
3 estimate the standard error seˆF (ˆθ) by the sample standard
deviation of the B replications:
seB =
1
B − 1
B
b=1
ˆθ∗
b
− ˆθ∗
0
2
where ˆθ∗
0
= 1
B
B
b=1
ˆθ∗
b
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Implement the previous procedure in R:
write a function bstrap.nonparam(x, B, s, ...) which
will generate B bootstrap samples x∗
b
and for each of them will
compute the bootstrap replicate of the parameter:
ˆθ∗
b
= s(x∗
b
, · · · )
write a function bstrap.theta0(T) which computes the
bootstrap estimate of the parameter, given the bootstrap
replicates in the vector T (ˆθ∗
0
)
write a function bstrap.se(T) which computes the bootstrap
estimate of the standard error of the parameter, given the
bootstrap replicates in the vector T (seB)
use the Rainfall data set to compute the bootstrap estimate
of the mean, median and corresponding standard errors
HOMEWORK: compare with textbook results! (discuss!)
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Bias–corrected and accelerated CI
the quantile-based CI is not tight enough nor robust
idea: better exploit the quantiles of the empirical distribution
by:
correcting the bias
improving convergence
simple bootstrap quantile-based CI: for an (1 − 2α) coverage,
the bounds of the CI are given by (ˆθ∗(α), ˆθ∗(1−α)) where ˆθ∗(q) is
the q−th quantile of the bootstrap replicates
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
The BCa CI is given by (ˆθ∗(α1), ˆθ∗(α2)) where
α1 = Φ ˆz0 +
ˆz0 + z(α)
1 − ˆa(ˆz0 + z(α))
α2 = Φ ˆz0 +
ˆz0 + z(1−α)
1 − ˆa(ˆz0 + z(1−α))
where
Φ(·) is the standard normal CDF
z(q) is the q−th quantile of standard normal distribution
ˆa and ˆz0 are cleverly chosen
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
The parameters of BCa CIs:
ˆz0 = Φ−1


#{ˆθ∗
b
< ˆθ}
B


ˆa =
n
i=1
ˆθ(·) − ˆθ(i)
3
6 n
i=1
ˆθ(·) − ˆθ(i)
2 3/2
where
ˆθ(i) is the value of the parameter computed on the vector x
with the i−th component removed (jackknife values of the
parameter)
ˆθ(·) = n
i=1
ˆθ(i)/n
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Exercise: implement the BCa procedure in R:
write a function
bstrap.bca(x, B, s, ..., alpha=c(0.025, 0.05))
that returns the low and upper bounds of the CI computed by
BCa method
you can use (call) the previous function bstrap.nonparam
compute the 90% and 95% BCa CIs for the mean of
Rainfall data: bstrap.bca(Rainfall, 2000, mean)
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Important properties of BCa CIs
transformation respecting: the bounds of the CIs transform
correctly if the parameter is changed by some function: e.g.
the CIs for
√
·-transformed parameter are obtained by taking
√
of the bounds of the parameter itself
second order accurate: convergence rate of 1/n to true
coverage
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Bootstrapping for tests
consider two possibly different distributions F and G,
F → z = (z1, . . . , zn)
G → y = (y1, . . . , ym)
hypotheses:
H0 : F = G
H1 : F G
F = G ⇔ ProbF {A} = ProbG{A} for all sets A
observe a test statistic ˆθ (e.g. mean difference)
achieved signiﬁcance level (ASL): probability of observing that
large a value under H0:
ASL = ProbH0
{ˆθ∗
≥ ˆθ}
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Bootstrapping hypothesis testing procedure
1 choose a test statistic (not necessary a parameter): t(x) (for
example: t(x) = ¯z − ¯y)
2 draw B samples of size n + m from x = (z, y) and call the ﬁrst
n observations z∗ and the remaining m y∗
3 evaluate t(·) for each sample: t(x∗
b
)
(for example
t(x∗
b) = ¯z∗
b − ¯y∗
b
)
for b = 1, 2, . . . , B
4 approximate ASLboot by
ASLboot = #{t(x∗
b) ≥ t(x)}/B
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Exercise:
consider the data vectors mouse.c and mouse.t for the
control and treatment arms of an experiment (some clinical
variable)
implement the bootstrap hypothesis testing procedure
use the test statistic
t(x) =
¯z − ¯y
¯σ
√
1/n + 1/m
where
¯σ =
n
i=1(zi − ¯z)2 + m
i=1(yi − ¯y)2
(n + m − 2)
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
Implementations in R
many R packages implement various bootstrapping
procedures
bootstrap package contains data and functions
accompanying the book by Efron and Tibshirani
boot package contains a lot of well tested and documented
functions
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
Permutation tests
nonparametric testing procedure
allow testing hypotheses when the properties of the test
statistic under the null hypothesis are not known
do not make assumptions on the data
work on small data sets
idea: generate the distribution of the test statistic under the
null hypothesis from the data
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
exact permutation tests: for (very) small data sets, generate
all permutations and compute the corresponding test statistics
random test: for large data sets, generate a number of random
permutations, for which compute the test statistic
test procedure: count how many times the test statistic from
the permutations is more extreme than the real test statistic
and reject H0 if the proportion is below the predeﬁned α−level
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
Outline
1 Bootstrapping
Introduction
Empirical distribution and the plug-in principle
Improved bootstrap conﬁdence intervals
Bootstrapping for hypothesis test
2 Permutation tests
Introduction
Example/exercise
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
Example - two populations tests
consider the data vectors mouse.c and mouse.t for the
control and treatment arms of an experiment (some clinical
variable)
implement a permutation testing procedure for testing
H0 : there is no signiﬁcant difference in the clinical variable
between control and treatment
vs
H1 : there is a signiﬁcant difference in the clinical variable
between control and treatment
which test statistic? what to permute? how many
permutations?
what should be changed if the test was about superiority of
treatment vs control?
Vlad Bi7740: Scientiﬁc computing
Bootstrapping
Permutation tests
Introduction
Example/exercise
Histogram of d
d
Frequency
−50 0 50
050010001500
Vlad Bi7740: Scientiﬁc computing