Lecture 2
Basic ANOVA and regression
R101: A practical guide to making R your everyday statistical tool (PSY532)

Programme
•T-tests
•Linear regression
•ANOVA
•Repeated-measures ANOVA
•Logic of the analysis
•Hypotheses from our dataset
─Regression: A  hypothesis from a related but slightly different experiment
─ANOVA: As for regression + Hypothesis 2 from Lecture 1
─Repeated-measures ANOVA: Hypothesis 1a from Lecture 1
•Working together in R:
─Obtaining descriptive statistics
─Running the analysis
─Checking assumptions
•Reporting the analysis
•Seminar: Repeated-measures ANOVA; bootstrapping
•
Readings: LSR for everything except Repeated-Measures ANOVA

T-tests
•Used for comparing:
– two means that come from different groups with the same variance on a measure (Student t-test)
– two means that come from different groups with differing variances (Welch test)
–a group mean and a theoretical value (one-sample t-test)
– means recorded by the same people in different conditions (related samples t-test)
•
•Quick demonstration of the Welch test: Did participants who were asked to think aloud during the
soccer game score higher on the measure of supernatural strategising (PostSupIoC)?
•
•R has other packages for running t-tests but an advantage of the lsr package is that it calculates
a Cohen’s d, a measure of effect size – i.e., of the size of differences between two groups.
Reading: LSR, Ch 13

Linear regression
•Logic of the analysis – one predictor
residual for observation i of N (e.g., day 78 of 80)
•We use a sequence of calculations (maximum likelihood estimation; MLE) to draw a line that
minimises the sum of the squared values of the residuals
•MLE makes two key assumptions:
─Residuals are normally distributed (with mean 0) and have a standard deviation that is the same at
every value of the predicted variable/ “outcome” variable (grumpiness)
─There is a linear relationship between the predictor (sleep) the outcome (grumpiness)
Reading: LSR, Ch 15

•Logic of the analysis – one predictor (continued)
•R2 tells us the extent to which the sum of squared residuals is smaller than the sum of the
following: square of (each value of the outcome variable minus the mean of the outcome variable)
•Two answers to the same question of whether there is a significant relationship between the
predictor and the outcome (null hypotheses in green):
−T-test to determine whether the slope of the regression line (slope coefficient in the model) is
significantly different from zero
−F-test (ANOVA) to determine whether the model performs better than an intercept-only model (i.e.,
an equation in which the slope coefficient equals zero and the intercept then equals the outcome
variable’s mean)
horizontal line at the mean of the outcome variable (grumpiness)
The model:
Slope coefficient
Intercept
Residual

Depression
http://www.ats.ucla.edu/stat/sas/teach/reg_int/reg_int_cont.htm
Y
X2
X1
•Logic of the analysis – two (or more) predictors
The model:
•We use MLE to determine an equation that minimises the sum of the squared values of the residuals
(equation of a 3D plane for two predictors)
•MLE makes the same key assumptions as for analyses with a single predictor
•Interactions between the predictors are possible
The model:
Yi = b2Xi2 + b1Xi1 + b3Xi1Xi2 + b0 + εi

Logic of the analysis – two predictors (continued)
•R2 has the same meaning, but you can also calculate adjusted R2 , which is smaller than R2 if
there are many predictors and/or the sample size is small
•Three associated hypothesis tests (null hypotheses in green):
−T-tests  for each coefficient in the model: is it significantly different from zero?
−F-test (ANOVA) to determine whether the model performs better than an intercept-only model (i.e.,
an equation in which all slope coefficients equal zero and the intercept is then equal to the
outcome variable’s mean)
−Hierarchical regression: F-test (ANOVA) to determine whether a model featuring one or more
additional predictors performs better than the original model
Horizontal plane from the mean of the outcome variable (Y)
Intercept-only model
X1
X2
Y
b2
b1
p. 481
X1
X2
Y

•Experiment
•N = 97
•100 trials of the soccer-themed slot-machine task under one of five win-frequency conditions
–1 win per 2 trials, per 3 trials, per 4 trials, per 8 trials and per 16 trials
•Pre-game and post-game questionnaires almost identical to those in the Success-Slope experiment
from Lecture 1. Final win amount (final credits) also calculated.
•
•Hypothesis
•In many causal judgement experiments, as the frequency with which two events co-occur increases,
conclusions that one event causes the other have been found to increase in strength. Example:
treatment with a certain drug and recovery. Here, we expect the same to the be the case for wins
and choices made during the game. As win frequency increases, choices should come to be considered
more causally effective (i.e., more strategic). Win frequency should predict natural or
supernatural illusion of control.
Hypothesis from our dataset (actually from another very similar dataset; SF in your Study
Materials/Data folder)

•‘Natural’ IoC
1.My skill in playing the game.
2.I got better with practice.
3.I developed a logical strategy for playing.
4.Experience in playing computer games.
•Natural IoC variable: Average of these items
•‘Supernatural’ IoC
1.I took advantage of moments when my luck was good.
2.I’ve always been a lucky kind of person.
3.I knew how to make my luck turn good.
4.A certain lucky way of playing just seemed to work for me.
5.The players I chose.
6.I learned how to predict the movements of the goalkeeper.
•Supernatural IoC variable: Average of these items
•
Post-game measure of illusion of problem-solving – slight difference from SS data
All chance.
It was all chance.
•
We will use Supernatural IoC (PostSupIoC) as our outcome variable in this demonstration because it
has more items and is therefore a potentially more reliable measure of the illusion of
problem-solving.

Working together in R – descriptive statistics
•Graph – ggplot2 commands are in the script
Correlation table
As shown in the script, create a subset data frame of the variables you want to correlate, then use
the correlate function in the lsr package. You should include all possible predictors of the
outcome variable for which data is available.
RegrPlot1.png RegrPlot2.png

Working together in R – Running the analysis
•Revised hypothesis based on the correlation table
•Once gambling-related beliefs and soccer interest assessed in the pre-game questionnaire
(PreSoccerInterest; PreDBC_Sup) are accounted for, win frequency (LogWinFreqPerc) is a significant
predictor of the illusion of supernatural control (PostIoCSup).
•
•See script for a demonstration of a hierarchical regression approach to testing this hypothesis.
We use the lm and anova functions for which you do not need to install a package (they are in the
‘base’ package). We also make use of the lsr package.

Working together in R – Checking assumptions
Assumption
Checks available in R
If the assumption is not met...
Normality of residuals
hist(residuals(model1), breaks = 20)
plot(model1, which = 2)
shapiro.test(residuals (model1))
Transform one or more of the predictors
Constant variance of residuals
• lack of influential points
•homogenieity of variance
plot(model1, which = 4)
plot(model1, which = 5)
plot(model1, which = 3)
ncvTest(model1)
Run regression without influential points (see script)
Run regression with heteroscedasticity-corrected covariance matrix (see script)
Linearity of the relationship between the outcome and the predictor(s)
Plot of fitted values against observed values
plot(model1, which = 1)
residualPlots(model1)
Transform one or more of the predictors

Reporting the analysis
•Table showing coefficients, R2, T-test and hierarchical regression results (if any). Standardised
coefficients (β) tend to also be reported. •Summary of results, given the hypothesis: Overall, the
analysis indicated that illusion-of-control ratings increased with increases in win-frequency, once
the influence of background beliefs and soccer interest was accounted for.
Hierarchical regression step
Predictors
b
SE b
β
t
p
Adj R2
1
Intercept
DBC total
2
Intercept
DBC total
Win-frequency

ANOVA: Independent measures
•Logic of the analysis – one predictor (here, drug type, with 3 levels)
Anxifree
Joyzepam
Placebo
Group means
*Note: Data in illustration does not correspond to textbook
•We calculate two quantities:
–Sum of squares expressing difference between each individual score and the group mean: ......
–Sum of squares expressing difference between group means and grand mean – variability due to
factor (drug type)
•These enable us to compute an F-value, which can then be tested for significance.
Reading: LSR, Ch 14 and 16

•N is number of participants
•G is number of groups
•i is a participant number
•k is an integer representing the group number/factor level
•
•Effect size – eta-squared:
SS expressing difference between all scores (regardless of group) and grand mean

Logic of the analysis: The F-statistic as a model comparison
•The F-test, as it is used in both ANOVA and regression, is really a comparison of two statistical
models.
•In an ANOVA with one predictor, the F-test is a comparison of an intercept-only model (M0, null
hypothesis) to a model involving the intercept and the predictor (M1, alternative hypothesis).
•

ANOVA plot 1.png
Grand mean
Win frequency
Logic of the analysis – ANOVA as regression (illustration for ANOVA with one predictor)
When we use the aov function, the chosen group’s mean is the intercept (baseline) in a “dummy
coded” regression. In this case, the regression has four predictors (see next slide). Using the aov
function additionally involves a model comparison (see script, ANOVA Example 1).
A group mean selected by researcher (e.g., the lowest win-frequency condition).

•Win frequency data (first 6 cases) “dummy coded” with 1/16 as reference group
PNo
SupIoC
1/8 (X1)
1/4 (X2)
1/3 (X3)
1/2 (X4)
2
0.8333
1
0
0
0
3
0.0000
0
0
0
0
4
2.5000
0
1
0
0
5
4.1667
0
0
1
0
6
0.6667
0
0
0
1
7
4.5000
0
0
0
0
The regression model:
Yp= b1X1p + b2X2p + b3X3p + b4X4p  + b0 + εi
Mean of 1/16 group
Difference between means of 1/2 group and 1/16 group
Participant p’s code on X1
SupIoC of participant p
We determine the values of b0, b1, b2, b3 and b4 using the summary.lm function.

Other possible contrasts in the regression component
•The dummy coding in the previous slide contained a treatment contrast. •Other possible contrasts
include Helmert, sum-to-zero (“effect coding”) and manually set orthogonal contrasts.
•
PNo0
1/8 (X1)
1/4 (X2)
1/3 (X3)
1/2 (X4)
2
1
-1
-1
-1
3
-1
-1
-1
-1
4
0
2
-1
-1
5
0
0
3
-1
6
0
0
0
4
7
-1
-1
-1
-1
Win frequency data (first 6 cases) with Helmert contrast and  1/16 as reference group:
This coding enables us to contrast the second level with the reference level, the third with the
average of the first two, and so on.

PNo
1/16 (X1)
1/8 (X2)
1/4 (X3)
1/3 (X4)
2
0
1
0
0
3
1
0
0
0
4
0
0
1
0
5
0
0
0
1
6
-1
-1
-1
-1
7
1
0
0
0
Win frequency data (first 6 cases) with sum-to-zero contrast (“effect coding”) and  1/2 as
reference group (as per script):
The regression model:
Yp= (1/5)b1X1p + (1/5)b2X2p + (1/5)b3X3p + (1/5)b4X4p  + b0 + εi
This coding enables us to contrast the mean of each group except the reference group with the grand
mean. The grand mean is “weighted” (see script) if the groups are not equal in sample size.
Mean of 1/16 group minus weighted grand mean
(Weighted) grand mean
Mean of 1/8 group minus weighted grand mean

Rules for manually setting orthogonal contrasts
Rules:
1.The weights within any contrast must sum to zero
2.The weights for any pair of contrasts must sum to zero when the dot product is taken.
Illustration:
•Contrast A = (a, b, c, d, e)
•Contrast B = (f, g, h, i, k)
•Contrast C = (l, m, n, o, p)
If the rules are met:
1.a + b + c + d + e = 0, f + g + h + i + k = 0, and l + m + n + o + p = 0
2.a*f + b*g + c*h + d*i + e*k = 0, l*f + m*g + n*h + o*i + p*k = 0, and
a*l + b*m + c*n + d*o + e*p = 0
For a worked example, see ANOVA Example 3 in script.
Each contrast should compare two sets of means (e.g., mean of a, b and d to the mean of c and e).
Chunks with a negative weight (e.g., -1) are compared to chunks with a positive weight. So in this
example, we would assign weights like this: (1, 1, -1, 1, -1) or (-1, -1, 1, -1, 1).
Reading: Field chapter

.

http://www.theanalysisfactor.com/wp-content/uploads/2011/12/interaction-graphic-1.gif
Logic of the analysis – ANCOVA (with one predictor and one covariate)
(covariate; e.g., beliefs in value of strategies even before the game – PreDBC_Sup)
(categorical predictor, here with two levels; e.g., win-freq of 1/2 vs. 1/16)
•If the categorical predictor has more levels (e.g., 5 as in our example), there might be more
parallel lines:
–The vertical distance between lines represents the effect of the categorical predictor
•Two covariates could be visualised as parallel regression planes.
•Parallel slopes (lack of relationship between predictor and covariates) are assumed.
•Covariates can be categorical!

Logic of the analysis – ANOVA with two or more predictors
From the same data that gives us this table,
we can calculate...
Factor A
(3 levels)
Factor B (2 levels)
Row marginal means
Column marginal means
Grand mean
Group means – e.g., for group 1,1

Total sum of squares expressing distance between all data points and grand mean
Sum of squares expressing difference between row marginal means and grand mean – variability due to
Factor A
Sum of squares expressing difference between column marginal means and grand mean – variability due
to Factor B
Four sets of degrees of freedom:
•For Factor A
•For Factor B
•For the interaction between A and B
•For the residuals
Sum of squares expressing the extent to which the group means cannot be predicted based on the
marginal means alone – variability due to interaction between A and B (see next slide)
Using the first four quantities, we can calculate the residual sum of squares
This is all the information we need for computing the F-value for each predictor and interaction
term. We can also compute an effect size (eta-squared) for each predictor/interaction – e.g., for
Factor A:

Interactions


Logic of the analysis: Different types of hypothesis tests (model comparisons)  in unbalanced
designs
•An issue to consider in any factorial ANOVA (i.e., ANOVA with two or more predictors) where group
sample sizes are not equal (e.g., where group 1,1 has N = 25 and group 3,1 has N = 17)
•To do with the F statistic as a model comparison (see earlier slide)

Name
Model comparison method
Recommended for
Not recommended for
Type I Sums of Squares (R default)
Sequential; The first term that is entered “grabs” all the variance in Y that it can. The second
term grabs as much as possible of the remaining variance, and so on.
Situations where cell sizes (1,1; 1,2 etc) reflect differences in proportions in the population.
Situations where it is crucial to know the effect size (eta squared).
Situations where you do not have a theoretical justification for the ordering of predictors.
Type II Sums of Squares
Non-sequential, hierarchical; The null model always contains less terms, so that the term whose
significance we are trying to test is not part of a higher-order term in the model (i.e., an
interaction).
Most situations
Type III Sums of Squares (SPSS default)
Non-sequential, unique; The null model always contains one less term, corresponding to the term
whose significance we are trying to test.
Situations where you expect a significant main effect and an interaction. The main effects are
meaningless when there is a significant interaction.

Working together in R – descriptive statistics
•Interaction plot
Descriptive statistics
As shown in the script, check for a correlation between the outcome variable and any proposed
covariates. Also use the psych package to generate relevant descriptive statistics, as we did in
the last lecture.
Anova plot 2.png
Plot suggests that there might be an interaction.

Working together in R – Running the analysis
•A different hypothesis – this time from our SS data (Hypothesis #2)
•Once gambling-related beliefs (PreDBC_Total) are accounted for, a higher percentage of wins
(PostHowManySingleWins) should be remembered in the descending condition relative to the others
(SeqCond). Sequence condition could interact with question wording (PostHowManySingleCaptionType).
•
•See script for a demonstration of a Type II Sums of Squares ANCOVA test of this hypothesis. We use
the lm function for which you do not need to install a package. The Anova function we use is in the
car package. We also make use of the psych package (describeBy), and the effects package (function:
effect).

Working together in R – Checking assumptions
Assumption
Checks available in R
If the assumption is not met...
Normality of residuals
hist(residuals(anova_
SSHyp2))
shapiro.test(residuals
(anova_SSHyp2))
Try a generalised linear model – discussed in a few lectures’ time
Constant variance of residuals across predicted group means – homogeneity of variance
leveneTest(formula) – car package. Formula must specify a saturated model (i.e., a model with all
possible main effects and interactions) with no covariates.
oneway.test()
kruskal.test()
Homogeneity of regression slopes (ANCOVA)
HRS <- aov(outcome variable ~ predictor*covariate) or with multiple predictors:
HRS <- aov(outcome variable ~predictor1*predictor2 *covariate)
Anova(HRS, type = 2)
Try a more complex model where the covariate is a predictor
Independence between the covariate and predictor(s) (ANCOVA)
aov(predictor1*predictor2~covariate)
Try a more complex model where the covariate is a predictor.

Reporting the analysis – as in Results section
•Table (or very clear graph) showing means and SDs across factor levels. As in the interaction
plot.
•In text: An ANCOVA (with Type II Sums of Squares) was conducted with percentage of remembered wins
as the outcome variable, success-slope and question wording as predictors, and background beliefs
(Drake Beliefs About Chance total score) as a covariate. After the significant influence of
background beliefs was accounted for (F(1,325) = 11.32, p < .001, eta-squared = .03), the analysis
revealed a significant main effect of success-slope (F(3,325) = 3.10, p = .03, eta-squared = .02),
a significant main effect of question wording (F(1,325) = 38.08, p < .001, eta-squared = .09), and
a significant interaction effect (F(3,325) = 3.83, p = .01, eta-squared = .03). Planned comparisons
of the Descending condition’s mean to those of other groups under a treatment contrast revealed a
significant difference between the Ascending and Descending groups (p = .05). As regards the
interaction, the effect of question wording was found to be marginally significantly different in
the Ascending, as compared to the Descending, condition (p = .07). As the descriptive statistics
suggest, question wording was irrelevant to the win-frequency estimates of participants in the
Ascending condition. Notably, the homogeneity of variance assumption was violated in the analysis.
•F values (with degrees of freedom), p values and effect sizes can also be reported in a table.
•A table showing estimated marginal means could also be included.
•

Discussing the analysis – as in Discussion section
•The results suggest that more wins were remembered when most wins were concentrated early in the
experienced sequence, rather than late in the sequence. This is partly consistent with our
expectation that memory for wins would resemble memory for word lists, where the words at the top
of the list are remembered more clearly. Interestingly, the early-wins condition did not differ
from the evenly-spaced and U-shaped conditions in terms of remembered wins. For the U-shaped
condition, a likely explanation is that the early wins there were clearly remembered. For the
evenly-spaced condition, it is possible that memory was boosted by the “spacing” of the wins. The
effects of spacing are well-known in the memory literature. Words tend to be remembered better the
wider their spacing across time. The spacing effect is also likely to have been responsible for the
effects of question wording. People seemed to have been underestimating the frequency of losses,
possibly because these were not as widely spaced as wins. Why this effect of question wording was
not observed in the late-wins (Ascending) condition is unclear.

Reading
•Navarro, D. J. (2014). Learning statistics with R: A tutorial for psychology students and other
beginners. Available online: http://health.adelaide.edu.au/psychology/ccs/teaching/lsr/. Chapters
13-16.
•
•Baguley, T. Serious Stats: A Guide to Advanced Statistics for the Behavioural Sciences. Palgrave
Macmillan: UK. Chapter 16 “Repeated Measures ANOVA” (pdf in Study Materials/Readings).
•
•Field, A., Miles, J., & Field, Z. (2012). Discovering Statistics Using R. Sage: UK. Chapter 10.
Comparing several means: ANOVA (pdf in Study Materials/Readings).