We respect your right to privacy. You can choose not to allow some types of cookies. Your cookie preferences will apply across our website.
ANOVAs as mixed effects models for randomised block designs and repeated-measures analyses.
Six functions are available for fitting linear models or linear mixed effects models. These can be used instead of simple ANOVAs or ‘mixed’ ANOVAs when there are random factors to consider.
These functions rely on lm and lmer (latter implemented in package lmerTest).
This page examples of using linear mixed-effects models to get the:
mixed_anova, mixed_anova_slopes) with F and P valuesmixed_model, mixed_model_slopes), which is useful for diagnostics and posthoc comparisons.Both fit the full model with interaction terms when more than one fixed factors are provided. By default, variable intercepts only models are fit. For simplicity, grafify currently implements one version of variable slopes model through mixed_model_slopes and mixed_anova_slopes. Refer to lme4 package for more complex analyses with random effects.
ANOVA tables show type II sum of squares (type II SS). ANOVA tables with mixed_anova are generated using the anova() call through lmerTest that adds P values.
If you want Type III SS, first set contrasts using options(contrasts=c("contr.sum", "contr.poly") before fitting the model or getting the ANOVA table. Fitting the model without this and setting type = "III" will give incorrect results. The default contrasts in R can be obtained by running options(contrasts=c("contr.treatment", "contr.poly")).
See the data help page and ensure data table is in the long-format, and check that you have correctly converted categorical variables with as.factor and columns with numeric variables are in the correct format.
We will use data_1w_death and data_2w_Festing in these examples. In this vignette we use ordinary linear models; see the vignette on mixed-effects models for analyses of the same data where the random factor is taken into account.
Let’s look at these data first so you have an idea of whether or not there are differences between them.
First the data_1w_death (one-way ANOVA design) data from 5 independent in vitro experiments where death of host cells was measured (as percentage of total cells) after infection with three different strains of a pathogenic bacterium (factor Genotype).
head(data_1w_death, n = 6) #see first 6 rows of data
#> Experiment Genotype Death
#> 1 Exp_1 WT 25.012173
#> 2 Exp_1 KO_1 1.824973
#> 3 Exp_1 KO_2 14.294956
#> 4 Exp_2 WT 16.542609
#> 5 Exp_2 KO_1 2.131199
#> 6 Exp_2 KO_2 23.749439
A plot of these data shows that KO_1 induces very little death as compared to the other two bacterial strains.
plot_3d_scatterbox(data_1w_death, #data
Genotype, #fixed factor
Death, #Y variable
Experiment) #shapes

The data_2w_Festing data from Festing et al dataset has two fixed factors (“Treatment” & “Strain”). For the present analysis, I will ignore the blocking factor (“Block”), and we use the same data set in the vignette on mixed effects models (mixed_anova).
head(data_2w_Festing, n = 8) #see first 8 rows
#> Block Treatment Strain GST
#> 1 A Control NIH 444
#> 2 A Treated NIH 614
#> 3 A Control BALB/C 423
#> 4 A Treated BALB/C 625
#> 5 A Control A/J 408
#> 6 A Treated A/J 856
#> 7 A Control 129/Ola 447
#> 8 A Treated 129/Ola 719
The plot below of these data suggests that Treatment generally increases GST values irrespective of the mouse Strain (although the magnitude of increase appears to be different across the strains).
plot_4d_scatterbox(data_2w_Festing, #data
Strain, #factor 1
GST, #Y variable
Treatment, #factor 2
Block) #shapes

Let’s fit a linear model, use diagnostics and get ANOVA table.
In this vignette we use linear mixed-effects models; see the vignette on ordinary linear models for analyses of the same data where the random factor is ignored.
mixed_modelmixed_model to save models for diagnostics and posthoc_ functions. The mixed_model and mixed_anova calls have identically arguments.
mod_1wdeath <- mixed_model(data_1w_death, #data table
"Death", #quantitative variable
"Genotype", #fixed factor
"Experiment") #random factor
#diagnostic residual plot
plot_qqmodel(mod_1wdeath)

mod_2w <- mixed_model(data_2w_Festing,
"GST",
c("Strain", "Treatment"),
"Block")
mod_2wtdeath <- mixed_model(data_2w_Tdeath,
"logit(PI/100)", #logit transformation in formula
c("Genotype", "Time"),
c("Experiment", "Subject"))
Check model residuals with the plot_qqmodel function.
plot_qqmodel(mod_2w) #saved above

plot_qqmodel(mod_2wtdeath)

The ggResidPanel package is great for model diagnostics. Check it out.
mixed_anovaANOVA table with random factor taken into account.
mixed_anova(data_1w_death, #data table
"Death", #quantitative variable
"Genotype", #fixed factor
"Experiment") #random factor
#> Type II Analysis of Variance Table with Kenward-Roger's method
#> Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
#> Genotype 1420.2 710.1 2 8 42.659 5.401e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The output indicates a huge impact of Genotype on cell death. Compare this result with the ordinary linear model.
Let’s analyse the Festing et al dataset this time including the random factor (analysis as ordinary linear model here).
mixed_anova(data_2w_Festing, #data table
"GST", #quantitative variable
c("Strain", "Treatment"), #fixed factors
"Block") #random factor
#> Type II Analysis of Variance Table with Kenward-Roger's method
#> Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
#> Strain 28613 9538 3 7 3.2252 0.09144 .
#> Treatment 227529 227529 1 7 76.9394 5.041e-05 ***
#> Strain:Treatment 49591 16530 3 7 5.5897 0.02832 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The effect of Treatment looks more marked, and there may be a hint of interaction, so dig deeper!
Currently, grafify implements one type of variable slopes model (mixed_model_slopes and mixed_anova_slopes). Refer to lme4 package to fit other kinds of random effects.
The two-way repeated-measures data_2w_Tdeath has two fixed factors (Genotype & Time) and two random factors, Experiment and Subject. The Subject indicates repeated-measure over Time. The data table also has a column with numeric “Time2” column that could be used for graphs, remember to use the column with categorical “Time” column for analysis.
#> Experiment Time Subject Genotype PI Time2
#> 1 e1 t100 s1 WT 20.47120 100
#> 2 e2 t100 s2 WT 28.88967 100
#> 3 e3 t100 s3 WT 11.55061 100
#> 4 e4 t100 s4 WT 23.24516 100
#> 5 e5 t100 s5 WT 30.20904 100
str(data_2w_Tdeath) #Time coded as categorical variable
#> 'data.frame': 24 obs. of 6 variables:
#> $ Experiment: Factor w/ 6 levels "e1","e2","e3",..: 1 2 3 4 5 6 1 2 3 4 ...
#> $ Time : Factor w/ 2 levels "t100","t300": 1 1 1 1 1 1 1 1 1 1 ...
#> $ Subject : Factor w/ 12 levels "s1","s10","s11",..: 1 5 6 7 8 9 10 11 12 2 ...
#> $ Genotype : Factor w/ 2 levels "WT","KO": 1 1 1 1 1 1 2 2 2 2 ...
#> $ PI : num 20.5 28.9 11.6 23.2 30.2 ...
#> $ Time2 : int 100 100 100 100 100 100 100 100 100 100 ...
mixed_anova(data_2w_Tdeath, #data table
"logit(PI/100)", #logit transformation in formula
c("Genotype", "Time"), #fixed factors
c("Experiment", "Subject")) #random factors
#> Type II Analysis of Variance Table with Kenward-Roger's method
#> Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
#> Genotype 1.18171 1.18171 1 5 58.5419 0.0006071 ***
#> Time 1.84830 1.84830 1 10 91.5646 2.377e-06 ***
#> Genotype:Time 0.08288 0.08288 1 10 4.1059 0.0702377 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Results show a large effect of Genotype & Time, and questionable interaction effect.
The function plot_lm_predict is handy to check how close the values predicted from the linear model are to observed (or measured) values that were used to fit the model. A good model will predict numbers close to observed/real data.
This function takes in a fitted model, and needs X and Y variables that exactly match those used to fit the model.
Let us plot the predictions of models fitted above.
#plots for Festing data set
plot_lm_predict(Model = mod_2w, #fitted model
xcol = Treatment, #one independent factor
ycol = GST, #dependent variable
ByFactor = Strain) #second factor
