Singh et al. (2016) examined the response of copulatory traits in male fruitflies (Drosophila melanogaster) to cold stress. They had two selection treatments (populations selected for cold shock versus control populations, a fixed effect) crossed with five ancestral populations that experimental flies were derived from (random effect). The data we will analyse are from an experiment that compared files from the two selection treatments that were cold shocked, allowed to recover and then copulatory traits were measured after 4, 12 and 30 hours. Different flies were used for each period so there are no repeated measures. This fully-crossed design has two fixed effects (selection treatment, period) and one random effect (ancestral population, termed “block” by Singh et al. (2016), the sample size varied from 35 to 62 flies in each combination of selection treatment, period and ancestral population, and the response variable we will focus on was mating latency (minutes).
The paper is here
Singh, K., Samant, M. A., Tom, M. T. & Prasad, N. G. (2016). Evolution of pre- and post-copulatory traits in male Drosophila melanogaster as a correlated response to selection for resistance to cold stress. PLoS One, 11, e0153629.
First, load the required packages (afex, car, lattice, lme4, lmerTest, nlme, VCA, Rmisc, MuMin, emmeans)
Import singh data file (singh.csv)
singh <- read.csv("../data/singh.csv")
head(singh,5)
Set contrasts from afex
Convert period and block to factors
set_sum_contrasts()
setting contr.sum globally: options(contrasts=c('contr.sum', 'contr.poly'))
singh$block <- factor(singh$block)
singh$period <- factor(singh$period)
Check assumptions with boxplots - very skewed distributions
boxplot(matlat~selection*period, data=singh)
Fit OLS anova model to check residuals
singh.aov <- aov(matlat~selection*period*block, data=singh)
Check residuals - variance increases with mean
plot(singh.aov)
Transform response to logs (+1 to handle zero values)
singh$lmatlat <- log10(singh$matlat+1)
Check assumptions - much improved boxplots
boxplot(lmatlat~selection*period, data=singh)
Refit OLS model and check residuals (much better)
singh.aov1 <- aov(lmatlat~selection*period*block, data=singh)
plot(singh.aov1)
Although the original paper used untransformed data, our pre-analysis checks suggest that a data transformation, or, as in later chapters, a generalized linear model, might be a better fit
singh.lm1 <- lm(lmatlat ~ selection*period*block, data=singh)
summary(singh.lm1)
Call:
lm(formula = lmatlat ~ selection * period * block, data = singh)
Residuals:
Min 1Q Median 3Q Max
-1.0686 -0.2434 -0.0532 0.1889 1.2209
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.16e-01 8.94e-03 102.50 < 2e-16 ***
selection1 5.83e-02 8.94e-03 6.53 9.3e-11 ***
period1 1.63e-01 1.27e-02 12.83 < 2e-16 ***
period2 -1.72e-02 1.26e-02 -1.37 0.17192
block1 -7.79e-02 1.89e-02 -4.13 3.8e-05 ***
block2 6.64e-02 1.82e-02 3.66 0.00026 ***
block3 1.10e-01 1.74e-02 6.35 3.0e-10 ***
block4 3.52e-02 1.79e-02 1.97 0.04908 *
selection1:period1 6.08e-03 1.27e-02 0.48 0.63170
selection1:period2 6.95e-03 1.26e-02 0.55 0.58136
selection1:block1 -2.72e-02 1.89e-02 -1.44 0.14888
selection1:block2 1.77e-02 1.82e-02 0.98 0.32942
selection1:block3 2.62e-02 1.74e-02 1.50 0.13278
selection1:block4 -7.89e-03 1.79e-02 -0.44 0.65873
period1:block1 -9.68e-03 2.67e-02 -0.36 0.71704
period2:block1 5.37e-02 2.64e-02 2.04 0.04179 *
period1:block2 5.63e-02 2.61e-02 2.15 0.03133 *
period2:block2 -5.49e-02 2.56e-02 -2.15 0.03204 *
period1:block3 -3.02e-02 2.45e-02 -1.23 0.21859
period2:block3 5.54e-02 2.49e-02 2.23 0.02612 *
period1:block4 2.98e-02 2.55e-02 1.17 0.24249
period2:block4 -2.40e-02 2.50e-02 -0.96 0.33585
selection1:period1:block1 -5.95e-03 2.67e-02 -0.22 0.82391
selection1:period2:block1 3.47e-02 2.64e-02 1.32 0.18829
selection1:period1:block2 -4.56e-02 2.61e-02 -1.75 0.08098 .
selection1:period2:block2 -6.38e-03 2.56e-02 -0.25 0.80311
selection1:period1:block3 -1.76e-04 2.45e-02 -0.01 0.99427
selection1:period2:block3 -7.58e-05 2.49e-02 0.00 0.99757
selection1:period1:block4 4.79e-02 2.55e-02 1.88 0.06015 .
selection1:period2:block4 -3.49e-03 2.50e-02 -0.14 0.88895
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.341 on 1449 degrees of freedom
Multiple R-squared: 0.208, Adjusted R-squared: 0.192
F-statistic: 13.1 on 29 and 1449 DF, p-value: <2e-16
Anova(singh.lm1, type='III')
Anova Table (Type III tests)
Response: lmatlat
Sum Sq Df F value Pr(>F)
(Intercept) 1219 1 10507.20 < 2e-16 ***
selection 5 1 42.59 9.3e-11 ***
period 23 2 99.44 < 2e-16 ***
block 13 4 27.69 < 2e-16 ***
selection:period 0 2 0.53 0.5864
selection:block 1 4 1.14 0.3338
period:block 3 8 2.74 0.0054 **
selection:period:block 1 8 1.23 0.2755
Residuals 168 1449
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Get correct F-ratio and P value for selection x period
f = 0.03/0.14
f
[1] 0.214
1-pf(f, 2, 8, lower.tail = TRUE, log.p = FALSE)
[1] 0.812
Get correct F-ratio and P value for selection
f = 4.72/0.13
f
[1] 36.3
1-pf(f, 1, 4, lower.tail = TRUE, log.p = FALSE)
[1] 0.00382
Get correct F-ratio and P value for period
f = 11.09/0.33
f
[1] 33.6
1-pf(f, 2, 8, lower.tail = TRUE, log.p = FALSE)
[1] 0.000128
singh.vca <- anovaMM(lmatlat~selection + period + selection*period + (block) + (selection*block) + (period*block) + (selection*period*block), singh)
Convert variable selection from "character" to "factor"!
VCAinference(singh.vca, alpha=0.05, VarVC=TRUE, excludeNeg=FALSE, constrainCI=FALSE)
Inference from Mixed Model Fit
------------------------------
> VCA Result:
-------------
[Fixed Effects]
int selectionFCB selectionFSB period4 period12 period30 selectionFCB:period4 selectionFSB:period4
0.7235 0.0967 0.0000 0.2906 0.1100 0.0000 0.0332 0.0000
selectionFCB:period12 selectionFSB:period12 selectionFCB:period30 selectionFSB:period30
0.0322 0.0000 0.0000 0.0000
[Variance Components]
Name DF SS MS VC %Total SD CV[%] Var(VC)
1 total 419.9978 0.1282 100 0.358 39.263
2 block 4 12.7393 3.1848 0.0097 7.5916 0.0986 10.8181 1e-04
3 selection:block 4 0.523 0.1307 0* 0* 0* 0* 0
4 period:block 8 2.6243 0.328 0.0019 1.4688 0.0434 4.7585 0
5 selection:period:block 8 1.1446 0.1431 6e-04 0.4328 0.0236 2.5829 0
6 error 1449 168.1046 0.116 0.116 90.5068 0.3406 37.3529 0
Mean: 0.912 (N = 1479)
Experimental Design: unbalanced | Method: ANOVA | * VC set to 0 | adapted MS used for total DF
> VC:
-----
Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
total 0.1282 0.1125 0.1475 0.1148 0.1441
block 0.0097 0* 0.0248 0* 0.0224
selection:block 0 0* 0.0016 0* 0.0013
period:block 0.0019 0* 0.0055 0* 0.0049
selection:period:block 6e-04 0* 0.0034 0* 0.003
error 0.116 0.108 0.1249 0.1093 0.1235
> SD:
-----
Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
total 0.358 0.3354 0.384 0.3389 0.3797
block 0.0986 0* 0.1575 0* 0.1496
selection:block 0 0* 0.0395 0* 0.0362
period:block 0.0434 0* 0.074 0* 0.0699
selection:period:block 0.0236 0* 0.0586 0* 0.0545
error 0.3406 0.3286 0.3535 0.3305 0.3514
> CV[%]:
--------
Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
total 39.263 36.778 42.111 37.164 41.6359
block 10.8181 0* 17.2708 0* 16.4055
selection:block 0 0* 4.3363 0* 3.9724
period:block 4.7585 0* 8.1098 0* 7.6705
selection:period:block 2.5829 0* 6.4271 0* 5.9782
error 37.3529 36.0412 38.7644 36.2481 38.533
95% Confidence Level | * CI-limits constrained to be >= 0
SAS PROC MIXED method used for computing CIs
Get least squares means for selection and period
emmeans (singh.lm1, "selection")
NOTE: Results may be misleading due to involvement in interactions
selection emmean SE df lower.CL upper.CL
FCB 0.974 0.0130 1449 0.949 1.000
FSB 0.858 0.0123 1449 0.834 0.882
Results are averaged over the levels of: period, block
Confidence level used: 0.95
emmeans(singh.lm1, "period")
NOTE: Results may be misleading due to involvement in interactions
period emmean SE df lower.CL upper.CL
4 1.079 0.0156 1449 1.048 1.109
12 0.899 0.0154 1449 0.869 0.929
30 0.770 0.0155 1449 0.740 0.801
Results are averaged over the levels of: selection, block
Confidence level used: 0.95
FCB & FSB:
#means
10^.974
[1] 9.42
10^.858
[1] 7.21
#lower CIs
10^.949
[1] 8.89
10^ .834
[1] 6.82
#upper CIs
10^1
[1] 10
10^.882
[1] 7.62
Periods
10^1.079
[1] 12
10^.899
[1] 7.93
10^.77
[1] 5.89
singh.lmer1 <- lmer(lmatlat~selection + period +selection:period + (1|block) + (1|selection:block)
+ (1|period:block) + (1|selection:period:block), REML=TRUE, singh)
boundary (singular) fit: see help('isSingular')
summary(singh.lmer1, ddf="Kenward-Roger")
Linear mixed model fit by REML. t-tests use Kenward-Roger's method ['lmerModLmerTest']
Formula: lmatlat ~ selection + period + selection:period + (1 | block) + (1 | selection:block) + (1 | period:block) + (1 | selection:period:block)
Data: singh
REML criterion at convergence: 1072
Scaled residuals:
Min 1Q Median 3Q Max
-3.219 -0.706 -0.118 0.558 3.666
Random effects:
Groups Name Variance Std.Dev.
selection:period:block (Intercept) 0.000473 0.0217
period:block (Intercept) 0.001876 0.0433
selection:block (Intercept) 0.000000 0.0000
block (Intercept) 0.009328 0.0966
Residual 0.116030 0.3406
Number of obs: 1479, groups: selection:period:block, 30; period:block, 15; selection:block, 10; block, 5
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.91634 0.04568 4.00065 20.06 3.6e-05 ***
selection1 0.05922 0.00977 3.95933 6.06 0.0039 **
period1 0.16275 0.02103 8.02693 7.74 5.4e-05 ***
period2 -0.01835 0.02098 7.97861 -0.87 0.4072
selection1:period1 0.00571 0.01384 7.97814 0.41 0.6909
selection1:period2 0.00519 0.01377 7.90483 0.38 0.7160
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) slctn1 perid1 perid2 slc1:1
selection1 0.010
period1 0.001 0.005
period2 -0.001 0.011 -0.500
slctn1:prd1 0.001 0.007 0.033 -0.026
slctn1:prd2 0.004 -0.004 -0.026 0.038 -0.501
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
anova(singh.lmer1, ddf="Kenward-Roger")
Type III Analysis of Variance Table with Kenward-Roger's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
selection 4.26 4.26 1 3.96 36.73 0.0039 **
period 8.34 4.17 2 8.00 35.96 0.0001 ***
selection:period 0.07 0.04 2 7.92 0.31 0.7400
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Model fit singular - over-fitted due to zero variance component for 3 way interaction
singh.ci1 <- confint.merMod(singh.lmer1, oldNames=FALSE)
Computing profile confidence intervals ...
Warning: slightly lower deviances (diff=-1.13687e-12) detectedWarning: Last two rows have identical or NA .zeta values: using minstepWarning: Last two rows have identical or NA .zeta values: using minstepWarning: non-monotonic profile for sd_(Intercept)|selection:period:blockWarning: unexpected decrease in profile: using minstepWarning: unexpected decrease in profile: using minstepWarning: unexpected decrease in profile: using minstepWarning: unexpected decrease in profile: using minstepWarning: non-monotonic profile for selection1Warning: bad spline fit for sd_(Intercept)|selection:period:block: falling back to linear interpolationWarning: collapsing to unique 'x' valuesWarning: bad spline fit for selection1: falling back to linear interpolation
singh.vc1 <- (singh.ci1)^2
print(singh.vc1)
2.5 % 97.5 %
sd_(Intercept)|selection:period:block 0.00e+00 0.002792
sd_(Intercept)|period:block 5.12e-05 0.006111
sd_(Intercept)|selection:block 0.00e+00 0.002572
sd_(Intercept)|block 2.10e-03 0.038632
sigma 1.08e-01 0.124796
(Intercept) 6.70e-01 1.029152
selection1 1.50e-03 0.006291
period1 1.50e-02 0.041501
period2 3.49e-03 0.000438
selection1:period1 3.65e-04 0.000920
selection1:period2 3.89e-04 0.000869
singh.lmer2 <- lmer(lmatlat~selection + period +selection:period + (1|block) + (1|selection:block)
+ (1|period:block), REML=TRUE, singh)
Compare models
anova(singh.lmer1, singh.lmer2)
refitting model(s) with ML (instead of REML)
Data: singh
Models:
singh.lmer2: lmatlat ~ selection + period + selection:period + (1 | block) + (1 | selection:block) + (1 | period:block)
singh.lmer1: lmatlat ~ selection + period + selection:period + (1 | block) + (1 | selection:block) + (1 | period:block) + (1 | selection:period:block)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
singh.lmer2 10 1054 1107 -517 1034
singh.lmer1 11 1056 1115 -517 1034 0 1 1
AICc(singh.lmer1, singh.lmer2)
Omitting the interaction has little effect on fit
summary(singh.lmer2, ddf="Kenward-Roger")
Linear mixed model fit by REML. t-tests use Kenward-Roger's method ['lmerModLmerTest']
Formula: lmatlat ~ selection + period + selection:period + (1 | block) + (1 | selection:block) + (1 | period:block)
Data: singh
REML criterion at convergence: 1073
Scaled residuals:
Min 1Q Median 3Q Max
-3.222 -0.705 -0.121 0.553 3.670
Random effects:
Groups Name Variance Std.Dev.
period:block (Intercept) 2.12e-03 0.04610
selection:block (Intercept) 8.77e-05 0.00937
block (Intercept) 9.29e-03 0.09640
Residual 1.16e-01 0.34085
Number of obs: 1479, groups: period:block, 15; selection:block, 10; block, 5
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 9.16e-01 4.57e-02 4.00e+00 20.05 3.6e-05 ***
selection1 5.93e-02 9.41e-03 3.95e+00 6.31 0.0034 **
period1 1.63e-01 2.11e-02 8.03e+00 7.74 5.4e-05 ***
period2 -1.85e-02 2.10e-02 7.98e+00 -0.88 0.4035
selection1:period1 5.78e-03 1.26e-02 1.46e+03 0.46 0.6469
selection1:period2 4.89e-03 1.26e-02 1.46e+03 0.39 0.6972
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) slctn1 perid1 perid2 slc1:1
selection1 0.010
period1 0.001 0.005
period2 -0.001 0.011 -0.500
slctn1:prd1 0.002 0.008 0.036 -0.028
slctn1:prd2 0.004 -0.004 -0.028 0.041 -0.502
F tests of fixed effects (Type III SS) for this new model - same conclusions as above
anova(singh.lmer2, ddf="Kenward-Roger")
Type III Analysis of Variance Table with Kenward-Roger's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
selection 4.62 4.62 1 4 39.80 0.0034 **
period 8.34 4.17 2 8 35.89 0.0001 ***
selection:period 0.08 0.04 2 1462 0.36 0.6969
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
singh.ci2 <- confint.merMod(singh.lmer2, oldNames=FALSE)
Computing profile confidence intervals ...
Warning: slightly lower deviances (diff=-1.36424e-12) detectedWarning: unexpected decrease in profile: using minstepWarning: Last two rows have identical or NA .zeta values: using minstepWarning: non-monotonic profile for sd_(Intercept)|selection:blockWarning: bad spline fit for sd_(Intercept)|selection:block: falling back to linear interpolationWarning: collapsing to unique 'x' values
singh.vc2 <- (singh.ci2)^2
print(singh.vc2)
2.5 % 97.5 %
sd_(Intercept)|period:block 4.89e-05 0.006111
sd_(Intercept)|selection:block 0.00e+00 0.002659
sd_(Intercept)|block 2.07e-03 0.038632
sigma 1.08e-01 0.124796
(Intercept) 6.71e-01 1.027682
selection1 1.76e-03 0.005897
period1 1.57e-02 0.040132
period2 3.14e-03 0.000356
selection1:period1 3.64e-04 0.000920
selection1:period2 3.89e-04 0.000869
This is the approach used by the paper’s author’s. We’ve given the initial OLS fit, which you can adjust for the random effects by recalculating F-ratios as shown in the earlier code. You can easily also run the REML mixed effects model.
options(digits=4)
singh.lm2 <- lm(matlat~selection*period*block, data=singh)
summary(singh.lm2)
Call:
lm(formula = matlat ~ selection * period * block, data = singh)
Residuals:
Min 1Q Median 3Q Max
-22.80 -6.94 -3.38 1.22 100.30
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.9362 0.3983 29.97 < 2e-16 ***
selection1 2.0177 0.3983 5.07 4.6e-07 ***
period1 5.6324 0.5656 9.96 < 2e-16 ***
period2 -1.0621 0.5615 -1.89 0.0587 .
block1 -2.1440 0.8405 -2.55 0.0108 *
block2 1.8845 0.8090 2.33 0.0200 *
block3 3.2913 0.7750 4.25 2.3e-05 ***
block4 1.1268 0.7957 1.42 0.1569
selection1:period1 0.8168 0.5656 1.44 0.1489
selection1:period2 0.1700 0.5615 0.30 0.7621
selection1:block1 -1.3873 0.8405 -1.65 0.0990 .
selection1:block2 -0.1761 0.8090 -0.22 0.8277
selection1:block3 1.4173 0.7750 1.83 0.0676 .
selection1:block4 0.5771 0.7957 0.73 0.4684
period1:block1 -1.4099 1.1906 -1.18 0.2366
period2:block1 1.6193 1.1747 1.38 0.1683
period1:block2 3.4923 1.1639 3.00 0.0027 **
period2:block2 -2.4888 1.1397 -2.18 0.0291 *
period1:block3 -0.3059 1.0925 -0.28 0.7795
period2:block3 1.7296 1.1083 1.56 0.1188
period1:block4 0.2358 1.1360 0.21 0.8356
period2:block4 0.0091 1.1132 0.01 0.9935
selection1:period1:block1 -0.8870 1.1906 -0.75 0.4564
selection1:period2:block1 1.5682 1.1747 1.33 0.1821
selection1:period1:block2 -0.8039 1.1639 -0.69 0.4898
selection1:period2:block2 -0.6862 1.1397 -0.60 0.5472
selection1:period1:block3 -0.1096 1.0925 -0.10 0.9201
selection1:period2:block3 0.0743 1.1083 0.07 0.9465
selection1:period1:block4 1.7904 1.1360 1.58 0.1152
selection1:period2:block4 -0.0457 1.1132 -0.04 0.9673
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 15.2 on 1449 degrees of freedom
Multiple R-squared: 0.13, Adjusted R-squared: 0.113
F-statistic: 7.48 on 29 and 1449 DF, p-value: <2e-16
Anova(singh.lm2, type='III')
Anova Table (Type III tests)
Response: matlat
Sum Sq Df F value Pr(>F)
(Intercept) 206972 1 898.08 < 2e-16 ***
selection 5914 1 25.66 4.6e-07 ***
period 25814 2 56.01 < 2e-16 ***
block 11609 4 12.59 4.5e-10 ***
selection:period 805 2 1.75 0.175
selection:block 1278 4 1.39 0.236
period:block 4086 8 2.22 0.024 *
selection:period:block 1411 8 0.77 0.634
Residuals 333938 1449
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#use anova to show MS
anova(singh.lm2)
Analysis of Variance Table
Response: matlat
Df Sum Sq Mean Sq F value Pr(>F)
selection 1 5766 5766 25.02 6.4e-07 ***
period 2 25108 12554 54.47 < 2e-16 ***
block 4 11607 2902 12.59 4.5e-10 ***
selection:period 2 637 319 1.38 0.251
selection:block 4 1255 314 1.36 0.245
period:block 8 4178 522 2.27 0.021 *
selection:period:block 8 1411 176 0.77 0.634
Residuals 1449 333938 230
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
glance(singh.lm2)