Here we use the Medley and Clements (1998) data differently from the original analysis in Chapter 6. Now we are comparing diatom species diversity across streams. Streams are treated as a random factor, assuming these streams represent a random sample of all possible streams in this part of the Rocky Mountains. The design is unbalanced with between four and seven stations (level 1 units) on each stream (level 2 clusters). In addition to diatom diversity, zinc concentration was also recorded at each station (a level 1 covariate). We will do two analyses on these data. First, we will fit models that focus just on estimating the diatom diversity variance components for streams and for stations within streams, i.e. fit a null or random effects model. Second, we will include the covariate zinc concentration in our models and aim to estimate the variance associated with random intercepts and random slopes.
The data were used in the first edition, and the data file is here
First, load the required packages (afex, car, lattice, lme4, lmerTest, nlme, VCA)
Import the data file (medley_rand.csv)
medley_rand <- read.csv("../data/medley_rand.csv")
head(medley_rand,10)
Set contrasts from afex
setting contr.sum globally: options(contrasts=c('contr.sum', 'contr.poly'))
First fit OLS anova model with stream as random factor
medley_rand.aov <- aov(diversity~stream, medley_rand)
Check residuals (no pattern, all OK)
plot(medley_rand.aov)
Get anova results
summary(medley_rand.aov)
Df Sum Sq Mean Sq F value Pr(>F)
stream 5 1.828 0.3656 1.411 0.251
Residuals 28 7.255 0.2591
Note: With OLS estimation, CIs can be negative)
medley_rand.vca <- anovaMM(diversity~(stream), medley_rand)
Convert variable stream from "character" to "factor"!
medley_rand.vca
Result Variance Component Analysis:
-----------------------------------
Name DF SS MS VC %Total SD CV[%]
1 total 31.348886 0.278044 100 0.527299 31.125269
2 stream 5 1.827846 0.365569 0.01893 6.80842 0.137588 8.121501
3 error 28 7.255178 0.259113 0.259113 93.19158 0.509032 30.047023
Mean: 1.694118 (N = 34)
Experimental Design: unbalanced | Method: ANOVA
VCAinference(medley_rand.vca, alpha=0.05, VarVC=TRUE, excludeNeg=FALSE, constrainCI=FALSE)
Inference from (V)ariance (C)omponent (A)nalysis
------------------------------------------------
> VCA Result:
-------------
Name DF SS MS VC %Total SD CV[%] Var(VC)
1 total 31.3489 0.278 100 0.5273 31.1253
2 stream 5 1.8278 0.3656 0.0189 6.8084 0.1376 8.1215 0.0018
3 error 28 7.2552 0.2591 0.2591 93.1916 0.509 30.047 0.0048
Mean: 1.6941 (N = 34)
Experimental Design: unbalanced | Method: ANOVA
> VC:
-----
Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
total 0.2780 0.1791 0.4897 0.1920 0.4457
stream 0.0189 -0.0653 0.1032 -0.0518 0.0896
error 0.2591 0.1632 0.4740 0.1755 0.4286
> SD:
-----
Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
total 0.5273 0.4232 0.6998 0.4381 0.6676
stream 0.1376 -0.2555 0.3212 -0.2275 0.2994
error 0.5090 0.4040 0.6884 0.4189 0.6547
> CV[%]:
--------
Estimate CI LCL CI UCL One-Sided LCL One-Sided UCL
total 31.1253 24.9806 41.3053 25.8619 39.4077
stream 8.1215 -15.0829 18.9581 -13.4281 17.6701
error 30.0470 23.8447 40.6371 24.7292 38.6437
95% Confidence Level
SAS PROC MIXED method used for computing CIs
medley_rand.reml <- lmer(diversity~(1|stream), medley_rand)
summary(medley_rand.reml)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: diversity ~ (1 | stream)
Data: medley_rand
REML criterion at convergence: 54.3
Scaled residuals:
Min 1Q Median 3Q Max
-2.0787 -0.7689 0.2107 0.6707 2.0945
Random effects:
Groups Name Variance Std.Dev.
stream (Intercept) 0.02054 0.1433
Residual 0.25756 0.5075
Number of obs: 34, groups: stream, 6
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.6933 0.1053 5.1118 16.08 1.42e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
medley_rand.ml <- lmer(diversity~(1|stream), medley_rand, REML=F)
summary(medley_rand.ml)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: diversity ~ (1 | stream)
Data: medley_rand
AIC BIC logLik deviance df.resid
57.5 62.1 -25.8 51.5 31
Scaled residuals:
Min 1Q Median 3Q Max
-2.0874 -0.7075 0.2287 0.7077 2.1584
Random effects:
Groups Name Variance Std.Dev.
stream (Intercept) 0.009928 0.09964
Residual 0.257183 0.50713
Number of obs: 34, groups: stream, 6
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.69361 0.09629 6.07706 17.59 1.92e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
CI on variance components
We need to square CIs from lmer, which are in SD units
medley_rand.ci <- confint.merMod(medley_rand.reml)
Computing profile confidence intervals ...
medley_rand.vc <- (medley_rand.ci)^2
print(medley_rand.vc)
2.5 % 97.5 %
.sig01 0.0000000 0.1744225
.sigma 0.1592729 0.4440721
(Intercept) 2.1608963 3.6686745
First plot the regressions for each stream
panel.smoother <- function(x, y) {
panel.xyplot(x, y)
panel.lmline(x, y)
}
xyplot(diversity~zn|stream,main="Scatterplots by Stream", ylab="diversity", xlab="zinc concentration", panel=panel.smoother, medley_rand)
Note: most slopes negative although sample size is very small
medley_randz.lmer1 <- lmer(diversity~zn +(zn|stream), REML=TRUE, medley_rand)
Warning: Model failed to converge with max|grad| = 1.06439 (tol = 0.002, component 1)Warning: Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
summary(medley_randz.lmer1)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: diversity ~ zn + (zn | stream)
Data: medley_rand
REML criterion at convergence: 55.2
Scaled residuals:
Min 1Q Median 3Q Max
-2.1446 -0.7050 0.2234 0.6671 1.8617
Random effects:
Groups Name Variance Std.Dev. Corr
stream (Intercept) 9.796e-02 0.3129920
zn 3.551e-07 0.0005959 -0.72
Residual 1.455e-01 0.3815099
Number of obs: 34, groups: stream, 6
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.9026528 0.1541741 3.3321553 12.341 0.000676 ***
zn -0.0011524 0.0004287 1.2726781 -2.688 0.181485
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
zn -0.618
optimizer (nloptwrap) convergence code: 0 (OK)
Model failed to converge with max|grad| = 1.06439 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
No convergence presumably due to small and unequal sample sizes - try random intercept model
medley_randz.lmer2 <- lmer(diversity~zn + (1|stream), REML=TRUE, medley_rand)
summary(medley_randz.lmer2, ddf="Kenward-Roger")
Linear mixed model fit by REML. t-tests use Kenward-Roger's method ['lmerModLmerTest']
Formula: diversity ~ zn + (1 | stream)
Data: medley_rand
REML criterion at convergence: 55.4
Scaled residuals:
Min 1Q Median 3Q Max
-2.1183 -0.6488 0.1606 0.6830 2.1086
Random effects:
Groups Name Variance Std.Dev.
stream (Intercept) 0.06032 0.2456
Residual 0.15512 0.3939
Number of obs: 34, groups: stream, 6
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.8631212 0.1282947 5.9077521 14.522 7.6e-06 ***
zn -0.0009613 0.0002292 29.3025865 -4.194 0.000232 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
zn -0.312
anova(medley_randz.lmer2, ddf="Kenward-Roger")
Type III Analysis of Variance Table with Kenward-Roger's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
zn 2.7285 2.7285 1 29.303 17.589 0.0002319 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Get CIs on varcomps from medley_rand.lmer2
medley_randz.ci <- confint.merMod(medley_randz.lmer2)
Computing profile confidence intervals ...
medley_randz.vc <- (medley_randz.ci)^2
print(medley_randz.vc)
2.5 % 97.5 %
.sig01 0.000000e+00 2.658587e-01
.sigma 9.257588e-02 2.657698e-01
(Intercept) 2.566196e+00 4.541983e+00
zn 1.951020e-06 2.421419e-07
medley_randz.lmer3 <- lmer(diversity~zn + (1|stream), REML=FALSE, medley_rand)
summary(medley_randz.lmer3)
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: diversity ~ zn + (1 | stream)
Data: medley_rand
AIC BIC logLik deviance df.resid
45.9 52.0 -19.0 37.9 30
Scaled residuals:
Min 1Q Median 3Q Max
-2.1386 -0.6302 0.1412 0.7115 2.1838
Random effects:
Groups Name Variance Std.Dev.
stream (Intercept) 0.04592 0.2143
Residual 0.14982 0.3871
Number of obs: 34, groups: stream, 6
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.8615113 0.1170068 7.4478940 15.909 5.09e-07 ***
zn -0.0009519 0.0002196 30.8384764 -4.335 0.000144 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr)
zn -0.333
anova(medley_randz.lmer3, medley_rand.reml)
refitting model(s) with ML (instead of REML)
Data: medley_rand
Models:
medley_rand.reml: diversity ~ (1 | stream)
medley_randz.lmer3: diversity ~ zn + (1 | stream)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
medley_rand.reml 3 57.500 62.079 -25.750 51.500
medley_randz.lmer3 4 45.938 52.044 -18.969 37.938 13.562 1 0.0002308 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Try same model with nlme
medley_rand_zinc.lme2 <- lme(fixed=diversity~zn, random=~1|stream/zn, method="REML", data=medley_rand)
summary(medley_rand_zinc.lme2)
Linear mixed-effects model fit by REML
Data: medley_rand
Random effects:
Formula: ~1 | stream
(Intercept)
StdDev: 0.245595
Formula: ~1 | zn %in% stream
(Intercept) Residual
StdDev: 0.3655802 0.1465479
Fixed effects: diversity ~ zn
Correlation:
(Intr)
zn -0.312
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-0.78816670 -0.24140519 0.05975661 0.25412012 0.78455703
Number of Observations: 34
Number of Groups:
stream zn %in% stream
6 34
Fit random intercept-only model using nlme (matches lme4 model above)
medley_rand_zinc.lme1 <- lme(fixed=diversity~zn, random=~1|stream, method="REML", data=medley_rand)
summary(medley_rand_zinc.lme1)
Linear mixed-effects model fit by REML
Data: medley_rand
Random effects:
Formula: ~1 | stream
(Intercept) Residual
StdDev: 0.2455997 0.3938584
Fixed effects: diversity ~ zn
Correlation:
(Intr)
zn -0.312
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.1182693 -0.6488025 0.1606048 0.6829671 2.1085586
Number of Observations: 34
Number of Groups: 6