Constable (1993) studied the role of sutures in the shrinking of the test of the sea urchin Heliocidaris erythrogramma under different food regimes. The categorical predictor (factor) was food regime with three groups: high food regime, low food regime, and an initial sample. The response variable was the width of inter-radial sutures (mm) from each urchin and initial body volume (ml) was the covariate. There were 24 urchins in each group. Constable (1993) transformed body volume to cube roots to linearise the relationship between suture width and body volume, and while a scatterplot suggested the relationships were approximately linear (Figure ‎8.15), we will be consistent with the original paper and also apply transformation.
This example was used in the first edition; the data file is here
Constable, A. J. (1993). The role of sutures in shrinking of the test in Heliocidaris erythrogramma (Echinoidea: Echinometridae). Marine Biology, 117, 423-30.
ancova for urchin data First, load the required packages (car, kader, reghelper, interactions)
Import constable data file (constable.csv)
constable <- read.csv("../data/constable.csv")
head(constable,10)
constable$treatment<-factor(constable$treatment)
boxplot(sutwidth~treatment, data=constable)
boxplot(bodyvol~treatment, data=constable)
Boxplots don’t indicate major skewness or markedly unequal variances
scatterplot(sutwidth ~ bodyvol | treatment, data = constable)
Lines look reasonably linear (except maybe low) but not parallel (esp. low shallower than others) but original author transformed body volume to cube root so we will too.
constable$cbodyvol <- (constable$bodyvol)^(1/3)
Check that covariate not affected by treatments
cov.aov <- aov(cbodyvol~treatment, data=constable)
summary(cov.aov)
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 0.656 0.3278 1.363 0.263
Residuals 69 16.591 0.2404
All SS types produce same result for interaction
constable.aov1 <- aov(sutwidth~treatment+cbodyvol+treatment*cbodyvol, data=constable)
plot(constable.aov1)
Residual plot looks OK
summary(constable.aov1)
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 0.012380 0.006190 14.754 5.05e-06 ***
cbodyvol 1 0.010907 0.010907 25.997 3.09e-06 ***
treatment:cbodyvol 2 0.003944 0.001972 4.701 0.0123 *
Residuals 66 0.027690 0.000420
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Conclude evidence for non-parallel slopes
constable$treatment <- relevel(constable$treatment, ref="low")
constable.lm1 <- lm(sutwidth~treatment+cbodyvol+treatment*cbodyvol, data=constable)
anova(constable.lm1)
Analysis of Variance Table
Response: sutwidth
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 0.0123801 0.0061900 14.7542 5.053e-06 ***
cbodyvol 1 0.0109067 0.0109067 25.9965 3.094e-06 ***
treatment:cbodyvol 2 0.0039443 0.0019721 4.7007 0.01234 *
Residuals 66 0.0276899 0.0004195
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
summary(constable.lm1)
Call:
lm(formula = sutwidth ~ treatment + cbodyvol + treatment * cbodyvol,
data = constable)
Residuals:
Min 1Q Median 3Q Max
-0.047764 -0.012727 0.001153 0.012959 0.065303
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.034871 0.021224 1.643 0.1051
treatmenthigh -0.063073 0.037350 -1.689 0.0960 .
treatmentinitial -0.063663 0.030892 -2.061 0.0433 *
cbodyvol 0.009053 0.007522 1.204 0.2330
treatmenthigh:cbodyvol 0.037721 0.014161 2.664 0.0097 **
treatmentinitial:cbodyvol 0.026395 0.011117 2.374 0.0205 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared: 0.4958, Adjusted R-squared: 0.4576
F-statistic: 12.98 on 5 and 66 DF, p-value: 8.431e-09
A nice plot is available from the interactions package
interact_plot(constable.lm1, pred=cbodyvol, modx=treatment, plot.points=TRUE)
Simple slopes
simple_slopes(constable.lm1)
treatment cbodyvol Test Estimate Std. Error t value df Pr(>|t|) Sig.
1 low sstest 0.0091 0.0075 1.2037 66 0.2330190
2 high sstest 0.0468 0.0120 3.8985 66 0.0002291 ***
3 initial sstest 0.0354 0.0082 4.3302 66 5.177e-05 ***
4 sstest (high) 2.17322 0.0189 0.0086 2.1975 66 0.0315008 *
5 sstest (initial) 2.17322 -0.0063 0.0085 -0.7373 66 0.4635775
6 sstest (high) 2.666077 0.0375 0.0062 6.0902 66 6.464e-08 ***
7 sstest (initial) 2.666077 0.0067 0.0060 1.1244 66 0.2648959
8 sstest (high) 3.158934 0.0561 0.0100 5.6304 66 3.988e-07 ***
9 sstest (initial) 3.158934 0.0197 0.0076 2.5853 66 0.0119446 *
Available in package interactions
Note for fixing; can’t get pairwise sets, so using subsets of data Not sure how to deal with categorical predictor, though it should be possible; manual not clear
# Look at low vs high
#create dummy variables
library(fastDummies)
Thank you for using fastDummies!
To acknowledge our work, please cite the package:
Kaplan, J. & Schlegel, B. (2023). fastDummies: Fast Creation of Dummy (Binary) Columns and Rows from Categorical Variables. Version 1.7.1. URL: https://github.com/jacobkap/fastDummies, https://jacobkap.github.io/fastDummies/.
df <- constable[constable$treatment %in% c('low','high'),]
df <- dummy_cols(df, select_columns = 'treatment')
Registered S3 method overwritten by 'data.table':
method from
print.data.table
constable.lm2 <- lm(sutwidth~treatment_high*cbodyvol, data = df)
anova(constable.lm2)
Analysis of Variance Table
Response: sutwidth
Df Sum Sq Mean Sq F value Pr(>F)
treatment_high 1 0.0112241 0.0112241 23.6210 1.528e-05 ***
cbodyvol 1 0.0040072 0.0040072 8.4332 0.005742 **
treatment_high:cbodyvol 1 0.0029769 0.0029769 6.2650 0.016101 *
Residuals 44 0.0209077 0.0004752
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
johnson_neyman(model = constable.lm2, pred = treatment_high, modx = cbodyvol)
JOHNSON-NEYMAN INTERVAL
When cbodyvol is OUTSIDE the interval [-2.25, 2.17], the slope of treatment_high
is p < .05.
Note: The range of observed values of cbodyvol is [1.71, 3.62]
# Now low vs initial
df2 <- constable[constable$treatment %in% c('low','initial'),]
#df2 <- subset(constable, treatment !='high')
df2 <- dummy_cols(df2, select_columns = 'treatment')
constable.lm3 <- lm(sutwidth~treatment_initial*cbodyvol, data = df2)
johnson_neyman(model = constable.lm3, pred = treatment_initial, modx = cbodyvol)
JOHNSON-NEYMAN INTERVAL
When cbodyvol is OUTSIDE the interval [1.51, 2.78], the slope of
treatment_initial is p < .05.
Note: The range of observed values of cbodyvol is [1.52, 3.62]
# Now high vs initial
df3 <- constable[constable$treatment %in% c('initial','high'),]
df3 <- dummy_cols(df3, select_columns = 'treatment')
constable.lm4 <- lm(sutwidth~treatment_initial*cbodyvol, data = df3)
johnson_neyman(model = constable.lm4, pred = treatment_initial, modx = cbodyvol)
JOHNSON-NEYMAN INTERVAL
When cbodyvol is INSIDE the interval [2.00, 3.91], the slope of treatment_initial
is p < .05.
Note: The range of observed values of cbodyvol is [1.52, 3.53]
Alternative package for J-N
devtools::install_github('kenstoyama/JNplots')
Skipping install of 'JNplots' from a github remote, the SHA1 (0bab1ace) has not changed since last install.
Use `force = TRUE` to force installation
library(JNplots)
Loading required package: scales
Attaching package: ‘scales’
The following object is masked from ‘package:viridis’:
viridis_pal
The following object is masked from ‘package:purrr’:
discard
The following object is masked from ‘package:readr’:
col_factor
Loading required package: ape
Attaching package: ‘ape’
The following object is masked from ‘package:dplyr’:
where
#Note that it seems to only deal with categorical predictors with 2 levels
jnt_cat(X='cbodyvol', Y='sutwidth', m='treatment', data=df)
$coeff
Generalized least squares fit by REML
Model: Yi ~ Xi * gi
Data: NULL
Coefficients:
Correlation:
(Intr) Xi gihigh
Xi -0.980
gihigh -0.568 0.557
Xi:gihigh 0.521 -0.531 -0.987
Standardized residuals:
Min Q1 Med Q3 Max
-2.19114183 -0.58031084 0.08608482 0.59449162 2.99577686
Residual standard error: 0.02179849
Degrees of freedom: 48 total; 44 residual
$`lower limit in X`
(Intercept)
1.325842
$`upper limit in X`
(Intercept)
2.019806
jnt_cat(X='cbodyvol', Y='sutwidth', m='treatment', data=df2)
$coeff
Generalized least squares fit by REML
Model: Yi ~ Xi * gi
Data: NULL
Coefficients:
Correlation:
(Intr) Xi giintl
Xi -0.980
giinitial -0.687 0.674
Xi:giinitial 0.663 -0.677 -0.981
Standardized residuals:
Min Q1 Med Q3 Max
-1.663619995 -0.729695423 0.008010786 0.755476536 2.013803900
Residual standard error: 0.01638333
Degrees of freedom: 48 total; 44 residual
$`lower limit in X`
(Intercept)
2.02381
$`upper limit in X`
(Intercept)
2.800654
jnt_cat(X='cbodyvol', Y='sutwidth', m='treatment', data=df3)
$coeff
Generalized least squares fit by REML
Model: Yi ~ Xi * gi
Data: NULL
Coefficients:
Correlation:
(Intr) Xi giintl
Xi -0.991
giinitial -0.808 0.800
Xi:giinitial 0.818 -0.826 -0.987
Standardized residuals:
Min Q1 Med Q3 Max
-2.10462809 -0.61110459 -0.03835316 0.59392696 2.87749339
Residual standard error: 0.02269455
Degrees of freedom: 48 total; 44 residual
$`lower limit in X`
(Intercept)
-1.188155
$`upper limit in X`
(Intercept)
1.131917