Polis et al. (1998) studied the factors that control spider populations on islands in the Gulf of California. We will use part of their data to model the presence/absence of lizards (Uta) against the ratio of perimeter to area (P/A, as a measure of input of marine detritus) for 19 islands in the Gulf of California. We fitted a GLM of the presence of Uta (binary) against P/A ratio.
Uta stansburiana. Judy Gallagher , via Wikimedia Commons
The data are in Table 1 of the paper, and our data file is here.
Polis, G. A., Hurd, S. D., Jackson, C. T. & Sanchez-Pinero, F. (1998). Multifactor population limitation: Variable spatial and temporal control of spiders on Gulf of California islands. Ecology, 79, 490-502.
First, load the required packages (car, performance, lmtest) + ggplot2
Import polis data file
polis <- read.csv("../data/polis.csv")
polis
Simple exploratory plot
plot(utanum~paratio, data=polis)
polis.glm <- glm(utanum ~ paratio,data=polis,family=binomial)
glance(polis.glm)
tidy(polis.glm, conf.int=TRUE)
Do LR test on slope
anova(polis.glm, test = 'Chisq')
Analysis of Deviance Table
Model: binomial, link: logit
Response: utanum
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 18 26.3
paratio 1 12.1 17 14.2 0.00051 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Get and plot residuals - deviance residuals first then standardized residuals
residualPlots(polis.glm, type="deviance")
Test stat Pr(>|Test stat|)
paratio 0.43 0.51
residualPlots(polis.glm, type="rstandard")
Test stat Pr(>|Test stat|)
paratio 0.43 0.51
Check for overdispersion
c(deviance(polis.glm), df.residual(polis.glm))
[1] 14.2 17.0
Check influence diagnostics
rstandard(polis.glm, infl = influence(polis.glm), do.coef = FALSE,
type = c("deviance"))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1.14893 0.44858 2.27765 -1.37313 0.93511 -1.02996 -0.30093 -0.71754 -1.70588 0.81977 0.47127 0.31050 0.37771 -0.01192 -0.00837 -0.72523 -0.67179 0.24631 0.31997
augment(polis.glm)
There is one outlier (case 3 - cook D = 0.84) and deleting it changes P value for Wald test but not LR test
polis1 <- subset(polis, island != "Cerraja")
plot(utanum~paratio, data=polis1)
polis1.glm <- glm(utanum ~ paratio,data=polis1,family=binomial)
tidy(polis1.glm)
glance(polis1.glm)
anova(polis1.glm, test = 'Chisq')
Analysis of Deviance Table
Model: binomial, link: logit
Response: utanum
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 17 24.95
paratio 1 17.4 16 7.53 3e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Create reduced model for lrtest - same as anova result
polis2.glm <- glm(utanum~+1, data=polis, family=binomial)
lrtest(polis2.glm, polis.glm)
Likelihood ratio test
Model 1: utanum ~ +1
Model 2: utanum ~ paratio
#Df LogLik Df Chisq Pr(>Chisq)
1 1 -13.14
2 2 -7.11 1 12.1 0.00051 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Plot predicted probabilities from original model
plot(polis.glm$fitted.values~polis$paratio)
Get odds ratio with CI
exp(coef(polis.glm))
(Intercept) paratio
36.821 0.803
exp(confint.default(polis.glm))
2.5 % 97.5 %
(Intercept) 1.328 1021.180
paratio 0.659 0.978
Get H-L test and Tjur r2
performance_hosmer(polis.glm, n_bins=10)
# Hosmer-Lemeshow Goodness-of-Fit Test
Chi-squared: 7.898
df: 8
p-value: 0.443
Summary: model seems to fit well.
r2_tjur(polis.glm)
Tjur's R2
0.521
p1<-ggplot(polis.glm,aes(y = polis.glm$fitted.values, x = polis$paratio)) +
geom_point(color="grey60", size=2, alpha= 1) +
theme_qk()+labs(x = "Perimeter:Area ratio", y = "Probability (lizards present)",
)
p1