The aim of these exercises is to extend the analyses from Chapter 8, focusing on identifying the relative importance of predictors in linear models that include at least one continuous predictor.
Recall the example of La Rosa and Conner (2017) from the Chapter 8 exercises. They examined effects of up to six floral traits on fitness components of milkweeds, Asclepias spp. The fitness components were male and female pollination success and female reproductive success.
The data are available from Dryad here. Fitness component estimates were relativized by dividing by the mean, and the traits were standardized to a mean of zero and standard deviation of one. You can also get the data from larosa.csv.
df <- read.csv("../data/larosa.csv")
knitr::kable(head(df,10), booktabs=TRUE) %>%
kableExtra::kable_styling(latex_options = c("HOLD_position","scale_down","striped"))
species | plant.id | gyn.width | hood.length | hood.height | horn.reach | slit.length | gap.width | display.flowers.1day | removals.per.flower | insertions.per.flower | fruits | geo_mean | relz.gyn.w | relz.hood.l | relz.hood.h | relz.horn.r | relz.slit.l | relz.gap.w | remins.std.gyn.w | remins.std.hood.l | remins.std.hood.h | remins.std.horn.r | remins.std.slit.l | remins.std.gap.w | std.floral.display.1day | rel.removal.per.flower | rel.insertion.per.flower | fruit.std.gyn.w | fruit.std.hood.l | fruit.std.hood.h | fruit.std.horn.r | fruit.std.slit.l | fruit.std.gap.w | fruit.std.display.size | rel.fruits | notes | total.flower.quantity | poll.duration.per10min.no.flies. | poll.visits.per10min.no.flies. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Asyr | AS01 | 0.2190 | 0.3557 | 0.5095 | 0.2038 | 0.1918 | 0.0464 | 168 | 3.0308 | 0.3231 | 7 | 0.2954 | 0.7414 | 1.2042 | 1.7249 | 0.6900 | 0.6493 | 0.1571 | 0.0126 | 0.7453 | 0.0231 | -0.0052 | 0.1851 | -1.2587 | 0.8513 | 1.5032 | 0.7706 | -0.0338 | 0.6940 | -0.0218 | -0.0925 | 0.1087 | -1.2607 | 0.7907 | 1.176 | 325.0 | 0.512 | 0.480 | |
Asyr | AS02 | 0.2042 | 0.2776 | 0.5293 | 0.1820 | 0.1843 | 0.0597 | 120 | 3.0732 | 1.0244 | 3 | 0.2727 | 0.7488 | 1.0180 | 1.9410 | 0.6674 | 0.6758 | 0.2189 | -1.2374 | -1.1126 | 0.4765 | -0.6511 | -0.4590 | -0.1277 | 0.1527 | 1.5242 | 2.4433 | -1.2902 | -1.1224 | 0.4261 | -0.7732 | -0.5484 | -0.1345 | 0.1036 | 0.504 | 369.0 | 0.375 | 2.413 | |
Asyr | AS03 | 0.2233 | 0.2771 | 0.5083 | 0.1809 | 0.1687 | 0.0664 | 117 | 0.3226 | 0.1935 | 10 | 0.2699 | 0.8274 | 1.0267 | 1.8833 | 0.6703 | 0.6251 | 0.2460 | 0.3758 | -1.1245 | -0.0044 | -0.6837 | -1.7987 | 0.4420 | 0.1090 | 0.1600 | 0.4616 | 0.3313 | -1.1340 | -0.0489 | -0.8075 | -1.9150 | 0.4329 | 0.0607 | 1.680 | 372.0 | 0.000 | 0.000 | |
Asyr | AS04 | 0.2081 | 0.2403 | 0.4241 | 0.1520 | 0.1796 | 0.0515 | 76 | 1.2475 | 0.2970 | 3 | 0.2484 | 0.8377 | 0.9673 | 1.7071 | 0.6118 | 0.7229 | 0.2073 | -0.9080 | -1.9999 | -1.9328 | -1.5400 | -0.8626 | -0.8251 | -0.4878 | 0.6187 | 0.7084 | -0.9591 | -1.9899 | -1.9533 | -1.7100 | -0.9601 | -0.8288 | -0.5262 | 0.504 | 303.0 | 0.188 | 0.665 | |
Asyr | AS05 | 0.2136 | 0.3544 | 0.5368 | 0.1940 | 0.1842 | 0.0459 | 148 | 3.5573 | 0.9008 | 3 | 0.2941 | 0.7262 | 1.2049 | 1.8250 | 0.6596 | 0.6262 | 0.1560 | -0.4435 | 0.7144 | 0.6483 | -0.2956 | -0.4676 | -1.3013 | 0.5602 | 1.7643 | 2.1484 | -0.4922 | 0.6638 | 0.5957 | -0.3985 | -0.5571 | -1.3030 | 0.5044 | 0.504 | 262.0 | 4.736 | 11.667 | |
Asyr | AS07 | 0.2276 | 0.2984 | 0.4867 | 0.2161 | 0.1953 | 0.0840 | 61 | 1.5000 | 0.2400 | 1 | 0.2835 | 0.8030 | 1.0527 | 1.7170 | 0.7624 | 0.6890 | 0.2963 | 0.7389 | -0.6178 | -0.4991 | 0.3592 | 0.4857 | 1.9387 | -0.7061 | 0.7439 | 0.5724 | 0.6964 | -0.6386 | -0.5375 | 0.2916 | 0.4153 | 1.9232 | -0.7409 | 0.168 | 150.0 | 2.138 | 0.668 | |
Asyr | AS08 | 0.2103 | 0.3155 | 0.4390 | 0.1806 | 0.1967 | 0.0581 | 69 | 0.1833 | 0.3333 | 0 | 0.2751 | 0.7644 | 1.1468 | 1.5957 | 0.6564 | 0.7150 | 0.2112 | -0.7222 | -0.2110 | -1.5915 | -0.6926 | 0.6059 | -0.2638 | -0.5896 | 0.0909 | 0.7950 | -0.7724 | -0.2409 | -1.6163 | -0.8169 | 0.5379 | -0.2700 | -0.6264 | 0.000 | 90.0 | 0.000 | 0.000 | |
Asyr | AS09 | 0.2261 | 0.3385 | 0.5744 | 0.2101 | 0.1898 | 0.0597 | 388 | 2.5053 | 0.3915 | 1 | 0.3022 | 0.7481 | 1.1200 | 1.9005 | 0.6952 | 0.6280 | 0.1975 | 0.6122 | 0.3361 | 1.5094 | 0.1815 | 0.0134 | -0.1277 | 4.0535 | 1.2426 | 0.9337 | 0.5690 | 0.2940 | 1.4462 | 0.1042 | -0.0665 | -0.1345 | 3.9399 | 0.168 | 421.5 | 8.181 | 5.167 | |
Asyr | AS10 | 0.2214 | 0.3508 | 0.5406 | 0.2295 | 0.1997 | 0.0525 | 47 | 2.0200 | 0.5600 | 0 | 0.3026 | 0.7317 | 1.1593 | 1.7865 | 0.7584 | 0.6599 | 0.1735 | 0.2153 | 0.6287 | 0.7353 | 0.7563 | 0.8636 | -0.7400 | -0.9099 | 1.0018 | 1.3357 | 0.1700 | 0.5801 | 0.6817 | 0.7100 | 0.8008 | -0.7442 | -0.9413 | 0.000 | 50.0 | 0.000 | 0.000 | |
Asyr | AS11 | 0.2175 | 0.3166 | 0.4320 | 0.1501 | 0.1746 | 0.0535 | 35 | 0.1389 | 0.0278 | 0 | 0.2685 | 0.8102 | 1.1793 | 1.6092 | 0.5591 | 0.6504 | 0.1993 | -0.1141 | -0.1849 | -1.7518 | -1.5963 | -1.2920 | -0.6550 | -1.0845 | 0.0689 | 0.0663 | -0.1611 | -0.2154 | -1.7747 | -1.7693 | -1.3981 | -0.6595 | -1.1131 | 0.000 | 36.0 | 3.410 | 0.628 |
df_syr<-subset(df,species=="Asyr")
df_vir<-subset(df, species=='Avir')
df_tub<-subset(df, species=='Atub')
Refit the linear models from the Chapter 8 exercises but now use the recommended methods from this chapter (hierarchical partitioning or LMD, PMVD) to assess each predictor’s relative importance in each of the models.
If you’re really keen, there’s a third species in the dataframe (Atub).
Now use AIC and Aikake weights to find the most parsimonious model (best fit with fewest predictors) for each combination of fitness component and species. If there are multiple models with similar AICs, then use full (zero) model averaging to produce a final model.
Peraza et al. (2023) studied what factors drove mercury accumulation in muscle tissues of a high-altitude carnivore, the wolverine (Gulo gulo). Wolverine muscle (for Hg) and hair (for N and C stable isotopes) samples were obtained from carcasses submitted by trappers and hair snags across four Canadian provinces. We will focus on total Hg concentration (µg.gdw) in muscle as the response and 14 predictor variables measured at the point of collection:
Start by reading in the data.
peraza <- read.csv("../data/peraza clean.csv")
knitr::kable(head(peraza,10), booktabs=TRUE) %>%
kableExtra::kable_styling(latex_options = c("HOLD_position","scale_down","striped"))
ageclass | sex | long | lat | thg | delta15n | delta13c | hgdep | hgwet | prec | tempmax | tempmin | elev | dist | soc | sph10 | sph60 | X |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Adult | Male | -139.2 | 64.4 | 0.095 | 5.775 | -24.480 | 8.644 | 5.731 | 30.459 | 0.242 | -10.445 | 1205 | 496304.4 | 19.000 | 56 | 58 | NA |
Yearling | Female | -136.5 | 62.2 | 1.184 | 9.570 | -25.438 | 10.364 | 3.209 | 22.764 | 3.282 | -8.797 | 954 | 721306.5 | 144.104 | 56 | 60 | NA |
Adult | Female | -137.4 | 62.8 | 0.498 | 6.692 | -25.216 | 12.601 | 2.918 | 25.705 | 2.845 | -8.578 | 687 | 653101.6 | 114.319 | 54 | 60 | NA |
Yearling | Female | -133.1 | 60.7 | 0.308 | 5.747 | -25.773 | 9.165 | 3.253 | 28.955 | 4.186 | -7.525 | 926 | 883503.6 | 69.691 | 56 | 58 | NA |
Adult | Male | -135.4 | 63.3 | 0.058 | 6.330 | -25.684 | 9.067 | 2.700 | 31.001 | 3.142 | -9.092 | 598 | 590826.8 | 106.367 | 66 | 70 | NA |
Yearling | Male | -139.2 | 64.4 | 0.084 | 5.594 | -23.896 | 8.644 | 5.731 | 30.459 | 0.242 | -10.445 | 1205 | 496304.4 | 19.000 | 56 | 58 | NA |
Adult | Male | -140.6 | 64.6 | 0.120 | 4.000 | -33.459 | 11.238 | 2.964 | 25.016 | 3.363 | -8.605 | 442 | 501614.8 | 93.346 | 57 | 61 | NA |
Yearling | Male | -137.4 | 62.8 | 0.595 | 7.223 | -26.486 | 12.601 | 2.918 | 25.705 | 2.845 | -8.578 | 687 | 653101.6 | 114.319 | 54 | 60 | NA |
Yearling | Male | -137.4 | 62.8 | 0.309 | 7.280 | -25.037 | 12.601 | 2.918 | 25.705 | 2.845 | -8.578 | 687 | 653101.6 | 114.319 | 54 | 60 | NA |
Adult | Male | -132.0 | 61.3 | 0.052 | 4.728 | -25.333 | 6.673 | 3.062 | 31.631 | 3.376 | -9.474 | 1048 | 830036.0 | 81.455 | 55 | 58 | NA |
Do the usual pre-analysis checks of assumptions using boxplots, a scatterplot matrix and VIFs.
Note the strong collinearity between min and max temperature, between soil pH at 10 and 60cm and between distance from coast and latitude. The authors removed min temperature, latitude and pH at 60cm from their model.
Fit a multiple regression model relating log total Hg to the remaining 11 predictors and check the residual plot.
Any indication of outliers of concern?
We recommend you proceed with the model with all data, but note that Perazo et al. omitted 17 observations as outliers so your results will differ somewhat from theirs.
Tonkin et al. (2015) Tonkin et al (2022) surveyed 80 freshwater stream sites in mid-latitude China to determine how different climate and catchment (watershed) variables predicted the richness of three different insect groups (Ephemeroptera, Plecoptera, Trichoptera; collectively abbreviated as EPT). They recorded 32 predictor variables in total but to avoid collinearity (r > 0.7), only 17 variables were included in the analyses (see their Table 1).
The full dataset is available at http://dx.doi.org/10.6084/m9.figshare.1305679 but a tidied-up version including only the non-collinear predictors is available here. We will also not include region (a categorical variable) as a predictor, resulting in 16 predictors. Tonkin et al used Poisson regression models to link richness to these predictors but for the purposes of this chapter, we will treat richness as normally distributed (its distribution wasn’t very skewed - you can use boxplots to see if you agree).
tonkin <- read.csv("../data/tonkin.csv")
head(tonkin, 10)
## sitecode region ept ephem plec trich trees_bl trees_nl shrub herbaceous
## 1 BS01 East 7 6 0 1 0.00000 100.00000 0.00000 0
## 2 BS02 East 14 10 1 3 0.00000 100.00000 0.00000 0
## 3 BS03 East 16 13 0 3 0.00000 84.61538 15.38462 0
## 4 BS04 East 17 14 0 3 0.00000 100.00000 0.00000 0
## 5 BS05 East 11 6 2 3 0.00000 100.00000 0.00000 0
## 6 BS06 East 12 10 1 1 0.00000 100.00000 0.00000 0
## 7 BS07 East 17 7 6 4 0.00000 100.00000 0.00000 0
## 8 BS08 East 18 12 4 2 42.85714 57.14286 0.00000 0
## 9 BS09 East 27 13 6 8 66.66667 33.33333 0.00000 0
## 10 BS10 East 9 7 1 1 0.00000 100.00000 0.00000 0
## cultivated water bio1 bio4 bio8 bio15 bio18 ai pet elevation slope
## 1 0 0 170 8420 211 55 573 14284 1153 99 2.631222
## 2 0 0 157 8406 198 53 588 15367 1062 292 2.318342
## 3 0 0 164 8476 205 54 578 14841 1103 186 2.871562
## 4 0 0 165 8503 206 54 556 14424 1094 145 1.416681
## 5 0 0 164 8354 205 54 585 14670 1124 211 3.630091
## 6 0 0 157 8264 197 54 615 15977 1084 338 5.192832
## 7 0 0 166 8407 206 55 580 14747 1127 157 2.908305
## 8 0 0 165 8400 206 54 582 15317 1104 167 6.563747
## 9 0 0 159 8327 199 54 603 15790 1081 285 6.270956
## 10 0 0 166 8487 207 54 572 14708 1115 126 3.333718
## catch_size
## 1 76
## 2 11
## 3 13
## 4 5
## 5 1
## 6 8
## 7 1
## 8 7
## 9 6
## 10 4
Which predictors had the strongest influence on EPT richness?
How do the results compare?
Use the same settings as they did (with the default bag fraction of 0.5):