The online versions of the worked examples are different from those in the book:
For a start, they are less polished. The book versions have detailed explanations of the analysis and what it means. Here, we just provide brief statements as a guide to any next steps
They include the R code, so you can run the analysis for yourself or use it for similar data structures and models
In most cases, the R code includes preliminary steps in the data analysis and may show alternative ways to fit a particular model.
They include links to the published paper, and, if necessary, the data repository. We’ve tried to use examples from journals such as PlosOne, where the data links are clear within the paper
Where we think it’s helpful, we’ve included images of the study organism or system. We haven’t bothered if the target is a mouse, assuming that you’ll have some familiarity (;-)), but there are others that involve less common species, and some examples where we found that an intriguing title (e.g. Are teasel carnivorous?) led us to a widely-distributed weed. The images are mostly available under Creative Commons licenses and good for including in lectures.
We will update these online examples periodically to maintain compatibility with R upgrades, improve the code, and possibly add more material. It’s also worth checking our personal website and Github for minor updates, suggestions, discussion, etc.
The scripts should be fine for R version 4.3.1.
Each example is based on an Rmarkdown notebook, which has two important features:
You can show or hide the code throughout the document for for individual code chunks. Sometimes it’s good to just see the comments and the output, while at other times you really want the code. Mix and match!
The Rmarkdown files can be downloaded using the button at the top right of each screen. This isn’t working atm, but you can save the html file and open it in RStudio and you’ll see the code.
The scripts rely on two external R scripts:
libraries.R is a small script that loads packages that are used across several examples, plus packages that make life easier, such as the Tidyverse and broom. It makes life easier to add packages via this script, rather than add individually to each example. The script should install any packages that aren’t present and then load all of the packages.
appearance.R is a collection of tweaks used to produce the figure layouts in the book. It functions as a style sheet for graphics, and defines a set of variables used for axis colour, symbol colours, etc. Changing this script is a quick way to change lots of figures, e.g. to colours suitable for teaching, rather than the gray scales predominantly used in the printed book. It’s invoked in ggplot using +theme_qk(). Default ggplot settings reappear if you disable this line in the code.
We’ve created our own working files from the raw data from each example. When that’s the case, you’ll also see a link to our file, and generally in the scripts.
We’ll load those files, rather than importing directly from a journal repository, Dryad, etc. That’s largely for convenience, and because most files require a little housekeeping before loading into a dataframe. Often that’s just some tidying of the first few rows and renaming of variables, but it can be more extensive. Excel files are quite common, and they often have a few header rows. We could deal with that as part of the script for each example, but for teaching purposes, it’s cleaner to just be able to load a file where the variable names match the rest of the script exactly.
In a few cases, the analyses in the published papers used a more complex subset of a larger dataset that was provided online. There, such as for Box 13.9, we’ve included an R code chunk replicating the data subset in the paper. In Box 13.9, that code chunk was kindly provided by the authors.
In future, we may provide the example data files as a complete set, either as a zip or an R package.
The pages for each of the worked examples are in one folder, and the data, R scripts, and documents/images are in separate folders. These folders are shared with other material, so the directory structure is slightly unusual. The Rmarkdown files, if you download them, link to three “parent” folders, R, data, and media. If you download the examples as a zipped document, that structure will be created, and you’ll just need to use the examples folder as the session directory.
If you download individual files, make sure you recreate that structure: main directory has example.html and four folders:
examples will contain the html file (if you have it) and any Rmd files
R has relevant R scripts
data has csv files
media has any images
This chapter has one example box; it uses summary data from Chapter 2, which you can get from Box 2.2.
No worked examples.
No worked examples. The code to replicate the exploratory procedures is available on request
Box 6.1 Single-predictor linear regression: coarse woody debris in lakes
Box 6.2 Single-predictor linear regression: soldier production in aphids
Box 6.3 Single-random predictor linear regression: brain and body weights of mammals
Box 6.4 Single-factor design with ordered treatment groups: fatty acid production in salmon
Box 6.5 Single-factor design: diatom communities in metal affected streams
Box 6.8 Linear regression diagnostics and transformations: diversity in mussel clumps
Box 6.10 Power analysis for single-predictor linear regression: bryophyte diversity
Box 8.1, 8.5 & 8.6 Multiple-predictor linear regression: cricket jump distance
Box 8.2 Multiple-predictor linear regression: bird abundance in remnant forest patches
Box 8.12 Single-factor ANCOVA design: sex and fruit fly longevity
Box 8.15 Overlapping covariate ranges is a statistical and a biological issue
Box 10.2 Single random factor design: diatom communities in metal affected streams
Box 10.3 Single random factor design with continuous covariate: gobies along the Rhine River
Box 10.6 three-level nested design: nutrition of Crown-of-thorns seastar larvae
Box 10.7 Two-factor mixed crossed design: urchin sperm and ocean acidity
Box 10.10 Three-factor mixed crossed design: cold stress and fruit fly mating
Box 10.11 Two-factor crossed block design: seagrass mutualisms
Box 13.1 Single-predictor logistic regression: presence/absence of lizards on islands
Box 13.2 Multiple-predictor logistic regression: determinants of butterfly color
Box 13.3 Dose response curves: you might already be using GLMs or GLMMs without knowing it
Box 13.5 2 x 2 two-way contingency table: problem bears, cubs
Box 13.6 R x C two-way contingency table: cat owners’ attitudes
Box 13.7 Three-way contingency table: cetacean sensory genes
Box 13.8 Generalized linear mixed model (GLMM): birds “backrest roosting”
Box 13.9 Generalized additive model (GAM): disease and an endangered bivalve
Box 15.1 Principal components analysis (PCA): soil characteristics and land use in China
Box 15.2 Correspondence analysis (CA): invertebrates in artificial ponds
Box 15.3 Constrained ordination: invertebrates in artificial ponds
Box 15.4 Linear discriminant (function) analysis: cryptic diversity in leopard frogs
Additional material: example to illustrate importance of careful colour selection