Chapter 7

The aim of these exercises is to make sure you’re comfortable with fitting linear models with two or more categorical predictors. We will focus on normally distributed responses.

For each of the 5 examples below, you should follow the sequence we’ve used previously:

  1. What is the biological question and what are the response and predictor variables?

  2. What distribution do you expect the response variable to follow?

  3. Are the predictors continuous or categorical?

  4. Write out the linear model corresponding to this question.

  5. What are the assumptions behind the statistical model you’ll fit?

    1. Are those assumptions satisfied?
  6. Fit the model

    1. How will you assess whether the model fits well?

    2. Can you detect an effect of the predictor?

    3. How do you measure the effect?

  7. What next steps would you suggest?

  8. What do you conclude (including any cautions)?

A. Psychostimulant effects on vole physiology

Hostetler, Phillips, and Ryabinin (2016) examined the effects of methamphetamine (MA) on hypothalamic neuropeptides in the brains of laboratory-housed prairie voles. Individual voles were the experimental units, and there were two fixed factors. Treatment, representing access to MA or water, was an experimental factor, crossed with sex (male vs. female), an observational factor, with five or six voles in each of the four combinations.

We’ll start with the oxytocin data as the response variable. The data are in the supplementary files of the paper or use our version. Import hostetler_OT data file (hostetler OT.csv), and we’ll make sure that sex and treatment are treated as factors.

host_ot <- read.csv("../data/hostetler OT.csv")
head(host_ot,10)
##    Animal.ID trtmt sex pvn1 pvn2 pvn3 pvn4 pvn5 pvn6 otmean
## 1   231.15-5     0   1   35   32   56   51   NA   NA   43.5
## 2   220.17-2     1   1   39   44   41   29   34   27   35.7
## 3    248.1-4     0   1   39   30   55   34   80   73   51.8
## 4    240.7-4     1   1   26   26   70   56   NA   NA   44.5
## 5   220.17-4     0   1   48   55   68   70   NA   NA   60.3
## 6   231.15-6     1   1   66   51   38   52   50   39   49.3
## 7   220.17-5     0   1   57   59   59   47   NA   NA   55.5
## 8    240.7-3     1   1   25   29   39   37   35   31   32.7
## 9   232.17-2     0   1   42   57   NA   NA   NA   NA   49.5
## 10  220.17-3     1   1   43   56   59   83   NA   NA   60.3
# make sex & trtmt factors
host_ot$sex<-as.factor(host_ot$sex)
host_ot$trtmt<-as.factor(host_ot$trtmt)

Hint. To help you analyse this data set in R, have a look at the online version of Box 7.3, which includes some relevant R code.

Would the same approach work for the ATV response?

B. Metabolic responses of salmon to temperature change

Kraskura et al. (2020) examined sex-specific differences in the metabolic responses of coho salmon (Oncorhynchus kisutch) to three different temperatures: 9°C (typical), 14°C (current maximum) and 18°C (climate change scenario). Fish were collected from a hatchery in British Columbia, Canada, and returned to a research laboratory and held in outdoor tanks at 9°C. Temperatures were raised slowly to the relevant test temperature and held for 2 days. Fish were then transferred individually to a swim tunnel respirometer where critical swim speed was measured by increasing the water velocity until the fish could no longer maintain its position in the water column. In addition, the maximum metabolic rate was also recorded during each trial. The fish were subsequently euthanized and their sex determined.

Analyze these data to examine the effects of temperature, sex and the interaction between these two fixed factors on absolute critical swim speed (cm s-1).

The data are available on Dryad. They are in an Excel document that needs some tidying before using in R, as sheets with data include columns for summary statistics, etc. A tidied version is here.

krask <- read.csv("../data/krask_swim.csv")
head(krask,10)
##       fish_id sex temperature bm_test_kg ucrit_bl_s ucrit_cm_s
## 1  Oct06fish1   F           9      2.461      2.836    167.756
## 2  Oct07fish2   F           9      1.525      2.488    113.567
## 3  Oct08fish1   F           9      1.885      3.592    135.208
## 4  Oct09fish1   F           9      1.614      3.184    112.879
## 5  Oct10fish2   F           9      1.984      3.647    155.070
## 6  Oct14fish1   F           9      2.142      3.132    160.883
## 7  Oct14fish2   F           9      2.028      2.767    105.653
## 8  Oct16fish1   F           9      2.581      2.936    168.327
## 9  Oct16fish2   F           9      2.708      2.638    140.878
## 10 Oct16fish3   F           9      3.255      3.086    154.580
# make sex & temperature factors
krask$sex<-as.factor(krask$sex)
krask$temperature<-as.factor(krask$temperature)

Would the same approach work for the maximum metabolic rate (mgO2 kg-1 min-1) response?

The tidied metabolic rates file is here; the variable we’re interested in is total_epoc

krask <- read.csv("../data/krask_metab.csv")
head(krask,10)
##       fish_id sex temperature bm_test_kg fl_test_cm    mmr   rmr    AAS    FAS
## 1  10_16_b1_1   F           9       1.63       52.7 12.160 1.380 10.780  8.812
## 2  10_16_b1_2   F           9       2.10       41.1 11.980 1.127 10.853 10.630
## 3  10_16_b1_3   F           9       1.62       55.0  9.750 1.099  8.651  8.872
## 4  10_16_b2_3   F           9       2.61       59.6  8.840 1.280  7.560  6.906
## 5  10_22_b1_1   F           9       2.54       60.8 11.916 1.262 10.654  9.442
## 6  10_22_b1_2   F           9       2.54       59.3 11.501 1.080 10.421 10.649
## 7  10_22_b2_1   F           9       2.56       60.1 13.595 0.989 12.606 13.746
## 8  10_22_b2_3   F           9       2.74       61.7 14.832 1.368 13.464 10.842
## 9  10_30_b1_1   F           9       2.17       57.8 10.100 1.225  8.875  8.245
## 10 10_30_b1_2   F           9       2.31       58.4  7.839 1.308  6.531  5.993
##    total_epoc
## 1      512.49
## 2      616.91
## 3      367.30
## 4      387.92
## 5      612.14
## 6      690.88
## 7      485.28
## 8     2782.92
## 9      698.77
## 10     622.76
# make sex & temperature factors
krask$sex<-as.factor(krask$sex)
krask$temperature<-as.factor(krask$temperature)

C. Dietary supplements and osteoarthritis in dogs

This exercise is a little more complex, designed to start thinking about using additional predictors to give a clearer picture of any signals from predictors of interest, i.e. to reduce background noise in the data. Two things to note:

  • We’ll assume you’ve completed the previous two exercises, so you’re familiar with multifactor models, the questions they answer, etc. We won’t go through the initial checklist again, but recommend that you do it anyway.

  • We’ll come back to this example in exercises for later chapters (10, 11, and 13).

The example we’ll use is from Martello et al. (2022), who tested whether dietary supplements, which are often expensive, can help with symptoms of osteoarthritis in dogs. Their study was experimental, with 40 dogs with chronic pain symptoms. Dogs were also screened and excluded if, for example, they had recent surgery, were taking steroidal medication, etc. The dogs were then allocated randomly to one of two treatments, a dietary supplement that included glucosamine sulfate and chondroitin, and a placebo. The study was completely blind, so administration of treatments and recording of data was done with no knowledge of what each dog had received. The supplements were administered for 60 days, and dogs received treatments orally at 2g/10kg body weight. The treatments are the variable GROUP.

Two response variables were used, both involving scoring of dogs’ behaviour and activity:

  1. Dog owners used a questionnaire, responses to which generated a value on a 40 point Helsinki Chronic Pain Index, and these assessments were done at the start of treatment, and after 40 and 60 days. In the data file, these measures are the variables HCPI, HCP.4 and HCP.6

  2. Veterinary assessments were done at the same 3 times. These assessments were on a 5-point scale reflecting increasing degrees of lameness. These variables are SEGNI.OA.VET, SEGNI.OA.VET.4, SEGNI.OA.VET.6

We won’t deal with them here, but veterinarians also measured a range of physiological and biochemical variables from a blood sample at 0 and 60 days.

The authors stated that osteoarthritis is affected by a range of other factors, including sex, body weight, breed, etc., and they also considered whether other things like desexing, with its hormonal changes, might be important. They measured several other potential predictors:

  • Sex
  • Sterilization (Yes/No)
  • Body weight (PESO.KG)
  • Breed (RAZZA)
  • Estimated age (ETA)
  • Body condition score, which combines all of these to produce a 4-point scale (BCS).

The data are in the supporting information (and a copy is here).

df <- read.csv("../data/martello.csv")
head(df,10)
##    group PAZIENTE FAR      breed wt eta ster sex REGIONE.ANATOMICA     Note
## 1    CTR       21   B   METICCIO 21  10    N   F           ANCA DX         
## 2    CTR       22   B  DOGUE DEB 47   6    N   M            GOMITI         
## 3    CTR       23   B  PASTORE T 23   9    N   M         GOMITO DX         
## 4    CTR       24   B   METICCIO 32   8    S   F        ANCA DX/SX         
## 5    CTR       25   B   METICCIO 29   4    N   M      GOMITO DX/SX         
## 6    CTR       26   B   METICCIO 35  12    N   M        ANCA DX/SX         
## 7    CTR       27   B   METICCIO 30   8    N   F         GOMITO DX CARPO DX
## 8    CTR       28   B LABRADOR R 28  11    N   F      GINOCCHIO DX         
## 9    CTR       29   B   GOLDEN R 24   7    S   F           ANCA SX         
## 10   CTR       30   B    LAGOTTO 18   9    S   F      GINOCCHIO DX TARSO DX
##    ESAMI.EMATICI RX bcs hcp0 hcp40 hcp60 vet0 vet40 vet60 clin.BASELINE.MV.0
## 1            NRM SI   3   25    25    25    2     2     2                  1
## 2            NRM SI   3   26    26    28    2     2     2                  1
## 3            NRM SI   2   32    32    30    3     3     3                  2
## 4            NRM SI   3   34    34    32    3     3     3                  2
## 5            NRM SI   3   20    20    22    1     1     2                  0
## 6            NRM SI   4   26    26    28    2     2     2                  1
## 7            NRM SI   4   28    28    28    2     2     2                  1
## 8            NRM SI   3   22    22    20    1     1     1                  0
## 9            NRM SI   3   38    28    28    3     3     2                  0
## 10           NRM SI   3   40    36    40    3     2     3                  3
##    clin.FUP.MV.2 clin.FUP.MV.4 clin.FUP.MV.6 clin.BASELINE.PR.0 clin.FUP.PR.2
## 1              1             1             2                  1             1
## 2              1             1             2                  1             1
## 3              1             2             2                  2             0
## 4              2             1             1                  3             3
## 5              0             1             2                  1             1
## 6              1             1             2                  1             1
## 7              1             1             1                  1             0
## 8              0             1             1                  0             1
## 9              0             1             2                  1             1
## 10             3             3             3                  2             2
##    clin.FUP.PR.4 clin.FUP.PR.6
## 1              1             1
## 2              1             1
## 3              1             2
## 4              3             3
## 5              1             1
## 6              1             1
## 7              0             1
## 8              0             0
## 9              0             0
## 10             2             1
#Tidy up the names
#df_names<-c(group="GROUP", ster="sterilizzato", bcs="BCS",eta="ETA", breed="RAZZA",
#                  hcp0="HCPI", hcp40="HCPI.4", hcp60="HCP.6",
#                  vet0="SEGNI.OA.VET", vet40="SEGNI.AO.VET.4", vet60="SEGNI.AO.VET.6")
#df<-rename(df, !!!df_names)
# make sex, treatment, and sterilization factors
df$sex<-as.factor(df$sex)
df$group<-as.factor(df$group)
df$ster<-as.factor(df$ster)

The question of most interest is whether the dietary supplements result in an improvement after 60 days, and for now we’ll include sex in the analysis, as there are good grounds for expecting a range of sex-based differences.

Identify an appropriate model for examining whether the dog’s condition, as assessed by their owners, after 60 days, depends on whether a dog received the supplement or a placebo and whether those results depended on sex of the dog.

Before starting analysis, we’d do a little tidying up of the data file, to stay consistent with using lower-case for variable names and doing away with periods, etc. It doesn’t really matter, but it keeps things tidy. As we read in the data, we’ll also declare a few of the variables as factors.

Start with boxplots, then fit model and look at residuals

Would including desexing in the analysis improve the model fit or the conclusions?

Anything else worth checking?

Have a think about the analysis and the experimental design. We’ve analyzed the dogs’ status at the end of the experiment, and relied on randomization to keep us away from misleading results. Think about how you might use the measurements at time 0 to do some additional checks or analyses. Think of a couple of ways and then investigate them.

For interest, you could try using the body condition score (bcs) instead of sex and sterilization, as they were chosen by the authors to play the same role.

Other things to consider

As you work through later chapters, we’ll use this data set to explore a few other things: 1. See whether body weight plays a role in the response. Dosage rates were adjusted for body weight, but there are other considerations around forces acting on joints, etc. 2. Make use of the 3 measurements on each dog and see if there are more subtle aspects of the response 3. Take a look at the vet data; with only 5 values, it might be better not to try and force it into an analysis that presumes a normal distribution.

D. Honeydew production in aphids

We’ll return to the example from the Chapter 6 exercises, where Vosteen, Gershenzon, and Kunert (2016) examined patterns of production of honeydew by different races of pea aphids (Acyrthosiphon pisum) and how that attracts ovipositioning hoverflies (Episyrphus balteatus) to create enemy-free space for the aphids. They measured honeydew production in response to combinations of aphid races and native/universal hosts, and treated these combinations as a single factor.

We could view aphid race and whether the aphids were on their native host or a universal one as two separate factors. Use the modified data file to see what happens when you separate these factors.

#Read in the data and assign it to an object df
df <- read.csv("../data/vosteen.csv")
head(df,10)
##    clone_plant_combination honeydew clone plant hostplant
## 1                  T_Vicia     1.08     T Vicia Universal
## 2                  T_Vicia     2.21     T Vicia Universal
## 3                  T_Vicia     2.63     T Vicia Universal
## 4                  T_Vicia     1.63     T Vicia Universal
## 5                  T_Vicia     3.51     T Vicia Universal
## 6                  T_Vicia     2.53     T Vicia Universal
## 7                  T_Vicia     2.92     T Vicia Universal
## 8                  T_Vicia     0.98     T Vicia Universal
## 9                  T_Vicia     2.39     T Vicia Universal
## 10                 T_Vicia     2.05     T Vicia Universal
#Make sure clone and hostplant are factors
df$clone<-as.factor(df$clone)
df$hostplant<-as.factor(df$hostplant)

References

Hostetler, C. M., T. J. Phillips, and A. E. Ryabinin. 2016. “Methamphetamine Consumption Inhibits Pair Bonding and Hypothalamic Oxytocin in Prairie Voles.” PLoS One 11 (7): e0158178. https://doi.org/10.1371/journal.pone.0158178.
Kraskura, K, E A Hardison, A G Little, T Dressler, T S Prystay, B Hendriks, A P Farrell, et al. 2020. “Sex-Specific Differences in Swimming, Aerobic Metabolism and Recovery from Exercise in Adult Coho Salmon ( Oncorhynchus Kisutch ) Across Ecologically Relevant Temperatures.” Edited by Craig Franklin. Conservation Physiology 9 (1): coab016. https://doi.org/gr9cmm.
Martello, Elisa, Mauro Bigliati, Raffaella Adami, Elena Biasibetti, Donal Bisanzio, Giorgia Meineri, and Natascia Bruni. 2022. “Efficacy of a Dietary Supplement in Dogs with Osteoarthritis: A Randomized Placebo-Controlled, Double-Blind Clinical Trial.” Edited by Angel Abuelo. PLOS ONE 17 (2): e0263971. https://doi.org/gr9cmt.
Vosteen, Ilka, Jonathan Gershenzon, and Grit Kunert. 2016. “Hoverfly Preference for High Honeydew Amounts Creates Enemy-Free Space for Aphids Colonizing Novel Host Plants.” Edited by Kim Cuddington. Journal of Animal Ecology 85 (5): 1286–97. https://doi.org/f9csdq.