Lemmens et al. (2015) did a detailed study of various biotic communities in artificial ponds in Belgium. They sampled 28 ponds that represented different types of management, a combination of fish farming strategies (no fish, farming young fish, low intensity management, no management), and drainage frequencies (> 10 years ago, occasional, annual). They also quantified taxon abundances for fish, zooplankton, and macro-invertebrates (different families and species within some groups) and covers of submerged, floating, and emergent vegetation. The macroinvertebrate dataset only included 23 ponds and we will use these data to illustrate CA by examining the ordination of the macroinvertebrate community (abundances of families).
The paper is here and the pond subset for this example is lemminvert2.csv
Lemmens, P., Mergeay, J., Van Wichelen, J., De Meester, L. & Declerck, S. A. (2015). The impact of conservation management on the community composition of multiple organism groups in eutrophic interconnected man-made ponds. PLoS One, 10, e0139371.
Plots used for QK use the ggplot classic theme, with some tweaks. Tweaks are consolidated into theme_QK; use this theme for figures and tweak the theme to avoid repetitive code changes.
packages: vegan
Convert data to contingency table
Get chi-square test of independence
lemminvert <- read_csv("../data/lemminvert2.csv")
Rows: 23 Columns: 32── Column specification ────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): site, manag
dbl (30): managsymb, ca, ba, ac, ly, pla, sp, vi, co, na, ne, no, ple, ga, as, c...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
lemminvert1.tab <- as.table(as.matrix(lemminvert[,-(1:3)]))
chisq.test(lemminvert1.tab,correct=F)
Warning: Chi-squared approximation may be incorrect
Pearson's Chi-squared test
data: lemminvert1.tab
X-squared = 24485.276, df = 616, p-value < 2.2204e-16
Very low P value
lemminvert1 <- lemminvert[,-(1:3)]
lemmens1.ca <- cca(lemminvert1)
summary(lemmens1.ca, scaling=1)
Call:
cca(X = lemminvert1)
Partitioning of scaled Chi-square:
Inertia Proportion
Total 1.306439 1
Unconstrained 1.306439 1
Eigenvalues, and their contribution to the scaled Chi-square
Importance of components:
CA1 CA2 CA3 CA4 CA5 CA6
Eigenvalue 0.3646027 0.1563418 0.1471454 0.1350028 0.12477099 0.09851886
Proportion Explained 0.2790813 0.1196702 0.1126309 0.1033365 0.09550466 0.07541024
Cumulative Proportion 0.2790813 0.3987515 0.5113824 0.6147189 0.71022356 0.78563380
CA7 CA8 CA9 CA10 CA11
Eigenvalue 0.08073187 0.04604113 0.04071200 0.03138836 0.02620429
Proportion Explained 0.06179537 0.03524170 0.03116258 0.02402589 0.02005780
Cumulative Proportion 0.84742917 0.88267087 0.91383345 0.93785935 0.95791715
CA12 CA13 CA14 CA15 CA16
Eigenvalue 0.01864224 0.01378977 0.006884844 0.005143572 0.004453789
Proportion Explained 0.01426951 0.01055523 0.005269932 0.003937094 0.003409106
Cumulative Proportion 0.97218665 0.98274189 0.988011820 0.991948913 0.995358020
CA17 CA18 CA19 CA20
Eigenvalue 0.003260623 0.0011283312 0.0008856922 0.0005362529
Proportion Explained 0.002495810 0.0008636694 0.0006779439 0.0004104692
Cumulative Proportion 0.997853830 0.9987174993 0.9993954433 0.9998059125
CA21 CA22
Eigenvalue 0.0001551241 9.843940e-05
Proportion Explained 0.0001187381 7.534941e-05
Cumulative Proportion 0.9999246506 1.000000e+00
summary(lemmens1.ca, scaling=2)
Call:
cca(X = lemminvert1)
Partitioning of scaled Chi-square:
Inertia Proportion
Total 1.306439 1
Unconstrained 1.306439 1
Eigenvalues, and their contribution to the scaled Chi-square
Importance of components:
CA1 CA2 CA3 CA4 CA5 CA6
Eigenvalue 0.3646027 0.1563418 0.1471454 0.1350028 0.12477099 0.09851886
Proportion Explained 0.2790813 0.1196702 0.1126309 0.1033365 0.09550466 0.07541024
Cumulative Proportion 0.2790813 0.3987515 0.5113824 0.6147189 0.71022356 0.78563380
CA7 CA8 CA9 CA10 CA11
Eigenvalue 0.08073187 0.04604113 0.04071200 0.03138836 0.02620429
Proportion Explained 0.06179537 0.03524170 0.03116258 0.02402589 0.02005780
Cumulative Proportion 0.84742917 0.88267087 0.91383345 0.93785935 0.95791715
CA12 CA13 CA14 CA15 CA16
Eigenvalue 0.01864224 0.01378977 0.006884844 0.005143572 0.004453789
Proportion Explained 0.01426951 0.01055523 0.005269932 0.003937094 0.003409106
Cumulative Proportion 0.97218665 0.98274189 0.988011820 0.991948913 0.995358020
CA17 CA18 CA19 CA20
Eigenvalue 0.003260623 0.0011283312 0.0008856922 0.0005362529
Proportion Explained 0.002495810 0.0008636694 0.0006779439 0.0004104692
Cumulative Proportion 0.997853830 0.9987174993 0.9993954433 0.9998059125
CA21 CA22
Eigenvalue 0.0001551241 9.843940e-05
Proportion Explained 0.0001187381 7.534941e-05
Cumulative Proportion 0.9999246506 1.000000e+00
ordiplot(lemmens1.ca, scaling=1, type="text")
ordiplot(lemmens1.ca, scaling=2, type="text")
Get broken stick graphs
lemmens1.ca.eig <- lemmens1.ca$CA$eig
evplot(lemmens1.ca.eig)
screeplot(lemmens1.ca,bstick=TRUE)
Do nice biplots
library(ggrepel)
#extract scores into smaller file.show
a<-as.data.frame(lemmens1.ca$CA$u) #u is sites
b<-as.data.frame(lemmens1.ca$CA$v) #v is variables
b$fam<-row.names(b) #add family names for plotting
a<-cbind(lemminvert[c(1:3)],a) #Add site names & symbols from original data file
br=c("nm","li","nf","yf")
la=c("None", "Light", "No fish", "Young fish")
p1a<-ggplot(data=b, aes(x=CA1, y=CA2))+
geom_point()+
geom_text_repel(aes(label=fam), size=2, max.overlaps=25)+
theme_qk()+
xlim(-2,3)+
ylim(-2,10)
p2<-ggplot(data=a, aes(x=CA1, y=CA2, shape=manag, ) )+
geom_point()+
labs(y=NULL)+
scale_shape_manual(values=sym4,
name="Management",
breaks=br,
labels=la,
guide =
guide_legend(label.theme = element_text(size=6),
title=NULL)
)+
xlim(-2,3)+
ylim(-2,10)+
theme_qk()
p3<-p1a+p2
p3
Colour version of RH panel
p2a<-ggplot(data=a, aes(x=CA1, y=CA2, color=manag, ) )+
geom_point()+
labs(y=NULL)+
scale_color_viridis_d(
name="Management",
breaks=br,
labels=la,
guide =
guide_legend(label.theme = element_text(size=6),
title=NULL)
)+
xlim(-2,3)+
ylim(-2,10)+
theme_qk()
p3c<-p1a+p2a
p3c