STATISTICAL ANALYSIS
Differences in egg size between high- and low-density sites were
assessed using a General Linear Mixed Model (GLMM). This included egg
size as the response variable, density as a predictor (fixed effect),
location (random effect) and female condition as a covariate. Similarly,
differences in sperm length between low- and high-density sites were
assessed using sperm length as the response variable, density as a
predictor (fixed effect), location (random effect) and male condition as
a covariate.
Principal component analysis (PCA) was used to reduce the motility
parameters to two PCs with eigenvalues more than 1, explaining
respectively 60.75% and 23.35% of the variance (total variance
explained: 84.10%). PC1 was predominantly loaded by the three velocity
measures (VAP, VSL & VCL) and is therefore interpreted as a measure of
sperm swimming speed; PC2 was loaded most strongly by STR and LIN and is
therefore interpreted as a measure of sperm swimming straightness. These
two PCs (PC1 and PC2) were used in all subsequent analyses, and their
relationship with the original motility parameters is detailed in Table
1. We analysed differences in sperm motility between low- and
high-density populations at the two sperm concentrations
(106 and 108, see above) using a
General Linear Mixed Model (GLMM) within the lme4 package of R using PC
scores (PC1 and PC2) as dependent variables, population density and
sperm density as fixed factors with their interactions, location as a
random effect and male condition as a covariate. The same model was run
to analyse the percentage of motile sperm (arcsine-transformed before
use). Means are presented with their standard errors (SE).
We analysed fertilisation success using Markov-chain Monte-Carlo
generalized linear mixed models in R (package MCMCglmm, (Hadfield
2010)). With this Bayesian mixed model approach, we modelled the
proportion of normally and abnormally fertilised eggs following a
binomial distribution and obtained both an estimate of the components of
variance and an estimate of the interval of credibility. All models were
run for 1.3 × 107 iterations, with a thinning interval
of 100 (i.e., only one iteration from every 100 in the Markov chain was
used to estimate the posterior distribution of the parameters to reduce
the occurrence of autocorrelation between successive iterations), and a
burn-in of 3 × 105 (i.e., we discarded the first 3 ×
105 models of the simulation to avoid issues with
autocorrelation).
For the Bayesian models we included the fixed effects of population (3
levels, Geelong, Portarlington, and Williamstown), gamete age (time
elapsed between spawning and fertilisation), body condition (see above),
sperm concentration (fitted both as a linear and quadratic term),
density of the male (high vs low), density of the female (high vs low),
their interaction with each other and with sperm concentration. The
random effects used in the model included block
(V block), identity of male
(V male), identity of female
(V female), and an interaction between male and
female IDs (V male × female). The residual
variance (Ve ) of the model represents the
variance between replicates within a given male and female pair.
A necessary step in Bayesian statistical analyses is to set priors
before running the models. The term prior refers to the prior
distribution of a parameter before the data are analysed. The level of
information of the prior can vary from noninformative to highly
informative. When knowledge about the relationship between the variables
in the model is low, it is best to run the model with different priors
and to check whether these different priors provide different posterior
distributions (Hadfield 2010). We therefore ran the models using inverse
Wishart priors (equivalent to an inverse gamma distribution with
shape=scale=0.001; V=1, nu=0.002), parameter expanded priors (V=1, nu=1,
alpha.mu=0, alpha.V=1000), and non-informative improper priors (V=1e-16,
nu=-2). Although we present results from the model using parameter
expanded priors, the conclusions did not qualitatively change according
to prior specifications. We inspected the 95% highest posterior density
(HPD) intervals associated with each fixed effect to check whether they
overlapped with zero. A 95% HPD interval contains most of the posterior
distribution and is analogous to a confidence interval in the
frequentist approach; we considered estimates with 95% HPD intervals
overlapping with zero as non-significant.
We ran an initial model for the proportion of normally fertilised eggs
while excluding data from the highest sperm concentration to facilitate
the interaction between sperm concentration and population density
(i.e., including the highest sperm concentration made the fertilisation
curve very complex). To model the proportion of abnormally fertilised
eggs, we ran a second model while keeping data in the highest sperm
concentration. The proportion of abnormally fertilised eggs was very low
at other sperm concentrations.
We ran a third model, this time including data from all sperm
concentrations and allowed variance components to vary with sperm
concentration. We used the “idh()” function to model heterogeneous
variance components for V male,V female, and V male ×
female according to each concentration. Hence, there were 20 variance
components estimated in this model (i.e., V male,V female, and V male ×
female estimated separately for sperm concentrations and a singleV block and Ve ). By
contrast to Gaussian data, with binomial data it is not recommended to
compare different models using the deviance information criteria (an
index produced by MCMCglmm models that balances the fit of the model
based on the number of parameters used in the model). Thus, we cannot
formally test whether model fit was improved by allowing heterogeneous
variance components across sperm concentrations. Instead, we inspected
the 95% HPD intervals associated with each random effect to check
whether they overlapped. We considered estimates with non-overlapping
95% HPD intervals as significantly different (Hadfield 2010). Note
that, as the lower limit of a variance component is bound to zero, its
lower 95% HPD can be extremely close to, but cannot overlap zero. Thus,
inspection of the HPDs cannot be formally used to test whether a
variance component is significantly greater than zero (Hadfield 2010).
Nevertheless, the 95% credible intervals around the variance estimates
provide a measure of the precision of the estimate and allowed us to
test whether V male × female differed across sperm
concentrations.
Data Availability statement
Data and associated r-code for the fertilisation analysis can be found
here:
https://datadryad.org/stash/share/Yx-wxEJsD-SjrHkHG9lVShv1Qw-KFTAT55vBMvUBVQ4