Estimating population-level flowering onsets and
terminations from simulated herbarium data
We generated phenoclimate models for each species from each set of
simulated specimen collection dates using quantile regression (Koenker
et al. 2018) in RStudio (Team 2020). In all cases, each model regressed
observed DOYs of the phenological snapshots of all sampled individuals
of a given species against mean annual temperature. From these 1450
models (representing each of the species-specific models for all 1200
species plus the additional 150 models exhibiting population-level
collection biases and the 100 models exhibiting individual-level
collection biases), we predicted the 10th,
50th, and 90th percentiles of
flowering DOYs for each species from mean annual temperatures
corresponding to the years and locations of their source populations. We
then calculated the mean absolute error (MAE) of the linear regression
of the known timing of the onset (or termination) of the peak flowering
period for each reference population on the predicted DOYs produced by
each phenoclimate model based on the simulated herbarium data. For each
metric of population-level phenology (i.e., flowering onset, peak (i.e.,
median DOY), and termination), we then used Tukey HSD tests to compare
the mean accuracies (estimated as MAE) of these predicted DOYs versus
the actual population-level metrics among models constructed from
species that differed in their phenological sensitivities to climate,
flowering durations, degrees of intrapopulation variation in
phenological timing, and collection biases.
Similarly, we tested whether the mean MAE of estimated peak flowering
onset and termination dates among groups of species that exhibited the
same flowering duration, phenological responsiveness, and
intrapopulation phenological variation differed significantly from the
mean MAE of estimated median flowering dates for each group of simulated
species that exhibited the same flowering duration, phenological
responsiveness, and intrapopulation phenological variation. We used
Tukey HSD tests to compare the accuracy of estimated onset, median, and
termination dates of the peak flowering period among all species
produced from each of the simulated datasets.
Finally, we re-fit all 1200 models (including all 24 combinations of
species parameters but excluding models constructed to test the effects
of collection biases) with randomly selected subsets of data (100–1000
specimens per species) to determine how sample size affected model
performance and predictive accuracy. To evaluate whether more data would
be needed when variation in phenology among populations is not perfectly
explained by the climate variables included in the model, we ran
additional simulations in which population-level mean DOYs (and
associated onset and termination DOYs of the flowering period) of each
species at each sampled location and year included random variation not
associated with local climate: adding either ±5 days (i.e., a low-noise
scenario) or ±15 days (i.e., a high-noise scenario) to the DOYs of the
onset, median, and termination of flowering DOYs. For each location and
year, the random offsets of the DOYs of flowering onset, median
flowering DOY, and flowering termination were identical, such that
random variation was incorporated only into the timing of flowering, and
not its duration.