Methods
What to measure? Diet-breadth dependent performance trade-offs could result from any number of mechanistic interactions between a herbivorous insect and a host plant. On a particular host plant, in comparison to a specialist, a generalist might have (1) a reduced ability to initiate feeding, (2) a lower feeding rate, (3) less efficient utilization of host nutrients, (4) greater susceptibility to host defenses, or (5) more exposure to natural enemies. No matter the mechanism, any trade-offs that drive the evolution of specialization would need to ultimately limit survival or fecundity. If specialization is an adaptive response to trade-offs between performance on alternative hosts, specialists should have higher survival or fecundity than generalists on shared resources. In the studied tropical forest plots, we were not able to measure survival or fecundity directly, but we were able to measure the abundance and patch occupancy of each diaspidid species on each host-plant species. As mentioned earlier, because after the crawler stage each diaspidid is stuck for life on one host, an observation of a second-instar or adult individual on a host is evidence of successful development on the host (Hill and Holmes 2009). Moreover, the relative abundance of diaspidid species on each host-plant species is an especially synthetic proxy for fitness – integrating across host-dependent differences in diaspidid fecundity and survival.
Sampling. We surveyed diaspidids at two tropical rainforest sites: (1) San Lorenzo National Park, Panama and (2) Lambir Hills National Park, Malaysia (on the island of Borneo). At each site, we used a crane to reach the forest canopy. We were not able to search each tree in each plot, so we used preexisting databases of the trees at each site to divide identified individual trees into sampling groups of one randomly-selected individual per tree species. Only trees over 10 cm diameter at breast height were considered. We did not sample any tree individual more than once, so tree species with only one individual were present only in the first round of sampling, those with two individuals were present in the first two rounds, and so on. This protocol allowed us to sample across the full diversity of host taxa while also getting multiple samples from common host species.
At each focal tree, 20 person-minutes were allocated to visual searching of accessible foliage. Any leaves and twigs that we saw were infested by diaspidids, we cut from the tree and collected. From each tree we also haphazardly took one 20 cm twig sample and one 20 cm2bark sample. Removed plant material was stored in plastic bags and transferred to the lab for processing under magnification; live diaspidids were cut from the surrounding plant material and preserved in 95% ethanol. Specimens were subsequently sorted to life stage and second-instars and adult females were regarded as evidence of successful establishment.
Phylogenetics. DNA was extracted from all second-instar and adult females and using Qiagen DNeasy Blood & Tissue kits (Qiagen, Valencia, CA) following the procedure outlined in Normark et al. (2014). We amplified three loci that have previously been used for diaspidid phylogenetics: elongation factor 1-α (EF1α), part of the large ribosomal subunit rDNA gene (28S), and a part of the mitochondrial genome spanning cytochrome c oxidase I and II (COI-II). PCR primers and protocols followed Andersen et al. (2010) and Gwiazdowski et al. (2011). PCR products were visualized using 1.5% agarose gels with SYBRsafe (Invitrogen, Carlsbad, CA, USA) and successful reactions were purified with Exo SAP-IT enzymatic digestion (Affymetrix, Cleveland, OH, USA). Sanger sequencing of the PCR products was completed by Macrogen (Cambridge, MA, USA) or Eton Biosciences (San Diego, CA, USA). Genbank Accessions are provided in S1.
Phylogenetic relationships among all sampled individuals were estimated from the DNA sequence data. Sequences from each genetic locus were aligned using PASTA (Mirarab et al. 2014), and alignments were trimmed to include only sites with non-gap sequence for at least 80% of specimens (Capella-Gutiérrez et al. 2009). Genealogies were inferred using the GTR+CAT model in RAxML (Stamatakis 2014). The three single-locus alignments were then combined as one supermatrix, from which we also inferred a phylogeny with RAxML. For use in comparative analyses, we made a version of the phylogeny with just one tip per species, and scaled branch lengths to time using an auto-correlated model of among-lineage rate variation, fit with penalized likelihood as implemented in treePL (Smith and O’Meara 2012), and constraining the armored scale root to be 50-75 million years old (Vea and Grimaldi 2016).
Species Delimitation and Identification. We delimited putative species with a version of the genealogical concordance method (as in Gwiazdowski et al. 2011). All clades shared by at least two gene trees, and not contradicted by the third gene tree, were considered evolutionarily independent lineages. Species were defined provisionally as the most inclusive independent lineages containing at least three terminal branches and no more exclusive independent lineages. This method precludes delimitation of species represented by fewer than three specimens. To work around this problem, we calculated the minimum divergence between provisional species clades, and used that value as a maximum threshold for within-species divergence. Any specimens separated by more than this distance from all other specimens were also considered distinct species. We also examined specimens and identified them according to standard morphological criteria to the extent that this was possible. Because second instars and adults were both included in this study, whereas standard keys and descriptions are based on adults only, direct morphological comparisons and identifications were not always possible. The analyses below were repeated for DNA-based and morphology-based species delimitations. We retained all specimens in both analyses, whether or not they were morphologically identifiable; for a few specimens that were not morphologically identifiable, in the morphology-based analysis we defaulted to the DNA-based species.
Statistical Analysis. We characterized host-use specialization by diaspidid species in two ways, each applied at three levels of host plant taxonomy (species, genus, and family). First, we quantified diet specificity; we asked whether diaspidids used less diverse hosts than expected by chance. Concretely, for each diaspidid species, we quantified host-taxon diversity using Simpson’s Reciprocal Diversity Index (RDI), which is essentially evenness-corrected host-taxon richness. We compared empirical RDIs to those expected under a null model of random host use. We simulated 1000 null data sets by randomly permuting the associations between diaspidid species and individual host trees; then for each permutation, we again calculated the mean RDI for the hosts of each diaspidid. With this approach, a diaspdid species is a specialist if its host RDI is lower than expected under the null model.
In a second view of host-use specialization, we calculated the phylogenetic conservatism of host use across diaspidid species. In other words, we asked if evolutionary history constrains host use. We used the R package (R Core Team 2017) MCMCglmm (Hadfield and Nakagawa 2010) to measure the phylogenetic signal of host use by estimating the proportion of variance in the binary use-or-non-use of each host taxon that could be explained by the diaspidid phylogeny. Empirical values for phylogenetic signal were then compared to those calculated under a null model. Null data sets were produced by randomly swapping associations between diaspidid species and host taxa until the associations were thoroughly shuffled (the number of random swaps was 10 times the overall number of associations). This preserved the empirical distribution of diet breadths while randomizing specific associations. P- values for the empirical phylogenetic signal values were calculated using aZ -test against each parameter’s null data set values (which were approximately normally distributed). We corrected for multiple comparisons by assigning statistical significance according to a false discovery rate (FDR; Benjamini & Hochberg 1995) of 0.05. The FDR procedure was conducted separately for each host-taxon level because these analyses were not independent, and must be interpreted as alternative configurations of the same data.
We investigated the strength of performance trade-offs by calculating for each host tree taxon the correlation between diaspidid diet breadth (count of host taxa) and mean abundance. If performance trade-offs are strong, on any given host taxon, we expect generalist diaspidids to be less abundant than specialists. We also investigated the relationship between diet breadth and the proportion of host trees of a taxon colonized at each site, as patch occupancy may be a better indicator of fitness than local abundance in a metapopulation of discrete colonies (Gyllenberg and Metz 2001). Using R, we fit generalized linear models. For local abundance, the response variable was the number of diaspidid individuals identified per host tree, assuming a Poisson distribution. For metapopulation colonization-rate, the response variable was the probability that an individual tree within each host taxon would be colonized by a diaspidid species, assuming a binomial distribution and excluding host taxa with fewer than 3 trees surveyed. Both models only incorporated data for host-taxon-by-diaspidid associations with at least one record. To assess statistical significance, we compared empirical coefficients to those estimated from 1000 null data sets, produced by randomly permuting the empirical data.
The scripts used for the analysis will be made available at Dryad.