Materials and Methods
Tissue dissection in lab-reared N.
lecontei
To obtain a large number of individuals for tissue dissection, we
established laboratory colonies of N. lecontei from a single
population of larvae collected in 2016 from a Pinus mugo bush in
Lexington, KY, USA (38.043602°N, 84.496549°W). We reared colonies onPinus foliage using standard rearing protocols (Bendall,
Vertacnik, & Linnen, 2017; Harper, Bagley, Thompson, & Linnen, 2016)
for a minimum of one full generation before collecting tissues. To
ensure that all larvae were haploid males, we collected male tissues
(all larval tissues and adult males) from the progeny of virgin females.
We collected adult female tissues from the progeny of mated females. To
obtain larval tissues, we collected different instars of haploid male
larvae from rearing boxes containing Pinus foliage. We
immediately placed live larvae on a petri dish containing frozen 1X PBS
buffer, removed the heads from the bodies, and flash-froze both tissues
on dry ice prior to storage at -80°C. To obtain adult tissues, we stored
adults at 4°C (a temperature that extends lifespan) for up to 20 days
until tissues could be collected. Before harvesting tissues, we warmed
adults to room temperature and enclosed them in a mesh cage containing
three Pinus seedlings for a minimum of 15 minutes to induce
normal host- and mate-seeking behavior. Then, to sedate adults prior to
dissection, we enclosed each individual in a deli cup and exposed it to
dry ice for approximately 1 minute. We placed the sedated adult on
frozen 1X PBS buffer to harvest six tissues: antennae, mouthparts, heads
(sans antennae and mouthparts), legs, thorax, and the terminal segment
of the abdomen (which includes the ovipositor in females or copulatory
organ in males). We flash-froze these tissues and stored them at -80°C
until RNA extraction.
RNA extraction, library preparation, and
sequencing
We extracted RNA from all tissues with RNeasy Plus Micro and Mini Kits
(Qiagen, Germantown, Mayland). To account for family-level variation in
gene-expression traits, each RNA extraction contained tissues from a
single lab-reared family (each represented by 2-18 individuals,
depending on the tissue). Prior to starting standard kit protocols, we
disrupted tissues using a TissueLyser LT bead mill (Qiagen, Germantown,
Mayland) and 5mm stainless steel beads. We disrupted tissues for up to
four 30-sec periods of 60 oscillations/sec, placing tissues on dry ice
between periods to ensure that the RNA did not degrade during the
disruption process. We then quantified RNA using Quant-iT RNA Assay Kits
(Invitrogen by ThermoScientific, Waltham, Massachusetts) and assessed
quality using a 2100 Bioanalyzer and RNA 6000 Nano Kits (Agilent, Santa
Clara, California).
To prepare RNA-seq libraries for Illumina sequencing, we used a TruSeq
Stranded mRNA High Throughput Kit (Illumina, San Diego, California),
following manufacturer protocols. For larval tissues, we made each
library from RNA extracted from members of a single family. For adult
tissues, which required more individuals to obtain sufficient material,
we used 1-3 families per library (with equal RNA contributions from each
family). For most tissues, we made libraries for four biological
replicates, with different families in each replicate. The exceptions
were male copulatory organs (for which only 3 high quality biological
replicates could be produced) and male thoraxes (for which only 2 high
quality biological replicates could be produced), resulting in a total
of 77 libraries (Supplemental Table 1). We quantified resulting cDNA
using Quant-iT DNA Assay Kits (Invitrogen by ThermoScientific, Waltham,
Massachusetts) and evaluated quality with a 2100 Bioanalyzer and DNA
1000 Kits (Agilent, Santa Clara, California). We then pooled libraries
that passed quality inspection and sequenced this pool with 150-bp
paired-end reads on an Illumina HiSeq4000 at the University of Illinois
at Urbana-Champaign Roy J. Carver Biotechnology Center.
Processing of RNAseq data to produce a
provisionally annotated de novotranscriptome
To de-multiplex and quality-trim raw reads, we used trimmomatic (Bolger,
Lohse, & Usadel, 2014) and a cut-off score of 30. We also removed any
remaining TruSeq indexed universal adapter present at the beginning of
reads using fastx_clipper (http://hannonlab.cshl.edu/fastx_toolkit/).
After filtering, we used tophat2 (Kim et al., 2013) to map retained
reads to the N. lecontei version 1.0 assembly (Vertacnik, Geib,
& Linnen, 2016). We then used a these mapped reads to produce a
genome-guided de novo transcriptome assembly using Trinity (Haas
et al., 2013). Because TRINITY is known to overestimate the actual
number of contigs present in transcriptomes
(https://github.com/trinityrnaseq/trinityrnaseq/wiki), we performed
additional filtering to retain meaningful contigs. First, we used CD-HIT
(W. Li & Godzik, 2006) to create non-redundant contigs from the TRINITY
contigs (minimum of 200bp). Next, we used Bowtie2 (Langmead & Salzberg,
2012) to map the reads to the non-redundant contigs and RSEM (B. Li &
Dewey, 2011) to identify contigs with at least one transcript per
million in at least two biological replicates.
To functionally annotate contigs that were retained after filtering, we
performed two sets of BLASTx (Altschul, Gish, Miller, Myers, & Lipman,
1990) searches. First, we BLASTed these transcripts against the
predicted N. lecontei non-redundant proteins available on NCBI
and a set of manually curated N. lecontei OBPs, ORs, and GRs
(Vertacnik et al. in prep ). For transcripts that did not map with
90% identity to putative N. lecontei genes, we performed an
additional BLAST search against the insect non-redundant protein
database (Pruitt, 2004) (e-value of 0.001). In all cases, we selected
the best BLAST match as a provisional annotation for these transcripts.
Testing predictions of the ADH
If gene-expression traits are “adaptively decoupled” there should be
variable and predictable patterns of decoupling across the
transcriptome. At a phenotypic level, gene-expression traits that are
highly decoupled across development are expected to exhibit: (1)
pronounced differences in expression levels, (2) minimal quantitative
genetic correlations, and (3) independent evolutionary trajectories
across phylogenies (Collet & Fellous, 2019; Medina et al., 2020;
Wollenberg Valero et al., 2017). At a genetic level, variation in
adaptively decoupled gene-expression traits within and between species
should be attributable to mutations with minimal pleiotropy across life
stages. Ideally, each of these four patterns (differential expression,
reduced quantitative genetic correlations, macroevolutionary
independence, and reduced levels of pleiotropy) would be evaluated
across the entire transcriptome. However, even with current sequencing
technologies and novel tools for functional genomics, transcriptome-wide
analyses would be prohibitively expensive for the large sample sizes
needed for quantitative genetic, comparative phylogenetic, and forward
or reverse genetic approaches. Therefore, as a first step to quantifying
transcriptome-wide patterns of gene-expression decoupling across
multiple metamorphic transitions, we used differential expression
between life stages and sexes of a single species as our measure of
decoupling.
To compare levels of differential expression across different
metamorphic transitions and among different types of genes, we first
used bowtie2 (Langmead & Salzberg, 2012) to align reads to the de
novo N. lecontei transcriptome described above. We then estimated the
abundance of each transcript using RSEM via the Trinity package utility
program align_and_estimate_abundance.pl (Haas et al., 2013; B. Li &
Dewey, 2011). Using the utility program
abundance_estimates_to_matrix.pl (Haas et al., 2013) we created a
complete count matrix of transcript abundance. This program also
produces matrices of normalized gene expression for comparisons of
relative abundances (transcripts per million) as well as correcting for
highly expressed genes to obtain absolute abundances (TMM,
third-quartile normalization) (Haas et al., 2013).
To determine whether gene-expression decoupling changes predictably
across different types of metamorphic transitions, we compared
whole-transcriptome profiles of different life stages and sexes in two
ways. First, to visualize overall similarity or dissimilarity of gene
expression across all tissues and life stages, we conducted a principle
component analysis (PCA) using the PtR scripts within the Trinity
package (Haas et al., 2013). Second, to quantify how decoupling changes
across different metamorphic boundaries and between the sexes, we
compared the number of differentially expressed genes (DEGs). To
determine the number of DEGs, we used the Trinity utility program
run_DE_analysis.pl to implement DESeq2 (Haas et al., 2013; Love,
Huber, & Anders, 2014) and obtain statistics of differential expression
between different tissue types and different life stages. To account for
multiple testing this program utilized the Benjamini-Hochberg
correction, requiring an adjusted p-value (padj ) of 0.05 or less
to be considered significantly differentially expressed. We compared the
numbers of differentially expressed genes between metamorphic
transitions using Fisher’s exact test and accounted for multiple testing
using “RVAideMemorire” package (v. 0.9-70, implemented in R 3.5.0).
To determine whether genes that mediate changing ecological interactions
show stronger and more variable levels of gene-expression decoupling, we
used both ad hoc and a priori approaches to evaluate candidate genes.
For our ad hoc approach, we compiled lists of the top differentially
expressed genes (i.e., lowest p-values) between each of the three
metamorphic transitions and between the sexes using the output of
DESeq2. For each comparison, we identified the top 10 genes that were
upregulated in each tissue and in each stage/sex and asked whether these
lists contained any genes that are likely to mediate ecological
interactions. Based on the ecology of N. lecontei (Figure 1), we
were specifically looking for genes involved in digestion,
detoxification, pigmentation, gregarious behavior, chemosensation,
immune function, and reproduction. We first asked whether any of our top
differentially expressed genes corresponded to candidate genes in
existing manually curated gene datasets for N. lecontei(chemosensation, detoxification, and immunity genes: (Vertacnik et al.,
in prep); pigmentation genes: (Linnen et al., 2018)). For remaining
genes, we identified the closest Drosophila orthologue and used
the gene ontology (GO) term database to determine the likely function
for each gene. When no Drosophila orthologue could be identified
or the orthologue did not have a predicted GO function, we used UniProt
to predict the possible function of the gene.
For our a priori approach, we compared patterns of gene-expression
decoupling for two comparably sized sets of genes that we expected to be
under drastically different selection regimes. The first was set of
manually curated chemosensory genes (including odorant receptors,
gustatory receptors, odorant-binding proteins; (Vertacnik et al., in
prep).), which are expected to experience more antagonistic selection
and, therefore, exhibit more extreme and more variable decoupling. The
second category consisted of a similarly sized family of housekeeping
genes (the ribosomal protein L genes, hereafter RPLs), which are
expected to show coupled expression across sexes and life stages. To
visualize how the degree of gene-expression decoupling of our “high”
and “low” decoupling categories compare to the rest of the
transcriptome, we first condensed each gene in the transcriptome to a
single expression value per stage/sex. For each gene, we calculated the
log2 of the average normalized expression level across
all tissues and replicates for each life stage. We then overlaid RPL and
chemosensory gene-expression values on transcriptome-wide values for
each metamorphic event and between the sexually dimorphic adults. To
determine whether the distribution of expression differences between the
stages differs between RPLs and chemosensory genes, we first calculated
the absolute value of the average log2-fold change for
each comparison with DESeq2. Then, we used non-parametric Mann-Whitney U
tests to compare the two groups of genes within each sexual/metamorphic
comparison.
Because our analyses revealed highly variable patterns of decoupling
across chemosensory genes (see below), we went a step further to
identify chemosensory genes with the highest and lowest levels of
decoupling. We investigated variation in decoupling among chemosensory
genes in two ways. First, we produced heatmaps to visualize variation in
patterns of decoupling across stages/sexes across individual
chemosensory genes. Because there was drastic variation in the maximum
expression of chemosensory genes, we first used the scale function so
that the expression of so that the total expression of each gene was
equal and the heatmap function to visualize the normalized expression
levels using R 3.3.2 (Team, 2016). Finally, we used custom Python
(v3.5.1) scripts to identify chemosensory genes that had expression
levels in the top 10th percentile in the feeding
larvae, adult males, and adult females to further pinpoint genes with
coupled or decoupled expression across life stages.