Whole-Genome Resequencing Analysis
We removed adapter sequences and quality trimmed reads using
AdapterRemoval v2.1.7 (Schubert et al., 2016). We then aligned reads to
a new chromosome-level assembly of a yellow-rumped warbler
(Setophaga coronata ) (Baiz et al., 2021) using Bowtie2 (Langmead
& Salzberg, 2012). PCR duplicates were marked with Picard tools (Broad
Institute, 2021).
We analyzed the resultant assemblies using the ANGSD bioinformatics
pipeline (Korneliussen et al., 2014), the most appropriate method for
low-coverage data. To measure the extent of differentiation, we
calculated the global estimation of F ST between
the main range S. v. virens and S. v. waynei . We then
generated a windowed estimate of F ST in 10kb
windows across the genome, comparing S. v. virens and S. v.
waynei , thus quantifying whether different parts of the genome are more
divergent than other regions of the genome between the groups.
To identify genes within the divergent regions, we used the annotation
information associated with the S. coronata genome (Baiz et al.,
2021), which used SNP gene predictions, trained on the Zebra Finch
(Taeniopygia guttata ) genome, within the MAKER annotation
pipeline. We focused on coding sequences where MAKER had either
transcript or protein matches with the Zebra Finch annotation. We first
identified genes within the two regions (see below) that were bounded by
10Kb F ST windows >0.15. We also
focused on two genes intersecting the two highestF ST windows.
We next performed Principal Components Analysis (PCA) to determine the
genome-wide signal of clustering. We used PCAngsd (Meisner &
Albrechtsen, 2018) to generate a covariance matrix from genome-wide
genotype likelihoods, and R 4.0.5 (R Core Team, 2021) to calculate and
plot eigenvalues. Lastly, we constructed a bootstrapped phylogeny to
estimate relationships of the various populations and to test roughly
when S. v. waynei separated from the main group. Genotype
likelihoods generated by ANGSD for 31,241,801 sites were analyzed using
ngsDist v1.0.8 (Vieira et al., 2016) and run with 100 bootstrap
replicates. We produced trees from the resultant distance matrices using
FastME v2.1.6.2 (Lefort et al., 2015) and combined support values on the
main tree using RAxML v8.2.12 (Stamatakis, 2014).