Mitochondrial Cytochrome b
Following amplification of either the entire cytochrome b gene
(tissue) or the first 300 base pairs (museum specimens), all tested
samples recovered sequences which mapped to the G. sabrinusreference and ranged from 0-100% amplification success as assessed from
gel electrophoresis (details in Appendix 1). Coverage of the museum
specimens ranged from an average of 22.7- 1933x. Interestingly, the
highest coverage was from one of the poor quality museum specimens (MVZ
5211), as was the lowest (MVZ 2088). Regardless, all samples recovered
reliable cytochrome b sequences, and only the lowest coverage
(MVZ 2088) had an error rate over 0.0001% (0.064%, Appendix 2).
Despite poor gel visualization confirmation, all samples recovered
mitochondrial sequences, even when many PCRs had been deemed as failed
(see all poor quality sample results in Appendix 1). Average coverage
across all sample types was 624x, 612x across the high quality museum
specimens, and 760x for the poor quality museum specimens (which was
largely biased by the very high coverage recovered in MVZ 5211, without
this sample the average was 174x). Additional quality metrics for each
sample are detailed in Appendix 2. The quality scores across the samples
did not vary substantially. The sample recovering the lowest Q20 was HSU
1836 with 95.8%, and the highest was MVZ 2088 with 98.6%. At a quality
of Q30 the lowest was again HSU 1836 and the highest was UMMZ 79755 with
98%. The tissue sample had 98.2% score at Q20, and 94% at Q30, which
is likely artificially low since this sample had the entire cytochromeb amplified, then fragmented prior to library preparation (Yuan,
2020). Regardless of sample type, the reads appear to be high quality as
determined by the quality metrics, expected rates of errors and Q
scores. It is also noteworthy that the cytochrome b data was
extracted solely from the pooled data where all microsatellites and
cytochrome b fragments were pooled prior to library prep.
The haplotypes recovered from the different samples included here
consisted of three closely related haplotypes, separated by three to six
substitutions. All of the G. o. californicus recovered the same
haplotype, the single G. o. lascivus (HSU 8180) recovered a
second, the G. o. stephensi (HSU 1836) sample recovered the third
haplotype (Appendix 3). The haplotypes were common across a wider study
of G. oregonensis (Yuan, 2020), and the fact that none of the
LQMS recovered a unique haplotype provides support for the authenticity
of the data. Importantly, samples included in this study represent three
subspecies from four geographic locations, so numerous haplotypes were
expected.