4.1 Method of data analysis and model selection of landscape
genetics with whole-genome resequencing
The common markers in landscape genetics are microsatellites,
mitochondrial DNA, amplified fragment length polymorphisms, and the Y
chromosome (Manel, Schwartz, Luikart, & Taberlet, 2003). In recent
years, single nucleotide polymorphisms (SNPs) have become another major
and widely used marking method. The greatest advantage and
characteristic of SNPs is that the number of polymorphic sites increases
greatly compared with that of other molecular markers. However, the
increase in number of markers also increases the difficulty and the time
needed for data analysis. Therefore, by selecting the correct analysis
tool, the genetic model in whole-genome resequencing can be accurately
obtained and the analysis time effectively shortened. Pairwise
estimation of FST is an important parameter in
population genetic analysis that can conveniently summarize the
population structure (Weir & Cockerham, 1984). Pairwise
FST is generally calculated using GENEPOP (Rousset,
2008), the R package adegenet , or GenAlEx 6.5 (Peakall & Smouse,
2012). In this study, after filtering, 4,152,751 SNPs were obtained in
Shunchang and 3,298,993 SNPs were obtained in Xiapu. Then, the R packageadegenet was used to calculate pairwise FST;
however, the calculation required a long time, approximately one month.
Therefore, the same data were used to calculate FSTthrough 5,000-bp windowing in vcftools software (Auton & Marcketta,
2015), and the calculation required only 7 h, which showed that using
this method to calculate pairwise FST could effectively
shorten the calculation time.
Observed (HO) and expected (HE)
heterozygosity, fixation index (FIS), allele diversity
(A), and mean number of alleles per locus (K) are also important
parameters in population genetic analysis, which can be calculated by
software such as Arlequin 3.11 (Excoffier, Laval, & Schneider, 2005),
GENETIX 4.05 (Belkhir, Borsa, Chikhi, Raufaste, & Bonhomme, 2004), or
ADZE 1.0 (Szpiech, Jakobsson, & Rosenberg, 2008). The commonly used
software for calculating these parameters in Restriction-site associated
DNA sequencing (RAD-seq) or Genotyping by sequenceing (GBS) is GenAlEx
6.5, but because the number of SNPs in whole-genome resequencing is
usually a hundred times that of RAD-seq or GBS, GenAlEx 6.5 did not
appear to be able to support such a large data set. Therefore, in this
case, Metapop2 software (López-Cortegano et al., 2019) was selected,
which can not only calculate the complete genetic diversity but can also
effectively analyze a large number of SNPs (e.g., >100,000
SNPs) in the latest optimized version. The software calculated the
population genetic parameters in a total of 60 h, which indicated that
Metapop2 can effectively analyze millions or even tens of millions of
SNPs in a short time.
To quantify the landscape structure and determine its effect on
population genetics we used method of least-cost path based on
resistance surface or straight-line transects (Spear et al., 2010). To
eliminate the shortcomings of those two methods, Strien et al. (2012)
developed the least-cost transect analysis (LCTA) by combining the two
methods. While realizing objectivity, this analysis can also use buffers
to form transect widths to quantify the proportions of each of the
landscape types. In this study, LCTA was performed at a fine scale
(<10 km) based on whole-genome resequencing, which clearly
revealed the relationships between different tree types, urban areas,
roads, and farmland and the dispersal and gene flow of M.
alternatus . Thus, this analysis was effective with SNP markers and
could identify the effects of landscape types at a fine scale. Cleary et
al. (2017) used this analysis to describe the landscape genetics of two
frugivorous bats under agricultural intensification and also obtained a
better interpretation of effects. The db-RDA model also effectively
described landscape genetics at a fine scale. This model breaks through
the limitation that the existing method of RDA can only be performed by
using Euclidian distance and allows the use of Bray–Curtis or other
ecologically meaningful measures (Legendre & Anderson, 1999). The LCTA
and db-RDA well explained the effects of different landscape types onM. alternatus at a fine scale, and because the results obtained
by the two models were relatively consistent, both are applicable to
landscape genetics at a fine scale under whole-genome resequencing.