Genetic divergence and diversity estimation
To estimate absolute genetic divergence between populations, we computed
pairwise DXY following the formula derived by Nei
(Nei & Li, 1979). When calculating DXY , two
alleles at each SNP were interpreted as two haplotypes and corresponding
allele frequencies as haplotype frequencies. PairwiseDXY values were summed over all SNPs and the sum
was normalized by effective sequence length. For each pair of
populations, the effective sequence length was defined by sites without
missing data in both populations. The obtainedDXY matrix was used in multidimensional scaling
using the ‘cmdscale’ package implemented in R (Figure 2), as well as
neighbor-joining tree constructed using MEGA7 (Kumar, Stecher, &
Tamura, 2016). We also performed Principal Component Analysis (PCA) on
the SNP frequency matrix (summarizing the frequency of each SNP in each
population) using the “prcomp” function in R (Venables & Ripley,
2002) to test whether the SNP frequencies differed among populations.
Finally, to assess the extent to which genetic polymorphisms were fixed,FST statistics were computed following a method
for many SNPs (Nei & Miller, 1990; Willing, Dreyer, & van Oosterhout,
2012).
The levels of genetic diversity within populations were measured by π
and Watterson’s θ statistics.
π
summarizes the average number of nucleotide differences between two
sequences randomly sampled from a population (Nei, 1987), while
Watterson’s θ estimates nucleotide polymorphism based on the number of
observed segregating sites (Watterson, 1977). To correct systematic
errors of high-throughput sequencing, we computed θ values following a
published algorithm (He et al., 2013).
Analyses of molecular variance (AMOVA) basing onDXY and FST are used to
test whether genetic variation was partitioned by subspecies or
geographical region. In the test for geographical region, the
populations are assigned into three groups with the Malay Peninsula and
Wallacea as the boundaries, which are two major discontinuities revealed
in mangrove species (Guo et al., 2018b, 2016; J. Li et al., 2016; Yang
et al., 2017). The first group includes MC, PN and LS, the second group
includes BB, CA, DW, BS and AK, and the last group includes all the
other populations.
Mantel tests of DXY andFST against geographic distance was performed to
test the Isolation by Distance model. Geographical distances between
sampling sites were approximated either by spheric distance or dispersal
pathway along coasts (called coastline distance). The coastline distance
is estimated according to the simulation of one-month oceanic dispersal
ability using the methods described in (Van der Stocken, Carroll,
Menemenlis, Simard, & Koedam, 2019), with approximate ruler of 350 km.
Geographic barriers delineating the largest genetic discontinuities
between pairs of populations were identified using BARRIER 2.2 (Manni,
Gue, & Heyer, 2004). By randomly selecting half of the 94 genes, we
calculated one FST matrix for the 47 genes. We
repeated this process 100 times and obtained 100FST matrices. Robustness of each inferred barrier
was thus assessed by the 100 matrices.