Among-subspecies genetic divergence
We obtained 76 to 87 kb of DNA sequence covering 88 to 94 genes (Table 1). By mapping short reads to reference sequences, we identified 74 to 1657 segregating sites within each population (Table 1). We calculated among-population pairwise DXY values to assess genetic divergence and used the resulting distance matrix to construct a neighbor-joining tree. The DXY matrix shows clear divergence between the three subspecies, with the BB population the sole exception (Figure 2b). The largest DXY values were observed between the australasica populations and the other two subspecies, ranging from 7.7 to 9.9/kb (Table S6). Lower divergence was observed between eucalyptifolia and marinapopulations, with DXY values between 6.5 and 7.4/kb. By pooling populations within each subspecies, we estimated theDXY to be 8.2/kb between eucalyptifoliaand australasica , 6.7/kb between marina andeucalyptifolia and 9.1/kb between marina andaustralasica .
Genetic divergence was generally lower among populations than among subspecies (Fig. 2d). The two australasica populations diverged little from each other (DXY =2.2/kb). The pair ofeucalyptifolia populations diverged more but still less than among subspecies (DXY = 5.48/kb). Withinmarina , we see two major geographical groups: one containing MC, LS, and PN (west of the Malay Peninsula) and the other TN, BK, SS, SY, WC, SB, CB, and BL (east of the Malay Peninsula, Figure S1).DXY per kb ranges from 1.27 to 3.75 within the first and from 0.94 to 4.69 within the second geographical group. Between the two geographical groups, DXY ranges from 4.32 to 5.69, still lower than between subspecies. The BB population is an outlier and has diverged far from other marinapopulations (DXY = 7.76-8.43/kb), to a level among subspecies. The AMOVA indicates 65.1% of genetic divergences (DXY ) is accounted by subspecies division. In contrast, 50.8% of the DXY variance is accounted by geographical division.
DXY provides a measurement of how far the populations diverged from each other. We also measured the extent of divergence by comparing the allele frequencies of polymorphisms within populations (Cruickshank & Hahn, 2014). Plotting principal components of the allele frequency matrix, populations of each subspecies generally cluster together but diverged from other subspecies at PC2, except that the DW population (eucalyptifolia ) is close to marinapopulations and the BB population (marina ) is again different from all the other marina populations (Figure 2c). In PC1, only population DW diverges largely from all other populations. In addition, the CA population (eucalyptifolia ) diverges from other populations largely in PC3 and PC4 (Figure S2).
The FST statistic quantifies these genetic differences. The 120 values of pairwise FSTestimates calculated for the 16 populations are generally high, with the average value of 0.61 (first and third quartiles are 0.50 and 0.76 respectively). Populations from the South China Sea, i.e. TN, BK, SS, SY, and WC (Figure S1), have relatively low pairwise differentiation.FST between the two populations on the west coast of Malay Peninsula (LS and TN) are also low (Figure S1).
The Mantel tests show a significant relationship (P value <0.01) between genetic differentiation and geographic distance. This is regardless of whether the geographic distance was estimated using the spherical or coastline method (Figure S3, see Methods for details). All four tests have P values less than 0.01and survive a multiple-test correction. This correlation indicates that geographical distance contributes, at least partly, to the high level of genetic differentiation among A. marina populations. However, the two geographical groups around the Malay Peninsula show genetic differentiation greater than what we would expect from the distance separating them, indicating that other factors are also important (Figure S3).
The BARRIER analysis reveals that major barriers (with >80% bootstrap support) roughly lie along the Sunda shelf and between Australasia and Southeast Asia. Minor barriers are also identified between Africa and Southeast Asia, as well as between Western Australia and Northern Australia. The major barrier in the historic Sunda Land corresponds to the obvious deviation ofFST values from the expectation based on distance alone (Figure S3 & S4).
Isolation among subspecies indicated by high divergence and inferred barriers may influence genetic diversity within populations. Both the nucleotide diversity (π) and Watterson’s estimator of nucleotide polymorphism (θ) show different levels of within-population genetic variation. The two eucalyptifolia populations have the highest genetic diversity, on average θ (across segments) = 2.82 and 3.94/kb and π = 3.41 and 4.06/kb (Figure 3). In contrast, marina populations are low in genetic diversity, with average θ ranging from 0.21 to 0.91/kb and π from 0.15 to 1.39/kb (Table1, Figure 3). The BS population (australasica ) has intermediate diversity, while the AK population (australasica ) is unusually monomorphic (Table1, Figure 3). The very low diversity of the AK population is likely due to its marginal location, similar to WC and SY.