Discussion
Discriminating whether the most appropriate model to reconstruct the demographic history of a species is structured orunstructured should be the first step in empirical population genetics investigations, particularly when targeting species of conservation concerns. Even when an extensive spatial sampling is lacking, an ABC model selection approach can actually distinguish whether the sampled deme belongs or not to a meta-population (similarly to previous studies (Maisano Delser et al., 2019; Peter, Wegmann, & Excoffier, 2010)). This should inform on how to interpret the variation in Ne through time traced using unstructured models. Among the four species considered here, the tiger shark is the only panmictic. This means that the reconstructed stairwayplot has a direct biological interpretation, with the shark experiencing a mild ancestral expansion and a recent ~4-fold bottleneck around 2,000 years B.P. (consistent with the results of (Pirog et al., 2019)) (Figure 3). How to interpret the stairwayplot in the remaining three species? Coalescence simulations provided helpful and general insights into the understanding of the relation between the inferences performed under unstructured and structured models. We first focus on scenarios simulated under the SST, with parameters close to those estimated in real data. The first and most striking result is that we systematically observed a recent bottleneck under all simulated scenarios (Table 2, Figure 4, 5, 6, S1, and S2). This result could seem at a first glance surprising and due to an artefact. However, this is not the case, as: i) the signal does not depend on the inferential algorithm chosen to analyse the data (i.e., the stairwayplot ), since the normalized spectra showed a deficit in singletons compared to the other low frequency classes (Figure 4, 5, 6 and S1), which is typical of a recent population decline; ii) it is consistent with the distribution of the Inverse Instantaneous Coalescent Rate (IICR) computed in one diploid individual, which shows a signature of decline under similar meta-population models (Chikhi et al., 2018; O. Mazet, Rodríguez, Grusea, Boitard, & Chikhi, 2016; Rodríguez et al., 2018). The results of our simulations are consistent with the recent bottleneck observed in the three shark species (Figure 3), with its intensity inversely correlated to the estimated Nm (i.e., stronger forC. melanopterus and C. limbatus than for C. amblyrhynchos ). In our SST model there is an instantaneous colonization of the array of demes at Tcol , which corresponds also to a demographic expansion (i.e., the total number of individuals in the array of deme is larger than those in the ancestral deme). However, this signature is detected only for Nm ≥5 when Tcol is neither too recent nor too old (at equilibrium) (Figure 5, 6, S1 and S2). In these scenarios, the beginning of the expansion retrieved by thestairwayplot broadly corresponds to the simulated Tcol . This again corroborates the results obtained for the three shark species, since the two species with higher Nm displayed indeed an ancestral expansion in the stairwayplot with a timing consistent with the estimated Tcol (Table 1, Figure 2 and 3). Similarly, it explains why we could not retrieve the ancestral expansion for C. melanopterus nor estimate Tcol under the SST model: this appears to be a property of the coalescent pattern and it is not related to the amount of data available (see below).
It is now straightforward to frame all these findings under the coalescent perspective. The coalescent history of the lineages sampled from a single deme in an SST (or FIM) model can be separated for simplicity into three phases: the scattering , thecollecting and the ancestral phase. Going backward in time, lineages will coalesce in the sampled deme with a rate according to both Nm and N until all lineages either have coalesced or migrated to another deme. This is the scattering phase described in the seminal works of (Wakeley, 1998, 1999). Thescattering phase was considered instantaneous for mathematical tractability, with its outcome dependent on Nm only, but later works could disentangle the effect of N and m on the shape of the gene genealogy (Mona, 2017). The collecting phase starts when the lineages which did not coalesce have migrated to other demes of the array: they will then coalesce according to a Kingman process with a rate scaled by Nm and the number of demes of the array (Wakeley, 1999). Finally, all surviving lineages (in non-equilibrium model) will reach the ancestral deme at Tcol , where they will coalesce at a rate depending only on the Nanc parameter. In species with lowNm , the rate of coalescent during the scattering phase is very fast since lineages have low probability of emigrating from the sampled deme and high probability of coalescence due to the smallN . Once all the lineages are dispersed in the array of demes, there will be two possible outcomes: i) in equilibrium model, we shift to the collecting phase, where the rate of coalescent drops since lineages will hardly fall in the same deme again; ii) in non-equilibrium model, with the parameters we have simulated here, there will be very few (if any) coalescent events during the collecting phase and the transition will be directly from the scattering to theancestral phase. Both the collecting and theancestral phases have a rate of coalescent lower than thescattering phase, which determines the observed recent drop inNe for all simulated scenarios. Remarkably, the decline inNe is much stronger in equilibrium model, since the rate of coalescent is much lower in the collecting than in theancestral phase (Figure 4, 5, 6, S1, and S2). Low Nmspecies will therefore have only two coalescent phases, thescattering and either the collecting (in equilibrium model) or the ancestral (in non-equilibrium model) which is why the signature of the ancestral expansion is lost. For growing Nm , in equilibrium model there will be again only two coalescent phases, namely the scattering and collecting , with the latter having lower rate of coalescent than the former independently of the simulated parameters. This is why we observed always a strong bottleneck consistent with the distribution of the IICR statistics in any equilibrium model (Chikhi et al., 2018; Olivier Mazet et al., 2015; Rodríguez et al., 2018). In non-equilibrium model, there will be two different situations: a) Tcol (in generations) is of the same order of the deme size N . In this setting, going backward in time few lineages would have escaped the sampled demes before Tcol . This corresponds to a shift in the coalescent rate directly from thescattering to the ancestral phase, resulting in a bottleneck of lower intensity compared to an equilibrium model (Figure 4, 5, 6 and S1), for the same reasons as above; b) Tcol (in generations) is larger than N . In this setting, some coalescent events may occur during the collecting phase, at a rate much slower than the two other phases. This determines the hump observed in the stairwayplot (Figure 4, 5, 6) and explains why in this window of parameters it is also possible to correctly estimate Tcolusing our ABC framework. Further simulations under the FIM model confirmed those patterns even though the ancestral expansion could be detected for lower long-term Nm than the corresponding SST scenario (Figure S3). This is probably due to a higher apparent connectivity underlined the by FIM, where lineages can move more freely during the collecting phase in comparison to SST where migrants only come from the closest neighbours. If many coalescent events occur during the collecting phase, the change in coalescent rate will affect the resulting gene genealogy and it will be detected by thestairwayplot (or any other unstructured method based on coalescent theory).
Using coalescent arguments, we clarified why simple meta-population models with constant connectivity generate a gene genealogy harbouring a signature of a recent decline for any parameters’ combination. The signature of bottleneck detected by the stairwayplot in the three shark species best described by SST can be therefore interpreted as a consequence of the underlying structure. However, connectivity likely changes through time. For instance, human activities have likely impacted the evolutionary history of a large number of species either by decreasing their effective population size and/or by fragmenting their habitat (i.e., reducing migration rates between demes). This intuitively should exacerbate the signature of population decline in the resulting gene genealogy. However, it remains to be shown whether this signature is qualitatively and quantitatively distinguishable from models with constant connectivity. This is a question of fundamental importance to understand whether it is possible to detect recent bottleneck in structured populations. To this end, we further investigated by coalescent simulations the expected gene genealogy in SST-CH (and FIM-CH) models with a change in connectivity 10 or 50 generations B.P., which matches the beginning of extensive anthropogenic influence on biodiversity considering our species’ generation time (Ceballos et al., 2015). The resulting gene genealogies were poorly affected by the recent drop in connectivity, with both the normalized SFS and the inferredNe dynamic following the same trajectory of the corresponding scenario with the same long-term Nm and Tcol (Figures 7, 8, S5, S6, S7, S8, S9 and S10). We noticed the drop in N (Figure 8, S6, S9 and S10) had stronger influence than the drop in m(Figure 7, S5, S7 and S8), consistent with previous finding showing that the distribution of coalescent events depends not only by the Nmcompound parameter but also by their individuals values (Mona, 2017). These results imply that the simulated change in connectivity is too recent to significantly alter the pattern of coalescent events during the scattering phase and that a recent drop can be hardly detected on the basis of the SFS only. Our empirical data are consistent with these findings: when we compared SST vs. SST-CH models in the three shark species using the ABC framework, we failed to clearly distinguish the two models (Table S5, Table S6, Figure S4). This seems a paradox: we observed a recent bottleneck in species of conservation concern usingunstructured model, but we cannot exclude that this is just the consequence of population structure.
This study highlight once more the importance to explicitly test for meta-population structure before interpreting the demographic signals detected by unstructured models, similarly to what advocated previously by (Maisano Delser et al., 2019; Rodríguez et al., 2018). If the meta-population structure hypothesis is rejected, the variation ofNe through time can be directly interpreted as the demographic history of the population under investigation, such as the case of tiger shark. Otherwise, this variation is still related to demographic events, but it has to be explained in the light of population structure and its consequence on the rate of coalescent events. We showed by coalescent simulations how to interpret such variation: the recent bottleneck detected by the stairwayplot in demes belonging to a meta-population is a consequence of the coalescent process. In other words, any inferential method implementing an unstructured model will detect such decline (if enough data is available) since it is a property of the gene genealogy. Importantly, the gene genealogy is only slightly affected by recent changes in connectivity if the time of this change in generations is of the same order of the size of the deme.
Our study highlights a key issue in conservation genetics as a recent decline inferred by an unstructured model can be mis-interpreted as a consequence of recent anthropic pressures (Ceballos et al., 2015) when it actually results from meta-population structure. This is all the more alarming since the majority of species is likely organised in meta-populations across their range rather than panmictic at a large scale. We therefore stress the necessity for an educated choice of tools to correctly uncover the recent trend of a species and design proper conservation programs. For instance, detecting a recent bottleneck in meta-populations will require summary statistics measuring the linkage disequilibrium (Boitard, Rodríguez, Jay, Mona, & Austerlitz, 2016; Kerdoncuff, Lambert, & Achaz, 2020) and/or the inferential framework based on the IICR (Chikhi et al., 2018; Rodríguez et al., 2018) coupled with whole genome data. On a positive note, we showed that the colonization time of the array of demes can be estimated to some extent (and under some combinations of parameters) by unstructuredmodels. We believe that this is particularly important because it has been shown that the simple instantaneous colonization process we used here behaves similarly to a spatial explicit range expansion (Hamilton, Stoneking, & Excoffier, 2005; Mona, 2017), which is certainly a more realistic model but more difficult to investigate. We are aware that the meta-population models here tested are simple and the parameters chosen are specific of the three shark species we focused on. Nevertheless, the time-scale separation of the coalescent process is general, and it allows explaining intuitively any structured models. The four shark species here used as an example has the merit to cover a large spectrum of LHT and consequently a large spectrum of demographic scenarios, going from a highly structured to a panmictic population: this has strong implications on the distribution of coalescent times and therefore on the interpretation of the observed data.
In this study we found that population structure, independently from the degree of connectivity between demes and the migration matrix relating them, intrinsically determines a variation in the rate of coalescent events through time. We showed that the intensity and the direction(s) of such variation related to the demographic parameters of the meta-population in a predictable way. Our results highlight the importance of detecting population structure (which depends on LHT among other factors) before performing any demographic inferences but, at the same time, they reveal the utility of unstructured models to describe the shape of the gene genealogy, which is the final product of the evolutionary history of a species. A combination of structured andunstructured models is therefore the key to best characterize the evolutionary history of a species. We call for a change in perspective when investigating the demographic history of a species: the focus should be put in the reconstruction of the variation of both Nand m through time, which requires certainly new methodological development and probably more data.