Strain selection during artificial evolution
The strain selection process was tracked using strain-specific metabarcoding of a locus with high allelic richness of 110 unique alleles in the 59 strains. At the start of the experiment, individual strains contributed between 1.9 to 6.3% cells of the start population based on microscopic counts. Both alleles of all strains were also detected initially (Fig. 4), although some were observed at up to three-fold higher or lower relative abundance compared with microscopic cell counts. Out of 669 strain observations across all samples, only 22 were of a strain in the wrong population, likely because of sequencing errors in key SNP positions or PCR chimeras, suggesting a FDR of 0.065. These false positive observations amounted to a negligible amount of total strain observations (on average 0.02% [SD: 0.05] per sample) but could affect absent/present scoring and underestimate the number of extinct strains. After 42 days, a single strain generally dominated 30-99% of the total strain abundance and replicates typically delivered the same dominant strain, but different strains were selected for in the copper and control treatment (Fig. 4).
The metabarcoding revealed that, as expected for the mining population, the tolerant VG1-2_81 and VG1-2_74 strains became the most abundant ones in all copper selection replicates, with a joint final abundance of 37 to 99% (Fig S5). VG1-2_81 was initially highly competitive in the control conditions but VG1-2_103 eventually outcompeted it and other strains in this treatment, making up 22-90% of all amplicons in all five replicates on day 42 (Table 1). VG1-2_103 was also the second fastest-growing strain and statistically indistinguishable from the other two when grown as a mono-clonal culture (Fig. S2).
The selection outcome of the reference population deviated more from expectations based on mono-clonal traits (Fig. 5). In the control, GP2-4_40, a strain with an observed average growth rate of only 1.32 ± 0.11 day-1, dominated all five replicates after 42 days of selection (37-90%, Fig. 4), with several more strains retained at 1-25% abundance (Table 1, Fig S4). In the copper treatment, GP2-4_27, which had a below-average EC50 (7.8 µM Cu; Fig. S6), outcompeted everyone else in four out of five replicates (93-97%, Fig. 3) and was thus identified as the strain responsible for the plastic tolerance developing in this population (Fig. 2 and 3). In the last replicate, GP2-4_57, a strain that was extinct in the other four, as well as in all the controls, became dominant (78%, Fig. S5). Importantly, like all strains at the beginning of the experiment, GP2-4_57 was heterozygous for the barcoding locus Sm_C12W1 , but all 9,099 amplicons observed at day 42 were from only one of its two alleles (Fig. S6), indicating a loss of heterozygosity, which could be explained through inbreeding. Furthermore, this bottle replicate had its own evolutionary trajectory and developed higher copper tolerance (EC50 11.2 µM Cu, versus 9.45-10.8) but slower growth rate (46 generations versus 54-59) than the GP2-4_27 dominated replicates (Table 1).
The metabarcoded relative abundances of strains were used to disentangle individual growth rates during co-cultivation. The barcoded copper tolerance traits were not correlated with other mono-clonal strain traits like cellular surface-to-volume ratios, Fv/Fm, or growth rate (Fig. S7 and S8). Several strains had already gone extinct after nine days, especially in the Mining population and copper treatment, where nine out of 30 were lost (Fig. 5). The remaining strains’ growth rates in the reference population correlated poorly against what was predicted from mono-clonal observations, with R2 of 0.003,p =0.8 (copper), and 0.07, p =0.2 (control, without the outlier-strain GP2-4_42). Correlations were higher in the Mining experiments with R2 of 0.39, p =0.002 (copper) and 0.24, p =0.006 (control, without outlier-strain VG1-2_63), but it was still difficult to distinguish growth rates between many strains with confidence (Fig. 1, S1, and 5). Importantly the precision of the metabarcode-derived growth rates was, on average, three times higher for the barcoded growth rates (95% conf. +/- 0.038 day-1) compared with the mono-clonally estimated rates (+/- 0.11 day-1), showing that this approach has much higher chance of detecting subtle strain differences in fitness. Furthermore, the evolutionary trajectories observed via metabarcoding on day nine generally persisted for the final 33 days of the evolution experiment (Fig. 4, 5, S4, and S5). Consequently, a short selection experiment of pooled populations of strains, combined with observations of strain abundance using intraspecific metabarcoding, appears to be a robust approach to estimate both fitness of individual strains and the evolutionary potential of phytoplankton populations.