Figures and Tables Captions

Fig. 1. Analysis set-up and identification workflow
Fig. 2. Map of sequence class versus samples (DNA extraction pools). Class “Ambiguous” refers to unique sequence variants shared between samples of different taxonomic identity (Iranian and GreekF. orientalis treated as different taxa); class “Specific” refers to 5S-IGS variants exclusively found in a single taxon sample (or two samples, in the case of F. sylvatica s.str.)
Fig. 3. Circular cladogram based on maximum likelihood (ML) tree inference and bootstrap (BS) analysis (550 BS pseudoreplicates) of the 686-sequence matrix including all 5S-IGS variants with a total abundance ≥ 25. The tree is rooted on the genetically most distinct O-type lineage; numbers at branches give ML-BS support. Colours indicate sequence class based on their sample distribution (Fig. 2); very short variants and sequences selected for the 38-sequence matrix (cf. Fig. 5) are highlighted
Fig. 4. Neighbour-net for the 686-sequence matrix, inferred from uncorrected (Hamming) pairwise distances. Neighbourhoods defined by well-defined interior “trunks” relate to prominent sorting events (bottleneck situations; evolutionary jumps) leading to coherent 5S-IGS lineages with high (near-unambiguous) root branch support in Fig.3; “fans” represent poor sorting of more ancient 5S-IGS variants forming incoherent 5S-IGS lineages with ambiguous root branch support. Note the absolute genetic distance between the assumed outgroup, the ‘Japonica O’-type, and the ingroup types including the ‘Japonica I’ type. Colouration as in Fig. 3
Fig. 5. Maximum likelihood phylograms inferred from the selected 38-sequence matrix including the most common variants of each main lineage, and rare, shared variants (labelled as “A… ”) as well as high-divergent variants (“D_… ”). Left tree was inferred with generally length-polymorphic regions (LPR) included (as defined in Supplementary file S4, sheet Motives ); tree to the right with them excluded. Line thickness visualises non-parametric bootstrap (BS) support based on 10,000 BS pseudoreplicates
Fig. 6. Sample-wise maximum likelihood (ML) trees illustrating the bimodality of the 5S-IGS pool in each sample. Numbers give ML bootstrap support (based on 1000 pseudoreplicates) for selected major phylogenetic splits. Subtree and individual labels refer to lineages and tips introduced in Figs 3–5
Fig. 7. Amplicon GC content and length violin plots for the (co-)dominant lineages/main 5S-IGS types found in each sample. Horizontal black lines give the median value for each main type (based on all variants of the respective lineage). Width of violin plots adjusted to visualise the relative proportion (number of HTS reads) of each type within a sample; n gives the plots’ sample number (number of distinct variants). Pie-charts give the proportional abundance (PA) of the plotted types within each sample (see Supplementary file S1, appendix A, for absolute numbers). “O”, “I”, “A”, “B0”–“B3” refer to respective (sub)types/ lineages labelled in the 686- and 38-guide trees (Figs 3, 5). Colouration gives affinity to main 5S-IGS lineages (Figs 3–5). *, excluding very rare IranianF. orientalis- and F. crenata- specific B types (cf. Fig. 8)
Fig. 8. Doodle summarising the totality of our results and their interpretation regarding all available (referenced in the text) information
Table 1. Estimates of mean inter- and intra-lineage 5S-IGS divergence
Table 2. Main 5S-IGS sequence types observed in our dataset and their evolutionary interpretation