Discussion

5S-IGS divergence and paralogy/homeology

With the exception of few variants (‘Crenata A’, relict lineage variants, partially deleted very short O-types), there is no evidence for sequential decay. The relatively low GC contents (range: 33.2 – 45.1%) as compared to other Fagaceae such as oaks, appear to be characteristic of nuclear spacers of Fagus , and its nucleome in general (NCBI GenBank acc. no. QCXR00000000.1 and BKZX00000000.1; not yet annotated, accessed on 25/11/2020). Thus, pseudogeny can be largely ruled out as the cause for the detected 5S-IGS divergence. Rather, the presence of two stable co-dominant main sequence clusters in each sample (Fig. 6; Tables 2, 3) fits with the only available cytogenetic data for Fagus (Ribeiro et al., 2011), showing two paralogous, potentially functional 5S rDNA pericentromeric loci (and four terminal 35S rDNA loci) in F. sylvatica . This is a rare feature within Fagaceae, usually showing only one 5S rDNA locus (www.plantrdnadatabase.com; accessed 15/08/2020, Chokchaichamnankit & Anamthawat-Jonsson, 2015). Additional or odd numbers of (unpaired) loci were found in single individuals and attributed to hybridisation and auto-polyploidisation (Chokchaichamnankit et al., 2008, Ribeiro et al., 2011). Comprehensive data for comparisons with other families in the Fagales are currently lacking, with the exception of Corylus (Betulaceae) showing a single 5S locus and much lower intra-individual, intra-specific divergence (Forest & Bruneau, 2000).
In F. japonica, the longer, GC-rich O-type and shorter, GC-normal I-type, represent two highly divergent, only distantly related lineages; their divergence (Table 1) parallels that between subgenera of oaks (Denk & Grimm, 2010; Simeone et al., 2018; Piredda et al., 2020 for first HTS data) and between genera in Betulaceae (Forest et al., 2005). Both types are abundant: although HTS results cannot be generally considered quantitative (Lamb et al., 2019), I- and O-type variants appear co-dominant, while type X variants, representing a second ingroup-related lineage of ambiguous phylogenetic affinity, are rarer (Fig. 7; Supplementary file S1, section 4.3). This situation is similar to the one reported for cloned ITS data: 35S rDNA-cistron ITS regions ofF. japonica and its sister species F. engleriana (mainland China) and F. multinervis (Ullung Do, South Korea) show extreme length and sequence heterogeneity with up to three divergent main sequence types, while the ITS of the F. crenata-sylvatica lineage is more homogeneous and poorly sorted (Denk et al., 2005; Grimm et al., 2007).
In the 5S-IGS of the crenata-sylvatica lineage, the length and sequence differences are less pronounced, with the longer A-type variants being generally less derived and less abundant than the shorter, much more abundant and more diverse B-type variants (Figs 3–5; Supplementary file S1, appendix A). The divergence between these two types is also comparable to intra-sectional diversity in oaks and genus-level diversity in Betulaceae. The crenata-sylvaticasamples exhibit increasing dominance of type B over type A (Fig. 7): weakly developed in Iranian F. orientalis , increased in European samples following a (south)east/(north)west gradient (Greece – Italy – Germany), and strongest in F. crenata (type A almost absent). In addition, putative relicts (Fig. 5) can be found that lack/mix features of A- and B-type variants (Supplementary file S4) or are clearly related to the outgroup variants, type O (‘European O’; Figs 3, 4, 6). Hence, we detected up to three major length classes referring to four principle 5S-IGS types of disparate phylogenetic affinity.
While two (or more) paralogous (and/or homeologous) 5S loci may have facilitated ancient polymorphism (e.g. polyploidisation, hybridisation), intragenomic silencing of homeologues leading to pseudogeny (reviewed in Volkov et al., 2007; see Volkov et al., 2017 for a case of an ancient allopolyploid) may cause the observed detection differences. The HTS primers bind to the highly conserved 5S rDNA. If these are strongly degraded, the intergenic spacers of such arrays will not be in our sample. Compared to other studied Fagaceae (e.g. Denk & Grimm, 2010) and ITS data (Denk et al., 2005), reducing spacer length by reducing AT-dominated length-polymorphic regions may represent a general trend (root-tip distances in Fig. 5 reflect sequence derivation). Within beech, this trend is only obvious in the crenata-sylvaticalineage: the AT-richer A-types are slightly longer than the B-types replacing them. In F. japonica, longer, GC-richer O-types are rarer and less diverse than types I (Figs 6, 7; Supplementary files S1, S2). Based on the high structural and sequence diversity, counterbalanced by the large number of identical sequences detected in each sample, a combined effect of both concerted and birth-and-death evolution models must therefore be assumed for the 5S rRNA genes in beech (Nei & Rooney, 2005; Galian et al., 2014). Thus, even if there are two (or more) loci in all species of Fagus , they may not be paralogous (in a strict sense) but rather act as homeologues, i.e. they differ in position but not in function and are affected by inter-array (inter-loci) recombination and limited concerted evolution.

Formation of 5S-IGS gene pools in the F. crenata – F. sylvatica s.str. lineage

Based on the ML results of the 686-tip tree, the individual samples, and the 38 selected variants (Figs. 3, 5, 6), the potential disruption and assembly of ancient common gene pools can be discussed using the NNet network (Fig. 4). The closer two clusters (neighbourhoods, if defined by an edge bundle) are in the network, the more recent is their split. Mixed clusters typically comprise either shared ancestral variants or variants propagated during (past) gene flow. Pronounced ‘trunks’ imply genetic drift and isolation, while ‘fans’ reflect gradual radiation (see also Hipp et al., 2020 for oaks). The distances across the graph represent genetic distances, thus, the amount of genetic drift: the fixation rate of new mutations and their propagation within the genome and a population’s gene pool. Further, the network does not assume dichotomy, which may be a poor model for the evolution and propagation of 5S-IGS variants within a species’ gene pool. With currently available methods, it is impossible to date potentially paralogous-homoeologous multi-copy 5S IGS data. Considering the used taxon sample, method and data, the divergence estimates by Renner et al. (2016) provide maximum ages for lineage splitting in beeches. During the Eocene (56–33.9 Ma; Cohen et al., 2013, updated), the lineages leading to F. japonicaand F. crenata + F. sylvatica s.str. started to diverge and speciation in the crenata-sylvatica lineage probably started in the late Oligocene (Chattian; 27.82–23.03 Ma). The fossil record indicates that gene flow between the western and eastern populations of the crenata-sylvatica lineage became impossible after the middle Miocene, when the near-continuous northern Eurasian distribution of beech forests became fragmented, at around 15–10 Ma (Denk, 2004; Arkhipov et al., 2005). Gömöry et al. (2018) used allozymes and approximate Bayesian computation to examine the demographic history of western Eurasian beeches and suggested that F. sylvaticas.str. diverged from F. orientalis at ~1.2 Ma in the Pleistocene (Calabrian); this scenario is supported by the leaf fossil record (e.g. Follieri, 1958). The diversity seen in IranianF. orientalis (Figs 3, 5) may well represent the original 5S-IGS diversity within the western range of the Oligocene-Miocene precursor of the crenata-sylvatica lineage. The poorly sorted ‘Western Eurasian type B’ cluster is likely the direct result of a common origin of the western Eurasian beeches and the Japanese F. crenata and ongoing gene flow in the Miocene (relict ‘Crenata B0’; ‘Crenata B1’ grade; Fig. 3). It is possible that F. crenata (the easternmost species) replaced its type A 5S-IGS variants with specific ‘Crenata type B2’ sequences. Originally, Fagus had a (near)continuous range from Japan via central Asia to Europe, and it is therefore possible that within this continuous area (extinct) (sub)species of beech acquired 5S-IGS variants that survived within the 5S gene pool of populations like those observed in Iran. Beech forests persisted throughout the entire Neogene in this area (and in Caucasus as well) although experiencing severe bottlenecks (Shatilova et. al., 2011; Dagtekin et al., 2020). The sharing of rare variants is consistent with this scenario as they link isolated populations and otherwise distinct species to a once common gene pool (e.g. ‘Relict’ lineage: ‘European O’, ‘Cross-Asia, Crenata B2’; Figs 3–5). These variants are not part of the terminal subtrees, but instead reflect ancient, largely lost diversity that predates the formation of the modern species and possibly even the final split between ‘subgenus Engleriana’ (represented by F. japonica ) and ‘subgenus Fagus’ lineages (crenata-sylvaticalineage).
In Europe, both A-type and B-type 5S-IGS arrays were secondarily sorted and homogenized to a certain degree: ‘European A’ and ‘Western Eurasian B’ clades include ± high proportion of shared (“ambiguous”) 5S-IGS variants in contrast to the highly coherent ‘European B’ clade (Figs 3, 4). Unhindered gene flow lasted much longer between Greek F. orientalis and F. sylvatica s.str. than the Iranian F. orientalis and/or Japanese F. crenata . A side effect of the higher genetic exchange between the western populations, a putatively larger active population size and more dynamic history, is the retention of ancient 5S-IGS variants, which are likely relicts from a past diversification.

Inter-species relationships and status of Iranian F. orientalis

Our data confirm the close relationship of F. crenata withF. sylvatica s.l. and the deep split between the two Japanese species, belonging to different subgeneric lineages. Results also agree with (i ) previous morphological (Denk, 1999a, b) and population-scale isoenzyme and genetic studies (Gömöry & Paule, 2010; Bijarpasi et al., 2020), which identified a split between disjunct populations traditionally considered as F. orientalis in Europe and adjacent Asia Minor (Iran and Caucasus); and (ii ) the relatively recent contact and mixing between the westernmost F. orientalis (here represented by a Greek sample) and F. sylvaticas.str. (Papageorgiou et al., 2008; Müller et al., 2019).
Considering all evidence, the Iranian F. orientalis must have been isolated for a much longer time and likely deserves recognition as distinct species. However, a formalisation requires comparative data of populations in the Caucasus (including NE. Turkey; cf. Gömöry and Paule, 2010), where the holotype of F. orientalis comes from. According to Denk (1999b), Gömöry & Paule (2010) and Gömöry et al. (2018), the Caucasian populations are morphologically and genetically distinct from both the western F. orientalis (including SE. Bulgarian and NW. Turkish populations) and the Iranian populations. The high amount and diversity of Iran-unique 5S-IGS A- and B-type variants, while being geographically restricted, points to a complex history of the Iranian beech. The Caucasian populations are geographically closer to the Iranian, and they may have been in contact in the more recent past (via Cis-Caucasia and Azerbaijan). Hence, some of the Iran-specific variants may be shared by the Caucasian populations. Further genetic imprints in the Iranian species could be a legacy from extinct beech populations growing further east and the disjunct and taxonomically complex beeches from the Nur (Amanos) Mts in south-central Turkey. The fact thatF. crenata and Iranian F. orientalis still carry exclusively shared (‘Cross-Asia Crenata B2’) and related (‘Crenata A’; ‘Crenata B0’) 5S-IGS variants (Figs 3, 4) fits with a more or less continuous range of beech populations from the southern Ural, via Kazakhstan and Siberia, to the Far East until the end of the middle Miocene (see above).

Potential of 5S-IGS HTS data to detect past and recent reticulation events

Beech has a complex biogeographic history in the Northern Hemisphere. The crenata-sylvatica lineage represents the most evolved, recently diverged branch (Supplementary file S4; Denk & Grimm 2009; Renner et al., 2016). A reasonable model for beech evolution and diversification in western Eurasia would include phases of range contraction (isolation-speciation) and expansion (species mixing and homogenisation) to explain the diffuse morphological evolution in western Eurasian beeches. Such a scenario is supported by the observed 5S-IGS diversity (Fig. 8): speciation and isolation lead to the accumulation of new lineage-specific variants, which are then exchanged or propagated during episodes of favourable climate and massive radiation of beech forests. Without data from Chinese and/or Taiwanese species, it is impossible to assess whether the A- or B-types or the partly pseudogenic relict type lineage(s) represent the original stock of western Eurasian beeches and their Japanese sister species. Nonetheless, since the A-type is largely lost in F. crenata , and sequentially equally distant to the B-type than the F. japonicaingroup variants (I-type; Fig. 4), we postulate that both types are present at least in some of the Chinese species and represent an ancient polymorphism shared by all Eurasian ‘subgenus Fagus’ species.
Pronounced ancient nuclear polymorphism implies that at least some modern beeches are of hybrid or allopolyploid origin. For the species of ‘subgenus Engleriana’ a hybrid origin would make sense, with theF. japonica ingroup variants representing the common ancestry with the Eurasian clade of ‘subgenus Fagus’, while the outgroup variant represents another, potentially extinct, lineage of high-latitude beeches (Denk & Grimm, 2009; see also Fradkina et al., 2005; Grímsson et al., 2016). A hybrid/allopolyploid origin would also explain the extreme divergence observed in the ITS region of Fagus (Denk et al., 2005) and appears to be supported by fossil evidence. Oldest fossils unambiguously belonging to Fagus (dispersed pollen grains from the early Paleocene of Greenland, with small size and narrow colpi reaching almost to the poles; Grímsson et al., 2016) resemble ‘subgenus Engleriana’. The subsequent radiation of beeches involved western North America and East Asia, where fossil-taxa including morphological characteristics of both modern subgenera are also known. For example, the Eocene Fagus langevinii has branched cupule appendages as exclusively found in extant Fagus engleriana along with ‘subgenus Engleriana’ type pollen but resembles ‘subgenus Fagus’ in features of leaf and nut morphology (Manchester & Dillhoff, 2005). Dispersed pollen from the late Eocene of South China also resembles modern pollen of ‘subgenus Engleriana’ (Hoffman et al., 2019). This might reflect an early phase in the evolution of beeches, during which the modern subgenera were evolving, but not yet fully differentiated. By the early Oligocene, fossil-species can clearly be assigned to either ‘subgenus Fagus’ or ‘Engleriana’: F. pacifica from western North America resembles ‘subgenus Fagus’ in leaf architecture and the cupule/nut complex (Meyer & Manchester, 1997); cupules and leaves from the Russian Far East can unambiguously be ascribed to ‘subgenus Engleriana’ (Pavlyutkin et al., 2014; as Fagus palaeojaponica , F. evenensis , Fagus sp. 3).
Reticulation enriching the 5S-IGS pool is conceivable for the European species as well. The first beeches arrived in Europe in the Oligocene showing a general morphotype found across continental Eurasia (F. castaneifolia ; Denk, 2004; Denk & Grimm, 2009). During the Miocene,F. castaneifolia was gradually replaced by F. haidingeri , the ancestor and precursor of all contemporary western Eurasian beeches (Denk et al., 2005; Denk & Grimm, 2009). Like F. castaneifolia, F. haidingeri shows high morphological plasticity. Hence, this fossil-species may represent a species complex rather than a single species. In addition, in southern Europe, gene flow between F. haidingeri and another fossil-species, F. gussonii , might have occurred. Our data confirm the potential for (sub)recent reticulation and incomplete lineage sorting between and within the morphologically distinct Greek F. orientalis and F. sylvaticas.str., and similar processes likely occurred repeatedly since the Miocene. Notably, the Greek orientalis and F. sylvatica s.str. comprise three main 5S-IGS lineages compared to only two in the Iranian sample. In addition to the inherited polymorphism shared with the Iranian F. orientalis (relatively similar type A, ‘Original A’ and ‘European A’; shared ‘Western Eurasian B’), we found a highly abundant, moderately evolved but less diverse B-type (‘European B’ in Figs 3–6), exclusive to Greek orientalis and F. sylvatica s.str., likely reflecting the recent common origin of these populations. In the context of F. gussonii, the second fossil-species present in the Miocene of Europe (including Iceland) and of uncertain affinity (Denk & Grimm, 2009), data from North American extant species will be most valuable. Will they be clearly different (as found for the ITS region and 2nd LEAFY intron; Denk et al., 2005; Renner et al., 2016) or have they preserved affinities with one of the here established lineages?