Definition of phylogenetically-conserved candidate genes
PCCGs are genes identified by functional biologists as having major effects on traits, and whose sequence and function are (at least partly) conserved across a broad range of species. This concerns genes coding for ecologically important traits, for instance traits associated (directly or indirectly) to resource acquisition or to interactions with other organisms (Skovmand et al. 2018). Many PCCGs have been identified by functional biologists, but this knowledge has poorly percolated into our scientific community, but for rare exceptions such as behavioural ecology (e.g., Fitzpatricket al. 2005; Ducrest et al. 2008). We believe that we should build on this knowledge, and that PCCGs may be fundamental to unify facets of biodiversity.
Seminal works from the 90’s have identified candidate genes sustaining traits that matter for fitness (Andersen & Lübberstedt 2003; Meinke et al. 2008; Chu et al. 2011; Schwander et al. 2014; Hassani-Pak & Rawlings 2017; Anreiter & Sokolowski 2019). In animals, some of these genes code for functional traits, such as foraging behaviour, metabolism or stoichiometry, that are strongly related to the acquisition of resources and/or its conversion into biomass (Brown et al.2004; Violle et al. 2007; Wolf & Weissing 2012). For instance, the Sokolowski’s team identified a gene (the for gene) strongly controlling the foraging behaviour of Drosophila melanogaster(de Belle et al. 1989; Sokolowski 2001; Anreiter & Sokolowski 2019). This gene codes for a cGMP-dependent protein kinase (a signalling molecule) and encodes two main behavioural strategies: the rover strategy describingDrosophila larvae travelling long distance to feed, and the sitter strategy describing Drosophila larvae feeding in more restricted areas. This gene also impacts the food intake of individuals (rover larvae have lower food intake) and the food preference (rover larvae absorb higher glucose quantities) (Anreiter & Sokolowski 2019). We can reasonably expect that variation in the expression of this gene will have consequences on trophic chains, and ecosystem functioning. For plants, MADS-box genes described in Antirrhinum majus(Schwarz-Sommeret al. 1990) are a family of genes encoding transcription factors involved in flowering time, plant and floral architecture, and fruit, seed and root development (Schilling et al. 2018). MADS-box genes are key targets to improve crops’ yields, and are altering the short term adaptation of plants to environmental changes (Cho et al. 2017; Theißen et al. 2018). For instance, the Flowering Loci C and T regulate flowering time in many plant species, an important trait for individual fitness, and for the function of pollination by insects (Schmidtet al. 2016).
This type of candidate genes is similar to (and is therefore reinforcing) the idea of “Ecology Important Genes” (EIG) (Skovmandet al . 2018), defined as genes contributing strongly to phenotypes having a large effect on communities and ecosystems. Nonetheless, we stress that the purpose of our approach –contrary to Skovmand et al . (2018)– is not to search for rare EIGs with disproportionately large effects (what they called Keystone Genes, KGs), but rather to consider the impacts of a large number of these candidate genes (a hundred or more) with small to large individual contributions to traits and to ecological dynamics. Our approach acknowledges the idea that phenotypes likely arise from the collective effect of many genes with small effect sizes (Falconer 1981). Focusing on a large number of candidate genes should also offer the opportunity to identify complementarity and redundancy (in term of trait functions, see BOX 1) among genes or locus within a community, which are two important concepts for predicting the impacts of biodiversity on ecological processes (Loreau 1998).
An important aspect of our framework is that we focus on candidate genes that are phylogenetically conserved , meaning that they can be sequenced across a large range of species within communities. The fact that genes are ecologically important is not sufficient to warrant their integration across the intra-/interspecific biodiversity facets; they must also be phylogenetically conserved. Noteworthily, most candidate genes identified in model species are actually conserved (at least partly) across species. For instance, the for gene is extremely conserved, and its sequence can be retrieved from a large number of Invertebrate species (Sokolowski 2001; Anreiter & Sokolowski 2019). An ortholog -i.e., a gene whose the sequence has diverged over the course of evolution from a shared genetic ancestor- gene (PRKG1 ) identified in Vertebrates was found associated with foraging-like behaviour in humans, amphibians and small mammals (Anreiter & Sokolowski 2019; Struk et al. 2019). Similarly, the MADS-box gene complex has been identified in many taxonomic groups including mosses, gymnosperms and angiosperms (Gramzow & Theißen 2013; Schilling et al. 2018). Conservatism of candidate traits should actually be the norm rather than the exception given their importance for essential biological functions (Marden et al.2013; Barson et al. 2015; McGirr & Martin 2016; James et al. 2017).
Using PCCGs as target for measuring biodiversity inclusively is particularly attractive because the dynamics of PCCGs is shaped by demographic and (micro- and macro-) evolutionary processes, and because PCCGs likely code for important ecological traits and functions linked to ecological processes. PCCGs are therefore at the intersection of ecological and evolutionary dynamics, which makes them an ideal basis to identify new mechanisms linking the environment, biodiversity and the functioning of ecosystems. Hereafter, we provide insights into the concepts and tools currently available to inform PCCG diversity across species, and we provide a technical framework that forms the basis of future research (Figure 2).