Introduction
Commonly occurring taxa within a particular habitat are thought to be
critical to that habitat’s and ecosystem’s functions (Hamady & Knight,
2009; Shade & Handelsman, 2012; Turnbaugh & Gordon, 2009; Turnbaugh et
al., 2009; Umaña, Zhang, Cao, Lin, & Swenson, 2017). A core set of taxa
has been defined as the consistent assemblage of organisms associated
with a certain niche space (Hamady & Knight, 2009), and cataloging core
taxa has been the focus of many recent microbiome studies, due to the
complex nature of high throughput sequence data. Separating taxa into a
set of core and non-core or transient members serves the purpose of
simplifying multidimensional data and is thought to be advantageous for
performing statistical analyses. Support for this simplification stems
from empirical consistency between patterns observed with only the core
taxa and all taxa present (Delgado-Baquerizo et al., 2018), the idea
that commonly occurring core taxa are responsible for community function
(Saunders, Albertsen, Vollertsen, & Nielsen, 2016), and from the
conservative practice of statistical testing for treatment effects by
examining only the most commonly occurring taxa (Wirth et al., 2018).
Examining the dynamics and patterns of variation of the core assemblage
is often seen as an important step in analyzing and understanding
complex community interactions.
The concept of a core community has been operationalized in various sets
of criteria that can be applied to identify taxa that could belong to
the core assemblage (Delgado-Baquerizo et al., 2018; Gray, Amjad, &
Gray, 1983; Lundberg et al., 2012; Shade & Handelsman, 2012; Shade &
Stopnisek, 2019; Soliveres et al., 2016; Turnbaugh & Gordon, 2009;
Turnbaugh et al., 2009). However, the assumption that a core set of taxa
can be accurately identified underlies all core methods, and it is
unclear to what extent the concept of a core assemblage is supported by
data. Shade and Handelsman (2012) reviewed different criteria for
defining the core microbiome including abundance, phylogeny, and
function. However, they did not evaluate evidence or support for the
concept of a core community. More recently, studies have indicated that
some habitats are not occupied by a consistent, core, set of taxa and
instead host transients (Hamady & Knight, 2009; Hammer, Janzen,
Hallwachs, Jaffe, & Fierer, 2017).
Beyond methodological considerations, focusing on a core subset of taxa
might overlook consequential effects of rare taxa. With attention
shifting from “who is there?” (i.e. taxonomic composition) to “what
are they doing?” (i.e. functionality), the contribution of rare taxa,
especially those that serve as hub taxa in complex microbial networks,
should not be disregarded simply due to lower abundances (Banerjee,
Schlaeppi, & van der Heijden, 2018; Shi et al., 2020). Certain narrowly
distributed microbial functions such as nitrification, denitrification,
methanogenesis, or sulfate reduction are performed by relatively rare
microbes (Jousset et al., 2017; Lynch & Neufeld, 2015). Use of
community analyses that only examine abundant or commonly occurring
microbes (i.e. core assignments), has the potential to overlook those
taxa responsible for important ecosystem functions, like the ones listed
above. In focusing solely on core taxa, the contributions of transient
or rare taxa are discounted and attributed to commonly occurring ones,
potentially overemphasizing the importance of common taxa while
simultaneously underestimating the contribution of rare taxa.
Given the considerable and growing interest in using molecular data to
characterize diverse communities across many samples and conditions
(e.g. Ahrendt et al., 2018; Delgado-Baquerizo et al., 2018; Desnues et
al., 2008; Geisen, Laros, Vizcaíno, Bonkowski, & de Groot, 2015;
Porazinska et al., 2010; Stat et al., 2017; Tedersoo et al., 2014) and
the frequent use of core community analyses, we evaluated the definition
a core community and its consequences via multiple methods: First, we
compared different methods for defining core membership. Next, we used
the core assignments and the full datasets to determine whether the
interpretation of differences in community diversity (beta-diversity)
would be the same. And finally, we examined to what extent core
assignment methods could identify significant hub taxa as determined by
cooccurrence network analysis. Our study used microbial datasets from
the human microbiome project (Turnbaugh et al., 2007) and soil
rhizosphere samples from Arabidopsis thaliana (Lundberg et al.,
2012) as well as simulations to examine the validity of splitting taxon
count data into two sets (core and non-core), while also assessing the
effects of varying criteria on core membership.