1. Different approaches to model forests

Different approaches have been developed to model forest ecosystems and community dynamics, as well as forest cover and tree species distributions. They range from basic theoretical models such as neutral models (Hubbell 2001), through models of growth patterns of individual trees, to forest stand or landscape models (Shifley et al. 2017), or global vegetation models (Prentice et al. 2007). Depending on the specific objectives of the developing scientists, the model representation of biogeochemical processes, vegetation structure, or biodiversity have been more or less detailed, by means of different degrees of aggregation or abstraction or following various assumptions.
The three model types we briefly present here - SDMs, IBMs, DGVMs - have been developed by different disciplines and cover a gradient from models that initially focused on a detailed representation of individual species to models that gave initial emphasis to the representation of forest structure and tree demography, to others that focused on the representation of biogeochemical processes. We chose these model types, which have a long history and are all widely used, especially in the context of global change, to illustrate the variety of modelling approaches, but our general ideas also apply to other model types. In the following, we present these three approaches by ordering them along a gradient of decreasing resolution of biodiversity representation and increasing resolution of biogeochemical process representation, acknowledging other orders could have been used alternatively.

Species distribution models

Species distribution models (SDMs; Booth et al. 2014; Guisan et al. 2017) focus on the spatial distribution of species and how it varies with environmental drivers. SDMs have their origin in flora distribution maps, which laid the concepts of biogeography (Humboldt 1849; Grisebach 1872). The development and increased usage of SDMs across a wide array of taxa and environments have relied on several technical advances (Guisan & Thuiller 2005; Elith & Leathwick 2009), namely statistical approaches (e.g. MaxEnt), methods for physical environment mapping (e.g. remote sensing techniques), and increased coordinational effort to compile knowledge on species records. All these approaches have been boosted by geographic information systems (GIS).
SDMs rely on the concept of ecological niche (Hutchinson 1957; Guisan & Thuiller 2005; Soberón 2007), and can be described as a two-step process as follows. First, the ecological niche representation of a species is built in an environmental space, based on known records in places where environmental conditions have been described. Then each geographic location is assigned a probability of occurrence for the species, based on the niche model (Elith & Leathwick 2009).
SDMs thus require little information on the processes from which species distributions result. This can be an advantage, e.g. for poorly known taxa in demand of conservation actions. Also, by looking for a best model fit in species niche modelling, important environmental drivers of spatial species patterns may be revealed (e.g. Thuiller et al. 2003; Bertrand et al. 2012). SDMs have also been used to predict species distributions under future environmental conditions, such as species invasion or climate change (Thuiller 2003; Thuiller et al. 2005). However, key assumptions of SDMs, mainly that species are at equilibrium with their environment (Václavík & Meentemeyer 2019), and that the species-environment relationships are valid beyond the range of model calibration, may be violated under such applications (Svenning & Skov 2004; Araújo & Pearson 2005; Veloz et al. 2012). Classical SDMs are further limited to a species-by-species approach, and thus typically overlook the role of species interactions in shaping species distributions (Dormann et al. 2018). Additionally, the inherent spatial autocorrelation (SAC) of species distribution and environmental variables can bias the estimated performance of SDMs (Bahn & McGill 2007; Fourcade et al. 2018), calling for care when using extrapolations from SDMs (Sofaer et al. 2018). However, at the same time, accounting for SAC in SDMs by various methods (Dormann et al. 2007; Václavík et al. 2012) can improve the accuracy of SDMs because SAC is often a result of important ecological processes (e.g. dispersal limitation, colonization time lag) that drive species distributions.
The integration of processes into SDMs is likely critical to infer species distributions in novel environments or under no-present analogue conditions (Kearney & Porter 2009; Dormann et al. 2012; Urban et al. 2016). Models that combine the traditional approach of SDMs with process-based information (Morin & Lechowicz 2008; Thuiller et al. 2008), such as dispersal limitation or phenology, have been developed (Stephenson 1990; Kleidon & Mooney 2000; Chuine & Beaubien 2001; Bykova et al. 2012; Nobis & Normand 2014; Duputié et al. 2015). Progress has also been made to integrate species competition as biotic factors influencing species realized niche (Leathwick & Austin 2001; Meier et al. 2011) and further extend these ideas to full ecological communities (Ferrier & Guisan 2006).

Individual-based forest models

There is a long tradition in ecology and forestry to use individual-based forest models, to answer a broad range of scientific questions. This type of models simulates the development of each individual tree within a forest stand. A key component is the interaction between single trees (e.g. by shading) which is crucial for tree growth and influences community dynamics. The simulation of individual trees allows to capture not only forest structure but also tree species diversity. A widely known type of individual based forest models is forest gap models (Shugart 1984; Huston et al. 1988). First developed for forest stands in North America, they have since become among the most used model types in ecology (Botkin et al. 1972; Shugart & West 1977; Shugart et al. 2018).
In the gap model approach, a forest stand is described as a mosaic of forest patches, (also named gaps). The dynamics of the forests emerges from the growth, mortality, establishment and competition of individual trees (Bugmann 2001; Porté & Bartelink 2002). Trees compete for light, water and nutrients. The vertical distribution of leaves is used to calculate the light availability for each tree, what affects growth and mortality. For competition with neighbouring trees a competition range has to be assumed (the patch size), wherein all trees compete with each other (a large tree should also fit into a patch). Due to the individual-based concept, these models are able to describe the successional dynamics of forests (mosaic dynamics, e.g. Watt 1947) and the natural heterogeneity of forest stands (Knapp et al. 2018). The coupling of biogeochemical processes is modelled in an aggregated way in forest gap models, using the concept of limiting factors (affecting tree growth rates). Gap models can simulate the impact of temperature, precipitation, CO2 and light on tree dynamics, and thus on forest productivity, biomass and species composition (Solomon 1986; Pastor & Post 1988; Overpeck et al. 1990). Some early studies also included nutrient cycles (e.g. Pastor & Post 1986). Gap models can be applied with daily time steps, but are typically used with monthly or annual time steps.
Modules for forest management (e.g. Liu & Ashton 1995; Huth & Ditzer 2001; Mina et al. 2017) and disturbances like fire (Kercher & Axelrod 1984; Fischer 2013), browsing (Seagle & Liang 2001; Didion et al. 2009) or wind through (Seidl et al. 2011, 2014a) have been included in subsequent studies. Tree mortality can thus be described as an exogenous process (e.g. by disturbances), but also as a growth-dependent and/or intrinsic process (e.g. Keane et al. 2001). Although gap models were first developed for temperate forests in the USA, they were soon applied also for European temperate forests (Kienast 1987; Bugmann 1996) and boreal forests (Leemans & Prentice 1989). Since the 90’s, forest gap models for tropical forests have also been developed (Bossel & Krieger 1991; Köhler & Huth 1998; Fischer et al. 2016). To simplify the high species richness of these forests, tropical gap models typically simulate forest succession by grouping tree species that share similar ecological features into several plant functional types (PFTs). The gap model approach was also extended to grasslands (Smith & Huston 1990; Taubert et al. 2012).
From the 1990s onwards, models that keep track of the positions of each tree in a finer-grained grid (i.e. they are spatially-explicit) and thus allow for a more detailed computation of tree light availability have been developed (Pacala et al. 1996; Chave 1999; Pretzsch et al. 2002; Maréchaux & Chave 2017). Other model developments have led to a more explicit representation of processes, for example by including a more detailed temperature and CO2 dependence of photosynthesis and respiration, or a more detailed water and carbon cycles or site fertility (Fischer et al. 2016; Maréchaux & Chave 2017). Similarly, novel parameterizations have allowed to simulate hundreds of species within communities (Maréchaux & Chave 2017; Rüger et al. 2019). Other stand-based models were designed to describe forest stand structure dynamics driven by ecophysiological processes in higher detail and finer time scales (Kramer et al. 2002; Morales et al. 2005; Medlyn et al. 2007), although often at the cost of simulation temporal or spatial coverage. Individual-based forest models have since been used to address a variety of basic and applied research questions (Bugmann & Pfister 2000; Seidl et al. 2012; Bohn et al. 2014; Fischer et al. 2016; Shugart et al. 2018). Modern extensions of these models allow also simulations of forests at large spatial scales (e.g. for whole countries or continents, Xiaodong & Shugart 2005; Sato et al. 2007; Scherstjanoi et al. 2014; Rödig et al. 2017; Thom et al. 2017).

Dynamic global vegetation models

DGVMs have their origin in four different modelling research areas that were initially investigated separately: plant geography, biogeochemistry, vegetation dynamics, and biophysics (Prentice et al. 2007), with HYBRID, LPJ and TRIFFID as being among the first DGVMs (Cramer et al. 2001). DGVMs have been initially developed to represent the interaction between vegetation and the global carbon cycle as independent models, but also to represent vegetation dynamics in Global Circulation Models.
DGVMs simulate vegetation dynamics on daily to monthly time steps at the global scale, driven by climate, atmospheric CO2 concentration, and soil information, hence using plant physiology and biogeochemistry to explain biogeography (Sitch et al. 2003; Krinner et al. 2005). This approach results in calculating the large-scale distribution of potential natural vegetation. Main components of each DGVM are process-based representations of photosynthesis, respiration, leaf transpiration, carbon allocation, mortality and disturbance. The exchange of carbon and water fluxes is represented at the leaf level by stomatal conductance (Ball et al. 1987; Collatz et al. 1991).
Describing vegetation dynamics at the global scale inevitably entails strong model simplifications to represent vegetation. These models use PFTs to aggregate functionally similar species to represent functional properties at the biome scale. Usually global vegetation is described with 5 to 14 PFTs by differentiating life form, leaf form, phenology, or photosynthetic pathway, e.g. tropical broad-leaved raingreen tree or C3 grasses (Woodward & Cramer 1996; Prentice et al. 2007). Hence, these PFTs represent a less detailed description of species diversity within forest communities than the ones used in IMBs. Additionally, DGVMs often conduct simulations using a relatively coarse-grained grid (typically of 0.5° lat/lon resolution) in which characteristics of each cell are assumed to be homogenous, simulating average individuals per PFT, where several of them can compete within one gridcell. Hence local competition processes are oversimplified and the influence of spatial structure within this coarse grid cell is neglected. Moreover, DGVMs typically apply the ‘big-leaf’ approach, whereby photosynthesis of the PFTs is simulated based on one photosynthetic surface throughout the grid cell. Most stand-alone DGVMs are not initialized with any observed vegetation distribution, nor with any values for the carbon and water pools. The global PFT and carbon-pool distribution is therefore determined by the given abiotic conditions and PFT-specific characteristics. Hence, each change in abiotic conditions (e.g. climate change) results in a non-prescribed reaction of the vegetation.
Although DGVMs were originally developed to simulate potential natural vegetation, including fire disturbance (Lenihan et al. 1998; Thonicke et al. 2001), they have been advanced by simulating land-use (Bondeau et al. 2007; Boysen et al. 2016; Langerwisch et al. 2017; Rolinski et al. 2018), water management (Jägermeyr et al. 2015), and forest management (Bellassen et al. 2010). In order to account for the role of nutrient deposition in vegetation dynamics and its interaction with the global carbon cycle, several DGVMs have further developed an explicit representation of nitrogen and phosphorus cycles (Wang et al. 2010; Smith et al. 2014; Reed et al. 2015; Goll et al. 2017; von Bloh et al. 2018). Similarly, a more explicit representation of tree hydraulics and water flows has been developed in some DGVMs to better assess the effect of climatic changes on evapotranspiration and drought-related mortality (Hickler et al. 2006; Bonan et al. 2014; Langan et al. 2017; Joetzjer et al. 2018). The need for a more realistic representation of vegetation structure and biodiversity to improve the predictive power of DGVMs has been highlighted to improve the predictive power of DGVMs (Quillet et al. 2010; McMahon et al. 2011). To achieve this, several developments have been made to include a finer representation of vegetation demographic processes (Moorcroft et al. 2001; Smith et al. 2001; Hickler et al. 2012; Fisher et al. 2018) and functional diversity (Pavlick et al. 2013; Scheiter et al. 2013; Sakschewski et al. 2015; Verheijen et al. 2015). Lately, also seed dispersal of trees and therefore the ability for tree species migration has been implemented into hybrid DGVMs (Snell & Cowling 2015; Lehsten et al. 2019).
In the following parts, we will henceforth use the terms “forest models” and “forest modelling” to describe the variety of models that have been used to simulate forest systems, among which the three above-described model types are widely-used examples, acknowledging that each model type is also used to simulate other ecological systems.