Material and methods
We used ground-based fuel hazard score observations and gridded
information on 15 environmental predictors related to climate (6),
topography (5), soil (3) and potential vegetation cover (1). We used the
ground-based fuel data to train and evaluate random forest models. We
then made projections of future fuel hazard using climate model
projections. The details of the data and models are outlined below.
Fuel hazard
observations
The study area was the state of Victoria in Southeast Australia, where
extensive fine resolution data were available to test our novel
machine-learning approach (Figure 1). The Victorian Department of
Environment, Land, Water and Planning (DELWP) provided field
observations of fuel hazard ratings over the period 1995 to 2017 with a
total of 47,245 individual records. The data contain categorical fuel
hazard scores for surface, near-surface, elevated and bark fuel strata,
with five levels: low (1), moderate (2), high (3), very high (4) and
extreme (5) (Hines et al. 2010). The records were mostly one-off
assessments for georeferenced plots of ca. 20m x 20 m.
We focused the analysis on the elevated stratum because of its high
consistency among different surveys (Watson et al., 2012) and high
impact on fire regimes (Hines et al. 2010). The elevated fuel hazard
score is based on the cover and horizontal connectivity of dead and live
plants that is close for biomass that may not be consumed by a flame
height of 0.5 m. The near-surface fuel hazard score is based on cover
and horizontal connectivity of dead and live plants that is close
(<0.5m) but not lying on the ground. The surface fuel hazard
score is based on litter depth, cover and horizontal connectivity lying
on the ground. Bark fuel hazard is specific to certain tree species and
alone does not contribute to the fire regime at landscape scale.
Therefore, we do not specifically model bark fuel stratum in this study.
Since we are interested in predicting potential fuel hazard score, we
used the observations made at least 10 years after the last fire; this
length of time allows the fuel to accumulate beyond 95% of capacity
(Peet et al., 1971; Fox et al., 1979; Burrows et al, 1994; Zazali et
al., 2021). This filtering resulted in a total of 27,799 observations
covering Victoria. Figure 1 shows the spatial distribution of the field
observations, with the temporal distribution shown in Figure S1.
Climate predictors
We chose a set of climate, soil and topographic predictors that have
been shown to drive fuel hazard in previous studies (e.g., Jenkins et
al., 2020; McColl‐Gausden et al., 2020). We used gridded daily climate
data at 0.05° x 0.05° resolution from 1994-2018 obtained from the SILO
project (Jeffrey et al., 2001; accessed at
http://www.longpaddock.qld.gov.au/silo/). We extracted daily
precipitation (PPT), maximum air temperature (Tmax) and
minimum relative humidity (RH) for the grid cells corresponding to the
locations of the field observations. We aggregated the climate data to
monthly time steps by taking the sum of precipitation and means for
other variables. The PPT, Tmax, minimum RH of the month
before the fuel hazard observations were used as predictors to represent
the impacts of short-term climate conditions. We calculated the mean
annual precipitation (MAP) and Tmax for 1994-2018 to
represent long-term climate conditions. Finally, a rainfall seasonality
index presented by Feng et al. (2013) was used to capture long-term
variation in the temporal distribution of precipitation over the twelve
months of the year; the index varies from 0 (even distribution over all
months) to 2.48 (all rainfall concentrated in a single month). The
details of meteorological data are in Table S1. In addition to the
climatic predictors used in the model, we obtained potential
evapotranspiration (PET) from SILO and averaged over 1994-2018. We then
calculated an Aridity index (AI) as PET/MAP. AI is not used as a
predictor in the models but rather as an indicator of long-term water
balance among the pixels in the following analysis.
Soil and terrain
attributes
Soil attributes, in particular proxies for water holding capacity and
soil fertility, are expected to constrain spatial variation of fuel
loads and fuel properties via their effect on vegetation composition,
density, and structure (McColl‐Gausden et al, 2020). We obtained gridded
bulk density, clay content for topsoil (0-5 cm) and Available Volumetric
Water Capacity (AWC; 0-200 cm) at 90 m resolution from the Whole Earth
product of the Soil and Landscape Grid of Australia (Malone, 2022).
We used fine scale topographic products containing wetness index
(Gallant and Austin, 2012a), adjusted monthly solar radiation in January
and July (Gallant et al., 2014), and plan/profile curvature (Gallant and
Austin, 2012 b,c) at 3 arcsecond resolution (~90 m). The
topographic wetness index, calculated as specific catchment area divided
by slope, is commonly used as an indicator of soil water availability
(Gallant and Austin, 2012a). Mean monthly shortwave radiation is the
mean shortwave radiation (MJ m-2d-1) received by a surface accounting for latitude,
day of year, average atmospheric conditions, and terrain effects (i.e.,
slope, aspect and topographic shading). We chose the shortwave radiation
in January and July to account for different energy inputs for summer
and winter. Plan and profile curvature, derived from the Smoothed
Digital Elevation Model (Geoscience Australia, 2015), added further
constraints on soil moisture availability and other variation in other
soil attributes (e.g., soil depth). The soil and topographic data used
are summarised in Table S1.
Future climate change
projections
The climate change projections are based on the downscaled output of
nine General Circulation Models (Clark et al., 2021): ACCESS1-0,
BNU-ESM, CSIRO-Mk3-6-0, GFDL-CM3, GFDL-ESM2G, GFDL-ESM2M, INM-CM4,
IPSL-CM5A-LR, MRI-CGCM3. The chosen projections were from a full list of
12 models in Clark et al. (2021), but we excluded three models because
they do not cover both Representative Concentration Pathways (RCP) 4.5
and 8.5. We used projections of PPT, Tmax and minimum RH
for the intermediate RCP 4.5 and high emission RCP 8.5 for the period
2000-2100. We obtained projected climate data for time periods of 16
years at the beginning (2000-2015), middle (2045-2060), and end
(2085-2100) of the 21st century. We calculated the mean monthly climate
for those three time periods for each of the nine climate models. The
data were then aggregated in the same way as the historical climate data
while the spatial resolution was kept at 0.036 degree
(~3.6 km). The MAP and mean Tmax were
the mean of each 16-year period. The mean of current and future climate
among the projections are shown in Figure S2. The future
Ca used in the study is shown in Table S2.
Optimal leaf area
index
Vegetation structural response to climate change was based on gridded
LAI layers simulated by an optimisation model (Yang et al., 2018).
Briefly, the model uses long-term mean precipitation, maximum air
temperature, vapour pressure deficit (calculated based on
Tmax and RH) and photosynthetically active radiation
(converted from monthly solar radiation) to predict LAI based on the
concept of ecohydrological equilibrium (Woodward,1987). The long-term
mean PPT is used to estimate the amount of water available to support
evapotranspiration. The optimal LAI is then calculated as the LAI that
maximises canopy carbon export (gross photosynthesis less leaf
construction and respiration costs) subject to this constraint on
evapotranspiration.
There are four advantages of the optimal LAI model (Yang et al., 2018)
which makes it suitable to incorporate into machine learning: (i) it has
detailed photosynthetic and stomatal processes to capture plant
responses to climate change and rising Ca; (ii) the
optimisation process accounts for potential changes in plant strategies
under future conditions; (iii) it showed a good agreement to
satellite-derived and ground-based observations over Australia during
2000-2011; (iv) it is parsimonious with minimum computational
requirements and high interpretability. The optimal LAI model therefore
allows us to capture essential aspects of the ‘fertilisation effect’ on
future fuel hazard. Current and future LAI are shown in Figure S2.
Random forest
models
We constructed three random forest models: one for each fuel stratum
(elevated, near-surface and surface). Each model predicted a fuel hazard
score of a stratum using 15 predictors including climate, soil,
topography, and LAI as shown in Table S1. The field-based fuel hazard
score data set was split into a training subset (random sample of 70%
of observations) and an evaluation data subset (remaining 30% of
observations). Due to the importance of elevated fuels for variation in
fire intensity and rate of spread, we focused our evaluation and
prediction on this stratum (Cheney et al., 2012). All of the data
process and analyses were conducted in R (4.1, R Core Team), with the
‘randomforest’ function from the ‘randomforest’ package used for model
fitting.
Model performance and importance of
inputs
We used observed accuracy and Fleiss’s Kappa coefficient (Fleiss, 1971)
to assess model performance. Briefly, observed accuracy captures the
agreement between model predictions and observations. However, due to
the imbalance in each score, accuracy could be driven by a single score
that has high frequency in the data. The Kappa coefficient addresses
this issue by balancing the frequency of score in the data (expected
accuracy) with the observed accuracy. A Kappa coefficient over 0.4 is
generally considered as good (McHugh, 2012). We also used
‘no-information rate’ (i.e., accuracy achievable by random sampling) as
a baseline of model performance. The more model accuracy exceeds the ‘no
information accuracy rate’ the better the model performance.
To assess the importance of each predictor, we used the Gini index (also
known as “Gini importance” or “mean decrease impurity”) defined as
the total decrease in node impurity (the proportion of a sample in each
node) averaged over all trees of the ensemble (Breiman, 2001). Since the
Gini index is in favour of variables with high categorical frequency, we
also reported the percent change of mean square error of each variable.
Future
projections
We explored the applicability of the field data to the whole region
(Note S1) and found that the sample sites covered most environmental
variation of the region (Figure S3). We thus predicted future fuel
hazard scores for each of the nine models and two RCPs, a total of 18
climate projections. The soil and topography layers were resampled to
the resolution of the climate model outputs (3.6 km) to make the
projection at state scale. A LAI layer was predicted for each climate
projection and period. Note that we also predicted fuel hazard scores
for 2000-2015 with climate projections to avoid potential discrepancies
between SILO and climate models. The projected fuel hazard scores varied
by month because of monthly climate predictors. Since past intense fires
in this region occurred mostly around January, we only report the fuel
hazard score for January. The future fuel projections targeted potential
fuel with the assumption of the land covered by natural vegetation. The
projections contained the probability of each fuel hazard score on an
ordinal categorical scale from 1 (low) to 5 (high). We predicted the
probability of a hazard score of 4 or 5 (P4_5) for the
elevated fuel stratum rather than the change in categories, since high
scores in this stratum have shown to strongly affect fire behaviour
(Hines et al., 2010).
Predictor impact
assessment
Although the importance of each predictor was reported by the Gini index
and increase of error, these metrics do not directly translate into the
magnitude of changes in the probability of a particular future fuel
hazard score. We therefore explored the impacts of two individual
predictors by making projections with the target predictor from
2000-2015 instead of the projected predictor while holding all other
predictors as projected (‘manipulated run’). For clarity, the
projections with the full setup are referred to as ‘projections’.
Comparing the ‘projections’ with ‘manipulated run’ allowed us to
separate the contributions of each predictor.
The two individual predictors that were assessed were the rainfall
seasonality and the optimal LAI. We assessed rainfall seasonality
because it showed high variability in the future and ranked high in the
importance list. The difference between ‘projection’ and ‘manipulated
run’ is referred to as the ‘rainfall seasonality effect’.
The optimal LAI was investigated because it represents the
‘fertilisation effect’ of rising Ca, a key innovation of
this study. The difference between ‘projections’ and ‘manipulated run’
is referred to as the ‘fertilisation effect’, hereafter. The fractional
contribution of the ‘fertilisation effect’ was calculated as the
‘fertilisation effect’ divided by the ‘projection’.
High resolution
projections
To explore model performance at fine spatial resolution, we focused on a
topographically heterogeneous region of about 20 km2near Dargo, Victoria (-37.33588, -37.38554,147.27566,147.32566). This
mountainous region features strong heterogeneity in environmental
conditions as indicated by the fine-scale patterns of the selected
terrain attributes (Figure S4). Models relying on climate only would
predict no variation in fuel hazard scores due to the resolution of
climate data (3.6-5.0 km). Here, we used topographic and soil data at 90
m resolution to demonstrate the capability of the model to predict fuel
hazard scores at this fine resolution. We used climate data from the
ACCESS 1-0 climate projection only, which was deemed sufficient for this
analysis. because the goal is to highlight the capability of the model
rather than predicting future change.