Birds were fitted with transmitters (Hunan Global Messenger Technology
Company, China) programmed to record GPS position and speed every 1-3
hours depending on the battery condition. Transmitters were solar
powered to enable the global system for mobile communication (GSM) to
transmit data via the short message service (SMS). These
back-pack design transmitters were 55x36x26 mm in size and weighed 22g
(appr. 1.6% of the bird’s body mass; Lei et al . 2019a). As
Mobile network coverage is sparse or non-existent in summering sites of
North-East Russia, the stored data obtained from that area were
downloaded when birds returned to China.
GPS records of locations (accuracy of <1000 m) were used in
the analysis of A. erythropus journeys to Russia. For
non-breeding A. erythropus (the longest one-way migration
recorded was 16,172 km in 60 days, Lei et al . 2019b), it was
assumed the spring migration turned to summering activities when the
trans-latitudinal movement became mostly trans-longitudinal. Like spring
migration, we assumed summering was terminated when a pronounced
southbound movement was detected. For breeding birds, the date of
arrival at a breeding site was used to indicate the start of summering.
The site was classified as staging if the bird stayed at a location for
more than four days.
Environmental predictors
To model the potential summering habitat, a range of
environmental variables were used
including bioclimatic, geomorphological, land production and human
disturbance.
Bioclimatic Bioclimatic variables were taken from the 30
second WorldClim (v2.1) climate data, downloaded from
http://www.worldclim.org, which were generated through interpolation of
monthly mean temperature and rainfall data from weather stations for the
period of 1970-2000 (Hijmans et al ., 2005). We selected five
variables that are relevant to geese summering including Max Temperature
of Warmest Month (Bio5), Mean Temperature of Wettest Quarter (Bio8),
Mean Temperature of Warmest Quarter (Bio10), Precipitation of Wettest
Month (Bio13) and Precipitation of Warmest Quarter (Bio18).
GeomorphologicalTopographic heterogeneity is important for species distribution (Austin
and Van Niel 2011). Three topographic variables were included in the
modelling, namely elevation, LDFG (Local Deviation from Global Mean) and
TRI (terrain ruggedness index). The global 1 km resolution digital
elevation model (DEM) for the study area was downloaded from
(http://srtm.csi.cgiar.org/) and cropped with the study. Based on the
DEM, LDFG and TRI were calculated as:
\(LDFG=y_{i}-\overset{\overline{}}{y}\) (1)
where \(\overset{\overline{}}{y}\) is mean evaluation of the 3 by 3
window, and \(y_{i}\) is the elevation of the focus grid. Positive LDFG
values represent locations that are higher than the average of their
surroundings, as defined by the neighborhood (ridges). Negative LDFG
values represent locations that are lower than their surroundings
(valleys). LDFG values near zero are either flat areas (where the slope
is near zero) or areas of constant slope (where the slope of the point
is significantly greater than zero).
\(TRI={(\sum{(Z_{c}-Z_{i})}^{2})}^{1/2}\ \) (2)
where Zc is the elevation of the central grid andZi is the elevation of one of the eight
neighboring grids. The terrain ruggedness index (TRI) is a topographic
measurement developed by Riley, et al . (1999) to quantify
topographic irregularities in a region.
As A. erythropus is ecologically dependent on wetlands, and often
observed breeding along river valleys (Solovieva et al . 2011), we
included a layer of distance to steams in the modelling. We generated
the raster using polylines in the Global River Widths from Landsat
(GRWL) dataset (George, 2018) as the central lines. The polylines were
checked to be a good represent of the rivers in the study area.
Land production To characterize land production, we
calculated three variables (EVImax,
EVIhom and EVIrange) using EVI (Enhanced
Vegetation Index) time series (2000-2009). The 10-day global EVI images
with 333 × 333 m resolution were downloaded from Copernicus Global Land
Service (https://land.copernicus.eu/global/products/ndvi, data
downloaded on 28 August 2019). EVImax is an indicator of
peak land productivity and was calculated as the 10-year mean of annual
max EVI. EVIrange is the range of land productivity
(i.e. EVImax -EVImin).
EVIhom is the similarity of EVI between adjacent eight
pixels, and was computed as (Tuanmu and Jetz 2015):
\(\text{EVI}_{\hom}=\sum_{i,j=1}^{m}\frac{P_{i,\ j}}{1+{(i-j)}^{2}}\)(3)
where m is the number of all possible scaled EVI values (i.e. 100) and\(P_{i,\ j}\) is the probability that two adjacent pixels have scaled
EVI values of i and j , respectively. Both
EVIhom and EVIrange can be indicator of
habitat diversity.
Human Disturbance Human disturbance can lead to declines
and local extinctions of avian species as well as habitat loss
(Vollstädt et al . 2017). The inclusion of human disturbance data
can increase the performance and accuracy of SDM (species distribution
model - Stevens and Conway, 2020). We compiled a database of all human
settlements including villages and towns in the study area (i.e.
Republic of Sakha, Magadanskaya Oblast and Chukotskiy Autonomous Okrug)
and generated a layer of distance to settlements as a proxy of human
disturbance. Settlements with zero registered inhabitants (abandoned and
closed before 2011) were excluded.
Land Cover Forcey et al . (2011) found that land
use has strong effects on waterbird distribution, and the percentage of
waterbird abundance is positively related to the area of wetland. In
this study, we used the 2015 global land cover map derived from
satellite observations by Land Cover Climate Change Initiative (CCI) and
available fromhttps://maps.elie.ucl.ac.be/CCI/viewer/download.php . The map
classifies the global terrestrial system into 28 major classes using
United Nations Food and Agriculture Organization’s land cover
classification system (Di Gregorio 2005).
R (R Core Team, 2019) packages “raster” (Hijmans et al . 2015)
and spatialEco (Evans and Ram, 2018) were used for raster manipulation
and calculation.
Modeling
A total of 96 geo-referenced records were compiled by combining the
tracking data and historical surveys (post 1999) (Table S2 in
Supplementary). To analyze the potential breeding range, maximum entropy
implemented in the Maxent package (version 3.4.1) was used. Maxent is
among the most robust and accurate SDM techniques (Elith et al., 2006).
In the past two decades, it has gained popularity in conservation
studies, partly because the technique is less sensitive to the number of
recorded sites and uses presence-only data (Elith et al., 2011). In
developing the SDM, the program was set to take 75% of the occurrence
records randomly for model training and the remaining 25% for model
testing. The mean area under the receiver operating characteristic curve
(AUC) was used to evaluate model performance, and AUC values
> 0.75 are considered as suitable for conservation planning
(Lobo et al., 2008). The modelling process was replicated 100 times and
we reported the mean as summering ranges to reduce the sampling bias
(Merow et al ., 2013).
Although collinearity is less of a problem for machine learning methods
in comparison with statistical methods (Elith et al ., 2011),
minimizing correlation among predictors prior to model building is
recommended (Merow et al ., 2013). We used VIF (Variance inflation
factor) to select predictors (Dupuis and Victoria-Feser, 2013). Nine
variables with VIF less than 10, including two bioclimatic variables
(Bio10 and Bio18), two topographic variables (DEM and LDFG), two
productivity variables (EVIhom and
EVIrange), land cover, Distance to stream, and Distance
to settlement, were included in model building.
Using the logistic outputs of MaxEnt, we applied the minimum training
presence threshold (MTP) to produce binary habitat map. MTP threshold
finds the lowest predicted suitability value for an occurrence point and
ensures that all occurrence points fall within the area of the resulting
binary model (Elith et al ., 2011).