Predicted distribution of suitable habitat
We used presence only data analyzed under a maximum entropy approach to
develop present day ecological niche models (ENMs). Our goal was to
evaluate the predicted distribution of Gonipterus platensis in
its introduced range throughout South America, with a focus on Ecuador.
We used climate data from 19 WorldClim variables summarizing temperature
and precipitation features (Fick & Hijmans, 2017) and elevation.
Environmental data was trimmed to the regional extent of South America
using the R package raster (Hijmans, 2023). The choice of
environmental background can influence the predictive ability in ENM
(Elith et al., 2010). Therefore, we created a background extent to
calibrate the ENM by generating a buffer of 500 km around each observed
locality of G. platensis , and sampling 10 thousand random points
within that environmental extent. Final models were then projected onto
the regional extent of South America.
ENMs were generated using Maxent v3.4.1; this method is widely used and
shows high predictive performance compared to other modeling methods
(Elith et al., 2006; Phillips et al., 2006). Species localities were
randomly partitioned into 75% training and 25% testing datasets, and
model calibration followed a cross-validation approach with k = 5. We
evaluated a range of regularization values from 1–5 and combinations of
up to four feature classes (i.e., L, Q, H, LQ, LQH, and LQHP) in the R
package ENMeval2.0 (Kass et al., 2021). The best tuning
parameters for modeling were then selected using Akaike Information
Criterion (AIC; Appendix, Table A1). Maxent uses regularization to
reduce model complexity and included variables contribute differentially
to the final model (Phillips & Dudík, 2008). Thus, we included all 19
WorldClim variables and elevation in the model and allowed the algorithm
to converge onto the variables with the greatest contribution. The final
model was calibrated using the background extent and the best tuning
parameters (i.e., fc = LQH and rm = 2) and was projected on South
America and Ecuador. This approach allowed us to evaluate the predicted
distribution of G. platensis across the introduced range.
Model performance was assessed using the area under the receiving
operating characteristic curve (AUC). AUC is a threshold-independent
measure that varies from 0 to 1, where a score of 1 represents perfect
discrimination and a score of 0.5 represents a model no better than
random (Peterson et al., 2011). We considered an AUC score greater than
0.7 to represent good model predictions (Peterson et al., 2011). Given
that AUC has been deemed unreliable for estimating performance of
presence-background models (e.g., Lobo et al., 2008) we separately
calculated the Boyce Index (BI) to assess model prediction in the R
package ecospat (Di Cola et al., 2017; Hirzel et al., 2006). The
BI uses a Spearman rank correlation coefficient, which varies from -1 to
1 (Hirzel et al., 2006). A positive BI value approaching one indicates
that model predictions are consistent with the evaluation dataset, zero
indicates random performance, and negative values indicate a poor match
with the evaluation dataset (Hirzel et al., 2006).
Because G. platensis is invasive in South America, the final
projected model implemented a lowest presence threshold of 95% (LPT95,
equivalent to the Minimum Training Presence threshold) obtained from the
model estimated by the Maxent cloglog output (Soto-Centeno & Steadman,
2015). Under this rule, prediction pixels with equal or higher values
than the LPT95 were scored as suitable conditions where G.
platensis could sustain viable populations in the introduced range. We
chose LPT95 to provide a conservative prediction where model datasets
contained at least 95% of locality points within suitable habitat
(i.e., a theoretical expectation of 5% omission rate of the training
data; Pearson et al., 2007). This threshold also helped us determine
visually if our ENMs allowed enough sensitivity to examine novel areas
of environmental suitability where G. platensis could establish
populations in South America.