2.3 Morphometric data and analysis
The following seven linear morphological characters were measured in 151 preserved adult frogs (100 males, 51 females): snout-vent length (SVL), tibia length (TL), femur length (FL), head length (HL), head width (HW), radio-ulnar length (RU), and hand length (HDL) according to Castellano and Giacoma (1998). Measurements were taken solely by the first author and repeated three times. We discarded the first set of measurements and tested the second and third measurements for repeatability (Pearson correlation coefficient (r > 0.9)). Once accepted, we used the third measurement for morphometric analyses.
We performed a PCA on the seven linear measurements in which PC1 was used as a proxy for body size. We also calculated relative leg length and generated a geometric morphometric variable for head shape using the program package SHAPE v.1.3 (Iwata & Ukai, 2002) based on photographs of 142 preserved frogs (97 males, 45 females) that were in sufficient condition. SHAPE traces the contour shape from an image, delineates the contour shape with elliptic Fourier descriptors (EFDs), and finally performs a principal component analysis of the EFDs to summarize the shape information (Iwata & Ukai, 2002). We retained the first principal component summarizing head shape for further analysis.
We used a random forest (RF) model within the randomForest R package (Liaw & Wiener, 2002) to determine the relative importance of each of the 12 uncorrelated environmental variables to body size, relative leg length, and head shape. RF is an ensemble learning method for nonlinear multiple regressions. When compared to similar approaches, RF consistently outperformed other methods (Cutler et al., 2007) and was among those least sensitive to spatial autocorrelation (Marmion et al., 2009). For each analysis, the data was first divided into training (70%) and testing (30%) sets to determine the optimal number of variables to split at each node in the tree before running a RF regression analysis based on 10,000 trees. Predictor variables were ranked in order of importance based on the number of times a given metric decreased the mean squared error (MSE) of the model. We then used the rfPermute R package to estimate the significance of importance metrics in all subsequent RF analyses. The response variable was permuted 1000 times on each of the 10,000 regression trees to create a null distribution against which the observed value was compared. Only significant (p < 0.05) environmental variables were retained in the final model that was used to extrapolate patterns of environmentally-associated morphological variation across the study area. To further explore the direction of associations between environmental predictor variables and body size, we performed a multiple linear regression model with significant predictor variables detected in RF modeling. Site origin was included as a factor in all analyses.