Environmental Association Analysis
Our intra-population randomization approach showed that the predictive
power of GEAM was much larger than expected by chance. Such high
predictive power was based on the genetic diversity found in five
closely-located populations with geographical distances that were not
correlated with either genetic or environmental distances
(Llanos-Garrido et al. 2019). Yet, a small number of inter-population
randomizations and randomizations by neutral loci also yielded
significant models, as expected from a certain degree of environmental
pseudoreplication and genetic aggregation in our data. However, the rate
of significance was close to 5%, i.e. the conventional level of type I
error rate for significance in statistical tests. On the other hand, our
complete randomization approach, which included the critical outlier
selection step, produced a relatively large number of significant models
(25%, still much lower than the 100% obtained by the ‘correct’
intra-population approach). This confirmed that outlier analyses were
effectively able to sort through the randomized SNP databases
identifying those that explain the greatest variance among arbitrary
subgroups, in such way that the projection of that genetic variance into
the environmental PC-space around the five sampled populations resulted
into significant association models. However, the environmental signal
of these randomly genotyped SNPs was significantly smaller than that of
real data. This supports the idea that the particular SNPs selected by
our EAA could be good proxies for the genetic variability that is
involved in local adaptation to different environmental conditions at
each population.. In addition, given the low standard error of parameter
estimates (Table 1), our final genotype-environment association model
should be regarded as robust.