Analytical framework
Analyses were performed in R (R Core Team 2018). Prior to model testing, we performed transformations of continuous data to improve normality of model residuals (details in Appendix S2). FST was transformed using Tukey’s ladder of powers transformation (Tukey, 1970) with the function transformTukey from the R package rcompanion (Mangiafico, 2018). Continuous predictors were transformed using their natural logarithm. We also estimated correlations (Plackett, 1983) and evaluated multicollinearity issues (Acock & Stavig, 1979; Fox & Monette, 1992) among predictor variables (Appendix S3). The multicollinearity tests indicated that all predictors could be included together in a multiple regression (Table S2 and Table S3).
In order to calculate and subsequently perform models that correct for phylogenetic signal (Freckleton, Harvey, & Pagel, 2002), a species-level phylogeny (Fig. S1) was produced with the R package V.PhyloMaker (Jin & Qian, 2019). This package prunes a custom list of species from the latest and most complete mega-tree of vascular plants (Smith & Brown, 2018) (see Appendix S4 for details). We then assessed phylogenetic signal in categorical predictors with Abouheif’s (1999) method (Jombart, Balloux, & Dray, 2010; Pavoine, Ollier, Pontier, & Chessel, 2008), and in FST values with Pagel’s (1999) λ (Molina-Venegas & Rodríguez, 2017; Revell, 2012) (Appendix S5). We found that closely related species tend to be more similar than expected by chance in their mating system, growth form, pollination mode, seed dispersal mode, latitudinal region and FST. The highest observed Moran’s I was that of growth form, followed by pollination mode, latitudinal region, seed dispersal mode, and lastly mating system (Fig. S2). FST values were also phylogenetically autocorrelated (Pagel’s λ=0.52, P<0.001 and Pagel’s λ=0.53, P<0.001 for raw and transformed FST values, respectively). Given the high levels of phylogenetic signal, we implemented phylogenetically informed multiple regressions (Symonds & Blomberg, 2014) with the function ‘phylolm’ from the R package phylolm (Ho & Ané, 2014). For the fit of models, the likelihood of the parameters was calculated with a Brownian motion model of evolution (Ho & Ané, 2014) (Appendix S6).
Finally, for the categorical predictors with more than two levels we chose reference levels based on exploratory analyses with phylogenetic ANOVA and post-hoc tests (Garland, Dickerman, Janis, & Jones, 1993; Revell, 2012). We selected the level which mean was most different from that of other levels (Tables S4 and S5). Reference levels were as follow: trees for growth form, small insects for pollination mode, gravity for dispersal mode, and temperate for latitudinal region.