The preprint by Jardim et al. makes important advances in the assessment of missing trait data. They analysed how the simulated target and auxiliary functional traits inputed in different sorts of missing data would influence the descriptive statistics, model parameters, and phylogenetic signal estimation from these databases. They simulated coalescent phylogenies and missing data (missing completely at random, missing at random but phylogenetically structured and missing at random but correlated with another variable) and found that the structure of the missing data, the evolutionary model used to simulate the phylogeny and the percentage of missing data were important factors determining estimation errors. We found the manuscript really well written and with strong analyses. This work is of great importance to the field, but we have some suggestions that are outlined below.
analysed
inputed
The preprint by Jardim et al. makes important advances in the assessment of missing trait data. They analysed how the simulated target and auxiliary functional traits inputed in different sorts of missing data would influence the descriptive statistics, model parameters, and phylogenetic signal estimation from these databases. They simulated coalescent phylogenies and missing data (missing completely at random, missing at random but phylogenetically structured and missing at random but correlated with another variable) and found that the structure of the missing data, the evolutionary model used to simulate the phylogeny and the percentage of missing data were important factors determining estimation errors. We found the manuscript really well written and with strong analyses. This work is of great importance to the field, but we have some suggestions that are outlined below.
The manuscript makes a novel and important contribution to the ecological literature. Trait data is rarely available for entire communities of species and most trait databases use imputation methods. In spite of the common use, previous studies have not evaluated the impact of data inputation on common metrics of trait distribution and phylogenetic signal. This paper shows that missing trait data and data inputation can create biases in common ecological and evolutionary metrics, and suggest ways to minimize the problem when only incomplete data are available.