Bioinformatic analysis
The outputs from LC−MS raw data files were converted into mzXML format and then processed using the XCMS, CAMERA and metaX toolbox implemented with the R software. Each ion was identified by combining retention time (RT) and m/z data. A three dimensional matrix containing arbitrarily assigned peak indices (retention time-m/z pairs), sample names (observations) and ion intensity information (variables) was generated and intensities of each peaks were recorded. By matching the exact molecular mass data (m/z) of samples with the online Kyoto Encyclopedia of Genes and Genomes (KEGG), Human Metabolome Database (HMDB) database, the metabolites were annotated. The molecular formula of metabolites would further be identified and validated by the isotopic distribution measurements, when a mass difference between observed and the database value was less than 10 ppm. Also, an in-house fragment spectrum library of metabolites to validate the metabolite identification was used.
The intensity of peak data was further preprocessed by metaX. Those features detected in less than 50% of QC samples or 80% of biological samples were removed, the remaining peaks with missing values were imputed with the k-nearest neighbor algorithm for improving the data quality. PCA analysis was conducted for outlier detection and batch effects evaluation using the pre-processed dataset. Quality control-based robust LOESS signal correction was fitted to the QC data regarding the order of injection to minimize signal intensity drift over time. The relative standard deviations of the metabolic features were calculated using all QC samples, >30% of which were then removed.
Student t-tests were employed to detect differences in metabolite concentrations between two phenotypes. The P value was adjusted for multiple tests using an FDR -P ≤0.05 (Benjamini–Hochberg). Supervised PLS-DA was conducted through metaX to discriminate the different variables between groups. The VIP value was calculated and a VIP cut-off value of 1.0 was used to select important features.