Water quality effects on transcriptome
A linear mixed model was used to assess percentages of transcriptional variance accounted for by selected environmental parameters. The linear mixed model was conducted using the R package variancePartition(Hoffman & Schadt, 2016) v.1.17.6 and included lake as a random effect (categorical) as well as dissolved oxygen, conductivity, pH, alkalinity, chlorophyll a, suspended phosphorus, suspended carbon, and total dissolved nitrogen as fixed effects (continuous). Gene expression, or the transcriptome-wide count data, was the response variable of interest. Not all environmental parameters were included in the model due to collinearity of certain variables. For example, total dissolved nitrogen, NO3-N, and NH3-N are positively correlated which may produce misleading results and overestimated the contribution of these variables; thus, covariates were dropped. variancePartition fit a linear mixed model that jointly considered the contribution of all variables on the expression of each gene in the normalized transcriptome-wide gene count matrix. Using a multiple regression model, variancePartition assessed the effect of each individual variable on gene expression while correcting for all other variables included in the model (see variancePartitiondocumentation for further statistical details).