2.8.3 Correlating gene expression profiles with
CO2 treatment and OA-affected behaviours
To analyse the correlation between gene expression and behavioural
traits of squid across CO2 treatments, we employed
weighted gene co-expression network analysis (WGCNA) followed by
canonical correlation analysis (CCA) on the CNS and eyes, separately
(Figure 1). Refer to Supplementary Text S3 for a detailed description of
the methods for gene co-expression network construction and module
detection, module eigengene correlation with behavioural traits, module
membership vs gene significance, and identification of hub genes
(Figures S3 – S21, Tables S1 and S2). Briefly, the gene-level counts
from DESeq2 (v1.30.1) (Love et al., 2014) were used in the WGCNA
package (v1.70-3) (Langfelder & Horvath, 2008) to construct a
co-expression network and detect modules of genes. A module eigengene
was calculated for each module of genes, representing the gene
expression profiles of that module. CCA using package CCA (v1.2.1)
(González et al., 2008) was used to explore the correlations
between the two sets of variables from the same individual squid: ME set
= module eigengenes from each module; traits set = CO2 level (current-day or elevated) and behavioural traits (active time (s),
distance (cm), speed (cm/s), time in Zone A (s), whether the squid
displayed an exploratory/aggressive interaction (yes/no), number of
exploratory/aggressive interactions). For those modules identified by
CCA to be correlated with each trait, the Pearson correlation of module
membership (MM, higher value indicates the gene is more highly connected
to the given module) and gene significance (GS, higher value indicates a
more biologically relevant gene) was calculated (Langfelder & Horvath,
2008). A correlation of GS and MM imply that genes more highly connected
with a given module also tend to be more highly correlated with the
given trait, providing another measure for the importance of this module
with the given trait (Langfelder & Horvath, 2008). This identified the
final modules of interest, within which the MM and GS values of each
gene were used as a screening method to identify biologically relevant,
highly interconnected hub genes (Fuller et al., 2007; Horvath &
Dong, 2008; Langfelder & Horvath, 2008), i.e. to find genes correlated
with CO2 treatment and each behavioural trait. Hub genes
were defined as those genes within the final modules of interest with a
very strong correlation with the module (MM > 0.8) and a
moderate correlation with the given trait (GS > 0.4).
All hub genes for CO2 treatment were compared across
tissues to identify hub genes for CO2 treatment that are
CNS-specific, eyes-specific or found in both tissues. Hub genes for
CO2 treatment that were also a hub gene for one or more
behavioural traits were identified as genes correlated with the
associated OA-induced behavioural change. Finally, functional enrichment
analysis was used for the CNS-specific CO2 treatment hub
genes that were also hub genes for all three activity traits in the CNS
using over-representation analysis (ORA) in clusterProfiler (v3.18.1)
(Yu et al., 2012) with the hypergeometric test. This determined
if any GO terms were significantly over-represented within the list of
hub genes. All other groups of CO2 treatment hub genes
had 25 or fewer genes and thus ORA was not used. P-values were adjusted
for multiple comparisons using the Benjamini-Hochberg method and a
significance threshold of padj < 0.05 was used.