Six baseline group-level networks were estimated separately for the following participant subgroups: 1) all depressed patients, n=465, 2) all never-depressed controls, n=1295, 3) matched patients, n=377, 4) matched controls, n=377, 5) relatively good responders to treatment, n=233, and 6) relatively poor responders to treatment, n=232. In addition, for longitudinal comparison, we estimated the baseline and follow-up network for a subset of adolescents who met diagnostic criteria for depression at baseline and had recovered at follow-up (n=232). All networks were estimated using the qgraph package in R (ref). As the underlying real network structure of depressive symptoms is unknown and may change with symptom levels (ref Eiko), and because we are interested in the stability of different network estimation methods, we estimated full correlation networks as well as partial correlation networks. We used the cor_auto function to estimate correlation coefficients between items across subjects. For each pair of items in the matrix, cor_auto detects the data distribution (in our case with a four-point rating scale mostly non-normal / skewed) and applies the appropriate parametric or non-parametric correlation equation (e.g. polychoric correlation). Next, the correlation matrix is inverted using the corpcor package to calculate partial correlations.
The covariance between two normally distributed variables with no floor- or ceiling effects is technically independent of the means of both variables, provided that the variance of both variables remains unchanged. Thereby, when item scores are normally distributed, strong networks (and, similarly, weak networks) are equally likely to occur in people with high symptom levels and in people with low symptom levels. These independencies do not hold in non-normally distributed variables. For example, in a variable with floor-effects (e.g. a questionnaire item on which many people score the lowest possible score, i.e. 0 for 'never'), a lower mean score is associated with lower variance which in turn will result in lower co-variance. Thus, network strength will vary as a function of distribution properties (i.e. mean, variance, skew etc) that we are not necessarily interested in. To filter out this effect, we thresholded the correlation matrix against a permuted null-distribution of randomly generated correlations. These random correlations are generated by, for each subject separately, scrambling the order of item scores such that the items dimension becomes meaningless while preserving the distribution of scores. By scrambling items, we should be
left with the covariance resulting from the dispersion of each subject from the mean,
across all items. Similarly, if we were to scramble the order of subjects for each item separately, we should be left with the
covariance resulting from the dispersion of each item from the mean, across all
subjects. For details about the thresholding method and a simulation study to show how correlations can artificially be induced in fully random data by manipulating distribution parameters, see supplement X. The permutations result in a
distribution of full correlation edge strengths and a distribution of partial correlation edge strengths, against which each of the real edge strengths can then be compared using one-sample t-tests at an alpha of 0.05/465 (number of edges)=0.0001. Significant
edges are set to 1, non-significant edges are set to 0. The result is a
thresholded non-weighted network. Global network strength is calculated as the percentage
of edges across the entire network reaching significance. Node strength is calculated for each item on the questionnaire as the percentage
of edges within each item reaching significance.