2. Methods
2.1. Description of the Cohort
Our analyses were conducted using data from the PreDicta study(Post-infectious immune reprogramming and its association with persistence and chronicity of respiratory allergic diseases), a 2-year multi-center prospective cohort study [14]. Blood and nasopharyngeal samples were collected from 233 preschool children (4-6 years) with asthma (167 subjects) and matched healthy controls (66) across 5 major European climate regions, i.e. Greece (Athens), Poland (Lodz), Finland (Turku), Germany (Erlangen) and Belgium (Ghent)[14]. To conduct this study, cytokine measurements and metagenomics data were used from 51 subjects from the whole cohort (32 asthmatics, 19 healthy controls), based on the availability of the samples, in 3 out of the 5 centres. Our population was generally representative of the whole cohort [12](Supplementary Table 1). Written informed consent was obtained from the parents or by the legal guardians and the study was approved by the Regional Ethics Committee of Karolinska Institutet, Stockholm, Sweden.
Cases needed to be diagnosed with asthma within the previous 2 years and have a minimum of 3 wheezing episodes within the last 12 months prior to study inclusion [12]. Exclusion criteria included severe asthma, >6 courses of oral steroids during the last 12 months, immunotherapy, chronic medication use, or the history of chronic respiratory disease other than asthma and/or allergic rhinitis. Control subjects had to demonstrate no history of wheezing/asthma[14]. Additionally, subjects needed to be away from an asthma exacerbation and/or upper respiratory tract infection for at least 4 weeks before sample collection. Subjects were balanced for sex (50.9% males, 49.1% females), of 4.95 +/-0.65 years old (range 3,26 to 6,29). Comorbidities and other characteristics are shown in Supplementary Table 1.
2.2. Treatment of blood samples and cytokine measurements
Blood samples were collected in tubes with lithium heparin (Vacutainer) and diluted with an equal volume of warm PBS (Gibco, Invitrogen, Massachusetts). Peripheral blood mononuclear cells (PBMCs) were isolated by centrifuging at 800g for 20 min at 18-20ºC on Biocoll separating solution (Biochrom AG, Germany). After three washes, PBMCs were resuspended in complete medium (CM) [RPMI-1640 with HEPES 25 mM and L-Glutamine (Gibco, Life Technologies Ltd, UK), supplemented with 10ml/L Penicillin-Streptomycin USA, 50μl/L 1M β-mercaptoethanol, 20ml/L L-Glutamine plus MEM Vitamin, 20ml/L Non-essential Amino Acid, Sodium Pyruvate and 10% heat-inactivated FBS (all from Sigma-Aldrich, Germany)] at concentration 106 cells/ml.
RV1b stock was propagated in HeLa cells and purified by centrifugation at 2500rpm for 10 minutes (4ºC). Suspension from HeLa lysates was used as a control. The same batch of RV1b and HeLa suspension was used throughout the study. 1ml of cells suspension (106PBMCs) were exposed to RV1b (6.7 titration) or HeLa suspension as follows: After centrifugation at 300g, for 15min, at room temperature (RT), the supernatant was carefully removed by aspiration, and cells were exposed to 0.5ml of RV or HeLa suspension for 1 hour, under rotation at RT. Subsequently, cells were washed twice with CM at 300g for 15 min, RT, and resuspended in 1ml of CM.
The cell suspension was seeded in a flat-bottom 48-well tissue plate (Corning Incorporated, Costar, New York), with 5 × 105viable cells per well (500μL). PBMCs were cultured in duplicates either with complete medium alone (unstimulated control, RV1b-exposed cells, and HeLa-exposed cells) or with one of the following stimulants: 4μg/ml Resiquimod (R848), 5μg/ml Endotoxin-free bacterial DNA (InvivoGen, France), 20μg/ml Polyinosinic–polycytidylic acid potassium salt (Poly I:C), 1μg/ml Lipopolysaccharides from E.Coli 0111:B4 (LPS), (Sigma-Aldrich, Germany), at 37ºC, 5% CO2.
Cultures were harvested after 48 hours and, after centrifugation at 600g for 5 min, supernatants were stored at -80ºC until analysis. Cytokine expression levels in the culture supernatants were quantified by multiplex bead-based fluorometric immunoassay (Milliplex, Millipore) using Luminex xMAP (Luminex 200, Bio-Rad) at the Swiss Institute for Allergy and Asthma Research (SIAF) in Davos, Switzerland. The panel used contained IFNα2, IFNγ, IFNλ-2, IL-1β, IL-5, IL-6, IL-7, IL-9, IL-10, IL-12p70, IL-13, IL-17A, IL-23A, IL-25, IL-27, IL-33, CCL3, CCL4, CCL5, CXCL8, CXCL10, TNF-α.
2.3. Characterisation of the nasopharyngeal virome
The presence of prokaryotic and eukaryotic viruses in the upper respiratory tract (nasopharynx and anterior nares) was previously investigated using metagenomic sequencing in samples obtained from 19 healthy individuals and 32 patients with asthma[12]. Briefly, based on the predominant viral families of these individuals, three virome profile groups (VPGs) were assigned: Prokaryotic VPG; (P-VPG, n=29), contained samples with high prevalence of prokaryotic viruses and low/intermediate of eukaryotic viruses and Anelloviridae, Eukaryotic VPG (E-VPG, n=11) included samples with high eukaryotic viruses’ predominance and low/intermediate of Anelloviridae and Anelloviridae VPG (A-VPG, n=11) contained samples with high Anelloviridae predominance. The virome characteristics of the 51 individuals are described in the Supplementary Data 1.
2.4. Statistical analysis
The data used for analysis consisted of a set of 22 cytokine concentrations in control medium and their inductions from different stimuli. Inductions, representing the ratio between the stimulated values over the baseline levels were used in the downstream analysis. Pre-processing, necessary for subsequent clustering, included the following steps. First, a few missing values were imputed with the use of the random forest algorithm for imputation[15]. Second, outlier detection and correction were performed: low outliers (stimulation values lower than 1) were converted to 1, in sake of biological validity, while any high outliers were substituted with the minimum outlier value, according to the default boxplot definition. Then, all values were converted to z-scores; thus, they all possessed a mean value equal to 0 and a standard deviation equal to 1. All such variables were found to be non-parametric, with the use of the Shapiro-Wilk procedure for composite normality.
Unsupervised cluster analysis was applied to group subjects according to subsets of spontaneously released or stimulated cytokines. Stimuli were grouped according to their nature; therefore, two major conditions were generated: the antiviral (R848, Poly-IC & RV1b) and the antibacterial (Endotoxin-free bacterial DNA & LPS) responses, alongside the baseline. After pre-processing and prior to clustering, optimal number of cluster identification took place with the use of a set of 27 appropriate criteria [16]. Then, the hierarchical agglomerative algorithm for clustering was used in order to group subjects regarding their similarity, to the pre-identified number of groups. The linkage method used was Ward’s linkage. Visualization of the clustering outcome was performed with the use of principal component analysis (PCA) in the dimensions of the first two dominant principal components.
Groups identified by clustering were analysed to characterize different types of response towards a stimulus or homeostasis. Comparisons between groups regarding the presence of major viral families (Siphoviridae, Picornaviridae, Anelloviridae) were performed by Pearson’s chi-squared test of independence (Supplementary table 2). Furthermore, the studied categorical variables (Geography, Sex, VPGs, Rhinitis, Siphoviridae, Picornaviridae, Anelloviridae and Asthma) and age of the donors were included in a multivariate regression. To avoid multicollinearity issues in the analysis, we ran beforehand bivariate crosstab tests between the predictors (immune clusters) and all target variables, namely Pearson’s chi-squared tests of independence, to eliminate those variables that did not provide significance to the model and only retaining those that yielded a significant p-value to one of the predictors. In each regression, predictors’ cluster 1 and target variables’ 1st level were used as benchmark.
Additionally, stimulated cytokines values were compared between subjects with or without the presence of pre-specified viral families (Picornaviridae, Siphoviridae, Anelloviridae), using Wilcoxon’s rank-sum test. Since all statistical tests were non-parametric, the descriptive statistics provided were non-parametric as well (i.e., in the form of median (Q1 - Q3)). All of the statistical tests were two-sided and statistical significance was taken when p was less than 0.05. The statistical analysis was implemented with the usage of the R language and the RStudio interface.