Statistical analysis
We performed an agglomerative hierarchical clustering with the Ward
minimum-variance method to group comorbidity variables and identify
aggregated conditions. We used the hclust function in R with the
dissimilarity matrix defined by the Kendall distance, assuming variables
were not parametric (Figure 2) [13]. The previously pre-specified
dichotomous variables (COPD, dyslipidemia, liver disease, dementia, and
stroke) were assigned a value of one when a given comorbidity was
present and zero when it was absent. Categorical variables, such as
stroke and arrhythmia, took their values depending on their respective
categories. In the case of stroke, the following values were assigned:
absent = 0, transitory ischemic accident = 1, hemorrhagic stroke = 2,
cardioembolic stroke = 3 and atherothrombotic stroke = 4. In the case of
arrhythmia, they were: sinus rhythm = 0, atrial fibrillation or flutter
(AF/flutter) = 1, atrioventricular block = 2, and other = 3. Finally,
the quantitative pre-specified variables (BMI, eGFR, LVEF, hemoglobin
and SBP) retained their numerical value.
Bootstrap resampling techniques (n = 1000) were used to assess
reproducibility for each hierarchical cluster, applying the pvclust
function in R [14]. We computed the bootstrap probability (BP) value
which corresponds to the frequency with which the cluster is identified
in bootstrap copies, and the approximately unbiased (AU) probability
values by multiscale bootstrap resampling (Figure 2). Clusters with AU ≥
95% are considered to be strongly supported by data.
Once the clusters were built, we performed univariate comparisons
between them. Quantitative variables were expressed as mean +/- standard
deviation if normal, and median +/- interquartile range if not normal.
The clusters were compared for various numeric parameters by one-way
analysis of variance and by the post hoc Tukey’s test for multiple
comparisons. If the variables were not normal, we used the
Kruskal-Wallis test. Qualitative variables were expressed as absolute
number and percentage. Study groups were compared using the Chi-squared
test.
Finally, a Cox proportional-hazard model was used to examine the
association between the clusters and time to hospitalization and death.
The model covariates were selected a priori based on previous prognostic
reports and clinical experience, and variables which were significant in
the initial univariate comparisons were also included. Cumulative curves
were estimated by the Kaplan-Meier method and compared by log-rank
testing. A p value of < 0.05 was considered significant.
Analyses were performed using the SPSS and R programs.
RESULTS
A total of 1,934 patients were analyzed: 907 had T2DM (39.1% men, mean
age 78.4+/-7.6 years) and 1,027 did not (39.9% men, mean age 81.4+/-
7.6 years). The most prevalent comorbidities were dyslipidemia (52.4%),
AF/flutter (67.4%), and COPD (24.9%). The similarity matrix and
significance by variable in the clusters are shown in figure 2.