The coronavirus disease 2019 (COVID-19) broke out quickly in Italy since
March 2020 when the epidemic got controlled in China. Reasons of rapid
breakout and overall case-fatality rate in Italy have been studied and
reported in literature [1, 2, 3]. Obvious differences in epidemic
spread and fatality rates among regions exist, but factors related these
spatial differences are unclear. It is of interest to study this
regional heterogeneity and the related factors.
Global data of COVID-19 have been integrated by researchers and
available publicly from R package nCov2019 [4]. We downloaded and
extracted the data of Italy by regions for our study. As of May 15,
2020, Lombardy ranks top 1 with 83820 cumulative confirmed cases among
the 20 regions, while the number of cumulative confirmed cases in
Basilicata is the smallest (389 cases). The number of death ranges from
22 to 15296, corresponding to regions of Molise and Lombardy,
respectively. Demographical data including population, area, population
density and human development index (HDI) by regions of Italy 2019 were
downloaded from https://en.wikipedia.org/wiki/Regions_of_Italy.
The case rates (the proportion of confirmed cases among regional
population) range from 0.0006 to 0.009 with a median of 0.0025, while
the death rate (proportion of deaths among regional population) ranges
from 0.00005 to 0.00152 with a median of 0.00026. HDI [5] is an
integrated index of healthy long life, education and living standard,
measured by life expectancy, expected/mean years of schooling, Gross
National Income per capita, respectively. The median HDI is 0.891 with a
range from 0.845 to 0.919.
It is reasonable to assume people in the same region are independent and
identical with the same probability of being infected and confirmed.
Under this assumption, we performed a univariate logistic regression
between the cumulative confirmed cases and HDI. We found that HDI is
statistically significant (log odds = 28.6476, p-value
<2*10-16). If HDI increases by 0.1, the odd
of a confirmed case (that is, the probability that a person is a
confirmed case against the probability that a person is not a confirmed
infected) increase exponentially by exp(2.8648)=17.5448.
Many literatures have studied the case-fatality rate. Case-fatality rate
is defined as the proportion of death among the confirmed cases.
However, not all infected people are diagnosed and counted into the
confirmed cases. It is natural to assume people in the same region has
the same probability to get infected and die due to COVID-19 while the
death probability are different among different regions. A univariate
logistic regression to study the relation between the cumulative death
and HDI is also performed. HDI is again significant (log odds = 36.7946,
p-value < 2*10-16). An increase of 0.1 in
HDI associates with an increase of 39.6230 in odds of death.
In summary, it is interesting to note that high HDI is associated with
high case rate and high fatality rate. This may because more old people,
more professionals live in regions with higher HDI and more business
activities including global business trips occur in those regions.