Methods

We now explain the estimation of the outbreak rate, and the reasons for including certain contextual factors; Table 1 summarizes the data sources.

Outbreak rate

We obtain COVID-19 outbreak data from USA Facts, as of April 14, 2020.1 Since January 22, this database has aggregated data from the CDC and other public health agencies. The 21 cases on the Grand Princess cruise ship are not attributed to any counties in California. We discard cases only allocated at the state-level due to lack of information. On average, these are only 308 cases per state, but a few states have as many as 4866 (New Jersey), 1300 (both Rhode Island and Georgia), or 1216 (Washington State). Following approaches by the Institute for Health Metrics and Evaluation at the University of Washington8 and the COVID-19 Modeling Consortium at the University of Texas at Austin,9 we model the outbreak using the exponential growth equation\(\frac{\text{dy}}{\text{dt}}=b\ y\), where b is a positive constant called the relative growth rate with units of inverse time. Going forward, we simply refer to b as the outbreak rate. The shape of the trends in case counts enables us to see differences between counties.10 Solutions to this differential equation have the form \(y=a\ e^{\text{bt}}\), wherea is the initial value of cases y . The doubling timeTd can be calculated as\(T_{d}=\frac{ln(2)}{b}\). We estimate the outbreak rate for 1987 out of 3142 counties in the 50 U.S. states that have a minimum of 10 reported cases. This is a statistical, but not an epidemiological model, that is, we are neither trying to model infection transmission nor estimate epidemiological parameters, such as the pathogen’s reproductive or attack rate. Instead, we are fitting curves to observed outbreak data at the county level. A change-point analysis using the Fisher discriminant ratio as a kernel function does not show any significant change points in the outbreak, and therefore justifies modeling the COVID-19 outbreak as a phenomenon of unrestricted population growth.11 We cannot forecast outbreak dynamics with this statistical approach, though we do not require extrapolated data in our work.

Cultural values

Culture can be defined as a set of values that are shared in a given social group. While cultural values are often used to distinguish countries,12 more than 80% of cultural variation resides within countries.13 The original North American colonies were settled by people hailing from various countries, who have spread their influence across mutually exclusive areas. Their distinct cultures are still with us today.6 Although today’s U.S. states are not strictly synonymous with these cultural areas, there is abundant evidence that political boundaries can serve as useful proxies for culture.14
One of the most useful constructs to emerge from cultural social psychology is the individualism-collectivism bipolarity. It has proven useful in describing cultural variations in behaviors, attitudes, and values. Briefly, individualism is a preference for a loosely knit social framework, whereas collectivism represents a preference for a tightly knit framework, in which its members are interdependent and expected to look after each other in exchange for unquestioning loyalty. While the majority of research on collectivism involves comparing countries12, we use an index developed at state-level solely within the U.S. 5. Previous studies have shown that the regional prevalence of pathogens and international differences in the COVID-19 outbreak are positively associated with collectivism.14,15

Institutional confounders

In addition to culture, we include various institutional confounders at the state-level, such as the political affiliation of a state’s governor, the gender of the governor, and government spending per capita. Government plays a critical role in policy development and implementation, and so state-level differences could influence the outbreak rate.16

Racial composition

While first systematic reviews about COVID-19 incidences from China relied on ethnically homogenous cohorts17,18ethnically diverse populations, such as in the U.K. and U.S. may exhibit different susceptibility or response to infection because of socioeconomic, cultural, or lifestyle factors, genetic predisposition and pathophysiological differences. Certain vitamin or mineral deficiencies, differences in insulin resistance or vaccination policies in countries of birth may also be contributing factors.18 We include variables measuring the composition of U.S. counties regarding racial and ethnic groups.

Income and education

Poverty is arguably the greatest risk factor for acquiring and succumbing to disease worldwide, but has historically received less attention from the medical community than genetic or environmental factors. The global HIV crisis brought into sharp relief the vulnerability of financially strapped health systems, and revealed disparities in health outcomes along economic fault lines.19 We include the median household income to quantify potential economic disparities between U.S. counties. In addition, we measure non-proficiency in English and math performance of students. Lower educational levels may result in a lower aptitude as it relates to understanding and effectively responding to the pandemic.

Other demographics

Age and gender also play a potential role in a population’s susceptibility. During the aging process, immune functions decline, rendering the host more vulnerable to certain viruses.20 We use the percentage of population below 18 years of age and their median age to determine potential effect of differences in mobility, response, and lifestyle factors. We also control for the percentage of the population that is female, as one COVID-19 study in Italy showed that about 82% of critically ill people admitted into intensive care were men.21

Personal health

Good overall personal health is a general indicator for disease resistance. Additionally, the health belief model suggests that a person’s belief in a personal threat of a disease, together with faith in the effectiveness of behavioral recommendations, predicts the likelihood of the person adopting the recommendation.22 We use the percentage of the population that reports insufficient amount of sleep, is obese (as defined by a body mass index above 30), and smokes daily. Given the latter two are publicized risk factors for COVID-19, there is a potential for greater caution following the value-expectancy concepts of the health belief model. Yet, medicinal nicotine has been identified as a potential protective factor against infection by SARS-CoV-2.23 We also measure the preventable hospitalization rate (that is, the rate of hospital stays for ambulatory-care sensitive conditions) as a potential indicator of poor personal health and the social association rate (that is, the average number of membership associations), which is generally connected with positive mental health and happiness.

External health

Previous studies suggest that exposure to pollution can suppress immune responses and proliferate the transmission of infectious diseases,24 and that the COVID-19 mortality rate is associated with air pollution.25 However, the impact of air pollution on the spread of COVID-19 is not yet known.24 We use the 2014 average daily density of fine particulate matter PM2.5 to measure air pollution across U.S. counties, and the percentage of population living in rural areas to account for physical distancing being more prevalent in rural areas. In addition, the food environment index reflects access to grocery stores and healthy foods.

Other confounders

Population density and overcrowding is significant when considering public health crises, facilitating the spread of diseases in developing and developed countries alike.26 As the climate is another highly publicized confounder potentially influencing the COVID-19 transmission rate,27 we also include each county’s average temperature during February and March 2020. To control for the temporality of the outbreak, we bring in a variable representing the number of days between January 01 and the 10thconfirmed case reported.