Discussion of results

In the absence of national-level data controlled for location and disaggregated by race and ethnicity, demographics, information about comorbidity and other personal health variables, an ecological analysis provides an alternative way of measuring the disproportionate impact of COVID-19 across the U.S. and among segments of Americans. It may be contrary to expectations that the outbreak rate of a new pathogen, which is able to infect virtually anyone, manifests contextual disparities. But for other conditions, such as HIV and cancer, regional health disparities have been reported before;33,34 and with the current study we show that contextual factors in the U.S. also create a variation in COVID-19 cases.
Our analysis indicates that higher outbreak rates can be found in U.S. states characterized by a higher cultural value of collectivism (coefficient 0.998, confidence interval [0.351; 1.645], p  = 0.004). As Table 2 shows, collectivistic values are more prevalent in counties that are warmer (correlation with temperature 0.715, p  < 0.001) and have a higher percentage of people with a Black/African background (with Black/African American 0.539, p  < 0.001). This mirrors findings from international cultural research.12 Conversely, we cannot find any statistical evidence that the government spending, the gender of the governor, or the party in control would be in any way linked to the outbreak. This certainly debunks myths spread by the popular media.
A disproportionately stronger outbreak of COVID-19 cases can be found in counties with a higher percentage of Black/African (1.158, [0.725; 1.591], p < 0.001) and Asian Americans (1.305, [0.166; 2.444], p = 0.025), which supports prior infection and mortality studies in the U.S. and U.K.18,35 The former counties are also characterized by a higher rate of sleep deprivation (0.568, p  < 0.001) and warmer temperatures (0.533, p  < 0.001). The latter have a higher population density (0.553, p  < 0.001). While we found sleep deprivation to be associated with a higher outbreak rate (1.557; [0.412; 2.702], p  = 0.008), a positive influence of population density (0.050, [-0.009; 0.109], = 0.095) and temperature (0.301, [-0.518; 1.120], p  = 0.472) are only directionally informative, but not statistically significant. In the first robustness test, higher average temperatures are positively and significantly related to the outbreak (1.027, [0.235, 1.861],p  = 0.011), potentially related to more time spent indoor with air conditioning.
Conversely, counties with more Hispanic Americans are less affected by the pandemic, with borderline statistical significance (-0.447, [-0.915; 0.021], p  = 0.061). We could not find a significant effect for counties with a higher Native American (0.763, [-0.209; 1.735], p  = 0.124) or Hawaiian population (1.478, [0.506; 2.450], p  = 0.538) though.
We see that higher income and education levels are associated with a less aggressive outbreak (household income: -3.854, [-7.437; -0.271]; p = 0.035; nonproficiency in English: 2.090; [0.547; 3.633]; p  = 0.008; math grade: -0.002, [-0.004; 000];p < 0.001). In counties with a higher household income, the obesity rate and the percentage of smokers tends to be lower (-0.518, p  < 0.001 and -0.666,p  < 0.001 respectively). Both are negatively associated with the outbreak rate. The effect of the obesity rate is highly significant (-1.093, [-1.828; -0.358], p  = 0.004), but the effect of the percentage of smokers is only directionally informative (-0.784, [-3.150; 1.582], p  = 0.516). Studies report that people with obesity are at increased risk of developing severe COVID-19 symptoms,36 but, to the best of our knowledge, a link to the infection rate has not yet been established. A potential explanation of this is that people with obesity heed the warnings issued by the CDC, and are extra careful in avoiding social contact, in line with the value expectancy concepts of the health belief model.22 Other studies report that smoking or medicinal nicotine might be a protective factor against infection by SARS-CoV-2;23 our ecological data does not contradict this finding. Many other variables related to good personal health are associated with a slower outbreak (social associations: -2.027, [-2.911; -1.143], p  < 0.001; sleep deprivation: 1.557, [0.412; 2.702], p  = 0.008; preventable hospitalization: 0.001, [-0.001; 0.003], p  = 0.207).
Regarding age-related demographics, we confirm early observations that counties with an older population are more affected by the outbreak, with borderline significance (median age: 0.657, [-0.033; 1.347], p  = 0.062). Notably, the percentage of persons under 18 years is positively associated with the outbreak rate, again with borderline significance (1.066, [-0.014; 2.146],p  = 0.053). A possible reason is that younger people physically interact more frequently, closer, and longer with their friends, thus contributing to the spread of the virus. Conversely, we find no effect of differences in gender (0.167, [-0.880; 1.214], p  = 0.755). None of these demographic variables are strongly correlated with any other variable.
Air pollution is a significant contributor to the outbreak (3.329, [1.465; 5.193], p  < 0.001), and, concurrently, counties with a rural environment experience a slower outbreak (-0.443, [-0.574; -0.312], p  < 0.001). This calls for studies linking air pollution to the lethality of COVID-1924,25 to include the outbreak rate as a potential confounding variable. Contrariwise, a better food environment is associated with a higher outbreak rate (5.996, [1.286; 10.706], p  = 0.016). While the food environment index is usually associated with a healthier lifestyle, better access to grocery stores and supermarkets in the vicinity also means more interaction with other people, and thus an increased likelihood of transmission.
As a final point, we want to note that we have presented associations between contextual factors and the COVID-19 outbreak which are consistent with the deliberations leading to our research model. However, these associations, even when statistically significant, are not an inference of causality. Establishing causal inference is, of course, critical for our understanding of and fight against COVID-19, but this represents a direction for further research using more detailed data at the level of individual patients.