1. Introduction
Coronavirus disease 2019 (COVID-19), caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), is a highly contagious disease that has caused widespread panic and concern across the globe. COVID-19 was the third leading cause of death in 2020. The death rate increased by 15.9% from 2019 to 2020 [
1]. As of September 2020, there have been 41 million confirmed cases and 660 thousand deaths due to COVID-19 in the USA [
1,
2,
3]. Additionally, COVID-19 has had a profound impact on social life and the economy, as closing businesses and social distancing have been common practices to slow the spread of the disease. The U.S. real GDP decreased by 3.5% in 2020 and was projected to lose at least
$3.2 trillion due to COVID-19 in a two-year course [
4,
5].
The burdens of COVID-19 have not been borne equally. Some populations face increased risk for COVID-19 morbidity and mortality [
6]. Many studies have reported disparities in the clinical outcomes of patients with COVID-19. For example, studies using inpatient data found severe disease progression and poor clinical outcomes of COVID-19 patients to be associated with a set of underlying medical conditions (e.g., hypertension, diabetes, asthma, and heart, liver, and respiratory illnesses), demographics (e.g., male, older age, race/ethnic minority), and social determinants of health (SDOHs) (e.g., lower education and income) [
7,
8,
9,
10,
11]. A study based on a large cohort in Louisiana comprised of 3,481 COVID-19 patients reported that 76.9% of the hospitalized cases and 70.6% of the death cases were among black patients, whereas only 31% of the state’s population is black [
12]. While these studies have provided a critical evidence base of disparities in COVID-19 clinical outcomes and implications for medical care for addressing the disparities, they offered limited implications for disparities in the risk of exposure to COVID-19 for the following reasons. First, the findings of these studies are applicable for hospitalized patients but may not be generalizable for outpatients, individuals with mild symptoms, and asymptomatic individuals since these studies are based on inpatient data. The omission of outpatients and individuals with laboratory-confirmed COVID-19 infections but no clinic visits will harm the potential opportunity of exploring risk factors for these populations [
13]. Second, using disease severity as the outcome variable does not provide information on SARS-CoV-2 infection and transmission. For example, SARS-CoV-2 transmits more easily in regions with a large proportion of younger people, yet the elderly were found to be at a higher risk of developing poor clinical outcomes [
14].
Therefore, it is equally important to curate an evidence base for disparities in the risk of exposure to COVID-19 and the pre-infection determinants of risk (PIDRs) (e.g., demographics, socioeconomics, and prevalence of diseases related to COVID-19 infection) [
15,
16,
17,
18,
19,
20]. Such an evidence base can be used for understanding disease transmission patterns, identifying vulnerable populations, and proactively mitigating disparities in future pandemics [
21]. Existing studies have reported demographic and socioeconomic factors to be related to disparities in the risk of exposure to COVID-19. Different combinations of those determinants lead to different health attributes (e.g., health behaviors and physical conditions), thus influencing the spread of the virus. For example, high-deprivation areas have higher rates of hospitalization and testing [
17]. People with a higher income are more likely to engage in self-protecting behavior during the COVID-19 pandemic [
18]. Another study reported that the behaviors of wearing masks and using hand hygiene are associated with the female sex and a higher education level among students in the Chinese population [
19]. In a primary care cohort, researchers observed a higher risk of COVID-19 infection among people aged 40–64 years, of the male sex, of the black race, and living in urban areas [
15]. Incorporating census tract level data with the COVID-19 dataset, Hawkins and colleagues examined the association between socioeconomic indicators and COVID-19 cases at the county level across the USA and found a lower education level and a higher percentage of black residents to be risk factors for the infection [
16].
To further explore the associations between PIDRs and COVID-19 transmission, geospatial information is needed. Geographic differences exist across states, counties, and communities in the timing of the SARS-CoV-2 introduction, which are further characterized by population density, local policies, and population composition [
14]. Particularly, understanding PIDRs and their geospatial epidemiology is urgently needed for rural states, such as South Carolina, that have a disproportionally low healthcare capacity and high disease burden. It may also provide timely information for post-COVID-19 care, given the emerging reports on the heterogeneity of symptoms in individuals with Post-Acute Sequelae of SARS-CoV-2 infection (PASC) [
22,
23]. Although the spatially dynamic nature of infectious diseases (e.g., different spatial patterns of transmission) makes geospatial analysis a valuable tool to unveil the epidemiology [
24,
25,
26,
27], there have been limited studies reporting the geospatial characteristics of PIDRs [
14,
28,
29,
30,
31]. Several studies have reported minority status, age, and other social vulnerabilities to be associated with a higher COVID-19 infection, yet spatial patterns were generally not included in the statistical models as independent variables [
28,
30,
31]. Fortaleza and colleagues used multivariate regression and found that population density and distance from the state capital are robust predictors of COVID-19 prevalence in Brazil [
29]. However, the results should be interpreted carefully since the association between population density and COVID-19 infection could be influenced by factors such as different policies being applied to smaller regions [
32]. Another study built a correlation matrix between socioeconomic determinants and COVID-19 case rates across the USA and found population density to be highly correlated with COVID-19 prevalence [
14].
Although the above studies have collectively suggested possible geospatial characteristics among the disparities in virus transmission, spatial autocorrelation is generally excluded from their statistical models, which limits the statistical power of the findings. The spatial autocorrelation, including global modeling and local modeling approaches, enables the correlation measure of a variable (e.g., PIDRs) with itself across different regions. Spatial global models assume a stationary correlation between a region and its neighbors, whereas spatial local models assume nonstationary correlations between a region and different neighbors. Among a few preliminary studies that adopted spatial autocorrelation, Mollalo and colleagues examined the association between the COVID-19 incidence rate and four county-level explanatory determinants including income inequality, median household income, the percentage of nurse practitioners, and the proportion of the black female population to the total female population across the USA [
33]. The authors started with a set of 35 socioeconomic, behavioral, topographic, and demographic explanatory variables. After a stepwise forward procedure and correlation analysis, they choose to keep four of these variables in their final model and found that geographically weighted regression (GWR) models best explained the variations, suggesting the existence of spatial autocorrelation and different vulnerabilities across the counties. Despite the application of highly appropriate geospatial methods, the study could have better interpreted the disparity structure if demographic determinants such as age, sex, and race were included in the analysis. Additionally, because these studies were based on analyses of cross-sectional data, they did not specify whether and how observed relationships between COVID-19 outcomes and PIDRs vary at different points in time as the pandemic evolved. Moreover, there is increased endogeneity in these analyses because they focused on large geographic regions within which different regional policies might have a greater impact on the COVID-19 prevalence as compared with the explanatory variables. Existing evidence suggests that government responses and socioeconomic determinants have played an important role in the transmission of SARS-CoV-2, which differs geographically [
34]. Another similar study included demographics but still suffered from the same endogeneity problem [
35].
Building on these existing studies, we sought to assess the association between PIDRs (including demographics, socioeconomics, and prevalence of diseases related to COVID-19 infection) and COVID-19 infection at the county level in South Carolina at different timepoints amid the pandemic. The heterogeneity in the virus spread in South Carolina suggests that different PIDRs in certain areas could enhance or inhibit the transmission of COVID-19. Within the smaller geographic scale of one state, the heterogeneous impacts of different regional policies could be largely mitigated, and the multi-source South Carolina surveillance data were sufficient for conducting geospatial analyses. Although there has been no statewide mask mandate in South Carolina, regional mask ordinances covered most of the regions by July 2021 [
36]. The findings of this study form an evidence base for temporal geospatial disparities in the risk of exposure to COVID-19 and the associated PIDRs. The identified PIDRs may also shed light on the populations and regions vulnerable to PASC in South Carolina during post-COVID-19 care.
4. Discussion
In this geospatial study, we adopted the socioecological vulnerability index from Snyder and Parks and compiled 15 variables within four categories of the index which could potentially explain the geographic patterns of COVID-19 transmission in SC [
39]. Our study resulted in three principal findings. First, our study demonstrated the spatial autocorrelations of COVID-19 incidence at the county level in SC. The results from global models and local models were consistent with the initial observation of the distribution maps of covariates. Second, some PIDRs (e.g., male percentage, unemployment rate) had consistent spatial correlations with COVID-19 incidence over time while some other PIDRs (e.g., percentage of the white population, obesity rate) showed divergent spatial correlations at different times of the pandemic, suggesting a critical role of the temporal dimension in the geospatial epidemiology of COVID-19 transmission. Third, the geospatial effect of PIDRs was strong at the beginning of the pandemic and started to decline as the infection cases continued to surge, suggesting the importance of early identification of critical PIDRs and timely intervention for possible future outbreaks of infectious diseases.
Aligned with existing studies [
28,
30], two PIDRs (e.g., male percentage and unemployment rate) were found to be significantly associated with a higher risk of COVID-19 infection in global models (e.g., SEM, SLM, and CAR). The higher risk of COVID-19 infection among the male population can be explained by several sex-related factors [
43]. Genetically, males have a higher expression of angiotensin-converting enzyme-2 (ACE2), which could be the receptor for SARS-CoV-2 [
44,
45]. The immunological response of SAR-CoV-2 may be different between males and females [
46,
47]. In addition, females have been found to have a more responsible attitude of health behaviors towards COVID-19 than males [
19,
48]. A higher unemployment rate reflects a higher socioeconomic vulnerability of COVID-19 infection. People with the ability to work from home are less likely to be infected because of higher job security [
49,
50]. Interestingly, our results are different from an existing study from Johnson et al. [
51]. They found unemployment to be a protective feature of COVID-19 infection and argued that it might be related to the lack of transportation among the unemployed. The role of unemployment in COVID-19 transmission needs further investigation.
We found that the white population was not statistically correlated with COVID-19 incidence from July to October and became positively correlated with COVID-19 incidence (all
p < 0.01 for SEM, SLM, CAR) in December. To the best of our knowledge, this finding has not been previously reported. We suspect that this finding is related to the fact that the COVID-19 incidence rate was higher in large metropolitan areas (e.g., urban, suburban) early on in the pandemic (i.e., March–May 2020) and diffused to small and nonmetropolitan areas, where proportions of white people are higher, later [
31]. Among the 26 counties that are classified as metropolitan areas in South Carolina, only three have a white population of less than 50%, and five have a white population of less than 60% [
52,
53]. Previous studies found that racial minorities had a higher risk of COVID-19 infection [
28,
30,
33], but these findings have not been tested or interpreted by the temporal dimension of the pandemic. Cunningham and Wigfall reported that racial attitudes towards COVID-19 had a significant impact on the likelihood of infection and mitigated the effect of racial difference, which also could explain our finding [
54]. In addition, our result could be related to the finding that a higher proportion of white people took COVID-19 tests than other races in the latter months [
55]. Median age, college degree rate, obesity rate, uninsured rate, and NP abundance were not statistically correlated with the COVID-19 infection rate.
Our findings suggest that early measures could be related to the transmission of COVID-19 since the geographic differences in COVID-19 infection reduced over time, indicated by the decreasing AIC values across models longitudinally (
Table 6). The decrease in AICs of local model (i.e., GWR model) over time indicated the persistence of the nonstationary spatial autocorrelation. Although the GWR models have lower AIC values compared with the global models, the coefficients of the variables in GWR models did not vary substantially, indicating small nonstationary effects. The small ranges of the coefficients geographically could be related to the insufficient granularity of the county-level data considering the study sample of South Carolina. Nevertheless, it is very interesting that the regional variances were decreasing over time within the study time frame.
This study is among the first to examine geospatial patterns in COVID-19 infection as well as PIDRs. Most studies have focused on patients with different levels of severity with COVID-19, which limits opportunities for examining possible disparities and PIDRs in COVID-19 infection [
56]. For example, older adults, people with certain medical conditions, and pregnant women were found to be associated with a higher risk of severe illnesses of COVID-19, while our study found that the male population and unemployment rate were risk factors of COVID-19 infection [
56]. Intuitively, the PIDR set for severe illnesses of COVID-19 is related to the physical condition of patients and the PIDR set for COVID-19 infection is jointly influenced by demographic and socioeconomic factors. Compared with PIDRs for severe illnesses, PIDRs for infection are highly sensitive to geographic regions and temporal dynamics of the pandemic because the transmission of COVID-19 is related to the activity of people. PIDRs for COVID-19 infection provide important information for developing interventions on targeted populations who share the same PIDRs at the beginning of the pandemic, which is imperative for containing the early-stage transmission and potential consequences in future infectious disease outbreaks.
Our study has several limitations. First, we did not use longitudinal measures of PIDRs due to limited surveillance data. Second, we used reported cases as a measure of COVID-19 prevalence. This measure could be potentially biased because testing rate and test positivity were not considered due to unavailable surveillance data. For example, data for COVID-19 testing rates for each race were not available for examining the racial differences [
57,
58,
59]. Third, due to the limited data access, we used county-level data in this study whereas using zip code-level data would have offered a better granularity of data in the statistical models. Fourth, mobility patterns have been identified as an important factor for COVID-19 transmission, which is not accounted for due to the limited data availability [
60]. Fifth, our methodology does not include the contrast between restrictions and temporary spatial patterns. Thus, implications resulted from temporal patterns should be discussed with caution. At last, variables used in this study may not be exhaustive in terms of all possible contributing factors of COVID-19 infection as this work is based on the framework from Snyder and Parks. Future studies could integrate variables such as the Social Vulnerability Index (SVI) for exploring the negative effects in communities towards hazardous events [
61,
62,
63].