Full-Coverage PM2.5 Mapping and Variation Assessment during the Three-Year Blue-Sky Action Plan Based on a Daily Adaptive Modeling Approach

He, Weihuan; Zhang, Songlin; Meng, Huan; Han, Jie; Zhou, Gaohui; Song, Hongquan; Zhou, Shenghui; Zheng, Hui

doi:10.3390/rs14153571

Open AccessArticle

Full-Coverage PM_2.5 Mapping and Variation Assessment during the Three-Year Blue-Sky Action Plan Based on a Daily Adaptive Modeling Approach

by

Weihuan He

¹,

Songlin Zhang

¹,

Huan Meng

^2,3,

Jie Han

¹,

Gaohui Zhou

¹,

Hongquan Song

^2,3,

Shenghui Zhou

^2,4

and

Hui Zheng

^2,3,4,*

¹

College of Surveying and Geo-Informatics, Tongji University, Shanghai 200092, China

²

Key Laboratory of Geospatial Technology for Middle and Lower Yellow River Regions, Ministry of Education, College of Environment and Planning, Henan University, Kaifeng 475004, China

³

Henan Key Laboratory of Earth System Observation and Modeling, Henan University, Kaifeng 475004, China

⁴

Henan Key Laboratory of Integrated Air Pollution Control and Ecological Security, Kaifeng 475004, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(15), 3571; https://doi.org/10.3390/rs14153571

Submission received: 7 June 2022 / Revised: 21 July 2022 / Accepted: 22 July 2022 / Published: 25 July 2022

Download

Browse Figures

Versions Notes

Abstract

:

Owing to a series of air pollution prevention and control policies, China’s PM_2.5 pollution has greatly improved; however, the long-term spatial contiguous products that facilitate the analysis of the distribution and variation of PM_2.5 pollution are insufficient. Due to the limitations of missing values in aerosol optical depth (AOD) products, the reconstruction of full-coverage PM_2.5 concentration remains challenging. In this study, we present a two-stage daily adaptive modeling framework, based on machine learning, to solve this problem. We built the annual models in the first stage, then daily models were constructed in the second stage based on the output of the annual models, which incorporated the parameter and feature adaptive tuning strategy. Within this study, PM_2.5 concentrations were adaptively modeled and reconstructed daily based on the multi-angle implementation of atmospheric correction (MAIAC) AOD products and other ancillary data, such as meteorological factors, population, and elevation. Our model validation showed excellent performance with an overall R² = 0.91 and RMSE = 9.91 μg/m³ for the daily models, along with the site-based cross-validation R²s and RMSEs of 0.86–0.87 and 12–12.33 μg/m³; these results indicated the reliability and feasibility of the proposed approach. The daily full-coverage PM_2.5 concentrations at 1 km resolution across China during the Three-Year Blue-Sky Action Plan were reconstructed in this study. We analyzed the distribution and variations of reconstructed PM_2.5 at three different time scales. Overall, national PM_2.5 pollution has significantly improved with the annual average concentration dropping from 33.67–28.03 μg/m³, which demonstrated that air pollution control policies are effective and beneficial. However, some areas still have severe PM_2.5 pollution problems that cannot be ignored. In conclusion, the approach proposed in this study can accurately present daily full-coverage PM_2.5 concentrations and the research outcomes could provide a reference for subsequent air pollution prevention and control decision-making.

Keywords:

PM_2.5; full-coverage; aerosol optical depth; air pollution; spatiotemporal variation; adaptive modeling

Graphical Abstract

1. Introduction

In the past, China’s economic development largely relied on the consumption of fossil energy, resulting in increased carbon emissions and severe air pollution [1,2]. Fine particulate matter (PM_2.5) is of immense concern and is widely regarded as the most harmful air pollutant to the human body [3]. PM_2.5 enters the body through the respiratory system, causing health problems such as asthma and lung cancer [4,5,6,7]. In recent decades, the Chinese government has gradually realized the adverse impacts of PM_2.5 pollution on human health, resulting in a series of air pollution prevention and control actions, such as the Air Pollution Prevention and Control Action Plan promulgated in 2013 [8] and the Three-Year Action Plan for Winning the Blue Sky Defense Battle introduced in 2018 [9]. In recent years, the Beautiful China medium- and long-term goals for 2035 have been formulated, i.e., the annual PM_2.5 concentration in all cities should be lower than 35 μg/m³ by 2035, which requires that the air quality in China improves fundamentally. The government aims to establish a society characterized by sustainable development and ecological civilization. Thus, it is particularly essential to establish a spatiotemporal contiguous PM_2.5 dataset that can interpret PM_2.5 variations under existing policies, evaluate the effects of the policies, and provide references for the formulation of subsequent policies.

So far, many studies have successfully simulated surface PM_2.5 concentrations. These studies can mainly be divided into two categories: one (i.e., Type I) is a numerical model that simulates the physical and chemical processes of pollutants in the atmosphere by using emissions inventories of pollution sources, represented by the WRF-Chem model [10] and the GEOS-Chem model [11,12]. The other, (i.e., Type II) is based on satellite remote sensing data, e.g., aerosol optical depth (AOD), to infer surface PM_2.5 concentrations. Type II can be further divided into two models, namely simple statistical and machine learning models. Simple statistical models are represented by the linear mixed effects model [13,14], the geographically weighted regression model [15], and the geographically and temporally weighted regression (GTWR) model [16]. In recent years, machine learning methods have been successfully applied to the reconstruction of surface PM_2.5 concentrations and have gradually matured and been systematized, represented by random forest (RF) [17,18,19] and extreme gradient boosting (XGBoost) [20,21]. Among these methods, simple statistical models have the advantages of a simple model structure and high spatial resolution compared to physical and chemical models. However, simple statistical model performance is slightly inferior to that of machine learning models and their computations are not as efficient as those of machine learning models when the sample size increases. Moreover, machine learning methods are more accurate than simple statistical models [22,23].

Previously, AOD-based studies basically obtained incomplete PM_2.5 maps due to missing values in the AOD data [24,25,26,27]; however, datasets with incomplete coverage may not provide a comprehensive view of the distribution and variation of PM_2.5 pollution, and their statistical results are likely biased. Research has thus begun to focus on spatial full-coverage PM_2.5 maps to improve upon existing data. Currently, one commonly used approach is to obtain full-coverage PM_2.5 data based on AOD gap-filling, including interpolation based on the AOD data themselves [28] and missing predictions using multi-source data [29,30,31]. Another approach is to use other datasets instead of AOD products [32,33], but the results of this approach have coarser spatial or temporal resolution than AOD-based results. Although AOD gap-filling has become the mainstream approach to obtain full-coverage PM_2.5 maps, this approach presents problems such as uncertainty in missing predictions and the accumulation of errors; some studies even fill AOD gaps without validation [34]. Furthermore, filling these AOD gaps is a large-scale area of research that has the potential to be an independent study [35,36].

Recent studies of surface PM_2.5 reconstruction, whether on an hourly, daily, or coarser time scale, have used all the available samples per year to build one or more models [37,38,39]. However, for AOD gap-filling, some studies have used a daily-scale modeling approach, that is, they have used daily samples to build a model for each day [34]. These two types of studies both built models to predict unknowns, but modeled them on different time scales. The reasons are as follows. Satellite AOD data are spatially continuous, a large number of samples can be obtained per day despite the existence of missing values. For PM_2.5, there are only <2000 observation sites across China, resulting in limited daily observations. To obtain a stable and accurate model, the sample must be expanded, i.e., the daily samples collected and combined for subsequent modeling. This approach, however, can be problematic. While the total sample size can be increased, the daily sample size cannot and remains small. Can a limited and geo-fixed daily sample accurately detect complex variations of PM_2.5?

Based on the aforementioned studies, we proposed two conjectures. First, is there a way to obtain full-coverage PM_2.5 datasets while using AOD but not filling AOD gaps? Second, is daily modeling for PM_2.5 feasible? This study is carried out on these two conjectures. In this study, we established a two-stage daily full-coverage PM_2.5 mapping framework based on a machine learning algorithm. Relying on this framework, we reconstructed daily full-coverage maps of PM_2.5 pollution at 1 km resolution during the Three-Year Blue-Sky Action Plan (2018–2020) (hereinafter referred to as the Action Plan) to understand PM_2.5 conditions and variations during the implementation phase of the policy. This information can be used to evaluate the effectiveness of the policy and provide references for subsequent policy-making regarding air pollution prevention and control.

2. Datasets and Processing

2.1. In-Situ PM_2.5 Measurements

Since 2013, the Ministry of Ecology and Environment of the People’s Republic of China has set up monitoring sites nationwide to conduct real-time monitoring of conventional air pollutants, including PM_2.5, regularly publishing this data on the urban air quality real-time release platform, i.e., the China National Environmental Monitoring Center (CNEMC). By the end of 2020, there were more than 1700 stations across China, mainly in densely populated coastal areas and major cities. In this study, hourly PM_2.5 measurements were obtained from the CNEMC from 1 January 2018 to 31 December 2020, which were then averaged to represent daily conditions. Figure S1 shows the distribution of PM_2.5 monitoring stations used in this study.

2.2. MAIAC AOD

AOD is highly correlated with surface PM_2.5 concentration and has thus been widely used in PM_2.5 retrieval [40,41,42]. The AOD data used in this study can be retrieved using the multi-angle implementation of the atmospheric correction (i.e., MAIAC) algorithm, which can be downloaded from NASA’s Level-1 and Atmosphere Archive and Distribution System (LAADS) Distributed Active Archive Center (DAAC) (https://ladsweb.modaps.eosdis.nasa.gov/, accessed on 10 September 2021). MAIAC AOD products have a spatial resolution of 1 km and temporal resolution of 1 day. Compared to the Dark Target (DT) and Deep Blue (DB) algorithms, the AOD data retrieved using this algorithm are more accurate, with a wider range [43].

2.3. Auxiliary Data

Meteorological factors are regarded as important auxiliary information to be incorporated into PM_2.5 modeling schemes. Meteorological factors including 2-m temperature (TEM), surface pressure (SP), relative humidity (RH), 10-m wind speed (WS), total column ozone (TCO), and boundary layer height (BLH) were obtained from the fifth-generation ECMWF reanalysis dataset (ERA5). The ERA5 dataset can provide hourly estimates and has been gridded to 0.25°. In this study, daily meteorological datasets were obtained by averaging the hourly records for each day.

Additionally, the monthly normalized difference vegetation index (NDVI) data, with a 1-km spatial resolution, obtained from Moderate Resolution Imaging Spectrometers (MODIS) Level-3 global products (MOD13A3), was used to display land cover. Elevation data were obtained from the Shuttle Radar Topography Mission (SRTM) database at a 30-m resolution. The 1-km annual population distribution (POP) data from the LandScan Global Population Database were also included in this study to explain the potential contributions of socioeconomic conditions on PM_2.5 pollution levels. All these auxiliary data were resampled to 1 km grid using the bilinear interpolation method. The data used in this study is summarized in Table S1.

3. Methodology

3.1. Daily Adaptive Modeling Scheme

In this study, we propose a daily adaptive modeling scheme based on a machine learning algorithm, i.e., the XGBoost algorithm, which is an efficient and systematic implementation of Gradient Boosting [44]. Compared to the gradient boosting decision tree that has been adopted in our previous studies [22,45], this algorithm supports column subsampling, thus effectively reducing the risk of overfitting [46,47]. One of the most time-consuming steps during the learning process of the decision tree algorithm is sorting features to determine the best split point. XGBoost supports the parallel processing of features, which can significantly improve its computational efficiency [20]. Figure 1 shows the framework of the proposed daily full-coverage PM_2.5 modeling scheme. The framework consists of two stages, i.e., the annual modeling and daily modeling stages.

3.1.1. Stage 1: Annual Modeling

At this stage, we established two separate PM_2.5 annual models: one with AOD as an impact factor input (i.e., Model 1) and the other without (i.e., Model 2). Variables with high temporal resolution (i.e., daily and hourly) were selected for the annual model. Taking Model 1 as an example, six meteorological variables (i.e., BLH, TEM, RH, SP, WS, and TCO) as well as AOD were used as inputs to establish the PM_2.5 retrieval model; day of year (DOY) was included as well. Due to the missing values in the AOD data, the maximum valid sample size of ˃500,000 per year (after all variables matched spatially and temporally), drops to <200,000 if the AOD-missing samples are eliminated. Therefore, to increase the sample size and improve model performance, we performed a simple gap-filling process on the matched samples first. For the valid matching samples (all variables with no missing data), with AOD as the dependent variable and in-situ PM_2.5 combined with auxiliary data as the independent variable, modeling was used to fill in the samples with PM_2.5 present but AOD missing (see Figure S2). In this way, the sample volume used for the annual model reached the most matches (i.e., >500,000). The other model (i.e., Model 2) has a similar structure to Model 1, but with the AOD removed. Thus, the process of filling the gaps in the missing samples was omitted from Model 2.

3.1.2. Stage 2: Daily Modeling

For each day of the year, 100,000 grids with no PM_2.5 observations were randomly selected (˃10 million grids across China at a 1 km resolution), since too many selections would consume computing resources and too few selections would cause a meaning loss of the daily modeling. These grids were then divided into two groups, one with AOD and the other without. The two corresponding models established in Stage 1 were applied to these two groups, respectively, to obtain initial PM_2.5 estimates. Following this, Model 3, i.e., the final daily adaptive PM_2.5 estimation model, was built. Assuming that the initial PM_2.5 estimates obtained from the previous step were the true values, they were used as the output to establish daily models with meteorological factors, DEM, POP, and NDVI as inputs. Daily situations, however, often vary, e.g., the distributions of daily PM_2.5 high and low clusters and the contributions of variables to the PM_2.5 pollution. An adaptive strategy is thus used here to combat this. First, fixed parameters were not used in the daily model as they were in the annual model; instead, variable parameters were adopted to adapt to daily variations. We searched for the daily optimal for the three important parameters of XGBoost (i.e., n_estimator, learning_rate, and max_depth). Notably, to reduce the calculations, finite sets were established in this study, and the relative best was selected. We set a n_estimator of {200, 300, 400, 500}, a learning_rate of {0.05, 0.1, 0.3, 0.5}, and a max_depth of {5, 6, 7, 8, 9}.

For machine learning, feature selection is as important as parameter optimization; the larger the number of features, the more complex the model will be. Furthermore, its final performance may not be better than that of a simpler model. The goal of feature selection is to find the optimal feature subset in order to construct a better model. Common methods of feature selection include three categories: filter, wrapper, and embedded methods. In this study, we used feature tuning to optimize the daily models. The recursive feature elimination (RFE) method in the wrapper category was chosen; the main idea of RFE is to iteratively reduce the feature set. The process involves first training on the original feature set, then choosing which features need to be excluded based on their importance until all the features are traversed. The detailed steps within this process are as follows:

Initialize the feature set $F {f_{1}, f_{2}, \dots, f_{n}}$ including all feature variables.
Train the model based on the feature subset $F_{i}$ ( $F_{i} \in F; i = 1, 2, \dots, n$ ). Calculate the importance of each feature and use the cross-validation method to obtain the subset score.
Remove the feature with the lowest importance from the current subset $F_{i}$ to obtain a new feature subset $F_{j}$ ( $F_{j} \in F_{i}; j = 1, 2, \dots, n$ ).
Repeat steps 2 and 3 until the feature subset is empty or the number of features reaches a predetermined threshold.
Compare the scores of each feature subset and output the subset with the highest score.

3.2. Model Validation

In this study, the models in Stage 1 and Stage 2 were evaluated separately. The commonly used site-based and time-based 10-fold cross-validation (CV) method, which can better reflect the model performance over space and time, was applied for the two models in Stage 1. Since final daily models in Stage 2 were not constructed based on the ground-based PM_2.5 measurements, all ground-based observations were used to validate the daily models for each day. However, evaluating models more objectively by eliminating any possible connections in the datasets is necessary. Thus, we also performed site-based and time-based CV for the daily model performance in Stage 2. The statistical indicators used to evaluate model performance were the coefficient of determination (R²), mean absolute error (MAE), and root mean square error (RMSE).

4. Results and Discussion

4.1. Model Performance

We first evaluated the effect of the sample gap-filling process on the model. The result for the comparison of the performance of Model 1 trained with valid AOD samples (i.e., the original matched samples without gap-filling) and gap-filled AOD samples, are shown in Table S2. For a fair comparison, only test data with valid AOD was used. It can be seen that the results of the model trained by the samples before and after filling are not considerably different. However, in general the model trained with gap-filling samples has a slight improvement. We then evaluated the two models in Stage 1, since the daily models in Stage 2 were built based on the output of the models in Stage 1, for this reason, the quality of the models in Stage 1 will affect the accuracy of the daily models in Stage 2. The evaluation results are shown in Table 1. The site-based CV R² of Model 1 and Model 2 for different years between 2018 and 2020 were 0.90–0.92 and 0.84–0.86, respectively. The RMSEs were 9.05–10.70 μg/m³ and 11.97–13.27 μg/m³, respectively. These results indicate that the two models built in Stage 1 were robust and that the estimates derived from this stage are relatively reliable. The Model 1 evaluation is slightly better than that of Model 2 due to the addition of AOD.

Next, we evaluated the performance of the proposed daily adaptive modeling scheme. The density scatter plots of the validation results for the daily estimated PM_2.5 concentrations are illustrated in Figure 2. For all matched samples during the study period (see Figure 2d), the daily PM_2.5 estimates generated by the proposed modeling scheme agreed with surface observations. The overall R² value between the measured and estimated PM_2.5 concentrations was 0.91, with RMSE and MAE values of 9.91 μg/m³ and 6.21 μg/m³, respectively. For annual results over space and time (see Table 1), the model performed well in terms of site-based CV, with R² values ranging from 0.86–0.87 and RMSE values ranging from 12–12.33 μg/m³. Time-based CV showed poor performance with R² values lower than 0.6. This is likely due to the limitations of the models in Stage 1.

A highlight of our strategy is the daily modeling; thus, the evaluation of daily model performance is essential. Figure 3 shows the time series of the evaluation result of the daily models in Stage 2 from 2018 to 2020; it can be seen that the performance remained similar through the years. Almost all daily R² values exceeded 0.6, with RMSE values below 20 μg/m³, demonstrating that the daily modeling strategy proposed in our study is feasible and that applying this method can produce reliable daily PM_2.5 estimates. For the long-term sequence, R² and RMSE values continuously rose and fell, varying from high values on cold days to low values on warm days, which exhibits a stable cyclical trend. Figure 3 also shows daily averaged PM_2.5 concentrations, which varied in the same way as R² and RMSE. Thus, the rise and fall of R² and RMSE throughout the year may be associated with fluctuations in the overall PM_2.5 concentrations.

4.2. Mapping and Variation Analysis

We generated daily full-coverage PM_2.5 maps across China from 1 January 2018 to 31 December 2020 with 1 km resolution. First, all data were averaged by year, as demonstrated in Figure 4, to show the annual variation; Figure 4a–c illustrates the spatial distribution of annual mean PM_2.5 concentrations. Generally, the annual mean PM_2.5 concentrations showed similar distribution patterns from 2018 to 2020. High-value clusters mainly appeared in the Tarim Basin in Xinjiang and the central and eastern regions, where the economy and population are concentrated. This is consistent with the results of previous studies [23,48,49]. Figure 4d shows the histogram of the regional averaged concentrations of 34 provinces, municipalities, and autonomous regions in China over 3 years. More specifically, the overall PM_2.5 concentration across China showed a downward trend of 16.74%, from 33.67 μg/m³ in 2018 to 28.03 μg/m³ in 2020. In terms of regions, the largest decrease occurred in Tibet—51.61%, from 21.43 μg/m³ to 10.37 μg/m³. The three main economic zones: the Beijing–Tianjin–Hebei, Pearl River Delta, and Yangtze River Delta zones (see Figure S1 for the definition of the scope), saw a decline of 14.81%, 19.46%, and 26.66%, respectively. The Jilin Province, however, reversed its trend during this period with a decline of −0.02%. According to the Ambient Air Quality Standard (GB3095-2012) (hereinafter referred to as the standard), the proportion of the area that meets the standard (i.e., annual mean PM_2.5 concentration ≤35 μg/m³) has increased from 62% in 2018 to 76% in 2020. Beijing, the political center, has reached the standard for the first time in 2020 as a result of strict control. The aforementioned results showed that the strict clean air policy implemented by the government is effective. However, some regions such as Jilin Province still showed a slight upward trend despite the overall decline. Future air pollution control efforts can thus be more focused on these areas.

Next, the monthly variation was analyzed. Figure 5 shows the multi-year averaged estimates for each month. It can be seen that the PM_2.5 concentration varied significantly across the months. The beginning of the year (i.e., January) was the most serious period for PM_2.5 pollution, when most of China showed high levels of pollution. Following this, the PM_2.5 level declined consecutively and started to rebound after reaching a trough in August. As shown in the time distribution, only June–September could be considered a “clean” period of the year; a severe PM_2.5 pollution period occurred in winter (i.e., DJF). Notably, the annual results indicated severe conditions in the Tarim Basin, similar to those in the central and eastern regions, but reducing the temporal resolution revealed that they have different pollution periods. The most seriously polluted period in central and eastern China was from December to February, while in the Tarim Basin it was March and April. The conditions in the central and eastern regions were mainly due to their dense populations and abundant human activity, supplemented by unfavorable weather conditions [50,51]. The combination of these conditions results in the frequent occurrence of PM_2.5 pollution events in these areas during the winter. China’s largest desert, i.e., the Taklimakan Desert, is located in the Tarim Basin. Dust storms that can increase the PM_2.5 concentrations in the air mostly occur in spring (i.e., MAM) and increase as the climate warms [52]. Therefore, although pollution also existed during adverse weather conditions, it was less severe than it was in spring.

To further explore PM_2.5 variations during the Action Plan, we reduced the temporal resolution to days. According to the standard, we calculated the number of polluted days (>75 μg/m³) (see Figure 6a–c) and clean days (≤35 μg/m³) (see Figure 6d–f) in one year for each 1 km grid. The clean days per grid steadily increased, suggesting that air pollution was improving; however, the results showed that not all regions saw a steady decrease in polluted days. For example, polluted days in parts of Henan and Shandong Province had decreased overall as of 2020 but increased in 2019. The southwest regions of the Tarim Basin saw a slight rebound in 2020 after a decrease in 2019. In some areas (such as the three northeastern provinces), polluted days have been on the rise throughout the study period. In addition, we also calculated the daily proportion of areas with different PM_2.5 levels (see Figure 6g). The proportion of areas with daily PM_2.5 concentrations below 35 μg/m³ and between 35–75 μg/m³ showed opposite trends, with slopes of 0.017 and −0.016, respectively. Despite the aforementioned results, such as the daily PM_2.5 concentrations exceeding 75 μg/m³ and showing a growth trend in some regions, its proportion showed an overall negative growth with a slope of −0.003 in the daily variation statistics. In conclusion, the increase in clean areas was mainly replenished by the decrease in moderate (35–75 μg/m³) areas from 2018–2020. The reduction of polluted areas was slow, therefore, reducing high-level PM_2.5 pollution remains a significant challenge.

4.3. Strengths and Limitations

Our proposed daily adaptive modeling approach showed satisfactory performance in this experiment. The generated daily PM_2.5 maps have no gaps, which can capture the daily PM_2.5 variations more accurately and intuitively. Compared to the related previous works (summarized in Table 2), our model performance is excellent in terms of evaluation indicators (i.e., overall R² = 0.91, MAE = 6.21 μg/m³, RMSE = 9.91 μg/m³), which is better than statistical regression models such as the GTWR model [16] and the TSAM model [13], as well as some machine learning and deep learning models such as the RF model [37], the STET model [23], and the GRNN model [53]. Currently, only a few studies have succeeded in daily full-coverage mapping [32,34,48]. Compared to these studies, such as the TAP dataset (http://tapdata.org.cn/, accessed on 1 March 2022), our results show superiority in terms of resolution or model performance.

In addition to the model performance analyzed above, our proposed modeling scheme has three major strengths and innovations listed hereafter. The first of these is daily modeling. Almost all current studies used the entire matched sample set to build annual models and then forecast daily conditions. In line with this convention, we adopted this approach as well (i.e., Model 1 and Model 2); however, the temporal and spatial distribution of PM_2.5 is complex and the dominant factors vary from region-to-region [54,55]. It may not be possible to accurately ascertain the daily internal mechanism of PM_2.5 distribution in different regions by relying on only <2000 site samples per day. Consequently, we used a daily modeling strategy in Stage 2, i.e., randomly selecting sufficient samples each day, to build a model suitable for that specific day. Our daily model is based on the XGBoost algorithm, the efficient computation of which enables daily modeling. In conclusion, we are the first to propose daily modeling and demonstrate that such an approach is practical.

The second strength is adaptive strategies. Unlike annual models, daily modeling has a lot of uncertainty. First, there is no one-size-fits-all set of machine learning parameters for everyday situations. Annual models can process each day after obtaining the optimal parameters. However, with daily models, the optimal model parameters for one day may perform poorly the next day. Next, not all factors contribute to the final daily PM_2.5 models. Many studies determine input factors based on evaluation results before modeling [19,56,57]. Considering the two aforementioned points, we adopted the adaptive adjustment modeling strategy in this study. Applying parameter tuning and feature tuning are the best ways to establish the most suitable model for a given day. Although this is a large and time-consuming undertaking, this optimal modeling scheme can achieve higher modeling accuracy and is helpful in finally reconstructing the daily PM_2.5 distributions. The evaluation results of this study further confirmed this.

Lastly, the proposed modeling scheme allows us to obtain daily full-coverage maps without completely filling the AOD gaps. In this study, we adopted AOD, but do not entirely rely on AOD. Full-coverage maps are still available in the case of AOD containing missing values. At present, most studies on PM_2.5 full-coverage mapping types are based on AOD gap-filling [30,34,48,58]. This is, of course, a good idea, but there is a certain degree of uncertainty inherent to gap-filling, which may cause the accumulation of the uncertainty of PM_2.5 predictions. Even in some studies where AOD was excluded, only non-missing data were used to obtain PM_2.5 full-coverage maps [32]. Owing to our novel modeling strategy, we retained AOD data in this study without filling the gaps within it. Note that AOD was only involved in Stage 1 to help build the annual model, and was not added to the daily model in Stage 2. Many studies have confirmed the contributions of AOD to PM_2.5 mapping [40,42,59], though the contributions of other factors such as meteorology should not be underestimated [60,61]. Therefore, we assumed that when modeling at a daily timescale and when the daily sample size is sufficient, the complex variations of PM_2.5 may well be detected without AOD. This conjecture has been confirmed experimentally in this study. To the best of our knowledge, this study is the first to obtain full-coverage results using information containing missing AODs.

Despite the aforementioned strengths, our study still has several limitations. First, in terms of evaluation results, although overall and site-based CV performed well, the time-based CV performance was relatively poor. This makes the model unsuitable for days without any information, which means it cannot be used for future times. Moreover, the PM_2.5 in Stage 2 is the estimated output from Stage 1 but there is a deviation between the estimated value and the true value. Modeling on this basis may have the same error accumulation effect as the approaches based on AOD gap-filling. Second, our models were based only on a few factors, While previous studies consider various categories of data such as land use, road density, and nighttime light intensity [34,62,63]. Owing to our adaptive strategy, however, increasing the factors consumes more computation to find the optimum. Therefore, we only used a few strongly correlated and frequently used factors, which results in a small number of high-resolution factors that contribute to detailed model construction. Model 1, for example, has no high-resolution factor at all, which may cause the final reconstructed daily PM_2.5 maps to miss some details. Nevertheless, our work is fully applicable to coarse-resolution studies. Finally, also limited by the computational problem are the daily optimal models we established, which were only a local optimum. Not all candidate values were experimented with on model parameter tuning. In addition, daily modeling and finding the optimum values will consume extensive time and computing resources, another unavoidable weakness.

5. Conclusions

In this study, we propose a two-stage daily adaptive modeling framework based on the XGBoost algorithm to reconstruct daily full-coverage PM_2.5 concentrations across China at a 1 km resolution. For model validation, we separately evaluated the models in two stages. The results of the models in Stage 1 performed better and laid the foundation for the feasibility and reliability of the daily model in Stage 2. The final daily models obtained excellent performance with an overall R² and RMSE of 0.91 and 9.91 μg/m³, respectively. Time-based cross-validation showed poorer performance compared to the site-based one. The daily evaluation performance for the daily model was stable. Compared with previous related work, the estimation accuracy of the proposed approach is also satisfactory. All of these indicate that our modeling strategy is feasible.

Based on our proposed modeling approach, daily full-coverage PM_2.5 maps across China were generated for 2018–2020. We analyzed the distribution and variations of the national PM_2.5 concentrations during the Action Plan at a yearly, monthly, and daily timescale. In general, China’s air pollution control policies have contributed significantly to reducing PM_2.5 concentrations. The national PM_2.5 pollution levels have improved remarkably, with the annual average concentration dropping from 33.67 to 28.03 μg/m³ from 2018 to 2020. Nevertheless, PM_2.5 concentrations in some areas have been increasing during this period; future air pollution control plans can focus on these areas. Reducing high-value PM_2.5 pollution continues to be a challenging field of study.

One characteristic innovation of our approach is daily modeling and its adaptive adjustment. To the best of our knowledge, this study is the first to propose daily modeling and to demonstrate that this strategy works. The second characteristic innovation of this study is the ability to obtain full-coverage results without filling the AOD gaps, which can save a substantial amount of work. In conclusion, we propose a novel method for PM_2.5 full-coverage modeling. In future studies, we will continue to improve upon this method and aim to generate long-term time series for PM_2.5.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs14153571/s1, Figure S1: Spatial distribution of PM_2.5 monitoring stations included in this study; Figure S2: The process of simple gap-filling on matched samples; Table S1: Summary of the data used in this study; Table S2: The site-based validation results of Model 1 based on the valid AOD and gap-filled AOD.

Author Contributions

Conceptualization: W.H. and H.Z.; methodology: W.H., H.M. and H.Z.; software: W.H., H.M., H.S., S.Z. (Shenghui Zhou), J.H. and G.Z.; validation: W.H.; writing—original draft: W.H.; writing—review and editing: S.Z. (Songlin Zhang) and H.Z.; project administration: S.Z. (Songlin Zhang), H.S., S.Z. (Shenghui Zhou) and H.Z.; funding acquisition: S.Z. (Songlin Zhang), S.Z. (Shenghui Zhou) and H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China (42005102), the Shanghai Municipal Natural Science Foundation (19ZR1459700), and the Science and Technology Development Project of Henan Province China (222102110419).

Data Availability Statement

The datasets generated during and/or analyzed during the current study are open access or available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Diao, B.; Ding, L.; Cheng, J.; Fang, X. Impact of transboundary PM_2.5 pollution on health risks and economic compensation in China. J. Clean Prod. 2021, 326, 129312. [Google Scholar] [CrossRef]
Yang, D.; Chen, Y.; Miao, C.; Liu, D. Spatiotemporal variation of PM_2.5 concentrations and its relationship to urbanization in the Yangtze river delta region, China. Atmos. Pollut. Res. 2020, 11, 491–498. [Google Scholar] [CrossRef]
Burnett, R.; Chen, H.; Szyszkowicz, M.; Fann, N.; Hubbell, B.; Pope, C.A., III; Apte, J.S.; Brauer, M.; Cohen, A.; Weichenthal, S.; et al. Global estimates of mortality associated with long-term exposure to outdoor fine particulate matter. Proc. Natl. Acad. Sci. USA 2018, 115, 9592–9597. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Leiva, G.M.A.; Santibañez, D.A.; Ibarra, E.S.; Matus, C.P.; Seguel, R. A five-year study of particulate matter (PM_2.5) and cerebrovascular diseases. Environ. Pollut. 2013, 181, 1–6. [Google Scholar] [CrossRef]
Lelieveld, J.; Evans, J.S.; Fnais, M.; Giannadaki, D.; Pozzer, A. The contribution of outdoor air pollution sources to premature mortality on a global scale. Nature 2015, 525, 367–371. [Google Scholar] [CrossRef]
Yang, X.; Zhang, L.; Chen, X.; Liu, F.; Shan, A.; Liang, F.; Li, X.; Wu, H.; Yan, M.; Ma, Z.; et al. Long-term exposure to ambient PM_2.5 and stroke mortality among urban residents in northern China. Ecotox. Environ. Safe. 2021, 213, 112063. [Google Scholar] [CrossRef]
Zhang, G.; Liu, X.; Zhai, S.; Song, G.; Song, H.; Liang, L.; Kong, Y.; Ma, R.; He, X. Rural-urban differences in associations between air pollution and cardiovascular hospital admissions in Guangxi, southwest China. Environ. Sci. Pollut. Res. 2022, 29, 40711–40723. [Google Scholar] [CrossRef]
Chinese State Council Action Plan on Air Pollution Prevention and Control (In Chinese). Available online: http://www.gov.cn/zwgk/2013-09/12/content_2486773.htm (accessed on 1 June 2021).
Chinese State Council Three-Year Action Plan on Defending the Blue Sky (In Chinese). Available online: http://www.gov.cn/zhengce/content/2018-07/03/content_5303158.htm (accessed on 1 June 2021).
Goldberg, D.L.; Gupta, P.; Wang, K.; Jena, C.; Zhang, Y.; Lu, Z.; Streets, D.G. Using gap-filled MAIAC AOD and WRF-Chem to estimate daily PM_2.5 concentrations at 1 km resolution in the Eastern United States. Atmos. Environ. 2019, 199, 443–452. [Google Scholar] [CrossRef]
Geng, G.; Zhang, Q.; Martin, R.V.; van Donkelaar, A.; Huo, H.; Che, H.; Lin, J.; He, K. Estimating long-term PM_2.5 concentrations in China using satellite-based aerosol optical depth and a chemical transport model. Remote Sens. Environ. 2015, 166, 262–270. [Google Scholar] [CrossRef]
van Donkelaar, A.; Martin Randall, V.; Brauer, M.; Kahn, R.; Levy, R.; Verduzco, C.; Villeneuve Paul, J. Global Estimates of Ambient Fine Particulate Matter Concentrations from Satellite-Based Aerosol Optical Depth: Development and Application. Environ. Health Perspect. 2010, 118, 847–855. [Google Scholar] [CrossRef] [Green Version]
Fang, X.; Zou, B.; Liu, X.; Sternberg, T.; Zhai, L. Satellite-based ground PM_2.5 estimation using timely structure adaptive modeling. Remote Sens. Environ. 2016, 186, 152–163. [Google Scholar] [CrossRef]
Ma, Z.; Liu, Y.; Zhao, Q.; Liu, M.; Zhou, Y.; Bi, J. Satellite-derived high resolution PM_2.5 concentrations in Yangtze River Delta Region of China using improved linear mixed effects model. Atmos. Environ. 2016, 133, 156–164. [Google Scholar] [CrossRef]
Ma, Z.; Hu, X.; Sayer Andrew, M.; Levy, R.; Zhang, Q.; Xue, Y.; Tong, S.; Bi, J.; Huang, L.; Liu, Y. Satellite-Based Spatiotemporal Trends in PM_2.5 Concentrations: China, 2004–2013. Environ. Health Perspect. 2016, 124, 184–192. [Google Scholar] [CrossRef] [PubMed] [Green Version]
He, Q.; Huang, B. Satellite-based mapping of daily high-resolution ground PM_2.5 in China via space-time regression modeling. Remote Sens. Environ. 2018, 206, 72–83. [Google Scholar] [CrossRef]
Brokamp, C.; Jandarov, R.; Hossain, M.; Ryan, P. Predicting Daily Urban Fine Particulate Matter Concentrations Using a Random Forest Model. Environ. Sci. Technol. 2018, 52, 4173–4179. [Google Scholar] [CrossRef]
Stafoggia, M.; Bellander, T.; Bucci, S.; Davoli, M.; de Hoogh, K.; de Donato, F.; Gariazzo, C.; Lyapustin, A.; Michelozzi, P.; Renzi, M.; et al. Estimation of daily PM10 and PM_2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model. Environ. Int. 2019, 124, 170–179. [Google Scholar] [CrossRef]
Wei, J.; Huang, W.; Li, Z.; Xue, W.; Peng, Y.; Sun, L.; Cribb, M. Estimating 1-km-resolution PM_2.5 concentrations across China using the space-time random forest approach. Remote Sens. Environ. 2019, 231, 111221. [Google Scholar] [CrossRef]
Chen, Z.; Zhang, T.; Zhang, R.; Zhu, Z.; Yang, J.; Chen, P.; Ou, C.; Guo, Y. Extreme gradient boosting model to estimate PM_2.5 concentrations with missing-filled satellite data in China. Atmos. Environ. 2019, 202, 180–189. [Google Scholar] [CrossRef]
Gui, K.; Che, H.; Zeng, Z.; Wang, Y.; Zhai, S.; Wang, Z.; Luo, M.; Zhang, L.; Liao, T.; Zhao, H.; et al. Construction of a virtual PM_2.5 observation network in China based on high-density surface meteorological observations using the Extreme Gradient Boosting model. Environ. Int. 2020, 141, 105801. [Google Scholar] [CrossRef]
He, W.; Meng, H.; Han, J.; Zhou, G.; Zheng, H.; Zhang, S. Spatiotemporal PM_2.5 estimations in China from 2015 to 2020 using an improved gradient boosting decision tree. Chemosphere 2022, 296, 134003. [Google Scholar] [CrossRef]
Wei, J.; Li, Z.; Lyapustin, A.; Sun, L.; Peng, Y.; Xue, W.; Su, T.; Cribb, M. Reconstructing 1-km-resolution high-quality PM_2.5 data records from 2000 to 2018 in China: Spatiotemporal variations and policy implications. Remote Sens. Environ. 2021, 252, 112136. [Google Scholar] [CrossRef]
Chen, W.; Ran, H.; Cao, X.; Wang, J.; Teng, D.; Chen, J.; Zheng, X. Estimating PM_2.5 with high-resolution 1-km AOD data and an improved machine learning model over Shenzhen, China. Sci. Total Environ. 2020, 746, 141093. [Google Scholar] [CrossRef] [PubMed]
He, Q.; Gu, Y.; Zhang, M. Spatiotemporal trends of PM_2.5 concentrations in central China from 2003 to 2018 based on MAIAC-derived high-resolution data. Environ. Int. 2020, 137, 105536. [Google Scholar] [CrossRef] [PubMed]
Xue, T.; Zheng, Y.; Tong, D.; Zheng, B.; Li, X.; Zhu, T.; Zhang, Q. Spatiotemporal continuous estimates of PM_2.5 concentrations in China, 2000–2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations. Environ. Int. 2019, 123, 345–357. [Google Scholar] [CrossRef] [PubMed]
Yan, X.; Zang, Z.; Jiang, Y.; Shi, W.; Guo, Y.; Li, D.; Zhao, C.; Husi, L. A Spatial-Temporal Interpretable Deep Learning Model for improving interpretability and predictive accuracy of satellite-based PM_2.5. Environ. Pollut. 2021, 273, 116459. [Google Scholar] [CrossRef]
Hu, H.; Hu, Z.; Zhong, K.; Xu, J.; Zhang, F.; Zhao, Y.; Wu, P. Satellite-based high-resolution mapping of ground-level PM_2.5 concentrations over East China using a spatiotemporal regression kriging model. Sci. Total Environ. 2019, 672, 479–490. [Google Scholar] [CrossRef]
Jiang, T.; Chen, B.; Nie, Z.; Ren, Z.; Xu, B.; Tang, S. Estimation of hourly full-coverage PM_2.5 concentrations at 1-km resolution in China using a two-stage random forest model. Atmos. Res. 2021, 248, 105146. [Google Scholar] [CrossRef]
Xiao, Q.; Wang, Y.; Chang, H.H.; Meng, X.; Geng, G.; Lyapustin, A.; Liu, Y. Full-coverage high-resolution daily PM_2.5 estimation using MAIAC AOD in the Yangtze River Delta of China. Remote Sens. Environ. 2017, 199, 437–446. [Google Scholar] [CrossRef]
Zhang, R.; Di, B.; Luo, Y.; Deng, X.; Grieneisen, M.L.; Wang, Z.; Yao, G.; Zhan, Y. A nonparametric approach to filling gaps in satellite-retrieved aerosol optical depth for estimating ambient PM_2.5 levels. Environ. Pollut. 2018, 243, 998–1007. [Google Scholar] [CrossRef]
Wang, Y.; Yuan, Q.; Li, T.; Tan, S.; Zhang, L. Full-coverage spatiotemporal mapping of ambient PM_2.5 and PM10 over China from Sentinel-5P and assimilated datasets: Considering the precursors and chemical compositions. Sci. Total Environ. 2021, 793, 148535. [Google Scholar] [CrossRef]
Miri, M.; Ghassoun, Y.; Dovlatabadi, A.; Ebrahimnejad, A.; Löwner, M.-O. Estimate annual and seasonal PM1, PM_2.5 and PM10 concentrations using land use regression model. Ecotox. Environ. Saf. 2019, 174, 137–145. [Google Scholar] [CrossRef] [PubMed]
Huang, C.; Hu, J.; Xue, T.; Xu, H.; Wang, M. High-Resolution Spatiotemporal Modeling for Ambient PM_2.5 Exposure Assessment in China from 2013 to 2019. Environ. Sci. Technol. 2021, 55, 2152–2162. [Google Scholar] [CrossRef] [PubMed]
Olcese, L.E.; Palancar, G.G.; Toselli, B.M. A method to estimate missing AERONET AOD values based on artificial neural networks. Atmos. Environ. 2015, 113, 140–150. [Google Scholar] [CrossRef]
Wang, Y.; Yuan, Q.; Li, T.; Shen, H.; Zheng, L.; Zhang, L. Large-scale MODIS AOD products recovery: Spatial-temporal hybrid fusion considering aerosol variation mitigation. ISPRS-J. Photogramm. Remote Sens. 2019, 157, 1–12. [Google Scholar] [CrossRef]
Chen, G.; Li, S.; Knibbs, L.D.; Hamm, N.A.S.; Cao, W.; Li, T.; Guo, J.; Ren, H.; Abramson, M.J.; Guo, Y. A machine learning method to estimate PM_2.5 concentrations across China with remote sensing, meteorological and land use information. Sci. Total Environ. 2018, 636, 52–60. [Google Scholar] [CrossRef]
Huang, K.; Xiao, Q.; Meng, X.; Geng, G.; Wang, Y.; Lyapustin, A.; Gu, D.; Liu, Y. Predicting monthly high-resolution PM_2.5 concentrations with random forest model in the North China Plain. Environ. Pollut. 2018, 242, 675–683. [Google Scholar] [CrossRef]
Liu, Y.; Li, C.; Liu, D.; Tang, Y.; Seyler, B.C.; Zhou, Z.; Hu, X.; Yang, F.; Zhan, Y. Deriving hourly full-coverage PM_2.5 concentrations across China’s Sichuan Basin by fusing multisource satellite retrievals: A machine-learning approach. Atmos. Environ. 2022, 271, 118930. [Google Scholar] [CrossRef]
Guo, J.; Zhang, X.; Che, H.; Gong, S.; An, X.; Cao, C.; Guang, J.; Zhang, H.; Wang, Y.; Zhang, X.; et al. Correlation between PM concentrations and aerosol optical depth in eastern China. Atmos. Environ. 2009, 43, 5876–5886. [Google Scholar] [CrossRef]
Hu, X.; Waller, L.A.; Lyapustin, A.; Wang, Y.; Liu, Y. 10-year spatial and temporal trends of PM(2.5) concentrations in the southeastern US estimated using high-resolution satellite data. Atmos. Chem. Phys. 2014, 14, 6301–6314. [Google Scholar] [CrossRef] [Green Version]
Xin, J.; Zhang, Q.; Wang, L.; Gong, C.; Wang, Y.; Liu, Z.; Gao, W. The empirical relationship between the PM_2.5 concentration and aerosol optical depth over the background of North China from 2009 to 2011. Atmos. Res. 2014, 138, 179–188. [Google Scholar] [CrossRef]
Mhawish, A.; Banerjee, T.; Sorek-Hamer, M.; Lyapustin, A.; Broday, D.M.; Chatfield, R. Comparison and evaluation of MODIS Multi-angle Implementation of Atmospheric Correction (MAIAC) aerosol product over South Asia. Remote Sens. Environ. 2019, 224, 12–28. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Knowledge Discovery and Data Mining; Association for Computing Machinery: San Francisco, CA, USA, 2016; pp. 785–794. [Google Scholar]
Zhang, T.; He, W.; Zheng, H.; Cui, Y.; Song, H.; Fu, S. Satellite-based ground PM_2.5 estimation using a gradient boosting decision tree. Chemosphere 2021, 268, 128801. [Google Scholar] [CrossRef] [PubMed]
Xiao, Q.; Chang, H.H.; Geng, G.; Liu, Y. An Ensemble Machine-Learning Model to Predict Historical PM_2.5 Concentrations in China from Satellite Data. Environ. Sci. Technol. 2018, 52, 13260–13269. [Google Scholar] [CrossRef] [PubMed]
Xu, Y.; Ho, H.C.; Wong, M.S.; Deng, C.; Shi, Y.; Chan, T.-C.; Knudby, A. Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM_2.5. Environ. Pollut. 2018, 242, 1417–1426. [Google Scholar] [CrossRef] [PubMed]
Geng, G.; Xiao, Q.; Liu, S.; Liu, X.; Cheng, J.; Zheng, Y.; Xue, T.; Tong, D.; Zheng, B.; Peng, Y.; et al. Tracking Air Pollution in China: Near Real-Time PM_2.5 Retrievals from Multisource Data Fusion. Environ. Sci. Technol. 2021, 55, 12106–12115. [Google Scholar] [CrossRef] [PubMed]
He, Q.; Gao, K.; Zhang, L.; Song, Y.; Zhang, M. Satellite-derived 1-km estimates and long-term trends of PM_2.5 concentrations in China from 2000 to 2018. Environ. Int. 2021, 156, 106726. [Google Scholar] [CrossRef]
Cao, J.; Shen, Z.; Chow, J.C.; Watson, J.G.; Lee, S.; Tie, X.; Ho, K.; Wang, G.; Han, Y. Winter and Summer PM_2.5 Chemical Compositions in Fourteen Chinese Cities. J. Air Waste Manag. Assoc. 2012, 62, 1214–1226. [Google Scholar] [CrossRef]
Wang, Y.; Yao, L.; Wang, L.; Liu, Z.; Ji, D.; Tang, G.; Zhang, J.; Sun, Y.; Hu, B.; Xin, J. Mechanism for the formation of the January 2013 heavy haze pollution episode over central and eastern China. Sci. China-Earth Sci. 2014, 57, 14–25. [Google Scholar] [CrossRef]
Chen, W.; Meng, H.; Song, H.; Zheng, H. Progress in Dust Modelling, Global Dust Budgets, and Soil Organic Carbon Dynamics. Land 2022, 11, 176. [Google Scholar] [CrossRef]
Li, T.; Shen, H.; Zeng, C.; Yuan, Q.; Zhang, L. Point-surface fusion of station measurements and satellite observations for mapping PM_2.5 distribution in China: Methods and assessment. Atmos. Environ. 2017, 152, 477–489. [Google Scholar] [CrossRef] [Green Version]
He, Z.; Liu, P.; Zhao, X.; He, X.; Liu, J.; Mu, Y. Responses of surface O₃ and PM_2.5 trends to changes of anthropogenic emissions in summer over Beijing during 2014–2019: A study based on multiple linear regression and WRF-Chem. Sci. Total Environ. 2022, 807, 150792. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Dickinson, R.E.; Su, L.; Zhou, C.; Wang, K. PM_2.5 Pollution in China and How It Has Been Exacerbated by Terrain and Meteorological Conditions. Bull. Amer. Meteorol. Soc. 2018, 99, 105–119. [Google Scholar] [CrossRef]
He, Q.; Huang, B. Satellite-based high-resolution PM_2.5 estimation over the Beijing-Tianjin-Hebei region of China using an improved geographically and temporally weighted regression model. Environ. Pollut. 2018, 236, 1027–1037. [Google Scholar] [CrossRef] [PubMed]
Wei, J.; Li, Z.; Cribb, M.; Huang, W.; Xue, W.; Sun, L.; Guo, J.; Peng, Y.; Li, J.; Lyapustin, A.I.; et al. Improved 1 km resolution PM_2.5 estimates across China using enhanced space time extremely randomized trees. Atmos. Chem. Phys. 2020, 20, 3273–3289. [Google Scholar] [CrossRef] [Green Version]
Bai, K.; Li, K.; Guo, J.; Chang, N. Multiscale and multisource data fusion for full-coverage PM_2.5 concentration mapping: Can spatial pattern recognition come with modeling accuracy? ISPRS-J. Photogramm. Remote Sens. 2022, 184, 31–44. [Google Scholar] [CrossRef]
Wang, L.; Cai, K.; Si, Y.; Yu, C.; Zheng, H.; Li, S. Evaluation of Himawari-8 version 2.0 aerosol products against AERONET ground-based measurements over central and northern China. Atmos. Environ. 2020, 224, 117357. [Google Scholar] [CrossRef]
Wang, T.; Song, H.; Wang, F.; Zhai, S.; Han, Z.; Wang, D.; Li, X.; Zhao, H.; Ma, R.; Zhang, G. Hysteretic effects of meteorological conditions and their interactions on particulate matter in Chinese cities. J. Clean Prod. 2020, 274, 122926. [Google Scholar] [CrossRef]
Li, X.; Song, H.; Zhai, S.; Lu, S.; Kong, Y.; Xia, H.; Zhao, H. Particulate matter pollution in Chinese cities: Areal-temporal variations and their relationships with meteorological conditions (2015–2017). Environ. Pollut. 2019, 246, 11–18. [Google Scholar] [CrossRef]
Yang, Q.; Yuan, Q.; Li, T.; Yue, L. Mapping PM_2.5 concentration at high resolution using a cascade random forest based downscaling model: Evaluation and application. J. Clean Prod. 2020, 277, 123887. [Google Scholar] [CrossRef]
Yang, Q.; Yuan, Q.; Yue, L.; Li, T.; Shen, H.; Zhang, L. Mapping PM_2.5 concentration at a sub-km level resolution: A dual-scale retrieval approach. ISPRS-J. Photogramm. Remote Sens. 2020, 165, 140–151. [Google Scholar] [CrossRef]

Figure 1. The framework of the proposed daily full-coverage PM_2.5 modeling scheme. The blue dashed box represents Stage 1 (annual modeling) and the outputs are the two annual models; the red dashed box represents Stage 2 (daily modeling) and the outputs are the estimated daily full-coverage PM_2.5 maps; T_i represents the day of the year.

Figure 2. Density scatter plots of the validation results for daily estimated PM_2.5 concentrations ((a–c) for each year and (d) for the total). The dash line in grey denotes the 1:1 line and the solid line in red denotes the linear regression line.

Figure 3. Time series of evaluation results of the daily models and daily averaged in-situ PM_2.5 concentrations from 2018 to 2020. Red and blue points represent R² and RMSE, respectively, and green points represent PM_2.5 concentrations.

Figure 4. Spatial distribution (a–c) and regional numerical statistics (d) of annual mean estimated PM_2.5 concentration across China at 1 km resolution.

Figure 5. Multi-year averaged (2018–2020) estimated PM_2.5 concentrations for each month over the study area.

Figure 6. Number of days per year (a–f) and percentage of area per day (g) for three PM_2.5 pollution levels (All of the linear trends were significant at p-level < 0.05).

Table 1. Cross validation results for annual (M1 and M2) and daily (M3) models by Years.

Year	N	Model	Site-Based CV			Time-Based CV
Year	N	Model	R²	MAE	RMSE (μg/m³)	R²	MAE	RMSE (μg/m³)
2018	528,400	M 1	0.90	6.43	10.70	0.73	10.66	17.44
		M 2	0.84	7.90	13.27	0.49	14.94	24.20
		M 3	0.86	8.06	12.33	0.58	13.46	22.18
2019	527,248	M 1	0.91	5.91	9.96	0.74	10.13	17.00
		M 2	0.86	7.23	12.44	0.51	14.16	23.55
		M 3	0.87	7.39	12.01	0.59	12.81	21.72
2020	532,453	M 1	0.92	5.15	9.05	0.76	8.86	15.51
		M 2	0.85	6.56	11.97	0.50	12.87	22.35
		M 3	0.86	6.78	12.00	0.58	11.72	20.54

Table 2. Comparisons of model performance with previous studies at the national scale in China.

Reference	Spatial Resolution	Period	Full Coverage	Overall		Site-Based		Time-Based
Reference	Spatial Resolution	Period	Full Coverage	R²	RMSE (μg/m³)	R²	RMSE (μg/m³)	R²	RMSE (μg/m³)
He and Huang (2018) [16]	3 km	2015	No	0.80	18.00	0.75	20.73	0.58	28.24
Fang et al. (2016) [13]	10 km	2013–2014	No	0.80	22.75	-	-	-	-
Chen et al. (2018) [37]	10 km	2014–2016	No	0.83	18.08	-	-	-	-
Li et al. (2017) [53]	3 km	2013–2014	No	0.67	20.93	-	-	-	-
Wei et al. (2021) [23]	1 km	2013–2018	No	0.86–0.90	10.0–18.4	-	-	-	-
Geng et al. (2021) [48]	10 km	2013–2020	Yes	0.80–0.88	13.9–22.1	0.69–0.83	14.6–26.4	0.58	27.5
Huang et al. (2021) [34]	1 km	2013–2019	Yes	0.88	15.73	0.92	5.05	0.85	12.90
Wang et al. (2021) [32]	5 km	2018–2020	Yes	0.93	8.98	0.88	11.55	0.73	17.44
This study	1 km	2018–2020	Yes	0.91	9.91	0.86–0.87	12–12.33	0.58–0.59	20.54–22.18

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

He, W.; Zhang, S.; Meng, H.; Han, J.; Zhou, G.; Song, H.; Zhou, S.; Zheng, H. Full-Coverage PM_2.5 Mapping and Variation Assessment during the Three-Year Blue-Sky Action Plan Based on a Daily Adaptive Modeling Approach. Remote Sens. 2022, 14, 3571. https://doi.org/10.3390/rs14153571

AMA Style

He W, Zhang S, Meng H, Han J, Zhou G, Song H, Zhou S, Zheng H. Full-Coverage PM_2.5 Mapping and Variation Assessment during the Three-Year Blue-Sky Action Plan Based on a Daily Adaptive Modeling Approach. Remote Sensing. 2022; 14(15):3571. https://doi.org/10.3390/rs14153571

Chicago/Turabian Style

He, Weihuan, Songlin Zhang, Huan Meng, Jie Han, Gaohui Zhou, Hongquan Song, Shenghui Zhou, and Hui Zheng. 2022. "Full-Coverage PM_2.5 Mapping and Variation Assessment during the Three-Year Blue-Sky Action Plan Based on a Daily Adaptive Modeling Approach" Remote Sensing 14, no. 15: 3571. https://doi.org/10.3390/rs14153571

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Full-Coverage PM_2.5 Mapping and Variation Assessment during the Three-Year Blue-Sky Action Plan Based on a Daily Adaptive Modeling Approach

Abstract

1. Introduction