Next Article in Journal
A Systematic Grey-Box Modeling Methodology via Data Reconciliation and SOS Constrained Regression
Next Article in Special Issue
Mold Level Predict of Continuous Casting Using Hybrid EMD-SVR-GA Algorithm
Previous Article in Journal / Special Issue
Numerical Investigation of SCR Mixer Design Optimization for Improved Performance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Robust Method for Solving CMB Receptor Model Based on Enhanced Sampling Monte Carlo Simulation

1
Lushan Binjiang Experimental School, Changsha 410013, China
2
School of Mathematics and Statistics, Central South University, Changsha 410083, China
3
School of Geoscience and Info-Physics, Central South University, Changsha 410083, China
4
School of Mathematics and Statistics, and Mobile E-business Collaborative Innovation Center of Hunan Province, Hunan University of Commerce, Changsha 410205, China
*
Author to whom correspondence should be addressed.
Processes 2019, 7(3), 169; https://doi.org/10.3390/pr7030169
Submission received: 15 February 2019 / Revised: 17 March 2019 / Accepted: 19 March 2019 / Published: 23 March 2019

Abstract

:
The traditional effective variance weighted least squares algorithms for solving CMB (Chemical Mass Balance) models have the following drawbacks: When there is collinearity among the sources or the number of species is less than the number of sources, then some negative value of contribution will appear in the results of the source apportionment or the algorithm does not converge to calculation. In this paper, a novel robust algorithm based on enhanced sampling Monte Carlo simulation and effective variance weighted least squares (ESMC-CMB) is proposed, which overcomes the above weaknesses. In the following practical instances for source apportionment, when nine species and nine sources, with no collinearity among them, are selected, EPA-CMB8.2 (U.S. Environmental Protection Agency-CMB8.2), NKCMB1.0 (NanKai University, China-CMB1.0) and ESMC-CMB can obtain similar results. When the source raise dust is added to the source profiles, or nine sources and eight species are selected, EPA-CMB8.2 and NKCMB1.0 cannot solve the model, but the proposed ESMC-CMB algorithm can achieve satisfactory results that fully verify the robustness and effectiveness of ESMC-CMB.

1. Introduction

Atmospheric particulate matter (PM10 and PM2.5, with diameters less than 10 μm and 2.5 μm) is a mixture of solid or liquid particles suspended in the air, and is an important air pollutant in urban environments [1,2,3]. Epidemiological studies have shown that PM2.5/PM10 and an increase in respiratory symptoms, lung cancer mortality, and cardiovascular disease are closely related [4,5,6,7,8,9,10]. China is one of the countries with the most serious PM2.5 pollution in the world. In recent years, a total of 28 provinces and cities have reported heavy PM2.5 pollution phenomena; on average, each province has an annual total of nearly 20 days of heavy pollution.
At present, haze is frequent in China, affecting a wide range and having a long duration, which causes inconvenience to public life, threatens human health, and causes great concern for society and the government. Understanding and clarifying the potential sources and their contributions of PM2.5 is important [4]. The work of source apportionment of PM2.5 has become one of the core strategies in the prevention and control of atmospheric pollution.
The CMB (Chemical Mass Balance) air quality model [5,6] is the most important model of atmospheric particulate matter source apportionment technology [7], recommended by the United States’ EPA (Environmental Protection Agency), mainly used to study the TSP (Total Suspended Particulate), PM2.5, PM10, and VOC (Volatile Organic Compounds) as well as other sources of pollutants and their contribution. CMB receptor models are established according to the principle of mass balance, and the chemical concentration of pollutants can be expressed by the sum of the product of the species richness and the source contribution.
The CMB receptor model [8,9] is composed of a set of linear equations, which indicates that the receptor concentration of each chemical element is equal to the linear sum of the product of the element content and the source contribution concentration. The basic principle of CMB model is mass conservation. It is assumed that there are several sources (J) that contribute to atmospheric particulates in the receptor, and that: (1) compositions of source emissions are constant over the period of ambient and source sampling; (2) the number of sources or source categories is less than or equal to the number of species; (3) the chemical composition of the particulate matter emitted by the various sources is significantly different; (4) the chemical composition of the particulate matter emitted by the source class is relatively stable; (5) all sources that make an obvious contribution to the receptors have their respective emission characteristics; (6) there is no interaction between the particles emitted by the source class, so the change in the process of transmission can be ignored; and (7) measurement uncertainties are random, uncorrelated, and normally distributed. Then the total substance concentration measured on the receptor is the linear sum of the contribution of each source.
The methods for solving CMB equations mainly include: (1) trace element method [10]; (2) linear programming solution [11]; (3) ordinary weighted least squares method [12]; (4) ridge regression weighted least squares [13]; (5) partial least squares [14]; (6) neural networks [15]; and (7) effective variance weighted least squares (EVWLS) with or without an intercept [16].
At present, the most commonly used algorithm for solving CMB model is the EVWLS method [17], which is derived by minimizing the weighted sums of the squares of the differences between the measured and the calculated values of C i and F i j , and is a practical method for calculating the contribution of the source S j and the error σ S j :
min m 2 = i = 1 I ( C i j = 1 J F i j × S j ) 2 V e f f , i ,
where the effective variance is V e f f , i = σ c i 2 + j = 1 J σ F i j 2 × S j 2 , σ S j (µgm−3 or g/g) is the uncertainty in source contribution S j (µgm−3 or g/g), σ C i (µgm−3 or g/g) is the uncertainty in the ambient concentrations species i, and σ F i j is the uncertainty in the fraction of species i in the source j profile.
The EVWLS method is actually an improvement over the ordinary weighted least squares method to minimize the sum of squares of the differences between the weighted chemical composition measurements and the calculated values.
However, there are some weaknesses to the above algorithms, such as near collinear sources resulting in incorrect source contributions, and the requirement that the number of chemical species be greater than or equal to the number of sources. At the same time, most of the above algorithms are finally transformed into optimization algorithms, which are mostly NP (Non-deterministic Polynomial) problems. So, in general, we get a locally optimal value or suboptimal value instead of a globally optimal value. So, instability is a fatal drawback to these algorithms, that is to say that different runs of the same input dataset at different times using the same algorithm may produce very different outputs or exhibit high variance with the same diagnostic criteria.
The Monte Carlo method [18], also known as stochastic simulation or statistical experiments, is based on statistical theory, according to the law of large numbers, using computer simulation technology [19] to solve some practical problem that is difficult to figure out directly with mathematical or other methods. The Monte Carlo method uses computer programs and mathematical models [20] to simulate practical random phenomena, through simulation experiments to get experimental data, and then infers from the analysis to get the law of certain phenomena. Monte Carlo simulation [19] is a method for exploring the solution and sensitivity of a complex system by varying the parameters within the statistical constraints. It is widely used in many fields such as engineering [21], environmental science [22], statistical physics [23], biophysics [24], materials science [25], and financial engineering [26]. Many practical problems are often accompanied by many random factors. If we take these factors into account, the model will become too complex to solve. However, we can utilize the Monte Carlo method to generate a random number to simulate these complicated phenomena, and then find out the operation law. The validity of the Monte Carlo method relies on the sampling process in simulation. However, the simple Monte Carlo algorithm converges too slowly, and it is easy to converge to local extreme points.
In this paper, we explore a novel robust method for solving CMB receptor model based on enhanced sampling Monte Carlo simulation, which overcomes the shortcomings of the above algorithms. In other words, when collinearity exists in the source profiles or the number of source profiles is greater than the number of species, the ESMC-CMB (Enhanced Sampling Monte Carlo CMB) algorithm can come to the correct results for source apportionment. In general, these enhanced sampling methods can be employed to help us quickly find an optimal stable solution when the model is complex, nonlinear, or involves more than just a couple uncertain parameters.
This paper is organized as follows. Section 2 provides a literature review about the CMB model and enhanced sampling Monte Carlo simulation. In Section 3, the proposed enhanced sampling Monte Carlo CMB algorithm (ESMC-CMB) is described. Section 4 presents the related numerical experiments and a comparison with various traditional algorithms. Finally, conclusions are given in Section 5.

2. CMB Model and Enhanced Sampling Monte Carlo Simulation

Methods commonly used for the particulate source apportionment include receptor model, source emission inventory, and source dispersion models. The source emission inventory method determines its contribution rate by investigating and accounting for emission factors and activity levels for different source categories. The source dispersion model is a combination of meteorological conditions, emission sources, and chemical processes to assess the distribution and contribution of different source classes in three dimensions [27]. The receptor model is a commonly used model in source apportionment.
In general, due to source j with constant emission rate Ej, the source contribution Sj present at a receptor during a sampling period of length T is
S j = D j E j ,
where:
D j = 0 T d [ u ( t ) ,   σ ( t ) ,   x j ] d t .
Dj is a dispersion factor depending on atmospheric stability ( σ ), wind velocity (u) and the location of source j with respect to the receptor (xj). All parameters in Equation (2) vary with time, so the instantaneous Dj must be integrated over time period T [27].
The CMB receptor model consists of a solution of a linear equation that represents the chemical concentration of each receptor as the product of source profile abundance and source contribution. Resource profile abundances (i.e., mass fractions of certain chemicals or other properties emitted from each source) and receptor concentrations (estimated with appropriate uncertainties) are used as input data for CMB. In order to distinguish the contribution of source types, the measured chemical and physical properties must occur in different proportions of source emissions, and the changes of these proportions between source and recipient can be neglected or approximated. The CMB model calculates the contribution values of each source and the uncertainties of these values. The principle of the CMB receptor model is shown in Figure 1.
The receptor model was used to identify the source of the receptor and determine the quantitative contribution of various sources to the receptor by analyzing the chemical tracers of the source of the environmental samples and the emission sources. If there is no interaction between their emissions to cause mass removal, the total mass measured at the receptor C is a linear sum of the contributions of the individual sources S j :
C = j = 1 J D j E j = j = 1 J S j .
Similarly, the mass concentration of elemental component i, Ci, will be
C i = j = 1 J F i j S j i = 1 , 2 , , I ,
where Fij is the mass fraction of source contribution Sj composed of element i at the receptor. The number of chemical species (I) must be greater than or equal to the number of sources (J) for a unique solution to these equations.
Equations (4) and (5) are based on material immortality and mass conservation. In Equation (5), Ci and Sj are the inputs to the model, and Fij is the source contribution we need to calculate.
There are several methods to solve the CMB receptor models: (1) the tracer element method [28]; (2) an ordinary weighted least squares solution [28]; (3) a linear programming solution [29], which maximizes the sum of the source contributions; (4) a ridge regression weighted least squares solution with or without an intercept [30] that is one approach for handling the multi-collinearity; (5) a neural networks solution; and (6) an EVWLS solution, which is the most common algorithm.
At present, the most commonly used algorithm to solve the CMB model is the effective variance least squares method, because this method is a practical method to calculate the error σ S j of source contribution S j . The effective variance least squares method is actually an improvement on the ordinary weighted least squares method, which minimizes the sum of squares of the difference between the measured and calculated values of the weighted chemical components:
min m 2 = i = 1 I ( C i j = 1 J F i j × S j ) 2 V e f f , i ,
where the effective variance is V e f f , i = σ c i 2 + j = 1 J σ F i j 2 × S j 2 , σ S j (µgm−3 or g/g) is the uncertainty in source contribution S j (µgm−3 or g/g), σ C i (µgm−3 or g/g) is the uncertainty (i.e., measurement errors) in the ambient concentrations species i, and σ F i j is the uncertainty (i.e., measurement errors) in the fraction of species i in the source j profile.
The matrix form of the CMB model is as follows:
C i × 1 = F i × j S j × 1 .
The steps of EVWLS iterative algorithm for solving the CMB model (Equation (7)) are as follows:
  • Set the initial estimate of the source contributions equal to zero:
    S j k = 0 = 0 j = 1 , 2 , , J .
  • Calculate the diagonal components V e f f , i of the effective variance matrix. All off-diagonal components of this matrix are equal to zero:
    V e f f , i k = σ c i 2 + j = 1 J ( S j k ) 2 × σ F i j 2 .
  • Calculate the K + 1 value of S j :
    S j k + 1 = ( F T ( V e k ) 1 F ) 1 F T ( V e k ) 1 C .
  • If the result of Equation (10) is greater than 1%, the previous iteration is executed; if less than 1%, the iteration is terminated.
    If   | S j k + 1 S j k | / S j k + 1 > 0.01 ,   go   to   step   2 . If   | S j k + 1 S j k | / S j k + 1 0.01 ,   go   to   step   5 .
  • Calculate the value of σ S j in the K + 1 step iteration, then
    σ S j = [ ( F T ( V e k + 1 ) 1 F j j ) 1 ] 1 / 2 j = 1 , 2 , , J ,
where C = ( C 1 , , C I ) T is a column vector with Ci as the ith component; S = ( S 1 , , S J ) T is a column vector with Sj as the jth component; F   i s   a n   I × J matrix of Fij, the source composition matrix; σ c i is one standard deviation uncertainty of the C i measurement; σ F i j is one standard deviation uncertainty of the Fij measurement; and V e is diagonal matrix of effective variances.
The above algorithm shows that the input parameters of the model are: the measured values of the concentration spectrum of the chemical components of the receptor C i and the standard deviation σ C i of C i , the measured values F i j of the source chemical composition spectrum and the standard deviation σ F i j of F i j . The output parameters of the model are: the calculated source contribution values of S j and the standard deviation σ S j of S j , the calculated source contribution values of chemical composition S i j , and the standard deviation σ S i j of S i j .
In the actual work of source apportionment, there are two commonly used software tools, EPA-CMB8.2 (V8.2, EPA, Washington, USA, 2004) and NKCMB1.0 (V1.0, Nankai University, Tianjin, China, 2005), which are the concrete implementation of above effective variance least squares algorithm for solving the CMB model.
The CMB receptor model is one of the standard methods used by the U.S. Environmental Protection Agency (EPA) to assess air quality. The practical tool software EPA-CMB8.2 based on the CMB model and the effective variance least squares algorithm is recommended by the EPA. NKCMB1.0 is a practical software tool for PM2.5 source apportionment, developed by the Key Laboratory of Urban Air Particulate Pollution Prevention and Control, Nankai University, Tianjin China, based on the CMB receptor mathematical model and the corresponding effective variance least squares algorithm. NKCMB1.0 is more suitable for source analysis and calculation in China’s more complex air quality environment.
As a stochastic method, Monte Carlo modeling can be used to describe and analyze complex problems by computer simulation sampling based on probability theory combined with certain statistical methods. Although the method emerged in the 1940s, it was limited to defense-related nuclear technology because it required sufficient computing resources to analyze the neutron behavior in matter [20]. With the rapid development of high-speed computers, the Monte Carlo simulation method is more and more widely used [19,20].
The basic idea of the Monte Carlo method is to establish an appropriate probability model or stochastic process so that its parameters (such as the probability of events, the mathematical expectation of random variables) are equal to the solution of the problem. Then repeated random sampling test of the model or process are carried out. With the statistical analysis to the results, the final calculation of the parameters, the approximate solution is obtained.
For example, in a Monte Carlo Simulation problem we represent the quantity we want to know as the expected value of a random variable Y , such as μ = E ( Y ) . Then we generate values Y 1 , , Y n randomly and independently from the distribution of Y and get their average:
μ ^ = 1 n i = 1 n Y i ,
as the estimate of μ .
However, the convergence speed of the above simple sampling Monte Carlo method is too slow; for a large dimension sampling space, the time to complete the sampling calculation is intolerable.
This paper will explore a new, enhanced sampling method to accelerate the convergence of the algorithm from the following aspects.
Firstly, in the process of solving the receptor CMB model, if the diagnostic indicator P M = j = 1 J η j = j = 1 J S j / C < λ , the results did not meet the requirements. So we could sample in the following space P M = j = 1 J η j = j = 1 J S j / C λ , for which the dimensions of the sample space will be reduced to some extent, and in the following experiment, λ = 0.7 will be selected. In the new sampling space, the Gibbs sampling method will be used.
Gibbs sampling [31,32,33] or a Gibbs sampler is a MCMC (Markov chain Monte Carlo) algorithm for obtaining a sequence of observations that are approximated from a specified multivariate probability distribution. Like other MCMC algorithms, Gibbs sampling from Markov chain can be regarded as a special case of the Metropolis‒Hastings algorithm; its sampling distribution can be deduced from the properties of the Markov chain and probability transition matrix, and it finally converges to joint distribution. The name of the algorithm originated from Josiah Willard Gibbs and was proposed by brothers Stewart and Donald Gemman in 1984 [31,32,33]. Gibbs sampling is suitable for multivariate distribution, where conditional distribution is easier to sample than edge distribution. At the same time, in order to accelerate the convergence speed of the simulation process, in this paper we adopt the enhanced Gibbs sampling algorithm from [34], called the enhanced sampling algorithm for short.
In order to overcome the shortcomings of the effective variance algorithm for solving the CMB model, in this paper, the EVWLS (effective variance weighted least square) algorithm will be combined with the Monte Carlo simulation algorithm of enhanced sampling to obtain a novel robust ESMC-CMB algorithm for solving the CMB receptor model. The algorithm is programmed by using MATLAB (V8.5, Mathworks, Natick USA, 2015) and implemented through numerical experiments with a real background. By comparing with the results of EPA-CMB 8.2 and NKCMB 1.0, the accuracy, robustness, and superiority of ESMC-CMB algorithm are fully verified.

3. Solving CMB Model Based on Enhanced Sampling Monte Carlo Simulation

For the CMB model with consideration of random error:
C i = j = 1 J F i j S j + ε i i = 1 , 2 , , I ,
where C i is the ambient concentration of species i , Sj is the source contribution of source j , Fij is the fraction of species i in source j , ε i is for errors. The number of chemical species (I) must be equal to or greater than the number of sources (J) for a unique solution to these equations. Equation (13) is solved by an effective variance weighted least squares approach: minimizing χ 2 , where
χ 2 = i = 1 I [ ( C i j = 1 J F i j S j ) 2 σ C i 2 + j = 1 J α F i j 2 S j 2 ] .
In the CMB model, uncertainties in the source contribution are estimated as
σ S j = ( i = 1 I F i j 2 σ C i 2 + j = 1 J α F i j 2 S j 2 ) 1 / 2 ,
where σ S j (µgm−3 or g/g) is the uncertainty in source contribution S j (µgm−3 or g/g), σ C i (µgm−3 or g/g) is the uncertainty in the ambient concentrations species i, and σ F i j is the uncertainty in the fraction of species i in the source j profile. Uncertainties in input variables are propagated by inversely weighting the EV (effective variance).
In this paper a new method for solving CMB receptor model based on the enhanced sampling Monte Carlo simulation was proposed as follows:
{ min χ 2 = i = 1 I [ ( C i j = 1 J F i j S j ) 2 σ C i 2 + j = 1 J α F i j 2 S j 2 ] s t .   { Generate   random   inputs :   S j with   Enhanced   Gibbs   sampler j = 1 J S j C S j 0 P M = j = 1 J η j = j = 1 J S j / C λ i = 1 , 2 , , I j = 1 , 2 , , J σ S j = ( i = 1 I F i j 2 σ C i 2 + j = 1 J α F i j 2 S j 2 ) 1 / 2 .
Then we can get the following ESMC-CMB algorithm:
Algorithm ESMC-CMB: Given the initial receptor and source profile data C i , σ C i , F i j , σ F i j , i = 1 , 2 , , I j = 1 , 2 , , J , the number of source and receptor components I, the number of source J, o b j = 10 100 , the number of simulation times N, n = 0 .
  • Step 1: Generate random variables with the enhanced sampling Monte Carlo method proposed in this paper: S j 0 j = 1 , 2 , , J .
  • Step 2: If j = 1 J S j C , go to step 1.
  • Step 3: n = n + 1, Calculate χ 2 , if χ 2 < o b j , then o b j = χ 2 o b j S j = S j .
  • Step 4: if n < N then step 1.
  • Step 5: Calculate χ 2 , η j = o b j S j C , σ S j .

4. Application to a Realistic Case

This realistic case focuses on the dataset from a city in China. The profiles of the receptor and source component are shown in Table 1 and Table 2.
EPA-CMB8.2 and NKCMB1.0 software can be used to solve the CMB model when the number of sources or source categories is less than or equal to the number of species. So, firstly, we select nine sources (Soil Dust, Construction Dust, Coal Combustion, Cooking Smoke, Biomass Burning, Industrial Processes, NO3, SO42−, Vehicular Emissions) and nine components (Al, Si, K, Ca, Fe, OC, EC, NO3, SO4), and use EPA-CMB8.2 and NKCMB1.0 to calculate source apportionment with the data in Table 1 and Table 2; the results are shown in Figure 2 and Figure 3.
With the same selection of the source profiles and receptor components and the same dataset, we use our proposed ESMC-CMB algorithm to calculate the source apportionment, and the results are shown in Figure 4. Table 3 shows the numerical comparison of the contribution rates of the above three algorithms.
From the results of Figure 2, Figure 3 and Figure 4 and Table 3, we can see that the results of source apportionment calculated with the above three algorithms are very close, and the correctness of the ESMC-CMB algorithm is verified.
If eight species such as Al, Si, K, Ca, Fe, OC, EC, and NO3 are selected, then the software EPA-CMB8.2 and NKCMB1.0 cannot be used because the number of species is less than the number of sources, but the proposed algorithm ESMC-CBM can calculate the results in Figure 5.
As there is strong collinearity between the sources Raise Dust (RD) and Soil Dust, if RD is added to the source profiles (Soil Dust, Construction Dust, Coal Combustion, Cooking Smoke, Biomass Burning, Industrial Processes, NO3, SO42−, Vehicular Emissions) to participate in the calculation using EPA-CMB8.2 and NKCMB1.0, some values of source contribution will be negative, so correct results cannot be obtained. However, using our proposed ESMC-CMB algorithm, we can get the correct value of the source apportionment as shown in Figure 6.
A comparison of the above results is given in Table 4. As can be seen clearly from Table 4, in the practical instances for source apportionment, when nine species and nine sources, with no collinearity among them, are selected, EPA-CMB8.2, NKCMB1.0, and ESMC-CMB can obtain similar results. However, because there is strong collinearity between source Raise Dust (RD) and Soil Dust, when the source Raise Dust is added to the source profiles, or nine sources and eight species are selected, EPA-CMB8.2 and NKCMB1.0 cannot solve the model, but the proposed ESMC-CMB algorithm can come to a satisfactory results, which fully verify the robustness and effectiveness of ESMC-CMB.

5. Conclusions

In this paper, a new robust algorithm for a CMB receptor model based on enhanced sampling Monte Carlo simulation and the effective variance weighted least squares is proposed. Because of the weaknesses of the traditional algorithms and software for CMB receptor source apportionment model such as collinearity and the requirement that the number of chemical species be greater than or equal to the number of sources, in many cases, software such as EPA-CMB8.2 and NKCMB1.0 cannot obtain results for the source apportionment or some values of the source contribution are negative. However, the proposed robust novel ESMC-CMB algorithm can overcome the above weaknesses and achieve satisfactory results. In the realistic source apportionment experiments, firstly, we selected nine sources (Soil Dust, Construction Dust, Coal Combustion, Cooking Smoke, Biomass Burning, Industrial Processes, NO3, SO42−, Vehicular Emissions) with no collinearity among them and nine species (Al, Si, K, Ca, Fe, OC, EC, NO3, SO4), and used the EPA-CMB8.2, NKCMB1.0, and ESMC-CMB algorithms to calculate source contributions, and got similar results, but when we selected eight species and nine sources or added Raise Dust to the source profiles, because of the collinearity with Soil Dust, EPA-CMB8.2 and NKCMB1.0 could not obtain correct results; however, the proposed ESMC-CMB algorithm can calculate the right results for source apportionment. This has fully demonstrated the robustness and effectiveness of the ESMC-CMB algorithm.
Although the ESMC-CMB algorithm has many advantages, there is often missing data in the actual problem. How to further improve the ESMC-CMB algorithm in the case of missing data is the next area of research to tackle.
Due to the limitations of the CMB model, in the realistic study of air pollution, the results of source analysis from the ESMC-CMB algorithm should be referred to the calculation results of other models, such as PMF (Positive Matrix Factorization) and CMAQ (Community Multiscale Air Quality), to obtain more reasonable results.

Author Contributions

Conceptualization, M.H.; Data curation, W.H.; Formal analysis, M.H., W.H., and Y.Y.; Methodology, W.H. and Z.W.; Writing, Q.W. and X.X.

Funding

This study was funded by the Natural Science Foundation of China under Grants 61375063, 61271355, 11301549, and 11271378.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shen, G.; Wang, W.; Yang, Y.; Zhu, C.; Min, Y.; Xue, M.; Ding, J.; Li, W.; Wang, B.; Shen, H.; et al. Emission factors and particulate matter size distribution of polycyclic aromatic hydrocarbons from residential coal combustions in rural Northern China. Atmos. Environ. 2010, 44, 5237–5243. [Google Scholar] [CrossRef] [Green Version]
  2. Kong, S.; Ji, Y.; Lu, B.; Chen, L.; Han, B.; Li, Z.; Bai, Z. Characterization of PM10 source profiles for fugitive dust in Fushun-a city famous for coal. Atmos. Environ. 2011, 45, 5351–5365. [Google Scholar] [CrossRef]
  3. Zheng, J.; Che, W.; Zheng, Z.; Chen, L.; Zhong, L. Analysis of Spatial and Temporal Variability of PM10 Concentrations Using MODIS Aerosol Optical Thickness in the Pearl River Delta Region, China. Aerosol Air Qual. Res. 2013, 13, 862–876. [Google Scholar] [CrossRef]
  4. Zheng, M.; Salmon, L.G.; Schauer, J.J.; Zeng, L.; Kiang, C.S.; Zhang, Y.; Cass, G.R. Seasonal trends in PM2.5 source contributions in Beijing, China. Atmos. Environ. 2005, 39, 3967–3976. [Google Scholar] [CrossRef]
  5. Friedlander, S.K. Chemical element balances and identification of air pollution sources. Environ. Sci. Technol. 1973, 7, 235–240. [Google Scholar] [CrossRef] [PubMed]
  6. Cooper, J.A.; Watson, J.G., Jr. Receptor oriented methods of air particulate source apportionment. J. Air Pollut. Control Assoc. 1980, 30, 1116–1125. [Google Scholar] [CrossRef]
  7. Gordon, G.E. Receptor models. Environ. Sci. Technol. 1988, 22, 1132–1142. [Google Scholar] [CrossRef] [PubMed]
  8. Watson, J.G. Overview of receptor model principles. J. Air Pollut. Control Assoc. 1984, 34, 619–623. [Google Scholar] [CrossRef]
  9. Hidy, G.M.; Venkataraman, C. The chemical mass balance method for estimating atmospheric particle sources in Southern California. Chem. Eng. Commun. 1996, 151, 187–209. [Google Scholar] [CrossRef]
  10. Miller, M.; Friedlander, S.; Hidy, G. A chemical element balance for the Pasadena aerosol. J. Colloid Interface Sci. 1972, 39, 165–176. [Google Scholar] [CrossRef]
  11. Hougland, E. Chemical element balance by linear programming. In Proceedings of the 73rd Annual Meeting of the Air Pollution Control Association, Atlanta, GA, USA, 19–24 June 1983. [Google Scholar]
  12. Gartrell, G.; Friedlander, S. Relating particulate pollution to sources: The 1972 California aerosol characterization study. Atmos. Environ. 1975, 9, 279–299. [Google Scholar] [CrossRef]
  13. Watson, J.G.; Robinson, N.F.; Chow, J.C.; Henry, R.C.; Kim, B.; Nguyen, Q.T.; Meyer, E.L.; Pace, T.G. Receptor Model Technical Series, Vol. III (1989 Revision) CMB7 User’s Manual; US Environmental Protection Agency: Washington, DC, USA, 1990.
  14. Geladi, P.; Kowalski, B.R. Partial least-squares regression: A tutorial. Anal. Chim. Acta 1986, 185, 1–17. [Google Scholar] [CrossRef]
  15. Song, X.-H.; Hopke, P.K. Solving the chemical mass balance problem using an artificial neural network. Environ. Sci. Technol. 1996, 30, 531–535. [Google Scholar] [CrossRef]
  16. Watson, J.G.; Cooper, J.A.; Huntzicker, J.J. The effective variance weighting for least squares calculations applied to the mass balance receptor model. Atmos. Environ. 1984, 18, 1347–1355. [Google Scholar] [CrossRef]
  17. Shi, G.L.; Zeng, F.; Li, X.; Feng, Y.C.; Wang, Y.Q.; Liu, G.X.; Zhu, T. Estimated contributions and uncertainties of PCA/MLR–CMB results: Source apportionment for synthetic and ambient datasets. Atmos. Environ. 2011, 45, 2811–2819. [Google Scholar] [CrossRef]
  18. Mahadevan, S. Monte carlo simulation. In Mechanical Engineering-New York And Basel-Marcel Dekker; Marcel Dekker Inc.: New York, NY, USA, 1997; pp. 123–146. [Google Scholar]
  19. Brémaud, P. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues; Springer Science & Business Media: Berlin, Germany, 2013; Volume 31. [Google Scholar]
  20. Sobol, I.M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul. 2001, 55, 271–280. [Google Scholar] [CrossRef]
  21. Smith, A. Sequential Monte Carlo Methods in Practice; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
  22. Hanna, S.R.; Chang, J.C.; Fernau, M.E. Monte Carlo estimates of uncertainties in predictions by a photochemical grid model (UAM-IV) due to uncertainties in input variables. Atmos. Environ. 1998, 32, 3619–3628. [Google Scholar] [CrossRef]
  23. Landau, D.P.; Binder, K. A Guide to Monte Carlo Simulations in Statistical Physics; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
  24. Friedland, W.; Dingfelder, M.; Kundrát, P.; Jacob, P. Track structures, DNA targets and radiation effects in the biophysical Monte Carlo simulation code PARTRAC. Mutat. Res./Fund. Mol. Mech. Mutagen. 2011, 711, 28–40. [Google Scholar] [CrossRef] [PubMed]
  25. Ohno, K.; Esfarjani, K.; Kawazoe, Y. Computational Materials Science: From AB Initio to Monte Carlo Methods; Springer Science & Business Media: Berlin, Germany, 2012; Volume 129. [Google Scholar]
  26. Glasserman, P. Monte Carlo Methods in Financial Engineering; Springer Science & Business Media: Berlin, Germany, 2003; Volume 53. [Google Scholar]
  27. Watson, J.G. Chemical Element Balance Receptor Model Methodology for Assessing the Sources of Fine and Total Suspended Particulate Matter in Portland, Oregon. Ph.D. Thesis, Department of Environmental Science, Oregon Graduate Center, Oregon City, OR, USA, 1979. [Google Scholar]
  28. Christensen, W.F.; Gunst, R.F. Measurement error models in chemical mass balance analysis of air quality data. Atmos. Environ. 2004, 38, 733–744. [Google Scholar] [CrossRef]
  29. Cheng, M.; Hopke, P.K. Linear Programming Procedure and Regression Diagnostics for least-Squares Solution Using CMB Receptor Model, in Receptor Methods for Source Apportionment—Real World Issues and Applications; Air Pollution Control Association: Pittsburgh, PA, USA, 1986. [Google Scholar]
  30. Gleser, L.J. Some thoughts on chemical mass balance models. Chemom. Intell. Lab. Syst. 1997, 37, 15–22. [Google Scholar] [CrossRef]
  31. Yue, K.; Wu, H.; Liu, W.; Zhu, Y. Representing and processing lineages over uncertain data based on the Bayesian network. Appl. Soft Comput. 2015, 37, 345–362. [Google Scholar] [CrossRef]
  32. Kozumi, H.; Kobayashi, G. Gibbs sampling methods for Bayesian quantile regression. J. Stat. Comput. Simul. 2011, 81, 1565–1578. [Google Scholar] [CrossRef] [Green Version]
  33. Gilks, W.R.; Wild, P. Adaptive rejection sampling for Gibbs sampling. Appl. Stat. 1992, 41, 337–348. [Google Scholar] [CrossRef]
  34. Arroyo, D.; Emery, X.; Peláez, M. An enhanced Gibbs sampler algorithm for non-conditional simulation of Gaussian random vectors. Comput. Geosci. 2012, 46, 138–148. [Google Scholar] [CrossRef]
Figure 1. The principle of the Chemical Mass Balance (CMB) receptor model.
Figure 1. The principle of the Chemical Mass Balance (CMB) receptor model.
Processes 07 00169 g001
Figure 2. The result using EPA-CMB8.2 with nine sources and nine species.
Figure 2. The result using EPA-CMB8.2 with nine sources and nine species.
Processes 07 00169 g002
Figure 3. The result using NKCMB1.0 with nine sources and nine species.
Figure 3. The result using NKCMB1.0 with nine sources and nine species.
Processes 07 00169 g003
Figure 4. The result using proposed ESMC-CMB algorithm (with nine sources and nine species.
Figure 4. The result using proposed ESMC-CMB algorithm (with nine sources and nine species.
Processes 07 00169 g004
Figure 5. The result using ESMC-CMB with nine sources and eight species.
Figure 5. The result using ESMC-CMB with nine sources and eight species.
Processes 07 00169 g005
Figure 6. The result using ESMC-CMB with 10 sources and nine species including RD (Raise Dust) collinear with Soil Dust.
Figure 6. The result using ESMC-CMB with 10 sources and nine species including RD (Raise Dust) collinear with Soil Dust.
Processes 07 00169 g006
Table 1. Receptor component profiles.
Table 1. Receptor component profiles.
Ele.Conc.STDEEle.Conc.STDE
TOT111.867754.19443Co0.0005050.000458
Na0.3812480.149582Ni0.0069080.00752
Mg0.2015560.094942Cu0.0556630.076044
Al2.6471722.03143Zn0.2379940.184731
Si2.4358581.56244Pb0.1111470.091934
P0.0611240.039434OC20.272512.6826
K1.3729870.862706EC3.8555472.132063
Ca2.9121851.292981Cl0.269340.560002
Ti0.0147920.00704NO34.7039215.350789
Cr0.0183820.012077SO417.272297.314421
Mn0.0417360.035401NH49.9607225.706486
Fe4.1225496.704566
Note: Ele. = Elements, Conc. = Concentration (μg/m3), STDE = Standard Deviation.
Table 2. Source component profiles.
Table 2. Source component profiles.
Ele.Raise DustSoil DustConstruction DustCoal CombustionCooking SmokeBiomass BurningIndustrial ProcessesNO3SO42−Vehicular Emission
Conc.STDEConc.STDEConc.STDEConc.STDEConc.STDEConc.STDEConc.STDEConc.STDEConc.STDEConc.STDE
Na0.0047220.0017440.0073090.0041830.0024780.0007350.0063650.0047740.0086170.0057420.0029590.0026660.0086000.00010000.00000100.0000010.0093630.005199
Mg0.0072760.0019450.0146750.0066360.0085460.0024480.0119220.0082090.0174050.0127090.0044600.0038310.0156000.00040000.00000100.0000010.0109410.006701
Al0.0882360.0113530.1189100.0384820.0693710.0313920.2390060.1822810.0123670.0073080.0319690.0267450.0053000.00010000.00000100.0000010.0106390.007285
Si0.1372110.0338870.2328820.0766670.0983630.0220800.0810330.0817620.0139540.0146970.0514090.0479430.0131000.00130000.00000100.0000010.0122610.006434
P0.000810.0002280.0008740.0003830.0002640.0001150.0003110.0002620.0003210.0001910.0001260.0000920.0000000.00000100.00000100.0000010.0020770.000802
K0.0139320.0032670.0185960.0062360.0173320.0023170.0089410.0073360.0138510.0154020.1049250.0659800.0330000.00070000.00000100.0000010.0121720.005053
Ca0.1080350.0288160.1254790.0856790.2748930.0437750.0566830.0862050.0122120.0070460.0083500.0075190.0920000.00190000.00000100.0000010.0130240.006604
Ti0.0022240.0005480.0035090.0012190.0026430.0011520.0451290.0307000.0075120.0061080.0014600.0017470.0004000.00004000.00000100.0000010.0069820.003594
Cr0.0001380.0000500.0003220.0001980.0000840.0000240.0007950.0008870.0004490.0002140.0008720.0017340.0003000.00003000.00000100.0000010.0018870.002440
Mn0.0005010.0001690.0007220.0003030.0003220.0001320.0001930.0001690.0001870.0001620.0000790.0001160.0098000.00010000.00000100.0000010.0009010.001107
Fe0.0308670.0091650.0385580.0161610.0131790.0073150.0536660.0328670.0193190.0147660.0102020.0086480.3670000.00020000.00000100.0000010.0343350.022235
Co0.0000110.0000030.0000260.0000120.0000060.0000040.0000130.0000170.0000030.0000030.0000060.0000130.0003000.00003000.00000100.0000010.0000280.000033
Ni0.0000460.0000190.0001350.0000820.0000320.0000040.0005680.0009880.0002110.0001310.0002780.0005500.0001000.00010000.00000100.0000010.0014590.001336
Cu0.0001230.0000480.0002490.0001490.0000700.0000190.0002940.0002330.0004070.0003320.0001780.0001470.0004000.00010000.00000100.0000010.0010140.001473
Zn0.0005790.0001810.0008380.0004760.0001550.0000500.0006490.0005300.0013570.0009180.0005430.0004960.0098000.00090000.00000100.0000010.0009520.000729
Pb0.0002250.0001270.0001210.0000650.0000350.0000060.0001170.0000880.0001150.0000850.0000510.0000360.0032000.00030000.00000100.0000010.0002070.000262
OC0.0409410.0079510.0240680.0059450.0403410.0073250.1219960.1159390.6422800.4090480.3976840.0922300.0082000.00080000.00000100.0000010.3459820.172765
EC0.0061950.0006200.0000560.0000060.0013260.0001330.0138010.0013800.0186330.0018630.0423550.0042360.0048000.00048000.00000100.0000010.1479480.079996
Cl0.0022610.0015990.0043450.0075390.0011120.0015470.0057140.0048690.0080580.0044480.1692340.0849540.0072000.00230000.00000100.0000010.0046520.003824
NO30.0063850.0016380.0063810.0046470.0013620.0003470.0066290.0063580.0100710.0082920.0042030.0043760.0000000.0000010.7948720.07948700.0000010.0090280.004308
SO40.0459960.0148600.0154460.0068910.0246990.0047150.0525910.0477590.0261230.0209260.0216580.0154970.0166000.00230000.0000010.7272730.0727270.0138420.010331
NH40.0011910.0009570.0013020.0007260.0001900.0001370.0131850.0140810.0102470.0147780.0840530.0519160.0000000.0000010.2051280.0205130.2727270.0272730.0091940.007708
Note: Ele. = Elements, Conc. = Concentration (%), STDE = Standard Deviation.
Table 3. A numerical comparison of EPA-CMB8.2, NKCMB1.0, and ESMC-CMB.
Table 3. A numerical comparison of EPA-CMB8.2, NKCMB1.0, and ESMC-CMB.
AlgorithmsEPA-CMB8.2NKCMB1.0ESMC-CMB
Source Contribution
Soil Dust0.0266989640.0184647390.030721789
Construction Dust0.0307525270.0279168990.038269989
Coal Combustion0.0511262270.0507337010.04052869
Cooking Smoke0.1637157220.1641810480.13384989
Biomass Burning0.0393976440.0388936060.048647818
Industrial Processes0.0698111710.0944718320.058153522
SO42−0.2630396840.2491730030.294499399
NO30.0545317190.0511786760.04448467
Vehicular Emissions0.2022444240.2029022450.214208926
Other0.0986819180.1020842510.096635306
Table 4. A comparison of NKCMB1.0 and MC-CMB.
Table 4. A comparison of NKCMB1.0 and MC-CMB.
AlgorithmsEPA-CMB8.2NKCMB1.0ESMC-CMB
Conditions
Number of sources ≤ number of species and existing no collinearityHaving resultsHaving resultsHaving results
Number of sources > number of speciesNo resultsNo resultsHaving results
The collinearity exist in sourcesNo resultsNo resultsHaving results

Share and Cite

MDPI and ACS Style

Hou, W.; Yang, Y.; Wang, Z.; Hou, M.; Wu, Q.; Xie, X. A Novel Robust Method for Solving CMB Receptor Model Based on Enhanced Sampling Monte Carlo Simulation. Processes 2019, 7, 169. https://doi.org/10.3390/pr7030169

AMA Style

Hou W, Yang Y, Wang Z, Hou M, Wu Q, Xie X. A Novel Robust Method for Solving CMB Receptor Model Based on Enhanced Sampling Monte Carlo Simulation. Processes. 2019; 7(3):169. https://doi.org/10.3390/pr7030169

Chicago/Turabian Style

Hou, Wen, Yunlei Yang, Zheng Wang, Muzhou Hou, Qianhong Wu, and Xiaoliang Xie. 2019. "A Novel Robust Method for Solving CMB Receptor Model Based on Enhanced Sampling Monte Carlo Simulation" Processes 7, no. 3: 169. https://doi.org/10.3390/pr7030169

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop