Application of Neural Networks on Carbon Emission Prediction: A Systematic Review and Comparison

Feng, Wentao; Chen, Tailong; Li, Longsheng; Zhang, Le; Deng, Bingyan; Liu, Wei; Li, Jian; Cai, Dongsheng

doi:10.3390/en17071628

Open AccessReview

Application of Neural Networks on Carbon Emission Prediction: A Systematic Review and Comparison

by

Wentao Feng

¹,

Tailong Chen

¹,

Longsheng Li

¹,

Le Zhang

¹,

Bingyan Deng

¹,

Wei Liu

²,

Jian Li

² and

Dongsheng Cai

^2,*

¹

State Grid Sichuan Information & Telecommunication Company, Chengdu 610095, China

²

Sichuan Provincial Key Laboratory of Power System Wide-Area Measurement and Control, University of Electronic Science and Technology of China, Chengdu 611731, China

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(7), 1628; https://doi.org/10.3390/en17071628

Submission received: 21 February 2024 / Revised: 21 March 2024 / Accepted: 26 March 2024 / Published: 28 March 2024

(This article belongs to the Special Issue Role of Hydrogen Energy in Renewable Energy Development/Integration and Global Decarbonization)

Download

Browse Figures

Versions Notes

Abstract

:

The greenhouse effect formed by the massive emission of carbon dioxide has caused serious harm to the Earth’s environment, in which the power sector constitutes one of the primary contributors to global greenhouse gas emissions. Reducing carbon emissions from electricity plays a pivotal role in minimizing greenhouse gas emissions and mitigating the ecological, economic, and social impacts of climate change, while carbon emission prediction provides a valuable point of reference for the formulation of policies to reduce carbon emissions from electricity. The article provides a detailed review of research results on deep learning-based carbon emission prediction. Firstly, the main neural networks applied in the domain of carbon emission forecasting at home and abroad, as well as the models combining other methods and neural networks, are introduced, and the main roles of different methods, when combined with neural networks, are discussed. Secondly, neural networks were used to predict electricity carbon emissions, and the performance of different models on carbon emissions was compared. Finally, the application of neural networks in the realm of the prediction of carbon emissions is summarized, and future research directions are discussed. The article provides a reference for researchers to understand the research dynamics and development trend of deep learning in the realm of electricity carbon emission forecasting.

Keywords:

carbon emission prediction; BP neural network; recurrent neural network; deep learning; hybrid models

1. Introduction

The release of large amounts of carbon dioxide into the atmosphere from industrial activities exacerbates the Earth’s greenhouse effect [1,2], causing an increase in greenhouse gas concentrations in the atmosphere, which, in turn, contributes to an increase in the Earth’s surface temperature. Such changes triggered several problems [3], such as extreme weather, reduced food production, and rising sea levels. In addition, as a result of the greenhouse effect, the global ecological balance suffered serious damage, with many species facing threats to their survival and ecological diversity being seriously weakened. To deal with that issue, the international community urgently needed to adopt a sustainable model of industrial development to minimize greenhouse gas emissions.

In order to protect the Earth’s home, it is the responsibility of all countries to act proactively to reduce carbon dioxide emissions, which is an urgent task that requires global cooperation to meet the challenge of climate change. China’s rapid economic growth and industrialization created a substantial reliance on coal for electricity generation, which resulted in substantial carbon emissions. Coal has been a primary source of energy in China due to its abundance and affordability. However, coal combustion releases vast quantities of carbon dioxide, contributing to climate change and air pollution. China’s electricity carbon emissions make up over 33% of the total global carbon emissions, solidifying its position as the largest emitter of carbon dioxide worldwide, so the reduction of carbon emissions from the electric power system is of great significance to reaching carbon neutrality. Comprehensive and accurate carbon emission prediction for electricity plays a crucial guiding role in policy formulation. Meanwhile, in this context, enterprise production and development need to be structurally adjusted [4,5,6] and shifted to being green and low-carbon [7,8,9] to adapt to the new development trend. From a long-term perspective, a green and low-carbon strategic positioning not only adapts to future environmental trends but also has the potential to meet consumer and investor expectations of social responsibility.

Carbon emissions are a dynamic and time-dependent process, making their prediction a challenging task that involves forecasting the correlation between emissions and time. With the advancement and progress of science and technology, a variety of information acquisition hardware, for instance, sensors, are increasingly advanced, and the characteristics of the information collected are becoming more and more abundant. The data on carbon emissions have very complex characteristics, characterized by multiple data features, large data volume, and nonlinearity, so the use of neural networks suitable for dealing with this kind of data for carbon emissions prediction becomes a necessary choice. However, there is a scarcity of comprehensive reviews regarding the utilization of neural networks for forecasting electricity carbon emissions. Given the limited direct application of neural network models in predicting electricity carbon emissions, this paper categorizes and summarizes all the literature related to neural network applications in carbon emissions. It endeavors to furnish references for the application of neural networks in electricity carbon emission prediction, helping to clarify the connections and differences among various research methods in this field and offering directions and insights for future research to further advance this area.

This article first analyzes and summarizes the latest applications of neural networks in the field of carbon emission prediction, including single neural networks, hybrid neural networks, and other methods. It discusses the strengths and weaknesses of different neural network methods and provides case studies from practical applications to prove the validity of various prediction methods in real-world environments. Secondly, a comparative analysis and summary of carbon emission prediction experiments using neural networks are conducted. Finally, the future trends and development directions in this research field are prospected, providing guidance and insights for future studies to help readers understand the latest developments in this area.

2. Review of Carbon Emission Prediction Methods

In the past few years, the forecasting techniques for carbon emissions have been mainly based on statistical methods and methods based on machine learning. Methods using statistics mainly include the integrating moving average autoregressive model [10] (ARIMA) and exponential smoothing method [11]. These techniques are based on historical data and trends for prediction, and statistical models are established to build prediction models by analyzing past carbon emission data and related factors. Statistical methods are usually based on the assumption of linear relationships, ignoring possible non-linear relationships, and do not meet the needs of the task. To reduce the margin of error in carbon emission forecasting, artificial intelligence-based methods have begun to gradually become the mainstream approach to forecasting carbon emissions. Through the analysis of the carbon emission literature over the past 5 years, as shown in Figure 1, it is evident that there is an increasing prevalence of artificial intelligence-based methods. Presently, artificial intelligence technology continues to innovate and advance in the domain of carbon emission forecasting, including the application of techniques such as deep learning and reinforcement learning. This trend indicates that artificial intelligence has become the predominant technological trend in this field.

With the gradual deepening of research, prediction methods based on machine learning strategies are widely used in carbon emission prediction, such as support vector machines [12,13] and random forest [14]. Compared to statistical-based methods, machine learning approaches are more modeling-based and intelligent and can learn complex nonlinear relationships from large amounts of data. However, traditional machine learning models have relatively limited adaptability to changes in data distribution and nonlinear relationships and usually need to manually introduce nonlinear features when dealing with nonlinear problems. Therefore, deep learning techniques that are capable of learning end-to-end and automatically extracting features have gradually become a popular research direction, and neural networks have overcome the disadvantage of linear models having difficulty extracting nonlinear relationships. Numerous researchers employed it in the domain of carbon emission forecasting, yielding a wealth of valuable studies.

2.1. Prediction Method Based on BP Neural Network

The backpropagation neural network represents a commonly utilized architecture in the realm of artificial neural networks used to solve classification, regression, and other machine learning tasks, which perform error correction through an error reversal algorithm [15]. It is a supervised learning technique that modifies a neural network’s weights with training data so that it can predict new data accurately. The structure consists of multiple layers, usually incorporating input layers, hidden layers, and output layers, as demonstrated in Figure 2. BP neural networks have a nonlinear transfer function that can achieve precise approximation of nonlinear functions with arbitrary accuracy and are, therefore, also used in carbon emission prediction studies.

The architecture of a BP neural network is a simple and intuitive structure for artificial neural networks, as seen in Figure 2. It usually consists of one or more intermediate hidden layers, where the neurons in each hidden layer are associated with all the neurons in the previous layer. When using neural networks for carbon emission forecasting, the data are first preprocessed, typically including missing value filling and normalization, and then an appropriate neural network model is established, and the weights and biases within the network are initialized before training begins. After the start of training, the prediction value is calculated by forward propagation, the prediction value is compared with the actual value to calculate the loss, and the weights and biases are adjusted by backpropagation through the loss value, aiming to reduce the loss. Throughout the entire training process, the above steps are repeated until optimal. The above is the process of training in which the actual situation of hyper-parameter adjustment attains superior generalization ability. In conclusion, the training of neural networks for carbon emission prediction is an iterative and continuous optimization process. Through careful parameter tuning and data processing, the precision and efficacy of the model in forecasting carbon emissions are gradually improved.

The traditional carbon emission prediction conducted at home and abroad are based on statistical methods environmental Kuznets curve [16,17], IPAT equation [18,19], STIRTAP [20,21] model, etc., which are mainly used for macro-field carbon emission forecasting research. The volume of data for macro-field carbon emission forecasting is relatively small, and most of the predictions are made on annual, quarterly, and monthly time scales. Although neural networks are mainly processed in terms of big data, they also have excellent performance when the amount of data is relatively small. Ref. [22] used BP neural networks for carbon emission prediction at the macro-field with significant results, which indicates that neural networks also have very good performance in macro-field prediction and can replace statistical-based methods. As the demand for prediction accuracy increases, a single model cannot meet the demand, so more and more scholars study the combination of multiple models for prediction to improve accuracy. Ji Guangyue [23] used a combination of gray correlation analysis [24] and BP neural network, firstly using gray correlation analysis to screen the more influential variables in the data, then in the use of BP neural network for training, this hybrid modeling approach improves the training rate and precision of BP neural network and strengthens the model robustness. Also processing the data are models such as the Lasso regression model [25] and principal component analysis [26] (PCA). This method of combining models is mainly to process the data first, and the main processing methods are screening, compression, dimensionality reduction, etc. After processing the data, the training rate and prediction precision of the model are improved. There are also some combinations of algorithms, such as Nonlinear PSO [27], IPSO [28], Genetic Algorithm [29], and BP neural network. The principal objective of these algorithms is to optimize the parameters of the neural network model to improve its performance. The best parameters for the neural network are first calculated using an optimization algorithm and then parameterized and trained. The process is shown in Figure 3. All these approaches significantly enhance the precision and velocity of prediction.

Carbon emission data vary between micro- and macro-scales, characterized by many variables and a large amount of data, and neural networks are suitable for dealing with this kind of data and, therefore, tend to have good prediction results. Wang Heng, Wei Zijie, and others [30] utilized BP neural network for carbon emission prediction research of coal-fired plants, and the model output of the assessment index R² = 0.987, MAPE = 0.0975, which is able to observe that BP neural network is highly precise when predicting in the micro-field. Zhao Jinyuan and Ma Zhen [31] applied this approach to iron and steel enterprises and evaluated the predictive performance of the BP neural network and multivariate linear regression model on carbon emissions, and the outcomes obtained showed that the forecasting effect of BP neural network is superior to that of the multivariate linear regression model. Hu Zhen [32] and others utilized BP neural networks to forecast residential carbon emissions, which are more strongly associated with the weather, holidays, personal routines, etc., so the data exhibit robust nonlinearity and unpredictability, which makes the prediction more difficult. The experimental findings indicate that all the evaluation indexes of the BP neural network are superior to the multiple linear regression model.

The aforementioned study demonstrates that the BP neural network presents superior results in the field of carbon emission prediction. This superiority stems from the fact that trends in carbon emission changes often exhibit nonlinear characteristics, and neural networks show excellent mapping capabilities in handling nonlinear data. In the macro-field, carbon emissions are influenced by a few complex factors, for instance, global economic development, energy consumption patterns, and policies and regulations, and there may be nonlinear interactions between these factors. Furthermore, parameters such as power generation methods, power generation capacity, fuel types, and power generation efficiency have significantly improved the forecasting capability of the models. BP neural networks can learn these sophisticated nonlinear relationships and capture the overall trend of carbon emissions more accurately, thus providing a significant guideline for the establishment of environmental policies and planning. At the micro-field, BP neural networks can learn from a substantial volume of carbon emission data, identify and extract nonlinear patterns and trends in different time scales, and achieve accurate predictions. Table 1 shows a comparison of the positive and negative aspects of the relevant models based on BP neural networks.

2.2. Prediction Method Based on Recurrent Neural Network

The RNN [33] is a specialized architecture tailored for handling sequential information. Unlike conventional feedforward neural networks, RNN incorporates feedback connections, enabling information to be transmitted within the network. This empowers RNN to effectively model time series data and sequential information. The main characteristic of RNNs is the addition of a recurrent structure, which enables the network to remember data from earlier inputs and use it for guiding inputs in the future. This makes RNNs well-suited for processing data with temporal dependencies, such as language, audio, video, stock prices, and other sequential data, as shown in Figure 4. However, the traditional RNN has challenges associated with “gradient vanishing” and “gradient explosion” when dealing with long sequences. To address this problem, more advanced RNN variants were proposed [34,35]; the most widely used is the LSTM [36].

In LSTM [37], the most crucial innovation lies in the incorporation of gate mechanisms to govern data flow. These primarily encompass the forget, input, and output gates. The structure is shown in Figure 5. These gating mechanisms are designed so that the LSTM can dynamically and selectively memorize, forget, and output information. By learning the weights of these gates, LSTM can adapt to patterns and dependencies in different sequences while solving the issues of gradient vanishing and gradient explosion. Using these gating units, LSTM overcomes the shortcomings of RNN’s gradient explosion and gradient vanishing, outperforming it in terms of time prediction. The LSTM neural network was initially applied in the field of speech recognition. With the growing demand for sequence data processing, LSTM has rapidly gained widespread application in areas such as machine translation, time series prediction, and natural language processing and has become a crucial tool for modeling sequence data. Given that carbon emissions data is a time series model, LSTM is naturally suitable for carbon emissions prediction. The following six formulas can be used for calculation.

f_{t} = σ (W_{f} \cdot [h_{t - 1}], x_{t}] + b_{f})

(1)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(2)

O_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(3)

{\tilde{C}}_{t} = t a n h (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(4)

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ {\tilde{C}}_{t}

(5)

h_{t} = O_{t} ⊙ t a n h (C_{t})

(6)

where

f_{t}

,

i_{t}

,

O_{t}

are the activation vectors for forgetting, inputs, and outputs, respectively,

{\tilde{C}}_{t}

is the new cell vector,

C_{t}

is the vector of cell state updates, and

h_{t}

is the hidden state vector of the LSTM cell.

σ

denotes the sigmoid activation function, tanh represents the hyperbolic tangent function, and

W_{f}

,

W_{i}

,

W_{O}

,

W_{C}

denote the weight matrices, respectively. The matrix multiplication equals

⊙

.

The uniqueness of recurrent neural networks is that their structure not only takes into account the input information of the present moment but also the output of the previous moments as inputs, thus constructing an internal mechanism with a memory function. This sophisticated design enables recurrent neural networks to capture the correlation information between data more effectively, which greatly enhances their learning ability and pattern recognition. Through this step-by-step iterative approach, recurrent neural networks are able to better understand the evolutionary trends and the intrinsic structure of the data when dealing with sequential data. Liu Chao [38] and others employed the LSTM model to forecast carbon emissions and successfully realized the task of predicting carbon emissions and achieved remarkable results, which provided strong support for research and application in this field. Chun-Sen Liu and Jian-Sheng Qu [39] used LSTM for carbon emission forecasting in the transport industry, and the outcomes demonstrated that LSTM outperforms the BP neural network and SVR Machine, which demonstrates the exceptional performance of LSTM in the domain of carbon emission forecasting.

A single model, such as the BP neural network above, is not able to meet the needs of the task, and many scholars also combined other models with LSTM to accomplish the goal of enhancing prediction precision. Zhang [40] proposed a CNN-LSTM model for forecasting carbon emissions by combining a convolutional neural network [41,42] and LSTM, which makes full use of the convolutional layer to extract the features of the carbon emission data, then effectively filters out the redundant information in the carbon emission data through the pooled downsampling method to optimize the prediction performance. Han [43] used LSTM and CNN coupling to forecast the carbon emissions of 30 Chinese provinces. After experimental comparison, the results show that incorporating spatial weights into the model can improve prediction performance, which is superior to a single CNN model and LSTM model. Duan [44] first decomposed and reconstructed the original wind power crosstalk features using variational modal decomposition and sample entropy before replacing the mean-square error in the classical LSTM network with the maximum correlation entropy criterion for prediction. This method enhanced the prediction proficiency in the domain of wind energy prediction and also provided us with a useful reference for carbon emission prediction. With similar ideas, we can apply this innovative approach to carbon emission predictions to boost the predictive performance of the model. Tang [45] combined the sparrow search algorithm with LSTM to forecast the future carbon emission trend of the transport sector under different scenarios, and the prediction performance was improved compared with LSTM, generalized regression neural network, and BP neural network, in which the sparrow search algorithm was mainly used to optimize the parameters in LSTM. There are also optimization algorithms such as the oscillating particle swarm optimization algorithm [46], improved whale optimization algorithm [47], and grey wolf optimization algorithm optimization [48], in which the optimization algorithms play the same role as in BP neural networks, and they are predominantly utilized to optimize the neural network parameters. Ke [49] amalgamated the BAS algorithm [50] with LSTM to first decompose the obtained original data using VMD [51] and then use ensemble empirical modal decomposition (EEMD) [52] to perform a quadratic decomposition on the residuals decomposed by the variational modal decomposition, and finally, use the beetle antennae search algorithm to perform the updating of the number of hidden layers and the gradient of the LSTM. This approach not only executes operations such as feature extraction on the raw data but also optimizes the count of hidden layers and gradient threshold of LSTM, which significantly enhances the forecasting accuracy and convergence rate of the model.

In summary, recurrent neural networks incorporate past information into current processing by introducing time-step connections into their structure, making them better at capturing long-term dependencies and patterns in sequential data. In hybrid model forecasting, data are typically pre-processed, filtered, or compressed using other models before being input into a neural network for prediction. Alternatively, optimization algorithms may be employed to initialize the model parameters of the neural network. This method of combining multiple models can leverage the benefits of each model to achieve more accurate predictions in practical applications. Table 2 shows a comparison of the relevant models based on LSTM neural networks. Whether in a single recurrent neural network or in a hybrid model, neural networks perform well in time-series data processing tasks such as carbon emission prediction, providing important support and reference for environmental protection and decision-making.

2.3. Other Neural Networks

At present, the forecasting methods of carbon emissions based upon neural networks at home and abroad are primarily based upon the above two neural networks; in addition to the carbon emission forecasting methods based upon the above two neural networks, the GMDH neural network is proposed to forecast the carbon release of five countries in the Middle Eastern [53]. The GMDH neural network is a data-driven modeling approach that combines the features of automatic model selection and parameter optimization. It can automatically choose the optimal model structure from the given data without the need to pre-define the count of layers and nodes in the neural network, thus avoiding the tedious process of manual parameter tuning. Mohammad Ghalandari [54] used the MLP, a category of feed-forward neural network, for prediction. It is one of the most popular and traditional neural network architectures. There are three layers: input, hidden, and output layer, where the hidden layer may be single or multilayered. The author of this work also compared the performance of the GMDH neural network with regard to prediction, and the outcome revealed that the accuracy of the two models in terms of carbon emission forecasting was nearly identical.

Additionally, Zheng [55] used a generalized regression neural network for carbon emission forecasting, in which the generalized regression neural network uses a radial basis function to assess the similarity between the forecasted value and the actual value and uses a local weighted average in prediction so that the training data closer to the input data have greater weight. Zhang [56] used KELM (Kernel Extreme Learning Machine) neural network for prediction. KELM is a feed-forward neural network that uses a kernel function to map the data into a high dimensional space, which allows the model to better deal with nonlinear relationships. Chi [57] uses CNNs for carbon emission prediction. Convolutional Neural Networks are usually used mainly for feature extraction of image and video data, but there is also a variant of One-Dimensional Convolutional Neural Networks (1D-CNN), which is specifically designed to handle one-dimensional sequential data, for instance, text, time-series, and audio. In 1D-CNN, the convolutional kernel will only perform convolutional operations along the sequence of time steps, which makes it effective in capturing the relationships and information between different features. This feature also makes it possible to perform carbon emission prediction.

Numerous domestic and foreign studies looked closely at the employment of different types of neural network structures in the forecasting of carbon emissions in addition to BP neural networks and recurrent neural networks. For research in the area of carbon emission prediction, a variety and wide development space were made available by the ongoing examination and comparison of different methodologies. To enhance the precision and reliability of carbon emission forecasting, there are still lots of additional viable neural network models that need further investigation and study. Apart from the BP neural network and recurrent neural network approaches to carbon emissions, there are currently only a small number of domestic and international studies; thus, they are not described in detail.

3. Comparison of Neural Network Carbon Emission Prediction

The relevant information for this research was sourced from the official website of California Independent System Operator (California ISO), the power system operator in California, United States. The data were downloaded in CSV format and synthesized. The data cover a period from 1 February 2023 to 5 April 2023, totaling 64 days, with data collected every 5 min. The dataset includes nine variables: renewable energy, natural gas, large-scale hydropower, imports, battery, nuclear power, coal, load, and carbon dioxide emissions, as shown in Table 3. Visualize some of the data as demonstrated in Figure 6.

Perform correlation analysis on data from California and generate a heatmap, as shown in Figure 7. From the heat map, it can be seen that natural gas, imports, battery storage, and load are key factors affecting carbon emission prediction. These factors are of great significance in the prediction process and play a crucial role in improving prediction accuracy.

The data were preprocessed by normalizing the model’s input values in the range [0, 1], where the normalization formula is as follows:

x_{i} = \frac{x_{i} - x_{m i n}}{x_{m a x} - x_{m i n}}

(7)

where

x_{i}

denotes the i-th sample value,

x_{m i n}

denotes the sample’s minimum value,

x_{m a x}

is its maximum value, and

x

denotes the sample’s value following normalization. By normalizing the sample, one can speed up and stabilize the model’s convergence and prevent problems like the prediction effect being less than ideal due to outliers in the data. Since the output is normalized data when the model is appraised after training, back-normalization is necessary to acquire the actual forecasted values for model evaluation. Back-normalization is calculated as follows:

x_{i} = x (x_{\max} - x_{\min}) + x_{\min}

(8)

The complete dataset is split into three sets: a training set that is 60% large for model training, a validation set that constitutes 20% of the total data, and a test set that is 20% large for assessing the performance of the ultimate trained model. The subsequent formula is employed when the loss function employs mean squared error:

l o s s (x_{i}, x_{j}) = {(x_{j} - x_{i})}^{2}

(9)

The objective of the experiment is to reduce the loss value as much as feasible, where loss is the mistake between the anticipated and true values,

x_{i}

is the predicted value, and

x_{j}

is the true value. Throughout the training phase, the model’s parameters were changed through repeated experimentation, and the leftover hyperparameters and learners were identified. The model’s parameters are repeatedly tested and altered during the training phase to identify the final hyperparameters and learners. The MAE and MSE were chosen as the model assessment indices after the data were entered into the model for training and testing, where the following is the MAE formula:

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(10)

where

y_{i}

denotes the factual value of the i-th data sample,

{\hat{y}}_{i}

denotes the forecasted value that corresponds to it, and n denotes the aggregate number of samples. In contrast to MSE, which squares the difference, MAE only considers the absolute difference between the predicted value and the actual value. The higher the model performance, the more proximate the prediction result is to the actual value and the smaller the MAE value. The formula for MSE is as follows:

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(11)

where

y_{i}

represents the actual value of the i-th data sample,

{\hat{y}}_{i}

represents the predicted value that corresponds to it, and n denotes the sample size. The lesser the value of MAE, the closer the forecasted results of the model are to the factual values, and the better the performance of the model. The model was tested on a dataset with 96 lengths of inputs to predict different visual fields {3,6,12,24}, and the experimental results are demonstrated in Table 4.

By looking at the data in Table 4, there is a difference in performance between the different models at different prediction lengths. Specifically, the BP neural network outperforms the other neural networks when the prediction lengths are 3, 6, and 12, but the other neural networks significantly outperform the BP neural network when the prediction length is increased to 24. This result may be due to the different network structures of the neural networks. For shorter prediction lengths, BP neural networks may be more effective because of their better training and prediction ability on shorter sequences, capturing patterns and trends in the short term better.

As the prediction horizon increases, sequences become longer and involve longer-term dependencies. At this point, LSTM neural networks, with their memory capabilities, exhibit better performance. Due to their effective capturing and retaining of long-term correlations within sequences, they demonstrate superior performance in longer-term time series prediction scenarios. However, when the prediction horizon is 12 and 24, both neural networks provided low-quality predictions. This may be due to the presence of multiple variables in the data and the use of a multivariate input and output prediction approach. By using a generic neural network model, it becomes challenging to capture the trend changes in multivariate predictions. Further improvements should involve designing more complex models to enhance prediction performance. Additionally, optimization of the data preprocessing method to reduce noise is also necessary.

To visualize the results, a visualization of 12 prediction windows was chosen, as presented in Figure 8. Based upon the presented prediction results, it is possible to observe that the BP neural network performs superior at a prediction length of 12 and fits well with real data. In contrast, the other two neural networks gave poorer predictions, which is consistent with the results in Table 4. This observation provides further evidence that BP neural networks may be more capable of predicting sequences of length 12 by better-capturing patterns and trends in the sequences, leading to more accurate forecasting. It is noteworthy that the performance of the model depends heavily on the application scenario and the characteristics of the dataset. Other neural networks may perform well under different tasks and data conditions, or their performance can be improved by adjusting the hyperparameters and network structure.

In summary, based on the current observational results and the data in Table 4, it is possible to deduce that the BP neural network exhibits stronger predictive capabilities for short-term sequences, while the memory-enhanced LSTM neural network performs better when predicting long sequences.

4. Conclusions

This study investigates the implementation of neural networks in the prediction of carbon emissions and experimentally compares the capability of four neural networks in power carbon emission prediction. The experimental results demonstrate that the BP neural network surpasses the two neural networks, LSTM and RNN, in the short-term prediction of sequences but not as well as these two neural networks in the long sequence forecasting. The data on carbon emissions are characterized by massive data volume and nonlinearity, and neural networks happen to be good at dealing with this kind of data. Compared with traditional prediction models, neural networks can superiorly capture the sophisticated nonlinear relationships in the data and adapt to the needs of more and more complex carbon emission data. At home and abroad, the BP neural network, recurrent neural network, and the amalgamation of traditional methods with these two networks are primarily employed in forecasting carbon emissions, and these traditional methods are primarily employed in data processing and optimization of neural network parameters when combined with neural networks. Overall, the exceptional performance of neural networks in carbon emission prediction and their ability to adapt to complex nonlinear data characteristics make them a necessary power carbon emission prediction model choice. Meanwhile, the evolving deep learning technology will continue to promote the further application and development of neural networks in the realm of power carbon emission prediction.

Currently, there is insufficient research on long-sequence prediction in carbon emission forecasting. With the increasing availability of electricity carbon emission data, there is a need for deep architecture neural network models to capture more complex spatiotemporal relationships and features within the input data. Therefore, future research needs to pay close attention to studying and designing more complex models that can optimize decisions through interaction with the environment, enabling better handling of dynamic and complex carbon emission issues. For example, incorporating attention mechanisms into the model can automatically emphasize the most influential input features for prediction results, enhancing the model’s perception of important features and improving its predictive capabilities under long sequences and complex data scenarios. Additionally, data quality considerations such as the type of electricity, carbon emission factors, and seasonal variations need to be addressed. When making predictions, data containing features such as power generation, load, and climate are more conducive to accurate forecasting.

Author Contributions

Data collection, design, and writing: W.F. and W.L.; study conception and revise the paper: D.C.; analysis and interpretation of results: T.C., L.L., L.Z. and B.D.; draft manuscript preparation: W.L., J.L. and D.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Project of State Grid Sichuan Electric Power Company, grant number B7194723R001.

Data Availability Statement

The website for the data is listed below, and it is where you can get the data used in the paper: http://www.caiso.com/TodaysOutlook/Pages/emissions.html accessed on 1 March 2023.

Conflicts of Interest

Authors Wentao Feng, Tailong Chen, Longsheng Li, Le Zhang, and Bingyan Deng were employed by the company State Grid Sichuan Information & Telecommunication Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Nomenclature

LSTM	Long short-term memory
BP	Backpropagation
PSO	Particle swarm optimization
IPSO	Improved particle swarm optimization
PCA	Principal component analysis
MAPE	Mean absolute percentage error
MAE	Mean absolute error
MSE	Mean square error
RMSE	Root mean square error
RNN	Recurrent neural networks
SVR	Support vector regression
VMD	Variational mode decomposition
EEMD	Ensemble empirical modal decomposition

References

Ling, D.Y. Greenhouse Effect Harm and Control Measures. Taxation 2018, 13, 252. [Google Scholar]
Tavassoli, M.; Kamran-Pirzaman, A. Comparison of effective greenhouse gases and global warming. In Proceedings of the 2023 8th International Conference on Technology and Energy Management (ICTEM), Babol, Iran, 8–9 February 2023; pp. 1–5. [Google Scholar] [CrossRef]
Jiang, Y.B. Behind frequent extreme weather. China Discipline Inspection and Supervision News, 18 August 2021. [Google Scholar] [CrossRef]
Yuan, X.L.; Geng, H.Y.; Li, S.R.; Li, C.P. The current situation, challenges and countermeasures of realizing the “double carbon” goal of Chinese cities from the perspective of high-quality development. J. Xi’an Jiaotong Univ. (Soc. Sci. Ed.) 2022, 42, 30–38. [Google Scholar] [CrossRef]
Fu, Y.; Ma, Y.H.; Liu, Y.J.; Niu, W.Y. Research on the development model of low-carbon economy. China Popul. Resour. Environ. 2008, 3, 14–19. [Google Scholar]
Lai, X.D.; Zhan, W.L. The impact of energy conservation and emission reduction policies of thousands of enterprises on corporate green technology innovation and its internal mechanism. China Popul. Resour. Environ. 2023, 33, 104–114. [Google Scholar]
Guo, M.X. Quantitative assessment of the contribution of fossil energy reduction to pollution reduction and carbon reduction. Ecol. Econ. 2023, 39, 184–190+207. [Google Scholar]
Chen, J.J. Discussion on energy conservation and emission reduction ideas in the refining and chemical industry under the background of “double carbon”. Mod. Chem. 2023, 43, 7–12. [Google Scholar] [CrossRef]
Xu, F.; Pan, Q.; Wang, Y.N. Research on the impact of green and low-carbon transformation on corporate profitability under the “dual carbon” goal. Macroecon. Res. 2022, 1, 161–175. [Google Scholar] [CrossRef]
Yang, H.; O’Connell, J.F. Short-term carbon emissions forecast for aviation industry in Shanghai. J. Clean. Prod. 2020, 275, 122734. [Google Scholar] [CrossRef]
Li, Y.; Li, T.; Lu, S. Forecast of urban traffic carbon emission and analysis of influencing factors. Energy Effic. 2021, 14, 84. [Google Scholar] [CrossRef]
Wen, L.; Cao, Y. Influencing factors analysis and forecasting of residential energy-related CO₂ emissions utilizing optimized support vector machine. J. Clean. Prod. 2020, 250, 119492. [Google Scholar] [CrossRef]
AlKheder, S.; Almusalam, A. Forecasting of carbon dioxide emissions from power plants in Kuwait using United States Environmental Protection Agency, Intergovernmental panel on climate change, and machine learning methods. Renew. Energy 2022, 191, 819–827. [Google Scholar] [CrossRef]
Wen, L.; Yuan, X. Forecasting CO₂ emissions in Chinas commercial department, through BP neural network based on random forest and PSO. Sci. Total Environ. 2020, 718, 137194. [Google Scholar] [CrossRef] [PubMed]
Zhang, C.; Guo, Y.; Li, M. A review of the development and application of artificial neural network models. Comput. Eng. Appl. 2021, 57, 57–69. [Google Scholar]
Bai, Y.F.; Zhang, W.R.; Liu, J.P. Research on the prediction method of per capita carbon emissions in urban demonstration areas based on the environmental Kuznets curve. Ecol. Econ. 2022, 38, 35–42+84. [Google Scholar]
Zhao, Z.X.; Wang, R.; Voss, H.; Yan, Y.F. Forecasting the turning point of China’s carbon emissions based on the classic environmental Kuznets model. Financ. Trade Econ. 2013, 10, 81–88+48. [Google Scholar] [CrossRef]
Zhu, Y.E.; Li, L.F.; He, S.S.; Li, H.; Wang, Y. Annual prediction of peak carbon emissions in Shanxi Province based on IPAT model and scenario analysis method. Resour. Sci. 2016, 38, 2316–2325. [Google Scholar]
Zhang, X.; Wang, R.; Liu, S.Q.; Wei, D.; Chang, Y.Y.; Zhang, Y.K. Carbon peak scenario analysis in Xuzhou based on the extended IPAT model. Chin. Mark. 2023, 8, 22–24. [Google Scholar] [CrossRef]
Wang, N.; Han, C.Y.; Zhang, Y.; Gu, Z.L. Research on Regional Carbon Emissions Peaking Based on the Threshold-STIRPAT Extended Model—Taking East China as an Example. Environ. Eng. 2023, 10, 1–11. Available online: http://kns.cnki.net/kcms/detail/11.2097.X.20231026.1803.004.html (accessed on 15 March 2023).
Xiao, Y.H.; Lu, H.; Lu, D.Y. Analysis of Carbon Emission Characteristics and Carbon Reduction Potential of Campus Building Operations Based on STIRPAT Model. Environ. Eng. 2023, 10, 1–11. Available online: http://kns.cnki.net/kcms/detail/11.2097.X.20231013.1438.006.html (accessed on 15 March 2023).
Pan, S.Y.; Zhang, M.L. Research on carbon dioxide emission prediction and influencing factors in Gansu Province based on BP neural network. Environ. Eng. 2023, 41, 61–68+85. [Google Scholar] [CrossRef]
Ji, G.Y. Application of BP neural network model based on gray correlation analysis in China’s carbon emission prediction. Pract. Underst. Math. 2014, 44, 243–249. [Google Scholar]
Chen, J.H.; Li, H.; Yang, S.; Zhou, Z.Y. Research on the driving factors and impacts of carbon emissions in the nonferrous metal mining and dressing industry in Hunan Province based on gray correlation analysis. Nonferrous Met. Eng. 2019, 9, 109–116. [Google Scholar]
Zhao, J.H.; Li, J.S.; Wang, P.L.; Hou, G.J. Research on carbon peak path in Henan Province based on Lasso-BP neural network model. Environ. Eng. 2022, 40, 151–156+164. [Google Scholar] [CrossRef]
Yan, F.Y.; Liu, S.X.; Zhang, X.P. Research on land carbon emission prediction based on PCA-BP neural network. West. J. Hum. Settl. Environ. 2021, 36, 1–7. [Google Scholar]
Ynag, J.Q.; Fan, X.J.; Zhao, Y.H.; Yuan, J. Carbon emission prediction in Shanxi Province based on PSO-BP neural network. J. Environ. Eng. Technol. 2023, 13, 1–15. Available online: http://kns.cnki.net/kcms/detail/11.5972.X.20230918.1150.002.html (accessed on 15 March 2023).
Zhang, D.; Wang, T.T.; Zhi, J.H. Carbon emission prediction and ecological economic analysis in Shandong Province based on IPSO-BP neural network model. Ecol. Sci. 2022, 41, 149–158. [Google Scholar] [CrossRef]
Shi, W.; Yang, J.; Qiao, F.; Wang, C.; Dong, B.; Zhang, X.; Zhao, S.; Wang, W. CO₂ emission prediction based on carbon verification data of 17 thermal power enterprises in Gansu Province. Environ. Sci. Pollut. Res. 2024, 31, 2944–2959. [Google Scholar] [CrossRef]
Wang, H.; Wei, Z.J.; Yao, Y.X.; Yu, S.S. Research on CO2 emission prediction of coal-fired power plants based on BP neural network. In Proceedings of the 2022 Annual Science and Technology Conference of Chinese Society of Environmental Science—Environmental Engineering Technology Proceedings of the Innovation and Application Session; Environmental Engineering Branch of the Chinese Society of Environmental Science: Editorial Department of “Environmental Engineering”: Beijing, China, 2022; pp. 348–352. [Google Scholar]
Zhao, J.Y.; Ma, Z.; Tang, H.L. Comparison of BP neural network and multiple linear regression models in carbon emission prediction. Sci. Technol. Ind. 2020, 20, 172–176. [Google Scholar]
Hu, Z.; Gong, X.; Liu, H. Research on carbon emission prediction of household consumption in western cities based on BP model—Taking Xi’an as an example. Arid. Area Resour. Environ. 2020, 34, 82–89. [Google Scholar] [CrossRef]
Liu, J.W.; Song, Z.Y. Review of recurrent neural network research. Control. Decis.-Mak. 2022, 37, 2753–2768. [Google Scholar] [CrossRef]
Rocki, K. Recurrent memory array structures. arXiv 2021, arXiv:1607.03085. [Google Scholar]
Choi, J.; Kim, T.; Lee, S.-G. Cell-aware stacked LSTMs for modeling sentences. arXiv 2021, arXiv:1809.02279. [Google Scholar]
Cheng, D.M. Overview of LSTM research status. Inf. Syst. Eng. 2022, 337, 149–152. [Google Scholar]
Graves, A. Generating sequences with recurrent neural networksi. arXiv 2013, arXiv:1308.0850. [Google Scholar]
Liu, C.; Wang, Z.L.; Yuan, C.J. The impact and trend prediction of independent technological innovation on industrial carbon emissions from a structural perspective. China Popul. Resour. Environ. 2022, 32, 12–21. [Google Scholar]
Liu, C.; Qu, J.; Ge, Y.; Tang, J.; Gao, X. Carbon emission prediction of China’s transportation industry based on LSTM model. Chin. Environ. Sci. 2023, 43, 2574–2582. [Google Scholar]
Zhang, X.Q.; Li, F.; Zhang, X.; Qiao, X.Y.; Li, X.Y. Research on real-time prediction of China’s carbon emissions based on CNN-LSTM model. China-Arab. Sci. Technol. Forum (Chin. Engl.) 2022, 44, 71–75. [Google Scholar]
Hu, W.; Huang, Y.; Wei, L.; Zhang, F.; Li, H. Deep Convolutional Neural Networks for Hyperspectral Image Classification. J. Sens. 2015, 2015, 1–12. [Google Scholar] [CrossRef]
Han, Z.; Cui, B.; Xu, L.; Wang, J.; Guo, Z. Coupling LSTM and CNN Neural Networks for Accurate Carbon Emission Prediction in 30 Chinese Provinces. Sustainability 2023, 15, 13934. [Google Scholar] [CrossRef]
Borovykh, A.; Bohte, S.; Oosterlee, C.W. Conditional time series forecasting with convolutional neural networks. arXiv 2017, arXiv:1703.04691. [Google Scholar]
Lu, W.; Duan, J.; Wang, P.; Ma, W.; Fang, S. Short-term Wind Power Forecasting Using the Hybrid Model of Improved Variational Mode Decomposition and Maximum Mixture Correntropy Long Short-term Memory Neural Network. Int. J. Electr. Power Energy Syst. 2023, 144, 108552. [Google Scholar] [CrossRef]
Tang, J.; Gong, R.; Wang, H.; Liu, Y. Scenario analysis of transportation carbon emissions in China based on machine learning and deep neural network models. Environ. Res. Lett. 2023, 18, 064018. [Google Scholar] [CrossRef]
Chen, Y.; Chen, Z.; Li, K.; Shi, T.; Chen, X.; Lei, J.; Wu, T.; Li, Y.; Liu, Q.; Shi, B.; et al. Research of Carbon Emission Prediction: An Oscillatory Particle Swarm Optimization for Long Short-Term Memory. Processes 2023, 11, 3011. [Google Scholar] [CrossRef]
Shao, C.; Ning, J. Construction and application of a carbon emission prediction model for China’s textile and apparel industry based on improved WOA-LSTM. J. Beijing Inst. Fash. Technol. (Nat. Sci. Ed.) 2023, 43, 73–81. [Google Scholar] [CrossRef]
Wang, W.; Pan, H.; Wang, G. Research on industrial carbon emission prediction and influencing factors in Liaoning Province based on GWO-LSTM model. Environ. Sci. Manag. 2024, 49, 28–33. [Google Scholar]
Ke, H.; Zhang, X.S.; Chen, Z.Z. Research on carbon emission prediction in Shaanxi Province based on quadratic decomposition BAS-LSTM . Oper. Manag. 2024, 144–152. [Google Scholar] [CrossRef]
Jiang, X.; Li, S. BAS: Beetle Antennae Search Algorithm for Optimization Problems. Int. J. Robot. Control 2017, 1, 1–3. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E. Ensemble Empirical Mode Decomposition: A Noise-Assisted Data Analysis Method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
MAhmadi, H.; Jashnani, H.; Chau, K.W.; Kumar, R.; Rosen, M.A. Carbon dioxide emissions prediction of five Middle Eastern countries using artificial neural networks. Energy Sources Part A Recovery Util. Environ. Eff. 2019, 45, 9513–9525. [Google Scholar] [CrossRef]
Ghalandari, M.; Fard, H.F.; Birjandi, A.K.; Mahariq, I. Energy-related carbon dioxide emission forecasting of four European countries by employing data-driven methods. J. Therm. Anal. Calorim. 2021, 144, 1999–2008. [Google Scholar] [CrossRef]
Zheng, H.; Guo, X.; Guo, Z.; Guo, H.W.; Liu, Y.N. Carbon emission prediction of automotive parts production process based on IPSO-GRNN. Mech. Des. 2023, 40, 69–73. [Google Scholar] [CrossRef]
Zhang, X.; Wei, Z.; Chen, Z.; Guo, Y.W. Research on industrial carbon emission prediction method based on LASSO-GWO-KELM. Environ. Eng. 2023, 41, 141–149. [Google Scholar] [CrossRef]
Chi, X.; Quan, Z.; Jia, X.; Zhang, W.J. Carbon emission prediction of power plants based on WPD-ISSA-CA-CNN model. Control Engineering 2023, 1–8. [Google Scholar] [CrossRef]

Figure 1. Literature analysis in the field of neural network carbon emission prediction.

Figure 2. BP neural network structure diagram.

Figure 3. Hybrid model flow chart.

Figure 4. RNN structure diagram.

Figure 5. LSTM structure diagram.

Figure 6. Partial data presentation diagram.

Figure 7. Data feature correlation analysis.

Figure 8. Visualization of the 12 prediction windows of the model.

Table 1. Summary comparison of advantages and disadvantages of BP neural network-based correlation models.

Document	Machine	Dominance	Limitations
[22]	Multilayer feedforward neural network trained by error backpropagation algorithm	With high prediction accuracy and the ability to learn nonlinear relationships between data, the model is more interpretable.	The training time is long, prone to overfitting, sensitive to initial parameters and learning rate, and requires careful parameter tuning.
[23,25,26]	The data variables are screened to select the variables that have a high impact, and then they are trained before being fed into the BP neural network.	The training speed and prediction accuracy are improved by filtering the influential variables, and the model robustness is enhanced.	In addition to the removal of variables with small influencing factors, when there are more variables, it will lead to a larger error in the prediction results.
[27,28]	Optimization of the parameters of the neural network using an optimization algorithm.	The parameters of the neural network can be accurately optimized, improving the prediction accuracy and generalization of the model.	The computational overhead is high and may fall into local optimal solutions.

Table 2. Comparison of correlation models based on LSTM neural networks.

Document	Machine	Dominance	Limitations
[37]	Neural networks with memory capabilities process sequential data by introducing cyclic connections and recurrent units.	Ability to handle long-term dependencies, flexibility in handling multi-dimensional data, and high prediction accuracy.	Longer training times and stronger reliance on past data may lead to long-term memory loss problems.
[40,43]	A combination of local features of the data extracted using CNN pairs and global features of the data extracted using LSTM.	Noise information can be filtered to accurately capture trends in time-series data.	Long training time and high computational overhead.
[49]	The data are first decomposed into modules of different frequencies, and then the parameters of the LSTM neural network are optimized using the BAS algorithm.	The problem of data nonlinearity and instability is solved, and the precision and convergence rate of the model are effectively improved.	Data decomposition may result in loss of information and high computational overhead.
[45,46,47,48]	Optimization of the parameters of the neural network using an optimization algorithm.	Finding the globally optimal parameter settings speeds up convergence and improves the prediction accuracy and generalization of the model.	High computational resource requirements, risk of overfitting, and the possibility of falling into a local optimum.

Table 3. Data set information.

Dataset	CO₂
Variants	9
Timesteps	18,432
Granularity	5 min
Start time	1 February 2023
Task type	Multi-step
Data partition	Train/Validation/Test:6/2/2

Table 4. Experimental results.

Model	BP		LSTM		RNN
Model	MSE	MAE	MSE	MAE	MSE	MAE
3	0.0459	0.1530	0.2278	0.3337	0.0564	0.1560
6	0.0702	0.1884	0.2559	0.3743	0.0877	0.2060
12	0.1113	0.2410	0.2599	0.3863	0.4014	0.5037
24	0.6392	0.6375	0.3770	0.4735	0.5008	0.5661

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, W.; Chen, T.; Li, L.; Zhang, L.; Deng, B.; Liu, W.; Li, J.; Cai, D. Application of Neural Networks on Carbon Emission Prediction: A Systematic Review and Comparison. Energies 2024, 17, 1628. https://doi.org/10.3390/en17071628

AMA Style

Feng W, Chen T, Li L, Zhang L, Deng B, Liu W, Li J, Cai D. Application of Neural Networks on Carbon Emission Prediction: A Systematic Review and Comparison. Energies. 2024; 17(7):1628. https://doi.org/10.3390/en17071628

Chicago/Turabian Style

Feng, Wentao, Tailong Chen, Longsheng Li, Le Zhang, Bingyan Deng, Wei Liu, Jian Li, and Dongsheng Cai. 2024. "Application of Neural Networks on Carbon Emission Prediction: A Systematic Review and Comparison" Energies 17, no. 7: 1628. https://doi.org/10.3390/en17071628

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Neural Networks on Carbon Emission Prediction: A Systematic Review and Comparison

Abstract

1. Introduction

2. Review of Carbon Emission Prediction Methods

2.1. Prediction Method Based on BP Neural Network

2.2. Prediction Method Based on Recurrent Neural Network

2.3. Other Neural Networks

3. Comparison of Neural Network Carbon Emission Prediction

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI