Forecasting the Natural Gas Supply and Consumption in China Using a Novel Grey Wavelet Support Vector Regressor

Ma, Xin; Deng, Yanqiao; Yuan, Hong

doi:10.3390/systems11080428

Open AccessArticle

Forecasting the Natural Gas Supply and Consumption in China Using a Novel Grey Wavelet Support Vector Regressor

by

Xin Ma

^1,2,*

,

Yanqiao Deng

^3,1 and

Hong Yuan

^3,1

¹

School of Mathematics and Physics, Southwest University of Science and Technology, Mianyang 621010, China

²

Center for Information Management and Service Studies of Sichuan, Southwest University of Science and Technology, Mianyang 621010, China

³

School of Management Science and Real Estate, Chongqing University, Chongqing 400045, China

^*

Author to whom correspondence should be addressed.

Systems 2023, 11(8), 428; https://doi.org/10.3390/systems11080428

Submission received: 30 June 2023 / Revised: 25 July 2023 / Accepted: 11 August 2023 / Published: 15 August 2023

(This article belongs to the Special Issue Recent Advances and Applications of Forecasting and Evaluation Techniques in Energy, Environment and Economy Management)

Download

Browse Figures

Versions Notes

Abstract

:

Natural gas is playing an important role in the reconstruction of the energy system of China. Natural gas supply and consumption indicators forecasting is an important decision-making support for the government and energy companies, which has attracted considerable interest from researchers in recent years. In order to deal with the more complex features of the natural gas datasets in China, a Grey Wavelet Support Vector Regressor is proposed in this work. This model integrates the primary framework of the grey system model with the kernel representation employed in the support vector regression model. Through a series of mathematical transformations, the parameter optimization problem can be solved using the sequential minimal optimization algorithm. The Grey Wolf Optimizer is used to optimize its hyperparameters with the nested cross-validation scheme, and a complete computational algorithm is built. The case studies are conducted with real-world datasets from 2003–2020 in China using the proposed model and 15 other models. The results show that the proposed model presents a significantly higher performance in out-of-sample forecasting than all the other models, indicating the high potential of the proposed model in forecasting the natural gas supply and consumption in China.

Keywords:

natural gas supply and demand; grey system model; support vector regressor; grey wolf optimizer

1. Introduction

Energy forecasting is of great significance for understanding and anticipating the future dynamics of the global energy landscape. As a versatile and abundant primary energy, natural gas plays a vital role in meeting the energy demands of various sectors, including power generation, heating, industrial processes, and transportation. Accurate forecasting of natural gas is crucial for policymakers to develop effective energy strategies and make informed decisions about energy infrastructure, environmental regulations, and energy security. However, as countries strive to reduce their reliance on fossil fuels and transition towards cleaner energy alternatives, the demand for natural gas may undergo fluctuations. Additionally, geopolitical factors, such as regional conflicts or trade agreements, can influence natural gas consumption patterns by affecting supply routes and prices. In a word, the time series of natural gas is influenced by multiple factors and exhibits typical complex and irregular characteristics. Traditional forecasting models face challenges in accurately forecasting such patterns. The popular energy forecasting methods include statistical (or empirical) models, machine learning models, and grey system models at present.

The time series models are one of the most used statistical models, which consist of the autoregressive integrated moving average (ARIMA) model, the autoregressive moving average (ARMA) model, and the seasonal autoregressive integrated moving average model [1]. However, the linear structure of time series models often makes them fail to forecast the nonlinear time series accurately. Machine learning models have unique advantages in handling nonlinear time series. Machine learning models mainly include neural network models [2,3,4], models based on kernel [5,6], and ensemble models [7]. Although machine learning models excel in handling nonlinear system problems, they require a significant amount of training data and impose certain hardware requirements. Furthermore, machine learning models possess intricate nonlinear parameters, which may lead to overfitting. Given the limitations of the aforementioned approaches, grey system models have gained wide applicability due to their ability to model with limited data and accommodate both linear and nonlinear problems.

Since Professor Deng [8] proposed the grey system theory, many researchers have made contributions to the development of the grey system in different fields. Examples include nuclear energy consumption [9,10,11,12], wind energy [13,14], electricity consumption [15,16], and oil production [17,18]. Currently, mainstream grey models can be broadly categorized into linear models and nonlinear models. The GM(1,1) model is the most classical linear grey model, while it has inherent issues in modeling mechanisms. Therefore, many scholars have proposed other linear grey models based on the GM(1,1) model. For example, Chen et al. [19] improved the background value of the grey model using Gaussian–Legendre integration. Xie et al. [20] first proposed the idea of grey discrete modeling and established the discrete grey model. Xu et al. [21] firstly demonstrated that grey inputs can vary with time and possess dynamic variability. Then, Cui et al. [22] employed a linear function of time as the grey input instead of the original constant grey input. However, the forecasting of natural gas is a complex nonlinear problem, and linear grey models have significant limitations when facing such a problem.

Nonlinear grey models can better capture the nonlinear relationships in raw data. In recent years, an increasing number of studies have focused on nonlinear grey system models. For instance, Zhou et al. [23] proposed a novel discrete grey model considering nonlinearity and fluctuations. Chen et al. [24] proposed the fractional Hausdorff derivative grey model based on fractional order calculation. Qian et al. [25] proposed a grey model with time power and demonstrated its feasibility. Zeng et al. [26] presented a grey model that incorporates lagged dependent variables, linear correction terms, and stochastic disturbance terms. Liu et al. [27] proposed a grey neural network and input–output combined forecasting model for forecasting the primary energy consumption in Spanish economic sectors. Ma et al. [28] proposed a novel wavelet kernel-based grey system model by combining the grey system modeling and wavelet kernel-based machine learning and applied it to the urban natural gas consumption forecasting in Kunming, China, which takes advantage of the nonlinearity and periodicity ability of the wavelet kernel. However, existing nonlinear grey system models are often constructed using specific functions, resulting in relatively fixed structures of these models, making them less adaptable to handle more general nonlinear characteristics.

The hyperparameters optimization of nonlinear grey models has also emerged as a popular research topic in recent years. Intelligent optimization algorithms are a class of algorithms based on heuristic search strategies, which are capable of finding solutions that are close to the global optimum within a limited number of iterations. This characteristic makes them particularly effective in handling highly nonlinear, multimodal, and complex optimization problems, leading to their widespread application in recent years. At present, the most popular algorithms are Grey Wolf Optimization (GWO) [29], Particle Swarm Optimization (PSO) [30], and the Genetic Algorithm (GA) [31]. For instance, Cai et al. [32] introduced GWO for tuning the hyperparameters of long short-term memory networks. Heydari et al. [33] employed GWO to optimize the Generalized Regression Neural Network. Sebayang et al. [34] used GWO to tune the hyperparameters of an Artificial Neural Network. Barman et al. [35] compared the performance of GWO, PSO, and GA, showing that GWO outperformed other algorithms.

In summary, nonlinear grey models can better fit complex data patterns, effectively handle outliers and noisy data, and improve the accuracy of predictions. In previous work, researchers employed specific functions as the grey input to construct nonlinear grey models. In order to address the broader spectrum of nonlinear characteristics, the kernel trick based on the support vector regression with wavelet kernel is employed to build a novel Grey Wavelet Support Vector Regressor. The

ε

-insensitive loss function is used to build the optimization problem for training the model, which is transformed to a convex quadratic programming and then solved by the sequential minimal optimization algorithm. The GWO is incorporated to optimize the hyperparameters of the suggested model, thereby finalizing the training and forecasting process of the model.

The subsequent sections of this paper are organized as follows: Section 2 presents the proposed Grey Wavelet Support Vector Regressor, including all the modeling procedures and algorithms; Section 3 presents the hyperparameter optimization by the grey wolf optimizer; Section 4 demonstrates the case studies conducted to forecast China’s total natural gas supply (available for consumption) and total natural gas consumption using the proposed model. A comparison with other grey system models is provided, and the findings are summarized in Section 5.

2. The Proposed Grey Wavelet Support Vector Regressor

In this section, the proposed Grey Wavelet Support Vector Regressor (GWSVR) is presented, which uses the main formulation of grey system models and the loss function of the support vector regression.

2.1. Grey System Model with Nonlinear Mapping and Its Solution

A general formulation of the grey system model with nonlinear effect by time can be represented by:

\frac{d y^{(1)} (t)}{d t} + b y^{(1)} (t) = g (t),

(1)

where

g (t)

is a nonlinear function of time t, and

y^{(1)} (t) = \sum_{τ = 1}^{t} y^{(0)} (τ)

is the first order accumulation, which is often abbreviated as 1-AGO. In previous work, researchers used some specific functions such as a linear function and an exponential function to build specific grey system models. In order to deal with more general nonlinear features, a nonlinear mapping is used in this work, which is defined as follows:

ϕ : R \to F,

(2)

where

R

is the 1-D set of real numbers, and

F

is the feature space, which is generally high-dimensional linear space. Within this nonlinear mapping, the nonlinear function can be linearly expressed in the feature space as

g (t) = ω^{T} ϕ (t) + u,

(3)

where the vector

ω \in F

is the weight, and u is a bias. Then, (1) of the grey system model can be re-written as

\frac{d y^{(1)} (t)}{d t} + b y^{(1)} (t) = ω^{T} ϕ (t) + u .

(4)

The whitening Equation (4) is a typical ordinary differential equation, of which the general solution is

y^{(1)} (t) = y^{(0)} (1) e^{- b (t - 1)} + \int_{1}^{t} e^{- b (t - τ)} \cdot [ω^{T} ϕ (τ) + u] d τ .

(5)

2.2. The $ε$ -Insensitive Loss for the Proposed Model

In order to estimate the parameters and the nonlinear function of the grey system model presented above, we need to first derive its discrete formulation. In previous work, a general way has been to integrate the whitening Equation (4) in a small interval (e.g.,

[k - 1, k]

). In this work, we make a tiny variation, which averages the time t of the nonlinear mapping instead of averaging the nonlinear function to make the model simpler without loss of effectiveness. The discrete formulation can be written as

y^{(0)} (k) + b s^{(1)} (k) = ω^{T} ϕ (k - \frac{1}{2}) + u,

(6)

where

s^{(1)} (k) = \frac{1}{2} (y^{(1)} (k) + y^{(1)} (k - 1)), k = 2, \dots, n

are called the background values.

Then, the

ε

-insensitive loss used in this work can be given by

\begin{matrix} min_{a, ω} \frac{1}{2} ({∥ b ∥}^{2} + {∥ ω ∥}^{2}) + C \sum_{i = 1}^{N} (ς_{i} + ς_{i}^{*}) \\ s . t . \{\begin{matrix} y^{(0)} (k) + b s^{(1)} (k) - ω^{T} ϕ (k - \frac{1}{2}) - u \leq ε + ς_{i} \\ ω^{T} ϕ (k - \frac{1}{2}) - b s^{(1)} (k) + u - y^{(0)} (k) \leq ε + ς_{i}^{*} \\ ς_{i}, ς_{i}^{*} \geq 0 \end{matrix} \end{matrix} .

(7)

The term C is often called the regularization parameter, which controls the fitting error and scale of the parameters b and

ω

. The

ε

is a threshold, which limits the fitting errors of the model, and the

ς_{i}, ς_{i}^{*}

are the slack variables that make the regularization problem more feasible. This formulation is different to the commonly used least squares method, which not only minimizes the training errors of the model but also reduces the scale of the parameters b and

ω

, and this way, the model has higher generality in the real-world cases.

However, the Formulation (7) is not computable at present, as the nonlinear mapping is still unknown. To make it easier to solve, an extended nonlinear mapping is introduced as

φ : R \times R \to R \times F,

(8)

which maps the vector

{(- s^{(1)} (k), k - \frac{1}{2})}^{T}

to

{(- s^{(1)} (k), ϕ (k - \frac{1}{2}))}^{T}

. Let the weighted vector be

ω = {(b, ω)}^{T}

, we have

- b s^{(1)} (k) + ω^{T} ϕ (k - \frac{1}{2}) = ω^{T} φ (k),

(9)

and then we can re-write Equation (7) as

\begin{matrix} min_{a, ω} \frac{1}{2} {∥ ω ∥}^{2} + C \sum_{i = 1}^{N} (ς_{i} + ς_{i}^{*}) \\ s . t . \{\begin{matrix} y^{(0)} (i) - ω^{T} φ (i) - u \leq ε + ς_{i} \\ ω^{T} φ (k) + u - y^{(0)} (i) \leq ε + ς_{i}^{*} \\ ς_{i}, ς_{i}^{*} \geq 0 \end{matrix} \end{matrix} .

(10)

And now Equation (10) is essentially equivalent to the primal problem of the support vector regression introduced in [36]. Within this formulation, the corresponding Lagrangian function can be constructed as

\begin{matrix} L : = & \frac{1}{2} {∥ ω ∥}^{2} + C \sum_{i = 2}^{n} (ς_{i} + ς_{i}^{*}) - \sum_{i = 2}^{n} (η_{i} ς_{i} + η_{i}^{*} ς_{i}^{*}) \\ - \sum_{i = 2}^{n} ι_{i} (ε + ς_{i} - y^{(0)} (i) + ω^{T} φ (i) + u) \\ - \sum_{i = 2}^{n} ι_{i}^{*} (ε + ς_{i}^{*} - ω^{T} φ (i) - u + y^{(0)} (i)) \end{matrix},

(11)

where

ι_{i}, ι_{i}^{*} (i = 2, \dots, n)

are the Lagrangian multipliers, which are nonnegative and satisfy

ι_{i} \cdot ι_{i}^{*} \neq 0

. By setting the partial derivatives

\frac{\partial L}{\partial b}, \frac{\partial L}{\partial ω}, \frac{\partial L}{\partial ς^{(*)}}

to be zeros, we can obtain some very important equalities as:

\{\begin{matrix} \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) & = 0 \\ ω & = \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) φ (i) \\ ι_{i}^{(*)} & = C - η_{i}^{(*)} \end{matrix} .

(12)

According to Mercer’s condition, a Mercer kernel can be used as the inner product of the feature space, i.e.,

K (ℓ, k) = ϕ^{T} (ℓ) ϕ (k)

. Within this theorem, we can deduce the following equality

\begin{matrix} ω^{T} φ (k) & = \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) φ^{T} (i) φ (k) \\ = \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) [- s^{(1)} (i), ϕ^{T} (i - \frac{1}{2})] [\begin{matrix} - s^{(1)} (k) \\ ϕ (k - \frac{1}{2}) \end{matrix}] \\ = \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) (s^{(1)} (i) s^{(1)} (k) + ϕ^{T} (i - \frac{1}{2}) ϕ (k - \frac{1}{2})) \\ = \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) (s^{(1)} (i) s^{(1)} (k) + K (i - \frac{1}{2}, k - \frac{1}{2})) . \end{matrix}

(13)

By substituting the equalities in (12) and (13) within a Mercer’s kernel, we obtain the dual formulation of the primal (10) as

\begin{matrix} max_{ι, ι^{*}} - \frac{1}{2} \sum_{i, j = 2}^{n} (ι_{i} - ι_{i}^{*}) (ι_{j} - ι_{j}^{*}) (s^{(1)} (i) s^{(1)} (j) + K (i - \frac{1}{2}, j - \frac{1}{2})) \\ - ε \sum_{i = 2}^{n} (ι_{i} + ι_{i}^{*}) + \sum_{i = 2}^{n} y^{(0)} (i) (ι_{i} - ι_{i}^{*}) \\ s . t . \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) = 0 and ι_{i}, ι_{i}^{*} \in [0, C] \end{matrix} .

(14)

It can be seen that the dual formulation is only involved in the computation of the Lagrangian multipliers, and it is now computationally feasible to solve.

2.3. The Kernel Representation and the Wavelet Kernel

Within the above results, the nonlinear function can now be explicitly written as

g (t) = ω^{T} ϕ (t) + u = \sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) K (i - \frac{1}{2}, t) + u .

(15)

The wavelet kernel function is used to represent the nonlinear function, which is defined as

K (ℓ, k) = exp (- \frac{{∥ ℓ - k ∥}^{2}}{2 ν^{2}}) \cdot cos (1.75 \frac{(ℓ - k)}{ν}),

(16)

where

ν

is the kernel parameter, which governs the degree of nonlinearity and periodicity exhibited by the wavelet kernel. The cosine term of the wavelet kernel function reflects the degree of periodicity of the samples, and the exponential part reflects the nonlinearity of the samples. It is very interesting to see that the exponential part is mathematically equivalent to the Gaussian kernel, which has high performance in dealing with nonlinearities. Considering the structure of the wavelet kernel, it is often more flexible to deal with different spectrums of the datasets, which can be expected to make the forecasting model perform better in nonlinear time series.

2.4. Training Algorithm for the GWSVR

In order to make the model computationally feasible, we should solve the dual problem (14) to obtain the optimal values of the Lagrangian multipliers

ι, ι^{*}

. It can be seen that this formulation is mathematically equivalent to the dual problem of the standard support vector regression model introduced in [36]; thus, it can also be solved by the sequential minimal optimization (SMO) algorithm, of which the key points are presented in this subsection.

For convenience, the following notation is used

q_{i, j} = s^{(1)} (i) s^{(1)} (j) + K (i - \frac{1}{2}, j - \frac{1}{2}) .

(17)

The main idea of the SMO is to minimize the objective function in a “pair-by-pair” way, which minimizes only two variables during each iteration. For the regression problem, two subscripts are selected, and the corresponding variables are optimized at this iteration. Noticing that the multipliers with same subscripts cannot be nonnegative at the same time, only two variables with different subscripts are updated during each iteration.

Let the selected two subscripts be

ℓ_{1}, ℓ_{2}

, and the subproblem of the dual problem (14) can be written as

max_{ι_{ℓ_{1}}^{(*)}, ι_{ℓ_{2}}^{(*)}} \{\begin{matrix} - \frac{1}{2} [{(ι_{ℓ_{1}} - ι_{ℓ_{1}}^{*})}^{2} q_{ℓ_{1}, ℓ_{1}} + {(ι_{ℓ_{2}} - ι_{ℓ_{2}}^{*})}^{2} q_{ℓ_{2}, ℓ_{2}} + 2 (ι_{ℓ_{1}} - ι_{ℓ_{1}}^{*}) (ι_{ℓ_{2}} - ι_{ℓ_{2}}^{*}) q_{ℓ_{1}, ℓ_{2}}] \\ - ε (ι_{ℓ_{1}} + ι_{ℓ_{1}}^{*} + ι_{ℓ_{2}} + ι_{ℓ_{2}}^{*}) + y^{(0)} (ℓ_{1}) (ι_{ℓ_{1}} - ι_{ℓ_{1}}^{*}) + y^{(0)} (ℓ_{1}) (ι_{ℓ_{1}} - ι_{ℓ_{1}}^{*}) \\ + r (ι_{ℓ -}) \end{matrix},

(18)

where

ι_{ℓ -}

is the vector of other multipliers, and

r (ι_{ℓ -})

is the sum of the rest of the terms, which are independent of the variables

ι_{ℓ_{1}}^{(*)}, ι_{ℓ_{2}}^{(*)}

, and it is regarded as constant during each iteration.

Noticing that the multipliers should satisfy the first equation in (12), the subproblem is essentially a constrained programming. The SMO updates the two variables in the following way

ι_{ℓ_{1}}^{(*)} = ι_{ℓ_{1}}^{(*)} - t, ι_{ℓ_{2}}^{(*)} = ι_{ℓ_{2}}^{(*)} + t .

(19)

Within this updating rule, the multipliers can always satisfy the constraints during the iterations. By substituting (19) into the subproblem (18), the objective function is only with respect to t, and the subproblem turns to be a univariate optimization problem, which can be solved analytically.

The bias u can be computed when the optimal multipliers are obtained, which can be approximated by the average value of the partial derivatives of the objective function in (14) with respect to every multiplier.

By comparing Equations (13) and (9), it is easy to obtain the explicit estimation of the parameter b, which can be written as

b = \sum_{i = 2}^{n} (ι_{i}^{*} - ι_{i}) s^{(1)} (i) .

(20)

Above all, all the parameters can be obtained by using the SMO algorithm and the formulae presented above. The implementation of the SMO is based on the work of Lin et al. [37], which is one of the most stable versions at present.

2.5. Response Function and the Predicted Values

In order to make predictions, the continuous formulation of the general solution of the whitening Equation (4) should be discretized. In this work, we average the time point to reach an approximation of the convolutional integral of the solution (5); then, its discrete formulation can be obtained as

\begin{matrix} {\hat{y}}^{(1)} (k) & = y^{(0)} (1) e^{- b (k - 1)} + \sum_{τ = 2}^{k} e^{b (k - τ + \frac{1}{2})} g (τ - \frac{1}{2}) \\ = y^{(0)} (1) e^{- b (k - 1)} + \sum_{τ = 2}^{k} e^{b (k - τ + \frac{1}{2})} (\sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) K (i - \frac{1}{2}, τ - \frac{1}{2}) + u) \end{matrix} .

(21)

This formulation is different to the previous work but is much easier to implement. Then, the predicted values can be obtained by first order differencing as

{\hat{y}}^{(0)} (k) = {\hat{y}}^{(1)} (k) - {\hat{y}}^{(1)} (k - 1), k = 2, 3, \dots, n,

(22)

and

{\hat{y}}^{(0)} (1) = y^{(0)} (1)

.

2.6. The Complete Overal Computational Algorithm

For application’s sake, a complete algorithm of the proposed GWSVR (Algorithm 1) is presented in this subsection as a piece of pseudo code.

Algorithm 1: Overall computation algorithm of GWSVR

Within the above computational steps, one can obtain the values predicted by the GWSVR model from the initial point to the

n + p

point.

3. Hyperparameter Optimization by the Grey Wolf Optimizer

Within the high performance of the kernel representation, the model can be easily trapped in the overfitting status. Hence, selecting the appropriate hyperparameters is crucial to enhance the model’s performance when applied to out-of-sample datasets.

3.1. Construction of the Hyperparameter Optimization Scheme

In this study, the hold-out scheme is employed, which can be regarded as a modified version of nested cross validation specifically designed for time series forecasting models. The criterion used for model selection is to minimize the mean absolute percentage error (MAPE) on the validation set. The model is trained using the preceding series to evaluate its performance on the validation set. The objective function is constructed based on the aforementioned considerations as

min_{C, ν} \frac{1}{n_{valid}} \sum_{i = n_{train} + 1}^{n_{train} + n_{valid}} |\frac{{\hat{y}}^{(0)} (i) - y^{(0)} (i)}{y^{(0)} (i)}| \times 100 % .

(23)

The corresponding constraints can be presented as follows:

\{\begin{matrix} {\hat{y}}^{(0)} (1) = y^{(0)} (1) \\ ι_{i}^{(*)}, b, u obtained by Algorithm 1 \\ {\hat{y}}^{(1)} (k) = y^{(0)} (1) e^{- b (k - 1)} + \sum_{τ = 2}^{k} e^{b (k - τ + \frac{1}{2})} (\sum_{i = 2}^{n} (ι_{i} - ι_{i}^{*}) K (i - \frac{1}{2}, τ - \frac{1}{2}) + u) \\ {\hat{y}}^{(0)} (k) = {\hat{y}}^{(1)} (k) - {\hat{y}}^{(1)} (k - 1), k = 2, 3, \dots, n \end{matrix} .

(24)

The optimization problem for selecting the hyperparameters is essentially a nonlinear programming with equality constraints, which can not be solved by traditional numerical optimization algorithms. Thus, an intelligent optimizer is used in this work.

3.2. The Grey Wolf Optimizer

The Grey Wolf Optimizer (GWO) is a metaheuristic algorithm introduced by Mirjalili et al. in 2014, inspired by the hunting behavior of grey wolves in nature [29]. This optimization method has gained significant attention due to its efficient handling of complex optimization problems. The GWO algorithm employs mathematical formulas to emulate the social hierarchy and hunting strategies of grey wolves.

The grey wolf optimizer categorizes the wolves into four levels:

ι

,

β

,

δ

, and

ω

. These designations represent the wolves’ fitness levels, ranked in descending order. The

ι

wolf denotes the best solution, while the

ω

wolf represents the poorest solution. The primary operation of the GWO algorithm involves updating the positions of the wolves, also referred to as agents, based on the current positions of the

ι

,

β

, and

δ

wolves. Initially, the distance

D

between the wolves is mathematically defined as follows:

\begin{matrix} D = |2 \cdot r_{2} \cdot X_{p} (t) - X (t)|, \end{matrix}

(25)

where

X

represents the position of a grey wolf, t indicates the current iteration, and

X_{p}

denotes the vector of the prey’s position.

Then, the position-updating behavior is defined as

\begin{matrix} X (t + 1) = X_{p} - (2 a \cdot r_{1} - a) \cdot D, \end{matrix}

(26)

where

r_{1}

,

r_{2}

are random vectors in

[0, 1]

, and the components of

a

are linearly decreased from 2 to 0 at each process of iteration.

Based on the social hierarchy and encircling prey processes, the hunting process of the GWO algorithm employs four primary equations to update the positions of the wolves iteratively based on Equation (25). By taking the mean value of each update, the best position

X (t + 1)

is represented as:

X (t + 1) = \frac{X_{1} + X_{2} + X_{3}}{3} .

(27)

The integral computational algorithm for hyperparameter tuning based on the GWO algorithm is outlined in Algorithm 2.

Algorithm 2: Algorithm for solving the optimization problem by GWO

4. Applications

4.1. Data Collection and Preprocessing

For this study, annual data on the Total Natural Gas Available for Consumption (measured in 100 million cubic meters, abbreviated as 100 million cu.m) and the Total Natural Gas Consumption (100 million cu.m) were collected from the National Bureau of Statistics of China (NBS) (https://data.stats.gov.cn/english/easyquery.htm?cn=C01 (accessed on 27 June 2023)). The data spans from 2003 to 2020, with the years 2003 to 2014 designated as the in-sample set. Among the in-sample data, the years 2003 to 2012 were used for model fitting, while the years 2013 to 2014 served as a validation set for hyperparameter tuning. The data from 2015 to 2020 remained reserved exclusively as the testing dataset, allowing for the evaluation of the model’s predictive capabilities on unseen data. This approach provides a reliable measure of the model’s performance in real-world scenarios. In addition, the data were subjected to normalization using MinMax scaling before conducting the experiments, which transformed the data into an interval

[0, 1]

, facilitating fair comparisons and preventing any undue influence from variables with different scales or units. Another process is that the time t in the proposed model was accumulated in order to make the feature smoother.

4.2. Benchmarked Models for Comparisons and Evaluation Metrics

To assess the performance of the proposed grey wavelet support vector regressor (GWSVR) model, fifteen regular benchmark grey system models were utilized for model performance comparison, along with ten evaluation metrics. These benchmark models included the conventional grey model (GM) [8], nonhomogeneous grey model (NGM) [22], discrete grey model (DGM) [38], nonhomogenous discrete grey forecasting model (NDGM) [38], nonhomogeneous grey Bernoulli model (NGBM) [39], fractional-order grey model (FGM) [40], fractional-order nonhomogeneous grey model (FNGM) [41], fractional-order discrete grey model (FDGM) [42], fractional-order nonhomogeneous discrete grey model (FNDGM) [43], new information priority grey model (NIPGM) [44], new information priority nonhomogeneous grey model (NIPNGM) [45], new information priority discrete grey model (NIPDGM) [46], new information priority nonhomogeneous discrete grey model (NIPNDGM) [47], nonlinear grey Bernoulli model with fractional order accumulation (FNGBM) [48], and the new information priority nonlinear grey Bernoulli model (NIPNGBM). For fair comparisons, the benchmark models with hyperparameters were also optimized based on the GWO algorithm. In addition, a comprehensive set of evaluation metrics (as shown in Table 1) was employed to quantitatively measure and analyze various aspects of the model performance.

4.3. Forecasting Results and Analysis

4.3.1. Case I: Forecast of the Total Natural Gas Available for Consumption in China

Figure 1 displays all the predicted values for the forecast of the total natural gas available for consumption. Additionally, the selected optimal hyperparameters and training time of all models are listed in Table 2, and Table 3 presents the evaluation metrics for the out-of-sample analysis. Notably, the GWSVR model demonstrates superior performance in out-of-sample forecasting, as evidenced by Figure 1. Conversely, the linear grey system models (GM and DGM) exhibit poor forecasting accuracy, with predicted values deviating significantly from the actual data. This discrepancy can be attributed to overfitting issues within these models. Additionally, the NGBM fails to capture the correct trend and exhibits an inaccurate predictive decay tendency. Furthermore, the other grey system models display a common limitation of poor out-of-sample prediction performance compared to their in-sample prediction performance, particularly struggling to accurately predict values near inflection points. In contrast, the GWSVR model effectively captures these inflection points, as evidenced by the close alignment of its predicted values with the original data.

Upon examining the out-of-sample performance evaluation metrics depicted in Table 3, it becomes apparent that the GWSVR model consistently surpasses other benchmarked grey system models. The MAPE values achieved by the GWSVR model range from 6.645278% to 66.479172%, indicating its superior performance across the evaluated scenarios. Notably, the

R^{2}

values for the other grey system models are relatively small, with the NGBM model exhibiting a particularly low

R^{2}

value of −18.7624. Additionally, none of the other grey system models achieve an

R^{2}

value exceeding 0.9. Conversely, the GWSVR model achieves an

R^{2}

value of 0.974704, providing compelling evidence of its proficiency in out-of-sample forecasting tasks. The comprehensive out-of-sample forecasting results of all the models are presented in Table 4; it is interesting to see that the maximum error of the GWSVR model is still less than the minimum error of the benchmarked models except for the NIPNGBM model, demonstrating the superior capability of the GWSVR model. Furthermore, the median error of the GWSVR model is at least sevenfold lower compared to the other models, and most of the errors of the GWSVR model are the smallest.

4.3.2. Case II: Forecast of the Total Natural Gas Consumption in China

In the case of the total natural gas consumption forecasting, all the predicted values are shown in Figure 2, the selected optimal hyperparameters and training time of all models in Case II are listed in Table 5, and the out-of-sample metrics are listed in Table 6. Overall, the GWSVR (Grey Wolf Support Vector Regression) model demonstrates superior performance, with predicted values closely aligned with the raw data and outperforming the other benchmarked models. Notably, only the GWSVR model successfully captures the fluctuations observed from 2014, whereas the other models exhibit limitations in handling these fluctuations. Additionally, the GWSVR model excels in following the overall trend and exhibits minimal difficulty in out-of-sample forecasting, in contrast to the other models where the predicted values deviate significantly from the out-of-sample raw data. Notably, most other models exhibit poor performance, indicating overfitting issues. Conversely, the GWSVR model demonstrates enhanced capability in addressing inflection points when compared to the other models.

Based on the aforementioned discussion, the out-of-sample evaluation metrics presented in Table 6 provide empirical evidence supporting the superior performance of the GWSVR model compared to the other benchmarked grey system models in the natural gas time series forecasting. Specifically, the GWSVR model exhibits a significantly smaller MAPE compared to the benchmark models, with reductions ranging from 6.316363% to 63.820175%. Furthermore, the GWSVR model achieves a

R^{2}

value of 0.974481, which stands in stark contrast to most other models that yield negative

R^{2}

values. And the out-of-sample predictive capability of the other models is notably inferior, as evidenced by MAPE values exceeding 9%. In summary, the GWSVR model consistently demonstrates superior forecasting performance when compared to the other grey system models. The comprehensive out-of-sample forecasting results of all the models are presented in Table 7. The median error of the GWSVR model is lower from 197.28353% to 1912.94653% compared to other benchmarked models. Additionally, it is interesting to see that the maximum error of the GWSVR is three times less than the minimum error of the NGBM model. The NGBM model presents the worst performance in this case, as it is underfitting. In summary, the GWSVR model consistently demonstrates superior forecasting performance when compared to the other grey system models.

4.4. Discussion

For further discussion, it is evident that the GWSVR model consistently outperforms all other models in terms of having the most accurate out-of-sample predictions that closely align with the original data, as well as superior evaluation metrics compared to the benchmarked grey system models. More detailed analyses of the model’s predictive ability and application performance are provided below.

4.4.1. Comparisons of the GWSVR Model and Linear Grey System Models

In comparison to the linear grey system models, the GWSVR model exhibits notable superiority in this study. The MAPE of the GWSVR model is ten times smaller than that of the GM and DGM models in both cases. This contrast clearly highlights the limitations of linear models, which fail to capture the correct out-of-sample trends. Additionally, linear models struggle to handle inflection points, often producing predicted values that deviate from the raw data. In contrast, the GWSVR model demonstrates remarkable adaptability in addressing these challenges and consistently delivers stable performance across all cases. Its strength lies in effectively extracting temporal features while simultaneously processing time series characteristics.

4.4.2. Comparisons of the GWSVR Model and Nonlinear Grey System Models

In this work, the nonlinear grey system models have consistently gained the advantage over the linear grey system models in terms of natural gas consumption forecasting, and the forecasting comparisons and findings are consistent with the existing work in that the nonlinear grey system models always yield better forecasting performance than the linear grey system models. In comparison with the nonlinear grey system models, the NIPNGBM and FNGBM offer the smallest MAPEs of 9.316916% and 9.073488% in the benchmarked models, respectively, while the GWSVR provides MAPEs of 2.671638% and 2.757125%, respectively. On the other hand, the GWSVR model is more capable of reaching the inflection points and enhancing the predictive ability with predicted values tightly following the raw trend.

5. Conclusions

In this work, a Grey Wavelet Support Vector Regressor (GWSVR) was proposed, with detailed modeling procedures and computational algorithms presented. Two real-world case studies on forecasting the natural gas supply and consumption in China are carried out using GWSVR in comparison with 15 grey system models, including linear and nonlinear models. The results clearly indicate that most linear grey system models cannot perform well in these two cases, illustrating that the datasets are nonlinear. Although some existing nonlinear grey system models perform better than the linear models, their out-of-sample errors are significantly larger than those of GWSVR. Above all, the proposed model demonstrates higher performance than the other 15 existing grey system models, further indicating the eligibility of the proposed modeling method and algorithm for building accurate grey system models. Additionally, it can be observed that the MAPEs of GWSVR in both cases are smaller than 3%, indicating its high potential in forecasting the natural gas supply and demand in China based on recent historical datasets. The results of the applications suggest that the proposed GWSVR can serve as a reliable decision-making support tool in the future.

Author Contributions

Author Contributions: Conceptualization, methodology, writing—original draft preparation, writing—review and editing, software, funding acquisition, X.M.; writing—original draft preparation, writing—review and editing, software, H.Y. and Y.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Humanities and Social Science Fund of the Ministry of Education of China (19YJCZH119) and the Natural Science Foundation of Sichuan Province (No. 2022NSFSC1821).

Data Availability Statement

Publicly available datasets were analyzed in this study. The data can be found here: https://data.stats.gov.cn/english/easyquery.htm?cn=C01.

Conflicts of Interest

Authors declare that there is no conflict of interest due to the publication of this paper.

References

Liu, X.; Lin, Z.; Feng, Z. Short-term offshore wind speed forecast by seasonal ARIMA—A comparison against GRU and LSTM. Energy 2021, 227, 120492. [Google Scholar] [CrossRef]
Bacanin, N.; Jovanovic, L.; Zivkovic, M.; Kandasamy, V.; Antonijevic, M.; Deveci, M.; Strumberger, I. Multivariate energy forecasting via metaheuristic tuned long-short term memory and gated recurrent unit neural networks. Inf. Sci. 2023, 642, 119122. [Google Scholar] [CrossRef]
Al-Alimi, D.; AlRassas, A.M.; Al-qaness, M.A.; Cai, Z.; Aseeri, A.O.; Abd Elaziz, M.; Ewees, A.A. TLIA: Time-series forecasting model using long short-term memory integrated with artificial neural networks for volatile energy markets. Appl. Energy 2023, 343, 121230. [Google Scholar] [CrossRef]
Dutta, R.; Das, S.; De, S. Multi criteria decision making with machine-learning based load forecasting methods for techno-economic and environmentally sustainable distributed hybrid energy solution. Energy Convers. Manag. 2023, 291, 117316. [Google Scholar] [CrossRef]
Pawar, P.; Tarun Kumar, M.; Vittal, K.P. An IoT based Intelligent Smart Energy Management System with accurate forecasting and load strategy for renewable generation. Measurement 2020, 152, 107187. [Google Scholar] [CrossRef]
Tayab, U.B.; Lu, J.; Yang, F.; AlGarni, T.S.; Kashif, M. Energy management system for microgrids using weighted salp swarm algorithm and hybrid forecasting approach. Renew. Energy 2021, 180, 467–481. [Google Scholar] [CrossRef]
Cao, Y.; Liu, G.; Sun, J.; Bavirisetti, D.P.; Xiao, G. PSO-Stacking improved ensemble model for campus building energy consumption forecasting based on priority feature selection. J. Build. Eng. 2023, 72, 106589. [Google Scholar] [CrossRef]
Deng, J. Control problems of grey systems. Syst. Control Lett. 1982, 1, 288–294. [Google Scholar] [CrossRef]
Ding, S.; Tao, Z.; Zhang, H.; Li, Y. Forecasting nuclear energy consumption in China and America: An optimized structure-adaptative grey model. Energy 2022, 239, 121928. [Google Scholar] [CrossRef]
Ding, S.; Li, R.; Wu, S.; Zhou, W. Application of a novel structure-adaptative grey model with adjustable time power item for nuclear energy consumption forecasting. Appl. Energy 2021, 298, 117114. [Google Scholar] [CrossRef]
Zhao, H.; Wu, L. Forecasting the non-renewable energy consumption by an adjacent accumulation grey model. J. Clean. Prod. 2020, 275, 124113. [Google Scholar] [CrossRef]
Luo, X.; Duan, H.; He, L. A Novel Riccati Equation Grey Model And Its Application In Forecasting Clean Energy. Energy 2020, 205, 118085. [Google Scholar] [CrossRef]
El-Fouly, T.; El-Saadany, E.; Salama, M. Grey predictor for wind energy conversion systems output power prediction. IEEE Trans. Power Syst. 2006, 21, 1450–1452. [Google Scholar] [CrossRef]
Şahin, U. Forecasting share of renewables in primary energy consumption and CO₂ emissions of China and the United States under COVID-19 pandemic using a novel fractional nonlinear grey model. Expert Syst. Appl. 2022, 209, 118429. [Google Scholar] [CrossRef]
Li, D.C.; Chang, C.J.; Chen, C.C.; Chen, W.C. Forecasting short-term electricity consumption using the adaptive grey-based approach—An Asian case. Omega 2012, 40, 767–773. [Google Scholar] [CrossRef]
Hamzacebi, C.; Es, H.A. Forecasting the annual electricity consumption of Turkey using an optimized grey model. Energy 2014, 70, 165–171. [Google Scholar] [CrossRef]
Wang, Q.; Song, X.; Li, R. A novel hybridization of nonlinear grey model and linear ARIMA residual correction for forecasting U.S. shale oil production. Energy 2018, 165, 1320–1331. [Google Scholar] [CrossRef]
Ma, H.; Zhang, Z. Grey prediction with Markov-Chain for Crude oil production and consumption in China. In Proceedings of the The Sixth International Symposium on Neural Networks (ISNN 2009), New York, NY, USA, 2–5 August 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 551–561. [Google Scholar]
Chen, Y.; Yang, D.; Dai, W.Z. Modeling research of model GM (1,1) background value-building based on Gauss-Legendre quadrature and its application. J. Zhejiang Sci.-Tech. Univ. 2007, 4, 444–447. [Google Scholar]
Xie, N.; Liu, S. Discrete GM(1,1) model and gray prediction modeling mechanism model. Syst. Eng. Theory Pract. 2005, 25, 93–99. [Google Scholar]
Liu, B.; Zhao, L.; Zhai, Z.J.; Dang, Y.G.; Zhang, R. Optimum model of GM(1,1) and its suitable range. J. Nanjing Univ. Aeronaut. Astronaut. 2003, 35, 451–454. [Google Scholar]
Cui, J.; Liu, S.F.; Zeng, B.; Xie, N.M. A novel grey forecasting model and its optimization. Appl. Math. Model. 2013, 37, 4399–4406. [Google Scholar] [CrossRef]
Zhou, W.; Wu, X.; Ding, S.; Pan, J. Application of a novel discrete grey model for forecasting natural gas consumption: A case study of Jiangsu Province in China. Energy 2020, 200, 117443. [Google Scholar] [CrossRef]
Yan, C.; Wu, L.; Liu, L.; Zhang, K. Fractional Hausdorff grey model and its properties. Chaos Solitons Fractals 2020, 138, 109915. [Google Scholar] [CrossRef]
Qian, W.; Dang, Y.; Liu, S. Grey GM (1, 1, tα) model with time power and its application. Syst. Eng.-Theory Pract. 2012, 32, 2247–2252. [Google Scholar]
Zeng, B.; Duan, H.; Zhou, Y. A new multivariable grey prediction model with structure compatibility. Appl. Math. Model. 2019, 75, 385–397. [Google Scholar] [CrossRef]
Liu, X.; Moreno, B.; García, A.S. A grey neural network and input-output combined forecasting model. Primary energy consumption forecasts in Spanish economic sectors. Energy 2016, 115, 1042–1054. [Google Scholar] [CrossRef]
Ma, X.; Lu, H.; Ma, M.; Wu, L.; Cai, Y. Urban natural gas consumption forecasting by novel wavelet-kernelized grey system model. Eng. Appl. Artif. Intell. 2023, 119, 105773. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef] [Green Version]
Cui, Y.; Jia, L.; Fan, W. Estimation of actual evapotranspiration and its components in an irrigated area by integrating the Shuttleworth-Wallace and surface temperaturevegetation index schemes using the particle swarm optimization algorithm. Agricultural and Forest Meteorology 2021, 307, 108488. [Google Scholar] [CrossRef]
Mirjalili, S. Genetic algorithm. In Evolutionary Algorithms and Neural Networks: Theory and Applications; Studies in Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2019; Volume 780, pp. 43–55. [Google Scholar]
Cai, Z.; Dai, S.; Ding, Q.; Zhang, J.; Xu, D.; Li, Y. Gray wolf optimization-based wind power load mid-long term forecasting algorithm. Comput. Electr. Eng. 2023, 109, 108769. [Google Scholar] [CrossRef]
Heydari, A.; Garcia, D.A.; Keynia, F.; Bisegna, F.; Santoli, L.D. Renewable Energies Generation and Carbon Dioxide Emission Forecasting in Microgrids and National Grids using GRNN-GWO Methodology. Energy Procedia 2019, 159, 154–159. [Google Scholar] [CrossRef]
Sebayang, A.; Kusumo, F.; Milano, J.; Shamsuddin, A.; Silitonga, A.; Ideris, F.; Siswantoro, J.; Veza, I.; Mofijur, M.; Reen Chia, S. Optimization of biodiesel production from rice bran oil by ultrasound and infrared radiation using ANN-GWO. Fuel 2023, 346, 128404. [Google Scholar] [CrossRef]
Barman, M.; Dev Choudhury, N.B. A similarity based hybrid GWO-SVM method of power system load forecasting for regional special event days in anomalous load situations in Assam, India. Sustain. Cities Soc. 2020, 61, 102311. [Google Scholar] [CrossRef]
Schölkopf, B.; Smola, A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
Chen, P.H.; Fan, R.E.; Lin, C.J. A study on SMO-type decomposition methods for support vector machines. IEEE Trans. Neural Netw. 2006, 17, 893–908. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Xie, N.M.; Liu, S.F. Discrete grey forecasting model and its optimization. Appl. Math. Model. 2009, 33, 1173–1186. [Google Scholar] [CrossRef]
Chen, C.I.; Chen, H.L.; Chen, S.P. Forecasting of foreign exchange rates of Taiwan’s major trading partners by novel nonlinear Grey Bernoulli model NGBM (1, 1). Commun. Nonlinear Sci. Numer. Simul. 2008, 13, 1194–1204. [Google Scholar] [CrossRef]
Wu, L.; Liu, S.; Yao, L.; Yan, S.; Liu, D. Grey system model with the fractional order accumulation. Commun. Nonlinear Sci. Numer. Simul. 2013, 18, 1775–1785. [Google Scholar] [CrossRef]
Duan, H.; Lei, G.R.; Shao, K. Forecasting crude oil consumption in China using a grey prediction model with an optimal fractional-order accumulating operator. Complexity 2018, 2018, 3869619. [Google Scholar] [CrossRef]
Meng, W.; Yang, D.; Huang, H. Prediction of China’s sulfur dioxide emissions by discrete grey model with fractional order generation operators. Complexity 2018, 2018, 8610679. [Google Scholar] [CrossRef] [Green Version]
Wu, L.F.; Liu, S.F.; Cui, W.; Liu, D.L.; Yao, T.X. Non-homogenous discrete grey model with fractional-order accumulation. Neural Comput. Appl. 2014, 25, 1215–1221. [Google Scholar] [CrossRef]
Wu, L.; Liu, S.; Chen, H.; Zhang, N. Using a novel grey system model to forecast natural gas consumption in China. Math. Probl. Eng. 2015, 2015, 686501. [Google Scholar] [CrossRef] [Green Version]
Cui, J.; Dang, Y.; Liu, S. Novel grey forecasting model and its modeling mechanism. Control Decis. 2009, 24, 1702–1706. [Google Scholar]
Zhou, W.; Zhang, H.; Dang, Y.; Wang, Z. New information priority accumulated grey discrete model and its application. Chin. J. Manag. Sci. 2017, 25, 140–148. [Google Scholar]
Xie, N.; Liu, S. Research on the non-homogenous discrete grey model and its parameter’s properties. Syst. Eng. Electron. 2008, 30, 863–867. [Google Scholar]
Wu, W.; Ma, X.; Zeng, B.; Wang, Y.; Cai, W. Forecasting short-term renewable energy consumption of China using a novel fractional nonlinear grey Bernoulli model. Renew. Energy 2019, 140, 70–87. [Google Scholar] [CrossRef]

Figure 1. Predicted values of all models in Case I.

Figure 2. Predicted values of all models in Case II.

Table 1. Model performance metrics.

Metrics	Abbreviation	Expression
Mean Absolute Error	MAE	$\frac{1}{n} \sum_{k = 1}^{n} \|y^{(0)} (k) - {\hat{y}}^{(0)} (k)\|$
Mean Absolute Percentage Error	MAPE	$\frac{1}{n} \sum_{k = 1}^{n} \|\frac{y^{(0)} (k) - {\hat{y}}^{(0)} (k)}{y (k)}\| \times 100 %$
Mean Square Error	MSE	$\frac{1}{n} \sum_{k = 1}^{n} {(y^{(0)} (k) - {\hat{y}}^{(0)} (k))}^{2}$
Root Mean Square Error	RMSE	$\sqrt{\frac{1}{n} \sum_{k = 1}^{n} {(y^{(0)} (k) - {\hat{y}}^{(0)} (k))}^{2}}$
Theil U Statistic 1	U1	$\frac{\sqrt{\frac{1}{n} \sum_{k = 1}^{n} {(y^{(0)} (k) - {\hat{y}}^{(0)} (k))}^{2}}}{\sqrt{\frac{1}{n} \sum_{k = 1}^{n} {(y^{(0)} (k))}^{2}} + \sqrt{\frac{1}{n} \sum_{k = 1}^{n} {({\hat{y}}^{(0)} (k))}^{2}}}$
Theil U Statistic 2	U2	$\frac{\sqrt{\frac{1}{n} \sum_{k = 1}^{n} {(y^{(0)} (k) - {\hat{y}}^{(0)} (k))}^{2}}}{\sqrt{\sum_{k = 1}^{n} {(y^{(0)} (k))}^{2}}}$
Median Absolute Error	MedAe	$\frac{1}{n} \sum_{k = 1}^{n} arctan (\|\frac{y^{(0)} (k) - {\hat{y}}^{(0)} (k)}{y (k)}\|)$
Index of Agreement	IA	$1 - \frac{\sum_{k = 1}^{n} {(y^{(0)} (k) - {\hat{y}}^{(0)} (k))}^{2}}{\sum_{k = 1}^{n} {(\|y^{(0)} (k) - {\bar{y}}^{(0)}\| + \|{\hat{y}}^{(0)} (k) - {\bar{\hat{y}}}^{(0)}\|)}^{2}}$
Coefficient of Determination	R $^{2}$	$1 - \frac{\sum_{k = 1}^{n} {(y^{(0)} (k) - {\hat{y}}^{(0)} (k))}^{2}}{\sum_{k = 1}^{n} {(y^{(0)} (k) - {\bar{y}}^{(0)})}^{2}}$
Percent Bias	Pibas	$\frac{\sum_{k = 1}^{n} (y^{(0)} (k) - {\hat{y}}^{(0)} (k))}{\sum_{k = 1}^{n} {\hat{y}}^{(0)} (k)}$

Table 2. Optimal hyperparameters and training times of the models in Case I.

Model	Model Abbreviation	Optimal Hyperparameters	Training Time (s)
Grey Wavelet Support Vector Regressor	GWSVR	C = 279.8859, $ν$ = 0.2920	0.84743 s
Nonhomogeneous Grey Bernoulli Model	NGBM	n = 5.8944	0.03722 s
Fractional-order Grey Model	FGM	r = 0.2582	0.15140 s
Fractional-order Nonhomogeneous Grey Model	FNGM	r = 1.3683	0.09306 s
Fractional-order Discrete Grey Model	FDGM	r = 0.2377	0.10684 s
Fractional-order Nonhomogeneous Discrete Grey Model	FNDGM	r = −1.0525	0.61533 s
New Information Priority Grey Model	NIPGM	r = 0.1218	0.05067 s
New Information Priority Nonhomogeneous Grey Model	NIPNGM	r = 0.1956	0.05541 s
New Information Priority Discrete Grey Model	NIPDGM	r = 0.8181	0.04859 s
New Information Priority Nonhomogeneous Discrete Grey Model	NIPNDGM	r = 0.7722	0.19020 s
Nonlinear Grey Bernoulli Model with Fractional-order Accumulation	FNGBM	n = 2.0000, r = −0.0255	0.78871 s
New Information Priority Nonlinear Grey Bernoulli Model	NIPNGBM	n = 2.0000, r = 0.0000	0.42991 s

Table 3. Out-of-sample forecasting metrics of all models in Case I.

	GWSVR	GM	NGM	DGM	NDGM	NGBM	FGM	FNGM	FDGM	FNDGM	NIPGM	NIPNGM	NIPDGM	NIPNDGM	FNGBM	NIPNGBM
MAE	67.67645	838.3097	248.2112	856.624	353.4656	1931.666	681.2515	634.3292	671.4348	375.5022	720.9939	1633.579	573.2332	1294.021	294.8063	213.4934
MAPE	2.671638	30.90473	9.520339	31.5974	13.63293	69.15081	25.57323	23.52015	25.24756	14.80201	27.01877	60.17945	21.68454	47.20952	12.29786	9.316916
MSE	6210.548	851,978.1	71,475.37	887,239.9	136,967.9	4,851,922	530,154.6	480,953.7	512,394.5	146,942.1	596,039.9	3,214,052	368,567.4	2,103,858	93,267.14	57,426.36
RMSE	78.80703	923.0266	267.3488	941.9341	370.0917	2202.708	728.1172	693.5083	715.8174	383.3302	772.0362	1792.778	607.0975	1450.468	305.3967	239.638
U1	0.014978	0.14954	0.048349	0.152132	0.065671	0.628413	0.121569	0.116579	0.119747	0.067848	0.127981	0.254925	0.103404	0.217042	0.054986	0.043823
U2	0.029889	0.350072	0.101396	0.357243	0.140363	0.835411	0.27615	0.263024	0.271485	0.145384	0.292807	0.679939	0.230251	0.550112	0.115826	0.090886
MedAe	60.64835	726.0962	249.7691	743.6731	346.3844	2072.833	599.266	545.3121	591.2506	381.9754	636.0132	1447.303	502.216	1114.763	280.7869	200.0508
IA	0.993118	0.676358	0.939349	0.66958	0.893418	0.195224	0.739185	0.76097	0.743286	0.880073	0.722582	0.460159	0.787813	0.534152	0.904416	0.933144
R2	0.974704	−2.47021	0.708873	−2.61383	0.442114	−18.7624	−1.15938	−0.95898	−1.08704	0.401488	−1.42774	−12.0912	−0.50122	−7.56926	0.620112	0.766096
Pbias	0.001949	−0.24455	−0.08746	−0.24856	−0.1201	2.935512	−0.20827	−0.19675	−0.20589	−0.12664	−0.21778	−0.3868	−0.18123	−0.33319	−0.1022	−0.07616

Table 4. Detailed results of the out-of-sample forecasting of all models in Case I.

Year	Raw Data	GWSVR	Error	GM	Error	NGM	Error	DGM	Error	NDGM	Error	NGBM	Error	FGM	Error	FNGM	Error
2015	1925	1962.351	−37.3511	2286.432	−361.432	2069.711	−144.711	2297.588	−372.588	2155.033	−230.033	1468.653	456.3469	2281.862	−356.862	2209.65	−284.65
2016	2080.5	2177.216	−96.7159	2654.098	−573.598	2337.722	−257.222	2667.468	−586.968	2430.429	−349.929	1300.472	780.0278	2614.386	−533.886	2546.322	−465.822
2017	2390.7	2415.607	−24.9071	3080.887	−690.187	2633.016	−242.316	3096.894	−706.194	2733.54	−342.84	761.2975	1629.402	2989.801	−599.101	2929.592	−538.892
2018	2814.3	2680.101	134.1994	3576.305	−762.005	2958.369	−144.069	3595.452	−781.152	3067.156	−252.856	298.0357	2516.264	3413.731	−599.431	3366.033	−551.733
2019	3057.5	2973.554	83.94561	4151.389	−1093.89	3316.842	−259.342	4174.271	−1116.77	3434.346	−376.846	93.05441	2964.446	3892.534	−835.034	3863.137	−805.637
2020	3270.2	3299.14	−28.9395	4818.947	−1548.75	3711.806	−441.606	4846.271	−1576.07	3838.49	−568.29	26.6897	3243.51	4433.394	−1163.19	4429.442	−1159.24
Minimum Error			−24.9071		361.4315		144.0692		372.5879		230.0328		456.3469		356.862		284.6495
Maximum Error			134.1994		1548.747		441.6064		1576.071		568.29		3243.51		1163.194		1159.242
Median Error			−26.9233		−726.096		−249.769		−743.673		−346.384		2072.833		−599.266		−545.312
Year	Raw Data	FDGM	Error	FNDGM	Error	NIPGM	Error	NIPNGM	Error	NIPDGM	Error	NIPNDGM	Error	FNGBM	Error	NIPNGBM	Error
2015	1925	2283.078	−358.078	2213.542	−288.542	2296.67	−371.67	2658.937	−733.937	2246.545	−321.545	2434.458	−509.458	2241.281	−316.281	2206.928	−281.928
2016	2080.5	2612.751	−532.251	2489.094	−408.594	2636.458	−555.958	3135.378	−1054.88	2559.518	−479.018	2871.576	−791.076	2498.838	−418.338	2451.233	−370.733
2017	2390.7	2984.317	−593.617	2783.428	−392.728	3020.962	−630.262	3706.396	−1315.7	2909.279	−518.579	3398.486	−1007.79	2759.741	−369.041	2694.717	−304.017
2018	2814.3	3403.184	−588.884	3096.616	−282.316	3456.065	−641.765	4393.209	−1578.91	3300.153	−485.853	4036.041	−1221.74	3019.185	−204.885	2932.474	−118.174
2019	3057.5	3875.457	−817.957	3428.723	−371.223	3948.427	−890.927	5221.795	−2164.3	3736.97	−679.47	4809.935	−1752.44	3272.5	−215	3160.051	−102.551
2020	3270.2	4408.021	−1137.82	3779.811	−509.611	4505.582	−1235.38	6223.96	−2953.76	4225.133	−954.933	5751.83	−2481.63	3515.492	−245.292	3373.759	−103.559
Minimum Error			358.078		282.3156		371.6699		733.9373		321.5452		509.4581		204.8847		102.5506
Maximum Error			1137.821		509.6112		1235.382		2953.76		954.933		2481.63		418.3375		370.7326
Median Error			−591.251		−381.975		−636.013		−1447.3		−502.216		−1114.76		−280.787		−200.051

Table 5. Optimal hyperparameters and training times of the models in Case II.

Model	Model Abbreviation	Optimal Hyperparameters	Training Time (s)
Grey Wavelet Support Vector Regressor	GWSVR	C = 48.9141, $ν$ = 0.2907	0.78489 s
Nonhomogeneous Grey Bernoulli Model	NGBM	n = 4.2162	0.03195 s
Fractional-order Grey Model	FGM	r = 0.1505	0.08470 s
Fractional-order Nonhomogeneous Grey Model	FNGM	r = 1.0902	0.08795 s
Fractional-order Discrete Grey Model	FDGM	r = 0.1354	0.08445 s
Fractional-order Nonhomogeneous Discrete Grey Model	FNDGM	r = 0.7610	0.22519 s
New Information Priority Grey Model	NIPGM	r = 0.8493	0.04955 s
New Information Priority Nonhomogeneous Grey Model	NIPNGM	r = 0.9839	0.05257 s
New Information Priority Discrete Grey Model	NIPDGM	r = 0.8481	0.04881 s
New Information Priority Nonhomogeneous Discrete Grey Model	NIPNDGM	r = 0.9030	0.18856 s
Nonlinear Grey Bernoulli Model with Fractional-order Accumulation	FNGBM	n = 2.0000, r = −1.0985	0.77943 s
New Information Priority Nonlinear Grey Bernoulli Model	NIPNGBM	n = 2.0000, r = 0.0000	0.42851 s

Table 6. Out-of-sample forecasting metrics of all models in Case II.

	GWSVR	GM	NGM	DGM	NDGM	NGBM	FGM	FNGM	FDGM	FNDGM	NIPGM	NIPNGM	NIPDGM	NIPNDGM	FNGBM	NIPNGBM
MAE	70.73226	819.5432	257.3194	837.7925	360.0807	1867.335	715.0994	829.0862	693.1967	1083.294	625.0661	933.0845	626.1443	992.5079	245.9249	212.6661
MAPE	2.757125	30.10073	9.851978	30.78744	13.84979	66.5773	26.7028	30.57583	25.96224	39.7818	23.46729	34.39759	23.52068	36.61266	9.073488	9.362476
MSE	6665.947	803908.8	73997.96	838135.1	139421.4	4511417	579693.9	809827.3	540108.1	1399142	436994.9	1025289	437770.2	1156709	80026.38	59962.91
RMSE	81.64525	896.6096	272.0257	915.4972	373.3918	2124.01	761.3763	899.9041	734.9205	1182.853	661.0559	1012.566	661.6421	1075.504	282.8893	244.8732
U1	0.015469	0.145006	0.048833	0.147606	0.065807	0.599254	0.125689	0.145397	0.12183	0.182914	0.110927	0.160705	0.111012	0.16901	0.054837	0.044542
U2	0.030774	0.337951	0.102532	0.34507	0.140739	0.800584	0.286979	0.339193	0.277007	0.445842	0.249166	0.381657	0.249387	0.40538	0.106627	0.092298
MedAe	65.08349	718.5948	264.3201	736.1083	358.6198	1922.922	639.5297	735.5087	620.9857	963.1634	558.1595	831.6983	559.5865	886.4476	221.2416	207.2592
IA	0.992839	0.695052	0.94051	0.688252	0.896817	0.213942	0.7358	0.691661	0.744624	0.605304	0.774006	0.654143	0.773404	0.633879	0.866851	0.932628
R2	0.974481	−2.07761	0.716713	−2.20864	0.466251	−16.2711	−1.21925	−2.10027	−1.0677	−4.35635	−0.67295	−2.92513	−0.67592	−3.42824	0.693634	0.770443
Pbias	0.00705	−0.23943	−0.08995	−0.24346	−0.12151	2.536972	−0.21549	−0.24154	−0.21028	−0.29384	−0.19361	−0.26385	−0.19388	−0.27601	0.044384	−0.07552

Table 7. Detailed results of the out-of-sample forecasting of all models in Case II.

Year	Raw Data	GWSVR	Error	GM	Error	NGM	Error	DGM	Error	NDGM	Error	NGBM	Error	FGM	Error	FNGM	Error
2015	1931.8	1964.998	−33.1984	2284.76	−352.96	2078.055	−146.255	2295.869	−364.069	2161.222	−229.422	1445.775	486.0255	2298.388	−366.588	2304.295	−372.495
2016	2078.1	2179.253	−101.153	2651.476	−573.376	2349.998	−271.898	2664.793	−586.693	2440.45	−362.35	1266.439	811.6608	2639.455	−561.355	2673.058	−594.958
2017	2393.7	2416.87	−23.1698	3077.053	−683.353	2650.442	−256.742	3093	−699.3	2748.59	−354.89	876.4395	1517.261	3025.939	−632.239	3097.102	−703.402
2018	2817.1	2680.395	136.705	3570.937	−753.837	2982.374	−165.274	3590.016	−772.916	3088.635	−271.535	488.517	2328.583	3463.921	−646.821	3584.716	−767.616
2019	3059.7	2972.654	87.04622	4144.092	−1084.39	3349.095	−289.395	4166.898	−1107.2	3463.889	−404.189	234.8974	2824.803	3960.299	−900.599	4145.432	−1085.73
2020	3339.9	3296.779	43.12077	4809.242	−1469.34	3754.251	−414.351	4836.479	−1496.58	3877.998	−538.098	104.2238	3235.676	4522.894	−1182.99	4790.215	−1450.32
Minimum Error			−23.1698		−352.96		−146.255		−364.069		−229.422		486.0255		−366.588		−372.495
Maximum Error			136.705		−1469.34		−414.351		−1496.58		−538.098		3235.676		−1182.99		−1450.32
Median Error			9.975466		−718.595		−264.32		−736.108		−358.62		1922.922		−639.53		−735.509
Year	Raw Data	FDGM	Error	FNDGM	Error	NIPGM	Error	NIPNGM	Error	NIPDGM	Error	NIPNDGM	Error	FNGBM	Error	NIPNGBM	Error
2015	1931.8	2298.021	−366.221	2408.667	−476.867	2266.683	−334.883	2354.288	−422.488	2269.313	−337.513	2388.746	−456.946	2122.463	−190.663	2218.179	−286.379
2016	2078.1	2633.393	−555.293	2814.976	−736.876	2591.951	−513.851	2737.896	−659.796	2594.262	−516.162	2778.55	−700.45	2269.559	−191.459	2462.651	−384.551
2017	2393.7	3012.31	−618.61	3290.358	−896.658	2957.806	−564.106	3181.093	−787.393	2959.606	−565.906	3230.151	−836.451	2417.44	−23.7398	2706.616	−312.916
2018	2817.1	3440.461	−623.361	3846.769	−1029.67	3369.313	−552.213	3693.104	−876.004	3370.367	−553.267	3753.544	−936.444	2566.076	251.0242	2945.239	−128.139
2019	3059.7	3924.275	−864.575	4498.21	−1438.51	3832.167	−772.467	4284.585	−1224.88	3832.191	−772.491	4360.338	−1300.64	2715.438	344.262	3174.102	−114.402
2020	3339.9	4471.02	−1131.12	5261.082	−1921.18	4352.777	−1012.88	4967.842	−1627.94	4351.427	−1011.53	5064.019	−1724.12	2865.499	474.4009	3389.509	−49.6092
Minimum Error			−366.221		−476.867		−334.883		−422.488		−337.513		−456.946		−23.7398		−49.6092
Maximum Error			−1131.12		−1921.18		−1012.88		−1627.94		−1011.53		−1724.12		474.4009		−384.551
Median Error			−620.986		−963.163		−558.16		−831.698		−559.586		−886.448		113.6422		−207.259

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, X.; Deng, Y.; Yuan, H. Forecasting the Natural Gas Supply and Consumption in China Using a Novel Grey Wavelet Support Vector Regressor. Systems 2023, 11, 428. https://doi.org/10.3390/systems11080428

AMA Style

Ma X, Deng Y, Yuan H. Forecasting the Natural Gas Supply and Consumption in China Using a Novel Grey Wavelet Support Vector Regressor. Systems. 2023; 11(8):428. https://doi.org/10.3390/systems11080428

Chicago/Turabian Style

Ma, Xin, Yanqiao Deng, and Hong Yuan. 2023. "Forecasting the Natural Gas Supply and Consumption in China Using a Novel Grey Wavelet Support Vector Regressor" Systems 11, no. 8: 428. https://doi.org/10.3390/systems11080428

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Forecasting the Natural Gas Supply and Consumption in China Using a Novel Grey Wavelet Support Vector Regressor

Abstract

1. Introduction