Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory

Shi, Hongmei; Wei, Lin; Wang, Cui; Wang, Shuai; Ning, Yufu

doi:10.3390/sym15071379

Open AccessArticle

Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory

by

Hongmei Shi

¹

,

Lin Wei

¹,

Cui Wang

¹,

Shuai Wang

^2,3,4,* and

Yufu Ning

^2,3,4

¹

School of Information Science and Engineering, Shandong Agriculture and Engineering University, Jinan 250100, China

²

School of Information Engineering, Shandong Youth University of Political Science, Jinan 250103, China

³

New Technology Research and Development Center of Intelligent Information Controlling, Universities of Shandong, Jinan 250103, China

⁴

Smart Healthcare Big Data Engineering and Ubiquitous Computing Characteristic Laboratory, Universities of Shandong, Jinan 250103, China

^*

Author to whom correspondence should be addressed.

Symmetry 2023, 15(7), 1379; https://doi.org/10.3390/sym15071379

Submission received: 7 June 2023 / Revised: 29 June 2023 / Accepted: 4 July 2023 / Published: 7 July 2023

(This article belongs to the Special Issue Fuzzy Set Theory and Uncertainty Theory—Volume II)

Download Versions Notes

Abstract

:

The traditional combination forecasting model has good forecasting effect, but it needs precise historical data. In fact, many random events are uncertain, and much of the data are imprecise; sometimes, historical data are lacking. We need to study combination forecasting problems by means of uncertainty theory. Uncertain least squares estimation is an important technique of uncertain statistics, an important way to deal with imprecise data, and one of the best methods to solve the unknown parameters of uncertain linear regression equations. On the basis of the traditional combination forecasting method and uncertain least squares estimation, this paper proposes two kinds of uncertain combination forecasting models, which are the unary uncertain linear combination forecasting model and the uncertain relative error combination forecasting model, respectively. We set up several piecewise linear regression models according to the data of different periods and, according to certain weights, These piecewise linear regression models are combined into a unary uncertain linear combination forecasting model with a better forecasting effect. The uncertain relative error combination forecasting model is a new forecasting model that combines the traditional relative error linear forecasting model and the uncertain least squares estimation. Compared with the traditional forecasting model, the model can better deal with the forecasting problem of imprecise data. We verify the feasibility of the uncertain combination forecasting model through a numerical example. According to the data analysis, compared with the existing model, the forecasting effect of the proposed model is better.

Keywords:

combination forecasting model; relative error; least squares estimation; uncertainty theory; linear regression model

1. Introduction

Regression analysis is a forecasting method for data analysis based on the causal relationship of changes in things; that is, according to the actual statistical data, through mathematical calculation, the interdependent quantitative relationship between variables is determined, and a reasonable mathematical model is established to calculate the future value of variables. Linear regression is a statistical analysis method that uses regression analysis in mathematical statistics to determine the interdependent quantitative relationship between two or more variables, which is widely used. Linear regression analysis is mainly used to analyze the observed values and fit a reasonable model. When a new value appears, it can be forecast using this model. The least squares method is a mathematical optimization technique, which is one of the most commonly used methods to solve the unknown parameters of linear regression. By minimizing the sum of squares of errors, it can find the best-matching data function and obtain a better linear regression fitting equation. The combination forecasting model adopts different single-item forecasting models for the same forecasting object, making full use of the information provided by various single item forecasting methods, and assigning appropriate weighting coefficients to improve the forecasting accuracy. There are many kinds of combination forecasting models, including the linear regression model, exponential model, power function model, logistic model, and neural network. Each model has its own characteristics and application scope. The idea of combining various models to achieve a better forecasting effect is basis of the combination forecasting. Many experts and scholars have conducted in-depth research on the linear combination forecasting model, deduced some forecasting models, achieved good results and carried out practical applications [1,2,3,4,5]. We know that precision and imprecision are symmetrical, precise data are relative, and imprecision data are absolute. Many of the observed data are imprecise. In other words, in practice, the obtained observation is often not a definite value, and may even show an approximate range. At this time, the traditional combination forecasting model cannot solve these problems. However, the uncertainty theory proposed by Liu [6] can solve this problem.

The relation between certainty and uncertainty is symmetrical, and any random event is uncertain. We need to study these problems by means of uncertainty theory. Liu [7] founded the uncertainty theory and gradually improved it [6,8,9,10]. Uncertainty theory is a branch of mathematics concerned with the analysis of degree of belief. Its main theories include uncertain measure, uncertain variable, uncertain distribution, uncertain inverse operation and expected value. Uncertainty theory has become an important branch of axiomatic mathematics to deal with uncertainty problems in reality. It has been widely used in uncertain planning, uncertain statistics, comprehensive evaluation and production planning [11,12,13], and has achieved fruitful results, which has aroused great attention. In 2010, Liu [6] began his research on uncertainty statistics, which is a methodology for collecting and interpreting expert experience data through uncertainty theory. Uncertainty statistics mainly include uncertain regression equation, uncertain estimation and uncertain hypothesis testing. Based on the keen interest in uncertain regression equations, many uncertain regression models have been proposed by experts and scholars [14,15,16,17,18]. Yao and Liu [19] proposed the least squares estimation to solve the unknown parameters of the uncertain regression equation. Wang et al. [20,21,22] proposed two new uncertain linear regression models. Shi et al. [23] proposed total least squares estimation model based on uncertainty theory. Uncertainty statistics also have real applications; when COVID-19 was spreading rapidly in most countries around the world, Liu Z. [24] proposed an uncertain growth model for the cumulative number of COVID-19 infections in China.

It is not easy to build a scientific forecasting model, because whether the forecasting model is scientific depends on the accuracy of the forecasting results on the one hand, and on the simplicity of the model itself on the other. However, these two aspects are contradictory: when the model is simple, the forecasting results are often not too accurate; when the forecasting is relatively accurate, the model is not too simple. On the basis of the previous research [3,4,5] and uncertainty theory, this paper puts forward two kinds of uncertain combination forecasting models, which are the unary uncertain linear combination forecasting model and the uncertain relative error combination forecasting model. In general, the newer the data information, the greater the impact of the given data on the model, but the historical data are also a factor affecting the accuracy of the model. According to the principle of minimum error, the unary uncertain linear combination forecasting model combines the piecewise linear regression of the data corresponding to different periods into a prediction model with higher accuracy. The uncertain relative error combination forecasting model is based on the least squares principle and relative error, combined with the uncertainty theory, which can better deal with the regression and forecasting of imprecision data. The two kinds of uncertain linear combination forecasting models can be used for both imprecise data and precise data, and the forecasting effect of the models is very good.

In this paper, we propose the unary uncertain linear combination forecasting model and the uncertain relative error combination forecasting model. Both of these models can solve the regression equation of imprecise observation data better, and the forecasting effect is better. The main organizational structure is as follows: In Section 2, we propose the unary uncertain linear combination forecasting model. This model aims to establish several piecewise linear regression models according to the data of different periods, and combine the piecewise linear regression into an uncertain combination forecasting model. In Section 3, we propose the uncertain relative error combination forecasting model. This model is a new model, which combines relative error and uncertainty theory together and has a good forecasting effect. In the Section 4, the feasibility of the uncertain linear regression combination forecasting model is verified by numerical example. The forecasting effect of the model is good. Finally, we summarize the proposed model and point out the future research direction.

2. Uncertain Regression Model

Certainty and uncertainty are symmetrical, and precision and imprecision are also symmetrical. In order to resolve uncertainty problems such as imprecise data, Liu [7] founded uncertainty theory. The main content of uncertainty theory includes the basic theory of uncertainty variable, uncertainty measure and uncertainty distribution, as well as the calculation methods of uncertainty operational laws and expected value. If you are interested in uncertainty theory and uncertainty statistics, please study Reference [10]. In this section, we mainly introduce the uncertain least squares estimation method of the uncertain regression equation.

Assume that (

x_{1}

,

x_{2}

, …,

x_{n}

) is an independent variables vector, and y is a dependent variable. If the functional relationship is between (

x_{1}

,

x_{2}

, …,

x_{n}

), then y can be expressed by a regression model

y = f (x_{1}, x_{2}, \dots, x_{n} ∣ β) + ε,

(1)

where

β

is an unknown vector of parameters,

ε

is a disturbance term and

ε

is an uncertain variable. If the regression equation fits well, its expected value

E [ε]

should be 0 [10].

In particular, Liu [10] call

y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{n} x_{n} + ε

(2)

a linear regression model.

Assume that we have a set of imprecisely observed data,

({\tilde{x}}_{i 1}, {\tilde{x}}_{i 2}, \dots, {\tilde{x}}_{i n}, {\tilde{y}}_{i}), i = 1, 2, \dots, m

(3)

where

{\tilde{x}}_{i 1}

,

{\tilde{x}}_{i 2}

, …,

{\tilde{x}}_{i n}

,

{\tilde{y}}_{i}

are independent uncertain variables with regular uncertainty distributions

Φ_{i 1}

,

Φ_{i 2}

, …,

Φ_{i n}

,

Ψ_{i}

, i = 1, 2, …, m, respectively.

Yao-Liu [19] proposed the least squares estimate of unknown parameter

β

of linear regression model. The parameter

β

is the solution of the following minimization problem

min_{β} \sum_{i = 1}^{m} E [{({\tilde{y}}_{i} - f ({\tilde{x}}_{i 1}, {\tilde{x}}_{i 2}, \dots, {\tilde{x}}_{i n} ∣ β))}^{2}] .

(4)

If the minimization solution is

β^{*}

, then the fitted regression equation is determined by y = f (

x_{1}

,

x_{2}

, …,

x_{n}

∣

β^{*}

). Then, for each index i(i = 1, 2, …, m), the term

{\tilde{ε}}_{i} = {\tilde{y}}_{i} - f ({\tilde{x}}_{i 1}, {\tilde{x}}_{i 2}, \dots, {\tilde{x}}_{i n} ∣ β^{*})

(5)

is called the ith residual.

Let the disturbance term

ε

is uncertain variable, its expected value and variance can be estimated as

\hat{e} = \frac{1}{n} \sum_{i = 1}^{m} E [{\tilde{ε}}_{i}],

(6)

and

{\hat{σ}}^{2} = \frac{1}{n} \sum_{i = 1}^{m} E [{({\tilde{ε}}_{i} - \hat{e})}^{2}],

(7)

where

{\tilde{ε}}_{i}

are the ith residual, i = 1, 2, …, m, respectively [25].

Let (

x_{1}

,

x_{2}

, …,

x_{n}

) be a new independent variables vector; the forecast uncertain variable of dependent variable

\hat{y}

is

\hat{y} = f (x_{1}, x_{2}, \dots, x_{n} ∣ β^{*}) + \hat{ε}, \hat{ε} \sim N (\hat{e}, \hat{σ}) .

(8)

Lio-Liu [25] suggested that the forecast value is defined as the expected value of the uncertain variable

\tilde{y}

, i.e.,

\hat{u} = f (x_{1}, x_{2}, \dots, x_{n} ∣ β^{*}) + \hat{e} .

(9)

Taking

α

(e.g., 95%) as the confidence level, the confidence interval of dependent variable

\tilde{y}

is

\hat{u} \pm \frac{\hat{σ} \sqrt{3}}{π} \ln \frac{1 + α}{1 - α} .

(10)

3. Unary Uncertain Linear Regression Combination Forecasting Model

In this section, we derive the unary uncertain linear combination forecasting model, which is abbreviated as UULCFM. We all know that timely updated data have a greater effect on the forecasting model, so when establishing the forecasting model scientifically, we should fully consider the changes in time and conditions. The data information of different periods has different influences on the model and recent data information is obviously more valuable than long-term data information. The idea of UULCFM is to establish m regression models by discarding a certain amount of previous historical data according to the existing data, and then assemble m regression models into a forecasting model according to the principle of minimum error.

Assume that (

{\tilde{x}}_{i}

,

{\tilde{y}}_{i}

) (i = 1, 2, …, n) be a set of imprecise data, where

{\tilde{x}}_{i}

,

{\tilde{y}}_{i}

are independent uncertain variables with regular uncertainty distributions

Φ_{i}

,

Ψ_{i}

(i = 1, 2, …, n), respectively. We always assumed that there is a linear relationship between

{\tilde{x}}_{i}

and

{\tilde{y}}_{i}

(i = 1, 2, …, n), and y can be expressed by an uncertain regression model

y = α + β x + ε

, where

α

,

β

are unknown parameters, and

ε

is an uncertain disturbance term.

The main steps of the UULCFM are as follows.

Step 1.: For the original n sets of data, we obtained the following unary linear regression model using uncertain least squares estimation.

$y_{1} = α_{1} + β_{1} x,$

(11)

where $α_{1}$ and $β_{1}$ are unknown parameters.
Step 2.: Discarding the first $N_{1}$ sets of data, we can obtain the following unary linear regression model for the remaining $n - N_{1}$ sets of data through least squares estimation.

$y_{2} = α_{2} + β_{2} x .$

(12)

where $α_{2}$ and $β_{2}$ are unknown parameters, $N_{1}$ is positive integer and $N_{1} < n$ .
Step 3.: Discarding the first $N_{2}$ sets of data, we can obtain the following unary linear regression model for the remaining $n - N_{2}$ sets of data through least squares estimation.

$y_{3} = α_{3} + β_{3} x .$

(13)

where $α_{3}$ and $β_{3}$ are unknown parameters. Both $N_{1}$ , $N_{2}$ are positive integers, and $N_{1} < N_{2} < n$ .

By analogy, we can obtain the mth unary linear regression equation.

Step m. Discarding the first

N_{m - 1}

sets of data, we can obtain the following unary linear regression model for the remaining

n - N_{m - 1}

sets of data through least squares estimation.

y_{m} = α_{m} + β_{m} x .

(14)

where

α_{m}

and

β_{m}

are unknown parameters. Both

N_{1}, N_{2}, \dots, N_{m - 1}

are positive integers, and

N_{1} < N_{2} < \dots < N_{m - 1} < n

.

In this way, m unary linear regression models are obtained as follows,

y_{i} = α_{i} + β_{i} x, i = 1, 2, 3, \dots, m .

(15)

where

α_{i}

,

β_{i}

are unknown parameters.

Each regression equation of Equation (5) is fitted to the remaining

n - N_{m - 1}

sets of data, and the generated errors are, respectively,

\begin{matrix} ε_{i j} = y_{i j} - {\tilde{y}}_{j}, \\ i = 1, 2, 3, \dots, m, j = N_{m}, N_{m + 1}, \dots, n . \end{matrix}

(16)

Since

{\tilde{y}}_{i}

, i = 1, 2, …, n is a set of imprecise data, Equation (6) is deformed into the following form according to the uncertain expected value formula [10].

ε_{i j} = y_{i j} - E [{\tilde{y}}_{j}] = y_{i j} - \int_{0}^{1} Ψ_{j}^{- 1} (α) d α, i = 1, 2, 3, \dots, m, j = N_{m}, N_{m + 1}, \dots, n .

(17)

The purpose of this paper is to find m numbers

k_{1}

,

k_{2}

,

k_{3}

, …,

k_{m}

that satisfy

k_{1} + k_{2} + k_{3} + \dots + k_{m} = 1

. Then, we construct the composite model

y = \sum_{i = 1}^{m} k_{i} y_{i} .

(18)

This minimizes the sum of the squares

R = \sum_{j = N_{m}}^{n} ε_{j}

of error

ε_{i j} = y_{i j} - E [{\tilde{y}}_{j}]

,

i = 1, 2, 3, \dots, m

,

j = N_{m}, N_{m + 1}, \dots, n

. This model is called the linear regression combinatorial model.

We know from the above derivation

\begin{matrix} ε_{j} = y_{j} - E [{\tilde{y}}_{j}] = \sum_{i = 1}^{m} k_{i} y_{i j} - \sum_{i = 1}^{m} k_{i} E [{\tilde{y}}_{j}] = \sum_{i = 1}^{m} k_{i} [y_{i j} - E [{\tilde{y}}_{j}]] = \sum_{i = 1}^{m} k_{i} ε_{i j} \\ = [ε_{1 j}, ε_{2 j}, \dots, ε_{m j}] {[k_{1}, k_{2}, k_{3}, \dots, k_{m}]}^{T} = [k_{1}, k_{2}, k_{3}, \dots, k_{m}] {[ε_{1 j}, ε_{2 j}, \dots, ε_{m j}]}^{T} . \end{matrix}

(19)

So,

\begin{matrix} ε_{j}^{2} = [k_{1}, k_{2}, k_{3}, \dots, k_{m}] {[ε_{1 j}, ε_{2 j}, \dots, ε_{m j}]}^{T} [ε_{1 j}, ε_{2 j}, \dots, ε_{m j}] {[k_{1}, k_{2}, k_{3}, \dots, k_{m}]}^{T} \\ = [k_{1}, k_{2}, k_{3}, \dots, k_{m}] [\begin{matrix} ε_{1 j}^{2} & ε_{1 j} ε_{2 j} & \dots & ε_{1 j} ε_{m j} \\ ε_{1 j} ε_{2 j} & ε_{2 j}^{2} & \dots & ε_{2 j} ε_{m j} \\ \dots & \dots & \dots & \dots \\ ε_{1 j} ε_{m j} & ε_{2 j} ε_{m j} & \dots & ε_{m j}^{2} \end{matrix}] {[k_{1}, k_{2}, k_{3}, \dots, k_{m}]}^{T} . \end{matrix}

(20)

Therefore, we obtain

\begin{matrix} R = \sum_{i = N_{m}}^{n} ε_{j}^{2} = [k_{1}, k_{2}, k_{3}, \dots, k_{m}] \\ [\begin{matrix} \sum_{i = N_{m}}^{n} ε_{1 j}^{2} & \sum_{i = N_{m}}^{n} ε_{1 j} ε_{2 j} & \dots & \sum_{i = N_{m}}^{n} ε_{1 j} ε_{m j} \\ \sum_{i = N_{m}}^{n} ε_{1 j} ε_{2 j} & \sum_{i = N_{m}}^{n} ε_{2 j}^{2} & \dots & \sum_{i = N_{m}}^{n} ε_{2 j} ε_{m j} \\ \dots & \dots & \dots & \dots \\ \sum_{i = N_{m}}^{n} ε_{1 j} ε_{m j} & \sum_{i = N_{m}}^{n} ε_{2 j} ε_{m j} & \dots & \sum_{i = N_{m}}^{n} ε_{m j}^{2} \end{matrix}] \\ {[k_{1}, k_{2}, k_{3}, \dots, k_{m}]}^{T} . \end{matrix}

(21)

Denoted as

\begin{matrix} K = [k_{1}, k_{2}, k_{3}, \dots, k_{m}], \\ E = [\begin{matrix} \sum_{i = N_{m}}^{n} ε_{1 j}^{2} & \sum_{i = N_{m}}^{n} ε_{1 j} ε_{2 j} & \dots & \sum_{i = N_{m}}^{n} ε_{1 j} ε_{m j} \\ \sum_{i = N_{m}}^{n} ε_{1 j} ε_{2 j} & \sum_{i = N_{m}}^{n} ε_{2 j}^{2} & \dots & \sum_{i = N_{m}}^{n} ε_{2 j} ε_{m j} \\ \dots & \dots & \dots & \dots \\ \sum_{i = N_{m}}^{n} ε_{1 j} ε_{m j} & \sum_{i = N_{m}}^{n} ε_{2 j} ε_{m j} & \dots & \sum_{i = N_{m}}^{n} ε_{m j}^{2} \end{matrix}] . \end{matrix}

(22)

There are

R = K \cdot E \cdot K^{T} .

(23)

Assuming that

U = {[1, 1, \dots, 1]}^{T} .

(24)

So, the linear regression combination model becomes the problem of finding the minimum value of

R = K \cdot E \cdot K^{T}

under the constraint condition

K \cdot U = 1

.

We use the Lagrange multiplier method to solve the conditional extremum. We construct the Lagrange function as follows

L = K \cdot E \cdot K^{T} + λ (K U - 1) .

(25)

The Lagrange function L is a binary elementary function and the minimum point is the stagnation point of the function. If we take the first partial derivative of

L w . r . t K

, then we obtain

\frac{\partial L}{\partial K} = 2 E \cdot K^{T} + λ U = 0 .

(26)

If we solve the Equation (26), we get

K^{T} = - \frac{1}{2} λ E^{- 1} U .

(27)

According to constraint

K \cdot U = 1

, we can solve Equation (27) and obtain

K^{T} = \frac{E^{- 1} U}{U^{T} E^{- 1} U}, λ = \frac{- 2}{U^{T} E^{- 1} U} .

(28)

For m numbers m, numbers

k_{1}, k_{2},

k_{3}, \dots, k_{m}

satisfy

k_{1} + k_{2} + k_{3} + \dots + k_{m} = 1

. Thus, the linear regression combination model

y = \sum_{i = 1}^{m} k_{i} y_{i}

was obtained.

The derivation process of the UULCFM involves the matrix inverse and matrix elementary transformation, which requires readers to have a certain matrix foundation and linear algebra foundation.

4. Uncertain Relative Error Linear Combination Forecasting Model

In this section, we derive the uncertain relative error linear combination forecasting model, which is abbreviated as UURELCFM. Suppose that we have a set of imprecise data

X = {(x_{1}, x_{2}, \dots, x_{n})}^{T}

. where

{\tilde{x}}_{1}

,

{\tilde{x}}_{2}

, …,

{\tilde{x}}_{n}

,

\tilde{y}

are independent uncertain variables with regular uncertainty distributions

Φ_{1}

,

Φ_{2}

, …,

Φ_{n}

,

Ψ

, respectively.

The basic principles of UURELCFM are as follows. The forecasting result of the ith

(i = 1, 2, \dots, m)

forecasting method is

X_{i} = {(x_{1 i}, x_{2 i}, \dots, x_{n i})}^{T}

. The linear combination of the m forecasting result is

Y = {(y_{1}, y_{2}, \dots, y_{n})}^{T} = ω_{1} X_{1} + ω_{2} X_{2} + \dots + ω_{m} X_{m} .

(29)

The relative error between the forecasting value and the original data can be defined as

E = {(e_{1}, e_{2}, \dots, e_{n})}^{T} .

(30)

Among them,

e_{j} = \frac{| y_{j} - {\tilde{x}}_{j} |}{| {\tilde{x}}_{j} |} = \frac{| \sum_{i = 1}^{m} ω_{i} x_{j i} - {\tilde{x}}_{j} |}{| {\tilde{x}}_{j} |} = | \sum_{i = 1}^{m} ω_{i} \frac{x_{j i}}{{\tilde{x}}_{j}} - 1 |, j = 1, 2, \dots, n .

(31)

Since

{({\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n}, \tilde{y})}^{T}

nleads toimprecise data, we have to solve Equation (31) by means of uncertain expectations [10].

e_{j} = | \sum_{i = 1}^{m} ω_{i} E [\frac{x_{j i}}{{\tilde{x}}_{j}}] - 1 | = | \sum_{i = 1}^{m} ω_{i} x_{j i} \int_{0}^{1} \frac{1}{Φ_{j}^{- 1} (1 - α)} d α - 1 |, j = 1, 2, \dots, n .

(32)

The uncertain relative error linear combination forecasting model I (URELCFM I) with the minimum sum of squares of relative errors is

m i n Q = \sum_{j = 1}^{n} e_{j}^{2},

(33)

and the constraint of the model is

\sum_{i = 1}^{m} ω_{i} = 1 .

(34)

Denoted as

\begin{matrix} Y_{i} = {(y_{1 i}, y_{2 i}, \dots, y_{n i})}^{T} = {(E [\frac{x_{1 i}}{{\tilde{x}}_{1}}] - 1, E [\frac{x_{2 i}}{{\tilde{x}}_{2}}] - 1, \dots, E [\frac{x_{n i}}{{\tilde{x}}_{n}}] - 1)}^{T} \\ = (x_{1 i} \int_{0}^{1} \frac{1}{Φ_{1}^{- 1} (1 - α)} d α - 1, x_{2 i} \int_{0}^{1} \frac{1}{Φ_{2}^{- 1} (1 - α)} d α - 1, \\ \dots, x_{n i} \int_{0}^{1} \frac{1}{Φ_{n}^{- 1} (1 - α)} {d α - 1)}^{T}, \end{matrix}

(35)

and

\begin{matrix} R = {[1, 1, \dots, 1]}^{T}, W = {[ω_{1}, ω_{2}, \dots, ω_{n}]}^{T}, \\ Y = [\begin{matrix} Y_{1}^{T} Y_{1} & Y_{1}^{T} Y_{2} & Y_{1}^{T} Y_{3} & \dots & Y_{1}^{T} Y_{m} \\ Y_{2}^{T} Y_{1} & Y_{2}^{T} Y_{2} & Y_{2}^{T} Y_{3} & \dots & Y_{2}^{T} Y_{m} \\ Y_{3}^{T} Y_{1} & Y_{3}^{T} Y_{2} & Y_{3}^{T} Y_{3} & \dots & Y_{3}^{T} Y_{m} \\ \dots & \dots & \dots & \dots & \dots \\ Y_{m}^{T} Y_{1} & Y_{m}^{T} Y_{2} & Y_{m}^{T} Y_{3} & \dots & Y_{m}^{T} Y_{m} \end{matrix}] . \end{matrix}

(36)

The sum of squares of relative errors is

\begin{matrix} Q = \sum_{j = 1}^{n} e_{j}^{2} = \sum_{j = 1}^{n} {(\sum_{i = 1}^{m} ω_{i} E [\frac{x_{j i}}{{\tilde{x}}_{j}}] - 1)}^{2} = \sum_{j = 1}^{n} {(\sum_{i = 1}^{m} ω_{i} E [\frac{x_{j i}}{{\tilde{x}}_{j}}] - \sum_{i = 1}^{m} ω_{i})}^{2} \\ = \sum_{j = 1}^{n} {[\sum_{i = 1}^{m} ω_{i} (E [\frac{x_{j i}}{{\tilde{x}}_{j}}] - 1)]}^{2} = \sum_{j = 1}^{n} {[\sum_{i = 1}^{m} ω_{i} y_{j i}]}^{2} = \sum_{j = 1}^{n} {[W^{T} {(y_{j 1}, y_{j 2}, \dots, y_{j m})}^{T}]}^{2} \\ = \sum_{j = 1}^{n} W^{T} {(y_{j 1}, y_{j 2}, \dots, y_{j m})}^{T} (y_{j 1}, y_{j 2}, \dots, y_{j m}) W = W^{T} Y W . \end{matrix}

(37)

Equation (33) is transformed into

m i n Q = W^{T} Y W,

(38)

and the constraint is transformed into

R^{T} W = 1 .

(39)

According to the Lagrange multiplier method, the optimal coefficient

W^{*}

is

W^{*} = \frac{Y^{- 1} R}{R^{T} Y^{- 1} R} .

(40)

The solution of model I sometimes has a negative component, which does not achieve the expected effect of the linear combination forecasting model. In order to overcome the limitations of URELCFM I, we put forward an uncertain combination forecasting model with the minimum sum of squares of relative errors of non-negative weights, namely, the uncertain relative error linear combination forecasting model II (URELCFM II)

m i n Q = \sum_{j = 1}^{n} e_{j}^{2},

(41)

and the constraint of the model is

\sum_{i = 1}^{m} ω_{i} = 1, ω_{i} \geq 0, i = 1, 2, \dots, m .

(42)

According to the derivation of URELCFM I, we can obtain

m i n Q = W^{T} Y W,

(43)

and the constraint is transformed into

R^{T} W = 1, W \geq 0 .

(44)

URELCFM II belongs to quadratic convex programming and can be solved by the simplex algorithm of quadratic convex programming. This method needs to be solved by the linear programming method of finite number or can also be solved by MATLAB optimization toolbox.

Both URELCFM I and URELCFM II require that the sum of the weighting coefficients is 1. In fact, there is no need for this limitation. The weight can also be negative, and the goal is to minimize the sum of squares of the combined forecasting errors. Although it is controversial that the weight is negative, it is also common from a mathematical perspective; for example, multiple regression often has negative coefficients. By removing the limitation of the weighting coefficient, we can obtain the uncertain relative error linear combination forecasting model III (URELCFM III) with the minimum sum of squares of relative errors

m i n Q = \sum_{j = 1}^{n} e_{j}^{2} .

(45)

Define

\begin{matrix} Z_{i} = {(z_{1 i}, z_{2 i}, \dots, z_{n i})}^{T} = {(E [\frac{x_{1 i}}{{\tilde{x}}_{1}}], E [\frac{x_{2 i}}{{\tilde{x}}_{2}}], \dots, E [\frac{x_{n i}}{{\tilde{x}}_{n}}])}^{T} \\ = (x_{1 i} \int_{0}^{1} \frac{1}{Φ_{1}^{- 1} (1 - α)} d α, x_{2 i} \int_{0}^{1} \frac{1}{Φ_{2}^{- 1} (1 - α)} d α, \\ \dots, x_{n i} \int_{0}^{1} \frac{1}{Φ_{n}^{- 1} (1 - α)} {d α)}^{T}, i = 1, 2, \dots, m . \end{matrix}

(46)

The sum of squares of relative errors is

Q = \sum_{j = 1}^{n} e_{j}^{2} = \sum_{j = 1}^{n} {(\sum_{i = 1}^{m} ω_{i} E [\frac{x_{j i}}{{\tilde{x}}_{j}}] - 1)}^{2} = \sum_{j = 1}^{n} {(\sum_{i = 1}^{m} ω_{i} z_{j i} - 1)}^{2} .

(47)

Q in Equation (47) is an elementary function. In order to solve the minimum value of Q, we take the partial derivative of

ω_{i} (i = 1, 2, \dots, m

and equal to 0, respectively, to obtain the following equations

\begin{matrix} ω_{1} \sum_{i = 1}^{n} Z_{i 1}^{2} + ω_{2} \sum_{i = 1}^{n} Z_{i 1} Z_{i 2} + ω_{3} \sum_{i = 1}^{n} Z_{i 1} Z_{i 3} + \dots + ω_{m} \sum_{i = 1}^{n} Z_{i 1} Z_{i m} = \sum_{i = 1}^{n} Z_{i 1}, \\ ω_{1} \sum_{i = 1}^{n} Z_{i 2} Z_{i 1} + ω_{2} \sum_{i = 1}^{n} Z_{i 2}^{2} + ω_{3} \sum_{i = 1}^{n} Z_{i 2} Z_{i 3} + \dots + ω_{m} \sum_{i = 1}^{n} Z_{i 2} Z_{i m} = \sum_{i = 1}^{n} Z_{i 2}, \\ ω_{1} \sum_{i = 1}^{n} Z_{i 3} Z_{i 1} + ω_{2} \sum_{i = 1}^{n} Z_{i 3} Z_{i 2} + ω_{3} \sum_{i = 1}^{n} Z_{i 3}^{2} + \dots + ω_{m} \sum_{i = 1}^{n} Z_{i 3} Z_{i m} = \sum_{i = 1}^{n} Z_{i 3}, \\ \dots \\ ω_{1} \sum_{i = 1}^{n} Z_{i m} Z_{i 1} + ω_{2} \sum_{i = 1}^{n} Z_{i m} Z_{i 2} + ω_{3} \sum_{i = 1}^{n} Z_{i m} Z_{i 3} + \dots + ω_{m} \sum_{i = 1}^{n} Z_{i m}^{2} = \sum_{i = 1}^{n} Z_{i m} . \end{matrix}

(48)

Denoted as

\begin{matrix} Z = [\begin{matrix} Z_{1}^{T} Z_{1} & Z_{1}^{T} Y_{2} & Z_{1}^{T} Z_{3} & \dots & Z_{1}^{T} Z_{m} \\ Z_{2}^{T} Z_{1} & Z_{2}^{T} Z_{2} & Z_{2}^{T} Z_{3} & \dots & Z_{2}^{T} Z_{m} \\ Z_{3}^{T} Z_{1} & Z_{3}^{T} Z_{2} & Z_{3}^{T} Z_{3} & \dots & Z_{3}^{T} Z_{m} \\ \dots & \dots & \dots & \dots & \dots \\ Z_{M}^{T} Z_{1} & Z_{m}^{T} Z_{2} & Z_{m}^{T} Z_{3} & \dots & Z_{m}^{T} Z_{m} \end{matrix}], \\ M = [\begin{matrix} Z_{1}^{T} R \\ Z_{2}^{T} R \\ Z_{3}^{T} R \\ \dots \\ Z_{m}^{T} R \end{matrix}] . \end{matrix}

(49)

We express the above equations by matrix equations, and obtain

Z W = M .

(50)

The matrix Z is invertible and the solution W is

W = Z^{- 1} M .

(51)

URELCFM I has more constraints than URELCFM III, and the accuracy of URELCFM I is lower than that of URELCFM III, while URELCFM II has more constraints than URELCFM I. Therefore, URELCFM III has the highest accuracy, that is, the sum of relative error squares

Q I I I \leq Q I \leq Q I I

.

5. Numerical Example

To verify the feasibility and effectiveness of the model proposed in this paper, we provide a numerical example of imprecise data. Moreover, we followed the numerical analysis method for the disturbance term in Reference [25], calculated the expected values and variance of the disturbance term, and forecasting and solved the confidence interval. The numerical analysis results show that the model proposed in this paper can lead to better forecasting data.

Assuming that (

{\tilde{x}}_{i}

,

{\tilde{y}}_{i}

), i = 1, 2, …, 8 are imprecise data provided in Table 1, where

{\tilde{x}}_{i}

,

{\tilde{y}}_{i}

, i = 1, 2, …, 8 are independent linear uncertain variables with regular uncertainty distributions

Φ_{i}

and

Ψ_{i}

, i = 1, 2, …, 8, respectively.

We carried out linear regression using the uncertain uncertain slope mean method (USMM) [20] and uncertain equation deformation method (UEDM) [21] respectively, and then solved the linear regression equations according to the combination forecasting model proposed in this paper. The results are shown in Table 2.

It can be seen from Table 2 that there are some differences in the coefficients of the linear regression equation of USMM and UEDM. The coefficients of the linear regression equation of the model proposed in this paper are almost the same, the stability of the model is strong, and the difference in fitting effect is small, which can be ignored.

The estimated expected values and estimated variances of the each model disturbance term are shown in Table 3.

As can be seen from Table 3, the estimated expected values of the disturbance terms of the URELCFM I, URELCFM II and URELCFM III are all 0.0000, and the variance is relatively small, indicating that the three models have a better fitting effect and better forecast effect, and URELCFM III has the best performance.

We forecast the data according to the URELCFM III and obtained the confidence interval. We assumed that

\tilde{x} \sim L (17, 19)

is a new imprecise form of data, and we take the confidence level to be

α = 95 %

. According to Reference [25], the forecast value and confidence interval were obtained as shown in Table 4.

From the perspective of numerical examples, all four models proposed in this paper are feasible. From the perspective of data analysis and comparison with existing models, the prediction effect of the four models proposed in this paper is better.

6. Conclusions

Traditional forecasting models all require data to be precise. In fact, statistics can be imprecise. For example, after the college entrance examination, we invited a teacher to estimate the score of a certain candidate. If the teacher believes that the candidate’s score is bound to exceed 500, we would obtain an expert’s experience data (500, 0), if the teacher thinks the candidate’s score is less than 520 is 0.3, we obtain an expert’s experience data (520, 0.3), if the teacher thinks the candidate’s score is less than 550 is 0.6, we obtain an expert’s experience data (550, 0.6), if the teacher thinks the candidate’s score is less than 580 is 0.8, we obtain an expert experience data (580, 0.8), and the teacher believes that the candidate will score no higher than 600, we obtain an expert experience data (600, 1). This gives us five pieces of expert experience data (500, 0), (520, 0.3), (550, 0.6), (580, 0.8), (600, 1), all of which are imprecise data.

Based on traditional combination forecasting methods and uncertainty theory, this paper proposes two kinds of uncertain combination forecasting models. The forecasting models proposed in this paper are all aimed at imprecise data, and they rely on uncertainty theory when solving. Univariate uncertain linear combination forecasting model is a relatively basic linear model. It establishes several piecewise linear regression models based on data in different periods and combines them into an uncertain combination forecasting model with high accuracy. The uncertain relative error combination forecasting model is based on the principle of minimizing the sum of squares of relative errors, setting weight restrictions, and obtaining three kinds of uncertain relative error combination forecasting models with good forecasting results. The four models proposed in this paper are all feasible, and the forecasting effect of the models proposed in this paper is better than the existing models obtained through data analysis.

The numerical example in this paper is a univariate linear forecasting problem, and the model solution and data analysis are not too complicated. The derivation and calculation of multivariable uncertain linear combination forecasting model are relatively complex, and can only be realized with the help of computer programs or MATLAB programming.

Author Contributions

Conceptualization, Y.N.; methodology, S.W.; validation, C.W.; data curation, L.W.; writing—original draft preparation, H.S.; writing—review and editing, H.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yao, C.; Wang, J. A comparative study on forecast models of apple output in China. J. Fruit Sci. 2007, 24, 682–684. [Google Scholar]
Meng, Y.; Wang, J.; Kang, G.; Chen, C. Analysis of apple production status in China. China Fruit Trees 2007, 1, 43–44. [Google Scholar]
Hou, L.; Sun, C. A neural network forecast model for apple yield. J. China Agric. Univ. Soc. Sci. Ed. 2001, 42, 51–53. [Google Scholar]
Gao, S.; Zhang, S.; Mei, L. Linear combination forecast based on relative error criterion. J. Syst. Eng. Electron. 2008, 30, 481–484. [Google Scholar]
Hao, L.; Wu, D. Linear regression combined forecasting model—Take the prediction of China’s aging population as an example. J. Shenyang Univ. Soc. Sci. 2016, 18, 290–293. [Google Scholar]
Liu, B. Uncertainty Theory: A Branch of Mathematics for Modeling Human Uncertainty; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
Liu, B. Uncertainty Theory, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Liu, B. Some research problems in uncertainty theory. J. Uncertain Syst. 2009, 3, 3–10. [Google Scholar]
Liu, B. Why is there a need for uncertainty theory? J. Uncertain Syst. 2012, 6, 3–10. [Google Scholar]
Liu, B. Uncertainty Theory, 5th ed.; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
Wang, X.; Ning, Y. An Uncertain currency model with floating interest rates. Soft Comput. 2017, 21, 6739–6754. [Google Scholar] [CrossRef]
Ning, Y.; Pang, N.; Wang, X. An Uncertain aggregate production planning model considering investment in vegetable preservation technology. Math. Probl. Eng. 2019. [Google Scholar] [CrossRef] [Green Version]
Guo, H.; Wang, X.; Gao, Z. Uncertain linear regression model and its application. J. Intell. Manuf. 2014, 28, 559–564. [Google Scholar] [CrossRef]
Wang, X.; Peng, Z. Method of moments for estimating uncertainty distributions. J. Uncertain. Anal. Appl. 2014, 2, 5. [Google Scholar] [CrossRef] [Green Version]
Chen, D. Tukey’s biweight estimation for uncertain regression model with imprecise observations. Soft Comput. 2020, 24, 16803–16809. [Google Scholar] [CrossRef]
Song, Y.; Fu, Z. Uncertain multivariable regression model. Soft Comput. 2018, 22, 5861–5866. [Google Scholar] [CrossRef]
Wang, X.; Li, H.; Guo, H. A new Uncertain regression model and its application. Soft Comput. 2020, 24, 6297–6305. [Google Scholar] [CrossRef]
Liu, Z. Least absolute deviations estimation for uncertain regression with imprecise observations. Fuzzy Optim. Decis. Mak. 2020, 19, 33–52. [Google Scholar] [CrossRef]
Yao, K.; Liu, B. Uncertain regression analysis: An approach for imprecise observations. Soft Comput. 2018, 22, 5579–5582. [Google Scholar] [CrossRef]
Wang, S.; Ning, Y.; Shi, H.; Chen, X. A new uncertain linear regression model based on slope mean. J. Intell. Fuzzy Syst. 2021, 40, 10465–10474. [Google Scholar] [CrossRef]
Wang, S.; Ning, Y.; Shi, H. A new uncertain linear regression model based on equation deformation. Soft Comput. 2021, 25, 12817–12824. [Google Scholar] [CrossRef]
Wang, S.; Ning, Y.; Huang, H. Uncertain least squares estimation model based on relative error. J. Intell. Fuzzy Syst. 2023, 44, 8281–8290. [Google Scholar] [CrossRef]
Shi, H.; Sun, X.; Wang, S.; Ning, Y. Total least squares estimation model based on uncertainty theory. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 10069–10075. [Google Scholar] [CrossRef]
Liu, Z. Uncertain growth model for the cumulative number of COVID-19 infections in China. Fuzzy Optim. Decis. Mak. 2021, 20, 229–242. [Google Scholar] [CrossRef]
Lio, W.; Liu, B. Residual and confidence interval for uncertain regression model with imprecise observations. J. Intell. Fuzzy Syst. 2018, 35, 2573–2583. [Google Scholar] [CrossRef]

Table 1. Imprecise data (Linear uncertainty distribution).

i	1	2	3	4
${\tilde{x}}_{i}$	$L (1, 3)$	$L (3, 5)$	$L (5, 7)$	$L (7, 9)$
${\tilde{y}}_{i}$	$L (4, 6)$	$L (5, 6)$	$L (7, 9)$	$L (10, 12)$
i	5	6	7	8
${\tilde{x}}_{i}$	$L (9, 11)$	$L (11, 13)$	$L (13, 15)$	$L (15, 17)$
${\tilde{y}}_{i}$	$L (12, 14)$	$L (15, 16)$	$L (20, 22)$	$L (18, 20)$

Table 2. The linear regression equations.

Model	Linear Regression Equations
UEDM	$y = 1.6258 + 1.1805 x$
USMM	$y = 1.5406 + 1.1901 x$
UULCFM	$y = 1.5336 + 1.1906 x$
URELCFM I	$y = 1.5346 + 1.1906 x$
URELCFM II	$y = 1.5321 + 1.1907 x$
URELCFM III	$y = 1.5355 + 1.1905 x$

Table 3. The expected value and variance in the disturbance term.

Model	Expected Values	Variances
UEDM	−0.2124	12.3274
USMM	−0.0120	5.6523
UULCFM	0.0080	4.4235
URELCFM I	0.0000	2.2146
URELCFM II	0.0000	2.4678
URELCFM III	0.0000	1.6131

Table 4. The forecast value and confidence interval.

Model	Forecast Value	Confidence Interval
URELCFM III	22.9645	22.9645 ± 2.5666

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shi, H.; Wei, L.; Wang, C.; Wang, S.; Ning, Y. Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory. Symmetry 2023, 15, 1379. https://doi.org/10.3390/sym15071379

AMA Style

Shi H, Wei L, Wang C, Wang S, Ning Y. Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory. Symmetry. 2023; 15(7):1379. https://doi.org/10.3390/sym15071379

Chicago/Turabian Style

Shi, Hongmei, Lin Wei, Cui Wang, Shuai Wang, and Yufu Ning. 2023. "Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory" Symmetry 15, no. 7: 1379. https://doi.org/10.3390/sym15071379

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Relative Error Linear Combination Forecasting Model Based on Uncertainty Theory

Abstract

1. Introduction

2. Uncertain Regression Model

3. Unary Uncertain Linear Regression Combination Forecasting Model

4. Uncertain Relative Error Linear Combination Forecasting Model

5. Numerical Example

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI