Intelligent Fuzzy Models: WM, ANFIS, and Patch Learning for the Competitive Forecasting of Environmental Variables

Korkidis, Panagiotis; Dounis, Anastasios

doi:10.3390/su15108032

Open AccessArticle

Intelligent Fuzzy Models: WM, ANFIS, and Patch Learning for the Competitive Forecasting of Environmental Variables

by

Panagiotis Korkidis

^* and

Anastasios Dounis

^*

Department of Biomedical Engineering, University of West Attica, Egaleo Park Campus, 12243 Athens, Greece

^*

Authors to whom correspondence should be addressed.

Sustainability 2023, 15(10), 8032; https://doi.org/10.3390/su15108032

Submission received: 28 February 2023 / Revised: 14 April 2023 / Accepted: 10 May 2023 / Published: 15 May 2023

(This article belongs to the Section Sustainable Engineering and Science)

Download

Browse Figures

Versions Notes

Abstract

:

This paper focuses on the application of fuzzy modeling methods in the field of environmental engineering. Since predicting meteorological data is considered to be a challenging task, the current work aimed to assess the performance of various fuzzy models on temperature, solar radiation, and wind speed forecasting. The models studied were taken from the fuzzy systems literature, varying from well-established to the most recent methods. Four cases were considered: a Wang–Mendel (WM)-based fuzzy predictive model, an adaptive network fuzzy inference system (ANFIS), a fuzzy system ensemble, and patch learning (PL). The prediction systems were built from input/output data without any prior information, in a model-free approach. The ability of the models to display high performance on complex real datasets, provided by the National Observatory of Athens, was demonstrated through numerical studies. Patch learning managed to not only display a similar approximation ability to that of strong machine learning models, such as support vector machines and Gaussian processes, but also outperform them on the highly demanding problem of wind speed prediction. More accurately, as far as wind speed prediction is concerned, patch learning produced a

0.9211

root mean squared error for the training data and a value of

0.9841

for the testing data. The support vector machine provided a

0.9306

training root mean squared error and a

0.9891

testing value. The Gaussian process model resulted in a

0.9343

root mean squared error for the training data and a value of

0.9861

for the testing data. Finally, as shown by the numerical experiments, the fuzzy system ensemble exhibited the highest generalisation performance among all the intelligent models.

Keywords:

competitive ensemble models; forecasting; solar radiation; wind speed; air temperature; fuzzy models; intelligent models

1. Introduction

Weather forecasting affects the lives and the decisions of a large group of people in modern society. The prediction of meteorological variables such as temperature, solar radiation, relative humidity, wind speed, and rainfall is crucial for several reasons. The fishing industry, as an example, heavily depends on early information in order to avoid severe weather phenomena at sea and at the same time cut down on fuel consumption. The usefulness of the forecasting is important for agricultural areas, airports, and regions that utilise renewable energy resources. One of the main challenges in the field of time-series forecasting is to obtain reasonable and accurate future predictions taking into account historical data records.

There are two approaches in time-series prediction: model-based approaches and nonparametric methods. Model-based approaches are associated with linear or nonlinear models such as autoregressive moving average (ARMA) and nonlinear ARMA with exogenous input (NARMAX), assuming that sufficient prior information is available [1]. Despite the fact that such techniques are widely used, they lack accuracy when dealing with noisy data compared to machine learning techniques [2]. On the contrary, nonparametric methods treat forecasting as a black box problem, taking into account the past observations of the variable of interest. Such methods emerged from the use of computational intelligence and machine learning techniques, such as neural networks and fuzzy systems. The latter have an inherent ability to incorporate the complex characteristics of data and produce accurate time-series forecasts.

Wind and solar energy have the advantages of being clean, free, and sustainable, and they can be utilised for electrification. As a renewable energy resource, wind energy can save fossil energy and reduce greenhouse gas emissions. At the same time, solar energy has the advantage of sustainability, and it is one of the most renewable energy sources. Photovoltaic (PV) panels convert solar energy based on solar radiation into electrical energy. PV power generation follows a parametric model closely related to weather conditions, and it can be expressed as a function of various factors:

P_{P V} = f (solar energy, ambient temperature, photovoltaic panels characteristics)

Wind turbines (WTs) convert the energy captured from wind into electrical energy. The wind power is proportional to the cube of the wind speed. Wind turbine power generation follows a parametric model that can be expressed as:

P_{W T} = f (local wind speed, wind turbine characteristics)

Solar and wind energy display a highly intermittent nature, and their dynamics require an accurate and reliable forecasting mechanism for effective and efficient load management, cost reduction, and a green environment. Short-term forecast models (STFMs) of wind speed, ambient temperature, and solar radiation are also critical to the operation of wind turbines and photovoltaic arrays, so that the energy conversion efficiency can be increased. With the increasing importance of wind and photovoltaic power, the forecasting of wind speed, air temperature, and solar radiation have come to play a vital role in improving energy market efficiency by maximising the revenue from wind and photovoltaic energy sold on the electricity market.

Much research published in the current literature has studied the problem of time-series forecasting pertaining to meteorological variables. In [3], a thorough analysis of wind speed forecasts was presented through the lens of deep learning. The stages of data processing, feature extraction, and learning were comprehensively reviewed. The application of neural networks in wind power systems was studied in [4], and an analysis of the strengths and shortcomings of different artificial neural models was provided. Ensemble-based methods for forecasting solar power were reviewed in [5]. The presented forecasting techniques were classified into three main categories: forecasting methods based on statistical methods, e.g., neural networks and support vector machines (SVMs); forecasting methods based on physical methods; and ensemble methods. Wind power prediction using ensemble learning was also studied in [6], and the authors demonstrated through numerical studies that combining models resulted in superior approximation performance compared with the standalone models. A multi-intelligent model strategy for wind power prediction that fused deep learning, an extreme kernel learning machines, and a heuristic optimisation algorithm was studied in [7].

Solano et al., in [8], studied an ensemble of forecasting models based on machine learning techniques. Their final model, combining SVM-based regression, extreme gradient boosting (XGB), and categorical boosting (CatBoost), was applied in solar radiation forecasting. The combination of methods was also beneficial in [9], where a novel approach for wind speed prediction was proposed. A radial basis function (RBF) took over the approximation task and was combined with a seasonal adjustment method and exponential smoothing. The power of SVMs was demonstrated in [10] for short-term wind speed forecasting, together with a genetic optimisation method for the model parameters and a wavelet transform approach for data analysis. Similarly, support vector machines for regression, optimised by an evolutionary algorithm known as grey wolf optimisation, were utilised for wind forecasting in [11].

A hybrid algorithm based on the seasonal autoregression integrated moving average and adaptive network fuzzy inference system was proposed in [12] for wind prediction. The time series was decomposed into its components, and function approximation models were utilised to predict each of these components. Liu et al. [13] studied neural network combined models and proposed a hybrid method for wind speed forecasting. The adaptation of neural networks for the short-term forecasting of wind speed was studied in [14]. Furthermore, a hybrid evolutionary optimised ANFIS was proposed for the same task in [15], together with singular spectrum analysis. In [16], a harmony search-optimised neural network was proposed for improving the performance of solar and wind speed predictions under the uncertain characteristics of these variables. Similarly, neural networks were used to predict solar energy in [17]. A novel hybrid method was proposed in [18], where the authors studied recurrent fuzzy neural networks combined with a particle swarm optimiser for time-series prediction. They applied their model on benchmark time-series problems as well as in wind speed forecasting.

Our aim was to study the problem of forecasting for ambient temperature, solar radiation, and wind speed from a fuzzy models perspective. To our knowledge, the current literature lacks detailed studies of this paper’s adopted models. Additionally, the numerical study of a wide range of fuzzy models is provided, in the sense that WM can be considered as a simple version of a fuzzy system and patch learning as one of the latest trends in the fuzzy research literature. An ensemble of ANFIS and patch learning has not yet been tested on real datasets pertaining to meteorological variable time series. The contributions of this study are the following:

-: We conducted numerical studies on a wide range of fuzzy models, from simple to modern trends, in a demanding function approximation framework;
-: We also performed numerical studies on the application of an ensemble of ANFIS and PL in an environmental engineering framework for the competitive forecasting of meteorological variables;
-: We tested the approximation performance of patch learning on real datasets;
-: We generated intelligent forecasting models that can be replicated for other types of variables applied to energy and environmental engineering.

This paper is organised as follows: In Section 2.1, the problem statement is described. The models are described in detail in Section 2.2, Section 2.3, Section 2.4 and Section 2.5. A short mathematical guide to fuzzy systems and the WM method is provided in Section 2.2. Section 2.3 is dedicated to the adaptive network fuzzy inference system. The ensemble methods framework, and more precisely bagging, is illustrated in Section 2.4. Patch learning is discussed in Section 2.5, and a pseudocode is provided. Numerical studies on the data provided by the National Observatory of Athens are illustrated in Section 3. Specifically, Section 3.1 discusses the adopted evaluation metrics, while Section 3.2 and Section 3.3 concern meteorological data and input selection. The forecast results for each dataset are presented in Section 3.4, Section 3.5 and Section 3.6. A comparison with powerful machine learning models is laid out in Section 3.7. Moreover, a comparison of the intelligent models with a naive predictor is provided in Section 3.8. A discussion on the results is given in Section 4. Conclusions and further research goals can be found in Section 5.

2. Materials and Methods

2.1. Problem Statement

Time series are stochastic processes that constitute measurements of variables of a physical system obtained at a specific time. Mathematically speaking, the problem of times-series prediction can be viewed as mapping from the feature space

X

, given by the lagged values of the variable under study,

[x (t - D) \dots x (t - 1)] \in X \subset R^{D}

, to the current

x (t) \in R

. The space

X

is generated by the selection of the appropriate lags, D, forming the necessary predictors of the problem. In the current research, time-series forecasting was studied from the perspective of fuzzy models. Consequently, the predicted value of a variable was based on its previous instances. Since fuzzy systems are equipped with a universal approximation property [19,20,21], they are able to capture the unknown map relating the m-tupled input data to the output data, i.e., the previous values of the variable under study and its future realisation.

The fuzzy models were designed from the available dataset,

D = (X, Y)

, where

Y : = {(x (t))}_{t \in T}

denotes the future values and

T

denotes the time index in which the variable assumed values. The data were split into training,

D^{t r}

and testing sets. The model was trained using

D^{t r}

, and it was assessed by testing the prediction performance on both the training and, more importantly, the testing set.

2.2. Wang–Mendel Predictive Model

Fuzzy systems [22] are function approximators and have been widely adopted in science and engineering, due to their ability to encode knowledge in the form of IF-THEN rules and at the same time describe uncertainty [23]. The general form of a fuzzy rule (as per the recently named Zadeh rules) is:

R^{l} : If x_{1} is F_{1}^{l} and \dots and x_{m} is F_{m}^{l}, then y is G^{l}

where

F_{i}^{l}

and

G^{l}

are fuzzy sets defined in

R

; l denotes the

l^{t h}

rule; and

l = 1, \dots, M

. The fuzzy system consists of a collection of fuzzy rules, together with a fuzzy inference engine and a defuzzification mechanism. In the fuzzy inference engine, the rules are combined to generate a map from the input fuzzy sets in

X

to the fuzzy output sets in

Y

. Let

F_{1}^{l} \times \dots \times F_{m}^{l} = A^{l}

, so that the

l^{t h}

fuzzy rule is expressed as

A^{l} \mapsto G^{l}

; then, the fuzzy rule is described as

μ_{R_{z}^{l}} (x, y) = μ_{A^{l} \mapsto G^{l}} (x, y)

. Since the prediction problem implies multiple antecedents, i.e., the input space is m-dimensional, we have:

μ_{A^{l} \mapsto G^{l}} (x, y) = [T_{i = 1}^{m} μ_{F_{i}^{l}} (x_{i})] ⋆ μ_{G^{l}} (y)

(1)

where T and ⋆ denote t-norms; however, the latter is used to symbolise the implication operation. If an m-dimensional input to the fuzzy system in the form of a fuzzy set

A_{x^{'}}

is considered, then by the compositional rule of inference, the

l^{t h}

rule determines a fuzzy set in

Y

given by:

μ_{B^{l}} (y) = sup_{x \in R^{m}} [μ_{A_{x^{'}}} (x) ⋆ μ_{A^{l} \mapsto G^{l}} (x, y)], y \in Y

(2)

The aggregated fuzzy set

B^{l}

, when taking into account the full set of fuzzy rules, i.e.,

l = 1, . . ., M

, is computed as:

μ_{B^{l}} (y) = {max}_{l = 1}^{M} \{sup_{x \in R^{m}}[μ_{A_{x^{'}}} (x) ⋆ μ_{A^{l} \mapsto G^{l}} (x, y)]}

(3)

In the case of singleton fuzzification, Equation (3) simplifies to:

μ_{B^{l}} (y) = {max}_{l = 1}^{M} {T_{i = 1}^{m} μ_{F_{i}^{l}} (x_{i}) ⋆ μ_{G^{l}} (y)}, y \in Y

(4)

The general form of the Mamdani fuzzy system

F (x)

, where

x \in R^{m}

, can be written as the following function:

F (x) = \frac{\sum_{l = 1}^{M} {\bar{y}}^{l} (T_{i = 1}^{m} μ_{F_{i}^{l}} (x_{i}))}{\sum_{l = 1}^{M} (T_{i = 1}^{m} μ_{F_{i}^{l}} (x_{i}))}

(5)

where

{\bar{y}}^{l}

is the center of the fuzzy set

B^{l}

. Equation (5) can be interpreted as a linear expansion of a set of the so-called fuzzy basis functions [20],

φ_{l} (x)

, where:

φ_{l} (x) = \frac{T_{i = 1}^{m} (μ_{F_{i}^{l}} (x_{i}))}{\sum_{l = 1}^{M} (T_{i = 1}^{m} (μ_{F_{i}^{l}} (x_{i})))}

(6)

Throughout this study a minimum t-norm was used. Fuzzy basis functions combine good characterisations of both local and global properties. Similarly to all machine learning models, fuzzy systems involve a selection process in which the partial components are properly chosen; how the inference rules are created and how the parameters of membership functions are computed are issues that require resolution in order to design a fuzzy system.

Wang [24] proposed one of the first and most popular methods of generating fuzzy rules from data, known as the Wang–Mendel (WM) method. This method is characterised by its simplicity, speed of computation, and good performance in many problems related to function approximation. The procedure of generating fuzzy rules from the given data is decomposed into the following steps:

Step 1: Define fuzzy partitions for each input variable of the system, covering the corresponding universe of discourse, as well as in the output.
Step 2: For all the $p = {1, . . ., N}$ , i.e., all the instances in $D^{t r}$ , generate a fuzzy rule. The procedure involves the computation of the membership degrees of the input vector to each fuzzy partition. For each instance, keep the fuzzy set with maximum membership as the antecedent and consequent parts of the inputs and output.
Step 3: Compute the degree of each fuzzy rule, $D (R^{l})$ , generated in Step 2, where $D (R^{l}) = \prod_{i = 1}^{m} μ_{F_{i}^{l *}} (x_{i}) μ_{B^{l *}} (y)$ , and $μ_{F_{i}^{l *}}$ and $μ_{B^{l *}}$ are the sets with maximum membership for each input and output, respectively.
Step 4: Remove all possible conflicting fuzzy rules and determine the final fuzzy rule base. Since the cardinality of the generated rule base equals the number of instances in $D^{t r}$ , there will be rules with the same antecedent. Amongst these rules, only consider for the final rule base those with maximum $D (R^{l})$ .
Step 5: Choose the fuzzy system parameters for fuzzy implication, aggregation, and defuzzification methods.

A WM based model can be seen as a vanilla version of a fuzzy system designed from data. Its simplicity in implementation and its adequate performance are the reasons for its wide adoption in the research literature.

2.3. Adaptive Network Fuzzy Inference System

The adaptive network fuzzy inference system (ANFIS) is a fuzzy system that can be interpreted as a neural network. It was proposed by Jang [25], and it is functionally equivalent to a Sugeno fuzzy inference system, either of zero or first order. Sugeno fuzzy rules have the same antecedent part as in the general form of a fuzzy rule, but the consequent part is described by a function of the inputs. More specifically, the rules of an ANFIS model are as follows:

R^{l} : If x_{1} is F_{1}^{l} and \dots and x_{m} is F_{m}^{l}, then y_{l} (x) = b_{l, 0} + \sum_{j = 1}^{m} b_{l_{j}} x_{j}

where

b_{l, 0}

and

b_{l_{j}}

are the zero- and first-order parameters of the consequents for the

l^{t h}

rule. The training of the neuro-fuzzy system involves the determination of both the antecedent parameters, i.e., the parameters of the membership function generating the fuzzy partitions in the inputs, and the consequents, i.e., the parameters of the linear function. In the framework of neuro-fuzzy modeling, the training approach was borrowed from the neural network literature and involves a gradient-descent-based method for the determination of the antecedent parameters. The so-called hybrid training of the ANFIS involves the computation of the consequent parameters in batch mode via the least-squares method and, at the same time, the determination of the input membership function parameters by backpropagation. Compared to using gradient descent for all parameters, the hybrid method of training achieves faster convergence.

The output of the ANFIS can be written as an expansion of the local linear models

y_{l} (x)

, weighted by the fuzzy basis functions evaluated at

x \in R^{m}

:

y_{ANFIS} (x) = \sum_{l = 1}^{M} {\bar{φ}}_{l} (x) y_{l} (x)

(7)

where

{\bar{φ}}_{l}

denotes the fuzzy basis functions computed using a product t-norm.

2.4. Fuzzy System Ensemble

In both the machine learning and computational intelligence community, ensemble methods [26,27] are utilised to provide a framework of improving the performance of a given model and, at the same time, make decisions under uncertain and noisy environments. Ensemble-method-based systems have been proven to be very effective and versatile in a broad spectrum of problem domains. Their main feature is the improvement of accuracy for problems such as parameter estimation and prediction in function approximation by reducing the variance of the system. There are three pillars when designing an ensemble system: data sampling; the training of the base learners; and combining the learning members. An issue of paramount importance in ensemble systems is the diversity of the training subsets that the learners are trained on. In this research, bootstrapped samples of the original

D^{t r}

were produced and used to train the base model via supervised learning. The individual models chosen as base learners were based on the adaptive network fuzzy inference system. Finally, the trained models were combined using a mean averaging approach (which is commonly used for regression problems) as the final ensemble prediction.

The described methodology is widely known as bootstrapped aggregating, i.e., bagging. The combination has a positive effect on the so-called statistical issue. By using a bagging learning algorithm, the variance is reduced due to the combination procedure. The hypothesis space is often too large to explore for limited training data. Furthermore, several hypotheses provide the same accuracy for the training set. Thus, the risk of choosing a hypothesis with a poor generalisation ability is reduced by combining the hypotheses.

Let the ground truth, i.e., the latent function, be f and assume that B bootstrap samples

D_{*}^{t r}

are generated from

D^{t r}

according to the empirical distribution

\hat{P}

. For each set

D_{*}^{t r}

, a model is trained to provide the prediction

y_{b}^{*} (x)

. The theoretical bagging estimate is defined as

E_{\hat{P}} y^{*} (x)

. The output of the ensemble system is given by:

y (x) = \frac{1}{B} \sum_{b} y_{b}^{*} (x)

(8)

which can be seen as a Monte Carlo estimate of

E_{\hat{P}} y^{*} (x)

as

B \to \infty

. In an idealised setting, let

P

be the population from which

D^{t r}

is drawn. Suppose that the bootstrap samples are drawn from the population rather than the observed data. The prediction of each learner can be written as the unknown function plus an error term:

y_{b}^{*} (x) = f (x) + ε_{b} (x), where b = 1, . . ., B

(9)

Thus, the mean squared error of

y_{b}^{*} (x)

is given by:

\int {(y_{b}^{*} (x) - f (x))}^{2} P (x) d x = \int ε_{b}^{2} (x) P (x) d x

(10)

and the expected error of the ensemble over the distribution

P

is given by:

E_{P} E (x) = \int {(\frac{1}{B} \sum_{b} y_{b}^{*} (x) - f (x))}^{2} P (x) d x = \int {(\frac{1}{B} \sum_{b} ε_{b} (x))}^{2} P (x) d x

(11)

Hence, by using Jensen’s inequality:

E_{P} E (x) \leq E_{P} ε (x) = \frac{1}{B} \sum_{b} \int ε_{b}^{2} (x) P (x) d x

(12)

where the latter term stands for the average error of the individual learners, the expected ensemble error is no larger than the averaged error of the base learners.

In the fuzzy systems literature, ensemble learning was proposed by Kim in [28], in combination with the genetic learning of a fuzzy rule base. Bagging and boosting models in terms of fuzzy modeling were also studied in [29]. It should be mentioned that there are many ways of averaging, e.g., using an ensemble of a weighted average. However, no studies have identified one method that should be favoured over another, and simple averaging suffers less from overfitting; hence, this approach was adopted in our study. The bagging methodology is illustrated in Algorithm 1.

Algorithm 1 Bagging Algorithm

1:: Given training data $D^{t r}$ , $B \in Z_{+}^{*}$ the maximum number of bootstrap samples, a choice of a supervised learning algorithm for the base learners, a choice of combining the base individual models
2:: for each $b = 1, \dots, B$ do
3:: Generate a bootstrap sample $D_{*}^{t r}$ from $D^{t r}$ by sampling with replacement
4:: Train the base model with a supervised learning algorithm using $D_{*}^{t r}$ and generate the corresponding hypothesis $y_{b}^{*} (x)$
5:: Add $y_{b}^{*} (x)$ to the ensemble $C \leftarrow C \cup y_{b}^{*} (x)$
6:: end for
7:: Return an average model $y (x) \leftarrow \bar{C}$ as the final model and the training and testing set predictions

2.5. Patch Learning

Patch learning (PL) is a novel method recently introduced by Dongrui Wu [30] as a general framework to improve the performance of a system for function approximation.

PL can be decomposed into two design stages. The first stage is associated with the generation of a global model that best fits the training data

D^{t r}

. The model is global in the sense that it is trained for the full available dataset. In the second stage, a necessary number of local models, i.e., the patch models, are trained for the data that fall into the patches and contribute the most to the training error. It should be noted that patches are domains in the input space

X

. The challenging task in PL is the determination of the patches, whose computation is based on the notion of type-1 first-order rule partitions [22]. Algorithm 2 demonstrates a pseudocode of PL.

Wu studied the performance of PL in regression problems, demonstrating its very high capabilities according to the fitting results, by using an ANFIS as the global and patch model. One important advantage is that within the framework of PL, just like in ensemble methods, any model can be used as the global and/or local model. This creates a promising opportunity for the exploration of designing intelligent models based on the PL theory.

Algorithm 2 Patch Learning

1:: Given training data $D^{t r}$ , testing data $D_{*}$ , L the maximum number of patches, a choice of a global and patch model
2:: Generate a global model by a training procedure using all training instances
3:: for each input do
4:: Compute the candidate patches by identifying the first-order fuzzy partitions
5:: end for
6:: Create a pool containing all the identified partitions
7:: for all patches do
8:: Locate the patch associated with the largest approximation error on $D^{t r}$
9:: Generate a local model via a training procedure using the instances that fall into the above identified patch, i.e., $D_{l}^{t r}$ , where l is an index of the patch
10:: Remove the current patch from the candidate pool
11:: end for
12:: Update the global model using the training data $D^{t r} \ {D_{l}^{t r}}$
13:: for all instances in the testing set, $D_{*}$ do
14:: for all patches do
15:: if the data fall into the $l^{t h}$ patch then
16:: Use the $l^{t h}$ patch model to make predictions
17:: else use the global model to make predictions
18:: end if
19:: end for
20:: end for
21:: Return the training and testing set predictions

3. Results

This section presents the results obtained by numerical experiments applying the fuzzy predictive models to meteorological data. All experiments were conducted on a MacBook Pro running MATLAB 2021a and macOS Catalina version 10.15.7, with an i5 Intel Core @2.5 GHz dual core, 8 GB memory, and 128 GB SSD.

3.1. Evaluation Metrics

The fuzzy algorithms were implemented as forecasting models for the temperature, wind speed, and solar radiation meteorological datasets. In order to quantify the best performance among the models, two evaluation criteria were adopted. These criteria were the root mean squared error (RMSE) and the mean absolute percentage error (MAPE), described by Equations (13) and (14), respectively.

E_{RMSE} = {[\frac{1}{| T |} \sum_{t \in T} {(Y (t) - y_{model} (t))}^{2}]}^{\frac{1}{2}}

(13)

E_{MAPE} = \frac{1}{| T |} \sum_{t \in T} | \frac{Y (t) - y_{model} (t)}{Y (t)} |

(14)

where

Y (t)

is the target of the training dataset

D^{t r}

;

| T |

is the cardinality of

D^{t r}

, i.e., the number of training examples; and

y_{model} (t)

represents the output of the adopted fuzzy model. A model with a smaller

E_{RMSE}

led to more precise predictions. The use of the

E_{MAPE}

metric was due to the fact that it is scale-independent. Furthermore, the presentation of the forecast performance as a percentage should help the reader to obtain a better insight into the models. The performance of the models was based on the training, but more importantly the testing, error.

In addition, the coefficient of determination

r^{2}

, possessing values within

(- \infty, 1]

and given by Equation (15), was also included as a metric of the models’ performance, since it is a descriptive statistic that measures the proportion of the variance of the dependent variable

Y (t)

explained by suggested explanatory variables. A positive value of

r^{2}

can be considered similar to the percentage correctness obtained by regression.

r^{2} = 1 - \frac{\sum_{t \in T} (Y (t) - y_{model} (t))^{2}}{\sum_{t \in T} (Y (t) - \bar{Y} (t))^{2}}

(15)

where

\bar{Y} (t)

denotes the mean value of the dataset’s target. The latter metric indicates the extent to which the output is predictable, i.e., how well the model fits the observed data. A value close to unity indicates a better model.

3.2. Meteorological Data

The data used in this study were provided by the National Observatory of Athens, Greece, and consisted of observations corresponding to a two-year period (2018–2019). The data were split into training

D^{t r}

and testing sets

D_{*}

(70 and 30 percent, respectively). For each dataset, the training set began on the first day of January 2018 and ended on 27 May 2019. Similarly, the testing set spanned from the second hour of 27 May 2019 to 31 December 2019. As far as units are concerned, air temperature was measured in °C, wind speed in m/s, and solar radiation in watts per squared meter (W/m

^{2}

). The items included in the data were all numerical, and they were associated with the year, month, day, hour, and value of each environmental variable in the appropriate units.

The main cause of wind generation is the pressure gradient, which arises due to the fact that the barometric pressure is not the same in every area of the Earth’s surface. Essentially, wind speed is caused by the movement of air from high to low pressures mainly due to changes in temperature. The variation in solar radiation during the day depends on the effect of the atmosphere (absorption, scattering, and reflection) and geometrical factors (ray inclination and the sun–earth distance). Factors that shape the air temperature during the day include wind, humidity, cloud coverage, and solar radiation. The nature of the variability in air temperature, wind speed, and solar radiation is illustrated in Figure 1, from left to right, respectively. The figures depict the time span in months, from January 2018 to December 2019, as well as the training and testing sets for each variable. Each tick on the horizontal axis, labelled with the appropriate month, refers to 15 days of the corresponding month.

3.3. Input Selection

The generation of the feature space

X

that served as the input to the fuzzy approximation models was based on the notion of partial correlations. The task was to determine the lags with the highest statistical significance; thus, for each meteorological variable’s

D^{t r}

, the partial autocorrelations

ϕ_{h}^{h}

were computed and plotted to reveal hidden structures. Put simply, the partial autocorrelation between

x (t)

and

x (t - h)

is the conditional correlation of

x (t)

and

x (t - h)

, conditioned on

{x (t - h + 1), \dots, x (t - 1)}

, i.e., the observations between the time instances t and

t - h

. The partial autocorrelation function (PACF) comprises the sequence of the estimates

{\hat{ϕ}}_{1}^{1}, {\hat{ϕ}}_{2}^{2}, {\hat{ϕ}}_{3}^{3}, \dots

, which are obtained by solving the Yule–Walker equations for

k = 1, k = 2, \dots, D

.

In Figure 2, the sample partial autocorrelations are plotted against the lags for air temperature, wind speed, and solar radiation. The most significant lags, which were chosen as the features for the fuzzy models, are depicted in purple.

3.4. Air Temperature Forecasting

For the first dataset, a vanilla version of a fuzzy system generated from the training data, i.e., the WM model, was adopted. The model was trained using the number of examples in the training set

D^{t r}

, resulting in 59 fuzzy rules. The inputs

x (t - 3)

,

x (t - 2)

, and

x (t - 1)

were granulated using uniform fuzzy partitions generated by nine Gaussian membership functions. As the fuzzy connective, a

m i n i m u m

t-norm was used, and the implication was of the Larsen type. The aggregation operation was associated with a

m a x i m u m

s-norm, and the final prediction of the system emerged from centre-of-gravity defuzzification. The performance for both the training and testing instances could be considered reasonably good, taking into consideration that this was quite a simple version of a fuzzy model.

The second model studied in the current research was the adaptive network fuzzy inference system. The TSK based model was trained using hybrid learning, which combined a gradient descent algorithm for the input parameters, e.g., the location and spread of the membership functions, and least squares for the consequent parameters of the rules. The number of membership functions was set to be as low as possible—two for each input dimension. The consequent part of the fuzzy rules was set as a linear function, and the membership function type was a generalised bell shape. The learning cycle had a duration of 200 iterations. The results of the forecasting models corresponding to the training and testing sets are presented in Table 1.

An ensemble of the ANFIS (EnANFIS) was also adopted for improving the forecasting capability of the standalone ANFIS model. Thus, each base learner, (five in total) was an ANFIS, with two generalised bell-shaped membership functions on each input dimension. The number of training iterations was 200. The EnANFIS was implemented in a bagging framework, with mean averaging used to generate the final system’s predictions. Every base learner was trained using hybrid learning, and the rules’ consequent was linear.

The last fuzzy model that was tested in the air temperature forecasting problem was patch learning. The experiments were conducted using the algorithm that Dongrui Wu kindly made available through his github repository (the code published by Dongrui Wu can be found at https://github.com/drwuHUST/Patch-Learning, (accessed on 27 December 2022)). Both the global and local models were adaptive network fuzzy inference systems. For each input, fuzzy partitions generated by two trapezoidal membership functions were defined. The use of trapezoidal membership functions stemmed from the fact that the definition and computation of patches was based on type-1 first-order rule partitions. As mentioned in [30], the number of patches should be the same as the number of base learners in an ensemble method to ensure an equal comparison. Therefore, the number of patches was set to be

L = 4

. The performance of PL in terms of forecasting was superior to that of the previous models, since it was not only composed of ANFIS models, which are highly powerful, but it also combined them in such a way as to ensure better approximation results. Figure 3 graphically illustrates the WM and ANFIS predictions, as well those associated with the EnANFIS and PL.

3.5. Wind Speed Forecasting

The input space

X

was generated by selecting the first and third delays of the data, as shown in Figure 2b. Thus, a two-dimensional input space was selected for the numerical experiments presented below. It should be noted that up to the 30th lag, the values of the partial autocorrelation function were of a very small magnitude. Despite the fact that the features

{x (t - 26), x (t - 19)}

together with

x (t - 1)

were also considered, the results were better when

{x (t - 1) x (t - 3)}

was chosen. This could be explained by the fact that the past lags had significantly low values of correlation. An important issue was that the selection of inputs based on the values of the partial autocorrelation functions was indicative and not definitive.

The WM-based predictive model was similar to that used for the temperature data, with eleven Gaussian membership functions to fuzzify each input. The ANFIS was trained using the hybrid algorithm, with three generalised bell-shaped membership functions for each input, and a linear function for each rule. The bagging model was generated using five ANFISs parametrised as previously mentioned. Lastly, five patches were considered for patch learning to ensure a fair comparison.

The plots illustrating the forecasting results of the four fuzzy models, corresponding to the wind speed data, are presented in Figure 4, Figure 5, Figure 6 and Figure 7. For the purpose of visualisation, we decided to plot a limited time span of the prediction. Thus, the left side of the following figures illustrates the training prediction results, limited to one thousand two hundred training points, and the right side represents the testing. The models’ testing set predictions are indicated in magenta. Interestingly enough, as seen in Table 2, PL was the superior model in terms of the training error, and the ANFIS was superior in terms of the testing error.

3.6. Solar Radiation Forecasting

The last dataset studied concerned solar radiation. Based on the partial autocorrelation functions, a three-dimensional input space was selected, i.e., the lags

x (t - 1)

,

x (t - 2)

, and

x (t - 3)

.

Since solar radiation vanishes during the night, the time series displayed several zero values during these periods. Thus, the metrics that were considered for our experiments were rmse and

r^{2}

, since mape could not be defined for the latter case. Similarly to Section 3.5, the plots (Figure 8, Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13) were drawn for a limited time span in order to improve the visualisation for the reader. The results for

E_{RMSE}^{D^{t r}}

and

E_{RMSE}^{D_{*}}

of each model are displayed in Table 3.

The WM model was implemented using nine Gaussian membership functions,

m i n i m u m

t-norm, Larsen implication, and

m a x i m u m

aggregation. The centre-of-gravity defuzzification method was used. In total, 278 fuzzy rules were generated from the data. The ANFIS model utilised two generalised bell-shaped membership functions for each input and was trained with a combination of gradient descent for the antecedent parameters and least squares for the consequent parameters. The final prediction of the ANFIS was obtained by passing the model’s output through an

| . |

function to discard the negative values, which did not make sense in terms of physical science. As far as the ensemble of ANFIS is concerned, ten base learners and nine patches in total were considered for the PL-based model.

3.7. Comparing Patch Learning with SVM and GP Models

Since the wind speed and solar radiation time-series data were considered rather demanding for forecasting, our aim was to also include a comparison section that studied the performance of patch learning relative to the performance of two of the most powerful machine learning methods in the literature. The two models considered, due to their well-known performance quality, were support vector machines and Gaussian processes (GPs). Both models are kernel-based, ensuring a high approximation capability, while the latter emerges from a probabilistic Bayesian framework.

To keep the comparison as fair as possible, the inputs of the models were the same as in the previous sections;

{x (t - 1) x (t - 3)}

for wind speed and

{x (t - 1) x (t - 2) x (t - 3)}

for solar radiation.

Our intentions for the present section were not to dwell on the details of these models, which would be outside of this study’s scope, but rather to perform numerical experiments for a direct comparison. The theoretical foundations for both SVMs and GPs can be found in any machine learning publication, such as [31,32]. For the simulations, the support vector machine used normalised data, and the kernel function was Gaussian. Due to the number of examples in

D^{t r}

, the Gaussian process inference was derived using the subset of data method, and not analytically. Furthermore, the covariance matrix in the GP emerged via a squared exponential function. Figure 14 depicts the SVM prediction for solar radiation. We should note that Figure 12 and Figure 15 depict the Gaussian process mean prediction.

Training for wind speed: The Gaussian process model resulted in

E_{RMSE}^{D^{t r}} = 0.9343

and

E_{MAPE}^{D^{t r}} = 38.2557 %

, as far as wind speed forecasting is concerned. In the case of the support vector machine model, the results were

E_{RMSE}^{D^{t r}} = 0.9306

and

E_{MAPE}^{D^{t r}} = 42.6937 %

. In terms of the coefficient of determination, the support vector machine resulted in a 0.8171 value, and the Gaussian process produced a value of 0.8157.

Testing for wind speed: The Gaussian process model resulted in

E_{RMSE}^{D_{*}} = 0.9861

and

E_{MAPE}^{D_{*}} = 45.3835 %

, as far as wind speed forecasting is concerned. The support vector machine model resulted in

E_{RMSE}^{D_{*}} = 0.9891

and

E_{MAPE}^{D_{*}} = 35.6805 %

. The values of

r^{2}

for testing were 0.8195 and 0.8206 for the support vector machine and Gaussian process, respectively.

To improve the reader’s visual experience, two illustrative plots associated with the performance of the PL, GP, and SVM models are provided in Figure 16. In the upper left plot, the short bars depict the training RMSE, whereas the tall bars refer to the testing. In the bottom left plot, the short bars refer to the testing error, and the tall bars to the training. The best-performing model’s bars are depicted in purple. As can be easily concluded, patch learning performed better in terms of RMSE and support vector machine in terms of MAPE. We should note that patch learning outperformed the Gaussian process in both the training and testing cases, which we found quite impressive. In the case of solar radiation, the Gaussian process was superior in terms of training RMSE and PL in terms of testing RMSE, as can be seen in the bottom right plot.

3.8. Comparing Intelligent Models with Naive Predictor

Lastly, a comparison of the intelligent fuzzy models with a simple predictor is provided. Meteorology variable forecasting may be obtained with the most recent information available. This predictor is referred to as a naive forecast. One such prediction method is to use the most recent data available. In the naive prediction model (NPM), the forecast is obtained using the average of m past values:

y_{NPM} (t) = \frac{1}{m} \sum_{j} y (t - j), where j = 1, 2, . . ., m and t \in T

(16)

The naive predictor requires minimal computational effort. This is one of the oldest and simplests ways to forecast wind speed, and it is also called a persistence model. Furthermore, in order to perform a substantial comparison for the forecasting, a naive prediction model was introduced to quantify the improvement provided by the intelligent forecasting techniques. The performance results of the NPM applied to forecast air temperature, wind speed, and solar radiation data are presented in Table 4.

As can be easily noticed, this model, despite its simplicity in implementation, failed to reach the performance levels achieved by the intelligent models. Therefore, the use of the fuzzy models for forecasting meteorological variables is supported.

4. Discussion

This research began with the well-known WM model from the fuzzy literature, which obtained great results when it came to air temperature forecasting. When it came to wind speed and solar radiation data, WM performed poorly. The drawback of this method stemmed from its batch-learning nature, in addition to the lack of further training in terms of the antecedent and consequent parameters. We strongly believe that by using optimisation methods, WM could provide much better approximation results. The ANFIS was proven to be an effective model, since the combination of its deep roots in neural networks with an optimal method, i.e., least squares, provided a strong foundation to cope with difficult problems. Despite the simplicity of its implementation, its generalisation ability was surprising.

A bagging algorithm was also studied. The heart of the ensemble lay in the base learners, which in turn were realised by adaptive network fuzzy inference systems trained on bootstrap samples drawn from

D^{t r}

. The performance of the fuzzy ensemble depended heavily on the number of base learners, but also on other factors such as the nature of the data. The higher the number of ANFIS learners, the better the final system’s performance. As demonstrated in the numerical study, the fuzzy ensemble of ANFIS had the greatest generalisation performance, meaning that it performed better on the testing set.

In all cases considered, patch learning displayed excellent results in terms of forecasting the time series. It is worth noting that patch learning could be regarded as equally efficient, if not better, compared with high-performing models from the machine learning literature. PL managed to outperform effective machine learning models for the highly demanding problem of wind speed prediction in terms of the training root mean squared error. To be more precise, in the demanding wind speed prediction problem, patch learning resulted in root mean squared errors of

0.9211

and

0.9841

for the training and testing sets, respectively. At the same time, the support vector machine resulted in a

0.9306

training root mean squared error and a

0.9891

testing RMSE, and the Gaussian process model resulted in a

0.9343

training RMSE and

0.9861

testing RMSE.

As far as the forecasting results are concerned, it should be mentioned that the models were implemented on the raw data without any preprocessing or decomposition techniques.

5. Conclusions

Since energy consumption reduction in the context of renewable energy systems and infrastructure is crucial, a framework for competitive forecasting was developed. An intelligent systems approach to environmental variable forecasting was studied. The task involved the training of four fuzzy models given time-series data of air temperature, wind speed, and solar radiation, followed by the testing of the prediction performance. The approximation performance was tested in terms of the following regression metrics: RMSE, MAPE, and the coefficient of determination. To provide a complete comparison, the fuzzy models were compared with each other; a simple naive predictor; and, finally, notoriously effective models from the machine learning literature.

The models studied were all trained in a supervised learning framework and could be considered to cover a range of complexity, from simple, e.g., WM, to modern and sophisticated methods, e.g., PL. All the studied models performed quite well on the air temperature and solar radiation datasets. However, as can be seen from the numerical experiments, wind speed prediction proved to be a quite demanding task for all the models studied in the current paper.

The main challenge of this paper and, as far as we are concerned, its main contribution, was the investigation of a recently introduced approach called patch learning, in the context of real environmental data prediction. Moreover, fuzzy ensemble methods is a subject missing from the environmental variable forecasting literature; thus, a bagging model with ANFIS-based base learners was proposed. Patch learning is a promising methodology for improving the performance of supervised learning algorithms for regression problems. In all cases considered, PL displayed great results in terms of forecasting the time series. Like all fuzzy models, PL is susceptible to dimensionality; thus, the feature space dimension for regression problems should be kept low.

Even though time-series prediction is a well-studied subject, for complex data many issues could arise, such as selecting an optimal feature space, selecting the best model, selecting the set of optimal model parameters, and reducing the dimensionality. Our future studies will focus on exploring decomposition methods in order to improve the approximation performance of the models in terms of wind speed forecasting, which is quite a demanding task. Furthermore, we aim to explore the performance of ensembles of WM in a bagging framework and the combination of evolutionary computing for the optimisation of the system’s parameters. Additionally, to model the uncertainty of the predictions, pythagorean and fermatean fuzzy sets will be explored.

Author Contributions

Conceptualisation, P.K. and A.D.; methodology, P.K. and A.D.; software, P.K.; validation, P.K. and A.D.; formal analysis, P.K.; investigation, P.K. and A.D.; resources, A.D.; writing—original draft preparation, P.K.; writing—review and editing, P.K. and A.D.; visualisation, P.K.; supervision, A.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors wish to express their gratitude to the National Observatory of Athens for providing the data under study and the reviewers for their fruitful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ANFIS	Adaptive network fuzzy inference system
WM	Wang–Mendel method fuzzy system
EnANFIS	Ensemble of ANFIS
PL	Patch learning
GP	Gaussian process
ARMA	Autoregressive moving average
NARMAX	Nonlinear ARMA with exogenous input
STFM	Short-term forecast model
SVM	Support vector machine
RBF	Radial basis function
PACF	Partial autocorrelation function
TSK	Takagi–Sugeno–Kang
NPM	Naive prediction model

References

Zhou, Z.; Dai, Y.; Xiao, J.; Liu, M.; Zhang, J.; Zhang, M. Research on Short-Time Wind Speed Prediction in Mountainous Areas Based on Improved ARIMA Model. Sustainability 2022, 14, 5301. [Google Scholar] [CrossRef]
Jahangir, H.; Ahmadian, A.; Aliakbar Golkar, M.; Elkamel, A.; Almansoori, A. Solar irradiance forecasting based on the combination of Radial Basis Function Artificial Neural Network and Genetic Algorithm. In Proceedings of the 6th European Conference on Renewable Energy Systems, Istanbul, Turkey, 25–27 June 2018. [Google Scholar]
Wang, Y.; Zou, R.; Liu, F.; Zhang, L.; Liu, Q. A review of wind speed and wind power forecasting with deep neural networks. Appl. Energy 2021, 304, 117766. [Google Scholar] [CrossRef]
Marugán, A.P.; Márquez, F.P.G.; Perez, J.M.P.; Ruiz-Hernández, D. A survey of artificial neural network in wind energy systems. Appl. Energy 2018, 228, 1822–1836. [Google Scholar] [CrossRef]
Rahimi, N.; Park, S.; Choi, W.; Oh, B.; Kim, S.; Cho, Y.h.; Ahn, S.; Chong, C.; Kim, D.; Jin, C.; et al. A Comprehensive Review on Ensemble Solar Power Forecasting Algorithms. J. Electr. Eng. Technol. 2023, 18, 719–733. [Google Scholar] [CrossRef]
Lee, J.; Wang, W.; Harrou, F.; Sun, Y. Wind Power Prediction Using Ensemble Learning-Based Models. IEEE Access 2020, 8, 61517–61527. [Google Scholar] [CrossRef]
Yan, H.; Wu, Z. A Hybrid Short-Term Wind Power Prediction Model Combining Data Processing, Multiple Parameters Optimization and Multi-Intelligent Models Apportion Strategy. IEEE Access 2020, 8, 227126–227140. [Google Scholar] [CrossRef]
Solano, E.S.; Dehghanian, P.; Affonso, C.M. Solar Radiation Forecasting Using Machine Learning and Ensemble Feature Selection. Energies 2022, 15, 7049. [Google Scholar] [CrossRef]
Wang, J.; Zhang, W.; Wang, J.; Han, T.; Kong, L. A novel hybrid approach for wind speed prediction. Inf. Sci. 2014, 273, 304–318. [Google Scholar] [CrossRef]
Liu, D.; Niu, D.; Wang, H.; Fan, L. Short-term wind speed forecasting using wavelet transform and support vector machines optimized by genetic algorithm. Renew. Energy 2014, 62, 592–597. [Google Scholar] [CrossRef]
Hameed, S.S.; Ramadoss, R.; Raju, K.; Shafiullah, G. A Framework-Based Wind Forecasting to Assess Wind Potential with Improved Grey Wolf Optimization and Support Vector Regression. Sustainability 2022, 14, 4235. [Google Scholar] [CrossRef]
Zhang, J.; Wei, Y.; Tan, Z.f.; Ke, W.; Tian, W. A Hybrid Method for Short-Term Wind Speed Forecasting. Sustainability 2017, 9, 596. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, S.; Chen, X.; Wang, J. Artificial Combined Model Based on Hybrid Nonlinear Neural Network Models and Statistics Linear Models—Research and Application for Wind Speed Forecasting. Sustainability 2018, 10, 4601. [Google Scholar] [CrossRef]
Manusov, V.; Matrenin, P.; Nazarov, M.; Beryozkina, S.; Safaraliev, M.; Zicmane, I.; Ghulomzoda, A. Short-Term Prediction of the Wind Speed Based on a Learning Process Control Algorithm in Isolated Power Systems. Sustainability 2023, 15, 1730. [Google Scholar] [CrossRef]
Zhang, Z.; Song, Y.; Liu, F.; Liu, J. Daily Average Wind Power Interval Forecasts Based on an Optimal Adaptive-Network-Based Fuzzy Inference System and Singular Spectrum Analysis. Sustainability 2016, 8, 125. [Google Scholar] [CrossRef]
Mohsin, S.M.; Maqsood, T.; Madani, S.A. Solar and Wind Energy Forecasting for Green and Intelligent Migration of Traditional Energy Sources. Sustainability 2022, 14, 6317. [Google Scholar] [CrossRef]
Barrera, J.M.; Reina, A.; Maté, A.; Trujillo, J.C. Solar Energy Prediction Model Based on Artificial Neural Networks and Open Data. Sustainability 2020, 12, 6915. [Google Scholar] [CrossRef]
Nasiri, H.; Ebadzadeh, M.M. MFRFNN: Multi-Functional Recurrent Fuzzy Neural Network for Chaotic Time Series Prediction. Neurocomputing 2022, 507, 292–310. [Google Scholar] [CrossRef]
Kosko, B. Fuzzy systems as universal approximators. In Proceedings of the [1992 Proceedings] IEEE International Conference on Fuzzy Systems, San Diego, CA, USA, 8–12 March 1992; pp. 1143–1162. [Google Scholar]
Wang, L.X.; Mendel, J. Fuzzy basis functions, universal approximation, and orthogonal least-squares learning. IEEE Trans. Neural Netw. 1992, 3, 807–814. [Google Scholar] [CrossRef]
Cybenko, G. Approximation by Superpositions of a Sigmoidal Function. Math. Control. Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Mendel, J.M. Uncertain Rule-Based Fuzzy Systems: Introduction and New Directions, 2nd ed.; Springer: New York, NY, USA, 2017. [Google Scholar]
Zadeh, L.A. Is there a need for fuzzy logic? Inf. Sci. 2008, 178, 2751–2779. [Google Scholar] [CrossRef]
Wang, L.X. The WM method completed: A flexible fuzzy system approach to data mining. IEEE Trans. Fuzzy Syst. 2003, 11, 768–782. [Google Scholar] [CrossRef]
Jang, J.S. ANFIS: Adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man, Cybern. 1993, 23, 665–685. [Google Scholar] [CrossRef]
Zhou, Z.H. Ensemble Methods: Foundations and Algorithms, 1st ed.; Chapman Hall CRC: Boca Raton, FL, USA, 2012. [Google Scholar]
Polikar, R. Ensemble Learning. In Ensemble Machine Learning: Methods and Applications; Zhang, C., Ma, Y., Eds.; Springer: Boston, MA, USA, 2012; pp. 1–34. [Google Scholar] [CrossRef]
Kim, D. Improving the fuzzy system performance by fuzzy system ensemble. Fuzzy Sets Syst. 1998, 98, 43–56. [Google Scholar] [CrossRef]
Hu, X.; Pedrycz, W.; Wang, X. Random ensemble of fuzzy rule-based models. Knowl. Based Syst. 2019, 181, 104768. [Google Scholar] [CrossRef]
Wu, D.; Mendel, J.M. Patch Learning. IEEE Trans. Fuzzy Syst. 2020, 28, 1996–2008. [Google Scholar] [CrossRef]
Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2013. [Google Scholar]
Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; Adaptive Computation and Machine Learning; MIT Press: Cambridge, MA, USA, 2006; pp. I–XVIII, 1–248. [Google Scholar]

Figure 1. Meteorological data. From upper left to bottom right: (a) air temperature, (b) wind speed, and (c) solar radiation.

Figure 2. Sample partial autocorrelation function for meteorological data. From left to right: (a) PACF corresponding to the air temperature data, (b) PACF corresponding to the wind speed data, and (c) PACF corresponding to the solar radiation data.

Figure 3. (a) Upper left—WM-based air temperature forecast; (b) upper right—ANFIS-based air temperature forecast; (c) bottom left—EnANFIS-based air temperature forecast; (d) bottom right—PL-based air temperature forecast.

Figure 4. (a) WM-based wind speed forecasting for a snapshot of the training set, (b) WM-based wind speed forecasting for a snapshot of the testing set.

Figure 5. (a) ANFIS-based wind speed forecasting for a snapshot of the training set, (b) ANFIS-based wind speed forecasting for a snapshot of the testing set.

Figure 6. (a) EnANFIS-based wind speed forecasting for a snapshot of the training set, (b) EnANFIS-based wind speed forecasting for a snapshot of the testing set.

Figure 7. (a) PL-based wind speed forecasting for a snapshot of the training set, (b) PL-based wind speed forecasting for a snapshot of the testing set.

Figure 8. (a) WM-based solar radiation forecasting for a snapshot of the training set, (b) WM-based solar radiation forecasting for a snapshot of the testing set.

Figure 9. (a) ANFIS-based solar radiation forecasting for a snapshot of the training set, (b) ANFIS-based solar radiation forecasting for a snapshot of the testing set.

Figure 10. (a) EnANFIS-based solar radiation forecasting for a snapshot of the training set, (b) EnANFIS-based solar radiation forecasting for a snapshot of the testing set.

Figure 11. (a) PL-based solar radiation forecasting for a snapshot of the training set, (b) PL-based solar radiation forecasting for a snapshot of the testing set.

Figure 12. (a) GP-based wind speed forecasting for a snapshot of the training set, (b) GP-based wind speed forecasting for a snapshot of the testing set.

Figure 13. (a) SVM-based wind speed forecasting for a snapshot of the training set, (b) SVM-based wind speed forecasting for a snapshot of the testing set.

Figure 14. (a) SVM-based solar radiation forecasting for a snapshot of the training set, (b) SVM-based solar radiation forecasting for a snapshot of the testing set.

Figure 15. (a) GP-based solar radiation forecasting for a snapshot of the training set, (b) GP-based solar radiation forecasting for a snapshot of the testing set.

Figure 16. (a)

E_{RMSE}^{D^{t r}}

and

E_{RMSE}^{D_{*}}

of the three compared models for wind speed data, (b)

E_{MAPE}^{D^{t r}}

and

E_{MAPE}^{D_{*}}

of the three compared models for wind speed data, (c)

E_{RMSE}^{D^{t r}}

and

E_{RMSE}^{D_{*}}

of the three compared models for solar radiation data.

Figure 16. (a)

E_{RMSE}^{D^{t r}}

and

E_{RMSE}^{D_{*}}

of the three compared models for wind speed data, (b)

E_{MAPE}^{D^{t r}}

and

E_{MAPE}^{D_{*}}

of the three compared models for wind speed data, (c)

E_{RMSE}^{D^{t r}}

and

E_{RMSE}^{D_{*}}

of the three compared models for solar radiation data.

Table 1. Models’ air temperature forecasting performance based on the training set

D^{t r}

and testing set

D_{*}

.

Table 1. Models’ air temperature forecasting performance based on the training set

D^{t r}

and testing set

D_{*}

.

Model	$E_{RMSE}^{D^{tr}}$	$E_{MAPE}^{D^{tr}}$	$E_{RMSE}^{D_{*}}$	$E_{MAPE}^{D_{*}}$	$r_{D^{tr}}^{2}$	$r_{D_{*}}^{2}$
WM	0.9057	4.9417%	1.0723	3.4941%	0.9839	0.9295
ANFIS	0.6689	2.9883%	0.7530	2.2648%	0.9912	0.9709
EnANFIS	0.6694	2.9840%	$0.7491$	$1.9031$ %	0.9912	$0.9710$
PL	$0.6598$	$2.9508$ %	0.7516	1.9170%	$0.9914$	0.9707

Table 2. Models’ wind speed forecasting performance based on the training set

D^{t r}

and testing set

D_{*}

.

Table 2. Models’ wind speed forecasting performance based on the training set

D^{t r}

and testing set

D_{*}

.

Model	$E_{RMSE}^{D^{tr}}$	$E_{MAPE}^{D^{tr}}$	$E_{RMSE}^{D_{*}}$	$E_{MAPE}^{D_{*}}$	$r_{D^{t r}}^{2}$	$r_{D_{*}}^{2}$
WM	1.0330	47.1689%	1.1123	39.8697%	0.7849	0.7901
ANFIS	0.9288	45.1588%	0.9815	$37.2538$ %	0.8172	0.8212
EnANFIS	0.9312	45.1000%	$0.9809$	37.3190%	0.8163	$0.8214$
PL	$0.9211$	$44.7763$ %	0.9841	37.3200%	$0.8199$	0.8202

Table 3. Models’ solar radiation forecasting performance based on the training set

D^{t r}

and the testing set

D_{*}

.

Table 3. Models’ solar radiation forecasting performance based on the training set

D^{t r}

and the testing set

D_{*}

.

Model	$E_{RMSE}^{D^{tr}}$	$E_{RMSE}^{D_{*}}$	$r_{D^{tr}}^{2}$	$r_{D_{*}}^{2}$
WM	107.8742	115.9108	0.8953	0.9164
ANFIS	70.5504	$63.6497$	0.9378	0.9661
EnANFIS	70.5875	63.6529	0.9377	$0.9663$
PL	$64.8374$	64.0434	$0.9474$	0.9658

Table 4. Naive prediction model forecasting performance based on the training set

D^{t r}

and testing set

D_{*}

.

Table 4. Naive prediction model forecasting performance based on the training set

D^{t r}

and testing set

D_{*}

.

Meteorological Variable	$E_{RMSE}^{D^{tr}}$	$E_{RMSE}^{D_{*}}$	$E_{MAPE}^{D^{tr}}$	$E_{MAPE}^{D_{*}}$
Air temperature	1.4144	1.6958	6.6384%	4.9629%
Wind speed	1.1300	1.2501	49.9548%	44.9695%
Solar radiation	173.1735	205.5003	-	-

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Korkidis, P.; Dounis, A. Intelligent Fuzzy Models: WM, ANFIS, and Patch Learning for the Competitive Forecasting of Environmental Variables. Sustainability 2023, 15, 8032. https://doi.org/10.3390/su15108032

AMA Style

Korkidis P, Dounis A. Intelligent Fuzzy Models: WM, ANFIS, and Patch Learning for the Competitive Forecasting of Environmental Variables. Sustainability. 2023; 15(10):8032. https://doi.org/10.3390/su15108032

Chicago/Turabian Style

Korkidis, Panagiotis, and Anastasios Dounis. 2023. "Intelligent Fuzzy Models: WM, ANFIS, and Patch Learning for the Competitive Forecasting of Environmental Variables" Sustainability 15, no. 10: 8032. https://doi.org/10.3390/su15108032

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Fuzzy Models: WM, ANFIS, and Patch Learning for the Competitive Forecasting of Environmental Variables

Abstract

1. Introduction

2. Materials and Methods

2.1. Problem Statement

2.2. Wang–Mendel Predictive Model

2.3. Adaptive Network Fuzzy Inference System

2.4. Fuzzy System Ensemble

2.5. Patch Learning

3. Results

3.1. Evaluation Metrics

3.2. Meteorological Data

3.3. Input Selection

3.4. Air Temperature Forecasting

3.5. Wind Speed Forecasting

3.6. Solar Radiation Forecasting

3.7. Comparing Patch Learning with SVM and GP Models

3.8. Comparing Intelligent Models with Naive Predictor

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI