Design of Ensemble Forecasting Models for Home Energy Management Systems

Bot, Karol; Santos, Samira; Laouali, Inoussa; Ruano, Antonio; Ruano, Maria da Graça

doi:10.3390/en14227664

Open AccessArticle

Design of Ensemble Forecasting Models for Home Energy Management Systems

by

Karol Bot

¹

,

Samira Santos

¹

,

Inoussa Laouali

^1,2

,

Antonio Ruano

^1,3,*

and

Maria da Graça Ruano

^1,4

¹

Faculty of Science & Technology, University of Algarve, 8005-294 Faro, Portugal

²

SIGER, Faculty of Sciences and Technology, Sidi Mohamed Ben Abdellah University, Fez 1049-001, Morocco

³

IDMEC, Instituto Superior Técnico, Universidade de Lisboa, 1950-044 Lisboa, Portugal

⁴

CISUC, University of Coimbra, 3030-290 Coimbra, Portugal

^*

Author to whom correspondence should be addressed.

Energies 2021, 14(22), 7664; https://doi.org/10.3390/en14227664

Submission received: 9 October 2021 / Revised: 10 November 2021 / Accepted: 11 November 2021 / Published: 16 November 2021

(This article belongs to the Special Issue Machine Learning Prediction Models in Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The increasing levels of energy consumption worldwide is raising issues with respect to surpassing supply limits, causing severe effects on the environment, and the exhaustion of energy resources. Buildings are one of the most relevant sectors in terms of energy consumption; as such, efficient Home or Building Management Systems are an important topic of research. This study discusses the use of ensemble techniques in order to improve the performance of artificial neural networks models used for energy forecasting in residential houses. The case study is a residential house, located in Portugal, that is equipped with PV generation and battery storage and controlled by a Home Energy Management System (HEMS). It has been shown that the ensemble forecasting results are superior to single selected models, which were already excellent. A simple procedure was proposed for selecting the models to be used in the ensemble, together with a heuristic to determine the number of models.

Keywords:

energy systems; machine learning; forecasting; energy management systems; multi-objective genetic algorithms; ensemble models; energy in buildings

1. Introduction

The increasing levels of energy consumption worldwide is raising issues with respect to surpassing supply limits, the severe effects on the environment (Earth-wide temperature increase, depletion of ozone layer, climate problems, and others), and the exhaustion of energy resources [1].

Academia and industry have spent the last decades proposing and discussing new sustainable energy systems and methods for integrating fluctuating renewable energy sources/storage into the electric power grid.

The impact of new approaches to design, develop, and manage energy systems while maintaining sustainability throughout their operation lifetime has been a significant challenge. Keywords such as “smart grids” and “nearly zero energy buildings” rose and are usually employed within the boundaries of the subsectors of energy systems, although they should be analyzed in the context of the overall energy system [2]. An example of this is the concept of intelligent energy systems, generally referred as “Smart Energy Systems” (see [2] for a comprehensive discussion on the synergies between different energy systems having sustainability as its focus).

The full potential of control management through the use of computational resources in energy systems has yet to be reached. It is still a topic that must be the focus of research. Individual advances on sub-energy systems are clearly essential, but the integration of all the sub-energy systems in a global approach is also required. A study of the influence of one sub-system operation on another one may be found in [2].

A formal definition of a smart energy system is given by [3,4] and consists of “new technologies and infrastructures that create new forms of flexibility, primarily in the “conversion stage of the energy system.” Different sectors, from electricity to transport, are combined to compensate the lack of flexibility from renewable sources. The smartness of the energy systems is strongly related to foreseeing the behaviour of these systems in the future, and in order to do so, modelling, simulation, and forecasting are of vital importance.

1.1. Background Information

This subsection provides a summary of background information concerning forecasting and prediction in energy systems. Instead of being exhaustive, this subsection presents an overview and links it to references where readers can deepen their understanding of the subject. There is an increasing need of tools that allow modelling energy systems across overall infrastructure. A variety of possible simulation approaches is available, and its selection must be adequate for the objective of the study.

Within the subject of modelling and simulating smart energy systems, energy forecasting plays an essential role in energy sector development, policy formulation, and the management of these systems [5]. A vast number of scientific reviews aim to elaborate the applications of forecasting models and techniques in different energy systems, as well as discussions on future trends. A review on deterministic and probabilistic methods using deep learning for forecasting in renewable energy applications is developed in [6], where effectiveness, efficiency, and application potentials are explored. Specifically applied to wind energy, a review on the use of multi-objective optimization for forecasting is presented in [7]. The authors first introduce basic theories and methods, followed by a classification of forecasting objectives for different applications. In [8], there is an evaluation of forecasting applications for energy storage systems—an important study such as the operation of energy storage systems is not trivial due to its energy limitations and degradation behaviour. Forecasting algorithms and energy management strategies for microgrids are extensively presented in [9]. Considerations on real-time energy systems’ applications of machine learning forecasting methods are given in [5]. Another review on prediction and forecasting of energy consumption is presented in [10], focusing on the manufacturing industry. The authors categorize the reviewed studies according to system boundaries, modelling techniques, objective, purpose and perspective of the forecasting, prediction horizon, and model output.

A critical analysis between forecasts and reality for the energy strategy is discussed in [11]. Here, the authors highlight that the accuracy of forecasting at a country scale strongly depends on the trends, scenarios, and risks assumed as contours for the modelling stage, since the expected key parameters and indicators may differ from the actual implemented values. Focusing on energy planning models, a review of forecasting methods applications is delivered by [12], which focuses on different aspects on a prediction scenario definition. Another systematic review to mention compares conventional models with artificial intelligence-based models for energy forecasting and can be found in [13]. The authors discuss model performance based on factors such as prediction horizon, application areas, model type, and forecasting accuracy.

In the context of energy forecasting by means of intelligence-based models and considering that the building sector is a significant energy consumer and by focusing on the scope of the current study, additional investigations should be referenced. A promising solution to attenuate the impact of the buildings energy consumption in the environment is the concept of low energy buildings or nearly zero-energy buildings that can satisfy high standards of energy efficiency [14] by having reduced energy consumption while installing (at a local level or neighbourhood level) renewable energy resources and, possibly, storage systems. This new paradigm of low-energy buildings requests the selection and use of appropriate forecasting methodologies for evaluating the efficiency of the built environment energy supply and the demand patterns at a single building or at a small community of buildings [15]. Furthermore, the accurate analysis and interpretation of energy demand enable a deeper understanding of consumption patterns for building owners and decision-making entities [16] motivated by social and scientific sustainable development and equitable energy use.

It is crucial to consider the connection of distributed renewable energy, storage, and energy management [17] in buildings. Several challenges are associated with the intermittence of renewable energy and its storage in a single building or small community scenario. A key issue is the need to accurately develop an energy management system (EMS) that will balance electricity supply and demand, aiming at the minimization of associated costs, the attainment of a positive impact on the grid [17], and contributing to the decarbonisation of energy related to the building sector [16]. As part of the global approach to achieve sustainability in energy systems, increasing sustainability from a demand-side point of view has vast economic implications [16], and it requires the evaluation of energy efficiency from diverse design options and operational planning strategies [1,16,18]. These approaches require models for demand simulation and forecasting that are simultaneously accurate and computationally fast [18]. Forecasting electricity demand of a building is an integral component of smart grids with respect to improving EMS efficiency concerning sustainability goals and cost reduction [1,17]. The accurate forecasting of electricity demand, meaningful for demand response, is explored in [16], where the importance of the availability of high-resolution smart metering infrastructure is also discussed.

Indeed, the larger availability of data at a single building level, considering that an increasing number of buildings possess smart meters nowadays, magnifies the potential of EMS to cause a positive impact at a big picture level with respect to the energy sector.

For the use of the acquired data in the forecasting models, the system must be continuously monitored and managed with respect to the energy time series and the factors that have more impact on the building’s energy performance (the exogenous variables of the time series). The challenges of load forecasting are significantly based on the intrinsic nonlinearity, volatility, and stochastic nature of the real-time load profile and load profile dependence of the occupancy pattern, especially in the residential case [16,19]. In recent years, the acceleration of Big Data platforms has increased the use of machine learning methods for delivering more accurate and fast electricity demand forecasts for residential EMS [17,19,20]. Machine learning (ML) facilitates an adequate mimicry of Building Performance Simulation (BPS) algorithms based on engineering methods, while being considerably faster in generating results than compared to BPS [18,20]. Further details on ML methods will be provided in Section 2.

1.2. Objectives, Contributions, and Work Organization

In this context, the objective of this work is to explore the use of ensemble techniques and to improve the forecasting performance of artificial neural networks models originally designed by multi-objective genetic algorithms. These models will be used for energy forecasting in residential houses, with PV generation and battery storage, controlled by a Home Energy Management System (HEMS). The case study is a residential house located in Portugal.

The main contributions of this work are listed below:

(i): A detailed review of ML techniques in energy forecasting in buildings and HEMS;
(ii): A simple scheme to design ensemble models for forecasting the energy produced and consumed in residences with PV generation and battery storage. Notice that as PV generation forecasting also implies the forecast of solar irradiance and atmospheric temperature, four different forecasting models are needed for a HEMS.

This study is organised into five main sections. Section 1 describes the context of the work, objectives, and contributions. Section 2 presents the literature review, covering the most used ML algorithms for energy systems and focusing on the topics of energy forecasting in buildings and forecasting applications in Home Energy Management Systems. Section 3 presents the design methodology used in this study, and Section 4 describes the case study employed. Section 5 discusses the results achieved with a single model and an ensemble of models, comparing also the results obtained with other techniques. Section 6 concludes the paper and points out future research directions.

2. Literature Review

The literature review section is segmented into three subsections. The first aims to deliver an overview of the most used ML-based prediction methods for energy systems. The second aims to present a review of the related publications on the topic of energy forecasting in the building sector. The third will discuss applications of these prediction methods on Home Energy Management Systems (HEMS). The Clarivate Analytics Web-of-Science was the source of information used to develop the literature review.

2.1. Machine Learning (ML)-Based Prediction Methods for Energy Systems

The prediction of energy is essential due to many factors, as highlighted in the introduction of the scope of this study. As in the case of ML methods, it is crucial to have access to extensive energy data in a time series context. The analysis of these time-series data has the potential to assess it in a meaningful statistical manner while predicting future values by using previous ones. Theoretical explanations on time-series analysis are extensively reported in the literature, such as, for instance, in [21,22,23,24]. A time series can be described as an ordered sequence of values sampled at equal time intervals, and its analysis is composed of two parts. First, the structure and underlying patterns of the observed data should be obtained. Second, in order to support future predictions, a model should be fitted to the sampled data.

This study does not aim to elaborate on the fundamentals of the techniques since they are widely reported in the literature, as in [25,26,27]. ML approaches may be employed in a supervised and unsupervised context. Four main steps should be used in a ML-based approach: data collection, data pre-processing, model training, and model testing. ML-based prediction may use single, ensemble, and hybrid models, and they are extensively described in [1]. The first employs one learning algorithm, the second comprises multiple prediction models, and the third combines two or more ML techniques. Figure 1 presents a general structure of single, ensemble, and hybrid models for forecasting time series. This work focuses on ensemble models.

Examples of single prediction model techniques include Artificial Neural Networks (ANN), Support Vector Regression (SVR), Linear Regression (LR), and autoregressive integrated moving average (ARIMA). These methods are extensively described in [1]. Studies that used ANN as an energy prediction method for buildings energy systems may be found in [28,29,30,31,32,33,34]. Applications employing SVR as an energy prediction method for buildings may be found in [35,36,37]. Nowadays, linear regression is often used as a comparative method to evaluate the performance of more elaborated machine learning methods. Studies that used ARIMA as an energy prediction method for buildings may be found in [38,39,40].

Ensemble methods have gained substantial attention in recent years and are extensively used nowadays because of their favourable forecasting predictive performance, and the combination of models may contribute in avoiding overfitting, which can occur by the selection of the best model in a single model scenario [1]. Studies that used ensemble methods as energy demand prediction methods for buildings may be found in [41,42].

Hybrid models combine ML techniques between themselves or are associated with optimisation algorithms. They can be created with one or more phases, corresponding to different problem-solving goals in order to overcome individual weaknesses, and can deal with complex components. A hybrid approach for forecasting energy consumption is proposed in [1].

As noted by different authors, ML models are very well-suited energy systems for forecasting, and they are described in early statistics literature [43,44]. In [45], the authors provide a substantial review on the four main ML approaches identified: ANN, Support Vector Machines (SVM), Gaussian-based regressions, and clustering, which have commonly been applied to improve building energy forecasting performance. The authors in [46] reviewed state-of-the art ML models used in the general application of energy consumption—the most relevant literature (to the date of their article) published in the field is classified according to ML modelling technique, energy type, perdition type, and application area. Another comprehensive review may be found in [47].

2.2. Forecasting of Energy Consumption in Buildings

This subsection aims to present a sample of related studies to the topic approached by this article. There are some interesting reviews in this topic, such as [48].

In [20], the study assesses entire building designs and design components based on a ML component-based approach. Test cases show that high prediction quality may be achieved, resulting in errors of 3.7% for cooling and 3.9% for heating.

In [19], the authors propose an innovative deep neural network-based energy prediction algorithm for forecasting the day-ahead hourly energy consumption profile by considering occupancy rate. In [16], another deep neural network model is designed by optimising the hyperparameters in order to enhance neural networks’ performances in a residential building based on the occupancy rate. Among the comparisons of different methods, the authors highlighted that some can take hours or days to process the data and to create a prediction model, which is an inconvenience even when it reaches very good performance. In [18], ML architectures are presented, and their suitability for space exploration in building design is evaluated. Compared to traditional ANN, deep learning has the potential to increase the performance of the forecasting models; an example of this is Multi-Task Learning, which can achieve more efficiency in the component development process.

Manuscript [22] performs feature selection for different energy systems in a residential building context, comparing it by using more than five different methods. The findings of this paper help select proper models, sensors, and inputs for model-predictive control systems during the heating and cooling seasons. In [49], the Gaussian Kernel regression model with random feature expansion and non-parametric based k-NN models were assessed against many different criteria based on feature significance for scenarios that vary the time intervals between samples.

In [50], the authors developed a forecasting system that optimises linear time series (using linear time-series model, a Seasonal Auto-Regressive Integrated Moving Average) with non-linear ML models (least squares support vector regression model) in order to identify the historical pattern of energy consumption and to predict multi-step ahead energy consumption. Optimisation algorithms were investigated using high-dimension mathematical benchmark functions, and computational time and input needs were assessed.

In [14], the work focuses on predicting the energy consumption of low energy buildings in two different scenarios (employing the entire data or only relevant data). As expected, the results showed that the relevant data modelling approach that relies on small representative data selection has higher accuracy (R² = 0.98; RMSE = 3.4) than all data modelling approaches (R² = 0.93; RMSE = 7.1).

In [51], the Holt–Winters (HW) method and Extreme Learning Machine (ELM) network were used in a hybrid model fashion for ultra-short-term predictions in a residential context, with a time scale of 15 min. The proposed model commonly demonstrated lower error compared to HW, ELM, and long- short-term memory networks when predicting residential electricity consumption. Substantial reductions on the RMSE were obtained (87.98%, 64.89%, and 53.39%, respectively).

In [52], by combining physical and data-driven approaches, a hybrid approach is applicable for modelling the building stock’s heating and cooling energy consumption, including residential and non-residential buildings. In this study, several models based on machine learning were assessed. Among them, the polynomial kernel support vector regression showed the best accuracy at the level of a single building, and the Gaussian radial basis function kernel support vector regression performed the best at the stock level. Another study that compares many machine learning-based models may be found in [53], where these models were validated against energy certifications (within the German regulation) for residential buildings; this data-driven approach is more accurate by almost 50% in comparison to the first approach.

In [54], the authors present a hybrid technique (Convolutional Neural Network and a Multi-layer Bi-directional Long-short-term memory model) and tested it in different scenarios. They showed that better results may be obtained using a 10-fold cross validation and a hold-out method.

In [15], 380 buildings from the end of the last century were employed for a comparative analysis of the predictive modelling of heating energy consumption. The authors selected different groups of variables and assessed the methods for obtaining data against the quality of the forecasting results. Six ML methods were used.

In [55], the authors performed an evaluation of three learning algorithms in an ensemble fashion by considering their performance. The algorithms are extremely randomised trees (extra-trees), random forests, and gradient boosted regression trees. Among them, gradient boosting improved prediction accuracy by an average of 14% and 65% for heating and cooling loads, in comparison with other literature proposed algorithm.

In [56], the goal was to evaluate the energy demand rate for building heating, and the authors combined the BORUTA feature selection algorithm and the rough set theory models, which proved result in good prediction quality while limiting the number of input variables. In [57], the paper presents a recurrent ANN for medium-to-long term predictions of electricity consumption profiles in buildings (one-hour resolution). The proposed method achieves lower relative errors compared to the conventional multi-layered perceptron neural network but presents differences when comparing residential and commercial contexts. In [58], the study’s objective was to obtain an accurate prediction of heat demand by using hybrid models, looking one hour ahead. The optimization of the feature set was obtained by using Pearson and least absolute shrinkage and selection operator methods, and the final results were compared with ANN and SVR traditional models.

In [47], six decomposition-based evolutionary ANNs for city and building scale energy forecasting were examined. Several measures were used to improve performance, and the results show that they can obtain high fitting accuracy and low error rates for different prediction horizons of forecasting/planning tasks. In [59], the authors designed a probabilistic data-driven predictive model for predicting electricity demand. The model is based on the Bayesian network framework. Scenarios considering different temporal granularities and spatial resolutions were assessed. They concluded that the Bayesian network framework is efficient for highlighting the dependencies between variables in the considered scenarios.

In [60], ML methods were used to derive data-driven appliance models and usage patterns to predict energy demand, aiming for increased accuracy of predictions of comfort needs, energy costs, environmental impacts, and grid service availability. Seven point six percent of energy savings were achieved without requiring substantial behavioural changes. The algorithms responded with 10% or lower errors when in a demand response event.

In [61], a method of coupling simulation with ML to predict indoor conditions and electricity demand in response to schedules and other factors was assessed. Potential spikes were identified based on predicted values. Coupling simulation techniques with ML reduced the requirement for costly and intrusive data collection methods. In [62], the authors applied extreme gradient boosting to predict and analyse electricity, gas, and water consumption and used SHapley Additive exPlanation to interpret the results. The methods were used in three different models: electricity, gas, and water consumption, in which a non-linear relationship was found between gas consumption and building intensity—due to an apparent relationship to the technology itself. Building type also significantly impacted interrelationships, especially between electricity and water.

Principal component analysis was performed for dimensionality reduction and for finding hidden patterns to provide data in clusters in [63]. The clusters were associated with climatic variables to forecast power consumption using regression-based ML models. In [64], the study aimed to develop an improved SVM (which applies Gaussian radial basis function optimised by a genetic algorithm as the kernel function) model to predict electricity demand under multiple scenario’ strategies. An average reduction of 12.1% in monthly electricity demand was achieved, compared with conventional behavioural intervention.

In [65], the study aimed to identify the best data-driven method for quantifying the impacts of climatic and socioeconomic changes on electricity consumption in buildings. A timeframe of four decades of data was used to train and validate the models. Monthly electricity consumption is predicted to decrease by 89.40% in the residential and commercial sectors, respectively, compared with 2018 levels.

The authors proposed in [33,34] the use of Radial-Basis Function (RBF) networks, designed by a Multi-Objective Genetic Algorithm (MOGA) framework, for multi-step forecasting of residential load demand. They used three years of data collected in Honda Smart Home US [66]. A single chosen model was compared with an ensemble of models, the latter obtaining the best results.

2.3. Applications of ML-Based Energy Systems Forecasting in HEMS

Many studies are available where the applications of ML-based energy systems forecasting in the HEMS context are assessed. Contribution [67] addresses the problem of residential load scheduling by using optimisation techniques in a receding horizon approach in a seven days-ahead prediction horizon. The proposed approach was compared with receding horizon and day-ahead scheduling techniques, and the obtained results were considered valid compared to the existing state-of-the-art approach. In [68], the study proposed a ML platform on a smart-gateway-based smart-grid in residential buildings, analysing occupant behaviours on a short-term load forecasting scheme. Based on the occupant behaviour profile and energy demand prediction, the proposed EMS can achieve up to 19.66% more peak load reduction and 26.41% more cost savings than compared to the SVM approach. In [31] the current authors developed short-term multi-step PV power forecasts to be used in model-based predictive control for HEMS. MOGA-designed RBFs were employed. In [69], also for short-term prediction, the authors looked at load power forecasting for HEMS by considering a smart community. Solar power systems are also the object of study in [70], where the forecasting is developed by using a long-short-term memory (LSTM) model, using different time scales (e.g., 15 min, 30 min ahead, and one day ahead). In [71], the authors addressed forecasting and HEMS optimization from the microgrid perspective, presenting a fully developed and implemented control scheme.

In [72], machine learning methods were used to predict the flexibility of a HEMS. User-behavior prediction and its impact on HEMS are assessed in [73]. In [74], the authors approach forecasting techniques in HEMS from a prosumer perspective, indicating that different renewables’ availability highly influences optimal demand allocation, renewables-based energy allocation, and the charging–discharging cycle of energy storage and electric vehicles. In contribution [75], machine learning methods are used to improve load prediction in a HEMS context based on human behavior patterns recognition. In [76], the authors implemented a self-learning HEMS based on Internet of Things, focusing on price forecasting, price clustering, and power alert system in order to enhance its functions. In [77], forecasting using deep learning is developed, aiming at improvement of automation efficiency in HEMS. In [78], the effect of electric vehicle movement schedule in a system composed of a photovoltaic generator, home energy storage, and HEMS control was analyzed.

Notice also that the use of popular open-source solutions such as Prophet [79] or AtsPy [80] can be used to obtain time-series forecasts.

2.4. Future Applications for Schedulable and Non-Schedulable Appliance Consumption Forecasting Using NILM

Due to the limitations on the practical implementation of in-depth and expensive monitoring systems, non-intrusive load monitoring (NILM) is becoming a hot topic [81]. One of the most important key points of a home energy management system (HEMS) is monitoring specific appliances. It aims to provide detailed information about the operating states and power consumption of specific devices in the house. Furthermore, it will allow HEMS systems to schedule energy-consuming appliances in order to establish energy-saving strategies, such as reprogramming high-power appliances to operate during off-peak hours [82].

In fact, electric appliances can be classified as schedulable (deferrable) and non-schedulable (non-deferrable) [83]. Devices such as washing machines, dryers, and water pumps can have their operations deferred and can be inoperative during peak energy demand hours. Moreover, this class of appliances includes also thermostatic devices such as heating/cooling systems and electric water heaters, representing a significant fraction of overall household electricity usage [84]. The non-schedulable devices, on the other hand, are made up of devices such as lighting, refrigerators, or cooking, where their electrical energy needs cannot be postponed.

The purpose of appliance monitoring is to identify the operating states and electrical consumption of schedulable and non-schedulable devices in real time. This can be performed by installing one or more sensors in each load of interest. This process is known as intrusive load monitoring. Due to its intrusive nature, which involves specific privacy concerns, the difficulty involved in installing and configuring several sensors, and its expensive cost, non-intrusive alternatives are preferred [83].

Non-intrusive load monitoring (NILM) aims to detect individual device usage from the aggregate total consumption collected by a smart meter at the building’s entrance. In general, the NILM process includes data collection, feature extraction, event detection, and load identification [83,85]. In [81], the authors present an overview of the state-of-the-art residential electrical demand monitoring. Unlike previous reviews, the applications of load monitoring are addressed based on technical challenges faced by the residential systems available. Contribution [86] proposed non-intrusive load monitoring based on the time window for HEMS application. The authors examined three machine learning algorithms (Decision Tree, k-Nearest Neighbor (kNN), and Random Forest). They obtained good performance by using a low-frequency public dataset.

An enhanced HEMS in residential power scheduling using a non-intrusive load monitoring approach and an automated nondominated sorting genetic algorithm-II (NSGA-II) was presented in [87]. The authors showed that the proposed advanced HEMS with the NILM approach is practicable and feasible in a real-world context. An analysis of device flexibility in the context of user behavior in a home energy management system using smart plugs was presented in [88]. They demonstrated that by including consumer behavior-related features, the suggested approach obtained outstanding performance. The identification of the operation, as well as energy consumption, of the set of non-schedulable appliances and for each schedulable device is not the end of the story. For an efficient HEMS, the consumption of each group of appliances should also be forecasted by using algorithms described before. An example of the forecasting of these two groups of devices can be observed in [34].

3. Design Methodology

In this Section, the models and the methods used for predictive model design are briefly introduced. The reader is encouraged to inspect the referenced papers for a deeper description.

3.1. The Models

The models used are RBF-ANN models. Typically, a Gaussian radial type of function is employed by hidden neurons, and their outputs are linearly combined afterwards. The model output is given by (1):

y [k] = w_{l + 1} + \sum_{j}^{l} w_{j} e^{- \frac{{‖ i [k] - C (j) ‖}_{2}^{2}}{2 σ_{j}^{2}}}

(1)

where

y [k]

is the output, at time instant k,

i_{j} [k]

is the j^th input at k, w is the vector of linear weights,

C (j)

is the vector (extracted from a C matrix) of the centres associated with hidden neuron j,

σ_{j}

represents its spread, and

{‖ ‖}_{2}

the Euclidean distance.

As the use of the model here is for prediction, a dynamic model is needed. This is achieved by employing external feedback, assuming that (1) can be observed as follows.

y [k] = f [i [k]]

(2)

The use of delayed versions of the measured output in i[k] of (1) allows the interpretation of RBF as a Nonlinear Auto Regressive (NAR) model.

\hat{y} [k] = f (y [k - d_{o_{1}}], \dots, y [k - d_{o_{n}}])

(3)

Employing delayed versions of external (eXogeneous) inputs, (3) is changed to a NARX model:

\hat{y} [k] = f (y [k - d_{o_{1}}], \dots, y [k - d_{o_{n}}], v [k - d_{i_{1}}], \dots, v [k - d_{i_{n}}])

(4)

where, for the sake of simplicity, only one external input, v, was used.

As the evolution of the forecasts over a prediction horizon (PH) is the objective, (4) is iterated over that horizon. For k + 1, we have the following.

\hat{y} [k + 1] = f (y [k + 1 - d_{o_{1}}], \dots, y [k + 1 - d_{o_{n}}], v [k + 1 - d_{i_{1}}], \dots, v [k + 1 - d_{i_{n}}])

(5)

Measured values for one or more terms in the argument of (5) may not be available depending on the indexes selected for the delays. Thus, these values must be obtained by using previous predictions. In this manner, the computation of the predictions over a prediction horizon PH may require PH executions of the model (5), representing a multi-step predictive model.

3.2. Model Design

A data-based model design is usually achieved by using the following three steps:

(i): Using the available data, training, generalization or testing, and validation sets should be constructed. This phase is known as data selection.
(ii): Once datasets have been built, the structure of the models, as well as their inputs, should be determined. This phase is known as structure selection.
(iii): For each model determined in the previous step, its parameters should be estimated. This is the estimation step.

The training set is used to estimate the parameters of each model designed; the testing set is used to compare models during the model design phase or to terminate parameter estimation; both sets are employed in the two last phases. The validation set, which is not used in the model design cycle, is employed to compare the performance of different designed models. In this application, the design data consists of samples, each one using the current value of the modelled data as a target, and delayed values of the modelled variable, as well as delayed values of every exogenous variable (if existent), as inputs.

3.2.1. Data Selection

We employ an approximate, stochastic convex-hull algorithm for data selection. This algorithm, denoted as ApproxHull, was proposed in [89]. It determines the convex hull (CH) of the data, treating memory and time complexity in an efficient manner. These CH vertices are compulsorily introduced in the training set, allowing the model to be designed with elements covering the entire operational range. The remaining samples included in the training set, testing set, and validation set are randomly extracted from the available data, without considering CH samples. For further details on the ApproxHull incremental algorithm, please consult [89]. Approxhull has been applied to different design problems [60,90,91] and also for online model adaptation [92].

3.2.2. Structure Selection

For feature and topology selection, this study uses a Multi-Objective Genetic Algorithm (MOGA). Within scope of this study, the model is considered as an multi-objective optimization, and restrictions and priorities are possibly assigned to each defined objective. The evolutionary algorithm searches the admissible space of the number of neurons and the number of inputs (lags for the modelled and exogenous variables) for the RBF models. For a detailed explanation of MOGA operation, please consult [93].

In this case, different objectives must be defined prior to MOGA execution. The minimization objectives used in this work are the RMSEs of the training set (

ε_{T_{r}}

) and of the testing set (

ε_{T_{e}}

), the model complexity (O_M), and the forecasting performance

ε_{P H}

. This last criterion is obtained by the sum of the RMSEs along with PH (6), where D is a time series, with p data points, and E is an error matrix (7).

ε_{P H} (D, P H) = \sum_{i = 1}^{P H} R M S E (E (D, P H), i)

(6)

E (D, P H) = [\begin{matrix} e [1, 1] & e [1, 2] & \dots & e [1, P H] \\ e [2, 1] & e [2, 2] & \dots & e [2, P H] \\ ⋮ & ⋮ & ⋱ & ⋮ \\ e [p - P H, 1] & e [p - P H, 2] & \dots & e [p - P H, P H] \end{matrix}]

(7)

In this manner, we compute the RMSE for each step-ahead prediction and sum up those values. In order to compute this criterion, a forecasting period must be used. To differentiate between the forecasting period used in MOGA design and the forecasting period employed in model validation, two notations will be introduced:

ε_{P H_{M O G A}}

and

ε_{P H_{V A L}}

. Notice that the latter is computed in a time series that is not employed for model design.

3.2.3. Parameter Estimation

Each model in the current population is a specific RBF, for which its parameters must be estimated. A modified version of the Levenberg–Marquardt [94,95] algorithm, which exploits linear-nonlinear parameter separation, is employed. Briefly, (1) can be expressed as follows:

y (X, v, u) = Γ (X, v) u

(8)

where X represents the input matrix, v the nonlinear parameters (the centres C and the spreads σ), and u represents the linear parameters (linear weights w). By using this parameter decomposition, the optimal value of the linear parameters can be obtained as follows (9):

{\hat{u}}_{d} (X, v, t) = {(Γ^{T} Γ)}^{- 1} Γ^{T} t = Γ^{+} t

(9)

where the symbol ⁺ denotes a pseudo-inverse operation. By incorporating this value in the usual training criterion (sum of the square of the errors), a new criterion is obtained:

Ψ (X, v, t) = \frac{{‖ t - Γ {\hat{u}}_{d} ‖}_{2}^{2}}{2} = \frac{{‖ t - Γ Γ^{+} t ‖}_{2}^{2}}{2} = \frac{{‖ P_{Γ_{⊥}} t ‖}_{2}^{2}}{2}

(10)

which is independent of the value of the linear parameters. In order to minimize (10) by using the LM algorithm, we need the Jacobian of the model, which can be obtained [96] as follows.

J^{T} (X, v) = {J^{T} (X, v) |}_{u = \hat{u}}

(11)

In other words, this is the Jacobian matrix computed in with the usually method of using linear parameters as the optimal ones. Finally, the LM update, s[k], is computed as the solution of the following equation:

(J^{T} [k] J [k] + λ [k] I) s [k] = - J^{T} [k] e [k]

(12)

where e is the error vector, and λ is the regularization parameter.

As the model is nonlinear, different initial values for the nonlinear parameters can result in different final results. For this reason, good initial values are important. The MOGA framework allows employing random values or a clustering method and the Optimal Adaptative K-Means (OAKM) algorithm [97]. A user-specified number of trials can be specified for each model in the population, each one with its initial values. As a multi-objective formulation is used, the best model can be determined in several ways, which are also user-specified. Finally, each parameter estimation procedure stops whether a user-specified number of iterations is reached or an early stopping technique is employed with the use of the testing set.

MOGA has been used successfully for different applications, such as HVAC control [90], detection of cerebral accidents [60], Ground Penetrating Radar (GPR) target detection [91], or river water level prediction [98].

3.3. Model Ensemble

As MOGA uses a multi-objective formulation, its result is not a single solution but a set of non-dominated or Pareto solutions. Therefore, the user must select, among these non-dominated models, one that presents a good trade-off performance among the different objectives,

ε_{T_{r}}

,

ε_{T_{e}}

, O_M, and

ε_{P H_{M O G A}}

, as well as in the validation data, which is not used for model design. Typically,

ε_{V}

and

ε_{P H_{V A L}}

are employed.

Since the number of non-dominated or preferable (that meet user-specified goals) solutions is typically large, the choice of this “best compromise” model is not a trivial process. On the other hand, non-dominated or preferable solutions are typically very good models, obtained with a huge computational effort in a computer cluster, and it seems to be a waste when only using one of them.

For this reason, the use of some of these models for ensemble averaging was proposed in [34]. The idea is to use, as output, the median value of the outputs of the models in the ensemble. The median value is preferable to the arithmetic mean because, as the forecasting performance is typically used as a minimizing criterion and not as a restriction, even preferable models can sometimes obtain large forecasting values.

The selection of the models to be used in the ensemble was not addressed in [34]. It will be discussed in this paper together with a deeper analysis of the ensemble results.

4. Case Study Description

The residence employed in this study is located in Montenegro, Faro, Algarve, Portugal (37°0′55″ N, 7°56′6″ W). The residence employed has two floors with 20 different spaces. A detailed description of the case study may be found in [32].

A Schneider panel consisting of 16 monophasic circuit breakers, plus a triphasic one, is used as the electric panel. It has a solar photovoltaic system composed of 20 Sharp NU-AK panels [99], each with a maximum power of 300 W. The inverter is a Kostal Plenticore Plus converter [100] that also controls a BYD Battery Box HV H11.5 (capacity of 11.5 kWh) [101]. An intelligent weather station is used to acquire the relevant weather data, as well as to compute their evolution within a user-specified PH [102]. Wi-Fi power plugs [103] are used to monitor and control specific devices. The house also has a few Self-Powered Wireless Sensors (SPWS) available for measuring room climate variables and occupant activity [104].

A data acquisition system is implemented in order to monitor the variables related to electricity consumption. The data that will be used for load demand are supplied by a Carlo Gavazzi (EM340) 3 phase energy meter [105]. The devices making measurements for additional electricity-related variables used for NILM purposes include Circutor Wibeees (WBs) [106], which are plug-and-play wireless devices for acquiring electric consumption values. One hundred and ninety-eight variables are sampled by the WBs every second.

Many gateways and a technical wireless network are responsible for data transmission from/to the devices that are the objects of measurements. A diagram of the data acquisition system is shown in Figure 2. For a description of the system implemented, as mentioned, the reader is invited to consult [32].

5. Results

Six variables are used for the current study: total electric power demand (P_D), PV DC power generated (P_G), atmospheric air temperature (T), global solar irradiance (R), occupation (O_cc), and day encoding (D_E). The first four variables are measured by the data acquisition system. O_cc represents the number of occupants present in the house each day. Day encoding, presented in Table 1, characterises each day of the week and the occurrence and severity of holidays based on the day they occur, as may be consulted in [107,108]. The regular day column shows the coding for the days of the week when these are not a holiday. The following column presents the values encoded when there is a holiday, and finally, the special column shows the values that substitute the regular day value in two special cases: for Mondays when Tuesday is a holiday and for Fridays when Thursday is a holiday.

Sixteen months of data were used, ranging from 1 May 2020 00:07:30 to 31 August 2021 23:52:30. The first four variables were averaged in 15 min steps, while a constant value was used for all samples within the same day for O_cc and D_E. The use of a 15 min time step is due to a previous study [109], where it was proved that this time step, together with a prediction horizon of 28 steps, enabled excellent HEMS performances.

Regarding P_D, its maximum, mean, and minimum values are 7.0, 1.1, and 0.0 kW, respectively. Figure 3 shows the daily energy consumption in kWh. We can note that the maximum consumption occurs in winter. Figure 4 illustrates the evolution of P_G, for some winter days. The maximum value of P_G is 6.2 kW, occurring on 28 April 2021 12:22:30.

Regarding R, its maximum value is 1.18 kW/m². The peak sunshine hours (or daily sunshine insolation) obtained for this location is represented in Figure 5. Its maximum value is 8.5 h, occurring on 20 June 2020. The maximum, mean, and minimum values of T are, respectively, 40.3, 17.1, and −0.6. Figure 6 shows the average daily temperature.

The next figure (Figure 7) shows the values of O_cc throughout the entire period, while Figure 8 illustrates a snapshot of D_E for the first two months.

As observed in Figure 1, Figure 2, Figure 3 and Figure 4, there are some gaps within the acquired data. This does not constitute a major problem when a static model is being designed, but, for a dynamic model, the entire range of lags (d_max) must be be presented at every sample of data considered for the design. Moreover, if the goal is to design a predictive model to forecast the evolution of a certain variable over a Prediction Horizon (PH), for each instant of time you need, in addition to d_max past values, PH posterior values. As one should assess the prediction performance over a time series with, say, n_pred values, d_max + n_pred + PH consecutive values must be available for the modelled variable, as well as for every exogenous model input.

5.1. Data Sets Description

As four different forecasting models are needed in a HEMS, a decision was made with respect to using the same forecasting time series for the design and validation of all models. The period that will be used for forecasting during MOGA design will be from 7 July 2020 13:07:30 to 27 July 2020 03:07:30 (1881 samples); for validation the period between 04 July 2021 05:37:30 and 24 July 2021 03:22:30, 1962 samples will be employed.

Four models will be designed in this study, and they are designated by M₁ to M₄. The first model will forecast power demand (P_D). It is a NARX model for which its exogeneous variables are T, O_cc, and D_E.

{\hat{P}}_{D} (k) = M_{1} ({\bar{P}}_{D} (k), \bar{T} (k), {\bar{O}}_{c c} (k), {\bar{D}}_{E} (k))

(13)

The use of these exogeneous variables was discussed and justified in previous publications of the authors (please see [32,33,110]. In (13), the dash superscript denotes a set of delayed values of the corresponding variable. Typically, the power demand at any instant within a day is correlated to corresponding values one day before and, to a lesser extent, values observed one week ago. For this reason, lags of the modelled and the exogenous variables will be collected from three periods: immediately before the sample, centred at the corresponding instant one day ago, and centred at the corresponding instant one week ago. For this particular model, we shall use [20, 9, 9] for P_D, [20 9 0] for AT, [1 0 0] for O_cc, and [1 0 0] for D_E. This means that for P_D we shall consider the first 20 lags before the current samples: nine centered 24 h ago and nine centered one week ago. For AT, the same number of lags before the current sample and centered one day ago will be used (but not from the third period), and for the other two variables only the first lag will be allowed. This means that the total number of lags (d_max) that MOGA will consider is (20 + 9 + 9) + (20 + 9) + (1) + (1), i.e., 69 lags.

As data averaged in 15 min intervals are used, one week of data consists of 4 × 24 × 7 = 672 samples. With the additional four lags before one week of data, we have the largest delay index, l_ind = 676 samples. As a PH of 28 steps ahead is used, for this model, we obtain the following: n_{pred_MOGA} = 1881 − 676 − 28 = 1161 samples (around 12 days of data in July 2020) and n_{pred_VAL} = 1962 − 676 − 28 = 1242 samples (around 13 days of data in July 2021).

The second model, M₂, will forecast solar irradiance, R. It is a NAR model (with no exogenous inputs).

\hat{R} (k) = M_{2} (\bar{R} (k))

(14)

This model uses [20 9 9] lags, which means that d_max = 38, and l_ind, n_{pred_MOGA} and n_{pred_VAL} are the same with M₁.

The third model, M₃, forecasts atmospheric temperature, T, and is also a NAR model.

\hat{T} (k) = M_{3} (\bar{T} (k))

(15)

As the lags used by this model are [20 9 0], the following is obtained: d_max = 29, 100, l_ind = 100, n_{pred_MOGA} = 1745 (around 18 days), and n_{pred_VAL} = 1826 (around 19 days).

Finally, the fourth model is used to predict the electric power generated by the photovoltaic systems. It is a NARX model, for which its exogeneous inputs are R and T (for the choice of the exogeneous variables, please see [31]).

{\hat{P}}_{G} (k) = M_{4} ({\bar{P}}_{G} (k), \bar{R} (k), \bar{T} (k))

(16)

This model uses [20 9 9] lags for P_PV, [20 9 9] lags for R, and [20 9 0] lags for T, which means that d_max = 105, and l_ind, n_{pred_MOGA}, and n_{pred_VAL} are the same as M₁ and M₂.

5.2. Approxhull Results

The total number of samples supplied to Approxhull is 17,098. They will be divided into Training (T_r), Testing (T_e), and Validation (V) sets. Their dimensions are 10,258, 3419, and 3421 samples, respectively. Notice, however, that the sample indexes for these sets are not identical in the four models, as the number and indexes of Convex Hull points (CH), which will be mandatorily integrated into each T_r, changes for each model. The numbers of CH points are 3630, 1835, 112, and 1306 for M₁, M₂, M₃, and M₄, respectively.

5.3. MOGA Results

For all problems, MOGA was parameterized with the following values:

Prediction Horizon: 28 steps (7 h);
Number of neuros: $n_{n} \in [\begin{matrix} 2 & \dots & 10 \end{matrix}]$ ;
Initial parameter values: OAKM [97];
Number of training trials: five, best compromise solution;
Termination criterion: early stopping, with a maximum number of iterations of 50;
Number of generations: 100;
Population size: 100;
Proportion of random emigrants: 0.10;
Crossover rate: 0.70.

For models M₁ and M₄, the number of admissible inputs ranged from 1 to 30, while for M₂ and M₃ the range employed was from 1 to 20.

For all models, two MOGA executions were performed. In the first, the following objectives were minimized:

ε_{T_{r}}

,

ε_{T_{e}}

O_M and

ε_{P H_{M O G A}}

.

By analyzing the results of the first MOGA iteration in the second MOGA iteration, some of these objectives will be recast as restrictions by using a heuristic for that purpose. Notice that, unless stated otherwise, the results use scaled data within the range of [−1 +1].

5.3.1. Single Solution

Model 1—Power Demand

In the first MOGA execution 311 non-dominated models were found. The histogram of the usage of the lags of these models, for P_D and T, can be found in Figure 9a,b, respectively.

Lags from the different periods are selected for these variables. Occupation and day encoding were less used. Among the 311 non-dominated models, only 16 used O_cc, and 42 used D_E.

By using the results obtained in the first execution, a second MOGA run was executed; this time, it had the the following objectives:

$ε_{T_{r}} < 0.15$ ;
$ε_{T_{e}} < 0.12$ ;
O_M < 150;
Minimize $ε_{P H_{M O G A}}$ .

In this execution, 341 non-dominated solutions and 204 preferable solutions were obtained.

The minimum values of

ε_{T_{r}}

,

ε_{T_{e}}

,

ε_{V}

and

ε_{P H_{V A L}}

are shown in the next table (Table 2) for the two executions.

From Table 2 we can observe that slightly better results were obtained for

ε_{P H_{V A L}}

.One model has been selected from the preferable solutions, which offers a good compromise between the different criteria.

{\hat{P}}_{D} (k) = M_{1} (\begin{array}{l} P_{D} (k - 1), P_{D} (k - 2), P_{D} (k - 11), P_{D} (k - 12), P_{D} (k - 4), \\ P_{D} (k - 96), P_{D} (k - 97), P_{D} (k - 688), P_{D} (k - 670), P_{D} (k - 672), \\ T (k - 6), T (k - 9), T (k - 10), T (k - 13), T (k - 14), T (k - 16) \end{array})

(17)

As it can be observed that P_D lags from the three periods are employed, while for T only lags from the first period were selected. The other exogenous variables were not employed. The model has seven neurons. In terms of prediction performance,

ε_{P H_{M O G A}}

for the first forecasting period was 4.55. Its value for the second forecasting period was 4.79. The prediction performance for this second period was slightly worse than for the first one, but this can happen. Firstly, the models have been designed for minimizing the prediction performance for the first period and not the second one; secondly, no data from the second period were involved in the design. Thirdly, the results obtained for

ε_{P H_{V A L}}

, whenever needed, forecast T, which was not the case for the first period where only measured data were employed.

Figure 10 shows the evolution of RMSE over the prediction horizon, and Figure 11 shows the one-step-ahead prediction; this time it is demonstrated at the original scale. Both graphs are related to the validation period.

Model 2—Solar Irradiance

The first MOGA execution obtained 216 non-dominated solutions. Figure 12 illustrates the lags’ usage.

By using the results obtained in the first execution, a second MOGA run was executed, this time with the following objectives:

$ε_{T_{r}} < 0.12$ ;
$ε_{T_{e}} < 0.10$ ;
O_M < 150;
Minimize $ε_{P H_{M O G A}}$ .

In this execution, 341 non-dominated solutions and 204 preferable solutions were obtained.

The minimum values of

ε_{T_{r}}

,

ε_{T_{e}}

,

ε_{V}

and

ε_{P H_{V A L}}

are shown in Table 3:

Table 3 demonstrates that the same results were obtained in the second MOGA iteration. One model has been selected from the preferable solutions, which offers a good compromise between the different criteria.

ε_{P H_{M O G A}} \hat{R} (k) = M_{2} (\begin{array}{l} R (k - 1), R (k - 2), R (k - 92), R (k - 95), \\ R (k - 668), R (k - 670), R (k - 675) \end{array})

(18)

As observed, lags from the three periods are employed. The model has seven neurons, which indicates that a simple model can be used for forecasting R. In terms of prediction performance,

ε_{P H_{M O G A}}

was 3.59, while

ε_{P H_{V A L}}

was 2.58. The prediction performance for this second period was much better than for the former simply because the first period included cloudy days, which did not occur for the latter (please see Figure 13).

Figure 14 demonstrates the evolution of the RMSE over the prediction horizon, and Figure 13 demonstrates the one-step-ahead prediction.

Model 3—Atmospheric Temperature

In the first MOGA execution, 175 non-dominated models were designed for M₃. The next figure (Figure 15) shows the histogram of lags usage for M₃.

By using the results obtained in the first execution, a second MOGA run was executed, this time with the following objectives:

$ε_{T_{r}} < 0.02$ ;
$ε_{T_{e}} < 0.02$ ;
O_M < 100;
Minimize $ε_{P H_{M O G A}}$ .

In this execution, 341 non-dominated solutions and 204 preferable solutions were obtained.

The minimum values of

ε_{T_{r}}

,

ε_{T_{e}}

,

ε_{V}

and

ε_{P H_{V A L}}

are shown in Table 4 for two iterations.

The second MOGA execution obtained slightly better forecasting results. Model (19) was selected from the preferable solutions, employing lags from the two periods.

\hat{T} (k) = M_{3} (\begin{array}{l} T (k - 1), T (k - 2), T (k - 8), T (k - 11), T (k - 17), T (k - 18), \\ T (k - 92), T (k - 96), T (k - 100) \end{array})

(19)

The model has nine neurons, which means that it has 90 nonlinear parameters. In terms of the prediction performance,

ε_{P H_{M O G A}}

was 2.84. Its value for the validation period was 2.70.

Figure 16 demonstrates the evolution of the RMSE over the prediction horizon, and Figure 17 demonstrates the one-step-ahead prediction in the original scale. Both graphs are related to the second forecasting period.

Model 4—Power Generated

Three hundred fifty-six non-dominated models were obtained in the first MOGA generation. The histogram of usage of the lags of these models for P_G, R, and T can be found in Figure 18a–c, respectively.

By analysing Figure 18, we can observe that lags from the different periods are selected for these variables.

By using the results obtained in the first execution, a second MOGA run was executed, this time with the following objectives:

$ε_{T_{r}} < 0.07$
$ε_{T_{e}} < 0.05$
O_M < 100
Minimize $ε_{P H_{M O G A}}$ .

In this execution, 295 non-dominated solutions and 186 preferable solutions were obtained.

The minimum values of

ε_{T_{r}}

,

ε_{T_{e}}

,

ε_{V}

and

ε_{P H_{V A L}}

are shown in Table 5 for the two executions.

Slightly better results were obtained for

ε_{P H_{V A L}}

in the second excutions. The following model has been selected from the preferable solutions.

{\hat{P}}_{G} (k) = M_{4} (\begin{array}{l} P_{G} (k - 1), P_{G} (k - 20), P_{G} (k - 95), \\ R (k - 2), R (k - 17), R (k - 93), R (k - 671), \\ T (k - 1), T (k - 98) \end{array})

(20)

Model (20) employs lags from the first two periods, while for R, lags from the three periods were chosen. Two lags are employed for T from the two periods. In terms of prediction performance,

ε_{P H_{M O G A}}

was 1.54, while

ε_{P H_{V A L}}

was 1.67. The prediction performance for this second period was slightly worse than for the first one.

Figure 19 demonstrates the evolution of RMSE over the prediction horizon, and Figure 20 demonstrates the one-step-ahead prediction. As observed, excellent results were obtained.

5.3.2. Ensemble Averaging

The number of models in the ensemble and the question of how to select them will be discussed in this Section. At the end of the second MOGA execution, we have the results of the preferable models available in terms of their performance on the training, testing and validation datasets, and on the first forecasting dataset. Our goal is to select an ensemble of models such that their forecasting performance within the entire time series is better than using a single model. In order to assess this, we proceed to analyse forecasting performance in the second forecasting period, which is not used for model design. We shall compare

ε_{P H_{V A L}}

, which is the forecasting performance of the models chosen previously, with the one obtained by using the median of the outputs of the models in ensemble

{\bar{ε}}_{P H_{V A L}^{50 %}}

. In order to assess the dispersion of the results of using this specific ensemble, we shall also assess the first and third quartile performances,

{\bar{ε}}_{P H_{V A L}^{25 %}}

and

{\bar{ε}}_{P H_{V A L}^{75 %}}

, as well as the interquartile value,

{\bar{ε}}_{P H_{V A L}^{75 %}} - {\bar{ε}}_{P H_{V A L}^{25 %}}

. Please note that we used these two measures as they are robust to outliers.

If the models generalize well, it can be argued that the forecasting performance in the first forecasting dataset would be translated with a similar performance throughout the time series. Using this assumption, we shall select the models within the ensemble with the smallest 10, 25, and 50 values in

ε_{P H_{M O G A}}

. We shall denote this set of models as

{M_{*}^{ε_{P H_{M O G A}} (n)}}

, where * denotes the type of model, and n assumes the values 10, 25, and 50.

Another criterion that can be used for model selection is the 2-norm of the linear weight vector,

{‖ w ‖}_{2}

, of each model. A large norm indicates that the output will change significantly with small changes in the basic functions outputs. Using this line of thought, we will select the models with the smallest 10, 25, and 50 values of

{‖ w ‖}_{2}

and denote these sets as

{M_{*}^{‖ w ‖ (n)}}

.

Model 1—Load Demand

The ensembles forecasting performances are shown in Table 6. Notice that

ε_{P H_{V A L}} = 4.79

.

By analysing the results shown in Table 6, we can conclude that the use of the median of the different ensembles always obtains better results than the single selected model. By comparing the two different selection criteria, better results are always obtained for

{\bar{ε}}_{P H_{M O G A}}

. Among those, the smallest

{\bar{ε}}_{P H_{V A L}^{50 %}}

is achieved for

{M_{1}^{ε_{P H_{M O G A}} (25)}}

, which also has a small dispersion of results. Figure 21 shows details of the measured load demand values, between 15 July 2021 06:52:30 and 15 July 2021 17:37:30, and a one-step-ahead box chart of the

{M_{1}^{ε_{P H_{M O G A}} (25)}}

models. As observed, the dispersion between the model results is not large. Figure 22 shows the evolution of the

ε_{P H_{V A L}}

, in red, and for the models belonging to the selected ensemble,

{\bar{ε}}_{P H_{V A L}}

. The evolution of the median results is shown in black, and it can be observed that, for all prediction steps, the scaled RMSE is always inferior to the single model.

Model 2—Solar Irradiance

The ensembles forecasting performances for R are shown in Table 7. Notice that for

ε_{P H_{V A L}}

, the forecasting performance of single model was 2.58.

The performance of the ensemble models is always superior to the single model. In the same manner as in M₁, the forecasting selection criterion is better than 2-norm. For this criterion, in contrast with the previous case, dispersion increases with the number of elements in the ensemble, and the smallest ensemble obtained the best results.

As observed by analysing Figure 23 and Figure 24, the 1-step-ahead approximations obtained by

{M_{2}^{ε_{P H_{M O G A}} (25)}}

and the dispersion between models are very small. In the same manner as in the case of load demand, the RMSE evolution achieved by the ensemble model is always better than the single model.

Model 3—Atmospheric Temperature

The ensembles forecasting performances are shown in Table 8. Notice that the

ε_{P H_{V A L}} = 2.70

.

In the same manner as in the previous models, the median forecasting performance of the ensemble models is always better than the single model. Ensembles for which their models have been selected with the forecasting criterion are better than compared to using the weight norm. The dispersion of the results decreases with the dimension of the ensemble, but the one with 25 elements obtains the best median value. The results are shown in Figure 25 and Figure 26. Figure 25 demonstrates detailed measured T values on 15 July 2021, between 03:37:30 and 16:37:30, and a one-step-ahead box chart of the

{M_{3}^{ε_{P H_{M O G A}} (25)}}

models. The approximation is very accurate; consequently, the dispersion of the ensemble values is very small. As observed in Figure 26, the RMSE of the single model is always superior to the ensemble model median.

Model 4—Power Generated

The ensembles forecasting performances for P_G are shown in Table 9. Notice that

ε_{P H_{V A L}} = 1.67

.

In the same manner as in the previous models, the performance of all ensemble models was better than the single model; this time, it was significantly better. The forecasting criterion also produced better results than compared to using the weight norm. In contrast with the other models, the ensemble with 50 models achieved better median forecasting performance this time, albeit with a larger dispersion. For this reason, we chose the

{M_{4}^{ε_{P H_{M O G A}} (25)}}

ensemble.

Figure 27 shows detailed measured P_G values on 15 July 2021, between 06:22:30 and 20:22:30, and the one-step-ahead box chart of the

{M_{4}^{ε_{P H_{M O G A}} (25)}}

models. As observed, the approximation is very accurate; consequently, the dispersion of the ensemble values is very small. Figure 28 illustrates the evolution of RMSE evolution over the prediction horizon. Apart from the result for the first step-ahead, all other values were significantly better than the selected model.

5.4. Discussion of the Results

As observed from analysing the results presented in Section 5.3.1, the forecasting results obtained with the MOGA single solution are very good and are among the best results presented in the literature. Despite that, MOGA ensembles with averaging solutions, discussed in Section 5.3.2, significantly improved the prediction results for all models considered.

Two different criteria were introduced in order to select the elements in the ensemble: using

ε_{P H_{M O G A}}

and

{‖ w ‖}_{2}

. For all ensembles generated, the use of the forecasting criterion obtained better

{\bar{ε}}_{P H_{V A L}^{50 %}}

results. For this reason, it should be the criterion used for model selection.

The number of models in the ensemble was also discussed for all model types. Among the three possibilities (10, 25, and 50), 25 models were chosen. Notice that this value is approximately 10% of the preferable solutions chosen by MOGA, and this may constitute a rule of thumb to be employed. Additionally, in terms of computing time, this number is not translated in a large overhead relative to the single solution.

Comparison of Results

In this section, we shall compare the forecasting performance of the technique proposed in this paper with similar studies found in the literature. We shall start with previous publications of the authors; subsequently, we shall address other authors’ works. Only two out of the four models will be considered: M₁, for load demand, and M₄, for power generation, as the other two are only necessary for these two models.

In [110], data from the same house were employed to generate load demand predictive models. Data from January to July 2020 were employed for model design, and forecasting validation used two weeks of data, from 10 to 24 July 2020. The same time resolution, 15 min, was employed.

Although only 7 months of data were employed in that study, compared to the 15 months employed here, the minimum values of the RMSEs in the different datasets were comparable to the ones presented in Table 2. Forecasting performance was assessed with a PH of 48 steps. The best RMSE result obtained, considering only the first 28 prediction steps employed here, was 4.84. This should be compared with

ε_{P H_{V A L}} = 4.79

and the best ensemble results, 4.61. It should be stressed that although the validation periods used in the two papers are different, single solution results are comparable, and both are worse than the ensemble solution.

In publication [66], data collected in Honda Smart Home US were employed. Please note that the energy consumption of the two houses is similar. Two different time steps were used, 15 min and 1 h. Three years of data were used for model design and validation. In order to analyse prediction results, one week worth of data, between 25 February to 1 March 2017, was employed. By using the first-time resolution, the

ε_{P H_{V A L}}

obtained was 9.24.

The forecasting performance of R, T, and P_G models was discussed in [31]. Data from the same house were employed, with the design period covering the days between 19 May and 31 July 2020 (two months and a half), and the forecasting performance was assessed during the 14 June to 12 July period. The minima RMSEs for the three datasets and for the three models were equivalent to the values presented in Table 3, Table 4 and Table 5.

In terms of forecasting performance, in [31], a PH of 48 steps was also applied. By using only the first 28 prediction steps for the sum, a value of 2.60 was obtained for R. This should be compared with

ε_{P H_{V A L}} = 2.58

and using

{M_{2}^{ε_{P H_{M O G A}} (25)}}

, a value of 2.47 was obtained. For T, a value of 2.69 was obtained and compared with

ε_{P H_{V A L}} = 2.71

and

{M_{3}^{ε_{P H_{M O G A}} (25)}}

of 2.58. Finally, for the power generated models, a value of 1.71 was obtained in [31], while we achieved

ε_{P H_{V A L}} = 1.68

and a

{\bar{ε}}_{P H_{V A L}^{50 %}}

using

{M_{4}^{ε_{P H_{M O G A}} (25)}}

of 1.41 here. Again, the ensemble forecasting values were significantly better than the ones obtained in [31].

Other authors have proposed different techniques for load demand and power generation. As criteria other than the RMSE were employed in their studies, other performance criteria are defined below.

M A E = \frac{\sum | y_{t} - {\hat{y}}_{t} |}{n}

(21)

M R E = \frac{M A E}{r} * 100 %

(22)

M A P E = \frac{1}{n} \sum | \frac{y_{t} - {\hat{y}}_{t}}{y_{t}} | * 100 %

(23)

R^{2} = 1 - \frac{\sum {(y_{t} - {\hat{y}}_{t})}^{2}}{\sum {(y_{t} - \bar{y})}^{2}}

(24)

In the previous equations, n is the number of samples, y_t is the measured tth value,

{\hat{y}}_{t}

is the predicted value,

\bar{y}

is the mean value, and r is the range of the measured variable.

Although there are several studies available in the field of load demand prediction, the majority considers buildings or sets of households. The scale we are considering is very important, as an aggregation of load demands or energy consumptions is usually much smoother than individual ones.

As an extreme example, Figure 29 illustrates Portuguese electricity demand for two consecutive weeks some years ago [111]. The daily profile should be compared with the one shown in Figure 11, demonstrating that the profile of the aggregation of a (very large) number of consumers is much smoother than an individual one. The interested reader can inspect the forecasting performance of Portuguese load demand in the previous publication and in [107,108,112].

In this manner, we will only consider publications that address forecasting energy consumption with respect to single households. This is the case of [113], where SVR is applied for hourly and daily energy forecasting of 15 houses in Ontario, Canada, from 2014 to 2016. Focusing on the hourly resolution, the one-step MAPE of energy consumption varies from 23.3% to 67.96%, depending on the accuracy category (good hourly and daily accuracy down to poor hourly and daily accuracy). As in our approach, we forecast the power demand in steps of 15 min, and we used the first four forecasts for calculating the energy. Notice that as we use four predictions to calculate one single result, the comparison is not fair for our approach. Having said that, our MAPE hourly consumption is 16.5%, which is much better that the results obtained in [113].

Wen and co-workers [114] employed data from residential buildings (single-family homes, town home, and apartments) from the Dataport website [115]. They produced hourly and daily consumption forecasts for aggregated consumptions by using Deep Recursive Neural Networks (RNN) and Long Short-Term Memory (LSTM) models. The hourly MAPE for a single house ranged between 11.3% and 17.97%, depending on the Deep model used. However, as the authors stated, the best Deep RNN model has the limitation of needing future values with respect to weather, which were not forecasted in their work in contrast to what happens in this paper.

In relation to PV power generated, Rana and co-workers [116] employed multi-step forecasts, as we did here. Forecasts between 5 min and 1 h obtained MREs between 4.2 and 9.3%. Our approach obtains MRE values between 2.35% and 2.45% for forecasts between 15 min and 1 h.

Subsequent studies employed only supplied one-step-ahead forecasts. A comparison of the results obtained with the multi-step-ahead forecast is not fair to our approach, especially when one-step-ahead forecasts correspond to many steps-ahead of the multi-step forecasts. In [117], forecasts of up to 1 h are produced, with MAPE values between 24.7% and 37.8%. In the approach proposed in this paper we obtain a MAPE of 15.6% for a 1 h forecast. Hossain and Mahmood [118] also employed the MAPE criterion in their work. For summer months and for a 6 h prediction step, a MAPE value of 28.6% was obtained. In our approach, for a 24 steps-ahead prediction step (6 h ahead), a MAPE value of 16% was obtained. Publication [119] employed the R² criterion. For summer, which is the case considered here, the coefficient of determination ranges from 0.99 to 0.96, between the 15 min and 180 min prediction horizon. In our application, for the same prediction PHs, R² ranged from 0.99 to 0.98.

The current methodology can be applied to any EMS, such as the one recently presented in [109] by the authors. The prediction of the different variables feeds the model-based predictive control algorithm. Indeed, most parts of intelligent EMS, regardless of how they are designed or implemented, need to be fed with forecasted inputs that will impact its control.

As mentioned by different authors cited in the literature review, one of the drawbacks of designing models with very high prediction accuracy is the large computational time it takes to design the model. With some models, due to their complexity, the execution time is also high. With this study, model design is very costly, but the execution time is extremely fast due to the fact that the models are very simple.

The data collected by the project in which the present study is developed will be made publicly available soon.

6. Conclusions

In this paper, an ensemble of RBF models, designed by MOGA, for forecasting variables used in HEMS systems has been proposed. It has been shown that ensemble forecasting results are always superior to single selected models, which were already excellent. A simple procedure was proposed for selecting the models that are used in the ensemble, together with a heuristic to determine the number of models.

The models designed in this paper will be used in the HEMS algorithm proposed in contribution [109], which employed measured data in order to simulate forecasting. This HEMS employed the Branch and Bound (BAB) algorithm in order to implement a Model-Based Predictive Control scheme. Its execution did not consider any uncertainty in the data. The use of the interquartile range in the model ensemble output, together with the median value, can be employed as a prediction interval; in this manner, it allows uncertainty to be introduced naturally and efficiently in the BAB algorithm. Finally, ensemble models will be used to forecast load demands, separated into non-schedulable and schedulable equipment, by using NILM techniques. This will allow the schedule of deferable appliances.

Author Contributions

Conceptualization, A.R. and M.d.G.R.; methodology, A.R., K.B. and M.d.G.R.; formal analysis, A.R., K.B., S.S., I.L. and M.d.G.R.; investigation, A.R., K.B., S.S., I.L. and M.d.G.R.; data curation, A.R., K.B. and S.S.; writing—original draft preparation, A.R., K.B. and I.L.; writing—review and editing, A.R., M.d.G.R. and K.B.; supervision, A.R. and M.d.G.R.; project administration, A.R. and M.d.G.R.; funding acquisition, A.R. and M.d.G.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Programa Operacional Portugal 2020 and Operational Program CRESC Algarve 2020, grant numbers 39578/2018 and 72581/2020. Antonio Ruano also acknowledges the support of Fundação para a Ciência e Tecnologia, grant UID/EMS/50022/2020, through IDMEC under LAETA.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

ANN	Artificial Neural Networks
ARIMA	Autoregressive Integrated Moving Average
BAB	Branch and Bound
BPS	Building Performance Simulation
CH	Convex Hull
DC	Direct Current
ELM	Extreme Learning Machines
EMS	Energy Management Systems
GPR	Ground Penetrating Radar
HEMS	Home Energy Management Systems
HVAC	Heat Ventilation Air Conditioning
HW	Holt-Winters
k-NN	k-Nearest Neighbors
LR	Linear Regression
LSTM	Long Short-Term Memory
ML	Machine Learning
MOGA	Multi-Objective Genetic Algorithm
NAR	Nonlinear Autoregressive
NARX	Nonlinear Autoregressive Exogenous
NILM	Non-Intrusive Load Monitoring
NSGA-II	Nondominated Sorting Genetic Algorithm-II
OAKM	Optimal Adaptative K-Means
PH	Prediction Horizon
PV	Photovoltaics
RBF	Radial Basis Function
RMSE	Root Mean Square Error
RNN	Recursive Neural Networks
SPWS	Self-Powered Wireless Sensors
SVM	Support Vector Machine
SVR	Support Vector Regression
WBs	Wibees

References

Chou, J.S.; Tran, D.S. Forecasting energy consumption time series using machine learning techniques based on usage patterns of residential householders. Energy 2018, 165, 709–726. [Google Scholar] [CrossRef]
Lund, H.; Østergaard, P.A.; Connolly, D.; Mathiesen, B.V. Smart energy and smart energy systems. Energy 2017, 137, 556–565. [Google Scholar] [CrossRef]
Lund, H. Renewable Energy Systems: A Smart Energy Systems Approach to the Choice and Modeling of 100% Renewable Solutions; Academic Press: Cambridge, MA, USA, 2014. [Google Scholar]
Connolly, D.; Lund, H.; Mathiesen, B.V.; Østergaard, P.A.; Möller, B.; Nielsen, S.; Ridjan, I.; Hvelplund, F.; Sperling, K.; Karnøe, P. Smart Energy Systems: Holistic and Integrated Energy Systems for the Era of 100% Renewable Energy. 2013. Available online: https://vbn.aau.dk/en/publications/smart-energy-systems-holistic-and-integrated-energy-systems-for-t/ (accessed on 14 October 2021).
Ahmad, T.; Chen, H. A review on machine learning forecasting growth trends and their real-time applications in different energy systems. Sustain. Cities Soc. 2020, 54, 102010. [Google Scholar] [CrossRef]
Wang, H.Z.; Lei, Z.X.; Zhang, X.; Zhou, B.; Peng, J.C. A review of deep learning for renewable energy forecasting. Energy Convers. Manag. 2019, 198, 111799. [Google Scholar] [CrossRef]
Liu, H.; Li, Y.; Duan, Z.; Chen, C. A review on multi-objective optimization framework in wind energy forecasting techniques and applications. Energy Convers. Manag. 2020, 224, 113324. [Google Scholar] [CrossRef]
Sharma, V.; Cortes, A.; Cali, U. Use of Forecasting in Energy Storage Applications: A Review. IEEE Access 2021, 9, 114690–114704. [Google Scholar] [CrossRef]
Ma, J.; Ma, X.D. A review of forecasting algorithms and energy management strategies for microgrids. Syst. Sci. Control Eng. 2018, 6, 237–248. [Google Scholar] [CrossRef]
Walther, J.; Weigold, M. A Systematic Review on Predicting and Forecasting the Electrical Energy Consumption in the Manufacturing Industry. Energies 2021, 14, 968. [Google Scholar] [CrossRef]
Heyets, V.M.; Kyrylenko, O.V.; Basok, B.I.; Baseyev, Y.T. The Energy Strategy: Forecasts and Reality (Review). Sci. Innov. 2020, 16, 3–14. [Google Scholar] [CrossRef]
Debnath, K.B.; Mourshed, M. Forecasting methods in energy planning models. Renew. Sustain. Energy Rev. 2018, 88, 297–325. [Google Scholar] [CrossRef] [Green Version]
Wei, N.; Li, C.J.; Peng, X.L.; Zeng, F.H.; Lu, X.Q. Conventional models and artificial intelligence-based models for energy consumption forecasting: A review. J. Pet. Sci. Eng. 2019, 181, 106187. [Google Scholar] [CrossRef]
Paudel, S.; Elmitri, M.; Couturier, S.; Nguyen, P.H.; Kamphuis, R.; Lacarriere, B.; Le Corre, O. A relevant data selection method for energy consumption prediction of low energy building based on support vector machine. Energy Build. 2017, 138, 240–256. [Google Scholar] [CrossRef]
Szul, T.; Necka, K.; Mathia, T.G. Neural Methods Comparison for Prediction of Heating Energy Based on Few Hundreds Enhanced Buildings in Four Season’s Climate. Energies 2020, 13, 5453. [Google Scholar] [CrossRef]
Khan, A.N.; Iqbal, N.; Rizwan, A.; Ahmad, R.; Kim, D.H. An Ensemble Energy Consumption Forecasting Model Based on Spatial-Temporal Clustering Analysis in Residential Buildings. Energies 2021, 14, 3020. [Google Scholar] [CrossRef]
Aslam, S.; Herodotou, H.; Mohsin, S.M.; Javaid, N.; Ashraf, N.; Aslam, S. A survey on deep learning methods for power load and renewable energy forecasting in smart microgrids. Renew. Sustain. Energy Rev. 2021, 144, 23. [Google Scholar] [CrossRef]
Singaravel, S.; Suykens, J.; Geyer, P. Deep-learning neural-network architectures and methods: Using component based models in building-design energy prediction. Adv. Eng. Inform. 2018, 38, 81–90. [Google Scholar] [CrossRef]
Truong, L.M.; Chow, K.H.K.; Luevisadpaibul, R.; Thirunavukkarasu, G.S.; Seyedmahmoudian, M.; Horan, B.; Mekhilef, S.; Stojcevski, A. Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches. Appl. Sci. 2021, 11, 2229. [Google Scholar] [CrossRef]
Geyer, P.; Singaravel, S. Component-based machine learning for performance prediction in building design. Appl. Energy 2018, 228, 1439–1453. [Google Scholar] [CrossRef]
Chatfield, C.; Xing, H. The Analysis of Time Series: An Introduction with R; Chapman and hall/CRC: Boca Raton, FL, USA, 2019. [Google Scholar]
Chatfield, C. The Analysis of Time Series: Theory and Practice; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Divina, F.; Garcia Torres, M.; Goméz Vela, F.A.; Vazquez Noguera, J.L. A comparative study of time series forecasting methods for short term electric energy consumption prediction in smart buildings. Energies 2019, 12, 1934. [Google Scholar] [CrossRef] [Green Version]
Webby, R.; O’Connor, M. Judgemental and statistical time series forecasting: A review of the literature. Int. J. Forecast. 1996, 12, 91–118. [Google Scholar] [CrossRef]
Kelleher, J.D.; Mac Namee, B.; D’Arcy, A. Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
Deisenroth, M.P.; Faisal, A.A.; Ong, C.S. Mathematics for Machine Learning; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
Shalev-Shwartz, S.; Ben-David, S. Understanding Machine Learning: From Theory to Algorithms; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
Khalil, A.J.; Barhoom, A.M.; Abu-Nasser, B.S.; Musleh, M.M.; Abu-Naser, S.S. Energy Efficiency Prediction using Artificial Neural Network. Int. J. Acad. Pedagog. Res. (IJAPR) 2019, 3, 1–7. [Google Scholar]
Ahmad, M.W.; Mourshed, M.; Rezgui, Y. Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build. 2017, 147, 77–89. [Google Scholar] [CrossRef]
Li, K.; Xie, X.; Xue, W.; Dai, X.; Chen, X.; Yang, X. A hybrid teaching-learning artificial neural network for building electrical energy consumption prediction. Energy Build. 2018, 174, 323–334. [Google Scholar] [CrossRef]
Bot, K.; Ruano, A.; Ruano, M.d.G. Short-Term Forecasting Photovoltaic Solar Power for Home Energy Management Systems. Inventions 2021, 6, 12. [Google Scholar] [CrossRef]
Ruano, A.; Bot, K.; Ruano, M.G. Home Energy Management System in an Algarve residence. First results. In CONTROLO 2020: Proceedings of the 14th APCA International Conference on Automatic Control and Soft Computing; Lecture Notes in Electrical Engineering; Springer Science and Business Media Deutschland GmbH: Bragança, Portugal, 2021; Volume 695, pp. 332–341. [Google Scholar]
Bot, K.; Ruano, A.; Ruano, M.G. Forecasting Electricity Demand in Households using MOGA-designed Artificial Neural Networks. In Proceedings of the 21st IFAC World Congress, Berlin, Germany, 12–17 July 2020. [Google Scholar]
Al-Dahidi, S.; Ayadi, O.; Alrbai, M.; Adeeb, J. Ensemble Approach of Optimized Artificial Neural Networks for Solar Photovoltaic Power Prediction. IEEE Access 2019, 7, 81741–81758. [Google Scholar] [CrossRef]
Zhong, H.; Wang, J.; Jia, H.; Mu, Y.; Lv, S. Vector field-based support vector regression for building energy consumption prediction. Appl. Energy 2019, 242, 403–414. [Google Scholar] [CrossRef]
Koschwitz, D.; Frisch, J.; Van Treeck, C. Data-driven heating and cooling load predictions for non-residential buildings based on support vector machine regression and NARX Recurrent Neural Network: A comparative study on district scale. Energy 2018, 165, 134–142. [Google Scholar] [CrossRef]
Li, Y.; Cao, L.; Han, Y.; Shi, Y.; Zhang, Y. Short-Term Electric Load Forecasting with a Hybrid ARIMA, SVR, and IA Methodology; American Society of Civil Engineers: Reston, VA, USA, 2020; pp. 166–175. [Google Scholar]
Nepal, B.; Yamaha, M.; Yokoe, A.; Yamaji, T. Electricity load forecasting using clustering and ARIMA model for energy management in buildings. Jpn. Archit. Rev. 2020, 3, 62–76. [Google Scholar] [CrossRef] [Green Version]
Jagait, R.K.; Fekri, M.N.; Grolinger, K.; Mir, S. Load Forecasting Under Concept Drift: Online Ensemble Learning With Recurrent Neural Network and ARIMA. IEEE Access 2021, 9, 98992–99008. [Google Scholar] [CrossRef]
Kandananond, K. Electricity Demand Forecasting in Buildings Based on ARIMA and ARX Models. In Proceedings of the 8th International Conference on Informatics, Environment, Energy and Applications, Osaka, Japan, 19 March 2019; pp. 268–271. [Google Scholar]
Wang, Z.; Srinivasan, R.S. A review of artificial intelligence based building energy use prediction: Contrasting the capabilities of single and ensemble prediction models. Renew. Sustain. Energy Rev. 2017, 75, 796–808. [Google Scholar] [CrossRef]
Tran, D.-H.; Luong, D.-L.; Chou, J.-S. Nature-inspired metaheuristic ensemble model for forecasting energy consumption in residential buildings. Energy 2020, 191, 116552. [Google Scholar] [CrossRef]
Bontempi, G.; Taieb, S.B.; Le Borgne, Y.-A. Machine Learning Strategies for Time Series Forecasting; Springer: Berlin/Heidelberg, Germany, 2012; pp. 62–77. [Google Scholar]
Ahmed, N.K.; Atiya, A.F.; Gayar, N.E.; El-Shishiny, H. An empirical comparison of machine learning models for time series forecasting. Econom. Rev. 2010, 29, 594–621. [Google Scholar] [CrossRef]
Seyedzadeh, S.; Rahimian, F.P.; Glesk, I.; Roper, M. Machine learning for estimation of building energy consumption and performance: A review. Vis. Eng. 2018, 6, 1–20. [Google Scholar] [CrossRef]
Mosavi, A.; Bahmani, A. Energy Consumption Prediction Using Machine Learning; a Review. Preprints 2019, 2019030131. [Google Scholar] [CrossRef]
Fathi, S.; Srinivasan, R.; Fenner, A.; Fathi, S. Machine learning applications in urban building energy performance forecasting: A systematic review. Renew. Sustain. Energy Rev. 2020, 133, 110287. [Google Scholar] [CrossRef]
Mariano-Hernández, D.; Hernández-Callejo, L.; García, F.S.; Duque-Perez, O.; Zorita-Lamadrid, A.L. A Review of Energy Consumption Forecasting in Smart Buildings: Methods, Input Variables, Forecasting Horizon and Metrics. Appl. Sci. 2020, 10, 8323. [Google Scholar] [CrossRef]
Ahmad, T.; Zhang, H.C. Novel deep supervised ML models with feature selection approach for large-scale utilities and buildings short and medium-term load requirement forecasts. Energy 2020, 209, 16. [Google Scholar] [CrossRef]
Chou, J.S.; Truong, D.N. Multistep energy consumption forecasting by metaheuristic optimization of time-series analysis and machine learning. Int. J. Energy Res. 2021, 45, 4581–4612. [Google Scholar] [CrossRef]
Liu, C.; Sun, B.; Zhang, C.H.; Li, F. A hybrid prediction model for residential electricity consumption using holt-winters and extreme learning machine. Appl. Energy 2020, 275, 15. [Google Scholar] [CrossRef]
Li, X.Y.; Yao, R.M. Modelling heating and cooling energy demand for building stock using a hybrid approach. Energy Build. 2021, 235, 110740. [Google Scholar] [CrossRef]
Wenninger, S.; Wiethe, C. Benchmarking Energy Quantification Methods to Predict Heating Energy Performance of Residential Buildings in Germany. Bus. Inform. Syst. Eng. 2021, 63, 223–242. [Google Scholar] [CrossRef]
Ullah, F.U.M.; Ullah, A.; Ul Haq, I.; Rho, S.; Baik, S.W. Short-Term Prediction of Residential Power Energy Consumption via CNN and Multi-Layer Bi-Directional LSTM Networks. IEEE Access 2020, 8, 123369–123380. [Google Scholar] [CrossRef]
Papadopoulos, S.; Azar, E.; Woon, W.L.; Kontokosta, C.E. Evaluation of tree-based ensemble learning algorithms for building energy performanceestimation. J. Build. Perform. Simul. 2018, 11, 322–332. [Google Scholar] [CrossRef]
Szul, T.; Tabor, S.; Pancerz, K. Application of the BORUTA Algorithm to Input Data Selection for a Model Based on Rough Set Theory (RST) to Prediction Energy Consumption for Building Heating. Energies 2021, 14, 2779. [Google Scholar] [CrossRef]
Rahman, A.; Srikumar, V.; Smith, A.D. Predicting electricity consumption for commercial and residential buildings using deep recurrent neural networks. Appl. Energy 2018, 212, 372–385. [Google Scholar] [CrossRef]
Gong, M.J.; Wang, J.; Bai, Y.; Li, B.; Zhang, L. Heat load prediction of residential buildings based on discrete wavelet transform and tree-based ensemble learning. J. Build. Eng. 2020, 32, 12. [Google Scholar] [CrossRef]
Bassamzadeh, N.; Ghanem, R. Multiscale stochastic prediction of electricity demand in smart grids using Bayesian networks. Appl. Energy 2017, 193, 369–380. [Google Scholar] [CrossRef]
Jin, X.; Baker, K.; Christensen, D.; Isley, S. Foresee: A user-centric home energy management system for energy efficiency and demand response. Appl. Energy 2017, 205, 1583–1595. [Google Scholar] [CrossRef]
Mawson, V.J.; Hughes, B. Coupling simulation with artificial neural networks for the optimisation of HVAC controls in manufacturing environments. Optim. Eng. 2021, 22, 103–119. [Google Scholar] [CrossRef]
Movahedi, A.; Derrible, S. Interrelationships between electricity, gas, and water consumption in large-scale buildings. J. Ind. Ecol. 2021, 25, 932–947. [Google Scholar] [CrossRef]
Kaur, J.; Bala, A. Predicting power for home appliances based on climatic conditions. Int. J. Energy Sect. Manag. 2019, 13, 610–629. [Google Scholar] [CrossRef]
Shen, M.; Lu, Y.J.; Wei, K.H.; Cui, Q.B. Prediction of household electricity consumption and effectiveness of concerted intervention strategies based on occupant behaviour and personality traits. Renew. Sustain. Energy Rev. 2020, 127, 109839. [Google Scholar] [CrossRef]
Liu, S.; Zeng, A.; Lau, K.; Ren, C.; Chan, P.W.; Ng, E. Predicting long-term monthly electricity demand under future climatic and socioeconomic changes using data-driven methods: A case study of Hong Kong. Sustain. Cities Soc. 2021, 70, 102936. [Google Scholar] [CrossRef]
Honda. Honda Smart Home US. Available online: https://www.hondasmarthome.com (accessed on 14 October 2019).
Leitao, J.; Fonseca, C.M.; Gil, P.; Ribeiro, B.; Cardoso, A. A Compressive Receding Horizon Approach for Smart Home Energy Management. IEEE Access 2021, 9, 100407–100435. [Google Scholar] [CrossRef]
Huang, H.T.; Xu, H.; Cai, Y.H.; Khalid, R.S.; Yu, H. Distributed Machine Learning on Smart-Gateway Network toward Real-Time Smart-Grid Energy Management with Behavior Cognition. ACM Transact. Des. Automat. Electron. Syst. 2018, 23, 26. [Google Scholar] [CrossRef]
Aurangzeb, K. Short Term Power Load Forecasting using Machine Learning Models for energy management in a smart community. In Proceedings of the 2019 International Conference on Computer and Information Sciences (ICCIS), Sakaka, Saudi Arabia, 3–4 April 2019; pp. 1–6. [Google Scholar]
Zaouali, K.; Rekik, R.; Bouallegue, R. Deep learning forecasting based on auto-lstm model for home solar power systems. In Proceedings of the 2018 IEEE 20th International Conference on High Performance Computing and Communications, Exeter, UK, 28–30 June 2018; pp. 235–242. [Google Scholar]
Shakir, M.; Biletskiy, Y. Forecasting and optimisation for microgrid in home energy management systems. IET Gener. Transm. Distrib. 2020, 14, 3458–3468. [Google Scholar] [CrossRef]
Ahmadiahangar, R.; Häring, T.; Rosin, A.; Korõtko, T.; Martins, J. Residential load forecasting for flexibility prediction using machine learning-based regression model. In Proceedings of the 2019 IEEE International Conference on Environment and Electrical Engineering and 2019 IEEE Industrial and Commercial Power Systems Europe (EEEIC / I&CPS Europe), Genova, Italy, 11–14 June 2019; pp. 1–4. [Google Scholar]
Rajasekaran, R.G.; Manikandaraj, S.; Kamaleshwar, R. Implementation of machine learning algorithm for predicting user behavior and smart energy management. In Proceedings of the 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI), Pune, India, 24–26 February 2017; pp. 24–30. [Google Scholar]
Koltsaklis, N.; Panapakidis, I.P.; Pozo, D.; Christoforidis, G.C. A Prosumer Model Based on Smart Home Energy Management and Forecasting Techniques. Energies 2021, 14, 1724. [Google Scholar] [CrossRef]
Fan, L.; Li, J.; Zhang, X.-P. Load prediction methods using machine learning for home energy management systems based on human behavior patterns recognition. CSEE J. Power Energy Syst. 2020, 6, 563–571. [Google Scholar]
Li, W.; Logenthiran, T.; Phan, V.-T.; Woo, W.L. Implemented IoT-based self-learning home management system (SHMS) for Singapore. IEEE Internet Things J. 2018, 5, 2212–2219. [Google Scholar] [CrossRef]
Khan, M.; Seo, J.; Kim, D. Towards Energy Efficient Home Automation: A Deep Learning Approach. Sensors 2020, 20, 7187. [Google Scholar] [CrossRef] [PubMed]
Arens, S.; Derendorf, K.; Schuldt, F.; Maydell, K.V.; Agert, C. Effect of EV movement schedule and machine learning-based load forecasting on electricity cost of a single household. Energies 2018, 11, 2913. [Google Scholar] [CrossRef] [Green Version]
Prophet. Prophet, Forecasting at Scale. Available online: https://facebook.github.io/prophet/ (accessed on 2 November 2021).
AtsPy. AtsPy: Automated Time Series Models in Python. Available online: https://github.com/firmai/atspy (accessed on 2 November 2021).
Yuan, X.M.; Han, P.; Duan, Y.; Alden, R.E.; Rallabandi, V.; Ionel, D.M. Residential Electrical Load Monitoring and Modeling—State of the Art and Future Trends for Smart Homes and Grids. Electr. Power Compon. Syst. 2020, 48, 1125–1143. [Google Scholar] [CrossRef]
Laouali, I.H.; Qassemi, H.; Marzouq, M.; Ruano, A.; Bennani, S.D.; El Fadili, H. A Survey on Computational Intelligence Techniques For Non Intrusive Load Monitoring. In Proceedings of the 2020 IEEE 2nd International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS), Kenitra, Morocco, 2–3 December 2020; pp. 1–6. [Google Scholar]
Ruano, A.; Hernandez, A.; Ureña, J.; Ruano, M.; Garcia, J. NILM Techniques for Intelligent Home Energy Management and Ambient Assisted Living: A Review. Energies 2019, 12, 2203. [Google Scholar] [CrossRef] [Green Version]
Hosseini, S.S.; Agbossou, K.; Kelouwani, S.; Cardenas, A. Non-intrusive load monitoring through home energy management systems: A comprehensive review. Renew. Sustain. Energy Rev. 2017, 79, 1266–1274. [Google Scholar] [CrossRef]
Zeifman, M.; Roth, K. Nonintrusive appliance load monitoring: Review and outlook. IEEE Trans. Consum. Electron. 2011, 57, 76–84. [Google Scholar] [CrossRef]
Lemes, D.A.M.; Cabral, T.W.; Fraidenraich, G.; Meloni, L.G.P.; De Lima, E.R.; Neto, F.B. Load Disaggregation Based on Time Window for HEMS Application. IEEE Access 2021, 9, 70746–70757. [Google Scholar] [CrossRef]
Lin, Y.-H.; Tsai, M.-S. An advanced home energy management system facilitated by nonintrusive load monitoring with automated multiobjective power scheduling. IEEE Trans. Smart Grid 2015, 6, 1839–1851. [Google Scholar] [CrossRef]
Zhai, S.; Wang, Z.; Yan, X.; He, G. Appliance flexibility analysis considering user behavior in home energy management system using smart plugs. IEEE Trans. Ind. Electron. 2018, 66, 1391–1401. [Google Scholar] [CrossRef]
Khosravani, H.R.; Ruano, A.E.; Ferreira, P.M. A convex hull-based data selection method for data driven models. Appl. Soft Comput. 2016, 47, 515–533. [Google Scholar] [CrossRef]
Ruano, A.E.; Pesteh, S.; Silva, S.; Duarte, H.; Mestre, G.; Ferreira, P.M.; Khosravani, H.R.; Horta, R. The IMBPC HVAC system: A complete MBPC solution for existing HVAC systems. Energy Build. 2016, 120, 145–158. [Google Scholar] [CrossRef]
Siemens Smart Infrastructure. Energy Intelligence—Tapping the Potential of a Smart Energy World; Siemens Switzerland Ltd.: Zug, Switzerland, 2020. [Google Scholar]
Gordillo-Orquera, R.; Lopez-Ramos, L.M.; Muñoz-Romero, S.; Iglesias-Casarrubios, P.; Arcos-Avilés, D.; Marques, A.G.; Rojo-Álvarez, J.L. Analyzing and Forecasting Electrical Load Consumption in Healthcare Buildings. Energies 2018, 11, 493. [Google Scholar] [CrossRef] [Green Version]
Ferreira, P.; Ruano, A. Evolutionary Multiobjective Neural Network Models Identification: Evolving Task-Optimised Models. In New Advances in Intelligent Signal Processing; Ruano, A., Várkonyi-Kóczy, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; Volume 372, pp. 21–53. [Google Scholar]
Levenberg, K. A method for the solution of certain problems in least squares. Q. Appl. Math. 1944, 2, 164–168. [Google Scholar] [CrossRef] [Green Version]
Marquardt, D. An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
Ruano, A.E.B.; Jones, D.I.; Fleming, P.J. A New Formulation of the Learning Problem for a Neural Network Controller. In Proceedings of the 30th IEEE Conference on Decision and Control, Brighton, UK, 11–13 December 1991; pp. 865–866. [Google Scholar]
Chinrunngrueng, C.; Séquin, C.H. Optimal adaptive k-means algorithm with dynamic adjustment of learning rate. IEEE Trans. Neural Netw. 1995, 6, 157–169. [Google Scholar] [CrossRef] [PubMed]
Lineros, M.L.; Luna, A.M.; Ferreira, P.M.; Ruano, A.E. Optimized Design of Neural Networks for a River Water Level Prediction System. Sensors 2021, 21, 6504. [Google Scholar] [CrossRef] [PubMed]
Sharp NU-AK PV Panels. Available online: https://www.sharp.co.uk/cps/rde/xchg/gb/hs.xsl/-/html/product-details-solar-modules-2189.htm?product=NUAK300B (accessed on 14 October 2021).
Kostal Plenticore Plus Inverter. Available online: https://www.kostal-solar-electric.com/en-gb/products/hybrid-inverters/plenticore-plus (accessed on 14 October 2021).
BYD Battery Box HV. Available online: https://www.eft-systems.de/en/The%20B-BOX/product/Battery%20Box%20HV/3 (accessed on 14 October 2021).
Mestre, G.; Ruano, A.; Duarte, H.; Silva, S.; Khosravani, H.; Pesteh, S.; Ferreira, P.; Horta, R. An Intelligent Weather Station. Sensors 2015, 15, 31005–31022. [Google Scholar] [CrossRef]
TP-Link WiFi Smart Plugs. Available online: https://www.tp-link.com/pt/home-networking/smart-plug/hs100/ (accessed on 14 October 2021).
Ruano, A.; Silva, S.; Duarte, H.; Ferreira, P.M. Wireless Sensors and IoT Platform for Intelligent HVAC Control. Appl. Sci. 2018, 8, 370. [Google Scholar] [CrossRef] [Green Version]
Carlo Gavazzi EM340. Available online: https://www.carlogavazzi.co.uk/blog/carlo-gavazzi-energy-solutions/em340-utilises-touchscreen-technology (accessed on 14 October 2021).
Wibeee Consumption Analyzers. Available online: http://circutor.com/en/products/measurement-and-control/fixed-power-analyzers/consumption-analyzers (accessed on 14 October 2021).
Ferreira, P.M.; Ruano, A.E.; Pestana, R.; Koczy, L.T. Evolving RBF predictive models to forecast the Portuguese electricity consumption. IFAC Proc. 2009, 42, 414–419. [Google Scholar] [CrossRef]
Ferreira, P.M.; Pestana, R.; Ruano, A.E. Improving the Identification of RBF Predictive Models to Forecast the Portuguese Electricity Consumption. IFAC Proc. 2010, 1, 208–213. [Google Scholar] [CrossRef]
Bot, K.; Laouali, I.; Ruano, A.; Ruano, M.d.G. Home Energy Management Systems with Branch-and-Bound Model-Based Predictive Control Techniques. Energies 2021, 14, 5852. [Google Scholar] [CrossRef]
Ruano, A.; Bot, K.; Ruano, M.d.G. The Impact of Occupants in Thermal Comfort and Energy Efficiency in Buildings. In Occupant Behaviour in Buildings: Advances and Challenges, Bentham Science: Sharjah. United Arab Emirates 2021, 6, 101–137. [Google Scholar]
Ferreira, P.M.; Ruano, A.E.; Pestana, R. Towards Online Operation of a RBF Neural Network Model to Forecast the Portuguese Electricity Consumption. In Proceedings of the 2011 IEEE 7th International Symposium on Intelligent Signal Processing (WISP), Floriana, Malta, 19–21 September 2021. [Google Scholar]
Ferreira, P.M.; Cuambe, I.D.; Ruano, A.E.; Pestana, R. Forecasting the Portuguese Electricity Consumption using Least-Squares Support Vector Machines. IFAC Proc. 2013, 3, 411–416. [Google Scholar] [CrossRef]
Zhang, X.M.; Grolinger, K.; Capretz, M.A.M.; Seewald, L. Forecasting Residential Energy Consumption: Single Household Perspective. In Proceedings of the 17th IEEE International Conference on Machine Learning and Applications (IEEE ICMLA), Orlando, FL, USA, 17–20 December 2018; pp. 110–117. [Google Scholar]
Wen, L.; Zhou, K.; Yang, S. Load demand forecasting of residential buildings using a deep learning model. Electr. Power Syst. Res. 2020, 179, 106073. [Google Scholar] [CrossRef]
Pecan Street Inc. Dataport. Available online: https://www.pecanstreet.org/dataport/ (accessed on 14 October 2021).
Rana, M.; Rahman, A. Multiple steps ahead solar photovoltaic power forecasting based on univariate machine learning models and data re-sampling. Sustain. Energy Grids Netw. 2020, 21, 100286. [Google Scholar] [CrossRef]
Hossain, M.S.; Mahmood, H. Short-Term Photovoltaic Power Forecasting Using an LSTM Neural Network and Synthetic Weather Forecast. IEEE Access 2020, 8, 172524–172533. [Google Scholar] [CrossRef]
Wang, K.; Qi, X.; Liu, H. A comparison of day-ahead photovoltaic power forecasting models based on deep learning neural network. Appl. Energy 2019, 251, 113315. [Google Scholar] [CrossRef]
Li, G.; Xie, S.; Wang, B.; Xin, J.; Li, Y.; Du, S. Photovoltaic Power Forecasting with a Hybrid Deep Learning Approach. IEEE Access 2020, 8, 175871–175880. [Google Scholar] [CrossRef]

Figure 1. Overall structure of single, ensemble, and hybrid models. Adapted from [1].

Figure 2. Schematic diagram of the acquisition system.

Figure 3. Daily energy consumption.

Figure 4. PV power generated.

Figure 5. Peak sunshine hours.

Figure 6. Mean daily temperature.

Figure 7. Occupation.

Figure 8. Day encoding.

Figure 9. Histogram of the lags’ usage in the non-dominated M₁ models: (a) P_D; (b) T.

Figure 10. RMSE over the prediction horizon—M₁.

Figure 11. Measured (blue) and one-step-ahead predicted (red) load demand.

Figure 12. Histogram of the lags’ usage in the non-dominated M₂ models.

Figure 13. Measured (blue) and one-step-ahead predicted (red) R.

Figure 14. Evolution of RMSE over the prediction horizon—M₂.

Figure 15. Histogram of the lags’ usage in the non-dominated M₃ models.

Figure 16. Evolution of RMSE over the prediction horizon—M₃.

Figure 17. Measured (blue) and one-step-ahead predicted (red) air temperature.

Figure 18. Histogram of the lags’ usage in the non-dominated M₄ models: (a) P_G; (b) R; (c) T.

Figure 19. Evolution of RMSE over the prediction horizon—M₄.

Figure 20. Measured (blue) and one-step-ahead predicted (red) PV power generated.

Figure 21. Load Demand detail. Measured values (red) and one-step-ahead box plot of

{M_{1}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 21. Load Demand detail. Measured values (red) and one-step-ahead box plot of

{M_{1}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 22. Evolution of RMSE over the prediction horizon—M₁. Selected model (red) and

{M_{1}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 22. Evolution of RMSE over the prediction horizon—M₁. Selected model (red) and

{M_{1}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 23. Solar Irradiance detail. Measured values (red) and one-step-ahead box plot of

{M_{2}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 23. Solar Irradiance detail. Measured values (red) and one-step-ahead box plot of

{M_{2}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 24. Evolution of RMSE over the prediction horizon—M₂. Selected model (red) and

{M_{2}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 24. Evolution of RMSE over the prediction horizon—M₂. Selected model (red) and

{M_{2}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 25. Air temperature detail. Measured values (red) and one-step-ahead box plot of

{M_{3}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 25. Air temperature detail. Measured values (red) and one-step-ahead box plot of

{M_{3}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 26. Evolution of RMSE over the prediction horizon—M₃. Selected model (red) and

{M_{3}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 26. Evolution of RMSE over the prediction horizon—M₃. Selected model (red) and

{M_{3}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 27. Power generated detail. Measured values (red) and one-step-ahead box plot of

{M_{4}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 27. Power generated detail. Measured values (red) and one-step-ahead box plot of

{M_{4}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 28. Evolution of RMSE over the prediction horizon—M₄. Selected model (red) and

{M_{4}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 28. Evolution of RMSE over the prediction horizon—M₄. Selected model (red) and

{M_{4}^{ε_{P H_{M O G A}} (25)}}

(blue).

Figure 29. Snapshot of electricity consumption for Portugal for two consecutive weeks [111].

Table 1. Day encoding.

Day of the Week	Regular Day	Holiday	Special
Monday	0.05	0.40	0.70
Tuesday	0.10	0.80
Wednesday	0.15	0.50
Thursday	0.20	1.00
Friday	0.25	0.60	0.90
Saturday	0.30	0.30
Sunday	0.35	0.35

Table 2. Statistics for M₁.

Execution	$ε_{T_{r}}$	$ε_{T_{e}}$	$ε_{V}$	$ε_{P H_{V A L}}$
1st	0.14	0.12	0.12	4.81
2nd	0.15	0.12	0.12	4.79

Table 3. Statistics for M₂.

Execution	$ε_{T_{r}}$	$ε_{T_{e}}$	$ε_{V}$	$ε_{P H_{V A L}}$
1st	0.11	0.08	0.08	2.58
2nd	0.11	0.08	0.08	2.58

Table 4. Statistics for M₃.

Execution	$ε_{T_{r}}$	$ε_{T_{e}}$	$ε_{V}$	$ε_{P H_{V A L}}$
1st	0.02	0.02	0.02	2.72
2nd	0.02	0.02	0.02	2.70

Table 5. Statistics for M₄.

Execution	$ε_{T_{r}}$	$ε_{T_{e}}$	$ε_{V}$	$ε_{P H_{V A L}}$
1st	0.05	0.04	0.05	1.69
2nd	0.06	0.04	0.05	1.67

Table 6. Forecasting Performances for M₁.

	${\bar{ε}}_{P H_{V A L}^{25 %}}$	${\bar{ε}}_{P H_{V A L}^{50 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}} - {\bar{ε}}_{P H_{V A L}^{25 %}}$
${M_{1}^{ε_{P H_{M O G A}} (10)}}$	4.14	4.61	5.14	1.00
${M_{1}^{ε_{P H_{M O G A}} (25)}}$	4.16	4.61	5.10	0.94
${M_{1}^{ε_{P H_{M O G A}} (50)}}$	4.17	4.62	5.10	0.93
${M_{1}^{{‖ w ‖}_{2} (10)}}$	4.04	4.67	5.53	1.49
${M_{1}^{{‖ w ‖}_{2} (25)}}$	4.04	4.67	5.53	1.49
${M_{1}^{{‖ w ‖}_{2} (50)}}$	4.18	4.65	5.17	0.99

Table 7. Forecasting Performances for M₂.

	${\bar{ε}}_{P H_{V A L}^{25 %}}$	${\bar{ε}}_{P H_{V A L}^{50 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}} - {\bar{ε}}_{P H_{V A L}^{25 %}}$
${M_{2}^{ε_{P H_{M O G A}} (10)}}$	2.27	2.46	2.66	0.35
${M_{2}^{ε_{P H_{M O G A}} (25)}}$	2.26	2.47	2.70	0.44
${M_{2}^{ε_{P H_{M O G A}} (50)}}$	2.25	2.48	2.72	0.47
${M_{2}^{{‖ w ‖}_{2} (10)}}$	2.15	2.55	3.07	0.92
${M_{2}^{{‖ w ‖}_{2} (25)}}$	2.25	2.53	2.84	0.59
${M_{2}^{{‖ w ‖}_{2} (50)}}$	2.28	2.54	2.80	0.52

Table 8. Forecasting performances for M₃.

	${\bar{ε}}_{P H_{V A L}^{25 %}}$	${\bar{ε}}_{P H_{V A L}^{50 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}} - {\bar{ε}}_{P H_{V A L}^{25 %}}$
${M_{3}^{ε_{P H_{M O G A}} (10)}}$	2.38	2.59	2.84	0.46
${M_{3}^{ε_{P H_{M O G A}} (25)}}$	2.40	2.58	2.78	0.38
${M_{3}^{ε_{P H_{M O G A}} (50)}}$	2.42	2.59	2.78	0.36
${M_{3}^{{‖ w ‖}_{2} (10)}}$	2.35	2.62	2.92	0.57
${M_{3}^{{‖ w ‖}_{2} (25)}}$	2.35	2.59	2.89	0.54
${M_{3}^{{‖ w ‖}_{2} (50)}}$	2.38	2.59	2.85	0.47

Table 9. Forecasting performances for M₄.

	${\bar{ε}}_{P H_{V A L}^{25 %}}$	${\bar{ε}}_{P H_{V A L}^{50 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}}$	${\bar{ε}}_{P H_{V A L}^{75 %}} - {\bar{ε}}_{P H_{V A L}^{25 %}}$
${M_{4}^{ε_{P H_{M O G A}} (10)}}$	1.19	1.44	1.75	0.56
${M_{4}^{ε_{P H_{M O G A}} (25)}}$	1.14	1.41	1.73	0.59
${M_{4}^{ε_{P H_{M O G A}} (50)}}$	1.09	1.38	1.74	0.65
${M_{4}^{{‖ w ‖}_{2} (10)}}$	1.24	1.65	2.16	0.92
${M_{4}^{{‖ w ‖}_{2} (25)}}$	1.18	1.56	2.05	0.87
${M_{4}^{{‖ w ‖}_{2} (50)}}$	1.15	1.51	2.08	0.87

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bot, K.; Santos, S.; Laouali, I.; Ruano, A.; Ruano, M.d.G. Design of Ensemble Forecasting Models for Home Energy Management Systems. Energies 2021, 14, 7664. https://doi.org/10.3390/en14227664

AMA Style

Bot K, Santos S, Laouali I, Ruano A, Ruano MdG. Design of Ensemble Forecasting Models for Home Energy Management Systems. Energies. 2021; 14(22):7664. https://doi.org/10.3390/en14227664

Chicago/Turabian Style

Bot, Karol, Samira Santos, Inoussa Laouali, Antonio Ruano, and Maria da Graça Ruano. 2021. "Design of Ensemble Forecasting Models for Home Energy Management Systems" Energies 14, no. 22: 7664. https://doi.org/10.3390/en14227664

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design of Ensemble Forecasting Models for Home Energy Management Systems

Abstract

1. Introduction

1.1. Background Information

1.2. Objectives, Contributions, and Work Organization

2. Literature Review

2.1. Machine Learning (ML)-Based Prediction Methods for Energy Systems

2.2. Forecasting of Energy Consumption in Buildings

2.3. Applications of ML-Based Energy Systems Forecasting in HEMS

2.4. Future Applications for Schedulable and Non-Schedulable Appliance Consumption Forecasting Using NILM

3. Design Methodology

3.1. The Models

3.2. Model Design

3.2.1. Data Selection

3.2.2. Structure Selection

3.2.3. Parameter Estimation

3.3. Model Ensemble

4. Case Study Description

5. Results

5.1. Data Sets Description

5.2. Approxhull Results

5.3. MOGA Results

5.3.1. Single Solution

Model 1—Power Demand

Model 2—Solar Irradiance

Model 3—Atmospheric Temperature

Model 4—Power Generated

5.3.2. Ensemble Averaging

Model 1—Load Demand

Model 2—Solar Irradiance

Model 3—Atmospheric Temperature

Model 4—Power Generated

5.4. Discussion of the Results

Comparison of Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI