1. Introduction
The growth and development rate of countries around the world is and has been, annually and inevitably, increasing significantly [
1]. The extraordinary increase in the global population, related to economic advancement, industrialization, social advances, and expectations of prosperity, has had a significant impact on energy and environmental issues [
2]. Associated with the growth of the human population in the demand for housing and well-being, the development of countries and societies will also continue to increase. To ensure these growths and developments, an increase in energy generation will be needed to stimulate global demand and, at the same time, the environment should be kept safe [
1,
2]. The increase in population and consumption patterns have promoted the increase in energy consumption that, unavoidably, has been growing at an annual high rate. According to the U.S. Energy Information Administration (EIA), energy consumption is expected to grow by more than 50% by 2050 [
3]. Energy plays a vital role in the progress and socio-economic development of a country [
4]. One criterion for assessing the rate of development and the welfare rate of a country is to compute the amount of energy it consumes [
4,
5,
6].
The world, in particular developing countries, faces a future increase in energy demand as population growth and the demand for welfare conditions promote a strong pressure and dependence on fossil fuels in societies and increasing challenges associated with climate change [
1]. This enormous growth is driven by the growing industrial revolution and urbanization [
7]. The introduction of new and advanced technologies increases dependence on electricity [
5]. The increase in population requires a rapid expansion of residential and commercial areas, which also contributes to an increase in energy consumption. At the same time, environmental issues should be considered in this development to reduce pollution, carbon footprints, and greenhouse effects [
1].
The latest annual report from the International Energy Agency (IEA) mentions that developing countries have a faster growth rate demand for electricity compared with developed countries. It is therefore necessary to provide for such a diversified demand for electricity using robust and high-performance forecasting methods for adequate power generation planning [
3].
To meet the objectives of the Paris Agreement and reduce Greenhouse Gas (GHG) emissions, it is essential to move towards a low-carbon energy system [
4]. The International Energy Agency (IEA) has identified energy efficiency as one of the measures to ensure the long-term decarbonization of the energy sector [
2]. One of the main solutions to reduce emissions is to reduce the intensity of primary energy through energy efficiency [
5] and electrification—based on renewable energy sources (RES)—a solution increasingly adopted for the industrial, commercial, residential, and transport sectors.
To mitigate the effects of climate change and protect the environment, developed countries have set themselves the goal of increasing their share of RES in energy production [
6,
8,
9]. The increased penetration of renewable energies, such as wind and solar energy, is characterized by a stochastic nature. Most RES are characterized by variability and intermittency, making it difficult to predict their power output (i.e., they depend on solar irradiation or wind speed). These features make the operation and management of energy systems more challenging because more flexibility is needed to safeguard their normal operation and stability [
9,
10].
However, because the production of renewable energy from sources such as wind or sun is intermittent, accurate forecasts of non-controllable electrical loads are needed to flexibly manage and achieve the balance between demand and supply. Therefore, in the ongoing energy transition, significant investments and developments in research are underway to quickly harness electricity generated from renewable sources [
2].
In addition, the 2019 coronavirus disease pandemic (COVID-19) had a notable and unexpected impact on fluctuating patterns of electricity in several sectors, especially the residential sector. Following the indications of the World Health Organization (WHO), the population stayed and spent more time at home, particularly in developed countries, leading to an increase in electricity consumption in the domestic sector, higher than in the past [
11].
People spend about 90% of their time in buildings, and 80 to 90 % of the overall electricity consumption throughout the life cycle of a building is intended to fulfil internal tasks and building operations. Thus, residential buildings have become the largest consumers of electricity in the world, representing 39% of electricity consumption and 38% of greenhouse gas emissions [
2]. In Europe, energy consumption and total carbon dioxide emissions from buildings account for 40% and 36%, respectively [
5].
The scarcity of energy supply increased global energy demand, pollution, and GHG emissions, and the lack of studies on sustainable and clean energy systems have led all research results to conclude that the management of electricity consumption is a key observation issue [
2]. Electricity is a necessity of life and represents a great challenge to all countries. Increasing electricity consumption at an alarming rate has become a challenge for distribution companies [
5]. Energy institutions and organizations in each country need to ensure and adjust the demand for supply at competitive prices, avoid the waste of resources, and safely provide it to their economy and their citizens [
12,
13].
These circumstances indicate the essential need for an effective electric load forecasting system to accommodate several uncertainties associated with changing end-user-related load consumption behavior and RES intermittency [
6,
8,
10]. In addition, an effective load forecasting system can help maintain the reliability, efficiency, and sustainability of the energy system by establishing more effective management, developing future economic planning, and promoting support systems for the decision-making of emerging generations. In the context of the impressive digital revolution promoted by Industry 4.0, several intelligent methods of load forecasting have recently been developed [
9].
The forecasting of the demand for resources within a power distribution network has relevant importance for the management of the limited availability of assets. An accurate load forecasting system can reduce the high cost of excessive contracts and subcontracts in balance markets due to load forecasting errors [
14]. Electricity demand forecasting has been decisive in electricity production capacity, transmission planning, and pricing [
12]. In addition, it keeps energy markets efficient and provides a better understanding of the dynamics of the monitored system [
14].
STLF allows the power grid industry to manage many operating decisions. To this end, STLF is often used to improve system operations and provide reliable services [
2], being an active area of research, namely in the residential sector, where high electricity consumption requires Demand-Side Management (DSM) strategies. As mentioned above, electricity consumption in the residential sector represents a considerable share of total energy consumption. Thus, by providing accurate forecasts at the residential level, electric power service companies can optimize energy production and manage demand [
10].
Energy demand forecasting techniques can be grouped into three categories, namely: correlation, extrapolation, and a combination of both. Extrapolation techniques involve the adjustment and reproduction of trend curves based on historical data on electricity demand. In these cases, the future value of the electricity demand is obtained by projecting the function of the trend curve at the desired future point. It is a simple technique, and its results are very realistic in some cases. However, correlation techniques seek to relate the system load demand curve to several economic and demographic factors. Therefore, these techniques seek the existing association between patterns of increased demand and other measurable factors. The disadvantage of correlation techniques lies in forecasting economic and demographic factors, making it harder than demand forecasting itself. Normally, correlation techniques are used for population data, building permits, heating, ventilation and cooling, employment, weather data, and the typologies of buildings. However, proper planning and useful applications of electric load forecasting require a specific forecast period (lead time) [
15].
Electricity consumption varies in short-, medium-, and long-term periods, which cover three forecast time horizons. They range, respectively, from one hour to a week, one week to a year, and more than a year [
8,
9,
16]. They are based on several factors, such as climate, region, and sector (residential, industrial, commercial). Residential loads consume most of the electricity generated [
9,
17]. STLF is important for cost reduction, energy savings, fine scheduling, and safety assurance [
18].
To improve the forecasting of electricity demand accuracy, several computational and statistical methods have been applied [
19,
20]. Some researchers cluster the forecasting models into two models. Those that are data-driven are usually applied with artificial intelligence techniques—equal to extrapolation techniques—and engineering methods—equal to correlation techniques. Even so, no single method is scientifically accepted as being better than the others in all situations [
11].
Over the years, many different forecasting models have been applied in energy systems, and they can be categorized into three methods: Engineering, Statistics, and AI [
1,
10]. Among them, the most applied are AI and statistical methods. Engineering methods are still applied, but some shortcomings have been identified, such as the complexity that contributes to the practical difficulties of application and their lack of input information. However, it was found that statistical methods lack accuracy and are not flexible, which makes the AI method preferred among researchers [
1]. There is no consensus on a specific forecasting model, and the use of one method over another is associated with the expert’s preference [
19]. Therefore, it is relevant to identify which model is adequate for a specific situation.
The purpose of this systematic review is to analyze and identify the quality of a method compared to some of the other potential solutions in a specific forecasting scenario and to help users in selecting their forecasting method by simply answering questions such as: “Is there any STLF model capable to forecast electricity demand, between 1 h and 24 h, for a single residence with acceptable error?”.
In AI, Machine Learning (ML) is the most widely implemented due to its ability to analyze nonlinear problems and its more accurate results. The several methods used in ML can also learn from historical patterns while analyzing the data. In addition, it has capabilities in pattern recognition and pattern classification. Although the statistical forecasting method is simple, for the stability of the original time series it needs many requirements. It is a method that has difficulty adapting to the increasingly complex characteristics of energy consumption, which normally leads to lower forecasting accuracy [
21].
The following sections of this article are structured as follows:
Section 2 presents the motivation and the review scope.
Section 3 presents the methodology used in the paper for a systematic review.
Section 4 presents a literature review of STLF methods. Finally, concluding remarks are given in
Section 5, summarizing the findings and guidelines for future research.
2. Motivation and Review Scope
Electricity grid infrastructure has had difficulty managing the growing need for energy [
22] and the increased penetration of renewable energy systems [
21]. In Europe, electricity consumption in the residential sector accounts for a considerable share of total energy consumption, reaching 40% [
10]. In recent years, there has been a growing trend toward implementing electricity generation systems from renewable sources, and their rapid development has presented important challenges for energy system operators [
10]. The smart grid provides an infrastructure for the power management system and facilitates the smooth integration of RES [
4].
The introduction of smart grids also promoted the development of STLF techniques, along with the implementation of one of the main devices of a smart grid: smart metering [
4]. The capabilities of smart grid technologies and real-time metering can enable them to forecast individual demand for electricity by residential consumers and aggregate all of them to forecast total demand at the electric utility. This opportunity motivates new approaches to the STLF in which the residential sector can forecast the consumption of electricity per resident, per cluster, and agglomerate. The possibility of the residential sector demands can be forecast independently, and individually the future demand for the electricity of a residence can be communicated to the electricity supplier and the consumer [
23].
Electricity forecasting is an essential component of the smart grid and has attracted academic interest. However, from several models developed, the selection of a model is difficult to choose for a particular application [
19]. The promotion of accurate forecasts of electricity demand in the residential sector allows electric utilities to optimize energy production and manage demand, reducing congestion and overloads on the transmission network [
20]. Reducing the STLF error benefits the power generation plan and the management of electricity systems [
23]. Load forecasting is an important technique to ensure that the power grid operates safely and reliably [
21,
24]. STLF solutions in the residential sector are also essential to contribute to the decarbonization of the economy and the reduction of GHG emissions, and to balance the demand and supply of electrical systems. These contributions have increased the interest and the need for a systematic review to summarize the algorithms used for the different models applied in the STLF.
Most of the articles analyzed in this systematic review provide valuable contributions. However, they tend to focus only on the development of the algorithm-specific technique and application domain. The growing need for solutions in this area reinforces the need for a comprehensive review that tracks its evolution and acts as a guide to highlight the most promising STLF solutions used in the residential sector, based on the knowledge so far reported in existing publications, in its different typologies: individual, cluster, and agglomerate. The purpose of our analysis is as follows:
First, provide a comprehensive overview and a detailed assessment of previous state-of-the-art STLF solutions applied to electricity demand in the residential sector, a perspective of their evolution, and possible future research paths.
Secondly, present a concise summary of the useful characteristics of STLF comparative techniques applied to the electrical load.
Third, present a systematic analysis of the advantages and disadvantages of using a specific STLF solution.
This is a comprehensive review of the STLF solutions applied in the residential sector in electricity demand over the past 10 years. More specifically, it includes 38 studies/articles (summarized in tables in the
Appendix A,
Appendix B,
Appendix C).
3. Methodology
This article presents a relevant literature systematic review on STLF in forecasting in the residential sector for electricity demand. The protocol
Preferred-Reporting Items for Systematic-Review and Meta-Analysis (PRISMA) was applied due to its ability to increase the value and quality of systematic reviews compared to other forms of review [
11].
The PRISMA protocol is used for extensive research in the existing literature and meta-analysis of the consultation, reducing the effect of chance and biases. This protocol has a well-defined procedure that brings clarity, accuracy, and repeatability. In summary, PRISMA begins with the definition of the research question(s); the search criteria to be applied to select the correct publications are then defined. After the selection, the data are analyzed and finally the results are discussed.
The criteria adopted for research were selected according to the research question. To optimize and limit the search in the online databases adopted, Web of Science, IEEE Xplore, Scopus, and Science Direct, some assertive keywords that emerged naturally with the survey of the question to be answered were identified, namely “STLF”, “Electricity”, or “Residential”. The database was searched as follows: “STLF” AND “Electricity” AND “Residential”. Synonyms of the keywords “Residential” were also used, such as “domestic”, “house”, “household”, “dwelling”, or “building”, within the title, abstract, text, or keywords.
The consultation of the databases mentioned made it possible to identify a total of 334 (three hundred and thirty-four) interesting scientific articles for this systematic review, collected and published in relevant journals and conferences. The articles were downloaded from the internet, and the titles, abstracts, and keywords were later carefully analyzed using the free web tool Rayyan. For assistance in systematic review and meta-analysis, the semi-automatic process of the Rayyan application sped up the initial screening by abstracts, titles, and keywords [
25].
The Rayyan tool allows the automatic detection of articles that are partially or totally (100%) duplicated. The tool allows erasure or to keep the article seemingly duplicated. Subsequently, the articles were sorted by the year of publication, title, abstract, and keyword.
The date of publication was one of the exclusion criteria. Articles with a date before 2012 were excluded, and those published in the last 10 years were maintained.
Figure 1 shows that there is an interesting number of publications on the topic underlying the question during this period. Another exclusion criterion was “Review” articles. The other exclusion factors were the articles whose aim was not directly related to the “Residential” sector or the results of studies that were not aligned with future electricity demand forecasting. All articles submitted by the databases consulted had the text in English; for this reason, there was no exclusion. Thus, there were 38 articles for quantitative analysis. The PRISMA flow diagram shown in
Figure 2 identifies the different phases and exclusion criteria.
Therefore, the research in the databases identified 335 (three hundred and thirty-five) articles from scientific journals and conferences; 24 (twenty-four) duplicate articles were removed and 311 (three hundred and eleven) articles remained for the screening stage. In the screening stage, 1 (one) article was removed because the study was not related to the services of the electrical networks, and 16 (sixteen) articles were published before 2012 and removed, reducing the articles eligible for analysis to 294 (two hundred and ninety-four). Thus, these 293 articles were used for qualitative analysis. Among these, 26 (twenty-six) articles were removed because they presented reviews of electrical load forecasting, reducing the number of articles eligible for qualitative analysis to 269 (two hundred and sixty-nine). Of the 269 articles that remained for analysis, 111 (one hundred and eleven) were rejected because the works did not cover the residential sector, and 120 (one hundred and twenty) were also removed because the results were not related to future electricity demand. Thirty-eight (38) articles were left for quantitative analysis, as shown in
Figure 2.
3.1. Cases Comparison–Results and Discussion
Studies for electricity consumption in the residential sector have different results and approaches when they are oriented to the typologies of one residence, cluster, or agglomerate of residences. The residential or domestic sector refers to a residence or dwelling where people live on a day-to-day basis alone, with family and friends in a villa, in a condominium, or on a floor or block of buildings located in a rural or urban zone. Because it provides relevant results, the literature selected for this approach and electricity demand forecasting presents works with two main methods: AI methods and statistical methods. To harness the strength of the different methods, some researchers use hybrid methods–two or more AI and statistical techniques–to improve the forecasting accuracy models adopted.
The selected works of forecasting the consumption of electricity seek to identify the partial or total hourly load, daily, weekly, monthly, seasonally, or annual hourly, or the total of one residence, cluster, or agglomerate of residences to be provided by an electric power supply. The STLF models presented by most of the studies use historical data of several days, weeks, months, or years of the electrical load and exogenous variables to input, e.g., meteorological data, calendar (weekdays, weekends, holidays), tariff, appliances, and socio-economic effects, and project the forecasting for the hourly electric load, daily, weekly, monthly, seasonal, or annual. The techniques used to evaluate the performance of the studied models are statistical. Most of the selected articles use MAPE, MAE, MSE, and RSME. The performance results were classified as high, satisfactory, and low. For performance, the adopted high interval should be among the first 5% of one or more evaluation methods. For a satisfactory assessment, it should be between 5% and 20%, and low performance above 20%. It turns out that most of the work that focuses on agglomerates has a high performance, while those who focus on one residence have satisfactory or low performance, given the volatility and occupation behavior of the residents.
3.2. Summary of Statistical Analysis
The 38 articles were analyzed to allow the first observation.
Table 1 shows the number of articles published in the last decades on STLF for the residential sector. The largest number of publications (20 articles, more than 50%) occurred in the last two years. This evidences the increase of interest by the scientific community on this subject.
Figure 3 shows that more than half of the publications are journal articles (63%), and the rest are articles published at conferences (37%).
Figure 4 shows the countries with the origin of the analyzed publications. The three main continents with the highest number of publications in STLF for energy demand in the residential sector come from the northern hemisphere, namely Europe (43.3%), Asia (26.7%), and North America (26.7%). The main countries with the most publications are the USA (16.7%) and France (13.3%), followed by the same number of articles published in Canada, Ireland, and Pakistan (10.0%) and, finally, Bangladesh and Portugal (6.7 %).
Figure 5 shows the cities covered by the published articles. The cities of Sceaux (France), Lahore (Pakistan), Lisbon (Portugal), New York (USA), and Texas (USA) stand out.
3.3. A highlight of Articles Published in Forecasting Models
The development of robust, effective, and accurate load forecasting models in the residential sector has been an opportunity to improve the responsiveness related to the operation of the electricity grid and the electricity demand. For suppliers and residential customers, the development of load forecasting solutions is very important to support the management decision-making, cost reduction and anticipation of possible problems with the operation of electricity grid systems. Thus, the task of forecasting demand for future energy consumption (load forecasting) has become an important component of any electric power management system [
2].
STLF solutions have become increasingly relevant in the face of the extensive proliferation of Internet-of-Things (IoT) and Metering Infrastructure (MI) technologies that allow greater granularity in the aggregation of electricity consumption data of each consumer and monitoring of relevant exogenous variables, such as the predominant data of the zone’s climate conditions [
2].
The analysis of the 38 articles shows that the input data used in the various models applied to STLF solutions increase the arbitrariness of the load/energy time series. There is a correlation with the calendar effects, among them, the numerous holidays, the same days of previous weeks and between several working days and weekends. They are also vulnerable to variations in exogenous conditions, such as climate conditions, occupation and social behaviours.
The methods presented in the 38 articles analyzed can be categorized under two labels, namely: AI methods and statistical methods. Analyzing the distribution of models through all articles allows having a reading of the current trend in the use of the forecasting model.
Figure 6 illustrates the distribution of the different forecasting methods analyzed. It is highlighted that in some articles, there are forecasting methods that apply various techniques to a specific solution and this systematic review seeks to provide a better reading of the actual trend.
Table 2 shows a clear trend in the use of STLF forecasting methods for the residential sector. The Deep Learning methodology is the most applied and is present in 22 of the 38 articles (57.9% of the articles). AI and statistics approaches are present in 9 articles (23.7% of the articles) in a hybrid algorithm. The statistical methods are present in 18.4% of the articles, i.e., 7 articles. The other models are singularities.
Table 3 shows the methods used by categories of residences in the articles reviewed. In recent years, there has been a growing increase in the application of STLF solutions for individual residences. In 2021 and 2022, 8 articles were presented-4 articles in each of the mentioned years. Meanwhile, in the category of agglomerates or clusters of residences, the same number has been maintained every year-between 1 to 2 articles per year.
Table 4 and
Table 5 show the most used algorithms in the 38 analyzed articles. The algorithms can be applied to various specific forecasting models and the distribution of the algorithms across the reviewed articles provides a better view of the real trend. The distribution allows identifying the algorithms used in the applications in the reviewed articles and not the number of algorithms used in each application presented in the reviewed article.
Figure 7 shows the most widely used models of AI and which are DL, LSTM, SVR, ANN, CFNN and MLP. The models that predominate are DL with 8 articles (year occurrence 2022), LSTM (year occurrence 2021), and ANN (years occurrence 2014 to 2017), with 4 articles each. The prevailing statistical models are the Bayesian and MLR with 3 articles each, in 14 articles, as shown in
Figure 8.
The efficiency of each forecasting model is partially dependent on the input variables. The input variables used in the electrical load forecasting were also analyzed.
Figure 9 shows the input parameters used by the authors in the forecasting models. The analysis of the revised articles found that, as was to be expected, everyone applies the energy historical data.
Table 5 and
Table 6 characterize the type of input variables and the number of articles that used them. It is observed that a high majority of the electricity demand forecasting in the residential sector applies to the inputs of the energy historical data (38 articles), the calendar effect (24 articles) and the weather data (22 articles). However, there is less relevance in the use of other data, such as energy tariffs, occupants’ lifestyles, household appliances and socio-economic conditions.
Table 7 shows that most of the authors of the reviewed articles applied the hourly load reference period (24 articles), from historical energy data, for the inputs to the forecasting models proposed for the energy demand of the residential sector.
3.4. Evaluation Metrics Used
The forecasting performance evaluation is an important part of the development of load forecasting models, and it allow researchers to evaluate the model developed in the modelling phase. Over the past few decades, researchers in the forecasting community have proposed many metrics to assess the performance and accuracy of their forecasting models. In the articles analyzed, the authors proposed several performance evaluation metrics for each model.
Table 8 shows that the most-used metric by authors was MAPE (25 articles), mainly due to its simplicity and transparency in understanding, its effectiveness, and its usefulness for short-term forecasting. The other most-used metrics are RSME, MSE, and MAE.
Table 9 shows the second most used metrics.
3.5. Data Pre-Processing
In some of the works presented (16 articles), the authors adopted data pre-processing techniques to increase the accuracy of the forecasting models proposed by them, avoiding or reducing the noise in the data used in their models.
Table 10 shows the three classifications, high (<5%), satisfactory (5% to 20%), and low (>20%) adopted to assess the performance of the energy demand forecasting models for the residential sector. In two articles it was not possible to identify whether pre-processing techniques were used in the input data. At first sight, it seems that the solution to improve the accuracy of the model proposed involves the solution of the algorithm used in the modelling rather than the pre-processing of the input data.
Most of the mentioned methods applied to the residential agglomerate have shown good performances, both AI and statistical methods. However, when applied to small clusters or individual households, performance in modelling nonlinear consumption patterns fails. The main reason pointed out is the random behavior of residents throughout the days of the week or the failure to predict exogenous variables, such as weather data [
26,
27].
The analyzed articles use short-term load forecasting methods in their studies based on scenarios with historical load data between 2 weeks and 10 years. Most are sufficient to be used in the proposed load forecasting models. However, in scenarios where historical load data are scarce, the accuracy of the forecasting will drop significantly or even make the study invalid [
21,
28].
In recent years, data decomposition methods such as empirical mode decomposition and variational mode decomposition have gained popularity in load forecasting research. These techniques have been used in various studies to preprocess time-series data, particularly in the energy sector. However, the literature analyzed found that the most used data processing techniques for load forecasting in households are statistical methods.
The load of the residential sector is a typical time series; many statistical methods have been and continue to be used for STLF solutions, such as the auto-regressive integrated moving average (ARIMA) [
29], or the Bayesian [
12,
30] or Gaussian processes (GP) [
27]. However, due to the non-linearity of the behavior of residential energy consumption, the effects of these models are often limited, and the availability of the original time sequences is mandatory; the statistical models present in the STLF solutions are low-performance.
Machine learning algorithms, such as Support Vector Regression (SVR) [
14,
15,
16] and Artificial Neural Networks (ANN) [
7,
17,
18,
19,
20], have been successfully applied to deal with the nonlinear characteristics of load forecasting. In SVR, inappropriate kernel function selection or hyperparameter settings have a great impact and compromise the results of the STLF. The use of ANN to load forecasting has a certain learning and auto-adaptive ability but easily converges to the optimal local value, resulting in a large error between the predicted value and the true value. For the models that employ SVRs and ANNs to predict electricity consumption in the short term, hourly, or daily for cluster or individual residence typologies, most of them show satisfactory or low results. Machine learning algorithms, such as SVR [
31,
32,
33] and ANN [
22,
29,
34,
35], have been successfully applied to deal with the nonlinear characteristics of load forecasting.
However, recently, DLs have been demonstrated as efficient techniques, and most of the studies analyzed have shown good results. The authors of the studies emphasize the significant performance improvements for STLF by applying DL models compared to conventional methods. The solutions with the best results and the most successful are DL models that incorporate Convolution Neural Networks (CNN) [
10,
21,
22,
23,
24,
25] and Long Short-Term Memory (LSTM) [
24,
26,
29], with large-scale data from historical data from 2 weeks to 10 years. DL approaches integrated with CNN models have been studied to predict daily or hourly energy consumption, while LSTM architectures have been studied to predict energy consumption in different periods, namely short-term, i.e., 7 days or 1 month, because they can capture long-term dependencies in time series.
The third type of data-driven approach to STLF solutions is called a hybrid method, which aims to integrate various types of techniques that harness the strength of different machine learning or statistical techniques, and outperforms the classic or recent models. The authors of the analyzed studies sought to hybridize two or more ML [
8,
10,
25] and statistical [
28,
29,
30] techniques to improve the forecasting accuracy of their models.
Hybrid forecasting methods combine the advantage of one or more techniques to decrease forecasting errors in STLF solutions. These models overcome the disadvantages presented by non-hybrid structures in the search for better forecasting accuracy by increasing the robustness and efficiency of a hybrid model. Most hybrid models presented combine linear and nonlinear models for effective forecasting. The authors of the analyzed articles present hybrid models developed mainly by combining (1) traditional statistical methods and (2) machine learning algorithms.
5. Conclusions
From the large and diversified literature analyzed for STLF solutions in the residential sector, we have highlighted the modelling approaches for electricity demand that are fundamental tools of great interest to the energy sector. In the last decade, there has been an increase in model proposals with a variety of formulations and applications, and a wide temporal and spatial scale. The models proposed and analyzed have their merits and advantages in each application case in the evaluation of the best methods, algorithms, and metrics of accuracy performance. This systematic review seeks to provide a comprehensive review of the modelling literature for STLF solutions in the residential sector.
The present systematic review of the analyzed literature sought to identify which of the modelling methods presents the best performance and accuracy in predicting future energy demand in the residential sector. A direct comparison of the cases presented is difficult due to the numerous variables that influence the performance of each model. Most models are implemented for different locations, in different periods, with more or less good quality data, and are supported by scripts that are more or less well written. Even the performance evaluation metrics for determining the accuracy of the models are different (MAPE, MAE, RSME, MSE, and others), which makes the comparison harder.
Given the number of articles published, it is acceptable to recognize that recurrence in the use of forecasting models reflects good practices. Some models seem to be favored for forecasts of electricity consumption and load demand, such as time series and ML models. It is noteworthy that a large number of different ML models are being used, but it seems that some techniques are more suitable than others for specific tasks. They are mainly used in short-term forecasting, where patterns of electricity load and energy consumption are more complex.
Statistical methods are often used for longer-period forecasting, where periodicity and changes are less significant. They are often associated with socio-economic variables, tariffs, and appliances that reflect the correlation between these variables and electricity consumption over longer periods. Statistical models are relatively easy to implement but have the disadvantages of lacking precision and flexibility.
AI models are suitable for solving nonlinear problems, which makes them very interesting for applying energy demand forecasting in the residential sector. They provide very accurate forecasting solutions, as long as the model selection and parameter configuration are well executed. The disadvantages of AI models are that they require enough historical data to perform well and can be extremely complex to implement and it is difficult to interpret results.
The most commonly used models in AI are DLs, RNAs, and SVMs. They are configured primarily with past energy consumption values, weather data, weekdays, and occupancy levels. The decomposition and classification of days of the week is particularly useful when the solution depends heavily on occupancy or when the time series has been decomposed into underlying patterns. Data decomposition and classification allow splitting complex series into simpler series and thus providing better forecasting performance.
To avoid noise in the time series used in the forecasting models, the authors also recommended a pre-analysis and pre-processing of the input data to eliminate absences and atypical values, by smoothing the time series of errors and interpolating the missing values.
When analyzing the preparation of the academic and scientific papers presented, the collection of information was a challenge. All authors declare that, when presenting their work, they explicitly provide the framework for implementing the forecasting model (period, time resolution, scale, inputs, outputs, pre-processing. etc.). To facilitate direct comparison between the papers, authors should use a pattern of metrics for measuring the equivalent result error in all studies. However, this would be impractical and, hence, it should be desirable to have a unified dataset together with a set of benchmarks with well-defined metrics for a clear comparison of all the modelling techniques and the corresponding algorithms. It would be useful to have a consistent data collection and a set of benchmarks with clearly defined metrics to enable a clear comparison of various modelling strategies and algorithms for STLF. The use of a common dataset and benchmarks would enable researchers to compare various models, algorithms, and techniques accurately. This would help to advance the field of STLF and enable researchers to identify the most effective techniques for specific applications.
The future of modelling for STLF solutions in energy demand for the residential sector seems to be the use of the AI method based on ML models in conjunction with improved data patterns and computational advances to estimate the demand for electricity from a single home, cluster, or agglomerate using dependent variables such as time series of the history of electricity load, time series of past weather data, independent variables such as the effect of the weekly and annual calendar, level of occupancy, and household appliances. This model would capture the complexity inherent in the daily variability promoted by the occupant’s behavior during the hours of day and night, weekdays, and weather conditions. This would allow optimization of the management and operating costs of the electricity grid. It would also allow for optimization in the use of primary resources and consequently to the reduction of GHG emissions, thus contributing to the environmental targets assumed by the international community in relation to the United Nations guidelines.
The forecast modelling of electricity demand in the field of STLF for the residential sector faces some challenges, including technical obstacles related to the complexity of the methods and the uncertainty of the models and variables to be used. Growing availability and opportunity in the methods and technology, including new data sources such as cloud computing and model-adjusted universal metrics, can contribute to enhancing forecasting quality.
The forecast modelling of the demand of electric load in STLF systems for the residential sector undoubtedly has great potential and should face several other new challenges that are not addressed here. However, this review has attempted to bring together a diverse body of practices in a recognizable field, which researchers and professionals can contribute to and develop further.