Next Article in Journal
Exploring the Driving Factors and Their Spatial Effects on Carbon Emissions in the Building Sector
Next Article in Special Issue
An EV Charging Guidance Strategy Based on the Hierarchical Comprehensive Evaluation Method
Previous Article in Journal
Emission Mitigation by Aluminum-Silicate-Based Fuel Additivation of Wood Chips with Kaolin and Kaolinite
Previous Article in Special Issue
An Optimal Scheduling Method of Shared Energy Storage System Considering Distribution Network Operation Risk
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

An Ultra-Short-Term PV Power Forecasting Method for Changeable Weather Based on Clustering and Signal Decomposition

by 1,2,*, 1,2,*, 3 and 3
State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300401, China
School of Electrical Engineering, Hebei University of Technology, Tianjin 300401, China
State Grid Tianjin Wuqing Electric Power Supply Company, Tianjin 301700, China
Authors to whom correspondence should be addressed.
Energies 2023, 16(7), 3092;
Received: 4 March 2023 / Revised: 24 March 2023 / Accepted: 27 March 2023 / Published: 28 March 2023


Photovoltaic (PV) power shows different fluctuation characteristics under different weather types as well as strong randomness and uncertainty in changeable weather such as sunny to cloudy, cloudy to rain, and so on, resulting in low forecasting accuracy. For the changeable type of weather, an ultra-short-term photovoltaic power forecasting method is proposed based on affinity propagation (AP) clustering, complete ensemble empirical mode decomposition with an adaptive noise algorithm (CEEMDAN), and bi-directional long and short-term memory network (BiLSTM). First, the PV power output curve of the standard clear-sky day was extracted monthly from the historical data, and the photovoltaic power was normalized according to it. Second, the changeable days were extracted from various weather types based on the AP clustering algorithm and the Euclidean distance by considering the mean and variance of the clear-sky power coefficient (CSPC). Third, the CEEMDAN algorithm was further used to decompose the data of changeable days to reduce its overall non-stationarity, and each component was forecasted based on the BiLSTM network, so as to obtain the PV forecasting value in changeable weather. Using the PV dataset obtained from Alice Springs, Australia, the presented method was verified by comparative experiments with the BP, BiLSTM, and CEEMDAN-BiLSTM models, and the MAPE of the proposed method was 2.771%, which was better than the other methods.

1. Introduction

With the deteriorating global climate and rapid growth of clean energy consumption [1], solar energy resources have received attention from many countries due to the fact that they are abundant, secure, and environmentally friendly [2]. The cost of PV power generation technology continues to decrease, and it is widely used in transportation, construction, and lighting industries, where its promotion has brought significant economic and environmental benefits to society [3]. However, PV power generation systems are affected by weather, resulting in uncertainty and intermittency [4], and their high penetration rate also brings many challenges to the safe and stable operation of power systems, potentially leading to voltage instability, a reduced power quality, and islanding effects [5].
Accurate PV power forecasting can effectively reduce the risk faced by the grid and improve the economic efficiency of the power system. PV forecasting methods can be classified into physical, statistical, and hybrid approaches [6,7,8]. In [8], different types of time series forecasting models were compared, and it was found that the accuracy of the physical model in short-term photovoltaic power forecasting was low; however, the traditional statistical model is difficult to accurately fit nonlinear photovoltaic power series, resulting in poor forecasting performance; the deep learning model can extract useful features from complex photovoltaic power series, and the forecasting effect is better; the hybrid model based on deep learning has become a research hotspot because of its excellent forecasting performance. Physical methods [9] refer to the construction of simulation models based on specific parameters of PV systems in order to calculate the output power of PV systems. Statistical methods include two categories: traditional statistical models such as seasonal autoregressive integrated moving average (SARIMA) [10], support vector regression (SVR) [11], gray models (GM) [12], etc., and artificial intelligence methods such as convolutional neural networks (CNNs) [13], long short-term memory (LSTM) [14], etc. Hybrid methods combine the advantages of many different models in order to achieve an improved forecasting accuracy. Time series forecasting models also have a wide range of applications in the fields of electricity prices, environment, and tourism [15,16,17,18]. For example, [17] proposed a functional autoregressive model of order P based on the two-component estimation procedure where the accuracy of electricity price forecasting was effectively improved.
In recent years, hybrid models have become a research hotspot in the field of PV power forecasting due to their high forecasting accuracy. At present, hybrid forecasting models are mainly divided into those based on clustering algorithms and those based on signal decomposition. The study in [19] established LSTM forecasting models under ideal weather conditions and divided non-ideal weather into three types: rainy, cloudy, and overcast. In addition, a combined discrete grey model (DGM)-LSTM forecasting model was established. In [20], an extreme random tree classification model was used to classify the PV data into four categories according to the meteorological conditions. Furthermore, a power forecasting was made fully considering the influence of the changing weather conditions and the daily variation pattern of the PV power. The study in [21] used time-series generative adversarial networks (TimeGAN) to perform data enhancement and proposed a K-medoids clustering method based on soft dynamic time warping (soft-DTW) to classify the enhanced data into sunny days, cloudy days, and rainy days. The experimental results showed that the enhanced training data had a better clustering effect. The study in [22] used a self-organized map (SOM) to classify numerical weather forecast information and classified the local weather types for the next 24 h into sunny days, cloudy days, and rainy days before making the forecasts, which effectively improved the forecasting accuracy. In [23], weather was classified from 33 to 10 types based on generative adversarial networks and convolutional neural networks in order to achieve a more accurate classification.
Considering the high non-stationarity of PV power series affected by weather factors, many studies have combined signal decomposition algorithms with forecasting models. The study in [24] decomposed and reconstructed the PV power series into high-frequency and low-frequency components using integrated empirical modal decomposition (EEMD) and constructed LSTM-SVR-BO hybrid models for both of them. In [25], historical data were decomposed into several subcomponents based on the variational modal decomposition (VMD) algorithm, and the subcomponents were then input into a hybrid forecasting model composed of a convolutional neural network (CNN) and bi-directional gated recurrent unit (BiGRU). The study in [26] used EEMD to decompose the original data, merged the subcomponents with similar sample entropy (SE) together, built LSTM networks for the reconstructed components, and optimized the LSTM network parameters using the sparrow search optimization algorithm (SSA). In [27], a forecasting model optimization method based on CEEMDAN and the multi-objective chameleon swarm algorithm (MOcsa) was proposed, effectively improving the effectiveness and stability of the forecasting model. The study in [28] used a random forest (RF) to calculate the weights of each factor, filtered similar days using an improved gray ideal value approximation (IGIVA), and attenuated the volatility of the power series using the CEEMD algorithm. The study in [29] used fuzzy entropy (FE) to reconstruct the sub-sequence generated by CEEMDAN decomposition and obtained the maximum, minimum, and average values of the reconstructed sequence using fuzzy information granulation (FIG), which extracted the signal characteristics more effectively and reduced the computational complexity at the same time. The combined PV power forecasting method based on signal decomposition decomposed the original PV power sequence with a high volatility into a subseries in different frequency domains, which can effectively improve the model forecasting accuracy.
The weather classification and power forecasting methods in photovoltaic power generation have been studied in the existing literature, but there is no special research on changeable weather and the forecasting accuracy of PV power is generally low under this weather type. For this reason, a photovoltaic ultra-short-term power forecasting method for changeable weather was proposed in this paper. In this method, first, the nonlinear photovoltaic power generation in a day is linearized by CSPC, next, the changeable weather is extracted from various weather types by AP clustering, and then the changeable weather photovoltaic power generation is forecasted by modal decomposition and the separate forecasting of each component.
The rest of the paper is organized as follows. In Section 2, the clear-sky normalization method is proposed and the principle of the AP clustering algorithm is presented. In Section 3, the principles of the CEEMDAN algorithm and BiLSTM network are presented and a framework for PV ultra-short-term power forecasting is proposed. In Section 4, experiments are conducted with a PV plant in Alice Springs, Australia, and compared with other models for analysis. In Section 5, the conclusions and future research directions of this paper are given.

2. Weather Clustering Method Based on Photovoltaic Power Fluctuation Characteristics

2.1. Clear-Sky Normalization

In order to avoid the impact of the installed capacity of different power stations, it is necessary to normalize the photovoltaic power data. The data are usually normalized to the interval [0, 1] by performing a min–max normalization based on the maximum and minimum values of the PV power series. The formula is as follows [22]:
P ˜ = P P min P max P min
where P ˜ is the photovoltaic power value after normalization; P is the original PV power sequence; P min and P max are the maximum and minimum values of the sequence.
The above method simply carries out normalization processing based on the maximum and minimum values without considering the uncertainty characteristics of the photovoltaic output variation; thus, the normalization result is random. Therefore, in this paper, normalization processing was carried out based on the uncertainty of photovoltaic power, and the maximum photovoltaic power at each moment of each month was selected in order to form the clear-sky curve of the month, thus representing the standard sunny photovoltaic power sequence of the month. Then, the PV power data were normalized to the CSPC using the clear-sky curve as the standard, record CSPC as σ:
σ i j = P i j C j
C j = max ( P 1 j , P 2 j , , P m j )
where S i j is the CSPC at time j of i days; S i j [ 0 , 1 ] ; P i j is the photovoltaic power at time j of i days; C j is the photovoltaic power at time j of the monthly clear-sky day; m is the number of days in the month.

2.2. AP Clustering Algorithm

The AP algorithm is an information transfer clustering algorithm that was proposed by Frey et al. in 2007 [30]. The advantage of the AP clustering algorithm is that it does not need to set the number of clustering centers in advance (i.e., it can automatically complete the clustering process when the number of clustering centers is unknown). All sample data points were regarded as potential clustering centers. The number and location of clustering centers are constantly modified by transferring messages between data points and updating the attraction information and degree of belonging, selecting optimal clustering centers from the data points, and allocating the remaining points to their corresponding clustering.
We defined the similarity matrix S(i, k) in order to describe the degree of similarity between two points, that is, the degree to which point k is suitable as the clustering center of point i [30]:
S ( i , k ) = x i x k 2
where x i x k is the Euclidean distance between point i and point k, and S(i, k) is the similarity between two points. The larger the value, the more suitable point k is as the clustering center of point i.
Based on the similarity matrix S, the attraction matrix r(i, k) was constructed in order to represent the attraction information of point k to point i. The formula is as follows [30]:
r t + 1 ( i , k ) = S ( i , k ) max j k { a t ( i , j ) + r t ( i , j ) } , i k S ( i , k ) max j k { S ( i , j ) } , i = k
where rt(i, j) is the degree to which points other than data point k at time t are suitable as the clustering center of point i, and the values in r(i, k) are all greater than zero. at(i, j) is the degree to which point i selects other points, except point k as the clustering center at time t, and the initial value is zero.
The attribution matrix a(i, k) was constructed to represent the attribution information of point i to point k. The specific formula is as follows [30]:
a t + 1 ( i , k ) = min 0 , r t + 1 ( k , k ) + j i , k max r t + 1 ( j , k ) , 0 , i k j k max r t + 1 ( j , k ) , 0 , i = k
where at+1(i, k) is the degree to which point i selects point k as an appropriate clustering center at t + 1 time, and rt+1(k, k) is the probability of point k being the clustering center.
In order to avoid vibration, the damping coefficient λ, which has a default value of 0.5, was introduced in order to update the iterative values of attraction matrix r(i, k) and attribution matrix a(i, k) at time t + 1 [30]:
r t + 1 ( i , k ) = λ r t ( i , k ) + ( 1 λ ) r t + 1 ( i , k )
a t + 1 ( i , k ) = λ a t ( i , k ) + ( 1 λ ) a t + 1 ( i , k )

3. Photovoltaic Power Ultra-Short-Term Forecast Portfolio Model

3.1. CEEMDAN Decomposition Algorithm

PV power in changeable weather is significantly volatile and has nonlinear and non-stationary characteristics due to the variable weather factors; thus, it needs to undergo a stationary process in advance to decompose the originally complex PV power series into individual components with more concentrated fluctuation characteristics in different frequency domains. Then, a forecasting model for each subcomponent needs to be built.
Signal decomposition is a commonly used method for making time series become stationary. Empirical mode decomposition (EMD) does not need to define the basis function before decomposition, but generates the intrinsic mode function adaptively according to the characteristics of the original signal, which decomposes the complex signal into several more stationary and regular intrinsic mode function (IMF) components, reflecting the local characteristics of the original signal at different time scales. The CEEMDAN algorithm adds adaptive Gaussian white noise to the data to be decomposed at each stage and performs an overall averaging calculation for each order component, which not only effectively reduces the modal mixing of the EMD algorithm, but also solves the problem of the transfer of white noise from high to low frequencies and improves the computational speed [27].
Let Ei(·) be the i-th modal component obtained by EMD decomposition, ωj(t) be the j-th added white noise, ε0 be the standard deviation of the white noise, and x(t) be the original power signal. The calculation steps of the CEEMDAN algorithm are as follows [31]:
Step 1: Add the Gaussian white noise that obeys the standard normal distribution to the signal x(t) to be decomposed in order to obtain the new signal x′(t).
x ( t ) = x ( t ) + ε 0 ω j ( t )
Step 2: Using EMD decomposition x′(t) to obtain the first-order IMF component IMF1j and the residual signal r1(t), we can obtain the first-order IMF component IMF1(t) resulting from CEEMDAN decomposition by finding the mean value of IMF1j.
I M F 1 ( t ) = 1 N j = 1 N I M F 1 j
where N is the number of times that white noise is added.
r 1 ( t ) = x ( t ) I M F 1 ( t )
Step 3: Add the white noise component after one EMD decomposition to the first-order residual signal r1(t), continue the EMD decomposition to obtain the second-order IMF component IMF2j and the residual r2(t), and derive the second-order component IMF2(t).
I M F 2 j = E 1 ( r 1 ( t ) + ε 1 E 1 ( ω j ( t ) ) )
I M F 2 ( t ) = 1 N j = 1 N E 1 ( r 1 ( t ) + ε 1 E 1 ( ω j ( t ) ) )
r 2 ( t ) = r 1 ( t ) I M F 2 ( t )
Step 4: Repeat the above steps, calculating the nth-order IMF component IMFnj and the residual rn(t) to find the nth-order component IMFn(t).
I M F n j = E 1 ( r n 1 ( t ) + ε n 1 E n 1 ( ω j ( t ) ) )
I M F n ( t ) = 1 N j = 1 N E 1 ( r n 1 ( t ) + ε n 1 E n 1 ( ω j ( t ) ) )
r n ( t ) = r n 1 ( t ) I M F n ( t )
where εn−1 is the weight coefficient of the n-1th-order white noise.
Step 5: Repeat step 4 until the residuals have a monotonic trend, and then stop the iteration, at which point, the K-th order IMF component is obtained and the original signal x(t) is decomposed as [31]:
x ( t ) = k = 1 K I M F k + r k ( t )
where K is the total number of IMF components obtained from the CEEMDAN decomposition.

3.2. BiLSTM Neural Network

The LSTM network is more often used as a time series algorithm, and is a special kind of recurrent neural network (RNN) that can learn the long-term dependencies of time series, and alleviate the problems of gradient disappearance and gradient explosion, which occur during the training of long series in traditional RNNs. As shown in Figure 1, the memory cell of the LSTM consists of the forget gate, input gate, and output gate. The specific calculation process of LSTM is as follows [32]:
The forget gate is responsible for controlling the discarding of redundant information from the previous moment’s cell status information C t 1 .
f t = σ ( W f h t 1 , x t + b f )
The input gate determines how much new information C ˜ t is allowed to add to the cell state C t at the current moment.
i t = σ ( W i h t 1 , x t + b i )
C ˜ t = tanh ( W c h t 1 , x t + b c )
C t = C ˜ t i t + C t 1 f t
The output gate determines the current moment network output value h t based on the cell state C t .
O t = σ ( W o h t 1 , x t + b o )
h t = tanh ( C t ) O t
where x t is the input at the current moment; h t is the output at the current moment; i t , W i , b i are the computed results, weight matrices, and bias terms of the input gates, respectively; o t , W o , b o are the computed results, weight matrices, and bias terms of the output gates, respectively; f t , W f , b f are the computed results, weight matrices, and bias terms of the forgetting gates, respectively; C t and C ˜ t denote the current and previous cell state; σ ( x ) and tanh ( x ) represent the Sigmoid and Tanh activation functions, respectively. σ ( x ) and tanh ( x ) can be expressed as follows:
σ ( x ) = 1 1 + e x
tanh ( x ) = e x e x e x + e x
The BiLSTM network consists of a combination of forward and reverse LSTM networks. This structure results in the BiLSTM network being more effective than the LSTM at capturing the bi-directional dependence information between the time series and extracting the features of the PV power series. The BiLSTM network structure diagram is shown in Figure 2. From Figure 2, we can see that the calculation process of the forward LSTM structure in the BiLSTM network was similar to that of a single LSTM network, and that the implied layer state of the BiLSTM network was obtained by combining the forward implied layer state and the reverse implied layer state. Its calculation formula is subsequently shown.
h t = LSTM ( h t 1 , x t , c t 1 )
h t = LSTM ( h t + 1 , x t , c t + 1 )
h t = α h t + β h t
where xt, h t , and h t denote the input data at time t, the output of the forward LSTM implicit layer, and the output of the reverse LSTM implicit layer, respectively, and α and β are constant coefficients that denote the weights corresponding to h t and h t , respectively.

3.3. Combined Model Forecasting Process

This paper proposed an ultra-short-term PV power forecasting method based on a combined AP-CEEMDAN-BiLSTM model considering the volatility characteristics of the PV output. First, the mean and variance of the daily PV power data were selected as clustering indicators, and the PV output was classified into sunny days, cloudy days, and changeable days based on AP clustering. The CEEMDAN algorithm was used to decompose the changeable weather data into K different modal components to reduce the complexity of the input sequence, and to then input the BiLSTM network for training and forecasting and accumulate the forecasting results of all components. The forecasting framework of the proposed method in this paper is shown in Figure 3, which can be mainly divided into four parts: PV power normalization, weather clustering, decomposition forecasting, and denormalization. The specific forecasting process steps are as follows:
  • Clear-sky normalization: Using the PV power history data of the whole year as the dataset, the maximum value of each moment in each month in the dataset was extracted to form the monthly clear-sky curve, which represents the standard “clear-sky days” of each month. The historical power data and the preliminary forecasted value of future power were normalized with the clear-sky curve as the standard, and the CSPC (including the real value in the past and the forecasted value in the future) was obtained;
  • AP weather clustering: The mean and variance of daily CSPC were calculated and subsequently used as clustering indicators for AP clustering, classifying data points into three weather types based on PV output characteristics: sunny, cloudy, and changeable weather;
  • Combined CEEMDAN-BiLSTM model: The CEEMDAN decomposition algorithm was used to decompose the changeable day data into n IMF components and one residual component in order to reduce the non-stationarity of the data, and they were then input into the BiLSTM network for the forecasting;
  • Clear-sky denormalization: The CSPC was denormalized according to the clear-sky curve in order to obtain the final power forecasting results.

4. Results and Analysis

4.1. Data Description

In order to verify the weather clustering method and combined model forecasting method in this paper, a site of the Desert Knowledge Australia Solar Center (DKASC) was taken as the research object [33]. DKASC is located in the town of Alice Springs in the Northern Territory of Australia, which has a dry desert climate and rich solar energy resources.
The measured data of the PV output for one year and four months were selected as the sample, in which a whole year’s data were used as the training set, two months of data as the verification set, and a month of data as the test set. The sampling interval was 15 min, and 96 data points were collected every day. The model uses a rolling forecast, using the values of the previous 24 h (96 values) as input to forecast the values for the next moment. Typical sunny, cloudy, and changeable days in each season were selected from the training set and are displayed in Figure 4.
The computer hardware facilities used in the experiment were: AMD Ryzen 7 5800H CPU, NVIDIA GeForce RTX 3070 graphics card, and 16 GB memory.
From the various types of weather and its corresponding CSPC, as shown in Figure 4, it can be seen that the CSPC curve for sunny days was relatively smooth, with a magnitude close to 1. The volatility of the CSPC curve for cloudy days was small, with a slightly smaller magnitude. The CSPC for changeable weather had a large volatility, which was non-stationary from the point of view of the time series.

4.2. Model Evaluation Criteria

The commonly used evaluation indicators of regression models—the mean absolute error (MAE), mean absolute percentage error (MAPE) and root mean square error (RMSE)—were used to evaluate and compare the accuracy of the forecasting model. The calculation formulas are as follows [34]:
M A E = 1 N t = 1 N P t P ^ t
M A P E = 1 N t = 1 N P t P ^ t P t × 100 %
R M S E = 1 N t = 1 N ( P t P ^ t ) 2
where P t represents the actual value of power, P ^ t represents the forecasted value of power, and N represents the number of points of the forecasted value of future power.

4.3. Experimental Results and Analysis

First, the PV power of the whole year of the training set was taken as the dataset, the extreme value of each time and month was calculated, and the clear-sky curve of the “clear-sky day” for 12 months was extracted, as shown in Figure 5. Australia is located in the Southern Hemisphere. Alice Springs is located in the northern part of Australia and belongs to the arid desert climate. Every year, September to November is spring, December to February is summer, March to May is autumn, and June to August is winter. It can be seen from Figure 5 that the PV output fluctuates seasonally. In summer, due to weather factors such as a long duration of sunshine, the PV output value of the clear-sky curve was significantly higher than that of other seasons. The PV module works early and stops late every day, and the power generation cycle is long. In winter, the photovoltaic power generation capacity is low and the daily power generation time is also short.
The historical PV power data were input into the BiLSTM model for the preliminary forecasting, and the preliminary forecasted value of the PV power at the future time was obtained. According to the corresponding clear-sky curve, the historical PV power data and the forecasted value of the future PV power data were clear-sky normalized to obtain the CSPC series. Considering the mean and variance of the PV output at all times of the day as the clustering index, the CSPC was clustered based on the AP clustering algorithm, and three weather types were clustered: sunny, cloudy, and changeable. The clustering results are shown in Figure 6.
In Figure 6, the horizontal axis is the mean value of the CSPC, and the vertical axis is the variance. From the classification results, the sunny day corresponded to the case that the mean value of the CSPC was large and the variance was small; the mean and variance of the changeable weather were kept in a certain numerical range; the mean value of the CSPC was small on cloudy days. In the current classification, the distance between different dates used the similarity matrix expressed by the 2-norm in Equation (4). Different ways of defining the similarity matrix will produce different results of weather type classification. Therefore, a more detailed classification of weather types can be achieved by defining a more complex similarity matrix.
Considering the non-stationarity of the changeable day data, Gaussian white noise was added to the changeable day sequence, and the CEEMDAN algorithm was used to decompose the sequence into 12 IMF components and a residual component step by step. The specific decomposition results are shown in Figure 7. As can be seen, due to the strong non-stationarity of the abruptly changeable day data, many modes were obtained by decomposition, and the fluctuation characteristics differed greatly from each other. If the forecasting is made directly without decomposition, the forecasting accuracy of the model will decline. Among them, IMF1 to IMF4 showed the characteristics of high frequency and strong randomness, which makes it difficult to forecast, but it cannot be removed as the randomness component because of its large amplitude change, otherwise it will affect the forecasting accuracy. IMF5 to IMF12 had a lower frequency and certain periodic change pattern, which makes the forecasting less difficult; Res was the trend component, and its trend indicated the overall decreasing trend of PV power. Therefore, it is important to build BiLSTM models for each component separately for training and forecasting.
The BiLSTM models were established to forecast each modal component of the CSPC on changeable days, and the forecasting results were de-normalized according to the corresponding clear-sky curve to obtain the final photovoltaic power forecasting results. In order to verify the validity of the forecasting method proposed in this paper, the BP neural network, BiLSTM neural network, and CEEMDAN-BiLSTM forecasting model were established for changeable weather, respectively. According to the evaluation formula described in Section 4.2, the errors of various forecasting methods were compared, as shown in Table 1.
As can be seen from Table 1, traditional power forecasting algorithms such as BP and BiLSTM had a large deviation for the power forecasting of changeable weather, mainly because the photovoltaic power curve corresponding to changeable weather had obvious non-stationary characteristics, which makes it difficult to find the regularity in the learning of the neural network. In the method proposed in this paper, the clear-sky normalization method changed the nonlinear output of photovoltaic into linear CSPC in a day. At the same time, based on the CEEMDAN modal decomposition method, the time series of non-stationary CSPC corresponding to changeable weather was divided into a number of modes. The BiLSTM neural network is suitable for the learning and forecasting of this information. The MAE, MAPE, and RMSE of this method were 0.029 MW, 2.771%, and 5.530 MW, respectively, which were much smaller than those of the other models. Thus, for the time series with non-stationary characteristics, the methods of linearization and mode decomposition are helpful to improve the accuracy of forecasting. The ACF and PACF for the final residuals for changeable days are shown in Figure 8. In Figure 8, the red dot in the left image represents the autocorrelation function of the final residual sample, the red dot in the right image represents the partial autocorrelation function of the final residual sample, the Abscissa is the number of lags, and the blue line represents 95% confidence interval.
In the test set, data of a typical day during changeable weather were selected, and the forecasting results of the four methods were compared, as shown in Figure 9. It can be seen that the fluctuation in the power output curve of changeable weather was very strong, and the PV power at adjacent moments was significantly different, which is very difficult to forecast, resulting in the various methods having an uneven forecasting accuracy. The three comparison models had a large deviation at the moment of drastic changes in the photovoltaic power, whereas the curve of the model in this paper basically conformed to the real value.

5. Conclusions

The accuracy of power forecasting has become an important technical challenge due to the high uncertainty of PV power forecasting, especially in the case of changeable weather. In this paper, we used the PV power curve information of standard clear-sky days to normalize the daily PV power curve into CSPC. On this basis, different types of weather days were classified, changeable types of weather were selected, and the corresponding PV power was decomposed and forecasted. The following conclusions were drawn:
  • The normalized daily CSPC could reflect the weather changes that affect photovoltaic power generation to a certain extent. In this paper, the weather types were divided into sunny days, cloudy days, and variable days, which can be further divided into more complex types based on the curve characteristics of the daily CSPC.
  • Due to the complexity of changeable days, the PV power curve has a very strong non-stationary feature, which is liable to cause low forecasting accuracy. The PV output power curve in a day can be linearized by the clear-sky normalization method, the method of modal decomposition, and the strategy of forecasting each component separately are helpful to improve the accuracy.
The methods described in this article, namely, the linearization, classification, decomposition, and forecasting of non-stationary signals, can provide references for wind power and power grid load forecasting.

Author Contributions

Conceptualization, J.Z. and Y.H.; Methodology, Y.H.; Software, Y.H.; Validation, J.Z. and Y.H.; Formal analysis, J.Z., R.F., and Z.W.; Investigation, J.Z.; Resources, J.Z. and Z.W.; Data curation, J.Z.; Writing—original draft preparation, Y.H. and J.Z.; Writing—review and editing, J.Z., Y.H., R.F. and Z.W.; Visualization, Y.H.; Supervision J.Z. and R.F.; Project administration, J.Z.; Funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.


This study was supported by the “State Grid Tianjin Electric Power Company science and technology project: KJ22-1-22”.

Data Availability Statement

Not applicable.


We wish to thank Lingling Li, Fengxian Li, Xinrui Liu and Chenxu Huang for providing the technical support.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Sobri, S.; Koohi-Kamali, S.; Rahim, N.A. Solar photovoltaic generation forecasting methods: A review. Energy Convers. Manag. 2018, 156, 459–497. [Google Scholar] [CrossRef]
  2. Zhang, Y.; Ma, T.; Yang, H. Grid-connected photovoltaic battery systems: A comprehensive review and perspectives. Appl. Energy 2022, 328, 120182. [Google Scholar] [CrossRef]
  3. Hao, D.; Qi, L.; Tairab, A.M.; Ahmed, A.; Azam, A.; Luo, D.; Pan, Y.; Zhang, Z.; Yan, J. Solar energy harvesting technologies for PV self-powered applications: A comprehensive review. Renew. Energy 2022, 188, 678–697. [Google Scholar] [CrossRef]
  4. Wang, M.; Wang, P.; Zhang, T. Evidential Extreme Learning Machine Algorithm-Based Day-Ahead Photovoltaic Power Forecasting. Energies 2022, 15, 3882. [Google Scholar] [CrossRef]
  5. Alcañiz, A.; Grzebyk, D.; Ziar, H.; Isabella, O. Trends and gaps in photovoltaic power forecasting with machine learning. Energy Rep. 2023, 9, 447–471. [Google Scholar] [CrossRef]
  6. Feng, C.; Liu, Y.; Zhang, J. A taxonomical review on recent artificial intelligence applications to PV integration into power grids. Int. J. Electr. Power Energy Syst. 2021, 132, 107176. [Google Scholar] [CrossRef]
  7. Mayer, M.J.; Gróf, G. Extensive comparison of physical models for photovoltaic power forecasting. Appl. Energy 2021, 283, 116239. [Google Scholar] [CrossRef]
  8. Salamanis, A.I.; Xanthopoulou, G.; Bezas, N.; Timplalexis, C.; Bintoudi, A.D.; Zyglakis, L.; Tsolakis, A.C.; Ioannidis, D.; Kehagias, D.; Tzovaras, D. Benchmark Comparison of Analytical, Data-Based and Hybrid Models for Multi-Step Short-Term Photovoltaic Power Generation Forecasting. Energies 2020, 13, 5978. [Google Scholar] [CrossRef]
  9. Pombo, D.V.; Rincón, M.J.; Bacher, P.; Bindner, H.W.; Spataru, S.V.; Sørensen, P.E. Assessing stacked physics-informed machine learning models for co-located wind–solar power forecasting. Sustain. Energy Grids Netw. 2022, 32, 100943. [Google Scholar] [CrossRef]
  10. Kushwaha, V.; Pindoriya, N.M. A SARIMA-RVFL hybrid model assisted by wavelet decomposition for very short-term solar PV power generation forecast. Renew. Energy 2019, 140, 124–139. [Google Scholar] [CrossRef]
  11. Wolff, B.; Kühnert, J.; Lorenz, E.; Kramer, O.; Heinemann, D. Comparing support vector regression for PV power forecasting to a physical modeling approach using measurement, numerical weather prediction, and cloud motion data. Sol. Energy 2016, 135, 197–208. [Google Scholar] [CrossRef]
  12. Wang, Y.; Chi, P.; Nie, R.; Ma, X.; Wu, W.; Guo, B. Self-adaptive discrete grey model based on a novel fractional order reverse accumulation sequence and its application in forecasting clean energy power generation in China. Energy 2022, 253, 124093. [Google Scholar] [CrossRef]
  13. Korkmaz, D. SolarNet: A hybrid reliable model based on convolutional neural network and variational mode decomposition for hourly photovoltaic power forecasting. Appl. Energy 2021, 300, 117410. [Google Scholar] [CrossRef]
  14. Limouni, T.; Yaagoubi, R.; Bouziane, K.; Guissi, K.; Baali, E.H. Accurate one step and multistep forecasting of very short-term PV power using LSTM-TCN model. Renew. Energy 2023, 205, 1010–1024. [Google Scholar] [CrossRef]
  15. Bibi, N.; Shah, I.; Alsubie, A.; Ali, S.; Lone, S.A. Electricity Spot Prices Forecasting Based on Ensemble Learning. IEEE Access 2021, 9, 150984–150992. [Google Scholar] [CrossRef]
  16. Dong, Y.; Xiao, L.; Wang, J.; Wang, J. A time series attention mechanism based model for tourism demand forecasting. Inf. Sci. 2023, 628, 269–290. [Google Scholar] [CrossRef]
  17. Jan, F.; Shah, I.; Ali, S. Short-Term Electricity Prices Forecasting Using Functional Time Series Analysis. Energies 2022, 15, 3423. [Google Scholar] [CrossRef]
  18. Wang, Z.; Gao, R.; Wang, P.; Chen, H. A new perspective on air quality index time series forecasting: A ternary interval decomposition ensemble learning paradigm. Technol. Forecast. Soc. Chang. 2023, 191, 122504. [Google Scholar] [CrossRef]
  19. Gao, M.; Li, J.; Hong, F.; Long, D. Day-ahead power forecasting in a large-scale photovoltaic plant based on weather classification using LSTM. Energy 2019, 187, 115838. [Google Scholar] [CrossRef]
  20. Wang, X.; Sun, Y.; Luo, D.; Peng, J. Comparative study of machine learning approaches for predicting short-term photovoltaic power output based on weather type classification. Energy 2022, 240, 122733. [Google Scholar] [CrossRef]
  21. Li, Q.; Zhang, X.; Ma, T.; Liu, D.; Wang, H.; Hu, W. A Multi-step ahead photovoltaic power forecasting model based on TimeGAN, Soft DTW-based K-medoids clustering, and a CNN-GRU hybrid neural network. Energy Rep. 2022, 8, 10346–10362. [Google Scholar] [CrossRef]
  22. Chen, C.; Duan, S.; Cai, T.; Liu, B. Online 24-h solar power forecasting based on weather type classification using artificial neural network. Sol. Energy 2011, 85, 2856–2870. [Google Scholar] [CrossRef]
  23. Wang, F.; Zhang, Z.; Liu, C.; Yu, Y.; Pang, S.; Duić, N.; Shafie-khah, M.; Catalão, J.P.S. Generative adversarial networks and convolutional neural networks based weather classification model for day ahead short-term photovoltaic power forecasting. Energy Convers. Manag. 2019, 181, 443–462. [Google Scholar] [CrossRef]
  24. Wang, L.; Mao, M.; Xie, J.; Liao, Z.; Zhang, H.; Li, H. Accurate solar PV power prediction interval method based on frequency-domain decomposition and LSTM model. Energy 2023, 262, 125592. [Google Scholar] [CrossRef]
  25. Zhang, C.; Peng, T.; Nazir, M.S. A novel integrated photovoltaic power forecasting model based on variational mode decomposition and CNN-BiGRU considering meteorological variables. Electr. Power Syst. Res. 2022, 213, 108796. [Google Scholar] [CrossRef]
  26. Li, Z.; Xu, R.; Luo, X.; Cao, X.; Du, S.; Sun, H. Short-term photovoltaic power prediction based on modal reconstruction and hybrid deep learning model. Energy Rep. 2022, 8, 9919–9932. [Google Scholar] [CrossRef]
  27. Zhou, Y.; Wang, J.; Li, Z.; Lu, H. Short-term photovoltaic power forecasting based on signal decomposition and machine learning optimization. Energy Convers. Manag. 2022, 267, 115944. [Google Scholar] [CrossRef]
  28. Niu, D.; Wang, K.; Sun, L.; Wu, J.; Xu, X. Short-term photovoltaic power generation forecasting based on random forest feature selection and CEEMD: A case study. Appl. Soft Comput. 2020, 93, 106389. [Google Scholar] [CrossRef]
  29. Zhang, J.; Liu, Z.; Chen, T. Interval prediction of ultra-short-term photovoltaic power based on a hybrid model. Electr. Power Syst. Res. 2023, 216, 109035. [Google Scholar] [CrossRef]
  30. Frey, B.J.; Dueck, D. Clustering by passing messages between data points. Science 2007, 315, 972–976. [Google Scholar] [CrossRef] [PubMed][Green Version]
  31. Gao, B.; Huang, X.; Shi, J.; Tai, Y.; Zhang, J. Hourly forecasting of solar irradiance based on CEEMDAN and multi-strategy CNN-LSTM neural networks. Renew. Energy 2020, 162, 1665–1683. [Google Scholar] [CrossRef]
  32. Peng, T.; Zhang, C.; Zhou, J.; Nazir, M.S. An integrated framework of Bi-directional long-short term memory (BiLSTM) based on sine cosine algorithm for hourly solar radiation forecasting. Energy 2021, 221, 119887. [Google Scholar] [CrossRef]
  33. DKASC. Alice Springs. Available online: (accessed on 11 December 2022).
  34. Liu, L.; Liu, F.; Zheng, Y. A Novel Ultra-Short-Term PV Power Forecasting Method Based on DBN-Based Takagi-Sugeno Fuzzy Model. Energies 2021, 14, 6447. [Google Scholar] [CrossRef]
Figure 1. The LSTM structure diagram.
Figure 1. The LSTM structure diagram.
Energies 16 03092 g001
Figure 2. The BiLSTM network structure diagram.
Figure 2. The BiLSTM network structure diagram.
Energies 16 03092 g002
Figure 3. The ultra-short-term PV power forecasting framework based on weather type clustering.
Figure 3. The ultra-short-term PV power forecasting framework based on weather type clustering.
Energies 16 03092 g003
Figure 4. Comparison of the PV power and the CSPC for the sunny, cloudy, and changeable days for each season.
Figure 4. Comparison of the PV power and the CSPC for the sunny, cloudy, and changeable days for each season.
Energies 16 03092 g004
Figure 5. Monthly clear-sky curve of the training set.
Figure 5. Monthly clear-sky curve of the training set.
Energies 16 03092 g005
Figure 6. Results of the weather clustering.
Figure 6. Results of the weather clustering.
Energies 16 03092 g006
Figure 7. The CEEMDAN results for changeable weather CSPC.
Figure 7. The CEEMDAN results for changeable weather CSPC.
Energies 16 03092 g007
Figure 8. The ACF and PACF plots of the final residuals for changeable days.
Figure 8. The ACF and PACF plots of the final residuals for changeable days.
Energies 16 03092 g008
Figure 9. Forecasting results of changeable days using different methods.
Figure 9. Forecasting results of changeable days using different methods.
Energies 16 03092 g009
Table 1. Comparison of the PV power forecasting errors of different methods.
Table 1. Comparison of the PV power forecasting errors of different methods.
MethodChangeable Day
The proposed method0.0292.7710.055
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, J.; Hao, Y.; Fan, R.; Wang, Z. An Ultra-Short-Term PV Power Forecasting Method for Changeable Weather Based on Clustering and Signal Decomposition. Energies 2023, 16, 3092.

AMA Style

Zhang J, Hao Y, Fan R, Wang Z. An Ultra-Short-Term PV Power Forecasting Method for Changeable Weather Based on Clustering and Signal Decomposition. Energies. 2023; 16(7):3092.

Chicago/Turabian Style

Zhang, Jiaan, Yan Hao, Ruiqing Fan, and Zhenzhen Wang. 2023. "An Ultra-Short-Term PV Power Forecasting Method for Changeable Weather Based on Clustering and Signal Decomposition" Energies 16, no. 7: 3092.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop