1. Introduction
Harmonic prediction in PV power plants is the key to supporting the safe operation of power systems and the development of dispatching strategies [
1]. Since voltage harmonics are susceptible to the disturbance of many factors, they are non-linear and non-stationary while maintaining periodicity. This brings a greater challenge for harmonic prediction and detection. Predicting power harmonics based on the construction of similar daily sets can provide the basis for the basic real-time data required for regional smart grid regulation and control technology [
2].
The current forecasting methods are mainly divided into conventional statistical methods and machine learning techniques. The traditional statistical techniques are the time series method [
3,
4], Kalman filter [
5], etc. Machine learning methods include artificial neural networks [
6], Long Short-Term Memory (LSTM) neural networks [
7,
8], Least Squares Support Vector Machines (LSSVM) [
9], etc. Reference [
10] proposed a short-term load prediction method based on a Kalman filter, but the algorithm is not applicable to non-linear non-stationary time series data processing and the prediction accuracy is poor. Reference [
11] adopted the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to extract the time-frequency domain feature information of wind speed sequences, used an echo state network to train the input information, and finally adopted LSSVM to correct the error and improve the prediction accuracy. Reference [
12] adopted an improved Particle Swarm Optimization (PSO) algorithm to optimize the parameters of the Multi-Kernel Extreme Learning Machine (MKELM) for prediction, but the global search capability and convergence speed of the PSO were limited, and the prediction effect was not satisfactory. In summary, the traditional methods based on statistics have poor mapping ability for non-linear and non-smooth data, and the processing effect is not satisfactory. The machine-learning-based methods have good generalization ability for unknown non-linear time series data, but the hyper-parameter search is difficult.
Many scholars have proposed a combined prediction model that uses signal decomposition methods to mine time series data features deeply and have adopted swarm intelligence optimization algorithms to optimize the parameters of the machine learning model to address the problem of the difficulty of a hyper-parameter search for the optimization of single machine learning [
13,
14,
15]. In reference [
16], the wind speed data were decomposed by Wavelet Transform (WT) and fed into the LSTM network to obtain the time-series characteristics of wind speed data. In reference [
17], the time series were decomposed into several subseries by Variational Mode Decomposition (VMD), and the new subseries were input to the Gated Recurrent Units (GRU) network by using sample entropy to filter and reconstruct each series, and this achieved better prediction results.
The historical data need to be pre-processed for standardization before input. Reference [
18] only screened the input data for external factors, and did not construct similar day sets for historical data, resulting in the presence of data with little correlation with the day to be predicted in the input data, interfering with the prediction results. In reference [
19], only time series factors were considered when constructing similar day sets considering holiday forecasts, and key factors such as weather conditions and temperature were not taken into account. Reference [
20] used the improved Grey Relational Analysis (GRA) to construct a similar day set, but did not consider the clustering selection of secondary similar days, so the final similar day set obtained was not accurate enough. This resulted in insufficient data cleaning and anomalous data removal from the input data, which led to a decrease in prediction accuracy.
In summary, traditional prediction methods cannot take into account the influence of multiple external factors on harmonic prediction, while the Kernel Extreme Learning Machine (KELM) can train the relationship between harmonic data and external data and deal with nonlinear and non-smooth problems better. GRA can filter historical data from geometric aspect similarity, and the K-means clustering method can select historical data from data aspect similarity. The advantages of Harris Hawk Optimization (HHO) are quick convergence and great global search reliability. Therefore, the method of photovoltaic power plant harmonic prediction based on GRA with the VMD-HHO-KELM model is proposed. First, the significant impacting elements are chosen using the Pearson correlation coefficient approach, and then the GRA approach and K-means clustering are used to create the final collection of comparable days. Then, the VMD approach is adopted to decompose the harmonic data of the collection of comparable days, and each decomposed subsequence is input to the constructed HHO-KELM neural network for prediction, respectively. Finally, all subseries prediction results are superimposed, numerical evaluation indicators are introduced, and the proposed approach is validated by applying the above method in simulation and comparing with LSTM, GRU, and other multi-models.
The main contributions of this study are summarized as follows:
- (1)
The HHO algorithm has good convergence speed and search efficiency. The HHO algorithm is applied to the KELM neural network, which overcomes the difficulty of selecting parameters, and establishes a good prediction model for the harmonic content of infrastructure construction.
- (2)
In response to the problems of unclear spatiotemporal trend flow and difficulty in tracing the source of harmonics in PV power plants, as well as the existing traditional prediction models with delay lag and poor prediction accuracy, this paper constructs a GRA-VMD-HHO-KELM harmonic prediction model, which can eliminate the harm caused by the pollution of each harmonic component and realize the function of accurate prediction of different sub-harmonics and harmonic content.
The structure of the paper is as follows. In
Section 2, the problem of harmonic prediction for photovoltaic power plants is described. In
Section 3, the theoretical basis is introduced, including the KELM model, the harmonic detection method based on VMD, and the design of the HHO algorithm. The strategy structure of voltage harmonic signal prediction is proposed, and the above methods are combined in
Section 4. The experimental analysis is carried out, and the superiority of the proposed method is verified by comparing the prediction results of different methods in
Section 5. Conclusions are drawn in
Section 6.
3. Theoretical Basis
3.1. Kernel Extreme Learning Machine
In this kernel function form of the ELM algorithm, the appropriate kernel function
K(
u,
v) must be provided and the buried layer feature mapping
h(
x) does not need to be known [
23]. It is also not necessary to specify the count of nodes of the buried layer
L. The problem of the unsatisfactory generalization ability and stability caused by the randomly given parameters of the buried layer of ELM is effectively improved, and the computational complexity is largely reduced to avoid the optimization of the total amount of nodes of
L. Since the implicit layer mapping relation function
h(
x) is unknown, Huang et al. established a kernel matrix instead of
HHT by studying the kernel function based on ELM, relying on the Mercer condition [
24].
where
is the kernel matrix,
is the elements in matrix.
In the KELM algorithm, the mapping function
h(
x) does not need to be known, and there is no requirement to provide the total amount of nodes in
L; only the corresponding kernel function
K(
u,
v) is given. The KELM algorithm flows as follows [
24].
Input: training samples
, kernel function
K(
u,
v). Output:
where
is the kernel function, and
T and
I are the target vector matrix and diagonal matrix, respectively;
λ is the regularization coefficient, and the smaller the
λ the stronger the model generalization ability; the larger the
λ the higher the model prediction accuracy. The selection of appropriate
λ is crucial to the model. The HHO algorithm benefits from having few adjustable parameters, strong full-optimal search capability, and a strong search efficiency, so the algorithm is introduced to optimize the KELM regularization factor
λ and the kernel function parameters
.
3.2. Harmonic/Inter-Harmonic Detection Method Based on VMD
VMD is a complete fully non-recursive signal processing method proposed for processing non-smooth and non-linear signals [
25]. Since the harmonic voltage signal of PV power plants has a certain periodicity and volatility, it was decided to first apply
K-means clustering to select the group of comparable days, and then to adopt the VMD method to decompose the historical harmonic voltage signal curve into multiple subseries with different frequencies and relative smoothness. Thus, the complexity of the harmonic curve and the non-smoothness and non-linearity of the time series were effectively reduced. The implementation steps are as follows.
The noise is dealt with according to the experimental requirements and the actual situation of the power station in this paper. Since it is not the focus of this paper, the conventional filtering method was applied before the process of decomposing harmonics by VMD method, and higher harmonics exceeding 750 Hz were eliminated. After elimination, the measured harmonics were decomposed into corresponding modes.
For ensuring that each modal function is a component of finite bandwidth with center frequency such that the decomposed modes is minimized, the constrained variational problem is [
25]:
where {
uk} = {
u1,
u2, …,
uk} is the
K modal components obtained from the decomposition; {
ωk} = {
ω1,
ω2 …
ωk} is the frequency center of each component;
δ(
t) is the impulse function; and
f(
t) is the original signal. The main iterative VMD solving process is as follows:
Step 1. Initialize {}, , {} and the maximum number of iterations n;
Step 2. Update uk, ωk, and λk.
Step 3. Convergence criterion > 0. If the iteration stopping condition is not satisfied, then return to step (2); otherwise, terminate the iteration and output the result to obtain each modal component uk.
VMD parameter setting: the number of VMD components K is very important to the decomposition effect. Usually, K is chosen from 3 to 8 as the number of VMD components; in this paper K = 6. The VMD penalty factor, initial center frequency, and convergence criterion are set to α = 2000, init = 1, and = 10−7, respectively.
3.3. Design of the Harris Hawk Optimization Algorithm
The algorithm was proposed for multi-dimensional complex problem solving. The algorithm is significantly improved in stability, convergence accuracy and speed, in comparison to certain conventional swarm intelligence optimization methods, and has excellent performance, especially in high-dimensional multi-polarity solution problems [
26].
- (1)
Global search phase:
The Harris hawk’s roundup behavior will be transformed into different behaviors depending on the prey escape energy
E.
E is shown in the following equation:
where
t is the current number of iterations,
E0 is a random number in the interval (−1, 1), and
E represents the current prey’s escape energy.
If the |
E| of the prey is greater than 1, the Harris hawk flock will disperse to fly in a larger range to find the prey, and a random number
q will be randomly generated for the different cases of the Harris hawks finding the prey and not finding prey to obtain the search phase equation:
where
X(
t + 1) represents the position vector of the Harris hawk at the next iteration,
X(
t) is the current position vector of this hawk,
Xrabbit(
t) is the prey position vector,
Xrand(
t) represents the position of random individuals in the hawk population, the upper and lower boundaries of this dimensional variable are represented by
UB and
LB, and
r1,
r2,
r3,
r4, and
q are random numbers between (0, 1). The mean position of the eagle
Xm(
t) can be calculated by the following equation:
where
N represents the number of individuals in the eagle population and
Xi(
t) denotes the position of each eagle in iteration
t.
When q ≥ 0.5, the prey is not detected by any of the hawks, and therefore will randomly select individuals in the population to approach the prey location and update their own position. If q < 0.5, the prey is detected and the Harris hawk targets the prey, circles around it, and updates its position.
- (2)
Local exploitation phase:
According to the escape behavior of the prey and the Harris hawk’s pursuit strategy, the assumption is the chance of the prey escaping before the raid with a successful escape (
r < 0.5) or unsuccessful escape (
r > 0.5). When the prey’s escape energy |
E| < 1, the prey is physically weak and the hawks will enter the siege phase, in which the hawks have four types of siege depending on whether the escape energy
E is greater than 0.5 and whether the prey escapes the siege [
27].
Circling roundup: when |
E| ≥ 0.5 and
r ≥ 0.5, the prey still has enough energy to escape but cannot escape from the encirclement. Harris hawks will circle around the prey and continue to consume the prey’s energy:
where
X(
t + 1) represents the position of the Harris hawk at the next iteration stage,
represents the difference between the prey and the Harris hawk position when the current number of delivery is
t, and
J denotes the random jump strength of the prey.
Strong raid: when |
E| < 0.5 and
r ≥ 0.5, at this point the Harris hawks consider the prey to be physically exhausted and will make a final raid on the prey at:
Hovering roundup and progressive dive attack: when |
E| ≥ 0.5 and
r < 0.5, the prey is energetic and still has a chance to escape, and the HHO algorithm introduces the concept of Levy flight (LF) to model the disorienting behavior of the prey with variable routes during the escape phase. The hawks will evaluate their next behavior according to the following equation:
The obtained
Y is then compared with the current position fitness to detect whether the hunt is successful. Assuming that the hunt is unsuccessful, the hawk flock will start to make irregular swoops and perform raids. The flock takes a raid and updates its position based on LF, as follows:
where
S is a
D-dimensional random vector.
D is the problem dimension, and the final Harris hawk location update final decision for this phase is
Strong raid with progressive dive attack: when |
E| < 0.5 and
r < 0.5. The prey is low on energy but has a chance to escape. The movement of the hawks is similar to a progressive dive attack in a circling roundup, but the difference is that the hawks will try to reduce the average distance to the prey. The following rules are therefore implemented under strong siege conditions:
6. Conclusions
This paper adopts the GRA and K-means clustering method to screen and reconstruct the historical day data twice, and after eliminating the historical day data that are not highly correlated or irrelevant to the day to be predicted, the study constructs the similar day set, and then adopts VMD decomposition for the similar day set, and uses the obtained IMF components as input data, and adopts the HHO algorithm to search and solve for ELM/KELM hyper-parameters. The GRA-VMD-HHO-ELM/KELM power harmonic prediction model is constructed, and finally the power harmonic test set curve of the day to be predicted is obtained, and relevant numerical evaluation indexes are introduced to assess the effectiveness of the method proposed in this paper.
- (1)
Similar day set construction: the Pearson correlation coefficient is used to solve the harmonic signal and each external factor Pearson coefficient value is used to screen the external factors that have greater influence on the harmonic value. The set of similar days is initially established, and then the K-means clustering method is used to classify the set of similar days, and the class including the center of the minimum Euclidean distance from the influencing factor of the day to be predicted is determined as the final set of similar days.
- (2)
Improvement of ELM and KELM algorithms: The HHO algorithm is introduced for simulation comparison with other population intelligence optimization algorithms. The HHO algorithm converges about 15 times, and the fitness function finally converges to 0. Other algorithms require more than 30 times the convergence, and the convergence value is higher than 10. The results of the data comparison verify the excellent convergence speed and search efficiency of the HHO algorithm.
- (3)
Construction of the GRA-VMD-HHO-KELM harmonic prediction model: In response to the problems of unclear spatiotemporal trend flow and difficulty in tracing the sources of harmonics in PV power plants, as well as the existing traditional prediction models with delay lag and poor prediction accuracy, this paper constructs a GRA-VMD-HHO-KELM harmonic prediction model, which can eliminate the harm caused by the pollution of each harmonic component and realize the function of the accurate prediction of different sub-harmonics. This model can eliminate the harm caused by the pollution of each harmonic component and realize the function of accurate prediction of different harmonic contents. The results show that the error of the prediction model is reduced by at least 39% compared with the conventional prediction method, so it can satisfy the function of harmonic content prediction of photovoltaic power plants.
Note that the proposed method can provide reliable data support and a theoretical basis for the subsequent harmonic detection, treatment and dynamic operation and the switching of reactive power compensation equipment. The application effect will be verified in future work.