Label-Free Fault Detection Scheme for Inverters of PV Systems: Deep Reinforcement Learning-Based Dynamic Threshold

Seo, Giup; Yoon, Seungwook; Song, Junyoung; Srivastava, Ekta; Hwang, Euiseok

doi:10.3390/app13042470

Open AccessArticle

Label-Free Fault Detection Scheme for Inverters of PV Systems: Deep Reinforcement Learning-Based Dynamic Threshold

by

Giup Seo

,

Seungwook Yoon

,

Junyoung Song

,

Ekta Srivastava

and

Euiseok Hwang

^*

Gwangju Institute of Science and Technology (GIST), 123, Cheomdangwagi-ro, Buk-gu, Gwangju 61005, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(4), 2470; https://doi.org/10.3390/app13042470

Submission received: 15 December 2022 / Revised: 11 February 2023 / Accepted: 13 February 2023 / Published: 14 February 2023

(This article belongs to the Special Issue Industrial AI: Applications in Fault Detection, Diagnosis, and Prognosis)

Download

Browse Figures

Versions Notes

Abstract

:

Generally, photovoltaic (PV) fault detection approaches can be divided into two groups: end-to-end and threshold methods. The end-to-end method typically uses a deep neural network (DNN) to learn fault patterns from labeled datasets, which directly detect whether faults occur or not. The threshold method first estimates power generation and uses thresholds to detect atypical deviations of measured values from estimated ones. The former method heavily relies on fault-labeled data and, therefore, requires the collection of abnormal event records, which is usually difficult, due to the sparseness of these events. The latter method typically uses statistical approaches, such as 3-sigma, to find thresholds, and it can be practically utilized without fault labels. However, setting a threshold with a proper confidence interval is still challenging, as PV power generation is sensitive to variations in environmental conditions, such as irradiance, ambient temperature, wind speed and humidity. In this paper, we propose a novel deep reinforcement learning (DRL)-based label-free fault detection scheme in which thresholds are dynamically assigned with suitable confidence intervals under varying environmental conditions. Various weather properties were used as input features (i.e., states) to a DRL agent, and proper thresholds were estimated in real time from the actions of the DRL agent. To this end, a reward function was designed for learning proper thresholds without fault labels under different weather conditions. To evaluate the performance of the proposed scheme, the PV dataset of the National Institute of Standards and Technology (NIST) was used, as it includes paired records of local weather and PV generations. The DRL-based scheme was compared with static and conventional dynamic threshold methods, based on statistical approaches. The results revealed that the proposed scheme outperformed the existing methods, providing a 5.67% higher

F_{1}

-score in the NIST dataset.

Keywords:

label-free fault detection; photovoltaic systems; deep reinforcement learning; dynamic threshold

1. Introduction

The capacity expansion of renewable energy sources is among the sustainable methods that can mitigate global warming issues [1]. Photovoltaic (PV)-based energy generation systems are broadly deployed due to their affordability and the growing concerns with regard to the environment [2]. The PV system, which has a scalability characteristic, can be installed in various areas, such as households, buildings, and utilities. Various approaches have been investigated to effectively accept such PV systems, including studies on energy forecasting [3,4], optimization [5,6], and monitoring [7] approaches. In particular, developing PV health monitoring systems is essential for the stable management of PV systems, as PV faults may cause energy losses or fatal accidents. A site in the UK experienced energy losses of 18.9% in a year due to PV system faults [8]. The faults often cause fires and can cause fatal problems in power grids [9,10].

Various fault detection approaches for PV systems have been studied based on the following approaches: modeling [11,12,13,14,15], electrical signal analysis [16,17,18,19,20], and electrical circuit simulation [21,22,23,24]. Each approach has its pros and cons according to user perspectives. In particular, model-based approaches are often employed because their simplicity, compared with electrical signal analysis or circuit analysis approaches. The model-based approaches can be typically divided into end-to-end and threshold methods. First, the end-to-end methods directly decide whether faults occur or not, which requires a huge amount of fault-labeled data to train fault detection models, such as deep neural network (DNN). Recently, to mitigate this issue, semi-supervised learning schemes were studied for PV fault detection and showed comparable performance with those of fully supervised learning methods [25,26]. Second, the threshold methods estimate the power generation of PV systems, and compare the estimations with actually measured power generation. By analyzing the differences, users detect different abnormalities, such as shading effects [27], snow accumulation [28], maximum power point tracking error [29], and faulty conditions of DC–AC converters [15,30,31]. The threshold methods have the advantage of working well even without fault-labeled data. However, the thresholds are typically predetermined [31], or determined by monitoring data profiles [30], which has room for improvement.

The accurate detection of faults and the reduction of false alarms are crucial for the effective operation of PV systems. Therefore, properly setting threshold confidence intervals is among the essential processes of the model-based approaches. However, renewable energy sources, such as wind or PV systems, are significantly affected by local weather conditions. Thus, users experience difficulties in manually setting thresholds under various weather conditions. To deal with this problem, in this paper, we propose a deep reinforcement learning (DRL)-based dynamic threshold scheme, in which an AI agent continuously updates thresholds depending on weather changes. We employed a proximal policy optimization (PPO) agent and used various fields of weather data, including irradiance, ambient temperature, humidity, and wind speed. The agent consists of continuous state and action spaces. A reward function was designed to learn proper thresholds without fault labels, which is useful, since fault-labeled data is very limited. To evaluate the proposed scheme, we used the PV data of the National Institute of Standards and Technology (NIST) [32], which is publicly accessible. The proposed model was compared to the existing static and dynamic threshold schemes. To the best of our knowledge, this is the first study in which a DRL agent with continuous action space was used as a label-free fault detection scheme to detect inverter faults in PV systems. The contributions of this paper are summarized as follows:

A DRL-based dynamic threshold scheme is proposed for fault detection in PV inverters. The scheme automatically sets proper thresholds for the accurate detection of faults and the reduction of false alarms under various weather conditions. The scheme can mitigate the burdens of manually assigning confidence intervals for thresholds.
A customized reward function was designed for learning proper thresholds for the detection of inverter faults under various weather conditions without fault labels, which can be helpful under the scarcity of fault-labeled data.

The rest of this paper is structured as follows: Section 2 describes precedent research. Section 3 conducts data analysis, and introduces the proposed scheme. Section 4 discusses the experimental results. Section 5 provides the conclusion of the study.

2. Related Works

2.1. Threshold-Based Fault Detection in PV Systems

Fault detection in PV systems has been investigated from various perspectives to avoid energy losses and accidents [33]. Image-based fault detection schemes were proposed with the rapid development of image sensors and AI approaches, such as convolution neural networks (CNNs) [34,35,36,37,38,39,40]. Despite the accurate diagnostic ability of such schemes, their infrastructural installations are very expensive [41]. On the other hand, model-based detection schemes using thresholds have the advantages of low cost and rapid detection, and they are regarded as leading methods in fault detection, due to their simplicity and decent performance [42].

However, the model-based approaches using thresholds face many challenges, because the performance of fault detection significantly depends on threshold settings [43]. Properly setting thresholds is essential to ensure accurate detection. Thus, many threshold methods have been proposed to improve accuracy [15,42,44,45,46,47]. Despite the efforts, there are still some issues to cover. First, fixed threshold methods show difficulties in practical situations, in which volatility and uncertainty exist. For example, Bressan et al. proposed a scheme for setting thresholds and detecting shading faults, where faults were detected if there were drops greater than 10% [45]. Rouani et al. introduced a shading fault detection method in a grid-connected PV system, in which vertices principal component analysis (VPCA) was compared with a standard PCA [48]. However, a trade-off relation existed between less missed detection and more false alarms, which is a critical problem of fixed threshold methods.

To deal with this problem, Platon et al. proposed an online fault detection method, in which different dynamic thresholds were used according to solar irradiance intervals [15]. Wang et al. improved setting thresholds, based on solar irradiance, by considering more features, such as current and voltage [49]. Pan et al. introduced a fault diagnosis threshold method based on non-parametric kernel density estimation (NKDE), which obtained thresholds by assigning the confidence values of models [42]. The above approaches require assigning confidence intervals for setting thresholds. Therefore, there is still a limitation in regard to finding proper confidence intervals for the PV system and situations where volatility and uncertainty exist.

In this paper, we propose a DRL-based dynamic threshold scheme in which a DRL agent automatically finds proper thresholds without assigning confidence intervals (e.g., 3-sigma) for various situations, while considering the various weather conditions, such as ambient temperature, irradiance, humidity, and wind speed. Using the proposed scheme, users have no burdens for manually setting thresholds.

2.2. Deep Reinforcement Learning-Based Fault Detection

The Markov decision process (MDP) is a general framework of reinforcement learning (RL), which comprises an agent and environment where the agent receives reward signals, such as positive or negative feedback, caused by its action at each state [50]. The agent continuously learns what an optimal action is for each state through interactions with the environment. As decision makers, the advantage of RL agents is consideration of long-term planning while deciding on an optimal action. Therefore, they choose the actions that can maximize cumulative rewards.

Huang et al. first introduced a value-based DRL anomaly detector, in which a Deep Q-Function Network (DQN) algorithm was utilized, and two actions for classifying abnormal and normal states were given [51]. Yu and Sun proposed a policy-based anomaly detector, in which Asynchronous Advantage Actor-Critic (A3C) algorithm was used, and the action space had two actions [52]. The policy-based anomaly detector outperformed the value-based approach. Khanzaeli applied a valued-based anomaly detector with state-space models to structural health monitoring, where Q-learning was used, and artificial anomalies were generated for test purposes [53].

Although the existing RL-based approaches had decent performances, they only used discrete action spaces with fault labels, which was the same problem as classification problems. Thus, it is difficult to apply these approaches under the scarcity of fault-labeled data. Moreover, these approaches do not utilize the continuous action space of DRL agents. In this study, we aimed to mitigate the difficulty of assigning threshold confidence by using a DRL-based dynamic threshold scheme. This can be applied under the scarcity of fault-labeled data because this scheme operates in an unsupervised manner (without fault labels). In the proposed scheme, a setting of continuous state and action spaces is applied, and the threshold values of each state are outputs of the DRL agent.

3. Methodology

Model-based approaches for inverter faults distinguish fault conditions by comparing residuals with thresholds representing normal states [15,54]. In this study, we designed an estimation model with an input of dominant weather data and proposed a DRL-based dynamic threshold model that can learn proper thresholds for detecting inverter faults without fault-labels under varying weather conditions. Figure 1 shows the configuration of the proposed threshold scheme, and Figure 2 compares the proposed scheme with conventional approaches. In the proposed scheme, first we performed data preprocessing to deal with missing data with scaling. For handling the missing data, we used linear interpolation, while deleting the periods that could not be recovered because of insufficient information. For data scaling, we utilized the minimum–maximum normalization. Then, we selected the weather data that affected power generation by analyzing the correlation coefficients between weather data and power generation. Once weather data was selected, we trained a DNN estimator to estimate power generation values from weather data. Afterwards, we trained a DRL agent to set dynamic thresholds for detecting the inverter faults of PV systems.

3.1. Dataset

3.1.1. Data Description

The PV datasets of the NIST are publicly accessible, and they provide various measurement data, such as PV power generation and weather data. The measurement data were collected for four years, from January 2015, to December 2018, obtained at one-minute intervals. There are three types of PV arrays in the NIST dataset: canopy, ground, and roof. The specifications of PV arrays are explained in detail in [32].

3.1.2. Data Analysis

First, we analyzed the linear relationship between the PV generation data and weather data to identify which weather feature dominantly affected the power generation of PV systems in the NIST dataset, by investigating Pearson’s correlations as follows:

r_{X Y} = \frac{\sum_{i = 1}^{N} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{\sum_{i = 1}^{N} {(X_{i} - \bar{X})}^{2}} \sqrt{\sum_{i = 1}^{N} {(Y_{i} - \bar{Y})}^{2}}}

(1)

where

r_{X Y}

denotes a correlation coefficient between different random variables (irradiance, humidity, and etc.), X denotes PV generation, Y denotes environment variables, and i denotes a sample index, where N is the number of samples for each random variable.

\bar{X}

and

\bar{Y}

denote sample mean values, respectively.

Irradiance, ambient temperature, relative humidity, wind speed, and snow depth were considered to be environment variables. Canopy was selected among the three PV types. The power generation data was measured at the inverter of the PV system. The heatmap of the correlation matrix for PV generation and the local weather data fields during 2015 in the NIST dataset are illustrated in Figure 3. We could identify that irradiances mainly affected the power generation of the PV system, while ambient temperature and humidity showed weaker correlations with power generation. Even though the ambient temperature showed positive correlations with the power generation, because irradiance effects were related to temperature, it was reported that the temperature increase actually reduced power generation output [55]. To observe this, we empirically divided the data into different irradiance intervals (e.g., 0–100, 100–200, …, 900–1000). We could observe that power generation decreased, and variation increased, when ambient temperature increased, as shown in Figure 4. This meant that the ambient temperature had negative correlations with power generation when it was decoupled from irradiance. Figure 5 also shows that the negative correlation relationship became stronger in high irradiance intervals. Especially, wind speed and snow depth seemed to have no correlations with power generation. Nevertheless, we observed that snow depth affected power generation in some periods, which is illustrated in Figure 6. The weather station, which measured the snow depth, was somewhat far from the PV systems. We had relatively fewer snow fall event periods compared to other weather conditions. This might have caused the very weak correlation between power generation and snow depth. For the implementation of our proposed scheme and benchmark models, we excluded the few periods when snow fell. Regarding wind speed, we could not directly observe an impact on power generation. However, it was reported that wind speed had an extreme effect the temperature of PV cells, equivalent to a cooling effect of 15–20 °C at wind speeds of 10 m/s [56,57], closely related to the efficiency of PV panels [55]. Therefore, we considered four environmental variables: irradiance, ambient temperature, humidity, and wind speed.

In addition, we observed some inverter faults in the dataset, and they were labeled by an expert who was managing the PV systems. It was also observed that AC power, current and voltage sharply decreased and increased in the PV system within a short time period, while irradiance showed a normal pattern, as illustrated in Figure 7.

3.2. Proposed Scheme

3.2.1. Estimation Model

We used a DNN as a estimation model to acquire residuals between the estimated values and measured values. Moreover, we used three separate models for AC power, current, and voltage. The weather information (irradiance

D_{t}

, temperature

T_{t}

, wind speed

W_{t}

, humidity

H_{t}

) was input to the estimation models, and the outputs were estimated values, such as

{\hat{P}}_{t}

,

{\hat{I}}_{t}

and

{\hat{V}}_{t}

. Each model consisted of 4 layers, where the input layer had 4 nodes, two hidden layers had 128 nodes, and the output layer had 1 node, and the rectified linear unit (ReLU) function was used as an activation function for the hidden layers. We used Adam optimizer as a learning algorithm and we set a fixed learning rate at 0.0001.

{\hat{P}}_{t} = f_{P} (D_{t}, T_{t}, W_{t}, H_{t}),

(2)

{\hat{I}}_{t} = f_{I} (D_{t}, T_{t}, W_{t}, H_{t}),

(3)

{\hat{V}}_{t} = f_{C} (D_{t}, T_{t}, W_{t}, H_{t})

(4)

We split the NIST PV canopy dataset into training, validation, and test datasets for model learning. The canopy PV dataset of 2015 was used for training and validation, where we selected the last day of every week as a validation set, (i.e., training and validation ratio was 6:1). The dataset of 2016 was used for testing. When we trained and evaluated the estimation model, the time period when irradiance somewhat existed was considered, from 9:00 a.m. to 16:00 p.m.

We utilized mean absolute error (MAE) and symmetric mean absolute percentage error (SMAPE) as metrics for the estimation performance evaluation.

M A E = \frac{1}{N} \sum_{t = 1}^{N} | {\hat{X}}_{t} - X_{t} |

(5)

S M A P E = \frac{100}{N} \sum_{t = 1}^{N} \frac{| {\hat{X}}_{t} - X_{t} |}{(| X_{t} | + | {\hat{X}}_{t} |) / 2}

(6)

3.2.2. Deep Reinforcement Learning-Based Dynamic Threshold

PPO is one of the DRL agent algorithms that can be utilized for both discrete action and continuous action spaces. PPO was utilized because it outperformed other policy gradient methods, such as Advantage Actor–Critic (A2C) and Actor–Critic with Experience (ACER) [58]. The objective function of the PPO was the following:

L^{C L I P} (θ) = {\hat{E}}_{t} [\min (z_{t} (θ) {\hat{A}}_{t}, clip (z_{t} (θ), 1 - ϵ, 1 + ϵ) {\hat{A}}_{t})]

(7)

In the clip term,

ϵ

was a hyperparameter for clipping the probability ratio

z_{t} (θ)

. The default value of

ϵ

was 0.2. The clipping effect prevents an excessively large policy update. The details of the algorithms are described in [58]. In our proposed scheme, we used the PPO as a DRL agent for setting dynamic thresholds. For

ϵ

, the default value was used. We utilized each agent for each estimation model and denoted our DRL agent as DRL-DT.

In RL, an agent chooses an action for maximizing a cumulative reward after observing information on each state at a given weather condition. Therefore, designing a proper action space, state space, and reward function for a problem is essential. We designed a DRL scheme for detecting the inverter faults in the NIST PV dataset. Figure 8 illustrates the structure of the designed DRL scheme.

Action

For action, we used a continuous action space setting, in which the Gaussian policy was utilized. There were two output nodes in the last layer of the policy function of the DRL-DT: one was

μ_{s_{t}}

, and the other was

σ_{s_{t}}

. Sigmoid function was utilized to constrain a range of

μ_{s_{t}}

, and Softplus function was utilized for

σ_{s_{t}}

. During training, dynamic threshold values were sampled in the Gaussian distribution with

μ_{s_{t}}

and

σ_{s_{t}}

.

a_{t} \sim N (μ_{s_{t}}, σ_{s_{t}})

(8)

During the test, we used

μ_{s_{t}}

as threshold values because sampling thresholds from distribution caused performance uncertainties.

a_{t} = μ_{s_{t}}

(9)

State

The same weather information (irradiance

D_{t}

, ambient temperature

T_{t}

, wind speed

W_{t}

, humidity

H_{t}

) was used as the input of the estimation model. This observation provided the agent information necessary for properly setting dynamic thresholds under varying weather conditions. A state vector, representing an observation at a time step, is as follows:

s_{t} = (D_{t}, T_{t}, W_{t}, H_{t})

(10)

Reward

We designed a reward function for setting dynamic thresholds to detect inverter faults under the condition that there were no fault labels. Feedback signals from the reward function can play a key role in anomaly or fault detection tasks in systems in which fault events are sparse. Once an estimation model is trained, residual values can be obtained by using an estimated value as follows:

e_{t} = | X_{t} - {\hat{X}}_{t} |

(11)

In the fault detection task, if

e_{t}

exceeded a threshold value

a_{t} (a_{t} \geq 0, \forall a_{t})

, we considered that an inverter fault occurred at the used time step. Therefore, if

\frac{e_{t}}{a_{t}}

was larger than, or equal to, 1, it was considered that a fault had occurred. Otherwise, it was considered that the PV systems were operating in a normal way.

Our proposed method was for dealing with cases in which fault labels were either sparse or not. For this reason, we trained our agent with only the normal data of the NIST canopy dataset. By utilizing the

\frac{e_{t}}{a_{t}}

, we could provide feedback to our agent on whether it chose proper thresholds without fault labels or not. Our designed reward function was as follows:

r_{t + 1} = \{\begin{matrix} β (\frac{e_{t} + ϵ}{a_{t} + ϵ}), 0 \leq \frac{e_{t} + ϵ}{a_{t} + ϵ} < 1 \\ γ (- \frac{e_{t} + ϵ}{a_{t} + ϵ}), \frac{e_{t} + ϵ}{a_{t} + ϵ} \geq 1 \end{matrix}

(12)

We assumed that the training data for the agent was normal. If the agent chose a threshold value,

a_{t}

, larger than a residual value,

e_{t}

, the agent received a positive reward, such as

β (\frac{e_{t} + ϵ}{a_{t} + ϵ})

. Otherwise, the agent received a negative reward, such as

γ (- \frac{e_{t} + ϵ}{a_{t} + ϵ})

. Here

ϵ

was used to prevent zero division, where

ϵ

was set as 1. As a result, after agent training, the agent tended to choose a tight, but robust, threshold at each state under varying weather conditions. By setting the

β

and

γ

as below, we could provide more weights on what users paid attention to more.

β + γ = 1

(13)

The reward function can be illustrated as in Figure 9.

For the training process of the DRL agent,

We randomly selected a day as an episode among the same training dataset that was used for the DNN estimator.
(a)
The same weather information (irradiance $D_{t}$ , ambient temperature $T_{t}$ , wind speed $W_{t}$ , and humidity $H_{t}$ ) was used as a state to the DRL agent.
(b)
The DRL agent generated an action that was used as a threshold value.
(c)
The threshold value was compared with the difference between a measured value and an estimated value.
(d)
The DRL agent received a reward value, based on the designed reward function.
The DRL agent updated its policy to get higher reward values.

The (a)–(d) steps were repeated until an episode was done. The 1–2 steps were repeated until the DRL agent’s policy converged to an optimal policy.

Even though the agent maximized the cumulative rewards, it was not guaranteed that the performance of the agent was practically getting better or not. Therefore, we used metrics, such as recall, precision, and

F_{1}

score, to validate the performance of the proposed dynamic threshold scheme. The formulae of the metrics were as follows:

R e c a l l = \frac{T P}{T P + F N}

(14)

P r e c i s i o n = \frac{T P}{T P + F P}

(15)

F_{1} s c o r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(16)

True positive, false positive, and false positive are denoted as TP, FP, and FN, respectively.

4. Results and Discussion

4.1. Performance Evaluation

4.1.1. Estimation Performance

We used the NIST canopy PV dataset as a training and validation set, in 2015. For the evaluation metrics, we used MAE and SMAPE. Figure 10 shows the training and validation errors of estimating power generation. We saved an estimation model at each epoch, and utilized the model that showed the best performance in the validation set. We tested the model using the NIST canopy PV dataset, in 2016. For the power generation estimation, the training, validation, and test errors were 4.99, 5.48 and 8.26, in terms of MAE, and they were 4.34, 6.32 and 8.30, in terms of SMAPE, respectively. Figure 11 shows a comparison between the measured and estimated profiles, while (a), (b) and (c) show the results of the AC power, current, and voltage estimations, respectively. Table 1 describes the performance details of each model.

4.1.2. Threshold Performance

After training the estimation model, we utilized the model and calculated the residuals between the measured values and estimated values. By using the same dataset as the estimation model, we trained the DRL-DT agent. Therefore, we used the dataset of 2015 for training and the dataset of 2016 for testing. To set

β

and

γ

, we searched 0.01 and 0.05 and, then, formed from 0.1 up to 0.9 by increasing 0.1 of

β

. We found that the

F_{1}

score was best when

β

was 0.1. Figure 12 shows the performance of DRL-DT in terms of recall, precision, and

F_{1}

score.

To compare our proposed method with benchmark methods, we implemented existing statistical threshold approaches [15,49], a DNN classifier, and a DRL classifier. The classifiers had 7 input nodes for features (irradiance

D_{t}

, ambient temperature

T_{t}

, wind speed

W_{t}

, humidity

H_{t}

, the error of an estimated power

e_{t}^{P}

, the error of an estimated current

e_{t}^{I}

, the error of an estimated voltage

e_{t}^{V}

). Two output nodes were used to represent normal operation and inverter fault. Softmax function was used as an activation function in the output layer. In the case of the statistical approaches, fault labels were not required for the threshold. However, DNN classifier and DRL classifier required fault labels. Thus, we categorized the label-free groups according to whether fault labels were used or not. The used statistical approaches, and our proposed scheme, were label-free. We denoted the statistical approaches using dynamic thresholds of DT-1 [49] and DT-2 [15], and these approaches only used residuals between the estimated and measured values. In the case of the DNN and DRL classifiers, fault data and labels were used when they were trained. Figure 13 shows a comparison between the proposed scheme and benchmark approaches. The inverter fault data in the NIST canopy dataset had a relatively simple pattern that sharply decreased in a specific time period. Thus, the supervised manner (with fault labels) could almost completely classify inverter faults and normal data, showing an

F_{1}

score of 0.99. Among the label-free approaches, we observed that our proposed method showed the best performance, which was closest to the performance of the supervised manner, showing an

F_{1}

score of 0.94. As a result, the proposed scheme provided a 5.67% higher

F_{1}

-score, compared with the DT-2. The summary of the model performances are described in Table 2. The recall and precision curves were compared in Figure 14. We observed that DRL-DT had the best performance and that its performance was closest to that of the supervised manner at two recall standards (0.91, 0.95).

DT-2 showed the second best performance. Both DT-2 and DT-1 only used irradiance for setting the thresholds. We shifted the confidence intervals by 1 sigma to search for the best threshold values at each irradiance interval, and used the best performances observed by both DT-2 and DT-1. The differences between them were as follows. DT-2 set thresholds for power, current, and voltage, while DT-1 set thresholds for only power. Our scheme was also applied to power, current, and voltage when comparing the performances (the number of threshold types was the same as DT-2). The irradiance interval of DT-2 (0–50, 50–100, …, 900–950, 950) was more granular than that of DT-1 (0–50, 50–250, 250–500, 500).

Figure 15 shows the threshold patterns of DT-2 and DRL-DT for

e^{P}

. We identified that DT-2 was more sensitive to irradiance interval, and DRL-DT showed more smoothed threshold values.

e^{P}

was almost the same as

e^{I}

. DT-2 and DT-1 showed decent performances by only considering irradiance intervals at cool temperature conditions (lower than 25 degrees). However, when the ambient temperature was higher than 25 degrees, the irradiance-based approaches clearly showed their weakness, as shown in Figure 16. Figure 17 shows false positive alarm occurrences with ambient temperatures for each label-free approach.

As a result, various environmental variables, such as temperature, should also be considered to set more accurate threshold values. Both DT-1 and DT-2 used the threshold values of look-up tables, based on irradiance interval, which faces difficulties in considering various environmental variables. our proposed scheme could mitigate such difficulties and improve the accuracy of threshold values by using the DRL agent.

5. Conclusions

We presented a deep reinforcement learning (DRL)-based dynamic threshold scheme that operates in an unsupervised manner (without fault labels), and a reward function was designed to set appropriate threshold values under various weather conditions. The experimental results showed that the proposed scheme outperformed the existing unsupervised threshold schemes, based on statistical approaches, under varying weather conditions. Our scheme showed the closest performance to the performance of the supervised approach (with fault labels). In addition to PV plants, we expect that our scheme could be utilized in various fields, including monitoring systems and anomaly and fault detection systems, in which fault labels are sparse.

We applied the proposed scheme to a specific PV inverter fault. Our scheme should be considered for different PV faults or fault detection tasks in various fields. A potential limitation of the proposed method was that normal data was used in the training of the DRL agent. If the initial label did not exist and a large number of abnormal data was included for training, performance of the DRL-based approach might be degraded. As a future work, we are considering extending the proposed scheme for fully unlabeled datasets.

Author Contributions

G.S. overall devised the idea and led the research as the first author; S.Y. implemented benchmark models for performance comparison; J.S. conducted data analysis; E.S. reviewed previous studies; E.H. supervised the research as the corresponding author. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Korea Government through the Ministry of Science and Information and Communication Technology (MSIT) (2021R1A2C1009803) and GIST Research Institute (GRI) grant funded by the GIST in 2023.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This research was supported, in part, by GIST Research Institute (GRI) grant funded by the GIST in 2023 and, in part, by the National Research Foundation of Korea (NRF) grant funded by the Korea government. (MSIT) (2021R1A2C1009803).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

PV	Photovoltaic
DNN	Deep neural network
DRL	Deep reinforcement learning
PPO	Proximal policy optimization
CNN	Convolution neural network
PCA	Principal component analysis
NKDE	Non-parametric kernel density estimation
MDP	Markov decision process
DQN	Deep Q-function network
A3C	Advantage actor-critic
DT	Dynamic threshold

References

Razmjoo, A.; Kaigutha, L.G.; Rad, M.V.; Marzband, M.; Davarpanah, A.; Denai, M. A Technical analysis investigating energy sustainability utilizing reliable renewable energy sources to reduce CO₂ emissions in a high potential area. Renew. Energy 2021, 164, 46–57. [Google Scholar] [CrossRef]
Algarín, C.R. An analytic hierarchy process based approach for evaluating renewable energy sources. Int. J. Energy Econ. Policy 2017, 7, 38–47. [Google Scholar]
Park, K.; Yoon, S.; Hwang, E. Hybrid load forecasting for mixed-use complex based on the characteristic load decomposition by pilot signals. IEEE Access 2019, 7, 12297–12306. [Google Scholar] [CrossRef]
Park, S.; Yoon, S.; Lee, B.; Ko, S.; Hwang, E. Probabilistic forecasting based joint detection and imputation of clustered bad data in residential electricity loads. Energies 2020, 14, 165. [Google Scholar] [CrossRef]
Yoon, S.; Hwang, E. Load guided signal-based two-stage charging coordination of plug-in electric vehicles for smart buildings. IEEE Access 2019, 7, 144548–144560. [Google Scholar] [CrossRef]
Seo, G.; Yoon, S.; Kim, M.; Mun, C.; Hwang, E. Deep Reinforcement Learning-Based Smart Joint Control Scheme for On/Off Pumping Systems in Wastewater Treatment Plants. IEEE Access 2021, 9, 95360–95371. [Google Scholar] [CrossRef]
Song, J.; Lee, Y.; Hwang, E. Time–frequency mask estimation based on deep neural network for flexible load disaggregation in buildings. IEEE Trans. Smart Grid 2021, 12, 3242–3251. [Google Scholar] [CrossRef]
Firth, S.K.; Lomas, K.J.; Rees, S.J. A simple model of PV system performance and its use in fault detection. Sol. Energy 2010, 84, 624–635. [Google Scholar] [CrossRef]
Hernández-Callejo, L.; Gallardo-Saavedra, S.; Alonso-Gómez, V. A review of photovoltaic systems: Design, operation and maintenance. Sol. Energy 2019, 188, 426–440. [Google Scholar] [CrossRef]
Correa-Betanzo, C.; Calleja, H.; Aguilar, C.; Lopez-Nunez, A.R.; Rodriguez, E. Photovoltaic-based DC microgrid with partial shading and fault tolerance. J. Mod. Power Syst. Clean Energy 2019, 7, 340–349. [Google Scholar] [CrossRef]
Spataru, S.; Sera, D.; Kerekes, T.; Teodorescu, R. Photovoltaic array condition monitoring based on online regression of performance model. In Proceedings of the 2013 IEEE 39th Photovoltaic Specialists Conference (PVSC), Tampa, FL, USA, 16–21 June 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 0815–0820. [Google Scholar]
Muñoz, M.; Correcher, A.; Ariza, E.; García, E.; Ibañez, F. Fault detection and isolation in a photovoltaic system. In Proceedings of the International Conference on Renewable Energies and Power Quality, La Coruña, Spain, 25–27 March 2015; Volume 1, pp. 202–207. [Google Scholar]
Stauffer, Y.; Ferrario, D.; Onillon, E.; Hutter, A. Power monitoring based photovoltaic installation fault detection. In Proceedings of the 2015 International Conference on Renewable Energy Research and Applications (ICRERA), Palermo, Italy, 22–25 November 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 199–202. [Google Scholar]
Andò, B.; Baglio, S.; Pistorio, A.; Tina, G.M.; Ventura, C. Sentinella: Smart monitoring of photovoltaic systems at panel level. IEEE Trans. Instrum. Meas. 2015, 64, 2188–2199. [Google Scholar] [CrossRef]
Platon, R.; Martel, J.; Woodruff, N.; Chau, T.Y. Online fault detection in PV systems. IEEE Trans. Sustain. Energy 2015, 6, 1200–1207. [Google Scholar] [CrossRef]
Chine, W.; Mellit, A.; Pavan, A.M.; Kalogirou, S.A. Fault detection method for grid-connected photovoltaic plants. Renew. Energy 2014, 66, 99–110. [Google Scholar] [CrossRef]
Kim, I.S. On-line fault detection algorithm of a photovoltaic system using wavelet transform. Sol. Energy 2016, 126, 137–145. [Google Scholar] [CrossRef]
Ali, M.H.; Rabhi, A.; El Hajjaji, A.; Tina, G.M. Real time fault detection in photovoltaic systems. Energy Procedia 2017, 111, 914–923. [Google Scholar] [CrossRef]
Spataru, S.; Sera, D.; Kerekes, T.; Teodorescu, R. Monitoring and fault detection in photovoltaic systems based on inverter measured string IV curves. In Proceedings of the 31st European Photovoltaic Solar Energy Conference and Exhibition, Hamburg, German, 14–18 September 2015; WIP Wirtschaft und Infrastruktur GmbH & Co Planungs KG: Munich, Germany, 2015; pp. 1667–1674. [Google Scholar]
Guerriero, P.; Piegari, L.; Rizzo, R.; Daliento, S. Mismatch based diagnosis of PV fields relying on monitored string currents. Int. J. Photoenergy 2017, 2017, 2834685. [Google Scholar] [CrossRef]
Stellbogen, D. Use of PV circuit simulation for fault detection in PV array fields. In Proceedings of the Conference Record of the Twenty Third IEEE Photovoltaic Specialists Conference-1993 (Cat. No. 93CH3283-9), Louisville, KY, USA, 10–14 May 1993; IEEE: Piscataway, NJ, USA, 1993; pp. 1302–1307. [Google Scholar]
Guasch, D.; Silvestre, S.; Calatayud, R. Automatic failure detection in photovoltaic systems. In Proceedings of the Proceedings of the 3rd World Conference on Photovoltaic Energy Conversion, Osaka, Japan, 11–18 May 2003; pp. 2269–2271. [Google Scholar]
Chao, K.H.; Ho, S.H.; Wang, M.H. Modeling and fault diagnosis of a photovoltaic system. Electr. Power Syst. Res. 2008, 78, 97–105. [Google Scholar] [CrossRef]
Hamdaoui, M.; Rabhi, A.; El Hajjaji, A.; Rahmoun, M.; Azizi, M. Monitoring and control of the performances for photovoltaic systems. In Proceedings of the International Renewable Energy Congress, Warsaw, Poland, 19–20 May 2009. [Google Scholar]
Jaskie, K.; Martin, J.; Spanias, A. PV fault detection using positive unlabeled learning. Appl. Sci. 2021, 11, 5599. [Google Scholar] [CrossRef]
Lu, F.; Niu, R.; Zhang, Z.; Guo, L.; Chen, J. A generative adversarial network-based fault detection approach for photovoltaic panel. Appl. Sci. 2022, 12, 1789. [Google Scholar] [CrossRef]
Nordmann, T.; Jahn, U.; Nasse, W. Performance of PV systems under real conditions. In Proceedings of the European Workshop on Life Cycle Analysis and Recycling of Solar Modules, The “Waste” Challenge, Brussels, Belgium, 18–19 March 2004; p. 1. [Google Scholar]
Takashima, T.; Yamaguchi, J.; Otani, K.; Oozeki, T.; Kato, K.; Ishida, M. Experimental studies of fault location in PV module strings. Sol. Energy Mater. Sol. Cells 2009, 93, 1079–1082. [Google Scholar] [CrossRef]
Lei, P.; Li, Y.; Chen, Q.; Seem, J.E. Extremum seeking control based integration of MPPT and degradation detection for photovoltaic arrays. In Proceedings of the Proceedings of the 2010 American Control Conference, Baltimore, Maryland, USA, 30 June–2 July 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 3536–3541. [Google Scholar]
Roumpakias, E.; Stamatelos, T. Health Monitoring and Fault Detection in Photovoltaic Systems in Central Greece Using Artificial Neural Networks. Appl. Sci. 2022, 12, 12016. [Google Scholar] [CrossRef]
Wang, Y.; Li, X.; Ban, Y.; Ma, X.; Hao, C.; Zhou, J.; Cai, H. A DC Arc Fault Detection Method Based on AR Model for Photovoltaic Systems. Appl. Sci. 2022, 12, 10379. [Google Scholar] [CrossRef]
Boyd, M. Performance data from the nist photovoltaic arrays and weather station. J. Res. Natl. Inst. Stand. Technol. 2017, 122, 1. [Google Scholar] [CrossRef] [PubMed]
Park, S.; Park, S.; Kim, M.; Hwang, E. Clustering-based self-imputation of unlabeled fault data in a fleet of photovoltaic generation systems. Energies 2020, 13, 737. [Google Scholar] [CrossRef]
Qian, X.; Li, J.; Cao, J.; Wu, Y.; Wang, W. Micro-cracks detection of solar cells surface via combining short-term and long-term deep features. Neural Netw. 2020, 127, 132–140. [Google Scholar] [CrossRef]
Li, X.; Yang, Q.; Lou, Z.; Yan, W. Deep learning based module defect analysis for large-scale photovoltaic farms. IEEE Trans. Energy Convers. 2018, 34, 520–529. [Google Scholar] [CrossRef]
Deitsch, S.; Christlein, V.; Berger, S.; Buerhop-Lutz, C.; Maier, A.; Gallwitz, F.; Riess, C. Automatic classification of defective photovoltaic module cells in electroluminescence images. Sol. Energy 2019, 185, 455–468. [Google Scholar] [CrossRef]
Karimi, A.M.; Fada, J.S.; Hossain, M.A.; Yang, S.; Peshek, T.J.; Braid, J.L.; French, R.H. Automated pipeline for photovoltaic module electroluminescence image processing and degradation feature classification. IEEE J. Photovoltaics 2019, 9, 1324–1335. [Google Scholar] [CrossRef]
Du, B.; He, Y.; He, Y.; Duan, J.; Zhang, Y. Intelligent classification of silicon photovoltaic cell defects based on eddy current thermography and convolution neural network. IEEE Trans. Ind. Inform. 2019, 16, 6242–6251. [Google Scholar] [CrossRef]
Akram, M.W.; Li, G.; Jin, Y.; Chen, X.; Zhu, C.; Ahmad, A. Automatic detection of photovoltaic module defects in infrared images with isolated and develop-model transfer deep learning. Sol. Energy 2020, 198, 175–186. [Google Scholar] [CrossRef]
Herraiz, Á.H.; Marugán, A.P.; Márquez, F.P.G. Photovoltaic plant condition monitoring using thermal images analysis by convolutional neural network-based structure. Renew. Energy 2020, 153, 334–348. [Google Scholar] [CrossRef]
AbdulMawjood, K.; Refaat, S.S.; Morsi, W.G. Detection and prediction of faults in photovoltaic arrays: A review. In Proceedings of the 2018 IEEE 12th International Conference on Compatibility, Power Electronics and Power Engineering (CPE-POWERENG 2018), Doha, Qatar, 10–12 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
Pan, J.; He, W.; Shi, Y.; Hou, R.; Zhu, H. Uncertainty analysis based on non-parametric statistical modelling method for photovoltaic array output and its application in fault diagnosis. Sol. Energy 2021, 225, 831–841. [Google Scholar] [CrossRef]
Pillai, D.S.; Rajasekar, N. Metaheuristic algorithms for PV parameter identification: A comprehensive review with an application to threshold setting for fault detection in PV systems. Renew. Sustain. Energy Rev. 2018, 82, 3503–3525. [Google Scholar] [CrossRef]
Shimakage, T.; Nishioka, K.; Yamane, H.; Nagura, M.; Kudo, M. Development of fault detection system in PV system. In Proceedings of the 2011 IEEE 33rd International Telecommunications Energy Conference (INTELEC), Amsterdam, The Netherlands, 9–13 October 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 1–5. [Google Scholar]
Bressan, M.; El-Basri, Y.; Alonso, C. A new method for fault detection and identification of shadows based on electrical signature of defects. In Proceedings of the 2015 17th European Conference on Power Electronics and Applications (EPE’15 ECCE-Europe), Geneva, Switzerland, 8–10 September 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–8. [Google Scholar]
Dhimish, M.; Holmes, V. Fault detection algorithm for grid-connected photovoltaic plants. Sol. Energy 2016, 137, 236–245. [Google Scholar] [CrossRef]
Garoudja, E.; Harrou, F.; Sun, Y.; Kara, K.; Chouder, A.; Silvestre, S. Statistical fault detection in photovoltaic systems. Sol. Energy 2017, 150, 485–499. [Google Scholar] [CrossRef]
Rouani, L.; Harkat, M.F.; Kouadri, A.; Mekhilef, S. Shading fault detection in a grid-connected PV system using vertices principal component analysis. Renew. Energy 2021, 164, 1527–1539. [Google Scholar] [CrossRef]
Wang, H.; Zhao, J.; Sun, Q.; Zhu, H. Probability modeling for PV array output interval and its application in fault diagnosis. Energy 2019, 189, 116248. [Google Scholar] [CrossRef]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Huang, C.; Wu, Y.; Zuo, Y.; Pei, K.; Min, G. Towards experienced anomaly detector through reinforcement learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
Yu, M.; Sun, S. Policy-based reinforcement learning for time series anomaly detection. Eng. Appl. Artif. Intell. 2020, 95, 103919. [Google Scholar] [CrossRef]
Khazaeli, S.; Nguyen, L.H.; Goulet, J.A. Anomaly detection using state-space models and reinforcement learning. Struct. Control Health Monit. 2021, 28, e2720. [Google Scholar] [CrossRef]
Malik, A.; Haque, A.; Kurukuru, V.B.; Khan, M.A.; Blaabjerg, F. Overview of Fault Detection Approaches for Grid Connected Photovoltaic Inverters. e-Prime-Adv. Electr. Eng. Electron. Energy 2022, 2, 100035. [Google Scholar]
Zaini, N.; Ab Kadir, M.; Izadi, M.; Ahmad, N.; Radzi, M.; Azis, N. The effect of temperature on a mono-crystalline solar PV panel. In Proceedings of the 2015 IEEE Conference on Energy Conversion (CENCON), Johor Bahru, Malaysia, 19–20 October 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 249–253. [Google Scholar]
Dubey, S.; Sarvaiya, J.N.; Seshadri, B. Temperature dependent photovoltaic (PV) efficiency and its effect on PV production in the world–a review. Energy Procedia 2013, 33, 311–321. [Google Scholar] [CrossRef]
Koehl, M.; Heck, M.; Wiesmeier, S.; Wirth, J. Modeling of the nominal operating cell temperature based on outdoor weathering. Sol. Energy Mater. Sol. 2011, 95, 1638–1646. [Google Scholar] [CrossRef]
Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal policy optimization algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar]

Figure 1. The proposed DRL-based threshold scheme for detecting the inverter faults of PV systems.

Figure 2. Structure comparison of model-based approaches for detecting inverter faults of PV systems.

Figure 3. Heatmap of a correlation matrix for PV generation and local weather data fields in the NIST dataset.

Figure 4. Scatter plot of power generation and ambient temperature within the irradiance interval (600∼700 [W/m

^{2}

]).

Figure 4. Scatter plot of power generation and ambient temperature within the irradiance interval (600∼700 [W/m

^{2}

]).

Figure 5. Correlation coefficients between ambient temperature and power generation in different irradiance intervals.

Figure 6. Profiles of power generation and local weather for 21 days in February 2015.

Figure 7. Typical inverter fault patterns observed in the NIST dataset.

Figure 8. DRL structure for setting dynamic thresholds.

Figure 9. Reward function of the proposed DRL-DT with different

β

and

γ

.

Figure 9. Reward function of the proposed DRL-DT with different

β

and

γ

.

Figure 10. Loss curve in power generation estimation.

Figure 11. Profile comparison between the estimation and measurement values.

Figure 12. PV inverter fault detection performance of the proposed DRL-DT as a function of

β

.

Figure 12. PV inverter fault detection performance of the proposed DRL-DT as a function of

β

.

Figure 13.

F_{1}

score comparison between model-based fault detection schemes [15,49].

Figure 13.

F_{1}

score comparison between model-based fault detection schemes [15,49].

Figure 14. Recall and precision curves for proposed and conventional fault detection schemes [15,49].

Figure 15. Power residuals and dynamic threshold profiles in a test day recorded at high ambient temperatures [49].

Figure 16. Scatter plots of voltage residuals and associated thresholds by the proposed and conventional schemes at high ambient temperatures [49].

Figure 17. False positive alarm occurrences depending on ambient temperature [15,49].

Table 1. Power generation estimation performance of the estimation model based on DNNs.

	Training Error		Validation Error		Test Error
	MAE	SMAPE	MAE	SMAPE	MAE	SMAPE
AC power [kW]	4.99	4.34	5.48	6.32	8.26	8.30
Current [A]	15.75	5.16	16.02	6.88	25.18	7.87
Voltage [V]	6.61	1.95	6.80	2.33	11.38	3.17

Table 2. Performance comparison for inverter fault detection schemes.

	Recall	Precision	$F_{1}$ Score	Label-Free
DNN classifier	1.000	0.991	0.995	×
DRL classifier	1.000	0.991	0.995	×
DRL-DT (proposed)	0.911	0.990	0.949	◯
DT-2 [49]	0.875	0.923	0.898	◯
DT-1 [15]	0.886	0.874	0.880	◯
Static T	0.773	0.783	0.778	◯

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Seo, G.; Yoon, S.; Song, J.; Srivastava, E.; Hwang, E. Label-Free Fault Detection Scheme for Inverters of PV Systems: Deep Reinforcement Learning-Based Dynamic Threshold. Appl. Sci. 2023, 13, 2470. https://doi.org/10.3390/app13042470

AMA Style

Seo G, Yoon S, Song J, Srivastava E, Hwang E. Label-Free Fault Detection Scheme for Inverters of PV Systems: Deep Reinforcement Learning-Based Dynamic Threshold. Applied Sciences. 2023; 13(4):2470. https://doi.org/10.3390/app13042470

Chicago/Turabian Style

Seo, Giup, Seungwook Yoon, Junyoung Song, Ekta Srivastava, and Euiseok Hwang. 2023. "Label-Free Fault Detection Scheme for Inverters of PV Systems: Deep Reinforcement Learning-Based Dynamic Threshold" Applied Sciences 13, no. 4: 2470. https://doi.org/10.3390/app13042470

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Label-Free Fault Detection Scheme for Inverters of PV Systems: Deep Reinforcement Learning-Based Dynamic Threshold

Abstract

1. Introduction

2. Related Works

2.1. Threshold-Based Fault Detection in PV Systems

2.2. Deep Reinforcement Learning-Based Fault Detection

3. Methodology

3.1. Dataset

3.1.1. Data Description

3.1.2. Data Analysis

3.2. Proposed Scheme

3.2.1. Estimation Model

3.2.2. Deep Reinforcement Learning-Based Dynamic Threshold

Action

State

Reward

4. Results and Discussion

4.1. Performance Evaluation

4.1.1. Estimation Performance

4.1.2. Threshold Performance

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI