Bats: An Appliance Safety Hazards Factors Detection Algorithm with an Improved Nonintrusive Load Disaggregation Method

Wang, Wei; Wang, Zilin; Chen, Yanru; Guo, Min; Chen, Zhengyu; Niu, Yi; Liu, Huangeng; Chen, Liangyin

doi:10.3390/en14123547

Open AccessArticle

Bats: An Appliance Safety Hazards Factors Detection Algorithm with an Improved Nonintrusive Load Disaggregation Method

by

Wei Wang

^1,†

,

Zilin Wang

^1,†

,

Yanru Chen

¹

,

Min Guo

¹

,

Zhengyu Chen

¹

,

Yi Niu

¹

,

Huangeng Liu

²

and

Liangyin Chen

^1,3,*

¹

College of Computer Science, Sichuan University, Chengdu 610065, China

²

School of Mechanical Electronic and Information Engineering, China University of Mining and Technology, Beijing 100083, China

³

Institude for Industrial Internet Research, Sichuan University, Chengdu 610065, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Energies 2021, 14(12), 3547; https://doi.org/10.3390/en14123547

Submission received: 21 April 2021 / Revised: 10 June 2021 / Accepted: 11 June 2021 / Published: 15 June 2021

Download

Browse Figures

Versions Notes

Abstract

:

In an electrical safe microenvironment, all kinds of electrical appliances can be operated safely to ensure the safety of life and property. The significance of safety hazard factors detection is to detect safety hazards in advance, to remind the administrators to exclude risk, to reduce the unnecessary loss, and to ensure that the electrical operation is healthy and orderly before the occurrence of accidents. In this paper, batteries are selected as the primary research subject of safety detection because batteries are used more and more in the Internet of Things (IOT), and they often cause fire in the process of discharging and charging. The existing algorithms need to be embedded into the specialized sensor for each important electrical appliance. However, they are limited by the actual deployment, so it is extremely difficult to spread widely. According to the opinions above, an improved load disaggregation algorithm based on dictionary learning and sparse coding with optimal dictionary matrix period is proposed to detect potential safety hazards of battery loads. For safety-related electrical applications, doing so can increase interpretability. Through experiments, we test this algorithm on the REDD dataset, and compare it with the baseline algorithms (combinatorial optimization, factorial hidden Markov model, basic discriminative dictionary sparse coding algorithm) to achieve a degree of trust. The Mean Absolute Error (MAE) value is 8.26, which drops by 70%. The Root Mean Square Error (RMSE) value is 97.75, which is also better than those baseline algorithms.

Keywords:

Internet of Things; power supply safety; safety hazard factors detection

1. Introduction

All the time, electrical fire accidents occur frequently, causing many economic property losses and casualties [1]. The statistical report on fire losses and casualties in the United States from 1980 to 2016 shows that the total number of fire hazards has decreased by half, but the losses of property and human have not significantly reduced [2]. The key reason to electrical safety problems is that there are not enough monitoring datasets about unsafe electrical appliances and interpretable monitoring algorithms [3,4]. Moreover, there are two ways to detect safety hazard factors, direct monitoring and nonintrusive load disaggregation [5,6]. Although the power consumption of individual appliances can be directly and accurately monitored, direct monitoring of battery load is difficult to deploy in practice due to hardware budget constraints and installation space constraints. With widespread use of smart meters and charging/recharging appliances, the safety hazard factors issue regarding electrical appliances is still an intractable challenging problem. The existing methods can be categorized as several main types of interpretable nonintrusive load disaggregation algorithms, including combinatorial optimization (CO) [7], factorial hidden Markov model (FHMM) [8], and dictionary learning sparse coding (SC) [4] algorithms.

The combinatorial optimization algorithms in load disaggregation of safety detection problem are like the famous knapsack problem and subset summation problem. These algorithms are essential to find the optimal combination from a finite set of appliances for power value. The goal of these algorithms is to assign power value to every appliance in order to minimize the error between the sum of the estimated value and the real aggregated data. The combinational optimization algorithms mentioned above have been used as the benchmark for these algorithms [4,7,9]. The combinatorial optimization algorithm is used to express the electric appliance model as switching state and finite multi-state [10]. Although the combinatorial optimization non-intrusive load disaggregation algorithm can explain the electrical load mathematically, it is still not suitable in battery safety scenarios because of low accuracy for lack of time correlation.

With the deepening of the research, although the continuity principle of the close time is considered, the problem of the state transition probability is not considered [11,12,13]. The factorial hidden Markov algorithms model each electrical appliance as a single Markov chain. The states of multiple electrical appliances form multiple Markov chains and evolve simultaneously, which are not only constrained by the aggregate power, but also influenced by the transition probability. The factorial hidden Markov model algorithm will use the clustering algorithm to cluster more discrete electrical states into several basic states, thus there is less representation of the state of each appliance [8,14,15,16]. The models from multiple appliances are combined to create a superstate hidden Markov model (SSHMM), which can represent states of all the appliances. The aggregated power is passed through the SSHMM, which returns the states of every appliance included in this aggregated power. When the number of appliances is smaller, these factorial hidden Markov models can be solved exactly (FHMMExact) [17]. The factorial hidden Markov algorithms have some explanatory abilities, but they have low precision because it expresses less state of each electrical appliance [18], so it is not suitable for battery safety hazard detection as well.

The current load disaggregation algorithms only focus on the consumed load, rarely involving the supplied load [19]. The existing load disaggregation algorithms [20] are not suitable for the expression of battery safety hazard factors. Recent studies on interpretable load disaggregation have been dominated by dictionary learning algorithms [21,22], such as energy disaggregation via discriminative sparse coding (DDSC) [4]. However, the dictionary matrix period of these algorithms is not optimal, so it leads to lower accuracy. At the same time, the dictionary learning and sparse coding can be not applied directly to our scenarios because of nonnegative matrix properties in the dictionary learning algorithm. Inspired by these algorithms, for battery safety hidden danger scenarios, this paper needs to build a new topology expression diagram for battery loads not mentioned before, and to build an improved detection algorithm with higher precision.

The contributions of this paper are as follows:

For the battery load safety hazard factors’ detection scenarios, we propose a new topological representation diagram in the load disaggregation domain firstly. In order to tackle the issue of nonnegative matrix properties, we design the shift strategy of battery load.
The optimal dictionary matrix period algorithm is constructed for improved the accuracy of the improved dictionary learning and sparse coding. In addition, adding the aggregation constraint further improves the algorithm.
Based on the above two points, we propose a new safety hazard factors detection algorithm for battery load.
Compared with three baseline algorithms including CO, FHMMExact, and DDSC, our algorithm Bats is more accurate than them on the Reference Energy Disaggregation Dataset (REDD) [23].

The remainder of this paper is organized as follows. In Section 2, we describe the basic model of the nonintrusive load disaggregation algorithm, the basic process of dictionary learning and sparse coding. Based on the above basic model, we propose an algorithm about how to further build the model for dictionary learning and sparse coding in the scene of nonintrusive load disaggregation, and we finally make some improvements according to the characteristics of battery safety hazard factors and the requirement for accuracy. Finally, we describe the modeling process for the problem. In Section 3, we describe the establishment and analysis of the model, deducing the detailed process of the algorithm. Section 4 is the experimental part which shows the simulation and analysis results. Section 6 is the concise summary of this paper, and the possible research directions in the future are prospected.

2. Problem Modeling

2.1. Preliminary

In the electrical microenvironment, the topologically logic relationship among the electrical appliances is shown in Figure 1. It is assumed that the power consumption is mathematically positive, and the power production is mathematically negative. The mathematically positive load includes the TV set, washing machine, microwave oven, refrigerator, and the energy storage battery in charging time slot, etc., while the negative load refers to the energy storage battery in the discharge that provides the power consumption time slot. In the case of the whole time slot, the energy storage battery shows partially positive and negative power consumption curve with alternating operation mode. The other commonly used electrical appliances continue to maintain the original operation mode of power consumption. For the convenience of reading, most of the important mathematical symbols are adopted in this paper, and the corresponding description is listed in Table 1.

In this way, we can define the aggregate power consumption on the meter or the plug board as

y (t), t \in (1, 2, \dots T)

, where T is the length of the entire time slot. Suppose that there are N electrical appliances, the

i

th electrical equipment in power consumption expression of time slot t is

x_{i} (t), t \in (1, 2, \dots T), i \in (1, 2, \dots N)

; therefore,

y (t) = x_{1} (t) + \sum_{k = 2}^{N} x_{i} (t) + ϵ (t)

(1)

where

x_{1} (t)

represents the energy storage battery with the mathematically positive or negative power consumption,

x_{i} (t), t \in (2, \dots N)

represents other kinds of commonly used electrical appliances, and

ϵ (t)

represents the random noise of system operation and measurement.

Due to practical constraints, we cannot provide all electrical appliances, charging and discharging appliances with some safe, expensive and large power sensors, or configure the corresponding signal processing and data transmission module. Generally speaking, only a single power sensor device is deployed at the entrance of the electrical microenvironment, such as a building, a room, or a station. In the case of a few power sensor points’ deployment, we can only measure and record the aggregate power data expressed as

y_{t}, t \in (1, 2, \dots, T)

, but we often hope to get the power consumption of these electrical appliances, which is inferred and computed by signal processing and data mining methods. Inspired by the theory of pattern recognition and numerical estimation, it is assumed that the electrical appliances estimated value

{\hat{x}}_{i} (t), t \in (2, \dots, N)

. In addition, then add the estimated individual electrical appliance power consumption values to get a new aggregate estimated value

\hat{y} (t)

, which is compared with the original aggregate power consumption data

y_{t}

. The goal of estimation is to make the error between the estimated aggregate value and the actual aggregate value as small as possible, which can be formally expressed as:

m i n | | y (t) - \sum_{i = 1}^{N} {\hat{x}}_{i} (t) {| |}^{2}

(2)

There are many specific modeling and solving methods for general expression (2). Inspired by the expression of Fourier series, any waveform theory can be expressed as the combination of fundamental waves. For load disaggregation modeling methods about the safety hazard factors detection, in order to obtain explicable results, we use the following form to express:

m i n | | y (t) - {B A | |}^{2}

(3)

where

y (t)

is the signal to be decomposed, B is the Fourier basis functions, and A is the coefficients of the basis functions. However, on the whole, the actual deployment of the electrical IOT safety monitoring systems is affected by some factors, such as a lower hardware budget of the sensor, higher computation cost of signal processing and data analysis, more storage and communication consumption of the back end system, and a bigger amount of connected internet data. Because such systems generally do not acquire high-frequency load data information in real time, it is unrealistic to directly use the Fourier transform method to implement high-frequency signal processing. In this case, the selection of similar and alternative solutions has become one of the basic modelings and solutions.

In this paper, the dictionary learning method is adopted to learn the basis matrix of each electrical appliance from the time-series data sampled at low frequency, and the activation matrix corresponding to the basis matrix is similar to the basis function and coefficient in the Fourier transform method. In Equation (4), D represents the basis matrix, and C represents the sparse coefficient matrix. The error between the expression of the estimated aggregate obtained by the combination of D and C can be measured by the Euclidean distance. Dictionary learning and sparse coding theory show that C should be as sparse as possible. While keeping the error small, sparse means to make as many zero items in the sparse coefficient matrix as possible,

m i n_{C} | | y (t) - {D C | |}_{2}^{2} + {λ | | C | |}_{1}

(4)

where

| | y (t) - {D C | |}^{2}

is Euclidean distance expression, and

{λ | | C | |}_{1}

is the penalty term.

ℓ_{1}

norm or

ℓ_{2}

norm regularization helps reduce overfitting and implements sparse solution problems.

ℓ_{1}

norm regularization uses proximal gradient descent. Tibshirani et al. [24] described the reason why the

ℓ_{1}

norm is chosen for the penalty item. The

ℓ_{1}

norm represents a rectangle in the coordinate system of the solution, intersecting a circle constructed by the quadratic function of the objective function, usually on the coordinate axis. In the coordinate system of the solution,

ℓ_{2}

norm presents a circle with the origin of coordinates as the center of the circle, and intersects the circle constructed by the quadratic function of the objective function. Generally, it will not intersect on the coordinate axis. It can be seen from the properties of the solution of activation matrix or sparse representation that the sparsity of the solution of

ℓ_{1}

norm type is better than that of

ℓ_{2}

norm. In this way, we can construct the objective function of sparse coding for dictionary learning of

ℓ_{1}

norm type, and finally solve the safety hazard factors detection problem to satisfy the requirements of the system.

2.2. Determination of Objective Function

In this section, Equation (4) is further transformed into an objective function with coefficients. According to the derivation process of Taylor’s Equation, the transformation and derivation of L-Lipschitz condition realization problem are figured out in [25].

Assuming that ∇ represents a differential operator and the objective function is

f (z)

, the problem is constructed as the following objective function:

min_{z} f (z) + {λ | | z | |}_{1}

(5)

where z is the independent variable. In addition, find the smallest z value by minimizing the objective function

f (z)

. If the objective function

f (z)

is differentiable, and

\nabla f

meets the L-Lipschitz condition, where

L > 0

is constant, the gradient inequality is formed as follows:

{|\nabla f (z^{'}) - \nabla f (z)|}_{2}^{2} \leq L {∥z^{'} - z∥}_{2}^{2} (\forall z, z^{'})

(6)

The second-order Taylor expansion of the objective function

f (z)

is carried out near

z_{k}

, and its expression is approximately as follows:

\begin{matrix} \hat{f} (z) & ≃ f (z_{k}) + \nabla f (z_{k}) (z - z_{k}) + \frac{L}{2} {∥z - z_{k}∥}_{2}^{2} \\ = \frac{L}{2} {∥z - (z_{k} - \frac{1}{L} \nabla f (z_{k}))∥}_{2}^{2} + CST \end{matrix}

(7)

where

\hat{f} (z)

is the estimated value of the object function,

z_{k}

is the specific value of z, L is constant coefficient value, and

C S T

is constant. The minimum value will be obtained at

z_{k + 1}

:

z_{k + 1} = z_{k} - \frac{1}{L} \nabla f (z_{k})

(8)

Through the gradient descent method,

f (z)

can be minimized by iterative computations. With each step of gradient descent iteration, the quadratic function

\hat{f} (z)

is minimized to get

z_{k + 1} = \underset{z}{arg min} \frac{L}{2} {∥z - (z_{k} - \frac{1}{L} \nabla f (z_{k}))∥}_{2}^{2} + λ {∥ z ∥}_{1}

(9)

where L can be simplified as constant value 1.

3. Model Solution and Algorithm Analysis

This section includes the detailed derivation process of the optimization objective function. Firstly, the dictionary learning algorithm of single appliance explains the dictionary belonging to every appliance. In addition, the load disaggregation error function states the iteration optimization methods. Then, the optimal dictionary matrix period algorithm can find one of the maximum periods for the improved dictionary learning sparse coding algorithm.

3.1. The Optimization Objective

Based on the input data

X_{i}

, the basis matrix

D_{i}

and the activate matrix

C_{i}

, we can build the minimization objective function as follows:

\begin{matrix} P 1 : min_{D_{i} \geq 0, C_{i} \geq 0} \frac{1}{2} {∥X_{i} - D_{i} C_{i}∥}_{2}^{2} + λ {∥C_{i}∥}_{1}, i = 1, \dots, N \\ s . t . {∥c_{i}^{(j)}∥}_{2} \leq 1, j = 1, \dots, M \\ \sum_{t = 1}^{T} X_{i} (t) \leq c o n s t_{i} \end{matrix}

(10)

where the coefficient value

\frac{1}{2}

is the coefficient of second derivative term of the Taylor series expansion Equation (7),

c o n s t_{i}

is the constraint value of the

i

th appliance, i.e., the maximum power value. In this way, when the objective function is doing gradient descending, by adjusting the regularization coefficient

λ

, the relationship between sparse coding error and sparsity of sparse matrix

C_{i}

is balanced. In addition, the sparse expression obtained finally meets the sparse condition. At the same time, we add the coefficient of sparse matrix normalized constraints

{∥c_{i}^{(j)}∥}_{2} \leq 1, j = 1, \dots, M

, in order to balance the sub-dictionary of relations among the weights of all atoms. The second constraint condition is the load disaggregation value finally solved, which needs to satisfy a condition. This condition is that each electrical appliance has a threshold value of cumulative sum of power consumption, and the whole operation process cannot exceed the threshold value.

From Equation (10), we can see that the objective function to be optimized includes two optimization variables:

D_{i}

, which is the dictionary basis matrix of the

i

th electrical appliances, and

C_{i}

which is the corresponding sparse expression. According to the description of the literature in [26], the natural solution to the problem is to fix a variable and solve another variable. At this time, the convex optimization theory can be used to obtain the corresponding solution through the derivation.

The values

D_{i}

and

C_{i}

can be solved by the above alternative optimization method; then, they they are used to construct the dictionary of [

1 : k

] electrical appliances by means of matrix concatenation. The dictionary construction of all electrical appliances is formed as follows, and its formal expression is as follows:

\begin{matrix} {\hat{D}}_{1 : k} & = arg min_{D_{1 : k} \geq 0} {∥Y - D_{1 : k} C_{1 : k}∥}_{2}^{2} + λ {∥C_{1 : k}∥}_{1} \\ = arg min_{D_{1 : k} \geq 0} F (Y, D_{1 : k}, C_{1 : k}) \end{matrix}

(11)

Assuming the fixed dictionary D, the estimated sparse code

\hat{C}

can be calculated by

arg {min}_{C_{1 : k} \geq 0} F (Y, D_{1 : k}, C_{1 : k})

. In Equation (11),

F (Y, D_{1 : k}, C_{1 : k})

is equivalent to

{∥Y - D_{1 : k} C_{1 : k}∥}_{2}^{2} + λ {∥C_{1 : k}∥}_{1}

. After calculating the estimated sparse code

\hat{C}

, the estimated power value of the

i

th electrical appliance can be obtained:

{\hat{X}}_{i} = D_{i} {\hat{C}}_{i}

(12)

In this way, the problem P1 is transformed into the problem P2, and the load disaggregation error can be expressed as follows:

\begin{matrix} P 2 : min E (X_{1 : k}, D_{1 : k}) = \sum_{i = 1}^{k} \frac{1}{2} {∥X_{i} - D_{i} {\hat{C}}_{i}∥}_{2}^{2} \\ s . t . {\hat{C}}_{1 : k} = arg min_{C_{1 : k} \geq 0} F (Y, D_{1 : k}, C_{1 : k}) \\ {∥c_{i}^{(j)}∥}_{2} \leq 1, j = 1, \dots, M \\ \sum_{t = 1}^{T} X_{i} (t) \leq c o n s t_{i} \end{matrix}

(13)

where

E (.)

is the error function.

j \in (1, 2, \dots, M)

is the atom numbers of dictionary.

C_{1 : k}

,

{1 : k}

is the concatenation formation of the appliance sparse coefficients.

However, because it is not easy to solve the error function minimization, a new penalty term is introduced here to convert the problem P2 into the problem P3, which is not a convex optimization problem, and it can be solved by using the gradient descent algorithm:

\begin{matrix} P 3 : min E_{r e g} (X_{1 : k}, D_{1 : k}) = E (X_{1 : k}, D_{1 : k}) + λ (\hat{{∥C_{1 : k}∥}_{1}}) \\ = \sum_{i = 1}^{k} \frac{1}{2} {∥X_{i} - D_{i} {\hat{C}}_{i}∥}_{2}^{2} + λ (\hat{{∥C_{1 : k}∥}_{1}}) \end{matrix}

\begin{matrix} s . t . {\hat{C}}_{1 : k} = arg min_{C_{1 : k} \geq 0} F (Y, D_{1 : k}, C_{1 : k}) \\ {∥c_{i}^{(j)}∥}_{2} \leq 1, j = 1, \dots, M \\ \sum_{t = 1}^{T} X_{i} (t) \leq c o n s t_{i} \end{matrix}

(14)

where

E_{r e g} (.)

is the error function with regularization item.

The power consumption data matrix

X_{i}

of the

i

th electrical appliance is used for sparse coding iteration to obtain the optimal activation coefficient matrix

C^{*}

:

C_{i}^{★} = arg min_{C_{i} \geq 0} \frac{1}{2} {∥X_{i} - D_{i} C_{i}∥}_{2}^{2} + λ (\hat{{∥C_{1 : k}∥}_{1}})

(15)

where

C_{i}^{★}

is the optimal activation coefficient matrix of the

i

th appliance. Then, concatenate these coefficient matrices into a bigger matrix.

Then, the iteration of gradient descent is performed to update the rules as follows:

\tilde{D} \leftarrow \tilde{D} - α ((Y - \tilde{D} \hat{C}) {\hat{C}}^{T} - (Y - \tilde{D} C^{★}) C^{★ T})

(16)

where the update rate or learning rate is

α

, which is the step size of each step of gradient descent. The solution with smaller error is found by controlling the step size in the process of solving this problem.

For each atom learned by updating iteratively the dictionary, for the convenience of further interference, the sub-vectors of all the learned dictionary D are normalized as follows:

d_{i}^{(j)} \leftarrow d_{i}^{(j)} / {∥d_{i}^{(j)}∥}_{2}, i \in (1, 2, \dots, N), j \in (1, 2, \dots, M)

(17)

where

d_{i}^{(j)}

is the

j

th vector of the

i

th appliance’s learned dictionary.

3.2. Dictionary Learning Period Optimal Algorithm

X_{i} (t), i = 1, \dots, N, t = 1, \dots, T

is the power value of the

i

th appliance.

N_{p e r i o d}

is the total period number of every appliance.

T_{i}, i = 1, \dots, N

is the average period of the

i

th appliance. In addition, the unit of

T_{i}

is the number of samples and the interval of two samples is 60 s. Through data exploring, it is found that different appliances have different operation periods as Figure 2 shows. Is it possible to consider different period window size, which may lead to different precision? The answer is yes.

Therefore, as Algorithm 1 describes, we compute every appliance typical period time statistically. Then, for every appliance, if the power value is greater than the threshold, then mark this timestamp as the start point of power period. Calculate the interval between two marked start points of power value. Finally, select one of maximum period times as the optimal dictionary learning period:

Algorithm 1: Dictionary Learning Period Optimal algorithm

3.3. Improved Dictionary Learning Sparse Coding Algorithm

In Algorithm 2, firstly, the optimal window size

M^{*}

is calculated by Algorithm 1, which is used as the segmentation method of time series data

X_{i}

. In addition, the time series data can be converted into matrix form. Then, the power values with a negative value are moved to positive values by adding some shift value. The same shift values are also added to the aggregate power consumption data. Thus, the new battery power values and aggregate power values are formed. Then, the positive values are initialized for

D_{i}

and

C_{i}

, and normalized for D. The dictionary learning algorithm is learned for each appliance to train the dictionary and the corresponding sparse code of every appliance. The learned dictionary is concatenated into a new sparse coding matrix. Update the dictionary according to the learning rate

α

. After adding the aggregate constraints, the optimization is iterated continuously. Through the above process, the optimal sparse coding matrix is obtained. Finally, in the dataset, the dictionary is multiplied by the sparse code matrix to obtain the predicted load decomposition power. The flowchart about Algorithms 1 and 2 is shown in Figure 3.

Algorithm 2: Improved dictionary learning sparse coding algorithm

4. Experimental Results and Analysis

4.1. Datasets for Experiments

The Reference Energy Disaggregation Dataset (REDD) is a representative, public, and freely available dataset that has frequently been utilized to explore all kinds of nonintrusive load disaggregation algorithms [23]. In order to facilitate the experiments, this paper modified and synthesized the battery simulation data based on REDD datasets. At the same time, four electrical appliances, such as battery, fridge, sockets, and light, were selected from building 1 to carry out the experiments of the improved dictionary learning and sparse coding algorithms. The dataset for battery devices is from synthetic data. Currently, there is no existing real dataset of battery electrical appliances for our experiments. The data can be obtained through simulation and then transformed to a completely positive condition for verification.

As Table 2 shows, the time of the REDD dataset is selected to execute

80 %

for training and

20 %

for testing. In fact, the training dataset lasted from 18 April 2011 to 25 February 2012. The test dataset lasted from 25 February 2012 to 25 May 2012. According to the method of 1-min resample period, we get a new sample dataset.

As Figure 4 shows, battery charging is embodied as external consumption power consumption, and the maximum is 20 W. In addition, the minimum value is −40 W, indicating that the average power consumption of external power supply is 40 W. To facilitate the experiments, the first phase of the experimental simulation requires a relatively ideal record of battery charging and discharging. It is further proved that the power consumptions of charging and discharging is close to each other in geometric area. In the future, we will test our algorithm on the real scenarios of battery loads.

4.2. Experimental Setting

These algorithms are implemented in Python based on the NILMTK [17,27]. This experiment is run on a desktop computer with GPU 1080i, Intel Core i5-10400 CPU, 2.9 GHz CPU physical frequency, and 16 GB memory capacity of a Windows 10 operating system. In order to make operation convenient and meet the needs of comparison experiments, this paper builds a virtual machine environment of Anaconda, which is a professional platform in the domain of data science research. It creates an isolated operating environment and installs all kinds of Python installation packages, such as numpy, matplotlib, cvxpy, hmmlearn, scikit-Learn, TensorFlow, Keras, and so on.

The baseline algorithms including CO, FHMMExact, and DDSC are selected as the experimental comparison objects. For the convenience of the experiment, we adopt the state-of-the-art load disaggregation framework in the domain of load disaggregation research, NILMTK-Contrib, as an important means of our evaluation. In this unified framework, three algorithms can obtain data, preprocess data, train models, and test them.

Furthermore, in the process of a simulation experiment, this paper selects the following specific hyper-parameters and corresponding explanations in Table 3. The regularization coefficient

λ

is 20. The learning rate of dictionary learning

α

is

10^{- 12}

The max iterative step is 10,000. Certainly, as the operation converges gradually, the calculation will not generally run to the maximum number of iterations as Figure 5. In all these experiments, the atom number of dictionary learning n is 10.

4.3. Convergence Analysis

Experimenting with Algorithm 1, the window segmentation experiment shows that different window sizes have different convergence rates, where the unit of convergence rate is the number of iterations. As Figure 5 is shown and Table 4 is displayed, under different segmentation windows, the optimization objective or the error function converge over time. In general, with the increase of iteration times, the convergence effect is getting better and better, while the error function is getting smaller and smaller. Therefore, the choice of the window size is very important for the effect of the convergence rate. As Table 4 says, convergence experiments were performed in the window size, ranging from 20 to 380. The result is that the minimum windows for convergence are 280 samples, and the corresponding iterations are 40 iterations. At the same time, the maximum one for convergence is 20 samples, and the corresponding iterations are 1465.

In this subsection, we investigate the convergence and carry out error analysis of the dictionary learning method for load disaggregation. In Equation (11), we present the error between the previous object function value and the current object function value. In Figure 5, the error reduction is shown along with the number of iterations. From Figure 6, we can see the error variation of microwave as the shape window size increases. If the shape window size changes, the MAE value is not always better than the other algorithms. Thus, the optimal shape windows size is considered as one of the most important factors.

4.4. Metrics

Compared with three baseline algorithms CO, FHMMExact, and DDSC, we need to perform their measurement comparisons using the same metrics.

4.4.1. MAE

In statistics, the Mean Absolute Error (MAE) is the measurement error of a pair of observations which express the same phenomenon. For instance, as described in this paper, the mean absolute value error is expressed as follows:

M A E_{i} = \frac{1}{T} \sum_{t = 1}^{T} | {\hat{x}}_{i} (t) - x_{i} (t) |

(18)

where

{\hat{x}}_{i} (t)

is the estimated power consumption of the

i

th electrical appliance in the time slot t, while

x_{i} (t)

is the real power consumption of the

i

th electrical appliance in the time slot t. T is the length of the entire dataset, and the MAE is the mean power consumption error of the

i

th electrical appliance in the entire dataset. As a traditional indicator in the domain of pattern recognition or measurement, this indicator can be seen in all kinds of literature, and it is the most important indicator of load disaggregation.

4.4.2. RMSE

The Root Mean Square Error (RMSE) is often used to quantify the measurement error of a pair of observations. Its mathematical expression is as follows:

R M S E_{i} = \sqrt{\frac{1}{T} \sum_{t} {({\hat{x}}_{i} (t) - x_{i} (t))}^{2}}

(19)

where the

{R M S E}_{i}

is the Root Mean Square Error value of the

i

th electrical appliance,

{\hat{x}}_{i} (t)

is the estimated power consumption of the

i

th electrical appliance in the time slot t, and

x_{i} (t)

is the real power consumption of the

i

th electrical appliance in the time slot t, T is the time length of the whole training and testing set, and the

{RMSE}_{i}

is the root mean square error of the

i

th electrical appliance in the whole time period. The root mean square error is calculated, which corresponds to the Euclidean distance or Euclidean norm, and could also be called

ℓ_{2}

norm, pronounced

{∥ \cdot ∥}_{2}

or

∥ \cdot ∥

.

4.5. Experiment Result Analysis

Through experiments, Bats (this paper), CO, FHMMExact, and DDSC are compared. Composite data: use a curve similar to the heat pump data and offset the negative number to the positive number line. Aggregated data: battery data, refrigerator data, etc. are integrated for safety hazard factors detection.

According to the REDD dataset, as the basis of dataset, synthetic data are about batteries while the power data of other appliances lasted from 18 April 2011 to 25 May 2012. These algorithms include CO, FHMMExact, DDSC, and Bats as the interpretable algorithms, and you can select some of the main electrical appliances.

It can be seen from Table 5 that, compared with the CO and FHMMExact algorithms in the REDD dataset, the MAE value of the Bats algorithm is at the same level as combinatorial optimization in terms of battery electrical disaggregation results. The MAE values of all appliances are smaller than the other two algorithms. The optimal windows shape size of dictionary learning has a positive effect.

From Table 6, it can be seen that the root mean square error of the Bats algorithm is improved to some extent compared with that of CO, FHMMExact, and DDSC algorithm. The superiority of Bats over the CO and FHMMExact algorithm is due to capturing complex dictionary atoms from the aggregate power data and learning more sequential relationships in the power trace.

At present, according to the analysis of experimental data, the Bats algorithm has a better effect than a combinatorial optimization algorithm and factorial hidden Markov model. However, from the perspective of data trends, the effect after negative translation is basically explained, indicating that the Bats algorithm can be used to detect battery safety trends.

As a new and important application scenario, it is also very valuable. Through the lasso and lars algorithm, the Bats algorithm converges well to a relatively smaller value, for example,

e r r o r = 0.1

. In the future, the effect of sparse coding algorithm for dictionary learning in load disaggregation, especially in the case of battery load, can be analyzed in depth from the perspectives of initialization value, learning rate, and gradient.

5. Discussion

As is mentioned above in Section 1, although the power consumption of appliances can be monitored directly and accurately, this paper should focus on safety hazard monitoring methods under nonintrusive load decomposition scenarios in reality. Considering safety monitoring, we select several relatively explicable approaches based on optimization theory rather than black-box neural network algorithms, such as CO, FHMMExact, and DDSC. As Table 7, the CO algorithm has no regard for temporal correlation of power consumption data, the CO and FHMMExact transform power consumption data into state data and first three algorithms have not been designed for battery scenarios. Therefore, in this paper, when we design an algorithm for our research scenarios, dictionary learning algorithms can overcome the loss of time correlation for CO and the loss of information based on the state algorithm for FHMMExact. For battery scenarios, we design some algorithm improvements including the shift strategy of battery load and optimal dictionary matrix period algorithm. At the same time, we add an aggregation constraint.

However, there are some limitations in our proposed algorithm Bats. The first main limitation of Bats is that only simulation tests are performed. In the future, we will establish a test-bed to verify the algorithm. The second main limitation of Bats is that, in the process of dictionary learning and sparse coding, all the input training and testing dataset is low frequency power consumption data that are not beneficial for the algorithm’s efficiency. In the future, we will take the high frequency dataset into account.

6. Conclusions

As unsafe electrical appliances including batteries are important for life and property, direct monitoring can ensure the safety monitoring timely. However, in some scenarios, there is no choice but to adopt the nonintrusive load disaggregation methods. Based on the principle that charging power is mathematically positive and discharging power is negative, we propose a new topological representation diagram. Inspired by the idea of Fourier transform algorithms and dictionary learning algorithms, we present an electrical safety hazard factors algorithm by using the improved dictionary learning and sparse coding methods, including an optimal dictionary matrix period. In our algorithm, we build the minimization objective function and adopt the gradient descent algorithm to fix the approximate solution for this problem. Compared with three baseline algorithms CO, FHMMExact, and DDSC, our algorithm is more accurate than them on the dataset REDD. The MAE value is 8.26, which drops by 70%. The RMSE value is 97.75, which is also better than these baseline algorithms. In conclusion, our algorithm has achieved some degree of feasibility but still has some room to improve. In future work, we will continue to explore higher precision algorithms while maintaining the interpretability for safety-related tasks.

Author Contributions

Conceptualization, W.W., L.C.; Formal analysis, M.G., Z.W. and W.W.; Investigation, Y.C., W.W. and Z.C.; Methodology, M.G., Y.N. and W.W.; Software, Z.W. and W.W.; Supervision, L.C.; Validation, H.L.; Writing—original draft, W.W. and Z.W.; Funding, L.C.; Writing—review and editing, W.W., Y.C. and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant No. 62072319, in part by the Foundation of Science and Technology on Communication Security Laboratory (No. 6142103190415), and in part by the Key Research and Development Program of the Science and Technology Department of Sichuan Province under Grant No. 2020YFG0254.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Beláň, A.; Cintula, B.; Cenký, M.; Janiga, P.; Bendík, J.; Eleschová, Ž.; Šimurka, A. Measurement of Static Frequency Characteristics of Home Appliances in Smart Grid Systems. Energies 2021, 14, 1739. [Google Scholar] [CrossRef]
Campbell, R. Home Electrical Fires: Supporting Tables; National Fire Protection Association: Quincy, MA, USA, 2019. [Google Scholar]
Wang, B.; Dehghanian, P.; Wang, S.; Mitolo, M. Electrical Safety Considerations in Large-Scale Electric Vehicle Charging Stations. IEEE Trans. Ind. Appl. 2019, 55, 6603–6612. [Google Scholar] [CrossRef]
Kolter, J.; Batra, S.; Ng, A. Energy Disaggregation via Discriminative Sparse Coding. Adv. Neural Inf. Process. Syst. 2010, 23, 1153–1161. [Google Scholar]
Schirmer, P.A.; Mporas, I.; Sheikh-akbari, A. Energy Disaggregation Using Two-Stage Fusion of Binary Device Detectors. Energies 2020, 13, 2148. [Google Scholar] [CrossRef]
Song, J.; Wang, H.; Du, M.; Peng, L.; Zhang, S.; Xu, G. Non-Intrusive Load Identification Method Based on Improved Long Short Term Memory Network. Energies 2021, 14, 684. [Google Scholar] [CrossRef]
Hart, G. Nonintrusive appliance load monitoring. Proc. IEEE 1992, 80, 1870–1891. [Google Scholar] [CrossRef]
Henao, N.; Agbossou, K.; Kelouwani, S.; Dubé, Y.; Fournier, M. Approach in Nonintrusive Type I Load Monitoring Using Subtractive Clustering. IEEE Trans. Smart Grid 2017, 8, 812–821. [Google Scholar] [CrossRef]
Batra, N.; Singh, A.; Whitehouse, K. If You Measure It, Can You Improve It? The Value of Energy Disaggregation. BuildSys 2015, 191–200. [Google Scholar] [CrossRef]
He, K.; Jakovetic, D.; Zhao, B.; Stankovic, V. A Generic Optimisation-Based Approach for Improving Non-Intrusive Load Monitoring. IEEE Trans. Smart Grid 2019, 10, 6472–6480. [Google Scholar] [CrossRef] [Green Version]
Machlev, R.; Levron, Y.; Beck, Y. Modified Cross-Entropy Method for Classification of Events in NILM Systems. IEEE Trans. Smart Grid 2019, 10, 4962–4973. [Google Scholar] [CrossRef]
Bhotto, M.Z.A.; Makonin, S.; Bajić, I.V. Load Disaggregation Based on Aided Linear Integer Programming. IEEE Trans. Circuits Syst. II Express Briefs 2017, 64, 792–796. [Google Scholar] [CrossRef]
Piga, D.; Cominola, A.; Giuliani, M.; Castelletti, A.; Rizzoli, A.E. Sparse Optimization for Automated Energy End Use Disaggregation. IEEE Trans. Control Syst. Technol. 2016, 24, 1044–1051. [Google Scholar] [CrossRef]
Anderson, K. Non-Intrusive Load Monitoring: Disaggregation of Energy by Unsupervised Power Consumption Clustering. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2014. [Google Scholar]
Mengistu, M.; Girmay, A.; Camarda, C. A Cloud-Based On-Line Disaggregation Algorithm for Home Appliance Loads. IEEE Trans. Smart Grid 2019, 10, 3430–3439. [Google Scholar] [CrossRef]
Liu, Q.; Kamoto, K.; Liu, X.; Sun, M.; Linge, N. Low-Complexity Non-Intrusive Load Monitoring Using Unsupervised Learning and Generalized Appliance Models. IEEE Trans. Consum. Electron. 2019, 65, 28–37. [Google Scholar] [CrossRef]
Batra, N.; Kukunuri, R. Towards reproducible state-of-the-art energy disaggregation. In Proceedings of the 5th International Conference on Future Energy Systems, Xi’an, China, 3–6 November 2010; pp. 193–202. [Google Scholar]
Nalmpantis, C.; Vrakas, D. Machine learning approaches for non-intrusive load monitoring: From qualitative to quantitative comparation. Artif. Intell. Rev. 2019, 52, 217–243. [Google Scholar] [CrossRef]
Zhao, J.; Jung, T.; Wang, Y.; Li, X. Achieving differential privacy of data disclosure in the smar. In Proceedings of the IEEE INFOCOM 2014—IEEE Conference on Computer Communications 2014, Toronto, ON, Canada, 27 April–2 May 2014; pp. 504–512. [Google Scholar]
Athanasiadis, C.; Dimitrios, D.; Theofilos, P.; Antonios, C. A Scalable Real-Time Non-Intrusive Load Monitoring System for the Estimation of Household Appliance Power Consumption. Energies 2021, 14, 767. [Google Scholar] [CrossRef]
Khodayar, M.; Wang, J.; Wang, Z. Energy Disaggregation via Deep Temporal Dictionary Learning. IEEE Trans. Neural. Netw. Learn Syst. 2020, 31, 1696–1709. [Google Scholar] [CrossRef] [PubMed]
Lyu, H.; Strohmeier, C.; Menz, G. COVID-19 time-series prediction by joint dictionary learning and online NMF. arXiv 2020, arXiv:2004.09112. [Google Scholar]
Kolter, J.; Johnson, M. REDD: A Public Data Set for Energy Disaggregation Research; Workshop on Data Mining Applications in Sustainability SIGKDD: San Diego, CA, USA, 2011. [Google Scholar]
Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
Boyd, S. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Julien, M.; Francis, B.; Jean, P.; Guillermo, S. Online dictionary learning for sparse coding. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 689–696. [Google Scholar]
Batra, N.; Kelly, J.; Parson, O. NILMTK: An open source toolkit for non-intrusive load monitoring. In Proceedings of the 5th International Conference on Future Energy Systems, Cambridge, UK, 11–13 June 2014; pp. 265–276. [Google Scholar]
Zhang, C.; Zhong, M.; Wang, Z.; Goddard, N.; Sutton, C. Sequence-to-Point Learning with Neural Networks for Non-Intrusive Load Monitoring. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32, pp. 2604–2611. [Google Scholar]

Figure 1. New topological logic diagram of electrical microenvironment with potential safety hazards.

Figure 2. Appliance period characteristics in REDD dataset.

Figure 3. Flowchart about Algorithms 1 and 2.

Figure 4. Ideal simulation of battery charging and discharging curves and their data sets.

Figure 5. Disaggregation convergence rate with shape window size in the REDD dataset.

Figure 6. Error change of microwave with shape window size in the REDD dataset.

Table 1. Academic terms covered in this paper.

Symbol	The Meaning of the Symbol
Y	Aggregation power
$X_{i}, i \in (1, 2, \dots, N)$	Power of $i$ th appliance
$y (t), t \in (1, 2, \dots, T)$	Aggregation power in t
$x_{i} (t)$	Actual power of $i$ th appliance in t
$ϵ (t)$	Power noise in t
$\hat{y} (t), t \in (1, 2, \dots, T)$	Estimated aggregation power in t
${\hat{x}}_{i} (t), t \in (2, \dots, N)$	Estimated power of $i$ th appliance in t
$t \in (1, 2, \dots, T)$	Time slot
$i \in (1, 2, \dots, N)$	Index of all appliances
$j \in (1, 2, \dots, M)$	The atom numbers of dictionary
B	Fundamental matrix of Fourier transform
A	Coefficient matrix of Fourier transform
D	Dictionary representation of aggregate
$D_{i}, i \in (1, 2, \dots, N)$	Dictionary representation of $i$ th appliance
$D_{1 : k}$	${1 : k}$ Appliance dict concatenation
$d_{i}^{(j)}$	The subvector of dictionary
C	Aggregation sparse coefficient
$\hat{C}$	Estimated aggregation sparse coefficient
$C^{*}$	Optimal aggregation sparse coefficient
$C_{i}, i \in (1, 2, \dots, N)$	$i$ th appliance sparse coefficient
$C_{1 : k}$	${1 : k}$ Appliance sparse coefficient
$\hat{C}$	Estimated aggregation sparse coefficient

Table 2. Experiment case for REDD datasets.

Index	Item	Time	Duration
1	train start time	2011.04.18	313 days
2	train end time	2012.02.25	313 days
3	test start time	2012.2.25	90 days
4	test end time	2012.5.25	90 days

Table 3. Experiment parameters.

Index	Parameter	Parameter Name	Parameter Value
1	$λ$	coefficient	20
2	$α$	update step	$10^{- 12}$
3	$c o n s t_{i}$	constraint const	0.1
4	T	time length	entire dataset
5	Step	max iterative step	10,000
6	n	atom number	10
7	error	objective function error	0.1
8	m	default matrix shape	120 or variable

Table 4. Statistical measures on minimum, maximum, average, and standard deviation values of errors for different presented cases.

Index	Shape Size	Iteration Numbers	Remarks
1	20	1465	maximum
2	40	1040	-
3	60	906	-
4	80	724	-
5	100	814	-
6	120	556	-
7	140	731	-
8	160	466	-
9	180	464	-
10	200	276	-
11	220	429	-
12	240	372	-
13	260	324	-
14	280	45	minimum
15	320	147	-
16	340	145	-
17	360	190	-
18	380	111	-
minimum	280	40	-
maximum	20	1464	-
average	-	511	-
standard deviation	-	376	-

Table 5. MAE in the REDD dataset.

Appliance Name	CO	FHMMExact	DDSC	Bats
battery	52.79	27.72	15.86	8.26
microwave	86.29	57.92	63.20	28.29
dish washer	113.13	70.58	113.05	38.92
light	69.40	38.05	73.50	33.83
electric oven	149.30	114.04	82.02	71.24
washer dryer	98.04	102.70	92.69	63.97

Table 6. RMSE in the REDD dataset.

Appliance Name	CO	FHMMExact	DDSC	Bats
battery	220.97	99.17	114.14	97.75
microwave	263.99	158.54	268.37	157.13
dish washer	265.07	220.22	389.93	174.31
light	101.14	64.87	132.43	59.35
electric oven	573.47	474.11	457.36	453.57
washer dryer	391.21	393.78	441.01	390.07

Table 7. Comparison of several algorithms.

Algorithms	Temporal Correlation	State or Non State Based	Specially for Battery	Black-Box
CO [7]	No	State based	No	NO
FHMMExact [17]	Yes	State based	No	NO
DDSC [4]	Yes	Non State based	No	NO
Seq2point [28]	Yes	Non State based	No	Yes
DTDL [21]	Yes	Non State based	No	Yes
Bats	Yes	Non state Based	Yes	NO

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, W.; Wang, Z.; Chen, Y.; Guo, M.; Chen, Z.; Niu, Y.; Liu, H.; Chen, L. Bats: An Appliance Safety Hazards Factors Detection Algorithm with an Improved Nonintrusive Load Disaggregation Method. Energies 2021, 14, 3547. https://doi.org/10.3390/en14123547

AMA Style

Wang W, Wang Z, Chen Y, Guo M, Chen Z, Niu Y, Liu H, Chen L. Bats: An Appliance Safety Hazards Factors Detection Algorithm with an Improved Nonintrusive Load Disaggregation Method. Energies. 2021; 14(12):3547. https://doi.org/10.3390/en14123547

Chicago/Turabian Style

Wang, Wei, Zilin Wang, Yanru Chen, Min Guo, Zhengyu Chen, Yi Niu, Huangeng Liu, and Liangyin Chen. 2021. "Bats: An Appliance Safety Hazards Factors Detection Algorithm with an Improved Nonintrusive Load Disaggregation Method" Energies 14, no. 12: 3547. https://doi.org/10.3390/en14123547

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bats: An Appliance Safety Hazards Factors Detection Algorithm with an Improved Nonintrusive Load Disaggregation Method

Abstract

1. Introduction

2. Problem Modeling

2.1. Preliminary

2.2. Determination of Objective Function

3. Model Solution and Algorithm Analysis

3.1. The Optimization Objective

3.2. Dictionary Learning Period Optimal Algorithm

3.3. Improved Dictionary Learning Sparse Coding Algorithm

4. Experimental Results and Analysis

4.1. Datasets for Experiments

4.2. Experimental Setting

4.3. Convergence Analysis

4.4. Metrics

4.4.1. MAE

4.4.2. RMSE

4.5. Experiment Result Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI