A Design and Comparative Analysis of a Home Energy Disaggregation System Based on a Multi-Target Learning Framework

Buddhahai, Bundit; Korkua, Suratsavadee Koonlaboon; Rakkwamsuk, Pattana; Makonin, Stephen

doi:10.3390/buildings13040911

Open AccessArticle

A Design and Comparative Analysis of a Home Energy Disaggregation System Based on a Multi-Target Learning Framework

by

Bundit Buddhahai

¹

,

Suratsavadee Koonlaboon Korkua

^2,*,

Pattana Rakkwamsuk

³ and

Stephen Makonin

⁴

¹

School of Informatics, Walailak University, Nakhon Si Thammarat 80161, Thailand

²

School of Engineering and Technology, Walailak University, Nakhon Si Thammarat 80161, Thailand

³

School of Energy, Environment and Materials, King Mongkut’s University of Technology Thonburi, Bangkok 10140, Thailand

⁴

School of Engineering Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada

^*

Author to whom correspondence should be addressed.

Buildings 2023, 13(4), 911; https://doi.org/10.3390/buildings13040911

Submission received: 2 February 2023 / Revised: 17 March 2023 / Accepted: 21 March 2023 / Published: 30 March 2023

(This article belongs to the Special Issue Building Energy-Saving Technology)

Download

Browse Figures

Versions Notes

Abstract

:

Insightful information on energy use encourages home residents to conduct home energy conservation. This paper proposes an experimental design for an energy disaggregation system based on the low-computational-cost approaches of multi-target classification and multi-target regression, which are under the multi-target learning framework. The experiments are set up to determine the optimal learning algorithm and model parameters. In addition, the designated system can provide inference of the appliance power state and the estimated power consumption from both approaches. The kernel density estimation technique is utilized to formulate the appliance power state as a finite-state machine for the multi-target classification approach. Multi-target regression can directly provide the estimation of appliance power demand from the aggregate data, and this work unifies the system’s design together with multi-target classification. The predictive performances obtained through the F-score (micro-averaged) and power estimation accuracy index for the power state inference and the estimated power demand, respectively, are shown to outperform a deep-learning-based denoising autoencoder network under the same data settings from both approaches. The results lead to a recommendation to apply the approach in home energy monitoring, which is mainly based on the characteristics of appliance power and the information that the residents wish to perceive.

Keywords:

energy disaggregation; non-intrusive load monitoring; multi-target learning; home energy monitoring

1. Introduction

Energy disaggregation, or non-intrusive load monitoring (NILM), is a system of data analysis that aims to determine the operation status or energy consumption of individual appliances from the total energy consumption data of a building [1]. The benefit of perceiving the information on energy use at the appliance level was the motivation to reduce the use of some specific appliances or prevent energy wastage [2]. A study found that residential users could reduce energy costs by at least 12% when obtaining information on appliance-level energy use in a real-time scenario with a contrast of 4–8% when perceiving the traditional information on the whole energy usage on a weekly or monthly basis [3]. Another benefit to grid distributors was that the detailed information on energy use could help in shaping energy policy based on consumption behavior to balance the electricity demand and supply scheme [4].

Data learning and inference are important components for developing an efficient energy disaggregation system. The development of the system could be classified mainly into two categories: event-based and non-event-based systems [5]. The former category dealt with detecting power switch events (power ON ←→ OFF), which had two sub-categories of detecting steady-state power consumption changes (∆P and ∆Q) [1,6] and power-ON transient patterns [7,8]. Both features were different for each appliance, which could be utilized to learn if the target appliances were being turned on or off. The integration of both features for data learning could enhance the disaggregation performance since the latter would be unique for different appliance types [9]. The transient patterns, however, required dedicated and high-cost hardware to capture the high-frequency components and perform complex data processing. These challenges have inhibited the adoption of this approach to the current power meters, which usually perform data manipulation at low frequencies [10]. The non-event-based category, on the other hand, involved the operation of decomposing the appliance status from the aggregate data without relying on detecting switch events. Thus, it allowed the utilization of a low-frequency measurement system which could enable a lower cost of development. This category could mainly be classified into three sub-categories: pattern classification, hidden-Markov-model-based, and deep learning [5,11]. The first sub-category referred to mapping each sample of aggregate data to the operating status (ON/OFF) of appliances with the help of traditional classification algorithms, such as neural networks [12] and support vector machines [13]. This category treated each load label independently, which might not comply with a real scenario in which an appliance might be used together with other certain appliances (e.g., a DVD player and a television) [14]. The second sub-category involved applying a factorial hidden Markov model (FHMM) and its variant models to set up the model parameters using the probability density function. The model factorized the aggregate data into a sequence of operation states (as a finite-state machine model) for appliances that were hidden from the observer [15,16,17]. The major drawback of these approaches was the model’s complexity, which increased exponentially when the number of appliances increased [5]. The third sub-category employed a deep neural network (DNN) framework which has been recently applied in many applications of data learning due to its capabilities of automatic feature processing and learning complex problems. Various network topologies were investigated to estimate the power demand for each appliance, for example, long short-term memory (LSTM) [18], convolutional neural networks (CNNs) [19,20], and autoencoder networks [21,22]. Another major drawback of these frameworks was the model’s complexity, which involved a number of network parameters and configurations for model optimization.

From the previous studies, the major approaches could have the ability to provide one or two important pieces of information in the NILM application; (1) the identification of the appliance’s operation state and (2) the estimation of the power demand for appliances. Some approaches could provide both pieces of information, but they have a high computational cost in the model’s configuration. This paper proposes an NILM system design based on a low-computational-cost approach to the multi-target learning framework, which can formulate the problem in the multi-target data format well to identify multiple appliances. The framework is classified into two approaches: multi-target classification (MTC) and multi-target regression (MTR). The MTC approach formulates the appliance power state using kernel density estimation (KDE), the proposed method for power state modeling, as the finite-state machine. The MTR approach involved less data processing than the MTC, and each approach was previously proposed independently [23,24]. This work unifies the system design of both approaches under the multi-target learning framework. The experiments illustrate the process to obtain the optimal predictive performance of an appliance power state inference and the estimated power demand. In addition, a comparative analysis of the performance and characteristics of the approaches is delivered for a decision to apply in a home energy monitoring system. The key contributions of this work are summarized as follows:

(1): An NILM system design based on multi-target classification and multi-target regression is proposed as a unified system of a multi-target learning framework. Both approaches can provide the inference of an appliance power state and the estimated power demand, which are the key outputs of the system.
(2): A power state modeling using KDE for the multi-target classification approach is proposed.
(3): The comparative analysis of the predictive performance for the multi-target classification, multi-target regression, and a denoising autoencoder (DAE) approach is provided for the consideration of applying the approach in a field application.

2. Materials and Methods

This section describes the general concept of the multi-target learning framework, the experimental data, and the process that demonstrates how a home energy disaggregation using this framework could be implemented. Each topic is described as follows.

2.1. Multi-Target Learning

Multi-target learning is a subfield of supervised learning for multiple outputs tasks that learn data by simultaneously mapping a set of input features to a set of output labels [25]. The learning framework has been applied in many application areas such as image classification, text mining [26], and predicting the model parameters for plantation study [27]. The general learning framework has a task of determining a function f: X → Y from a training set of {(x_i, y_i)|1 ≤ i ≤ n}, where n is the number of training samples, x_i ϵ X is a vector of the input features, and y_i ϵ Y is the associated output labels with x_i [25]_. Then, the function acts as a learning model to provide predicted outputs to an unknown input sample. The common learning approach that can be applied to the NILM task is multi-target classification and multi-target regression [28,29]. This is because the purpose of the task is to determine the estimated power demand (numeric values) and/or the appliance operation states (nominal values) from the aggregate data.

Data manipulation for the multi-target learning framework can be classified into two approaches of problem transformation and algorithm adaptation [28,30]. The former approach uses a multi-target learner (classifier or regressor) to transform the multiple output labels into a set of individual or group of labels. It then uses a conventional single-output learning algorithm (as the based learner) to tackle the problem. The latter approach adapts the single-output learning algorithm to directly learn the multi-output data. The learning algorithms used in this work were under the problem transformation approach that is available by the experimentation tools, including:

(1): Class relevance (CR) or single target (ST): treat and learns each output independently [26].
(2): Classifier chain (CC) or regressor chain (RC): cascade each output label to the input features and build dependent classifiers/regressors for voting [31].
(3): Random k-label sets with disjoint subsets (RAkELd): randomly partition the output labels into small groups of concatenated single-output problems [32].

A major advantage of data learning through the multi-target learning framework over the data learning through independent output labels (as the single-output data learning) is the characteristics of labels correlation among the outputs (as CC and RAkELd) which showed to provide outperform-predictive performances [31,32].

2.2. Experimental Data

The electricity consumption data used in this experiment were collected from a residence in Bangkok, Thailand and gathered as the Supplementary Material [33]. The data consist of an aggregate circuit and ten sub-circuits of appliance usage, with a one-minute interval for each sample and four months of data length. The electrical parameters for each circuit consist of the current (I), active power (P), reactive power (Q), and power factor (PF). In this work, six appliance labels (measurement circuits) that were frequently used and significantly contributed to total energy consumption were evaluated. The multi-target dataset was created using the four parameters of the aggregate circuit as the input (X) and the appliances’ power state (for MTC data) or the appliances’ power consumption (for MTR data) as the outputs (Y). The load description for each appliance label was presented in Table 1.

2.3. Experimental Design

The data processing procedure is evaluated as an experimental design where the main objective is to determine the optimized learning model and predictive performance for both MTC and MTR approaches. For the MTC approach, KDE modeling converted the sub-circuits power data into the discrete power state of the appliances. The MTC dataset was constructed using the aggregate data as the input and the appliances’ power state data as the outputs. The data learning was proceeded by selecting the optimized MTC algorithm and model parameters where the data inference process was to determine the predicted power state and relevant power consumption data of the appliances. For the MTR approach, the MTR dataset was constructed directly from the aggregate data and the sub-circuits power data. The data learning optimized the MTR algorithm and model parameters where the data inference provided the estimated power consumption and the relevant power state of the appliances. The evaluation procedure was summarized in the flowchart shown in Figure 1.

The KDE for MTC power state modeling is a statistical approach used for estimating the probability density function (PDF) of a random variable x with n samples (x₁, x₂,…, x_n) and a distribution function (f) [34]. The estimated shape or kernel estimator (

\hat{f}

) of the function can be described by Equation (1). K is the kernel function, in which the Gaussian distribution is chosen as an assumption of normal distribution for regular power data; h is the bandwidth of the kernel and acts as a smoothing parameter.

\hat{f} (x) = \frac{1}{n h} \sum_{i = 1}^{n} K (\frac{x - x_{i}}{h})

(1)

The main idea of using the technique for appliance power state modeling is the inherent power states which would exhibit a certain range of power demand data. The frequent data should be easily observed and extracted through a KDE plot. The parameter h was selected empirically to make the peaks obvious, and the positions of the peaks were determined through Python’s argrelextrema function which calculated the relative extrema within the list of data. Thus, the number and position of the peaks would represent the number of power states and the power representatives associated with that state, respectively.

The appliance power state of the MTC dataset was created by mapping the power data to the nearest element within the set of the power state representatives; the associated power state value could then be determined. For example, a three-state appliance label (the possible set of states was {0, 1, 2}) has the set of power representatives {0, 0.05, 0.25} kW. If a data sample has the power data of 0.1 kW, it would be mapped to state ‘1′ since it has the least distance among the associated power state representatives. For the MTR dataset, the power data were created using the actual power data from the individual appliance as the data for each output label.

To compare the predictive performance between the MTC and MTR, both classification and regression tasks from each approach were evaluated. The regression task from the MTC was obtained by mapping the result of the predicted state to its associated power representation values. The estimated appliance power data were then bound by the defined power states. The classification task from the MTR was obtained by defining the binary power state (on/off) for the actual and the estimated power data. This was achieved by having the power data below the operation state as described in Table 1 to be ‘0’ (off), otherwise, ‘1’ (on).

2.4. Evaluation Tools and Performance Index

This work used Python (ver. 3.7) with the Scikit-learn library (ver. 0.19.2) [35] and Meka (release 1.9.6) [36] for multi-target data learning and inference. The Python library provides the sklearn.multioutput method for both classification and regression tasks with a couple of learning algorithms from the problem transformation approach. The library, however, has partially supported the multi-target classification task, which would not be able to perform cross-validation tasks. Thus, Meka software, an application for learning multi-target classification algorithms, was used to perform this task instead. The performance indexes used in this work were based on determining the performance of power state identification as the classification task (Hamming score and F-score) and the performance of power consumption data estimation as the regression task (power estimation accuracy). Each index is described as follows.

-: Hamming score: the compliment of Hamming loss (Hamming score = 1 − Hamming loss), where Hamming loss [30] indicates the fraction of an incorrect classification through the entire set of output labels. Thus, the Hamming score can be expressed as Equation (2).

Hamming score = 1 - \frac{1}{n} (\sum_{i = 1}^{n} \frac{1}{L} |{\hat{y}}_{i} ∆ y_{i}|)

(2)

where n and L are the number of samples and the number of output labels, respectively.

∆

is the operator to determine the number of differences between the set of actual output labels (y) and the predicted labels (

\hat{y}

) for a sample. This index was used to evaluate performance for the whole output labels in the algorithm selection process by the Meka software.

-: F-score: the harmonic mean between precision and recall was used for the classification task. This index was commonly employed to determine the appliance classification performance rather than the accuracy because the result would not be altered by data with high-imbalanced classes [16,37]. The micro-averaged value was used to make a fair comparison between the results of MTC (finite-state class: {0, 1,…, C}, C: number of classes) and MTR (binary class: {0, 1}). The key concept of micro-averaging was to average the calculation of tp, fp, and fn, respectively, across all samples first, which was suitable for determining and comparing the output label with a different number of classes. The micro-averaged F-score was defined through the micro-averaged precision and micro-averaged recall, as presented in Equations (3)–(5).

Precision (micro) = \frac{\sum_{i = 1}^{L} {t p}_{i}}{\sum_{i = 1}^{L} {t p}_{i} + \sum_{i = 1}^{L} {f p}_{i}}

(3)

Recall (micro) = \frac{\sum_{i = 1}^{L} {t p}_{i}}{\sum_{i = 1}^{L} {t p}_{i} + \sum_{i = 1}^{L} {f n}_{i}}

(4)

F - score (micro) = \frac{2 \times P r e c i s i o n (m i c r o) \times R e c a l l (m i c r o)}{P r e c i s i o n (m i c r o) + R e c a l l (m i c r o)}

(5)

where

{t p}_{i}

,

{f p}_{i}

, and

{f n}_{i}

are the number of true positives, false positives, and false negatives, respectively, for the output label i. L is the number of output labels [30].

-: The power estimation accuracy: the index determines how good the power estimation by the regression model [15,38] is by calculating the complement of the power estimation error ratio over the test samples. It is defined for the individual appliance label and the entire dataset by Equations (6) and (7), respectively.

Power est . acc . (by label) = 1 - \frac{\sum_{i = 1}^{n} | {\hat{y}}_{i} - y_{i} |}{2 (\sum_{i = 1}^{n} y_{i})}

(6)

Power est . acc . (by dataset) = 1 - \frac{\sum_{i = 1}^{n} \sum_{l = 1}^{L} | {\hat{y}}_{i}^{l} - y_{i}^{l} |}{2 \cdot \sum_{i = 1}^{n} {\sum_{l = 1}^{L} y_{i}^{l}}_{}}

(7)

where

y_{i}^{l}

and

{\hat{y}}_{i}^{l}

are the actual and predicted output sample i of label l, respectively. n and L are the number of samples and the number of output labels, respectively.

3. Results and Discussion

This section provides the experimental results and analysis from the data learning process using the multi-target classification and multi-target regression approaches.

3.1. Multi-Target Classification Approach

3.1.1. Appliance Power States Modeling Using KDE

The number and representative data of power states for each appliance were analyzed using the KDE plot, as presented in Figure 2, which illustrated the distribution or density of appliance power data and inherent power states.

Appliance labels LTL and ANB1 had obvious peaks, which showed that they had distinguishable operation states (e.g., off/dim/full for lightings or off/on for an inverter-type air-conditioner) with a low variance in power data. On the other hand, the other appliance labels that consisted of multiple appliances (plugging appliances, PGB1) or continuous-variable appliances (water heater, WTB1) had a high variance of power data. These loads occurred along with inapparent peaks in the KDE plots, which were then averaged within the range to determine the representations. Each applicable peak and its position represented the power state and the power state representative value, which developed the number of power states, as the result shown in Table 2 outline.

3.1.2. Algorithm Selection and Model Optimization

The experiment consisted of selecting the optimal classification algorithm by comparing the performance of some candidates. Then, the best candidate will be further evaluated to choose the best set of model parameters.

A performance evaluation of different learning algorithms was conducted through cross-validation (K = 10) to obtain a performance value over the whole dataset. Meka’s Experimenter mode was used to perform the task where the experimentation was set for a comparative evaluation of multiple algorithms under the same environment setting. Table 3 presents the result of the comparison, where all the model parameters are set to default.

The MTC algorithm RAkELd with random forest as the based classifier won the competition even though its result was not statistically significant in difference. This classifier was then tested for the best model parameters by using the MultiSearch function in Meka’s Explorer mode through the required parameters and test range. The function evaluated each combination of model parameters under the test range and returned the parameter setup that provided the best model performance.

For the RAkELd + random forest classifier, there were two parameters put into this test; ‘k’, which controlled the number of partitioned labels for the multi-target classifier (‘k’ was bounded by 1 ≤ k < L/2, where L was the number of output labels). The other parameter was numIterations, which controlled the number of trees in the random forest classifier. The results of the parameter selection and the optimal predictive performance are shown in Table 4.

3.1.3. Testing Performance Evaluation

The optimized model setup was deployed for evaluating the predictive performance by individual labels using the Scikit-learn library. The dataset was split for the training set and test set by an 80:20 proportion. The model was created using the training set, and then we applied the model to predict the test set to obtain the F-score micro-averaged values, as the generalized model performance. The experiments of the power state classification performance and the power estimation accuracy were presented in the following subsections.

Power States Classification Accuracy

This evaluation described how well the model could correctly predict the power state of the appliances. The generalized classification performance of the MTC model is shown in Table 5.

The result showed that the air-conditioner ANB1 obtained the highest F-score value since it was a high-power appliance label with a low fluctuation in terms of the power demand, and it had a lower number of power states (two-state model: OFF/ON). The model could then distinguish the input data associated with the model more easily, with less confusion compared to other appliance labels with the three-state models.

Power Demand Estimation Accuracy

The predicted power state was mapped to the associated power state representative value, resulting in the estimated power demand predicted by the model. The power estimation accuracy, which compared the estimated power to the ground-truth data, was shown in Table 6.

The ANB1 load label still obtained a high accuracy for power estimation since its power states prediction became accurate beforehand. Other load labels, however, were lower in accuracy; this was because their power data fluctuated quite dramatically, and that makes a significant difference between the ground truth and the power representatives as the estimated data.

Apart from the power estimation accuracy, the illustration of power plots could also represent how well the estimated power data could track the ground-truth data. Figure 3 showed a week of test data’s power plots for each appliance label.

The lighting, plugging, and the water heater loads, which obtained relatively low power estimation accuracy values, had relevant results to the power plot profiles. For example, the plot showed several false positives for LTL and LTB1 labels, which made the performance value quite low, while PGB1 and WTB1 labels had a considerable gap between the ground truth and the estimated power data due to the fluctuation of the actual power data.

3.2. Multi-Target Regression Framework

The Python Scikit-learn library supports data learning and inference for the MTR dataset as in the following experiments.

3.2.1. Algorithm Selection and Model Optimization

The dataset was evaluated to pick an MTR algorithm with an optimal regression performance, and then the best candidate was further evaluated to obtain the optimal model parameters. The algorithm selection result through the power estimation accuracy by cross-validation (K = 10) was presented in Table 7.

The ST multi-target regressor with random forest as the based regressor provided the best result. The optimized regression model parameters were determined using the Python GridSearchCV function. This function evaluated the combination of two model parameters, max_depth (the maximum value of depth for the trees) and n_estimators (the number of trees in the forest), then it returned the set of parameters that provided the best predictive performance. Table 8 shows the results of this evaluation.

3.2.2. Testing Performance Evaluation

The optimized model setup was deployed for evaluating the power estimation performance using the same data split for the training set and test set as the MTC experiment. The experiments of the power estimation and power state prediction performance were presented in the following subsections.

Power Estimation Accuracy

This evaluation described how well the model could estimate the power demand for each appliance label. The generalized regression performance of the model that evaluated the test set was shown in Table 9.

Power States Classification Accuracy

The estimated power data were mapped to a binary power state model (off/ on) using the appliance power on the threshold to discriminate between the two states. A multi-label dataset [14] was created and used the same data split as the MTCs experiments. The result of the F-score evaluation on the test set was presented in Table 10.

The ANB1 load obtained a relatively high accuracy among both the power estimation and power state identification. This was due to high power with a low variant load profile, which would make the input data more distinguishable by the regression and classification models.

The illustration of power plots represents how well the estimated power data by the regression model could track the ground-truth data. Figure 4 shows a week of test data power plots for each appliance label. The estimated power data generated by the regression model was not unbound to the associated power state as the MTC approach was a benefit for the model improvement capability in data estimation. From the results, the lightings (LTL, LTB1), which were the low-power loads, obtained a fuzzy data estimation which was linked to the relatively low value of the power estimation accuracy. This was because the low power loads data contribute little change in the aggregate input data, making it hard for the model to accurately estimate the outputs.

3.3. Performance Benchmarking

This experiment compared the predictive performance of the MTC and MTR approaches, and DAE, a state-of-the-art deep-learning-based approach, as the comparator. The DAE network was used since it provided a good performance among the other network topologies [39]. It aimed to recreate the clean signal or ground-truth power data from the aggregate power data, which acted as the noisy signal. The network architecture from the study was employed, which consisted of two 1D convolutional layers for the first and the last layer, with three dense layers in the middle. The network was implemented in Python using the neural-disaggregator library [40]. The results of the comparison were summarized using the F-score classification performance and the power estimation accuracy, as presented in Table 11 and Table 12, respectively. Generally, the direct output from each approach (to predict the power state for MTC and to estimate the power data for MTR) demonstrated a higher performance than its counterparts, which have an additional role in data processing. Thus, less data processing would provide less loss of information and demonstrate better accuracy results.

3.4. Discussion

The experiments were formulated to design an energy disaggregation system with MTC and MTR frameworks. These are low-computational-cost systems that are exemplified for the application of home energy monitoring.

The MTC framework formulated the appliance power state modeling with KDE as a finite-state machine, and the power state classification result obtained by the F-score was over 87% for each appliance. The effectiveness of the modeling technique would also be indicated through the relevant regression performance that obtained a power estimation accuracy for over 70% of appliances with a moderate- to high-power consumption. The experimental results, however, did not provide a good predictive performance value for appliances with continuous or highly fluctuated power consumption (e.g., water heater, pump). For this case, deploying a data pre-processing stage that filters out the anomalies before data learning would help obtain a better predictive performance.

The MTR approach provided a direct (no extra data pre-processing) power estimation for each appliance which, on average, presented a better regression performance than the MTC approach. The key result included the estimation accuracy of over 93% for high-power loads (water heater and A/C). The power state identification from the MTR approach would also present the performance value at a comparable level to the MTC, which was the direct classification approach. Using the F-score, the micro-averaged index could enhance the reliability of the performance benchmarking between models with a different number of class outputs (i.e., binary class and multi-class model).

The benchmarking DAE network provided an inferior predictive performance compared to the proposed MTC and MTR approaches due to the amount of sample data in the experiment. Generally, the DAE network would detect ranges of data that indicated power ON as the clean signal or features to be learned from each appliance. This made the model require a large number of training samples to obtain a good level of predictive performance using deep-learning-based approaches [39,41]. On the other hand, the proposed approach learned data on an individual sample basis, which could afford a satisfactory predictive performance with a moderate amount of data.

In the used case of the proposed approaches, if the appliances in the domain are mostly finite-state loads or users wish to gather insight into the appliance operation status for a predictive maintenance purpose, the MTC framework would be a recommended system. For example, if a certain power state (e.g., “Compressor on for A/C”) has taken an unusual period of operation compared to the other state (e.g., “Fan on”), then it would signify a malfunction or abnormal usage. If the users wish to focus on monitoring the power demand of high-power appliances as a purpose of energy management, the MTR framework would be a recommended system because of the less data processing it requires.

A limitation of the MTC approach is the ability of the KDE method to catch and infer the appliance power state from continuous-power appliances (e.g., water heater, water pump). The power data from those appliances did not exhibit a frequent pattern in which the KDE could not properly identify the inherent power states and their associated power data. In case there are a large number of these appliances in the system, it is recommended to use the MTR approach to obtain a model with a decent predictive performance. For the MTR approach, the binary model of power state identification might not fit well for learning the operational state of multi-state appliances (e.g., fridge, washing machine). Thus, an additional step of data processing could be employed, such as power data mapping after the regression process to the prior knowledge of the power state assignment.

4. Conclusions

The proposed design of the home energy disaggregation system employs multi-target classification (MTC) and multi-target regression (MTR) approaches to predict the appliance power state identification and power demand estimation. The design is based on the optimization process of learning algorithm selection and model parameter selection. The power state modeling for MTC was designated using the KDE technique to determine the number of power states and the power representative data for each appliance. The MTR required no special data pre-processing before training the model, making the computational cost lower than the MTC approach. The MTC approach delivered the appliance power state identification through an F-score (micro-averaged) of 87–99% and a power estimation accuracy of 52–95% for the appliances under experiments. For the MTR approach, it delivered the appliance power estimation accuracy of 54–95% and the power state identification through an F-score (micro-averaged) of 87–99%. These performance results outperform the benchmarking deep neural network using DAE architecture under the same data settings. Deploying the proposed approach for a home monitoring system would rely on what data and type of appliances that users would wish to focus on because both approaches have strong and weak points in the process of data manipulation and the achieved predictive performance.

Future work could focus on improving the MTR model to obtain finite power state modeling instead of binary state modeling. This method helps the approach identify the power state more accurately when working with multi-state appliances. The result of the KDE method, that learns the inherent power states for the MTC approach, could be applied as the post-processing step. Another finite-state modeling technique could be using clustering methods such as DBSCAN [42] or hierarchical clustering [43]. These methods can learn to identify the number of clusters or power states from the power data without prior knowledge of the number of power states. Thus, a more refined power state model for the MTR approach would make the system more applicable to various types of appliances, leading to more efficient energy monitoring and management applications.

Supplementary Materials

The data used within the experiments can be downloaded from https://data.mendeley.com/datasets/nmnk58bgtb/1 (accessed on 15 January 2023).

Author Contributions

Conceptualization, B.B. and S.K.K.; methodology, B.B. and S.K.K.; software, B.B.; validation, S.M. and P.R.; formal analysis, B.B. and S.K.K.; investigation, B.B., S.K.K., and S.M.; resources, B.B. and P.R.; data curation, B.B.; writing—original draft preparation, B.B. and S.K.K.; writing—review and editing, B.B.; visualization, B.B.; supervision, S.M. and P.R.; project administration, S.K.K.; funding acquisition, B.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Research Grants from the Walailak University Research Fund under Contract WU66219.

Data Availability Statement

The data supporting the reported results will be available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hart, G.W. Nonintrusive appliance load monitoring. Proc. IEEE 1992, 80, 1870–1891. [Google Scholar] [CrossRef]
Meguro, W.; Peppard, E.; Meder, S.; Maskrey, J.; Josephson, R. Going Beyond Code: Monitoring Disaggregated Energy and Modeling Detached Houses in Hawai‘i. Buildings 2020, 10, 120. [Google Scholar] [CrossRef]
Armel, K.C.; Gupta, A.; Shrimali, G.; Albert, A. Is disaggregation the holy grail of energy efficiency? The case of electricity. Energy Policy 2013, 52, 213–234. [Google Scholar] [CrossRef] [Green Version]
Zeifman, M.; Roth, K. Nonintrusive appliance load monitoring: Review and outlook. IEEE Trans. Consum. Electron. 2011, 57, 76–84. [Google Scholar] [CrossRef]
Zoha, A.; Gluhak, A.; Imran, A.; Rajasegarar, S. Non-Intrusive Load Monitoring Approaches for Disaggregated Energy Sensing: A Survey. Sensors 2012, 12, 16838–16866. [Google Scholar] [CrossRef] [Green Version]
Laughman, C.; Lee, K.; Cox, R.; Shaw, S.; Leeb, S.; Norford, L.; Armstrong, P. Power signature analysis. IEEE Power Energy Mag. 2003, 99, 56–63. [Google Scholar] [CrossRef]
Leeb, S.B.; LeVan, M.S.; Kirtley, J.L.; Sweeney, J.P. Development and Validation of a Transient Event Detector. AMP J. Technol. 1993, 3, 69–74. [Google Scholar]
Chang, H.H. Non-Intrusive Demand Monitoring and Load Identification for Energy Management Systems Based on Transient Feature Analyses. Energies 2012, 5, 4569–4589. [Google Scholar] [CrossRef] [Green Version]
Norford, L.K.; Leeb, S.B. Non-intrusive electrical load monitoring in commercial buildings based on steady-state and transient load-detection algorithms. Energy Build. 1996, 24, 51–64. [Google Scholar] [CrossRef]
Parson, O.; Ghosh, S.; Weal, M.; Rogers, A. An unsupervised training method for non-intrusive appliance load monitoring. Artif. Intell. 2014, 217, 1–19. [Google Scholar] [CrossRef]
Huber, P.; Calatroni, A.; Rumsch, A.; Paice, A. Review on Deep Neural Networks Applied to Low-Frequency NILM. Energies 2021, 14, 2390. [Google Scholar] [CrossRef]
Ruzzelli, A.G.; Nicolas, C.; Schoofs, A.; Hare, G.M. Real-Time Recognition and Profiling of Appliances through a Single Electricity Sensor. In Proceedings of the 2010 7th Annual Proceedings SECON, Boston, MA, USA, 21–25 June 2010. [Google Scholar] [CrossRef] [Green Version]
Onoda, T.; Ratsch, G.; Muller, K.R. Applying Support Vector Machines and Boosting to a Non-Intrusive Monitoring system for Household Electric Appliances with Inverter. In Proceedings of the ICSC Symposium on Neural Computation, Berlin, Germany, 23–26 May 2000; pp. 1–7. [Google Scholar]
Tsoumakas, G.; Katakis, I.; Vlahavas, I. Mining Multi-label Data. In Data Mining and Knowledge Discovery Handbook; Springer: Boston, MA, USA, 2010; pp. 667–685. [Google Scholar]
Kolter, J.Z.; Johnson, M.J. REDD: A public data set for energy disaggregation research. In Proceedings of the Workshop on Data Mining Applications in Sustainability (SIGKDD), San Diego, CA, USA, 21 August 2011; pp. 59–62. [Google Scholar]
Kim, H.; Marwah, M.; Arlitt, M.; Lyon, G.; Zhan, J. Unsupervised Disaggregation of Low Frequency Power Measurements. In Proceedings of the 2011 SIAM International Conference on Data Mining, Mesa, AZ, USA, 28–30 April 2011; pp. 747–758. [Google Scholar]
Wu, Z.; Wang, C.; Peng, W.; Liu, W.; Zhang, H. Non-intrusive load monitoring using factorial hidden markov model based on adaptive density peak clustering. Energy Build. 2021, 244, 1–12. [Google Scholar] [CrossRef]
Rafiq, H.; Zhang, H.; Li, H.; Ochani, M.K. Regularized LSTM Based Deep Learning Model: First Step towards Real-Time Non-Intrusive Load Monitoring. In Proceedings of the 2018 IEEE International Conference on Smart Energy Grid Engineering (SEGE), Oshawa, ON, Canada, 12–15 August 2018; pp. 234–239. [Google Scholar] [CrossRef]
Zhou, A.; Li, S.; Liu, C.; Zhu, H.; Dong, N.; Xiao, T. Non-Intrusive Load Monitoring Using a CNN-LSTM-RF Model Considering Label Correlation and Class-Imbalance. IEEE Access 2021, 9, 84306–84315. [Google Scholar] [CrossRef]
Jiang, J.; Kong, Q.; Plumbley, M.D.; Gilbert, N.; Hoogendoorn, M.; Roijers, D.M. Deep Learning-Based Energy Disaggregation and On/Off Detection of Household Appliances. ACM Trans. Knowl. Discov. Data 2021, 15, 1–21. [Google Scholar] [CrossRef]
Bonfigli, B.; Felicetti, A.; Principi, E.; Fagiani, M.; Squartini, S.; Piazza, F. Denoising autoencoders for Non-Intrusive Load Monitoring: Improvements and comparative evaluation. Energy Build. 2018, 158, 1461–1474. [Google Scholar] [CrossRef]
Langevin, A.; Carbonneau, M.A.; Cheriet, M.; Gagnon, G. Energy disaggregation using variational autoencoders. Energy Build. 2022, 254, 1–21. [Google Scholar] [CrossRef]
Buddhahai, B.; Makonin, S. A Nonintrusive Load Monitoring Based on Multi-Target Regression Approach. IEEE Access 2021, 9, 163033–163042. [Google Scholar] [CrossRef]
Buddhahai, B.; Wongseree, W.; Rakkwamsuk, P. An Energy Prediction Approach for a Nonintrusive Load Monitoring in Home Appliances. IEEE Trans. Consum. Electron. 2020, 66, 96–105. [Google Scholar] [CrossRef]
Xu, D.; Shi, Y.; Tsang, I.W.; Ong, Y.S.; Gong, C.; Shen, X. Survey on Multi-Output Learning. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 2409–2429. [Google Scholar] [CrossRef] [Green Version]
Tsoumakas, G.; Katakis, I. Multi-Label Classification: An Overview. Int. J. Data Warehous. Min. 2017, 3, 1–17. [Google Scholar] [CrossRef] [Green Version]
Kocev, D.; Dzeroski, S.; White, M.D.; Newell, G.R.; Griffioen, P. Using single- and multi-target regression trees and ensembles to model a compound index of vegetation condition. Ecol. Model. 2009, 220, 1159–1168. [Google Scholar] [CrossRef]
Borchani, H.; Varando, G.; Bielza, C.; Larranaga, P. A survey on multi-output regression. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2015, 5, 216–233. [Google Scholar] [CrossRef] [Green Version]
Basgalupp, M.; Cerri, R.; Schietgat, L.; Triguero, I.; Vens, A. Beyond global and local multi-target learning. Inf. Sci. 2021, 579, 508–524. [Google Scholar] [CrossRef]
Madjarov, G.; Kocev, D.; Gjorgjevikj, D.; Dzeroski, S. An extensive experimental comparison of methods for multi-label learning. Pattern Recognit. 2012, 45, 3084–3104. [Google Scholar] [CrossRef]
Read, J.; Pfahringer, B.; Holmes, G.; Frank, E. Classifier Chains for Multi-label Classification. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, Bled, Slovenia, 7–11 September 2009; pp. 1–16. [Google Scholar]
Tsoumakas, G.; Katakis, I.; Vlahavas, I. Random k-Labelsets for Multilabel Classification. IEEE Trans. Knowl. Data Eng. 2011, 23, 1079–1089. [Google Scholar] [CrossRef]
Buddhahai, B.; Wongseree, W.; Rakkwamsuk, P. Multi-Circuit Electric Consumption Data for Application of Energy Disaggregation. Available online: https://data.mendeley.com/datasets/nmnk58bgtb/1 (accessed on 15 January 2023).
Parzen, E. On Estimation of a Probability Density Function and Mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
Pedregosa, F. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Read, J.; Reutemann, P.; Pfahringer, B.; Holmes, G. Meka: A multi-label/multi-target extension to weka. J. Mach. Learn. Res. 2016, 17, 667–671. [Google Scholar]
Batra, N.; Kelly, J.; Parson, O.; Dutta, H. NILMTK: An open source toolkit for non-intrusive load monitoring. In Proceedings of the 5th International Conference on Future Energy Systems, Cambridge, UK, 11–13 June 2014; pp. 265–276. [Google Scholar]
Makonin, S.; Popowich, F. Nonintrusive load monitoring (NILM) performance evaluation. Energy Effic. 2015, 8, 809–814. [Google Scholar] [CrossRef]
Kelly, J.; Knottenbelt, W. Neural NILM: Deep Neural Networks Applied to Energy Disaggregation. In Proceedings of the 2nd ACM International Conference on Embedded Systems for Energy-Efficient Built Environments, Seoul, Republic of Korea, 4–5 November 2015; pp. 55–64. [Google Scholar]
Krystalakos, O.; Nalmpantis, C.; Vrakas, D. Sliding Window Approach for Online Energy Disaggregation Using Artificial Neural Networks. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence, Patras, Greece, 9–12 July 2018; pp. 1–6. [Google Scholar] [CrossRef]
Fang, Y.; Jiang, S.; Fang, S.; Gong, Z.; Xia, M.; Zhang, X. Non-Intrusive Load Disaggregation Based on a Feature Reused Long Short-Term Memory Multiple Output Network. Building 2022, 12, 1048. [Google Scholar] [CrossRef]
Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN. ACM Trans. Data. Syst. 2017, 42, 1–21. [Google Scholar] [CrossRef]
Patel, S.; Sihmar, S.; Jatain, A. A study of hierarchical clustering algorithms. In Proceedings of the 2015 2nd International Conference on Computing for Sustainable Global Development, New Delhi, India, 11–13 March 2015; pp. 537–541. [Google Scholar]

Figure 1. The procedure of experimental design for (a) multi-target classification (b) multi-target regression approach.

Figure 2. Power distribution through KDE plots for appliance label (a) LTL; (b) LTB1; (c) PGB1; (d) PMP; (e) WTB1; and (f) ANB1.

Figure 3. Power data plots of the ground truth and the estimation by the MTC model (a) LTL; (b) LTB1; (c) PGB1; (d) PMP; (e) WTB1; and (f) ANB1.

Figure 4. Power data plots of the ground truth and the estimation by the MTR model (a) LTL; (b) LTB1; (c) PGB1; (d) PMP; (e) WTB1; and (f) ANB1.

Table 1. Load label under experiments.

Load Label	Power Range (kW)	Load Description
LTL	0.03–0.30	Living room lightings
LTB1	0.02–0.15	Bedroom 1 lightings
PGB1	0.06–1.60	Bedroom 1 plugging loads
PMP	0.09–1.60	Water pump
WTB1	0.01–3.20	Bedroom 1 water heater
ANB1	0.02–1.50	Bedroom 1 air-conditioner

Table 2. Power state representation of appliance label using KDE plot.

Load Label	Power State Representatives (kW)	Number of States
LTL	{0, 0.05, 0.25}	3
LTB1	{0, 0.04, 0.11}	3
PGB1	{0.02, 0.40, 1.40}	3
PMP	{0, 0.33, 1.48}	3
WTB1	{0, 1.20, 2.35}	3
ANB1	{0, 1.24}	2

Table 3. Hamming score by different MTC algorithms.

Algorithm	Hamming Score
Class relevance (CR) + random forest	0.734 ± 0.002
Classifier chain (CC) + decision tree	0.734 ± 0.002
Classifier chain (CC) + random forest	0.734 ± 0.001
RAkELd + decision tree	0.735 ± 0.002
RAkELd + random forest	0.736 ± 0.001

Table 4. Model parameters selection and optimized performance for the MTC model.

Parameter	Evaluation Set	Optimized Parameter	Optimized Hamming Score
k (RAkELd)	{1, 2, 3, 4}	4	0.738
numIterations (random forest)	{30, 50, 80, 100}	80	0.738

Table 5. Power states identification performance by F-score based on the MTC model.

Load Label	F-Score
LTL	0.876
LTB1	0.896
PGB1	0.940
PMP	0.950
WTB1	0.982
ANB1	0.992

Table 6. Power demand estimation accuracy associated with the MTC data.

Load Label	Power Estimation Accuracy
LTL	0.730
LTB1	0.520
PGB1	0.755
PMP	0.610
WTB1	0.710
ANB1	0.950

Table 7. Power estimation accuracy by different MTR algorithms.

Algorithm	Power Estimation Accuracy
Single target (ST) + ridge regressor	0.590 ± 0.003
Single target (ST) + gradient boosting regressor	0.885 ± 0.001
Single target (ST) + random forest regressor	0.912 ± 0.001
Regressor chain (RC) + ridge regressor	0.590 ± 0.003
Regressor chain (RC) + random forest regressor	0.900 ± 0.001

Table 8. Model parameters selection and optimized performance for the MTR model.

Parameter	Evaluation Set	Optimized Parameter	Optimized Power Estimation Accuracy
max_depth	{10, 20, 30, 40, 50}	20	0.915
n_estimators	{10, 20, 30, 50, 80}	80	0.915

Table 9. Power demand estimation accuracy associated with the MTR model.

Load Label	Power Estimation Accuracy
LTL	0.848
LTB1	0.545
PGB1	0.800
PMP	0.565
WTB1	0.930
ANB1	0.950

Table 10. Power states classification performance by F-score based on the MTR model.

Load Label	F-Score
LTL	0.943
LTB1	0.878
PGB1	0.900
PMP	0.980
WTB1	0.890
ANB1	0.990

Table 11. F-score performance by the MTC and MTR approaches, and DAE network.

Load Label	MTC	MTR	DAE
LTL	0.876	0.943	0.724
LTB1	0.896	0.878	0.695
PGB1	0.940	0.900	0.665
PMP	0.950	0.980	0.780
WTB1	0.982	0.890	0.825
ANB1	0.992	0.990	0.950

Table 12. Power estimation accuracy performance by MTC, MTR, and DAE network.

Load Label	MTC	MTR	DAE
LTL	0.730	0.848	0.650
LTB1	0.520	0.545	0.505
PGB1	0.755	0.800	0.783
PMP	0.610	0.565	0.550
WTB1	0.710	0.930	0.805
ANB1	0.950	0.950	0.925

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Buddhahai, B.; Korkua, S.K.; Rakkwamsuk, P.; Makonin, S. A Design and Comparative Analysis of a Home Energy Disaggregation System Based on a Multi-Target Learning Framework. Buildings 2023, 13, 911. https://doi.org/10.3390/buildings13040911

AMA Style

Buddhahai B, Korkua SK, Rakkwamsuk P, Makonin S. A Design and Comparative Analysis of a Home Energy Disaggregation System Based on a Multi-Target Learning Framework. Buildings. 2023; 13(4):911. https://doi.org/10.3390/buildings13040911

Chicago/Turabian Style

Buddhahai, Bundit, Suratsavadee Koonlaboon Korkua, Pattana Rakkwamsuk, and Stephen Makonin. 2023. "A Design and Comparative Analysis of a Home Energy Disaggregation System Based on a Multi-Target Learning Framework" Buildings 13, no. 4: 911. https://doi.org/10.3390/buildings13040911

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Design and Comparative Analysis of a Home Energy Disaggregation System Based on a Multi-Target Learning Framework

Abstract

1. Introduction

2. Materials and Methods

2.1. Multi-Target Learning

2.2. Experimental Data

2.3. Experimental Design

2.4. Evaluation Tools and Performance Index

3. Results and Discussion

3.1. Multi-Target Classification Approach

3.1.1. Appliance Power States Modeling Using KDE

3.1.2. Algorithm Selection and Model Optimization

3.1.3. Testing Performance Evaluation

Power States Classification Accuracy

Power Demand Estimation Accuracy

3.2. Multi-Target Regression Framework

3.2.1. Algorithm Selection and Model Optimization

3.2.2. Testing Performance Evaluation

Power Estimation Accuracy

Power States Classification Accuracy

3.3. Performance Benchmarking

3.4. Discussion

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI