Hyperparameter Sensitivity Analysis of Deep Learning-Based Pipe Burst Detection Model for Multiregional Water Supply Networks

Kim, Hyeong-Suk; Choi, Dooyong; Yoo, Do-Guen; Kim, Kyoung-Pil

doi:10.3390/su142113788

Open AccessArticle

Hyperparameter Sensitivity Analysis of Deep Learning-Based Pipe Burst Detection Model for Multiregional Water Supply Networks

by

Hyeong-Suk Kim

¹,

Dooyong Choi

²,

Do-Guen Yoo

^1,* and

Kyoung-Pil Kim

^2,*

¹

Department of Civil Engineering, The University of Suwon, Hwaseong-si 18323, Korea

²

Water & Wastewater Research Center, K-Water Institute, K-Water, Deajeon 34350, Korea

^*

Authors to whom correspondence should be addressed.

Sustainability 2022, 14(21), 13788; https://doi.org/10.3390/su142113788

Submission received: 22 August 2022 / Revised: 6 October 2022 / Accepted: 21 October 2022 / Published: 24 October 2022

(This article belongs to the Special Issue Optimal Design, Operation, and Management for Sustainable Water Distribution Systems)

Download

Browse Figures

Versions Notes

Abstract

:

In a deep learning model, the effect of the model may vary depending on the setting of the hyperparameters. Despite the importance of such hyperparameter determination, most previous studies related to burst detection models of the water supply pipe network used hyperparameters applied in other fields as-is or made a trial-and-error setting based on experience, which is a limitation. In this paper, a study was conducted on the deep learning hyperparameter determination of a deep neural network (DNN)-based real-time detection model of pipe burst accidents. The pipe burst model predicted water pressure by using operation data in units of 1 min, and the data period applied for the model training was less than 1 month (1, 2, and 3 weeks) in order to consider frequent changes in the system. A sensitivity analysis was first performed on the type of activation function and the period of the learning data, which may have different effects depending on the characteristics of the target problem. The number of hidden layers related to the network structure and the number of neurons in each hidden layer were set as hyperparameters for additional sensitivity analysis. The sensitivity analysis results were derived and compared using four quantified prediction error indicators. In addition, the model running time was analyzed to evaluate the practical applicability of the development model. From the results, it was confirmed that excellent effects could be expected if using a rectifier function as the activation function, 144 nodes in the hidden layer, which is eight times the number of nodes in the input layer, and four hidden layers. Additionally, by analyzing the appropriate period of training data required for model pressure prediction through prediction error and driving time, it was confirmed that it was most appropriate to use the data of two weeks. By applying the hyperparameter values determined through detailed sensitivity analysis and by applying the data of one week including actual burst accidents to the built-up pressure prediction model, the accident detection and predictive performance of the model were verified. The rational determination of the period of input factors for the optimal hyperparameter setting and model building, as in this study, is very necessary and very important as it can serve to ensure the continuity of the operation effects of the deep learning model.

Keywords:

deep learning; pipe burst detection; multiregional water supply networks; hyperparameters; sensitivity analysis

1. Introduction

In line with the recent fourth industrial revolution, information and communication technology (ICT, information and communications technologies), and the accelerated advanced metering infrastructure (AMI)-technology use trend, data analysis-based studies of big data and artificial intelligence are being actively conducted [1] in the field of water supply pipe network operation management technology. Particularly with this trend, the development and application of low-cost measuring instruments for data acquisition is an important opportunity to expand the number of installations of flowmeters, water pressure gauges, pipeline detection sensors, water quality gauges, and the like. The risk factor can be said to be large in that the water supply pipelines that constitute the water supply system are buried underground together with other facilities, and it is impossible to see most sections of the pipelines with the naked eye as the pipelines continue to age. Damage to the water supply pipeline would cause large-scale water outages and delays in water flow. The multiregional water supply system in Korea is a facility designed to provide water to more than two local governments and industrial complexes. It is equipped with 5400 km large diameter pipelines, 48 water intakes, and 43 water treatment facilities. This system transports more than 50% of the water supply volume in Korea. As the facilities were starting to be built in the 1960s, the aging of the water pipelines increases the potential risks of pipe bursts. If a pipe burst occurs, it will cause damage at the level of a national disaster. The monetary damage by a disruption of the industrial water service for 24 h can be up to approximately USD 2 billion depending on the service type, criticality of impact, and size of the affected area [2]. Therefore, it is important to detect real-time pipe bursts along with preventive management through pipe condition assessment. If an actual burst occurs, it will take a certain amount of time to detect the burst and identify the location of the damage, and it will take a considerable amount of time to complete not only the necessary restoration work but also the cleaning and water-filling work. From the data analysis results of multiregional water supply waterwork pipeline accidents that occurred in Korea from 2006 to 2016, the average time taken for water resupply to consumers after water cut-off due to pipe breakage was approximately 12 h [3].

As this value represents an average figure and there are many uncertainties affecting the time required for actual drainage, filling operations, and leakage detection, the time taken for the water resupply to occur may be longer depending on the manner of the pipe breakage accident. Accordingly, many studies on data analysis-based leak detection methodology using artificial intelligence techniques, detecting leakage accidents that may occur in large-scale water supply systems domestically and internationally, are currently being actively conducted. Zhou et al. [4] improved DensNET, which is a convolution neural network (CNN) technique, in a way to be suitable for monitoring pipe breakage. A method for detecting pipe breakage accidents was presented through a pressure prediction model trained using the pressure pattern at the time of the simulated accident generated from the pipe network analytic model. Wang et al. [5] and Lee and Yoo [6] proposed a leak accident detection model using an LSTM (long short-term memory) technique in a recurrent neural network (RNN) and conducted a model validation study through random accident data generation. Quiñones-Grueiro et al. [7] and Wu et al. [8] used deep learning techniques to propose a leak detection methodology combining a data-based method and a model-based method. Quiñones-Grueiro et al. [7] used a deep neural network model to detect leaks and proposed a model to identify leak location based on the inverse problem solution. Wu et al. [8] developed a leak detection methodology in the form of a system that learns using data generated from well-calibrated pipe network analysis data and deep learning techniques.

Such deep learning and combined model-based methodologies can be said to offer advantages in that the relevant measurement data are easy to secure and increasing in number, but the effect of the models may vary depending on the hyperparameter settings. There are many hyperparameters that need to be considered in order to apply a deep learning model. Hyperparameters can be classified into network structures and learning algorithms. Hyperparameters related to the network structure include the number of hidden layers, the number of neurons, and the type of activation function. The proper setting of hyperparameters can prevent the risk that the model may be underfitted or overfitted. It can also prevent the model creation time from being unnecessarily lengthened.

Despite the importance of such hyperparameter determination, most previous studies related to leak detection models of the water supply pipe network used the hyperparameters applied in other fields as-is or made a trial-and-error setting based on experience [5,6], which is a limitation. Recently, Fan et al. [9] developed a leak detection model using the autoencoder neural network model (AE model), which is an unsupervised model. In that study, it was confirmed that the compression ratio factor, which is an important hyperparameter of the AE model, had a negative effect on the overall detection accuracy, through sensitivity analysis of the parameter, and the corresponding values have been quantified and presented. Capelo et al. [10] carried out the sensitivity analysis for different multilayer perceptron (MLP) configurations depending on the number of neurons. They concluded that the higher the number of neurons in each layer, the better the expected results. However, using a large number of neurons can cause the problem of overfitting, so careful consideration is required. In the oil and gas area, Jin et al. [11] performed a tuning study of parameters related to oil well production rates, which is essential for oil and gas field development. In this study, the histogram of measurement data for the value to be predicted was reviewed in advance, and the model was applied using various deep learning structures, training functions, and transfer functions. The output results were compared with the distribution of the measured and predicted data. Najafabadipour et al. [12] used MLP configurations to predict the problem of groundwater levels; they used two hidden layers with just eight neurons and one neuron in the first and second hidden layers based on the used data scale.

This study was conducted on the deep learning hyperparameter determination of a deep neural network (DNN)-based real-time detection model of pipe burst accidents. The pipe burst model predicted water pressure by using operation data in units of 1 min, and the data period applied for the model training was less than 1 month (1, 2, and 3 weeks) in order to consider frequent changes in the system. As such, a sensitivity analysis of the number of hidden layers related to the structure of the deep learning network, the number of neurons in each hidden layer, the type of active function, and the period of the training data was carefully performed, and a suitable hyperparameter value that could be predicted efficiently was determined. Accordingly, by applying the set hyperparameter value, the accident detection performance of the verified model was evaluated using the actual pipe breakage accident events. The contribution of this study is that the sensitivity analysis of hyperparameters was performed in detail in the burst detection model, which was not focused on in other studies. In addition, the training time period of the model to reflect the frequent change in the water supply systems and for real-time operation of the model was considered as one of the hyperparameters, and the optimal model update period was suggested.

2. Methodologies

2.1. Applied Deep Learning Structure for Hyperparameter Sensitivity Analysis

The structure of the deep learning-based leak detection model configured for sensitivity analysis of hyperparameters, which is the ultimate purpose of this study, is shown in Figure 1. The pressure estimation problem was considered to be calculated by the surrounding hydraulic sensor in the equilibrium steady state, and a feed-forward neural network (FFNN) was used to generate the pressure estimation model. Water level, flow rate, ambient pressure, pump operating status, and motorized valve opening data measured at 1 min intervals were used for the training data to create a pressure prediction model for each pressure gauge. In other words, the pressure (dependent variable) being predicted is determined by flow rate, ambient pressure, pump operation state, and electric valve opening data, and operational state of hydraulic facilities (independent variables). A pressure prediction model is trained using ambient sensor measured data for normal operation status, and it is able to estimate pressure at a given pressure gauge using other nearby data in the pipe burst monitoring stage. If a pipe burst occurs, a sudden change in flow and pressure is observed, and a pressure different from normal is estimated. Based on the differences between observed and estimated values, it is possible to detect a burst.

Five quantitative performance indicators were selected to evaluate overall model performance. R-squared (R²) represents the proportion of the variance in the dependent variable, explained by the linear regression model. The mean absolute error (MAE) represents the average of the absolute difference between the measured and predicted values in the data. Mean squared error (MSE) represents the average of the squared difference between the observed and predicted values. Root-mean-squared error (RMSE) is the square root of MSE. The mean absolute percentage error (MAPE) is the mean or average of the absolute percentage errors of forecasts. Additionally, in this study, RMSLE (root-mean-squared log error) was used as an additional indicator to evaluate the model sensitivity according to the activation functions, as shown in Equation (6):

R^{2} = 1 - \frac{\sum^{} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum^{} {(y_{i} - \bar{y})}^{2}}

(1)

M A E = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - {\hat{y}}_{i} |

(2)

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(3)

R M S E = \sqrt{M S E} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}}

(4)

M A P E = 100 \times \frac{1}{N} \sum_{i = 1}^{N} \frac{| y_{i} - {\hat{y}}_{i} |}{| y_{i} |}

(5)

R M S L E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(\log (y_{i} + 1) - \log ({\hat{y}}_{i} + 1))}^{2}}

(6)

where

y_{i}

is measured value,

{\hat{y}}_{i}

is predicted value in ith data,

\bar{y}

is mean value of dataset, N is the number of dataset, and log(x) is the natural logarithm of x. x stands for any numeric value.

2.2. Applied Network and Preprocessing of Training Data

Actual data sets, including historical bursts in different metropolitan water systems, were collected. The dataset consisted of the test data of one week including the burst event and the training data of three weeks before the event. The data set was a raw water supply network, whereby the entire system needed to be analyzed in two sectors, as shown in Figure 2.

The target area represents a part of the water supply section where water is supplied through natural flow from a water purification plant installed in the highlands. Therefore, the pressure of P1 is almost unchanged, P2 and P4 are the branch pipes that supply water for living in residential areas, and P3 is the pressure gauge directly installed in the branch pipe that supplies water to customers. As the operation logic of the PRV installed in front of the P3 is not clearly or officially announced, the forecast data of P3 are highly uncertain. Compared to P2, P4 is installed close to the reservoir, so the pressure fluctuations measured according to the operation of the reservoir are relatively large.

On the other hand, the pump station is not normally operated because the bypass valve is opened to supply water; however, it is operated to maintain the pressure when the water demand in Sector 2 is 1500 m³/h or more. A total of four pumps are operated in a number of control systems, and two pumps are operated crosswise when operating the pumping station. In order to minimize the increase in prediction error due to the temporary unsteady flow that occurs when the pump starts and stops, data on the operation status of each pump unit were studied.

Seven pressure gauges are installed in Sector 1, and each pressure gauge predicts the pressure of the corresponding pressure gauge with a DNN model learned using data of different pressures, flow rates, and pump on/off data. Then, the model compares the predicted pressure with the actual measured pressure, and it detects whether a pipe breakage accident has occurred. When an accident occurs, the prediction error becomes larger than usual, and such anomalies appear simultaneously in two or more adjacent pressure gauges. In addition, the closer to the accident site, the larger the prediction error size, enabling the approximate accident site to be estimated.

Prior to running the model for hyperparameter sensitivity analysis, preprocessing of the measurement data for the applied event was performed. If false readings or missing data were collected in time series dataset, measured data at that time were deleted. In this application, nothing that could be presumed to be an extreme mismeasurement was found, but approximately 1.4% of data were missing, and 11.4% of data were staged in data transmission time, as shown in Figure 3. The part marked in light gray in the vertical direction means missing data. The rectangular dotted line indicates stagnant data, and the circled dotted line indicates negative data. Stagnated data means that the measured value did not change over a long period of time. Finally, about 13% of the dataset was deleted from the training data. After checking if there were any mismeasurements or missing items in the 3 weeks of data (30,240 pieces) collected for the model training, pretreatment was conducted on that data as shown in Figure 3.

2.3. Hyperparameter Sensitivity Analysis

A deep learning model needs many hyperparameters to be considered for applications. Hyperparameters can be classified into network structures and learning algorithms. Hyperparameters related to the network structure include the number of hidden layers, the number of neurons in each layer, the type of activation function, and drop-out, while hyperparameters related to learning algorithms include learning rate, momentum, epochs, and batch size. The proper setting of hyperparameters can prevent the model from being underfitted or overfitted. It can also prevent the model creation time from being longer than necessary. A pipe breakage detection model must ensure the reliability of abnormal detection as well as be able to reflect the latest pipe network operation status through periodic updates. Therefore, a large amount of training data cannot be used in the model, and the model should be regenerated using the most up-to-date data. Hyperparameters such as deep learning structures, training functions, and transfer functions were selected in most of the previous studies. However, this study considered the data period for training as an additional hyperparameter and tried to increase the deep learning applicability of this problem, where system variability frequently occurs. Therefore, in this study, it was assumed that a new pressure prediction model was regenerated every week, and a sensitivity analysis of the number of hidden layers related to the network structure, the number of neurons in each hidden layer, the type of activation function, and the duration of the training data was performed.

3. Application Results

3.1. Hyperparameter Sensitivity Analysis Results

About 1600 pressure prediction models were needed to establish a pipe breakage accident monitoring system for the entire metropolitan water supply system. Therefore, as the predictive performance of each pressure gauge was stably secured, a hyperparameter setting sensitivity analysis was performed, focusing on whether the model creation time for updates could be minimized. The sensitivity analysis was conducted on hyperparameters such as the activation function, the number of hidden layers, and number of neurons. Then, using the selected model framework, a sensitivity analysis was performed for the required period of the training data for one week of pressure prediction.

3.1.1. Selection of an Activation Function

The selection process of the activation function was performed as the first analysis for the design of the model framework. The activation function of a neural network defines how the weighted sum of the input defines nodes in the network layer or the type of transformation into an output at a node [13]. As the activation function has a great impact on the training process as well as its ability and performance such as convergence, it should be appropriately selected according to the type and characteristics of the problem to be solved by the network.

In this study, in order to select an activation function suitable for the pipe breakage monitoring model, the data of 3 weeks, the maximum data period available, was used. The number of layers of the hidden layer and the number of hidden nodes were determined in a later process, but in the activation function selection step, two hidden layers [14] and the number of hidden nodes corresponding to 2/3 of the input nodes were used as the conditions [15]. The activation functions used for analysis were the commonly used rectifier activation function, the tanh activation function, and the maxout activation function.

In order to select the activation function of the pressure prediction model suitable for monitoring pipe breakage, 80% of the data was used for model training, and 20% of the data was used for hyperparameter validation. A total of 21 models were created, which consisted of three activation functions and seven pressure meters. The test data set was used to determine underfitting (whereby the model does not work properly for the training data) and overfitting (whereby the model works well for the training data but does not properly work for the test data) of the tests. A test data set should be used only after the model is fully trained and validated with training and validation datasets. From the performance comparison results using the evaluation scale of each model, the difference was not clearly distinguished, as shown in Figure 4 and Figure 5, but the rectifier was found to have the best predictive performance when using the training data, while the tanh function was found to have the best predictive performance with the validation data. In order to prevent overfitting, it can be deemed appropriate to select the tanh function, which shows the best predictive performance in the validation data.

Meanwhile, the model creation time was significantly reduced when using the rectifier function, as shown in Figure 6. In this study, about 1600 models needed to be created for real-time operation, so the rectifier function was selected as the activation function of the pressure prediction model for pipe breakage monitoring because the model generation time of the rectifier function is shorter than that of Tanh, but there was no significant difference in the prediction performance.

On the other hand, in the case of P3, P4, and P5, the prediction error was relatively large compared to other pressure gauges. P3 and P4 were the pressure gauges installed in the bifurcation pipe for the water supply to a single consumer, as shown in Figure 3, showing a stepwise water demand pattern. P5 was a pressure gauge installed in front of the absorption well in the pressurized field. It was judged that the prediction performance was relatively low compared to the pressure gauges installed on the main pipeline because it was impossible to learn about the temporary hydraulic disturbances occurring upon the operation of the consumer valves but possible to supplement the hydraulic disturbance occurring when the pump is on/off through the learning of the pump operation state. In particular, the prediction error was larger in P3 due to a pressure-reducing valve that had been installed by the consumers at the front end and the setting information, which was unknown.

3.1.2. Number of Neurons in the Hidden Layers

Underfitting occurs when there are too few neurons in the hidden layer to properly detect signals in complex data sets. Using too many neurons in a hidden layer can cause several problems, including overfitting, which occurs when the neural network has too much information-processing power, and the limited amount of information contained in the training set is insufficient to train all of the neurons in the hidden layer. Even if the training data are sufficient, having too many neurons in the hidden layer can increase the time taken to train the network. The amount of training time exponentially increases, which may make it impossible to properly train the neural network. Therefore, it is necessary to determine the optimal number of neurons in the hidden layer.

To determine the number of neurons in each hidden layer appropriate for the pipe breakage accident monitoring model, using the selected rectifier for the rest of the pressure gauges except for the P3 pressure gauge in which the effect of the pressure-reducing valve could not be considered, the evaluation of the pressure prediction performance and model creation time was conducted for the number of hidden layer nodes of the eight scenarios that were assumed by multiplying 0.5~16 by the number of input layer nodes, as shown in Figure 7, under the condition of 3 weeks of data and two hidden layers as in the activation function selection step.

While there were pressure models with relatively large prediction errors, such as P4 and P5, in which the prediction error decreased in proportion to the increase in the number of nodes, it was found in P1, P2, P6, and P7, where the prediction error was small, that the prediction error increased due to overfitting in more than a certain number of nodes, and the optimal number of nodes for each pressure gauge was different. In summary, the degree of improvement in the prediction error according to the number of nodes showed different results for each pressure gauge, and it was not clear whether the prediction error decreased as the number of nodes increased from 72 nodes, accounting for four times the input node.

On the other hand, the time taken to create the model tended to be proportional to the number of nodes in the remaining models except for the P2 model, as shown in Figure 8. Especially in P4 and P7, the time taken to train the model sharply increased after the number of nodes reached 144. In P2, P5, and P7, the models were shown to be generated in a shorter time with the number of nodes at 144 than with the previous smaller node number of 108. Therefore, the number of nodes in the hidden layer most suitable for pipe breakage monitoring was determined to be 144 (eight times the number of nodes in the input layer), which can be considered the optimum number for stable prediction performance and the time required for model generation.

3.1.3. Number of Hidden Layers

Using the previously determined activation function and the number of hidden layer nodes, a sensitivity analysis was performed on the number of hidden layers. For the sensitivity analysis, the model performance and the required time were evaluated by creating a deep neural network from layer 1 to layer 5.

The evaluation results are shown in Figure 9 and detailed results are indicated in Appendix A. In the four scales, the prediction error tended to clearly decrease as the number of layers increased, as shown in Figure 9. However, the prediction error showed a tendency to increase due to overfitting after the 5th layer.

Meanwhile, the model creation time showed a tendency to increase with the increase in the number of layers, as shown in Figure 10. However, the prediction error reduction effect according to the number of layers was more clearly shown in the activation function selection and the node number selection stage, so the number of layers suitable for the accident monitoring model was determined to be four, in consideration of the reliability of the pressure prediction of the model.

3.1.4. Size of Training Dataset

As mentioned before, the physical boundaries of the system frequently fluctuate to allow a smooth water supply and the replacement of old facilities in metropolitan water systems. Therefore, past data with different physical boundary conditions cannot be used as training data for accident detection. In this study, the minimum unit of data set for model training was set to one week, considering both weekday and weekend water demand patterns. The accident detection model created for pipe breakage accident monitoring was only used for one week and then regenerated as a new model trained with the latest data.

An analysis was performed to select an appropriate period of training data required for pressure prediction. The period of maximum training data used for the analysis was 3 weeks, and through reducing the amount of training data to 2 weeks and 1 week, the prediction performances of the generated models were compared as follows.

As mentioned above, the evaluation of the error for P3 was not considered when deriving the optimal parameter as the operation logic of the PRV installed in front of the P3 is not clearly or officially announced; only the result was presented. As shown in Figure 11, the pressure gauges, except for P4, showed similar predictive performance in the three models. In P4, the prediction error was large when the models were trained with data from the most recent week. P4 was a pressure gauge affected by a single consumer water-taking pattern, and it was confirmed that there was a need for additional information such as valve opening/closing information to ensure the learning of water-taking patterns with high volatility.

The model creation time was shown to be the greatest when the training was conducted with data for 3 weeks, as shown in Figure 12 detailed results are indicated in Appendix B. In this study, two weeks of data was selected as the appropriate training period with the highest priority to adaptability to changes in operational management status and model creation time.

From the results, it was confirmed that excellent effects could be expected if using a rectifier function as the activation function, 144 nodes in the hidden layer, which is eight times the number of nodes in the input layer, and four hidden layers. Lee and Yoo [6] suggested a model consisting of three layers and 128, 64, and 48 nodes for each layer. Furthermore, Capelo et al. [10] concluded 50, 45, and 50 nodes for three layers could get better results, but they described that the higher the number of neurons in each layer, the better the expected results. Compared with the suggested values of parameters in the two previous studies, deeper (four hidden layers) and a larger number of nodes (144) showed better results in this study. A direct comparison is difficult because the used data size and in–out factors are different. However, the sensitivity analysis of more hyperparameters was performed, in detail, in this study, which was not focused on in other studies. In particular, the training time period of the model to reflect the frequent change in the water supply systems was considered as one of the hyperparameters, and the optimal model update period was suggested.

3.2. Model Application Results for the Optimized Hyperparameter

By applying the hyperparameter values determined through detailed sensitivity analysis and by applying the data from one week, including accidents, to the constructed pressure prediction model, the accident detection and predictive performance of the model were verified. The accident detection was obtained through a sudden increase in the prediction error that differed from usual, and the model prediction performance was analyzed using the prediction results from 1/22~23 days before the accident. As represented by the black dotted line in Figure 13, on January 24 at 15:17 when an accident happened, a sudden increase in the prediction error, differing from usual, was observed in all six pressure gauges being monitored, confirming that an accident had been detected.

Table 1 shows the prediction errors for 3 days up to 1/24, the day before the accident. The R2 of P1 and P7 was about 0.6, which is relatively low. P1 and P7 showed almost no pressure fluctuations according to the water demand, and the correlation between the predicted and measured pressure was low under normal operation conditions. However, if only the percentage error is considered because the sequence of time series data is not considered in this study, the MAPE (mean absolute percentage error) was less than 1% in the rest of the pressure prediction models except for P4, confirming that the pressure was very accurately predicted. The P2 and P4 pressure gauges are pressure gauges installed as an accessory to the flowmeter to measure the flow rate supplied to the drainage of the local municipality. For accident detection, it should be installed at a point where the hydraulic effect of the condition control of repair facilities such as pumps and electric valves is small. As P4 is installed in the inclined pipe adjacent to the drainage basin, there is a rather large pressure error due to the valve control. On the other hand, P2 is far from the water reservoir and is installed on flat ground.

The value expressed in the heatmap was calculated through an estimation error calculation formula, as in Equation (7). Figure 14 is a heat map showing the prediction error per minute (MAPE) considering the water flow direction and locational proximity. As shown in the Figure 14, if a pipe breakage accident occurred, the pressure gauge adjacent to the accident location would predict a pressure higher than the measured pressure due to an unknown flow rate that had not been trained in the real model. Therefore, (-) prediction errors were related to the occurrence of accidents but had no direct relationship with the accident location. In addition, the prediction error became larger as it got closer to the accident point, and accordingly, the approximate location of the accident could be estimated.

E s t i m a t e d e r r o r (%) = \frac{P_{e_{i}} - P_{m_{i}}}{P_{m_{i}}} \times 100

(7)

where

P_{e_{i}}

is the predicted pressure of the pressure gauge i, and

P_{m_{i}}

is the measured pressure of the pressure gauge i.

4. Conclusions

In order to prevent damage due to water outages in a water supply system, it is essential to develop a model such as a real-time burst detection model. In most previous studies, a burst detection model was developed using a well-calibrated pipe network analysis model or a virtual data-based analysis model. These various analysis models can exhibit excellent performance in certain situations, but prediction errors can rapidly grow when reflecting the constantly changing structure of the water supply system and the supply–demand relationship. Therefore, it is important to set appropriate parameters to check the performance of the model and determine a period of analysis data that is practically effective.

Hyperparameters such as deep learning structures, training functions, and transfer functions were selected in most of the previous literature. However, this study considered the data period for training as an additional hyperparameter and tried to increase the deep learning applicability of this problem, where system variability frequently occurs. A study was conducted to determine the deep learning hyperparameter of a real-time detection model of a deep neural network (DNN)-based pipe breakage accident using short-term data. A sensitivity analysis was carefully performed on the number of hidden layers and the number of neurons in each hidden layer, the type of active function, and the duration of the training data related to the structure of the deep learning network. A suitable hyperparameter value with which efficient predictions could be made was determined. The accident detection performance of the verified model was evaluated using an actual pipe breakage accident event by applying the set hyperparameter value. The sensitivity analysis results were derived and compared using four quantified prediction error indicators: the coefficient of determination (R2), mean absolute error (MAE), mean square error (MSE), and root-mean-square error (RMSE). The model running time was analyzed to evaluate the practical applicability of the development model. In the resulting burst detection models that were constructed, it was confirmed that excellent results could be expected if using a rectifier function as the activation function, 144 nodes in the hidden layer, which is eight times the number of nodes in the input layer, and four hidden layers. The contribution of this study is that the optimal training time period of the deep learning burst detection model is suggested to reflect the frequent change in the water supply systems and for the real-time operation of the model. By analyzing the appropriate period of training data required for model pressure prediction through prediction error and driving time, it was confirmed that it was most appropriate to use the data of two weeks. The rational determination of the period of input factors for optimal hyperparameter setting and model building was also deemed very necessary in order to ensure the continuity of the operation effect of the deep learning model.

Author Contributions

Conceptualization, H.-S.K., K.-P.K. and D.-G.Y.; methodology, H.-S.K. and K.-P.K.; investigation, H.-S.K. and D.C.; formal analysis, H.-S.K. and D.-G.Y.; visualization, K.-P.K.; writing—original draft, H.-S.K.; writing—review and editing, D.-G.Y. and K.-P.K.; and supervision, D.-G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korea Environment Industry & Technology Institute (KEITI) through the Intelligent Management Program for Urban Water Resources Project, funded by the Korea Ministry of Environment (MOE) (2019002950002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Evaluation Metric Value of Validation Data According to the Number of Hidden Layers

Number of layers	P1				P2				P4
Number of layers	R2	MAE	MSE	RMSE	R2	MAE	MSE	RMSE	R2	MAE	MSE	RMSE
[144]	0.851	0.018	0.001	0.031	0.925	0.009	0.000	0.016	0.808	0.121	0.035	0.186
[144,144]	0.883	0.016	0.001	0.028	0.928	0.007	0.000	0.017	0.929	0.077	0.013	0.114
[144,144,144]	0.929	0.015	0.000	0.022	0.906	0.009	0.000	0.019	0.939	0.069	0.011	0.106
[144,144,144,144]	0.967	0.010	0.000	0.015	0.938	0.006	0.000	0.015	0.939	0.065	0.011	0.105
[144,144,144,144,144]	0.967	0.011	0.000	0.015	0.935	0.007	0.000	0.016	0.654	0.144	0.063	0.252
Number of layers	P5				P6				P7
Number of layers	R2	MAE	MSE	RMSE	R2	MAE	MSE	RMSE	R2	MAE	MSE	RMSE
[144]	0.881	0.039	0.010	0.098	0.970	0.012	0.001	0.027	0.900	0.009	0.000	0.012
[144,144]	0.877	0.032	0.010	0.098	0.994	0.005	0.000	0.012	0.940	0.007	0.000	0.010
[144,144,144]	0.880	0.028	0.009	0.095	0.985	0.007	0.000	0.020	0.929	0.008	0.000	0.011
[144,144,144,144]	0.870	0.031	0.010	0.100	0.997	0.005	0.000	0.009	0.950	0.006	0.000	0.009
[144,144,144,144,144]	0.886	0.029	0.009	0.094	0.993	0.005	0.000	0.013	0.930	0.007	0.000	0.010

Appendix B. Evaluation Metric Value According to Training Dataset Size

Duration of Training Data	Classification	R2	MAE	MSE	RMSE
the last 1 week	P1	0.9707	0.0104	0.0002	0.0152
	P2	0.9265	0.0060	0.0001	0.0118
	P4	0.5904	0.1927	0.0780	0.2793
	P5	0.8799	0.0379	0.0110	0.1050
	P6	0.9885	0.0085	0.0004	0.0189
	P7	0.9094	0.0084	0.0001	0.0121
the last 2 weeks	P1	0.9798	0.0087	0.0002	0.0128
	P2	0.9098	0.0071	0.0002	0.0132
	P4	0.9464	0.0661	0.0102	0.1008
	P5	0.8885	0.0345	0.0104	0.1019
	P6	0.9961	0.0051	0.0001	0.0111
	P7	0.9444	0.0065	0.0001	0.0092
the last 3 weeks	P1	0.9615	0.0108	0.0003	0.0160
	P2	0.9628	0.0060	0.0001	0.0115
	P4	0.9577	0.0573	0.0077	0.0875
	P5	0.8846	0.0288	0.0090	0.0951
	P6	0.9935	0.0060	0.0002	0.0132
	P7	0.9281	0.0070	0.0001	0.0102

References

GWP. Brochure “Water 4.0.” 2016. Available online: http://www.germanwaterpartnership.de/fileadmin/pdfs/gwp_materialien/GWP_Brochure_Water_4.0.pdf (accessed on 18 August 2017).
Bae, C.H.; Kim, J.H.; Kim, K.P.; Koo, D. Introduction to K-water’s Research and Development Strategy for Advanced Water Pipe Network System Inspection, Monitoring, and Assessment Technology. In International Conference on Advanced Engineering Theory and Applications; Springer: Cham, Switzerland, 2016; pp. 335–343. [Google Scholar]
Shin, Drainage Coverage Model for Efficiently Discharging Water from the Transmission Line in Case of Pipeline Breakage. Ph.D. Thesis, Chungnam National University, Daejeon, Korea, 2018.
Zhou, X.; Tang, Z.; Xu, W.; Meng, F.; Chu, X.; Xin, K.; Fu, G. Deep Learning Identifies Accurate Burst Locations in Water Distribution Networks. Water Res. 2019, 166, 115058. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Guo, G.; Liu, S.; Wu, Y.; Xu, X.; Smith, K. Burst Detection in District Metering Areas Using Deep Learning Method. J. Water Resour. Plann. Manag. 2020, 146, 04020031. [Google Scholar] [CrossRef]
Lee, C.W.; Yoo, D.-G. Development of Leakage Detection Model and Its Application for Water Distribution Networks Using RNN-LSTM. Sustainability 2021, 13, 9262. [Google Scholar] [CrossRef]
Quiñones-Grueiro, M.; Milián, M.A.; Rivero, M.S.; Neto, A.J.S.; Llanes-Santiago, O. Robust leak localization in water distribution networks using computational intelligence. Neurocomputing 2021, 438, 195–208. [Google Scholar] [CrossRef]
Wu, Z.Y.; Chew, A.; Meng, X.; Cai, J.; Pok, J.; Kalfarisi, R.; Lai, K.C.; Hew, S.F.; Wong, J.J. Data-driven and model-based framework for smart water grid anomaly detection and localization. AQUA—Water Infrastruct. Ecosyst. Soc. 2022, 71, 31–41. [Google Scholar] [CrossRef]
Fan, X.; Zhang, X.; Yu, X. Machine learning model and strategy for fast and accurate detection of leaks in water supply network. J. Infrastruct. Preserv. Resil. 2021, 2, 10. [Google Scholar] [CrossRef]
Capelo, M.; Brentan, B.; Monteiro, L.; Covas, D. Near–real time burst location and sizing in water distribution systems using artificial neural networks. Water 2021, 13, 1841. [Google Scholar] [CrossRef]
Jin, M.; Liao, Q.; Patil, S.; Abdulraheem, A.; Al-Shehri, D.; Glatz, G. Hyperparameter Tuning of Artificial Neural Networks for Well Production Estimation Considering the Uncertainty in Initialized Parameters. ACS Omega 2022, 7, 24145–24156. [Google Scholar] [CrossRef] [PubMed]
Najafabadipour, A.; Kamali, G.; Nezamabadi-Pour, H. Application of Artificial Intelligence Techniques for the Determination of Groundwater Level Using Spatio–Temporal Parameters. ACS Omega 2022, 7, 10751–10764. [Google Scholar] [CrossRef] [PubMed]
Brownlee, J. Deep Learning with Python: Develop Deep Learning Models on Theano and TensorFlow using Keras; Machine Learning Mastery: Melbourne, VIC, Australia, 2016. [Google Scholar]
Lippmann, R. An introduction to computing with neural nets. IEEE Assp. Mag. 1987, 4, 4–22. [Google Scholar] [CrossRef]
Karsoliya, S. Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture. Int. J. Eng. Trends Technol. 2012, 3, 714–717. [Google Scholar]

Figure 1. Designed structure of deep neural networks.

Figure 2. The applied water supply system sensor network and information.

Figure 3. Preprocessing results of training data for 3 weeks (number of missing or stagnated data). * In the case of point 5, the flow meter is not installed; only the water pressure sensor is in operation.

Figure 4. Evaluation model metrics of training data.

Figure 5. Evaluation model metrics of validation data.

Figure 6. Comparison of duration (sec) for model train.

Figure 7. Evaluation metrics of validation data according to the number of nodes in the hidden layers. (a) R² (R squared), (b) MAE (mean absolute error), (c) MSE (mean squared error), and (d) RMSE (root-mean-squared error).

Figure 8. Duration according to the number of nodes in the hidden layers.

Figure 9. Evaluation metric graph of validation data according to the number of hidden layers. (a) R² (R squared), (b) MAE (mean absolute error), (c) MSE (mean squared error), and (d) RMSE (root- mean-squared error).

Figure 10. Duration according to the number of hidden layers.

Figure 11. Evaluation metric graph according to training dataset size.

Figure 12. Duration according to training dataset size.

Figure 13. Time series for measured and estimated pressure.

Figure 14. Heatmap of estimated pressure error.

Table 1. Time series for measured and estimated pressure.

Model	Detecting	R2	MAE	MSE	RMSE	Mean	MAPE (%)
P1	O	0.6009	0.0526	0.0042	0.0644	0.0598	0.6591
P2	O	0.9194	0.0159	0.2572	0.0006	0.0159	0.2572
P4	O	0.7033	0.2264	5.8826	0.1370	0.2264	5.8826
P5	O	0.9950	0.0098	0.2055	0.0003	0.0098	0.2055
P6	O	0.9790	0.0161	0.0010	0.0321	0.0179	0.3612
P7	O	0.6746	0.0219	0.4495	0.0008	0.0219	0.0219

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, H.-S.; Choi, D.; Yoo, D.-G.; Kim, K.-P. Hyperparameter Sensitivity Analysis of Deep Learning-Based Pipe Burst Detection Model for Multiregional Water Supply Networks. Sustainability 2022, 14, 13788. https://doi.org/10.3390/su142113788

AMA Style

Kim H-S, Choi D, Yoo D-G, Kim K-P. Hyperparameter Sensitivity Analysis of Deep Learning-Based Pipe Burst Detection Model for Multiregional Water Supply Networks. Sustainability. 2022; 14(21):13788. https://doi.org/10.3390/su142113788

Chicago/Turabian Style

Kim, Hyeong-Suk, Dooyong Choi, Do-Guen Yoo, and Kyoung-Pil Kim. 2022. "Hyperparameter Sensitivity Analysis of Deep Learning-Based Pipe Burst Detection Model for Multiregional Water Supply Networks" Sustainability 14, no. 21: 13788. https://doi.org/10.3390/su142113788

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperparameter Sensitivity Analysis of Deep Learning-Based Pipe Burst Detection Model for Multiregional Water Supply Networks

Abstract

1. Introduction

2. Methodologies

2.1. Applied Deep Learning Structure for Hyperparameter Sensitivity Analysis

2.2. Applied Network and Preprocessing of Training Data

2.3. Hyperparameter Sensitivity Analysis

3. Application Results

3.1. Hyperparameter Sensitivity Analysis Results

3.1.1. Selection of an Activation Function

3.1.2. Number of Neurons in the Hidden Layers

3.1.3. Number of Hidden Layers

3.1.4. Size of Training Dataset

3.2. Model Application Results for the Optimized Hyperparameter

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Evaluation Metric Value of Validation Data According to the Number of Hidden Layers

Appendix B. Evaluation Metric Value According to Training Dataset Size

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI