Next Article in Journal
Path Loss Characterization in an Outdoor Corridor Environment for IoT-5G in a Smart Campus University at 850 MHz and 3.5 GHz Frequency Bands
Previous Article in Journal
Intelligent Estimation of Exercise Induced Energy Expenditure Including Excess Post-Exercise Oxygen Consumption (EPOC) with Different Exercise Intensity
Previous Article in Special Issue
Arabic Captioning for Images of Clothing Using Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design of a Soft Sensor Based on Long Short-Term Memory Artificial Neural Network (LSTM) for Wastewater Treatment Plants

by
Roxana Recio-Colmenares
1,
Elizabeth León Becerril
1,*,
Kelly Joel Gurubel Tun
2,* and
Robin F. Conchas
3
1
Environmental Technology Department, Centro de Investigación y Asistencia en Tecnología y Diseño del Estado de Jalisco, A.C., Av. Normalistas 800, Colinas de la Normal, Guadalajara 44270, Jalisco, Mexico
2
School of Engineering and Technological Innovation, University of Guadalajara, Campus Tonalá, Tonalá 45425, Jalisco, Mexico
3
Electrical Engineering Department, Research Center and Advanced Studies of Instituto Politécnico Nacional (CINVESTAV), Unidad Guadalajara, Av. del Bosque 1145, El Bajío, Zapopan 45017, Jalisco, Mexico
*
Authors to whom correspondence should be addressed.
Sensors 2023, 23(22), 9236; https://doi.org/10.3390/s23229236
Submission received: 20 October 2023 / Revised: 11 November 2023 / Accepted: 13 November 2023 / Published: 17 November 2023
(This article belongs to the Special Issue Advanced Intelligent Sensor Based on Deep Learning)

Abstract

:
Assessment of wastewater effluent quality in terms of physicochemical and microbial parameters is a difficult task; therefore, an online method which combines the variables and represents a final value as the quality index could be used as a useful management tool for decision makers. However, conventional measurement methods often have limitations, such as time-consuming processes and high associated costs, which hinder efficient and practical monitoring. Therefore, this study presents an approach that underscores the importance of using both short- and long-term memory networks (LSTM) to enhance monitoring capabilities within wastewater treatment plants (WWTPs). The use of LSTM networks for soft sensor design is presented as a promising solution for accurate variable estimation to quantify effluent quality using the total chemical oxygen demand (TCOD) quality index. For the realization of this work, we first generated a dataset that describes the behavior of the activated sludge system in discrete time. Then, we developed a deep LSTM network structure as a basis for formulating the LSTM-based soft sensor model. The results demonstrate that this structure produces high-precision predictions for the concentrations of soluble X 1 and solid X 2 substrates in the wastewater treatment system. After hyperparameter optimization, the predictive capacity of the proposed model is optimized, with average values of performance metrics, mean square error (MSE), coefficient of determination (R2), and mean absolute percentage error (MAPE), of 23.38, 0.97, and 1.31 for X 1 , and 9.74, 0.93, and 1.89 for X 2 , respectively. According to the results, the proposed LSTM-based soft sensor can be a valuable tool for determining effluent quality index in wastewater treatment systems.

1. Introduction

In nonlinear systems, such as biological ones, complex variables crucial for determining the quality of wastewater often prove challenging to measure in real time due to the presence of external disturbances and the nonlinear phenomena of these processes. Within this context, the importance lies in the design of digital sensors aimed at identifying variables hard to measure in biological processes, with a specific focus on wastewater treatment plants [1]. This approach plays an essential role in decision making for optimal operation of the process, offering practical and cost-effective alternatives to expensive or impractical conventional measurement devices. The implementation of these sensors not only brings economic benefits but also has a positive impact on the environment. In contrast to hardware sensors, digital detection techniques offer notable advantages, including delay-free estimation, low cost, simple maintenance, and high resistance to interferences [2]. Taking into account the modeling methodologies, digital sensor models can be classified into three groups: first principles models, data-based models, and hybrid models. First principles models are complex and require significant computational resources, making data-based models the preferred option. The latter incorporates a variety of techniques such as support vector regression (SVR), artificial neural networks (ANN), Bayesian regression learning (BRL), gaussian process regression (GPR), kernel ridge regression (KRR), Kalman filters (KF), partial least squares regression (PLS), and ensemble learning [3,4,5,6]. While they have proven effective in modeling complex processes in digital sensors [7,8], the main challenge remains of the handling of unlabeled data and model generalization. Conventional modeling methods for digital sensors are not ideal for addressing large datasets, unlabeled data, and extensive industrial samples, making it difficult to obtain stable and reliable results [9]. In recent years, the widespread use of deep learning has been crucial in various fields, such as speech recognition, computer vision, natural language processing, and bioinformatics. Pretrained deep neural networks have proven to be a promising solution in extracting latent variables, significantly improving adaptability compared to traditional methods [10,11]. Furthermore, the computational efficiency of digital sensors is crucial for their successful implementation in industrial environments [12,13]. In this context, LSTM neural network models are a promising approach for time series forecasting and prediction compared to other deep neural network structures, since LSTMs are specifically designed to handle sequences of data, making them suitable for time series prediction. They can capture long-term dependencies in the data, allowing them to model relationships over extended time horizons, which is often a challenge for traditional feed-forward neural networks (FFNNs). LSTMs can process sequences of varying lengths, adapting to the specific context of each sequence, while some other deep neural network models require fixed-length input. In contrast to traditional recurrent neural networks (RNNs) which can face the vanishing gradient problem, making it difficult for them to capture long-term dependencies, LSTMs are designed to mitigate this issue through their gating mechanisms, allowing for more stable training and improved long-term performance [14]. Additionally, LSTMs can effectively handle noisy data and are robust to variations in data quality, making them suitable for real-world scenarios. LSTMs have been successfully applied to a wide range of time series forecasting tasks, including weather forecasting [14], financial forecasting [15], stock price and energy consumption predictions [16]. Their versatility and performance have made them a popular choice in these domains. The importance of using LSTM neural networks compared to traditional techniques lies in their ability to effectively model and predict relevant states in complex systems. By leveraging their ability to capture long-term dependencies in the data and handle sequences of varying lengths, LSTMs overcome the limitations of traditional techniques, especially in the context of time series prediction in bioprocesses. The use of LSTMs offers a promising and robust solution for online prediction variables, which has significant implications for improving efficiency and performance in a wide range of industrial and wastewater treatment applications.
In this work, a LSTM-based soft sensor approach to predict substrate concentrations for evaluating the effluent quality in wastewater treatment plants is proposed. The selection of deep LSTM network architecture and the configuration of hyperparameters results from a systematic exploration of parameter values, guided by empirical experimentation and prior research in the field. It represents a trade-off between the solution quality of prediction generated by the LSTM model and computational efficiency tailored to our specific problem context.

2. Materials and Methods

2.1. Wastewater Treatment Plant Description

The treatment process is realized in a real small-sized plant consisting of an aeration tank with 2000 m3 as the working volume, mechanical aerators which provide oxygen (kLa = 4.5 h−1) and mix the incoming wastewater, and a settler for either solids to be recirculated to the aeration tank (Dr) or extracted from the system (εD). The influent average flow D is about 3000 m3/day, the average chemical oxygen demand input (CODin) is 320 mg/L, and the total nitrogen input (TNin) is 30 mg/L after pretreatment. The operational conditions used in this process are based on those given by [17]. The treatment plant is schematically demonstrated in Figure 1. The Activated Sludge Model (ASM1) is used to describe the biochemical transformation processes in the suspended-growth treatment reactor for chemical oxygen demand (COD) removal [18]. A reduced model is represented by Equations (1)–(6), composed of ordinary differential equations and nonlinear kinetic functions which bear resemblance to those explored in the studies referenced in [17,18]. The characterization of wastewater and estimation of parameter values were made according to [18], and the reduced model was validated in a previous work [19]. The fitted model provides a satisfactory understanding of the transformation process leading to COD removal. In this work, the data needed to train and test the LSTM network architectures used in the soft sensor model were generated by simulating the ASM1 reduced model. The main objective of wastewater treatment plants is to improve the effluent quality. Therefore, we quantify effluent quality using TCOD as the performance index. For example, for urban wastewater, the maximum specified concentration of COD leaving a small-sized wastewater treatment plant is CODmax = 150 mg/L [19]. The TCOD is given by Equation (1), composed of the easily biodegradable soluble substrate X 1 , k , the slowly decomposing solid substrate component X 2 , k , and the inert organic material I s . The latter reflects the constant value of the inflow.
T C O D = X 1 , k + X 2 , k + I s
X 1 , k + 1 = D X 1 , i n X 1 , k 1 Y H μ 1 μ m a x , H μ 3 + μ 4 μ 6 η g X 3 , k + μ 7 k h μ 3 + μ 4 μ 6 η h X 3 , k
X 2 , k + 1 = D X 2 , i n X 2 , k + D r b 1 X 2 , k + 1 f p b H X 3 k + b A X B A μ 7 k h μ 3 + μ 4 μ 6 η h X 3 , k
X 3 , k + 1 = D X 3 , i n X 3 , k + D r b 1 X 3 , k + μ 1 μ 3 μ m a x , H X 3 , k + μ 1 μ 4 μ 6 μ m a x , H η g X 3 , k b H X 3 , k
X 4 , k + 1 = D X 4 , i n X 4 , k + D r b 1 X 4 , k + μ 2 μ 5 μ m a x , A X 4 , k b A X 4 , k
X 5 , k + 1 = D X 5 , i n X 5 , k + K L A X 5 , m a x X 5 , k 1 Y H Y H μ 1 μ 3 μ m a x , H X 3 , k 4.57 Y A Y A μ 2 μ 5 μ m a x , A X 4 , k
where X 3 , k is the active heterotrophic particulate biomass, X 4 , k is the active autotrophic particulate biomass, and X 5 , k is soluble oxygen. The kinetic and stoichiometric parameters are detailed in Appendix A, Table A1 and Table A2, respectively.

2.2. LSTM Network Architecture

The LSTM is a type of recurrent neural network initially introduced in the field of deep learning by Hochreiter and Schmidhuber [20] to address the issue of gradient explosion in RNNs during backpropagation. The LSTM model is widely recognized as an influential architecture for learning from sequential data due to its ability to capture long-term dependencies and effectively learn from sequences of varying lengths. A schematic of the LSTM model unit is presented in Figure 2.
The LSTM unit comprises three gates responsible for controlling the flow of information: i the input gate, which determines the significance of input information to be remembered; i i the forget gate, which decides whether to retain or discard the input value; and i i i the output gate, which governs the output of the LSTM unit. LSTM is implemented through Equations (7)–(12 ) .
The input gate i t :
i t = σ W x i x t + W h i h t   1 + W c i C t   1 + b i
The forget gate   f   t :
f t = σ W x f x t + W h f h t 1 + W c f C t 1 + b f
The state candidates C ~ t :
C ~ t = t a n h t a n h   W x c x t + W h c h t 1 + b c  
The activation cell C t :
C t = f t C t 1 + i t C ~ t
The output gate o t :
o t = σ W x o x t + W h o h t 1 + W c o C t + b o
The hidden state h t :
h t = o t t a n h C t  
Regarding the components entailed in the mathematical depiction of the LSTM cell, W c i , W c f , and W c o represent the weights establishing connections between the activation cell and the input gate, the forget gate, and the output gate, respectively. W h i , W h f , W h c , and W h o denote the weights linking the hidden layer to the input gate, the forget gate, the activation cell, and the output gate [13]. Additionally, W x i , W x f , W x c , and W x o correspond to the weight matrices connecting the input layer to the input gate, whereas b i , b f , b c , and bo refer to the respective biases. Ultimately, the values are rescaled within the range of 1 to 1 using the t a n h activation function.

2.3. LSTM-Based Soft Sensor Model

In recent years, soft sensors, which estimate process variables using measured data from other sensors, have become increasingly popular due to their ability to provide accurate and reliable predictions. In this context, ANNs have emerged as a prominent approach for developing soft sensors due to their ability to handle complex nonlinear relationships and their capability to learn from data [21,22]. In this work, a deep LSTM network is chosen for modeling the temporal behavior and dependencies between WWTP inputs and outputs due to its capability for time series prediction and handling time-dependent values [23,24]. Thus, the proposed LSTM-based soft sensor model is responsible for predicting the X 1 and X 2 states to quantify effluent quality using the TCOD as the quality index. As shown in Figure 3, the model operates in three stages:
  • Data preprocessing: this step includes data normalization and implementing a sliding window into the dataset.
  • Data processing: this step comprises the selection, training, and testing of the deep LSTM network to predict X 1 and X 2 .
  • Data postprocessing: this step consists of the denormalization of data and the evaluation of the model’s performance, resulting in the predictions of X 1 and X 2 , denoted as X ^ 1 and X ^ 2 , respectively.
The X 3 , X 4 , and X 5 states are the input data measurements because of their role in the biotransformation of organic micropollutants (OMPs). Table 1 presents the input and output measurements of the proposed LSTM soft sensor.

2.4. Dataset and Data Processing

Preparing data before feeding it into a model is a crucial step in machine learning techniques. LSTM networks require sufficient historical information to predict future outcomes and enhance system performance. In this study, wastewater dynamic states X 3 , X 4 , and X 5 are considered as input parameters. Input parameters are assumed to be available for data acquisition and they are directly related to the substrates degradation and oxidation, so they are suitable for the identification of organic substrates. Simultaneously, the output parameters X 1 and X 2 are predicted by the LSTM-based soft sensor model to determine the TCOD quality index for wastewater effluent assessment. Based on the simplified WWTP model described by Equations (1)–(6), a dataset comprising 5020 samples corresponding to 120 h of the process (5 days) was generated. The first 4500 rows corresponding to the first 108 h of the process were used for training and validating the LSTM networks. After adjusting the hyperparameters and attaining the optimal results, the remaining 520 sets of data (4501–5020) from the dataset, representing the final 12 h of the process, were used as unseen data to forecast the levels of X 1 and X 2 .
During the training phase, the model underwent supervised learning with predefined target outcomes. In the testing phase, the developed model was applied to predict the targeted substances based on the training data. Figure 4 visually demonstrates the 4500 data points generated for X 3 , X 4 , and X 5 . Statistics of parameters of the variables in the dataset generated experimentally by employing the model described by Equations (1)–(6) are presented in Table 2. It is important to note that for all kinds of data-driven models (e.g., artificial intelligence-based models), a low standard deviation of data indicates that the data points are closely clustered around the mean, which implies a smaller degree of variability or dispersion in the data; thus, it is expected to get less biased outputs from the models [25].
Studies have suggested that LSTM networks are responsive to dataset randomization, particularly when utilizing nonlinear activation functions. A widely adopted strategy to address this challenge is normalizing the dataset within the 0 to 1 range [24]. Consequently, we standardized both the input and target datasets using systematic weight initialization to expedite the learning process, leading to quicker convergence. The final normalized input data used for training the LSTM networks is illustrated in Figure 5.

2.5. Hyperparameter Selection for Proposed LSTM Architecture

The adequate selection of the deep LSTM network architecture, which is the core of the LSTM-based soft sensor model presented in Figure 6, involves utilizing various tools and methodologies. The optimal number of LSTM units in the hidden layer is determined through systematic experimentation, ranging from 2 to 200 cell units. Each topology is tested using a loss function as the error metric, with the process repeated thrice to ensure result consistency. After careful experimentation, the deep LSTM network architecture, depicted in Figure 6, exhibited the best training and validation accuracy results.
The training process involved using the seven most recent past measurements to perform the prediction of substrates. Notable minor hyperparameters of the selected configuration are presented in Table 3. The proposed LSTM network was implemented in Python 3.10.12 software, utilizing the Keras library with TensorFlow as its backend framework. Table 4 lists various available open-source libraries employed in this study.

2.6. Model Performance Evaluation

The objective of model performance evaluation is to validate the accuracy of the proposed model and identify any errors, thus guaranteeing its reliable applicability [15]. In this study, we employ the MSE, R2, and MAPE as three performance metrics to evaluate the predictive capabilities of the proposed LSTM-based soft sensor model. The calculations for MSE, R 2 , and MAPE are as follows [29]:
  • MSE: it measures the average of the squares of the errors and is given by the following equation:
M S E = 1 n i = 1 n y i y ^ i 2
2.
R2: The coefficient of determination measures how much one variable can explain the variation in another variable when predicting the outcome of an event. The formula is as follows:
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y ^ _ i 2
3.
MAPE: it is the mean or average of the absolute percentage errors of prediction:
M A P E = 1 n i = 1 n y i y ^ i y ^ i × 100
Regarding Equations (13)–(15), n represents the number of samples, y i corresponds to the i -th sample of the observed output data, y ^ i is the i -th predicted value, and y ^ _ i is the mean of the predicted values.

3. Prediction Results

3.1. Training and Validation Stage

Before running the proposed deep LSTM neural network for X 1 and X 2 prediction in the WWTP model described by Equations (1)–(6), the generated dataset comprising 5020 samples was divided into three groups, including data for training, data for validation, and data for testing, respectively. As mentioned in the previous section, a group of data comprised of 4500 rows corresponding to the first 108 h of the process was used for training and validating the LSTM networks, of which 80% was employed for training, and the remaining 20% of data was employed for validation. Figure 7 shows the training loss and validation loss curves in terms of MSE for the developed deep LSTM network using the hyperparameters presented in Table 3. From Figure 7, it is possible to observe that no overfitting occurs during the training and validation of the LSTM network stage.

3.2. Testing Stage

After training and validating the deep LSTM network, we applied the testing data containing 520 rows to evaluate the model’s prediction reliability for unseen data during the training process. Figure 8 shows the prediction results for X 1 with the respective prediction error. The prediction results for X 2 are presented in Figure 9. The states of the system are available via the LSTM-based soft sensor model, and the TCOD quality index is calculated by Equation (1). Figure 10 displays the predicted TCOD vs the real value in the wastewater plant along 240 h. From these results, it can be appreciated that, in general, the predicted values were close to the observed values, indicating the adequate capability of the proposed strategy to predict the behavior of X 1 and X 2 for unseen data.
The prediction results of the proposed model were evaluated based on the performance metrics MSE, R 2 , and MAPE presented in Equations (13)–(15). Generally speaking, a good fit between the observed and predicted results is obtaining values of MSE close to zero, while obtaining values of R 2 close to 1. Table 5 presents a comparative analysis in terms of the performance metrics for the prediction results achieved by employing the proposed LSTM-based model against those achieved using the FFNN [23] technique. The comparison was conducted using the same test dataset for both techniques. The FFNN was implemented with a single hidden layer comprising 128 neurons and trained using the Levenberg–Marquardt Algorithm (LMA).
Marquardt algorithm. The average MSE, R 2 , and MAPE values obtained were 23.38, 0.97, and 1.31 for X 1 , and 9.74, 0.93, and 1.89 for X 2 , respectively. The results indicate a superior performance of the proposed approach over the results obtained using the FFNN. Figure 11 and Figure 12 show the scatter plot of the real versus predicted values of X 1 and X 2 for the LSTM-based model and FFNN technique, respectively.

4. Discussion

Based on the results presented in Table 5, it can be observed that the MSE for X 2 prediction was comparatively better than that for X 1 , which can be attributed to some large punctual deviations due to outliers in the dataset. Hence, some of the peaks or differences shown in the plots could be attributed to these outliers, leading to substantial deviations in the consecutive results. This complication could be addressed with a more extensive preprocessing process of the dataset. On the other hand, the prediction results in terms of performance metrics R 2 and MAPE were better for X 1 in contrast to the prediction results for X 2 . These obtained results can be attributed to the fact that according to the basic statics of variables in the dataset presented in Table 3, the easily biodegradable soluble substrate X 1 presents a lower value of standard deviation, which implies a low data variability and a more stable and predictable pattern, resulting in more accurate and less biased predictions generated by the model. In general, the results demonstrate that the proposed LSTM-based soft sensor model is competent in capturing the nonlinear behavior of substrates X 1 and X 2 present in the wastewater biological process for effluent quality evaluation.

5. Conclusions

This study proposes an LSTM-based soft sensor model to predict the concentrations of two critical substrates for effluent quality determination in wastewater treatment plants. First, we generated a dataset that describes the behavior of a real small-sized WWTP, modeled by the discrete-time ASM1. Then, we developed a deep LSTM network structure as the foundation for formulating the LSTM-based soft sensor model. The results demonstrate that this structure yields high-accuracy predictions for the organic substrates. After hyperparameter fine-tuning, the predictive capability of the proposed model was optimized, with average values of the performance metrics MSE, R 2 , and MAPE of 23.38, 0.97, and 1.31 for substrate X 1 , and 9.74, 0.93, and 1.89 for substrate X 2 , respectively. According to the results, the proposed LSTM-based soft sensor can be a valuable management tool for decision making, with the aim to satisfy legislative requirements. However, it is important to note that LSTM networks still present several challenging limitations. For instance, LSTMs are prone to overfitting when dealing with small datasets. Additionally, data preparation is critical for LSTM predictions. In most cases, it is necessary to normalize or standardize the data, handle missing values, and select appropriate features to ensure that the LSTM can effectively learn from the input. Therefore, at this time, the authors are actively exploring the development of methodologies for optimally selecting the most suitable LSTM network structure and the respective hyperparameters according to the particular application. Furthermore, as future work, it is intended to investigate the application of the proposed LSTM-based soft sensor to simulate a closed-loop wastewater treatment plant system.

Author Contributions

Methodology, R.R.-C. and R.F.C.; Validation, R.R.-C. and R.F.C.; Formal analysis, E.L.B. and K.J.G.T.; Resources, E.L.B.; Writing—review & editing, K.J.G.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Consejo Nacional de Humanidades, Ciencias y Tecnologías (CONAHCYT; Project CF-2023-G-648).

Data Availability Statement

Data are contained within the article.

Acknowledgments

Roxana Recio-Colmenares acknowledges CONAHCYT for the Postdoctoral Research Fellowships 2256881.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Mathematical model parameters.
Table A1. Mathematical model parameters.
ParameterValuesUnitsDescription
Y A   0.24 mg COD/mg NAutotrophic yield coefficient
Y H   0.67 mg CDO/mg CODHeterotrophic yield coefficient
μ m a x , A   0.8 1 / h Maximum specific growth rate for autotrophs
μ m a x , H   6 1 / h Maximum specific growth rate for heterotrophs
μ 1 Monod kinetics for easily biodegradable soluble substrate
μ 2 Monod kinetics for the component X 5 , k as a function of X 4 , k
μ 3 Monod kinetics for the component X 5 , k as a function of X 3 , k
μ 4 Monod kinetics for soluble nitrate and nitrite nitrogen
μ 5 Monod kinetics for soluble ammonium nitrogen
μ 6 Inhibition kinetics for X 5 , k
μ 7 Saturation kinetics
b   0.9 Fraction coefficient for D r
b A   0.05 1 / h Autotrophic decay coefficient
b H   0.22 1 / h Heterotrophic decay coefficient
f P   0.08 Fraction of biomass yielding particulate products
η g   0.8 Correction factor for anoxic growth of heterotrophs
Table A2. Initial conditions and additional parameters.
Table A2. Initial conditions and additional parameters.
ParameterValuesUnitsDescription
D 2 1 / h Dilution rate
D r 1 1 / h Dilution recycle rate
K L A 1 / h Oxygen transfer coefficient
X i ,   i n 200 m g C O D / L Initial condition of X 1 , k
X 2 , i n 100 m g C O D / L Initial condition of X 2 , k
X 3 , i n 0 m g C O D / L Initial condition of X 3 , k
X 4 , i n 0 m g C O D / L Initial condition of X 4 , k
X 5 , i n 2 m g C O D / L Initial condition of X 5 , k
X 5 , m a x 10 m g / L Maximum concentration of soluble oxygen
V 15 L Tank volume
I s 5 m g C O D / L Concentration of soluble and particulate inert organic matter
η h 0.4 -Correction factor for anoxic hydrolysis

References

  1. Kadlec, P.; Gabrys, B.; Strandt, S. Data-driven soft sensors in the process industry. Comput. Chem. Eng. 2009, 33, 795–814. [Google Scholar] [CrossRef]
  2. Alanis, A.Y.; Sanchez, E.N. Full Order Neural Observers. In Discrete-Time Neural Observers; Academic Press: Cambridge, MA, USA, 2017; pp. 23–74. [Google Scholar] [CrossRef]
  3. Hu, X.; Cao, D.; Egardt, B. Condition Monitoring in Advanced Battery Management Systems: Moving Horizon Estimation Using a Reduced Electrochemical Model. IEEE/ASME Trans. Mechatron. 2018, 23, 167–178. [Google Scholar] [CrossRef]
  4. Grbić, R.; Slišković, D.; Kadlec, P. Adaptive soft sensor for online prediction and process monitoring based on a mixture of Gaussian process models. Comput. Chem Eng. 2013, 58, 84–97. [Google Scholar] [CrossRef]
  5. Lou, H.H.; Mukherjee, R.; Wang, Z.; Olsen, T.; Diwekar, U.; Lin, S. A New Area of Utilizing Industrial Internet of Things in Environmental Monitoring. Front. Chem. Eng. 2022, 4, 842514. [Google Scholar] [CrossRef]
  6. Zhang, Y.; Jin, H.; Liu, H.; Yang, B.; Dong, S. Deep Semi-Supervised Just-in-Time Learning Based Soft Sensor for Mooney Viscosity Estimation in Industrial Rubber Mixing Process. Polymers 2022, 14, 1018. [Google Scholar] [CrossRef]
  7. Wu, H.; Han, Y.; Jin, J.; Geng, Z. Novel Deep Learning Based on Data Fusion Integrating Correlation Analysis for Soft Sensor Modeling. Ind. Eng. Chem. Res. 2021, 60, 10001–10010. [Google Scholar] [CrossRef]
  8. Quan, J. Visualization and Analysis Model of Industrial Economy Status and Development Based on Knowledge Graph and Deep Neural Network. Comput. Intell. Neurosci. 2022, 2022, 7008093. [Google Scholar] [CrossRef]
  9. Yan, W.; Xu, R.; Wang, K.; Di, T.; Jiang, Z. Soft Sensor Modeling Method Based on Semisupervised Deep Learning and Its Application to Wastewater Treatment Plant. Ind. Eng. Chem. Res. 2020, 59, 4589–4601. [Google Scholar] [CrossRef]
  10. Li, Z.; Jin, H.; Dong, S.; Qian, B.; Yang, B.; Chen, X. Semi-supervised ensemble support vector regression based soft sensor for key quality variable estimation of nonlinear industrial processes with limited labeled data. Chem. Eng. Res. Des. 2022, 179, 510–526. [Google Scholar] [CrossRef]
  11. Hu, X.; Li, S.E.; Yang, Y. Advanced Machine Learning Approach for Lithium-Ion Battery State Estimation in Electric Vehicles. IEEE Trans. Transp. Electrif. 2016, 2, 140–149. [Google Scholar] [CrossRef]
  12. Bakirov, R.; Gabrys, B.; Fay, D. Multiple adaptive mechanisms for data-driven soft sensors. Comput. Chem. Eng. 2017, 96, 42–54. [Google Scholar] [CrossRef]
  13. Gopakumar, V.; Tiwari, S.; Rahman, I. A deep learning based data driven soft sensor for bioprocesses. Biochem. Eng. J. 2018, 136, 28–39. [Google Scholar] [CrossRef]
  14. Venkatachalam, K.; Trojovský, P.; Pamucar, D.; Bacanin, N.; Simic, V. DWFH: An improved data-driven deep weather forecasting hybrid model using Transductive Long Short Term Memory (T-LSTM). Expert Syst. Appl. 2023, 213, 119270. [Google Scholar] [CrossRef]
  15. Fang, Z.; Ma, X.; Pan, H.; Yang, G.; Arce, G.R. Movement forecasting of financial time series based on adaptive LSTM-BN network. Expert Syst. Appl. 2023, 213, 119207. [Google Scholar] [CrossRef]
  16. Gülmez, B. Stock price prediction with optimized deep LSTM network with artificial rabbits optimization algorithm. Expert Syst. Appl. 2023, 227, 120346. [Google Scholar] [CrossRef]
  17. Yoon, S.-H.; Lee, S. Critical operational parameters for zero sludge production in biological wastewater treatment processes combined with sludge disintegration. Water Res. 2005, 39, 3738–3754. [Google Scholar] [CrossRef] [PubMed]
  18. Henze, M.; Grady, C.; Gujer, W.; Marais, G.; Matsuo, T. A general model for single-sludge wastewater treatment systems. Water Res. 1987, 21, 505–515. [Google Scholar] [CrossRef]
  19. Recio-Colmenares, R.; Gurubel-Tun, K.J.; Zúñiga-Grajeda, V. Optimal neural tracking control with metaheuristic parameter identification for uncertain nonlinear systems with disturbances. Appl. Sci. 2020, 10, 7073. [Google Scholar] [CrossRef]
  20. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  21. Pisa, I.; Santín, I.; Vicario, J.L.; Morell, A.; Vilanova, R. ANN-based soft sensor to predict effluent violations in wastewater 275 treatment plants. Sensors 2019, 19, 1280. [Google Scholar] [CrossRef]
  22. Yaqub, M.; Asif, H.; Kim, S.; Lee, W. Modeling of a full-scale sewage treatment plant to predict the nutrient removal efficiency 279 using a long short-term memory (LSTM) neural network. J. Water Process. Eng. 2020, 37, 101388. [Google Scholar] [CrossRef]
  23. Silva, I.N.; Spatti, D.H.; Flauzino, R.A.; Liboni, L.; Alves, S. Artificial Neural Networks a Practical Course; Springer: Cham, Switzerland, 2017. [Google Scholar]
  24. Li, J.; Qi, C.; Li, Y.; Wu, Z. Prediction and compensation of contourerror of cnc systems based on lstm neural-288 network. IEEE/ASME Trans.-Actions Mechatron. 2021, 27, 572–581. [Google Scholar] [CrossRef]
  25. Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
  26. McKinney, W. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; pp. 51–56. [Google Scholar] [CrossRef]
  27. Hunter, J.D. Matplotlib A 2D Graphics Environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
  28. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for large-scale machine learning. In Proceedings of the Symposium on Operating Systems Design and 301 Implementation, Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
  29. Moriasi, D.N.; Arnold, J.G.; van Liew, M.W.; Bingner, R.L.; Harmel, R.D.; Veith, T.L. Model Evaluation Guidelines 293 for Systematic Quantification of Accuracy in Watershed Simulations. Trans. ASABE 1983, 50, 294. [Google Scholar]
Figure 1. Wastewater plant configuration.
Figure 1. Wastewater plant configuration.
Sensors 23 09236 g001
Figure 2. LSTM cell.
Figure 2. LSTM cell.
Sensors 23 09236 g002
Figure 3. LSTM-based soft sensor.
Figure 3. LSTM-based soft sensor.
Sensors 23 09236 g003
Figure 4. Dataset of input measurements X3, X4, and X5.
Figure 4. Dataset of input measurements X3, X4, and X5.
Sensors 23 09236 g004
Figure 5. Normalized dataset of input measurements X3, X4, and X5.
Figure 5. Normalized dataset of input measurements X3, X4, and X5.
Sensors 23 09236 g005
Figure 6. Proposed deep LSTM network architecture.
Figure 6. Proposed deep LSTM network architecture.
Sensors 23 09236 g006
Figure 7. Loss curves of training and validation of proposed deep LSTM network.
Figure 7. Loss curves of training and validation of proposed deep LSTM network.
Sensors 23 09236 g007
Figure 8. Observed and predicted results for X1.
Figure 8. Observed and predicted results for X1.
Sensors 23 09236 g008
Figure 9. Observed and predicted results for X2.
Figure 9. Observed and predicted results for X2.
Sensors 23 09236 g009aSensors 23 09236 g009b
Figure 10. TCOD predicted vs. real values.
Figure 10. TCOD predicted vs. real values.
Sensors 23 09236 g010
Figure 11. Plot of observed versus predicted values and the fitted regression line using the LSTM-based soft sensor model. (a) Red circles indicate X1 positive correlation and (b) green circles indicate X2 positive correlation.
Figure 11. Plot of observed versus predicted values and the fitted regression line using the LSTM-based soft sensor model. (a) Red circles indicate X1 positive correlation and (b) green circles indicate X2 positive correlation.
Sensors 23 09236 g011
Figure 12. Plot of observed versus predicted values and the fitted regression line using the FFNN. (a) Red circles indicate X1 positive correlation and (b) green circles indicate X2 positive correlation.
Figure 12. Plot of observed versus predicted values and the fitted regression line using the FFNN. (a) Red circles indicate X1 positive correlation and (b) green circles indicate X2 positive correlation.
Sensors 23 09236 g012
Table 1. Input and output measurements of LSTM soft sensor.
Table 1. Input and output measurements of LSTM soft sensor.
Input Measurements
MeasurementDescription
X 3 ( m g C O D / L ) Active heterotrophic particulate biomass
X 4 ( m g C O D / L ) Active autotrophic particulate biomass
X 5 ( m g / L ) Soluble oxygen
Output measurements
X 1 ( m g C O D / L ) Easily biodegradable soluble substrate
X 2 ( m g C O D / L ) Slowly biodegradable particulate substrate
Table 2. Basic statistics of parameters in the generated dataset.
Table 2. Basic statistics of parameters in the generated dataset.
ParametersMinimumMaximumMeanStd. Deviation
X 1 ( m g C O D / L ) 189.13393.60290.8231.77
X 2 ( m g C O D / L ) 66.06145.53107.8011.91
X 3 ( m g C O D / L ) 23.1057.5440.135.84
X 4 ( m g C O D / L ) 0.702.801.730.35
X 5 ( m g / L ) 1.243.932.490.43
Table 3. Hyperparameters selected for the deep LSTM architecture.
Table 3. Hyperparameters selected for the deep LSTM architecture.
HyperparametersSelected Values
Batch size128
Previous time steps7
OptimizerAdam
Epoch size700
Dropout rate0.1
Optimizer learning rate0.001
Number LSTM layers2
Number of LSTM cells per layer L 1 128 and L 2 64
Activation function RELU
Table 4. Libraries of Python 3.10.12 employed in this work.
Table 4. Libraries of Python 3.10.12 employed in this work.
LibraryPurposeVersion
Numpy [25]Data processing1.23.5
Pandas [26]Data management1.5.3
Matplotlib [27] Graphic Generation3.7.1
Tensorflow [28]Neural network implementation2.13.0
Table 5. Summary of prediction performance.
Table 5. Summary of prediction performance.
MethodMetric of
Performance
Obtained Values
for   X 1
Obtained Values
for   X 2
Proposed LSTM-based approachMSE23.389.74
R 2 0.970.93
MAPE (%)1.311.89
FFNNMSE115.5213.70
R 2 0.860.81
MAPE (%)3.342.76
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Recio-Colmenares, R.; León Becerril, E.; Gurubel Tun, K.J.; Conchas, R.F. Design of a Soft Sensor Based on Long Short-Term Memory Artificial Neural Network (LSTM) for Wastewater Treatment Plants. Sensors 2023, 23, 9236. https://doi.org/10.3390/s23229236

AMA Style

Recio-Colmenares R, León Becerril E, Gurubel Tun KJ, Conchas RF. Design of a Soft Sensor Based on Long Short-Term Memory Artificial Neural Network (LSTM) for Wastewater Treatment Plants. Sensors. 2023; 23(22):9236. https://doi.org/10.3390/s23229236

Chicago/Turabian Style

Recio-Colmenares, Roxana, Elizabeth León Becerril, Kelly Joel Gurubel Tun, and Robin F. Conchas. 2023. "Design of a Soft Sensor Based on Long Short-Term Memory Artificial Neural Network (LSTM) for Wastewater Treatment Plants" Sensors 23, no. 22: 9236. https://doi.org/10.3390/s23229236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop