Next Article in Journal
Nafion Solvated by Ethylene Carbonate, Dimethyl Carbonate and Dimethylacetamide as Electrolyte for Lithium Metal Batteries
Previous Article in Journal
Machine Learning Gaussian Process Regression based Robust H-Infinity Controller Design for Solar PV System to Achieve High Performance and Guarantee Stability
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Parameter Identification Concept for Process Models Combining Systems Theory and Deep Learning †

1
Automation & Computer Sciences Department, Harz University of Applied Sciences, Friedrichstrasse 57–59, 38855 Wernigerode, Germany
2
Institute for Chemical and Thermal Process Engineering, Technische Universität Braunschweig, Langer Kamp 7, 38106 Braunschweig, Germany
*
Author to whom correspondence should be addressed.
Presented at the 1st International Electronic Conference on Processes: Processes System Innovation, 17–31 May 2022; Available online: https://sciforum.net/event/ECP2022.
Eng. Proc. 2022, 19(1), 27; https://doi.org/10.3390/ECP2022-12686
Published: 8 June 2022

Abstract

:
In recent years, dynamic process models have grown even more important in the context of Industry 4.0 and the use of digital twins. However, the accuracy of the corresponding model parameter estimates is determined by the quantity and quality of data and the parameter identification solving methodologies used. Standard methods are based on the ordinary least squares framework. Still, other options are available that might be more sensitive to model parameter variations and ensure more precise parameter estimates. The paper presents a novel technique for parameter identification based on incorporating neural ordinary differential equations for surrogate modeling and differential flatness, i.e., a systems theory concept in control engineering. This approach may lead to improved parameter sensitivities, as demonstrated with a simulation study of a distributed-parameter identification problem assuming a diffusion-type parabolic partial differential equation.

1. Introduction

System identification is the process of creating a mathematical model, or equation, to represent a real-world problem. This equation can then be used to predict and analyze possible outcomes of the system under study. In many engineering fields, the time behavior of complicated technical systems can be described by using a system of ordinary differential equations (ODEs). However, the parameters in these equations are often unknown and need to be estimated from experimental data. Over the past few decades, there has been intense research on parameter estimation methods. A popular method minimizes the sum of squared errors (SSE) between a model prediction and measurement data, where the prediction is calculated by solving the ODE numerically [1]. The model parameters are then adjusted until a given minimization criterion is reached. However, other options are available that might be more sensitive to model parameter variations and might ensure more precise parameter estimates, respectively. Control and systems theory can improve parameter identification procedures, e.g., online parameter identification concepts [2,3].
Another example is differential flatness to recalculate control trajectory profiles for desired system dynamics [4,5], i.e., following a system inversion concept. In a differential flat system, state variables and input variables can be expressed as functions of so-called flat outputs and a finite number of their derivatives, also leading to a reformulation of the parameter identification problem. Moreover, while optimal experimental design concepts might be needed to improve data quantity and quality [3], the flatness concepts involved in the flat output approach may result in improved parameter sensitivities [6] and more precise parameter estimates [7,8] without experimental data enrichment.

2. Methods

2.1. Parameter Identification Problem

Frequently, dynamic process models are given as ordinary differential equation systems:
x ˙ ( t ) = f ( x ( t ) , u ( t ) , p ) ,   x ( t 0 ) = x 0 ,
where t [ t 0 , t 0 + t end ] is the time, with t 0 as the initial time and t end as the time duration of the simulation, u R n u is the vector of the control variables, p R n p is the vector of the time-invariant parameters, and x   R n x are the differential system states. The initial conditions for the differential states are given by x 0 . Moreover, f : R n x × n u × n p R n x represents the corresponding vector field. For this kind of mathematical representation, the standard approach of parameter identification, i.e., the ordinary least squares (OLS) method, can be defined as:
p ^ OLS = arg min p k = 1 K y data t k y t k , p 2 2 ,
where | | | | 2 denotes the Euclidean norm, y data   ( t k ) represents the data vector at discrete time points t k over all measurement samples K ,   and the model output function is defined as:
y ( t k , p ) = h ( x ( t k , p ) ) ,
with h : R n x R n y , and y R n y as the model output vector. Alternatively, when aiming to utilize the inverse model response, i.e., applying a model inversion strategy, an input least squares (ILS)-based parameter identification problem can be used:
p ^ ILS = k = 1 K u data t k u t k , p 2 2 .
Here, the control inputs, u ( t k , p ) , have to be calculated to solve the parameter identification problem, and u data   ( t k ) represents the recorded physical input actions. For this purpose, we study the differential flatness concept outlined in Section 2.2. However, it is essential to note that parameter sensitivities are relevant for well-posed parameter identification problems. On having the output function y ( t k , p ) and the inputs u ( t k , p ) (inverse model response), the sensitivity S of the parameter p is defined as:
S y p ( t k ) = y ( t k , p ) p ,
S u p ( t k ) = u ( t k , p ) p .
Here, in general, absolute high parameter sensitivity values ensure precise parameter estimates according to the Fisher Information matrix and the Cramér Rao inequality [1,3].

2.2. Differential Flatness

In literature, a process model (Equation (1)) is called differentially flat if there is an output function:
y flat   = h flat   ( x , u ,   u ˙ , , u ( s ) , p ) ,
with a finite value s   N and the smooth mapping function y flat   : R n x × ( R n u ) s + 1 × R n p R n y that is called flat output. With the flat output, the system states and control inputs are expressed as:
x = Ψ x ( y flat   ,   y ˙ flat   , , y flat   ( r ) , p ) ,
u = Ψ u ( y flat   ,   y ˙ flat   , , y flat   ( r + 1 ) , p ) ,
with the mapping functions Ψ x : ( R n y ) r + 1 × R n p R n x and Ψ u : ( R n y ) r + 2 × R n p R n u , and assuming a quadratic system dimy flat   = dimu . When applying the flatness concept, it was shown that parameter sensitivities and the reliability of parameter estimates, respectively, could be improved in the case of ILS [6] or when combining the OLS with ILS [7,8]. However, in process systems engineering, for instance, besides lumped-parameter systems (i.e., ordinary differential equations), distributed-parameter systems, described via partial differential equations, are frequently applied. In this case, the differential flatness approach has to be generalized [5,9,10,11].

2.3. Neural Ordinary Differential Equations

In data science and deep learning, neural networks are frequently used to build empirical models. A neural network is a group of interconnected neurons with one or more hidden layers depending on the network’s specific task. Technically, the i t h neural network layer, N N L i ( x ) :   R d i 1 R d i , contains N i neurons. Here, N N L i ( x ) is specified with the weight matrix, W i R d i × d i 1 , and the bias vector, b i R d i . Thus, for instance, a feed-forward neural network reads as:
NNL 0 ( x ) = x R d 0 , NNL j ( x ) = σ ( W j NNL j 1 ( x ) + b j ) R d j       1 j I 1 NNL I ( x ) = W I NNL I 1 ( x ) + b I R d I . ,
When it comes to the so-called neural ordinary differential equations, the governing equations read as:
  x ˙ ( t ) = NN ( x ( t ) , u ( t ) , p ) ,   x ( t 0 ) = x 0 .
Neural ODEs offer a promising approach for hybrid modeling and system identification [12,13,14,15]. Furthermore, the neural network’s architecture can be optimized to represent experimental data better. This could be performed in conjunction with optimal experimental design methods [3] to improve the accuracy of system identification further.

3. Case Study

Determining kinetic parameters of (diffusion-type parabolic) PDEs (e.g., Equation (12)) has been extensively studied, including chromatography and adsorption processes [2,16,17,18,19], respectively.
ϕ t = p 2 ϕ x 2 + u ( x , t )
Following the numerical solution and the finite difference method (Equation (13)) of the diffusion-type PDE, a set of coupled ODEs is obtained, which can be written in the state-space form as shown in Equation (14).
2 ϕ x 2 ϕ i + 1 2 ϕ i + ϕ i 1 Δ x 2
ϕ 1 t = p Δ x 2 ϕ 2 2 p Δ x 2 ϕ 1 + p Δ x 2 ϕ 0 + u 1 ( t ) , ϕ 2 t = p Δ x 2 ϕ 3 2 p Δ x 2 ϕ 2 + p Δ x 2 ϕ 1 + u 2 ( t ) , ϕ 3 t = p Δ x 2 ϕ 4 2 p Δ x 2 ϕ 3 + p Δ x 2 ϕ 2 + u 3 ( t ) , ϕ N 1 t = p Δ x 2 ϕ N 2 p Δ x 2 ϕ N 1 + p Δ x 2 ϕ N 2 + u N 1 ( t ) , ϕ N t = p Δ x 2 ϕ N + 1 2 p Δ x 2 ϕ N + p Δ x 2 ϕ N 1 + u N ( t ) , .
Practically, the parameter identification problem for this academic case study is to determine the diffusion parameter p in this system to represent the actual physical process being modeled accurately. This can be challenging as even minor changes in the coefficient value can result in significant changes in solution behavior or vice versa. However, it is often possible to obtain reasonable estimates for the diffusion parameter with careful analysis and experimentation [2,18], including the proposed concept of combining systems theory with differential flatness and deep learning. In particular, when assuming that all states in Equation (14) are measurable, i.e., y i   = ϕ i ,   1 i N , and that the output derivatives, i.e., y ˙ i ,   1 i N , exist, then the related equation system (Equation (15)) can be transformed to determine the input variables, u i   ,   1 i N , accordingly.
y ˙ 1 ( t ) y ˙ 2 ( t ) y ˙ N 1 ( t ) y ˙ N ( t ) = 2 p Δ x 2 p Δ x 2 0 0 0 0 0 0 p Δ x 2 2 p Δ x 2 p Δ x 2 0 0 0 0 0 0 p Δ x 2 2 p Δ x 2 p Δ x 2 0 0 0 0 0 0 0 0 0 p Δ x 2 2 p Δ x 2 p Δ x 2 0 0 0 0 0 0 p Δ x 2 2 p Δ x 2 y 1 ( t ) y 2 ( t ) y N 1 ( t ) y N ( t ) + + 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 u 1 ( t ) u 2 ( t ) u 3 ( t ) u N 1 ( t ) u N ( t ) .
Moreover, when dealing with measurement data instead of the full model system (Equation (15)), the neural ODE framework (Equation (11)) is used as a surrogate model to approximate the output functions and their derivatives. In particular, the following multilayer perceptron (MLP) setting is used: two hidden layers with 400 nodes each; input and output layer with 198 nodes each; tanh as activation function. To train the resulting neural ODE system, simulated data with t i + 1 t i = 0.1 , Δ x = 0.01 , t [ 0 ,   2 ] , x [ 0 ,   1 ] , Φ ( t 0 , x ) = sin ( 2 π x ) ,   Φ ( t , x = 0 ) = 0 ,   Φ ( t , x = 1 ) = 0 , and p = 0.1 are used in dimensionless form. Figure 1a shows the model response without any distributed control action, i.e., u ( t ) = 0 . The diffusion effect can be seen very clearly, as the initial differences along the spatial axis at the start time y ( t 0 , x ) decrease over the simulation time. Accordingly, a different diffusion parameter p would lead to a different degradation profile, reflecting the corresponding sensitivity of the model. Note that this parameter sensitivity allows, in principle, a practical identification of the model parameter when applying OLS with experimental data and Equation (2). Alternatively, the differential flatness approach can be used to impose desired process behavior. To this end, the necessary but parameter-dependent calculated input variables can be used for parameter estimation following the mentioned ILS concept and Equation (4). In Figure 1b, for example, an output profile can be seen in which there are no changes over time in the course of the simulation. Please note that the corresponding input profile to achieve the desired control was determined using the flatness concept combined with the neural ODE system and the specified training setting.
The reconstructed input and the generated output data from Figure 1b were used for the sensitivity analyses. Similar to the output profile, the calculated input profile depends on the diffusion parameter, and thus, is sensitive to its parameter variation. The sensitivities of the output data and the generated control input, corresponding to a variation of the diffusion parameter p , are analyzed using Equations (5) and (6). The resulting sensitivity plots are shown in Figure 2. Here, the output parameter sensitivity (Figure 2a) is zero at the starting time, and its absolute values increase at x = 0.25 and x = 0.75 , respectively. In the case of the input parameter sensitivity (see Figure 2b), these sensitivities are at their peak from the very starting time, and their absolute values are 3–4 times higher than the output parameter sensitivities. Moreover, as mentioned in Section 2, higher parameter sensitivities, in turn, imply better parameter estimates. Consequently, it could be comfortably said that the parameter estimation could be better performed using the control input (based on ILS) generated by combining the flatness property with the neural ODE concept.

4. Conclusions

Parameter identification is a fundamental problem in systems and control theory. This work successfully demonstrated that a parameter identification problem, which evaluates input least squares (ILS) instead of ordinary least squares (OLS), results in different parameter sensitivities and, in this particular case, an improved parameter sensitivity range. Here, our original contribution is the proper combination of advanced systems theory concepts (i.e., differential flatness) and recent developments in data science with neural ordinary differential equations. We applied our method to synthetic data generated. Here, we showed that ILS and related parameter sensitivities lead to a significantly higher parameter sensitivity range of the diffusion parameter than OLS-related parameter sensitivity. The improved parameter sensitivity range suggests that ILS may result in better parameter estimates than OLS but might critically depend on the neural ODE system setting and data quality—aspects addressed in ongoing research. Future work will also focus on advanced model inversion schemes which are not limited to differential flat systems.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ECP2022-12686/s1.

Author Contributions

Conceptualization R.S.; methodology, S.S. (Subiksha Selvarajan); implementation, S.S. (Subiksha Selvarajan), A.A.T. and R.S.; validation, R.S.; writing—original draft preparation, A.A.T.; writing—review and editing, S.S. (Subiksha Selvarajan), A.A.T., C.H., S.S. (Stephan Scholl), R.S.; visualization, S.S. (Subiksha Selvarajan); supervision, R.S.; funding acquisition, R.S. and S.S. (Stephan Scholl). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the German Research Foundation, grant number 444703025.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Walter, E.; Pronzato, L. Identification of Parametric Models from Experimental Data; Springer: New York, NY, USA, 1997; ISBN 9783540761198. [Google Scholar]
  2. Barz, T.; López C., D.C.; Cruz-Bournazou, M.N.; Körkel, S.; Walter, S.F. Real-time adaptive input design for the determination of competitive adsorption isotherms in liquid chromatography. Comput. Chem. Eng. 2016, 94, 104–116. [Google Scholar] [CrossRef]
  3. Abt, V.; Barz, T.; Cruz-Bournazou, M.N.; Herwig, C.; Kroll, P.; Möller, J.; Pörtner, R.; Schenkendorf, R. Model-based tools for optimal experiments in bioprocess engineering. Curr. Opin. Chem. Eng. 2018, 22, 244–252. [Google Scholar] [CrossRef]
  4. Fliess, M.; Levine, J.; Martin, P.; Rouchon, P. Flatness and defect of non-linear systems: Introductory theory and examples. Int. J. Control 1995, 61, 1327–1361. [Google Scholar] [CrossRef]
  5. Rigatos, G.G. Nonlinear Control and Filtering Using Differential Flatness Approaches; Studies in Systems, Decision and Control; Springer International Publishing: Cham, Switzerland, 2015; Volume 25, ISBN 978-3-319-16419-9. [Google Scholar]
  6. Schenkendorf, R.; Mangold, M. Parameter identification for ordinary and delay differential equations by using flat inputs. Theor. Found. Chem. Eng. 2014, 48, 594–607. [Google Scholar] [CrossRef]
  7. Liu, J.; Mendoza, S.; Li, G.; Fathy, H. Efficient total least squares state and parameter estimation for differentially flat systems. In Proceedings of the 2016 American Control Conference (ACC), Boston, MA, USA, 6–8 July 2016. [Google Scholar] [CrossRef]
  8. Liu, J.; Li, G.; Fathy, H.K. A Computationally Efficient Approach for Optimizing Lithium-Ion Battery Charging. J. Dyn. Syst. Meas. Control 2015, 138, 021009. [Google Scholar] [CrossRef]
  9. Meurer, T. Flatness-based trajectory planning for diffusionreaction systems in a parallelepipedon—A spectral approach. Automatica 2011, 47, 935–949. [Google Scholar] [CrossRef]
  10. Kater, A.; Meurer, T. Motion planning and tracking control for coupled flexible beam structures. Control Eng. Pract. 2019, 84, 389–398. [Google Scholar] [CrossRef]
  11. Meurer, T. Control of Higher–Dimensional PDEs; Communications and Control Engineering; Springer: Berlin/Heidelberg, Germany, 2013; ISBN 978-3-642-30014-1. [Google Scholar]
  12. Lee, K.; Parish, E.J. Parameterized neural ordinary differential equations: Applications to computational physics problems. Proc. R. Soc. A Math. Phys. Eng. Sci. 2021, 477, 1–19. [Google Scholar] [CrossRef]
  13. Rackauckas, C.; Ma, Y.; Martensen, J.; Warner, C.; Zubov, K.; Supekar, R.; Skinner, D.; Ramadhan, A.; Edelman, A. Universal Differential Equations for Scientific Machine Learning. arXiv Prepr. 2020, arXiv:2001.04385. [Google Scholar]
  14. Massaroli, S.; Poli, M.; Park, J.; Yamashita, A.; Asama, H. Dissecting Neural ODEs. Adv. Neural Inf. Process. Syst. 2020, 33, 3952–3963. [Google Scholar]
  15. Sharma, N.; Liu, Y.A. A hybrid science-guided machine learning approach for modeling chemical processes: A review. AIChE J. 2022, 68, e17609. [Google Scholar] [CrossRef]
  16. Wang, G.; Briskot, T.; Hahn, T.; Baumann, P.; Hubbuch, J. Estimation of adsorption isotherm and mass transfer parameters in protein chromatography using artificial neural networks. J. Chromatogr. A 2017, 1487, 211–217. [Google Scholar] [CrossRef] [PubMed]
  17. Kubrusly, C.S. Distributed parameter system indentification A survey. Int. J. Control 1977, 26, 509–535. [Google Scholar] [CrossRef]
  18. Gehring, N.; Rudolph, J. An algebraic algorithm for parameter identification in a class of systems described by linear partial differential equations. PAMM 2016, 16, 39–42. [Google Scholar] [CrossRef]
  19. Grimard, J.; Dewasme, L.; Vande Wouwer, A. A Review of Dynamic Models of Hot-Melt Extrusion. Processes 2016, 4, 19. [Google Scholar] [CrossRef]
Figure 1. Model response when: (a) zero input profile is applied (i.e., zero control action); (b) dedicated input profile is applied to compensate for the diffusion effect.
Figure 1. Model response when: (a) zero input profile is applied (i.e., zero control action); (b) dedicated input profile is applied to compensate for the diffusion effect.
Engproc 19 00027 g001
Figure 2. Sensitivity of the diffusion parameter p: (a) using OLS to define the parameter identification problem; (b) using ILS to define the parameter identification problem.
Figure 2. Sensitivity of the diffusion parameter p: (a) using OLS to define the parameter identification problem; (b) using ILS to define the parameter identification problem.
Engproc 19 00027 g002
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Selvarajan, S.; Tappe, A.A.; Heiduk, C.; Scholl, S.; Schenkendorf, R. Parameter Identification Concept for Process Models Combining Systems Theory and Deep Learning. Eng. Proc. 2022, 19, 27. https://doi.org/10.3390/ECP2022-12686

AMA Style

Selvarajan S, Tappe AA, Heiduk C, Scholl S, Schenkendorf R. Parameter Identification Concept for Process Models Combining Systems Theory and Deep Learning. Engineering Proceedings. 2022; 19(1):27. https://doi.org/10.3390/ECP2022-12686

Chicago/Turabian Style

Selvarajan, Subiksha, Aike Aline Tappe, Caroline Heiduk, Stephan Scholl, and René Schenkendorf. 2022. "Parameter Identification Concept for Process Models Combining Systems Theory and Deep Learning" Engineering Proceedings 19, no. 1: 27. https://doi.org/10.3390/ECP2022-12686

Article Metrics

Back to TopTop