# Towards Reliable Parameter Extraction in MEMS Final Module Testing Using Bayesian Inference

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Parameter Extraction from Dynamic MEMS Tests

#### 1.2. Uncertainty in MEMS Testing

## 2. Related Work

## 3. Methods for Uncertainty Quantification

#### 3.1. Network Architectures of BNN, MDN, PBNN and BayesFlow

#### 3.1.1. Bayesian Neural Networks

#### 3.1.2. Mixture Density Networks

#### 3.1.3. Probabilistic Bayesian Neural Networks

#### 3.1.4. BayesFlow

**ψ**and

**ϕ**, respectively. Both parts can be optimized jointly via back propagation by minimizing the KL divergence between the true and the model induced posterior of $\mathit{\theta}$. Then, the objective function can be written as follows:

#### 3.2. Metrics

- 1.
- In terms of the regression accuracy between the estimation and the ground truth of the model parameters, normalized root mean squared error (NRMSE) and coefficient of determination (${R}^{2}$) are two standard metrics. Moreover, in practice it is also important to have information about the maximum absolute error (MAXAE) and mean absolute error (MAE). For a group of estimated ${\{{\widehat{\theta}}^{\left(m\right)}\}}_{m=1}^{M}$ and true parameters ${\left\{{\theta}^{\left(m\right)}\right\}}_{m=1}^{M}$ with the mean of the true parameters $\overline{\theta}$, these metrics can be calculated as follows:$$NRMSE=\frac{\sqrt{\frac{1}{M}{\sum}_{m=1}^{M}{({\theta}^{\left(m\right)}-{\widehat{\theta}}^{\left(m\right)})}^{2}}}{{\theta}_{max}-{\theta}_{min}},$$$${R}^{2}=1-\sum _{m=1}^{M}\frac{{({\theta}^{\left(m\right)}-{\widehat{\theta}}^{\left(m\right)})}^{2}}{{({\theta}^{\left(m\right)}-\overline{\theta})}^{2}},$$$$MAXAE=\underset{m=1,...,M}{max}|{\theta}^{\left(m\right)}-{\widehat{\theta}}^{\left(m\right)}|,$$$$MAE=\frac{1}{M}\sum _{m=1}^{M}|{\theta}^{\left(m\right)}-{\widehat{\theta}}^{\left(m\right)}|.$$
- 2.
- In terms of the precision and reliability of the estimation, the normalized mean confidence interval width (NMCIW) and confidence interval coverage probability (CICP) at the 95% confidence level are assessed. The higher the CICP is, the more reliable the estimation could be. Whereas the smaller the NMCIW is, the more precise the estimation could be. For the m-th parameter by sampling it T times, the standard deviation is ${\sigma}_{{\widehat{\theta}}^{\left(m\right)}}$, the 95% confidence interval (CI) with lower limit ${L}_{m}$ and upper limit ${U}_{m}$ is$$C{I}^{\left(m\right)}(95\%)=[{L}_{m},{U}_{m}]={\widehat{\theta}}^{\left(m\right)}\pm 1.96{\sigma}_{{\widehat{\theta}}^{\left(m\right)}}.$$Then, the NMCIW and CICP for the entire group of parameters are$$\begin{array}{c}\hfill CICP=\frac{1}{M}\sum _{m=1}^{M}{c}_{m},\phantom{\rule{1.em}{0ex}}\mathrm{with}\phantom{\rule{1.em}{0ex}}{c}_{m}=\left\{\begin{array}{cc}0\hfill & \phantom{\rule{4.pt}{0ex}}\mathrm{if}\phantom{\rule{4.pt}{0ex}}{\theta}^{\left(m\right)}\notin [{L}_{m},{U}_{m}],\hfill \\ 1\hfill & \phantom{\rule{4.pt}{0ex}}\mathrm{otherwise}.\phantom{\rule{4.pt}{0ex}}\hfill \end{array}\right.,\end{array}$$$$NMCIW=\frac{1}{M}\sum _{m=1}^{M}\frac{{U}_{m}-{L}_{m}}{{\theta}_{max}-{\theta}_{min}}.$$
- 3.
- In terms of the uncertainty of posterior distribution, the negative log-likelihood (NLL) is calculated by assuming the posterior to be Gaussian distributed, which is guaranteed for MDN and PBNN. When the mean is taken to be the Gaussian NLL of the M data samples, the NLL is$$\frac{1}{M}\sum _{m=1}^{M}\frac{1}{2}(\mathrm{log}{\sigma}_{{\widehat{\theta}}^{\left(m\right)}}^{2}+\frac{{({\widehat{\theta}}^{\left(m\right)}-{\theta}^{\left(m\right)})}^{2}}{{\sigma}_{{\widehat{\theta}}^{\left(m\right)}}^{2}})+\frac{1}{2}\mathrm{log}\left(2\pi \right).$$

## 4. Experiments

#### 4.1. Data Sets and Preprocessing

#### 4.2. Implementation and Training of ML Models

## 5. Comparison and Evaluation of the UQ Methods

#### 5.1. Evaluation on Simulated MEMS Devices

#### 5.2. Influence of Varied Training Set Size

#### 5.3. Performance on Noisy Data

#### 5.4. Performance on Higher Damping Factors

#### 5.5. Overall Comparison of Predictive Performance and Uncertainty Estimates

## 6. Discussion

## 7. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

ASIC | application-specific integrated circuit |

BNN | Bayesian neural network |

CI | confidence interval |

CICP | confidence interval coverage probability |

cINN | conditional invertible neural network |

DNN | deep neural network |

DUT | device under test |

ELBO | evidence lower bound |

FT | final test |

GMM | Gaussian mixture model |

GP | Gaussian Process |

KL | Kullback–Leibler |

LSTM | long short-term memory network |

MAE | mean absolute error |

MARS | multivariate adaptive regression splines |

MAXAE | maximum absolute error |

MCMC | Markov Chain Monte-Carlo |

MDN | mixture density network |

MEMS | micro-electro-mechanical system |

ML | machine learning |

MLE | maximum likelihood estimation |

NLL | negative log-likelihood |

NMCIW | normalized mean confidence interval width |

NN | neural network |

NRMSE | normalized root mean squared error |

ODE | ordinary differential Equation |

OOD | out-of-distribution |

PBNN | probabilistic Bayesian neural network |

ROM | reduced order model |

SVI | stochastic variational inference |

SVM | support vector machine |

UQ | uncertainty quantification |

VI | variational inference |

WLT | wafer-level test |

## Appendix A

ResNet | BNN | MDN | PBNN | BayesFlow | |
---|---|---|---|---|---|

${D}_{L}$ | 0.754 | 0.983 | 0.923 | 0.979 | 0.981 |

${f}_{0}$ | 0.866 | 0.987 | 0.973 | 0.993 | 0.998 |

m | 0.694 | 0.976 | 0.931 | 0.977 | 0.997 |

${d}_{off}$ | 0.988 | 0.988 | 0.987 | 0.988 | 1 |

${p}_{1}$ | 0.821 | 0.974 | 0.940 | 0.979 | 0.998 |

${p}_{2}$ | 0.876 | 0.987 | 0.973 | 0.992 | 0.998 |

ResNet | BNN | MDN | PBNN | BayesFlow | |
---|---|---|---|---|---|

${D}_{L}$ | 0.397 | 0.0802 | 0.147 | 0.0865 | 0.0969 |

${f}_{0}$ | 0.275 | 0.0898 | 0.122 | 0.0573 | 0.0367 |

m | 0.342 | 0.0992 | 0.167 | 0.0870 | 0.0354 |

${d}_{off}$ | 0.156 | 0.0821 | 0.0834 | 0.0777 | 0.00256 |

${p}_{1}$ | 0.279 | 0.115 | 0.173 | 0.0971 | 0.0314 |

${p}_{2}$ | 0.267 | 0.0885 | 0.121 | 0.0600 | 0.0373 |

ResNet | BNN | MDN | PBNN | BayesFlow | |
---|---|---|---|---|---|

${D}_{L}$ | 1.159 | 0.293 | 0.792 | 0.459 | 0.261 |

${f}_{0}$ | 0.984 | 0.251 | 0.495 | 0.307 | 0.0993 |

m | 1.304 | 0.359 | 0.634 | 0.477 | 0.123 |

${d}_{off}$ | 0.361 | 0.243 | 0.201 | 0.276 | 0.0215 |

${p}_{1}$ | 1.147 | 0.383 | 0.692 | 0.458 | 0.104 |

${p}_{2}$ | 0.968 | 0.255 | 0.491 | 0.320 | 0.101 |

## References

- Ozel, M.K.; Cheperak, M.; Dar, T.; Kiaei, S.; Bakkaloglu, B.; Ozev, S. An electrical-stimulus-only BIST IC for capacitive MEMS accelerometer sensitivity characterization. IEEE Sens. J.
**2017**, 17, 695–708. [Google Scholar] [CrossRef] [Green Version] - El Badawi, H.; Azais, F.; Bernard, S.; Comte, M.; Kerzerho, V.; Lefevre, F. Investigations on the use of ensemble methods for specification-oriented indirect test of RF circuits. J. Electron. Test.
**2020**, 36, 189–203. [Google Scholar] [CrossRef] - Heringhaus, M.E.; Müller, J.; Messner, D.; Zimmermann, A. Transfer learning for test time reduction of parameter extraction in MEMS accelerometers. J. Microelectromechanical Syst.
**2021**, 30, 401–410. [Google Scholar] [CrossRef] - Richter, C.; Roy, N. Safe visual navigation via deep learning and novelty detection. In Proceedings of the Robotics: Science and Systems Conference XIII 2017, Cambridge, MA, USA, 12 July 2017. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009; pp. 214–217. [Google Scholar] [CrossRef]
- Hantos, G.; Flynn, D.; Desmulliez, M.P.Y. Built-in self-test (BIST) methods for MEMS: A review. Micromachines
**2021**, 12, 40. [Google Scholar] [CrossRef] - Brito, N.; Ferreira, C.; Alves, F.; Cabral, J.; Gaspar, J.; Monteiro, J.; Rocha, L. Digital platform for wafer-level MEMS testing and characterization using electrical response. Sensors
**2016**, 16, 1553. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Liu, N.; Manoochehri, S. Reliability-based MEMS system modeling and optimization. IEEE Int. Reliab. Phys. Symp. Proc.
**2006**, 403–409. [Google Scholar] [CrossRef] - Uhl, T.; Martowicz, A.; Codreanu, I.; Klepka, A. Analysis of uncertainties in MEMS and their influence on dynamic properties. Arch. Mech.
**2009**, 61, 349–370. [Google Scholar] - Zhang, Z.; Yang, X.; Oseledets, I.V.; Karniadakis, G.E.; Daniel, L. Enabling high-dimensional hierarchical uncertainty quantification by ANOVA and tensor-train decomposition. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
**2015**, 34, 63–76. [Google Scholar] [CrossRef] [Green Version] - Zhang, Z.; Weng, T.-W.; Daniel, L. Big-Data tensor recovery for high-dimensional uncertainty quantification of process variations. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
**2017**, 7, 687–697. [Google Scholar] [CrossRef] - Sadek, K.; Moussa, W. Investigating the effect of deposition variation on the performance sensitivity of low-power gas sensors. Sens. Actuators B: Chem.
**2003**, 107, 497–508. [Google Scholar] [CrossRef] - Mirzazadeh, R.; Mariani, S. Uncertainty quantification of microstructure-governed properties of polysilicon MEMS. Micromachines
**2017**, 8, 248. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Mohd-Yasin, F.; Nagel, D.J.; Korman, C.E. Noise in MEMS. Meas. Sci. Technol.
**2010**, 21, 012001. [Google Scholar] [CrossRef] - Chen, Y.; Zhang, Z.; Shen, Y.; Li, K. Wafer-level test system using a physical stimulus for a MEMS accelerometer. IEEE RCAR
**2017**, 145–150. [Google Scholar] [CrossRef] - Variyam, P.N.; Cherubal, S.; Chatterjee, A. Prediction of analog performance parameters using fast transient testing. IEEE Trans.Comput.-Aided Design Integr. Circuits Syst.
**2002**, 21, 349–361. [Google Scholar] [CrossRef] [Green Version] - Liu, L.; Jia, R. Uncertainty analysis of sensitivity of MEMS microphone based on artificial neural network. IEICE Electr. Exp.
**2019**, 16, 20190623. [Google Scholar] [CrossRef] [Green Version] - Hoang, T.-V.; Wu, L.; Paquay, S.; Golinval, J.-C.; Arnst, M.; Noels, L. A study of dry stiction phenomenon in MEMS using a computational stochastic multi-scale methodology. In Proceedings of the 7th International Conference on Thermal, Mechanical and Multi-Physics Simulation and Experiments in Microelectronics and Microsystems (EuroSimE), Montpellier, France, 18–20 April 2016; pp. 1–4. [Google Scholar] [CrossRef]
- Kolis, P.; Bajaj, A.K.; Koslowski, M. Quantification of uncertainty in creep failure of RF-MEMS switches. J. Microelectromechanical Syst.
**2017**, 26, 283–294. [Google Scholar] [CrossRef] - Gennat, M.; Meinig, M.; Shaporin, A.; Kurth, S.; Rembe, C.; Tibken, B. Determination of parameters with uncertainties for quality control in MEMS fabrication. J. Microelectromechanical Syst.
**2013**, 22, 613–624. [Google Scholar] [CrossRef] - Ling, Y.; Mahadevan, S. Quantitative model validation techniques: New insights. Elsevier Reliab. Eng. Syst. Saf.
**2013**, 111, 217–231. [Google Scholar] [CrossRef] [Green Version] - Mullins, J.; Ling, Y.; Mahadevan, S.; Sun, L.; Strachan, A. Separation of aleatory and epistemic uncertainty in probabilistic model validation. Elsevier Reliab. Eng. Syst. Saf.
**2016**, 147, 49–59. [Google Scholar] [CrossRef] [Green Version] - Choi, S.; Lee, K.; Lim, S.; Oh, S. Uncertainty-aware learning from demonstration using mixture density networks with sampling-free variance modeling. ICRA
**2017**. [Google Scholar] [CrossRef] [Green Version] - Kahn, G.; Villaflor, A.; Pong, V.; Abbeel, P.; Levine, S. Uncertainty-Aware Reinforcement Learning for Collision Avoidance. 2017. Available online: https://asset-pdf.scinapse.io/prod/2586067474/2586067474.pdf (accessed on 3 February 2017).
- Shiono, T. Estimation of agent-based models using Bayesian deep learning approach of BayesFlow. J. Econ. Dyn. Control
**2021**, 125, 104082. [Google Scholar] [CrossRef] - Radev, S.T.; Graw, F.; Chen, S.; Mutters, N.T.; Eichel, V.M.; Bärnighausen, T.; Köthe, U. OutbreakFlow: Model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany. PLoS Comput. Biol.
**2021**, 17, e1009472. [Google Scholar] [CrossRef] [PubMed] - Lust, J.; Condurache, A.P. A survey on assessing the generalization envelope of deep neural networks: Predictive uncertainty, out-of-distribution and adversarial samples. arXiv
**2021**, arXiv:2008.09381v4. [Google Scholar] - Murphy, K.P. Probabilistic Machine Learning: An Introduction. 2022. Available online: https://probml.github.io/pml-book/book1.html (accessed on 9 May 2022).
- Hüllermeier, E.; Waegeman, W. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Mach. Learn
**2021**, 110, 457–506. [Google Scholar] [CrossRef] - Abdar, M.; Samami, M.; Mahmoodabad, S.D.; Doan, T.; Mazoure, B.; Hashemifesharaki, R.; Liu, L.; Khosravi, A.; Acharya, U.R.; Makarenkov, V.; et al. Uncertainty quantification in skin cancer classification using three-way decision-based Bayesian deep learning. Comput. Biol. Med.
**2021**, 135, 104418. [Google Scholar] [CrossRef] [PubMed] - Kiureghian, A.D.; Ditlevsen, O. Aleatory or epistemic? Does it matter? Struct. Saf.
**2009**, 31, 105–112. [Google Scholar] [CrossRef] - Le, Q.V.; Smola, A.J.; Canu, S. Heteroscedastic Gaussian process regression. In Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, 7 August 2005; pp. 489–496. [Google Scholar] [CrossRef]
- Shaker, M.H.; Hüllermeier, E. Ensemble-based uncertainty quantification: Bayesian versus credal inference. arXiv
**2021**, arXiv:2107.10384. [Google Scholar] - Kochenderfer, M.J.; Amato, C.; Chowdhary, G.; How, J.P.; Reynolds, H.J.D.; Thornton, J.R.; Torres-Carrasquillo, P.A.; Ure, N.K.; Vian, J. Decision Making under Uncertainty: Theory and Application; MIT Press: Cambridge, MA, USA, 2015; 352p. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning: Information Science and Statistics; Springer: Berlin/Heidelberg, Germany, 2006; pp. 430–435. [Google Scholar] [CrossRef]
- Bhatt, U.; Antorán, J.; Zhang, Y.; Liao, Q.V.; Sattigeri, P.; Fogliato, R.; Melançon, G.G.; Krishnan, R.; Stanley, J.; Tickoo, O.; et al. Uncertainty as a form of transparency: Measuring, communicating, and using uncertainty. arXiv
**2021**, arXiv:2011.07586. [Google Scholar] - Lakshminarayanan, B.; Pritzel, A.; Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4 December 2017; pp. 6405–6416. [Google Scholar] [CrossRef]
- Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, New York, NY, USA, 19 June 2016; pp. 1050–1059. [Google Scholar] [CrossRef]
- Hershey, J.; Olsen, P. Approximating the Kullback Leibler divergence between Gaussian mixture models. In Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing—ICASSP ’07, Honolulu, HI, USA, 15–20 April 2007; pp. IV-317–IV-320. [Google Scholar] [CrossRef] [Green Version]
- Blei, D.M. Variational inference: A review for statisticians. J. Am. Stat. Assoc.
**2017**, 859–877. [Google Scholar] [CrossRef] [Green Version] - Jospin, L.V.; Laga, H.; Boussaid, F.; Buntine, W.; Bennamoun, M. Hands-on Bayesian neural networks—A tutorial for deep learning users. arXiv
**2021**, arXiv:2007.06823. [Google Scholar] [CrossRef] - Blundell, C.; Cornebise, J.; Kavukcuoglu, K.; Wierstra, D. Weight uncertainty in neural networks. In Proceedings of the 32nd ICML, Lille, France, 21 May 2015; Volume 37, pp. 1613–1622. [Google Scholar] [CrossRef]
- Riquelment, C.; Tucker, G.; Snoek, J. Deep Bayesian bandits showdown: An empirical comparison of Bayesian deep networks for thompson sampling. arXiv
**2018**, arXiv:1802.09127. [Google Scholar] - Rice, J.A. Mathematical Statistics and Data Analysis, 2nd ed.; Duxbury Press: Belmont, CA, USA, 1995; pp. 135–142. [Google Scholar]
- Makansi, O.; Ilg, E.; Cicek, O.; Brox, T. Overcoming limitations of mixture density networks: A sampling and fitting framework for multimodal future prediction. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7144–7153. [Google Scholar]
- Radev, S.T.; Mertens, U.K.; Voss, A.; Ardizzone, L.; Köthe, U. BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Trans. Neural Netw. Learn. Syst.
**2020**. [Google Scholar] [CrossRef] [PubMed] - Rezende, D.J.; Mohamed, S. Variational inference with normalizing flows. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, France, 6 July 2015; Volume 37, pp. 1530–1538. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput.
**1997**, 9, 1735–1780. [Google Scholar] [CrossRef] - Ardizzone, L.; Lüth, C.; Kruse, J.; Rother, C.; Köthe, U. Guided image generation with conditional invertible neural networks. arXiv
**2020**, arXiv:1907.02392. [Google Scholar] - Ardizzone, L.; Kruse, J.; Wirkert, S.; Rahner, D.; Pellegrini, E.W.; Klessen, P.S.; Maier-Hein, L.; Rother, C.; Köthe, U. Analyzing inverse problems with invertible neural networks. arXiv
**2019**, arXiv:1808.04730. [Google Scholar] - Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, RedHook, NY, USA, 8–14 December 2019; Volume 32, pp. 8024–8035. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. arXiv
**2015**, arXiv:1512.03385. [Google Scholar] - Lee, S.; Kim, H.; Lee, J. GradDiv: Adversarial robustness of randomized neural networks via gradient diversity regularization. arXiv
**2021**, arXiv:2107.02425. [Google Scholar] [CrossRef] - Bastian, B.; Poston, J.; Raikwar, P. pytorch-mdn. 2021. Available online: https://github.com/sagelywizard/pytorch-mdn (accessed on 3 April 2022).
- Chang, D.T. Bayesian hyperparameter optimization with BoTorch, GPyTorch and Ax. arXiv
**2021**, arXiv:1912.05686. [Google Scholar] - Jia, X.; Yang, J.; Liu, R.; Wang, X.; Cotofana, S.D.; Zhao, W. Efficient computation reduction in Bayesian neural networks through feature decomposition and memorization. IEEE Trans. Neural Netw. Learn. Syst.
**2021**, 32, 1703–1712. [Google Scholar] [CrossRef] - Wilson, A.G.; Izmailov, P. Bayesian deep learning and a probabilistic perspective of generalization. arXiv
**2020**, arXiv:2002.08791. [Google Scholar] - Hortúa, H.J.; Malago, L.; Volpi, R. Reliable uncertainties for Bayesian neural networks using alpha-divergences. arXiv
**2020**, arXiv:2008.06729. [Google Scholar] - Lee, K.; Lee, H.; Lee, K.; Shin, J. Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv
**2018**, arXiv:1711.09325. [Google Scholar] - Wenzel, F.; Roth, K.; Veeling, B.S.; Światkowski, J.; Tran, L.; Mandt, S.; Snoek, J.; Salimans, T.; Jenatton, R.; Nowozin, S. How good is the Bayes posterior in deep neural networks really? arXiv
**2020**, arXiv:2002.02405. [Google Scholar] - Huseljic, D.; Sick, B.; Herde, M.; Kottke, D. Separation of aleatoric and epistemic uncertainty in deterministic deep neural networks. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 9172–9179. [Google Scholar] [CrossRef]
- Yang, L.; Meng, X.; Karniadakis, G.E. B-PINNs: Bayesian physics-informed neural networks for forward and inverse PDE problems with noisy data. J. Comput. Phys.
**2021**, 425, 109913. [Google Scholar] [CrossRef]

**Figure 1.**BayesFlow Architecture. $\mathit{\psi}$ contains the network parameters of the summary network, $\mathit{\varphi}$ the network parameters of the cINN. The blue arrows stand for the forward process; the red arrows stand for the inverse process.

**Figure 3.**Performance of ResNet PBNN on test set for ${D}_{L}$; (

**a**) scatter plot with 95%CI as error bars; (

**b**) density plot for test observations composed of 100 evaluations of one sample. The highlighted area represents the 95%CI, the dashed line the target, the solid line the mean estimate.

**Figure 4.**(

**a**) NRMSE and (

**b**) NMCIW for BNN prediction of ${D}_{L}$ on the test set using different KL weights during training.

**Figure 5.**(

**a**) NRMSE and (

**b**) NMCIW of PBNN in prediction of ${D}_{L}$ with varied number of forward passes. Error bars show the standard deviation over 10 iterations.

**Figure 6.**(

**a**) Aleatoric and (

**b**) epistemic uncertainty with number of prediction passes. Error bars show the average over standard deviations.

**Figure 7.**Aleatoric (

**a**) and epistemic (

**b**) uncertainty components of PBNN under variation of the training set size and the KL weight.

**Figure 8.**(

**a**) NRMSE, (

**b**) NMCIW, and (

**c**) CICP plotted for each magnitude of white noise added to the test set in the prediction of ${D}_{L}$. All noise variants were included in the training set.

**Figure 9.**NMCIW and NRMSE in prediction of ${D}_{L}$ and ${f}_{0}$ with model trained only on samples from condition $\mathcal{A}$. $\mathcal{A}$, $\mathcal{B}$, and $\mathcal{C}$ denote the three distributions over the true ${D}_{L}$ with increasing means. (

**a**) NRMSE for ${D}_{L}$, (

**b**) NRMSE for ${f}_{0}$, (

**c**) NMCIW for ${D}_{L}$, (

**d**) NMICW for ${f}_{0}$.

**Figure 10.**Radar charts summarizing the findings on the test sets with respect to the predictive performance and consistency of uncertainty estimates for the prediction of ${D}_{L}$ of BNN, MDN, PBNN, and BayesFlow. (

**a**) 200 training samples, $\left(\mathbf{b}\right)$ 100 training samples, ($\mathbf{c}$) 200 noisy training samples, noise amplitude of 0.05 in test set, ($\mathbf{d}$) 200 training samples from damping condition $\mathcal{A}$ evaluated on $\mathcal{C}$ with logarithmically scaled axes.

ResNet | BNN | MDN | PBNN | BayesFlow | |
---|---|---|---|---|---|

${D}_{L}$ | 0.0893 | 0.0235 | 0.0500 | 0.0264 | 0.0251 |

${f}_{0}$ | 0.0883 | 0.0273 | 0.0394 | 0.0207 | 0.0111 |

m | 0.102 | 0.0286 | 0.0486 | 0.0280 | 0.0100 |

${d}_{off}$ | 0.047 | 0.0269 | 0.0269 | 0.0260 | 0.00116 |

${p}_{1}$ | 0.087 | 0.0332 | 0.0501 | 0.0300 | 0.00870 |

${p}_{2}$ | 0.085 | 0.0275 | 0.0394 | 0.0214 | 0.0111 |

${\overline{NRMSE}}_{test}$ | 0.0831 | 0.0278 | 0.0424 | 0.0254 | 0.0112 |

BNN | MDN | PBNN | BayesFlow | ||
---|---|---|---|---|---|

NMCIW | ${D}_{L}$ | 0.174 | 0.135 | 0.261 | 0.0823 |

${f}_{0}$ | 0.199 | 0.157 | 0.269 | 0.0442 | |

m | 0.182 | 0.140 | 0.285 | 0.0405 | |

${d}_{off}$ | 0.234 | 0.145 | 0.252 | 0.00237 | |

${p}_{1}$ | 0.179 | 0.141 | 0.292 | 0.0363 | |

${p}_{2}$ | 0.198 | 0.156 | 0.270 | 0.0444 | |

CICP | ${D}_{L}$ | 1.0 | 0.86 | 1.0 | 0.90 |

${f}_{0}$ | 1.0 | 0.94 | 1.0 | 0.90 | |

m | 0.98 | 0.84 | 1.0 | 0.90 | |

${d}_{off}$ | 1.0 | 1.0 | 1.0 | 0.90 | |

${p}_{1}$ | 0.98 | 0.86 | 1.0 | 0.96 | |

${p}_{2}$ | 1.0 | 0.94 | 1.0 | 0.90 |

**Table 3.**NRMSE, NMCIW, and CICP for the prediction of ${D}_{L}$ on the test set with models trained on 100, 200, and 400 samples.

Training Samples | BNN | MDN | PBNN | BayesFlow | |
---|---|---|---|---|---|

NRMSE | 0.0431 | 0.0779 | 0.0511 | 0.0758 | |

100 | NMCIW | 0.196 | 0.165 | 0.340 | 0.247 |

CICP | 0.96 | 0.74 | 0.98 | 0.90 | |

NRMSE | 0.0235 | 0.0500 | 0.0264 | 0.0251 | |

200 | NMCIW | 0.174 | 0.135 | 0.261 | 0.0823 |

CICP | 1.0 | 0.86 | 1.0 | 0.90 | |

NRMSE | 0.0242 | 0.0328 | 0.0228 | 0.0114 | |

400 | NMCIW | 0.190 | 0.122 | 0.243 | 0.00332 |

CICP | 1.0 | 0.98 | 1.0 | 0.147 |

**Table 4.**Aleatoric and epistemic uncertainty of the PBNN in the prediction of ${D}_{L}$ from perturbed time series with the PBNN trained on samples from all noise domains.

Noise Magnitude | 0.0 | 0.01 | 0.025 | 0.05 |
---|---|---|---|---|

${\sigma}_{a}^{2}$ | 0.479 | 1.028 | 1.032 | 1.069 |

SD | 0.108 | 0.104 | 0.103 | 0.134 |

perc. of $\mathbb{V}\left[\mathit{\theta}\right|\mathbf{x}]$ | 82.0% | 91.4% | 94.1% | 93.5% |

${\sigma}_{e}^{2}$ | 0.105 | 0.0968 | 0.0650 | 0.0746 |

SD | 0.0387 | 0.0291 | 0.0261 | 0.0194 |

perc. of $\mathbb{V}\left[\mathit{\theta}\right|\mathbf{x}]$ | 18.0% | 8.6% | 5.9% | 6.5% |

$\mathbb{V}\left[\mathit{\theta}\right|\mathbf{x}]$ | 0.584 | 1.125 | 1.097 | 1.1436 |

**Table 5.**NRMSE, NMCIW, and CICP are reported on test samples for BNN, MDN, PBNN, and a BayesFlow model trained on samples from all three ${D}_{L}$ modes $\mathcal{A}$, $\mathcal{B}$ and $\mathcal{C}$. The metrics evaluated on the test set are given for ${D}_{L}$ and the average over the parameters ${f}_{0}$, m, ${d}_{off}$, ${p}_{1}$ and ${p}_{2}$, which were not subject to a shift in the distribution.

Metric | Parameter (s) | BNN | MDN | PBNN | BayesFlow |
---|---|---|---|---|---|

NRMSE | ${D}_{L}$ | 0.0391 | 0.0572 | 0.0391 | 0.0134 |

NRMSE | all w/o ${D}_{L}$ | 0.0511 | 0.110 | 0.0817 | 0.0215 |

NMCIW | ${D}_{L}$ | 0.249 | 0.143 | 0.357 | 0.0388 |

NMCIW | all w/o ${D}_{L}$ | 0.193 | 0.299 | 0.346 | 0.0635 |

CICP | ${D}_{L}$ | 1 | 1 | 1 | 0.80 |

CICP | all w/o ${D}_{L}$ | 0.92 | 0.88 | 0.95 | 0.79 |

**Table 6.**Ranking of methods based on the area covered by the triangles built from the NRMSE, NMCIW, and 1-CICP values in the radar charts shown in Figure 10. Smaller areas are considered superior.

Condition | 100 Training Samples | 200 Training Samples | 200 Noisy Training Samples, Noise Amplitude of 0.05 in Test Set | 200 Training Samples from Damping Condition $\mathcal{A}$ Evaluated on $\mathcal{C}$ | |
---|---|---|---|---|---|

Method | |||||

BNN | 0.00177 | 0.00780 | 0.297 | 2.69 | |

MDN | 0.0141 | 0.0329 | 0.283 | 10.3 | |

PBNN | 0.00298 | 0.0109 | 0.0882 | 2.07 | |

BayesFlow | 0.00554 | 0.0221 | 0.125 | 1.54 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Heringhaus, M.E.; Zhang, Y.; Zimmermann, A.; Mikelsons, L.
Towards Reliable Parameter Extraction in MEMS Final Module Testing Using Bayesian Inference. *Sensors* **2022**, *22*, 5408.
https://doi.org/10.3390/s22145408

**AMA Style**

Heringhaus ME, Zhang Y, Zimmermann A, Mikelsons L.
Towards Reliable Parameter Extraction in MEMS Final Module Testing Using Bayesian Inference. *Sensors*. 2022; 22(14):5408.
https://doi.org/10.3390/s22145408

**Chicago/Turabian Style**

Heringhaus, Monika E., Yi Zhang, André Zimmermann, and Lars Mikelsons.
2022. "Towards Reliable Parameter Extraction in MEMS Final Module Testing Using Bayesian Inference" *Sensors* 22, no. 14: 5408.
https://doi.org/10.3390/s22145408