Next Article in Journal
Comparative Evaluation of Adipolin Expression in Gingival Crevicular Fluid and Serum of Healthy Subjects and Periodontitis Patients with and without Type 2 Diabetes Mellitus
Previous Article in Journal
Getting a Better Sense of Data Drift in Dynamic Systems: Sequence-Based Deep Learning for Monitoring Slowly Evolving Degradation Processes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

ProgMachina: Feature Extraction and Processing Package for Prognostic Studies †

1
Laboratory of Automation and Manufacturing Engineering, University of Batna 2, Batna 05000, Algeria
2
UMR CNRS 6027 IRDL, University of Brest, 29238 Brest, France
3
Logistics Engineering College, Shanghai Maritime University, Shanghai 201306, China
4
Laboratory of Signal Image and Energy Mastery (SIME), National Higher School of Engineers of Tunis, University of Tunis, 5 Av. Taha Hussein, Tunis 1008, Tunisia
*
Author to whom correspondence should be addressed.
Presented at the 10th International Electronic Conference on Sensors and Applications (ECSA-10), 15–30 November 2023; Available online: https://ecsa-10.sciforum.net/.
Eng. Proc. 2023, 58(1), 83; https://doi.org/10.3390/ecsa-10-16222
Published: 15 November 2023

Abstract

:
Prognostic studies of industrial systems essentially focus on health deterioration analysis that has recently been oriented toward data analytics and learning systems. In general, real degradation phenomena suffer from complex drifted data in which degradation patterns are hidden and change over time. Accordingly, such a process requires a well-structured processing and extraction mechanism to reveal such patterns, which facilitates the transition to other model reconstruction and investigation tasks. In this context, to provide additional simplicity of data processing in the field, a complete software package is designed and grouped into a single function that is fully automated and does not require human intervention. The package named ProgMachina (i.e., prognostic machine) provides a featured list of processed features from a life cycle that passed through denoising, filtering, outlier removal, and scaling process to ensure data significance in terms of degradation. The package allows for the use of a time window with a specific overlap to ensure that the scanning process of all possible degradation patterns is properly done. Additionally, an exponential function is used to identify a corresponding health index of degraded signals. In addition, a set of well-known metrics is used to assess the degradation of extracted features. Data visualization and many previous experiments on machines show the effectiveness of such a methodology in terms of obtained prediction accuracy and degradation assessment. The package is designed with Matlab software and made available online to be exploited in similar fields.

1. Introduction

Nowadays, prognostic studies rely heavily on data analysis and learning systems for condition monitoring rather than highly complex traditional physics-based modeling. Physics-based modeling is primarily needed when the systems under study are both safety-critical and financially expensive, rarely fail under working conditions, and cannot be subjected to real deterioration or accelerated aging laboratory experiments. However, physics-based modeling is used for generative modeling and is also hybridized with learning systems to ensure efficient predictions. In this case, the acquisition, extraction, and processing of run-to-failure data are crucial steps for data analysis and reconstruction of the learning model [1]. When it comes to building a learning model for system prognostics, run-to-failure is usually a challenge of complexity and data drift, while degradation patterns are hidden and buried with ever-changing noise and different distortion patterns, respectively, resulting from harsh system operating conditions. In this case, training a learning model with such data will certainly mislead the predictions and over-fit the model. In this context, the need for a well-structured feature extraction and processing methodology is urgent to ensure that data are well presented in terms of providing a reliable source of information to improve the performance of the learning model.
In the literature, many paths have been proposed, including most importantly, denoising, extraction, and outlier removal. Accordingly, since these methodologies are proven to be necessary for progressive degradation analysis in terms of prognostics studies, the main goal of this paper is to combine them as a single and full package as an important contribution to facilitating such a complex process. In this case, this paper introduces ProgMachina, a full package designed specifically to deal with such run-to-failure data features passing via different important steps. Each of these steps is used to uncover and extract degradation patterns from row data of entire life cycles. The package also allows for the release of an exponentially deteriorating health index for the intended life cycle.
This paper is organized as follows: Section 2 represents the package descriptions, its main features, and its relationship with run-to-failure data besides some illustrative examples. Section 3 is specifically dedicated to introducing the impact of this package on prognostics studies, while the conclusion is dedicated to limitations and future improvements of the package.

2. ProgMachina Package Description

ProgMachina is a function designed in Matlab software (https://www.mathworks.com/products/matlab.html, accessed on 14 November 2023) to deliver well-processed run-to-failure data with a corresponding health index (HI) ready to feed a learning system for training and evaluation. Table 1 gives further details about the metadata of the package. ProgMachina allows the acquisition of a run-to-failure dataset per life cycle (i.e., a single degradation unit from normal operating conditions to a complete failure of the system), which is organized vertically as observations and channels (i.e., different sensor measurements) only and uses them to generate an extracted and well-prepared list of features and corresponding HI. According to previous literature [2], ProgMachina follows specific steps of extraction, denoising, and outlier removal as the main steps of uncovering hidden degradation patterns in provided life cycles, while smoothening, filtering, and scaling brings further enhanced representations and builds strange connections and correlations between data samples. Accordingly, this section is dedicated to exploring such steps in detail.

2.1. Features Extraction

A set of well-used features in the literature is included in ProgMachina. These features include mean, standard deviation (Std), skewness, kurtosis, peak to peak, square root of the arithmetic mean (RMS), crest factor, shape factor, impulse factor, margin factor, energy, mean value spectral kurtosis (SKMean), standard deviation of spectral kurtosis (SKStd), spectral kurtosis of skewness (SKSkewness), and spectral kurtosis of kurtosis (SK kurtosis). More details of these features background and their mathematical background can be found in the following references [3]. These features are extracted for each time window that overlaps all over the signal and have been selected as they are well-known signal descriptors and used for such slowly evolving degradation process analysis while reducing problem complexity and prevent information loss [4]. It should be mentioned that these features need further analysis of whether they describe a degradation mechanism or not. In this case, metrics like Monotonicity, Tenability, Prognosability, and Robustness (MTPR) are well investigated for such purposes [4]. Accordingly, ProgMachina also includes such metrics to further provide insights about the degradation ability of extracted features and also to provide further information about feature selections. The goal of measuring MTPR is to guarantee that the signal is monotonic to specific degradation trends and given in a meaningful way through the degradation path reflecting actual system health, while prognosability mainly indicates the possibility of separating faulty and healthy degradation patterns.

2.2. Denoising

Slowly evolving degradation processes are well known with the complex dynamics resulting in a very complex feature space with higher levels of noise with unknown sources [5]. In this context, the collected features unquestionably need to be subject to a noise reduction procedure. ProgMachina offers an empirical Bayesian wavelet transformation to create more reliable representations by reducing the amount of noise of such features. This method successfully minimizes the effect of noise in the feature space by combining a Cauchy prior with a posterior median threshold rule [6,7]. This process is accomplished by including the default “wdenoise(---)” Matlab function.

2.3. Outlier Removal

Besides the existence of noise in recorded signals as well as per extracted features, different random pulses of higher magnitude disturbances can be found in such a degradation process. Therefore, an outlier remover is necessary to eliminate/reduce their effects on the recorded data. Subsequently, the denoised characteristics of the entire tire life cycle will be further processed using an outlier removal tool. This distinct outlier removal approach was implemented to distinguish differences in data characteristics. The removal of outliers was carried out by default, using a moving median function “rmoutliers(---)” [8].

2.4. Smoothing and Filtering

Additional filtering and smoothing processes are required to further enhance signal quality and provide further appearance to degradation patterns. While the Markov-switching dynamic regression model models data, we employed the state probabilities of the active latent states in the regime transition for smoothing “smooth(---)”. It conducts reverse recursion after performing forward recursion [9]. For the filtering process, an additional step of median filtering is involved to further enhance signal quality “medfilt1(---)”.

2.5. Scaling and Health Index Identification

A min-max normalization in the range [ 0,1 ] is used as normal to unify feature measurement scales after each process of denoising and smoothing and outlier removal. From such data, ProgMachina defines a deteriorating HI according to an exponential function (see Equation (2) from [10]).

2.6. Illustrative Example

As an illustrative example, a life cycle of bearing dataset generated from a mathematical model is used in this case [11]. The dataset represents a vibration run-to-failure measurement. Figure 1a is an example of vibration measurement while degradation grows exponentially. ProgMachina is used to find out both features in Figure 1b and HI in Figure 1c. The extracted features are smoother, cleaner, and even more representative of degradation than the original row signal and the HI signal. This is demonstrated by the fact that Figure 1d,e,g show values closer to 1 for all features. This means that the features reflect the degradation mechanism. Meanwhile, Figure 1f provides further instructions for feature selection if dimensionality reduction is required.

3. ProgMachina Impact

ProgMachina as an easy-to-use single package is expected to draw several advantages for prognostics studies including the most important ones listed as follows:
  • Bringing more simplicity in learning model reconstruction;
  • Needing less human intervention as everything is done automatically;
  • Easily investigating the variability of new features by simply adding them to the open source code;
  • Providing a reliable source of information, especially by involving outlier removal and denoising;
  • Making the feature selection process further simplified by studying signal degradations metrics;
  • Spending more time on developing learning systems rather than processing.

4. Conclusions

This paper has introduced ProgMachina, a full package for feature extraction and processing degradation signals recorded from slowly evolving degradation processes. It follows different important steps of extraction, denoising, outlier removal, smoothing, filtering, and scaling to reach quality signals that can be used to feed learning systems and prepare for investigations. We should mention that ProgMachina is built based on a limited set of both time domain and frequency domain feature extraction and processing. Therefore, future opportunities in improving such a package are to consider other features and further signal processing tools to produce better quality and clean data with better illustration of degradation. It is also important to consider the tempo-frequency domain. Additionally, we can use methods of attribute reduction to define health indicators in a new space whose evolution will surely be more linear.

Author Contributions

Conceptualization, T.B.; methodology, T.B. and M.B.; validation, T.B., M.B. and J.B.A.; formal analysis, T.B., M.B. and J.B.A.; investigation, T.B; resources, T.B.; data curation, T.B., M.B. and J.B.A.; writing—original draft preparation, T.B.; writing—review and editing, T.B., M.B. and J.B.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This work used data from https://figshare.com/articles/dataset/Simulated_Bearing_Degradation_Data_mat/12554690 (accessed 14 November 2023). Necessary files to reproduce the findings of this work can be downloaded at: https://doi.org/10.5281/zenodo.8174085.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Berghout, T.; Benbouzid, M. A Systematic Guide for Predicting Remaining Useful Life with Machine Learning. Electronics 2022, 11, 1125. [Google Scholar] [CrossRef]
  2. Ali, J.B.; Saidi, L. A New Suitable Feature Selection and Regression Procedure for Lithium-Ion Battery Prognostics. Int. J. Comput. Appl. Technol. 2018, 58, 102. [Google Scholar] [CrossRef]
  3. Qiu, W.; Zhu, K.; Teng, Z.; Tang, Q.; Yao, W.; Dong, Y.; Liu, Y. Cyber-Attack Identification of Synchrophasor Data Via VMD and Multi-Fusion SVM. In Proceedings of the 2020 IEEE Industry Applications Society Annual Meeting, IEEE, Detroit, MI, USA, 10–16 October 2020; pp. 1–6. [Google Scholar]
  4. Qiu, G.; Gu, Y.; Chen, J. Selective Health Indicator for Bearings Ensemble Remaining Useful Life Prediction with Genetic Algorithm and Weibull Proportional Hazards Model. Meas. J. Int. Meas. Confed. 2020, 150, 107097. [Google Scholar] [CrossRef]
  5. Saxena, A.; Goebel, K.; Simon, D.; Eklund, N. Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation. In Proceedings of the 2008 International Conference on Prognostics and Health Management, IEEE, Denver, CO, USA, 6–9 October 2008; pp. 1–9. [Google Scholar]
  6. Donoho, D.L. De-Noising by Soft-Thresholding. IEEE Trans. Inf. Theory 1995, 41, 613–627. [Google Scholar] [CrossRef]
  7. Johnstone, I.M.; Silverman, B.W. Needles and Straw in Haystacks: Empirical Bayes Estimates of Possibly Sparse Sequences. Ann. Stat. 2004, 32, 1594–1649. [Google Scholar] [CrossRef]
  8. Blázquez-García, A.; Conde, A.; Mori, U.; Lozano, J.A. A Review on Outlier/Anomaly Detection in Time Series Data. arXiv 2020, arXiv:2002.04236v1. [Google Scholar] [CrossRef]
  9. Kim, C.-J. Dynamic Linear Models with Markov-Switching. J. Econom. 1994, 60, 1–22. [Google Scholar] [CrossRef]
  10. Berghout, T.; Mouss, L.-H.; Bentrcia, T.; Benbouzid, M. A Semi-Supervised Deep Transfer Learning Approach for Rolling-Element Bearing Remaining Useful Life Prediction. IEEE Trans. Energy Convers. 2022, 37, 1200–1210. [Google Scholar] [CrossRef]
  11. Koceila, A.; Mouchaweh, M.S.; Cornez, L.; Chiementin, X. Simulated Bearing Degradation Data. 2020. Available online: https://figshare.com/articles/dataset/Simulated_Bearing_Degradation_Data_mat/12554690 (accessed on 14 November 2023).
Figure 1. ProgMachina package inputs and outputs: (a) raw vibration data; (b) extracted and processed features; (c) identified health index; (dg) Monotonicity, trendability, prognosability, robustness.
Figure 1. ProgMachina package inputs and outputs: (a) raw vibration data; (b) extracted and processed features; (c) identified health index; (dg) Monotonicity, trendability, prognosability, robustness.
Engproc 58 00083 g001
Table 1. Important metadata of ProgMachina.
Table 1. Important metadata of ProgMachina.
Package NameProgMachina
Current code versionv1.0.0
Permanent link to the softwarehttps://doi.org/10.5281/zenodo.8174085
Software code languages, tools, and services used Matlab
Compilation requirements Matlab ≥ r2023a
LogoEngproc 58 00083 i001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Berghout, T.; Benbouzid, M.; Ali, J.B. ProgMachina: Feature Extraction and Processing Package for Prognostic Studies. Eng. Proc. 2023, 58, 83. https://doi.org/10.3390/ecsa-10-16222

AMA Style

Berghout T, Benbouzid M, Ali JB. ProgMachina: Feature Extraction and Processing Package for Prognostic Studies. Engineering Proceedings. 2023; 58(1):83. https://doi.org/10.3390/ecsa-10-16222

Chicago/Turabian Style

Berghout, Tarek, Mohamed Benbouzid, and Jaouher Ben Ali. 2023. "ProgMachina: Feature Extraction and Processing Package for Prognostic Studies" Engineering Proceedings 58, no. 1: 83. https://doi.org/10.3390/ecsa-10-16222

Article Metrics

Back to TopTop