Steady-State Fault Detection with Full-Flight Data

Weiss, Matthias; Staudacher, Stephan; Becchio, Duilio; Keller, Christian; Mathes, Jürgen

doi:10.3390/machines10020140

Open AccessArticle

Steady-State Fault Detection with Full-Flight Data

by

Matthias Weiss

^1,*

,

Stephan Staudacher

¹,

Duilio Becchio

²,

Christian Keller

³

and

Jürgen Mathes

²

¹

Institute of Aircraft Propulsion Systems, University of Stuttgart, 70569 Stuttgart, Germany

²

MTU Aero Engines AG, 80995 München, Germany

³

MTU Maintenance Hannover GmbH, 30855 Langenhagen, Germany

^*

Author to whom correspondence should be addressed.

Machines 2022, 10(2), 140; https://doi.org/10.3390/machines10020140

Submission received: 30 November 2021 / Revised: 8 February 2022 / Accepted: 11 February 2022 / Published: 16 February 2022

(This article belongs to the Special Issue Diagnostics and Optimization of Gas Turbine)

Download

Browse Figures

Versions Notes

Abstract

:

Aircraft engine condition monitoring is a key technology for increasing safety and reducing maintenance expenses. Current engine condition monitoring approaches use a minimum of one steady-state snapshot per flight. Whilst being appropriate for trending gradual engine deterioration, snapshots result in a detrimental latency in fault detection. The increased availability of non-mandatory data acquisition hardware in modern airplanes provides so-called full-flight data sampled continuously during flight. These datasets enable the detection of engine faults within one flight by deriving a statistically relevant set of steady-state data points, thus, allowing the application of machine-learning approaches. It is shown that low-pass filtering before steady-state detection significantly increases the success rate in detecting steady-state data points. The application of Principal Component Analysis halves the number of relevant dimensions and provides a coordinate system of principal components retaining most of the variance. Consequently, clusters of data points with and without engine fault can be separated visually and numerically using a One-Class Support Vector Machine. High detection rates are demonstrated for various component faults and even for a minimum instrumentation suite using synthesized datasets derived from full-flight data of commercially operated flights. In addition to the tests conducted with synthesized data, the algorithm is verified based on operational in-flight measurements providing a proof-of-concept. Consequently, the availability of continuously sampled in-flight measurements combined with machine-learning methods allows fault detection within a single flight.

Keywords:

aircraft engine; gas turbine; fault detection; engine health monitoring; engine condition monitoring; full-flight data; one-class support vector machine; principal component analysis

1. Introduction

About one-third of the direct operating cost of an aircraft engine is related to maintenance, repair, and overhaul [1]. Cutting the maintenance expenses is one primary goal of airlines to increase profitability. The scope of on-wing and shop maintenance is carefully considered concerning the cost, downtime, and recovery of fuel efficiency and remaining useful life. Engine condition monitoring tracks measurable engine parameters during flight to derive insights into its current health state and trends. Effective maintenance planning is built on these insights.

As a minimum, only discrete data points gathered during take-off and cruise are provided for condition monitoring. These data points are called snapshots. Engine condition monitoring systems are applied to the long-term trending of gradual performance deterioration as well as to the detection, isolation and identification of faults [2]. Gradual performance deterioration results from minuscule changes caused by erosion, corrosion, or fouling [3] and accumulates over a large number of flights. Due to this nature, the availability of one single steady-state snapshot per flight is good enough to determine the associated long-term trends in performance parameters. On the other hand, faults are discrete events that occur at a defined point in time during a flight and lead to a step-change in performance parameters.

Here, the data sparsity related to snapshot data leads to difficulties distinguishing between faults and random scatter. Depending on the faulty component and the severity of the fault, it may take multiple data points to detect a fault [4,5,6]. Consequently, several flights might be carried out until fault detection increasing the risk of collateral damage. The increased usage of non-mandatory data acquisition equipment opens up the possibility of avoiding this latency.

Currently, most airlines adopted Quick Access Recorders for data acquisition, providing flight data continuously sampled at frequencies of 1 Hz and more, which is also referred to as full-flight data. The availability of these datasets enables the introduction of new approaches in engine condition monitoring, which offer the chance to detect engine faults within one flight. In addition to the application of the well-known engine fault detection [7,8], a statistically relevant number of steady-state data points allows the application of state-of-the-art machine-learning methods for fault detection. The latter approach allows directly analyzing multi-variate datasets based on determined fault-relevant features.

2. Related Research

2.1. Fault Detection

Generally, machine-learning-based fault-detection approaches are categorized as either supervised or unsupervised. Supervised methods usually directly combine fault detection and fault isolation. Different approaches are, e.g., Convolutional Neural Networks [9], Deep Belief Neural Networks [10], and Support Vector Machines [11,12,13,14]. These algorithms require appropriately labeled datasets a priori for training.

As engine faults are rare events, the total number of available training data describing faulty engine performance is limited and might not cover all possible fault cases [13,15]. Unsupervised learning methods mainly define clustering approaches analyzing a given dataset for outliers indicating a fault. These algorithms do not require labeled datasets and are therefore considered best suited for fault detection. The approaches are sometimes also referred to as anomaly detection or outlier detection.

An approach for fault detection based on steady-state flight regimes of full-flight data is demonstrated in [16,17]. At first, a linearized state-space model is used to compute the residuals between the raw measurements and the model predictions. The fault detection is based on a distance measure utilizing recursively calculated means and standard deviations. A similar approach utilizing Bayesian hierarchical models and the same fault-detection scheme is demonstrated in [18] for evaluating gas turbines. A utilization of data-driven models based on LSTMs in combination with a statistical evaluation of the residuals is demonstrated in [19] for a three-shaft marine gas turbine.

The k-means clustering method [20] is mainly utilized for defining cluster centers and allocating data points to these clusters. In general, k-means clustering does not provide intrinsic outlier detection necessary for fault detection. Utilizing k-means clustering for fault detection requires additional algorithms, e.g., artificial neural networks [21], or cluster evaluation [22]. These additional algorithms increase the complexity of the k-means clustering.

The DBSCAN algorithm [23] defines a clustering approach that can directly identify outliers. Possible applications of DBSCAN are anomaly detection for aircraft trajectories [24,25,26] or fault detection of wind turbines [27]. In [28], the DBSCAN algorithm is used for identifying anomalies within engine data in order to exclude the corresponding data points from data-driven model building. One major drawback of DBSCAN is the difficulty to determine appropriate hyperparameters for multi-variate datasets [24], especially if several clusters exist with varying densities.

A One-Class Support Vector Machine [29] identifies outliers by defining a boundary enclosing the data points within a given training dataset. One-Class Support Vector Machines are proven frameworks for fault detection with successful applications in various fields, such as bearing faults [30], process monitoring [31], aircraft trajectory monitoring [32], and gas turbine monitoring [33,34].

2.2. Feature Extraction

Applying feature extraction and dimensionality reduction is often required for machine-learning methods to improve their performance [35]. In general, datasets contain many different variables, of which only a few provide crucial information. Feature extraction derives features holding relevant information, reducing the dimensionality of the input dataset. Dimensionality reduction can also be used for visualizing highly dimensional datasets.

Principal Component Analysis [36] defines linear combinations of the input parameters resulting in a low-dimensional coordinate system retaining most of the variance. Principal Component Analysis is based on an eigendecomposition and allows easy execution. The approach is widely used in monitoring applications, e.g., aircraft engine condition monitoring [14,37], the monitoring of smart grids [38], and flight path monitoring [26,32].

There are additionally different methods for computing features based on artificial neural networks, such as autoencoders [11,12]. While these approaches allow the derivation of highly non-dimensional features, their main drawback is the iterative training process required for their utilization and the definition of an appropriate network architecture, which is not known a priori.

2.3. Steady-State Data Filters

Identifying steady-state operating regimes is important for analyzing datasets acquired from test-beds and in-flight. Different data filters for detecting steady-state data points are either regression-based [39] or variance-based [16,17,40,41]. Regression-based approaches [39] approximate the underlying measurements by linear functions obtaining estimations for their slope. Variance-based methods, on the other hand, either compute the variance [16,17,40] or the min-max-range [41] of the measurements within a given moving window.

Both regression-based and variance-based methods utilize predefined thresholds on their parameters to ensure the measurements’ stability. In addition to solely identifying regimes with stable measurements, the data filter proposed by [40] defines multiple modules for additionally ensuring thermal stability, consistent operating conditions, and the reduction of measurement uncertainty.

2.4. Conclusions from the Literature Review

Based on the literature review, the following properties of fault-detection schemes are considered advantageous:

Arbitrary Fault Detection: In the first place, fault-detection schemes are required to identify arbitrary deviations from the nominal engine performance indicating faults to ensure the engine’s safety and operability.
Processing Multi-Variate Datasets: Several gas path measurements are available for fault detection leading to a multi-dimensional dataset to be processed. Some approaches analyze the different sensors independently, making the definition of alarming rules more complex and requiring separate thresholds for each sensor.
Visualizability: The fault-detection schemes should allow visualization of the results to allow the maintenance engineer to check and reason the results.
Efficiency: Analyzing full-flight data requires a considerable amount of data to be processed [42]; therefore, fast and efficient approaches are required for analyzing the datasets.

None of the approaches mentioned in the previous sections combine all of the aforementioned properties. Therefore, a novel fault-detection approach is presented in this paper, combining a steady-state data filter with Principal Component Analysis and a One-Class Support Vector Machine. The steady-state data filter reduces the total number of data points to be processed, neglecting non-stationary flight regimes. The Principal Component Analysis was chosen for dimensionality reduction since it does not require a training process to determine fault-relevant features.

Therefore, a coordinate system that is most sensitive to a given fault can be determined, improving visualizability and detectability. In the last step, a One-Clase Support Vector Machine is used for cluster analysis determining outliers indicating faults. The proposed fault-detection toolchain is tested based on operational datasets of commercial aircraft.

3. Materials and Methods

3.1. Concept

The flowchart of the proposed fault-detection algorithm is shown in Figure 1. First, a data filter processes the continuously sampled flight data, detecting steady-state data points. The residual is the difference of these measured steady-state data points relative to the predicted values of a reference engine performance model run at the same operating conditions. In this proof-of-concept, a highly accurate industry-standard thermodynamic engine model is utilized. However, theoretically different types of engine models, e.g., data-driven models [28] or hybrid models [43] are also amendable. The model accounts for the impact of environmental conditions and engine settings, such as bleed extraction or power offtake [44].

In the next step, a database is built comprising the residuals of the current and preceding flights. The residuals of the current flight are compared to those within the database of preceding flights via cluster analysis. In the case of an engine fault, the corresponding change in performance parameters directly affects the in-flight measurements [45]. In this case, the residuals of the current flight will deviate from the dataset of preceding flights leading to dedicated clusters, which can be detected.

The proposed fault-detection approach allows seamless integration of full-flight data into existing engine condition monitoring systems and has been filed for patent. Representative steady-state data points can be sampled from the results of the Steady-State Data Filter and forwarded to existing engine condition monitoring systems. Here, state-of-the-art algorithms for long-term trending [46,47] can be utilized for estimating and predicting the gradual deterioration of the aircraft engine. Additionally, after fault detection, the isolation and identification of the potential fault can be performed in a postprocessing step with existing methods [48,49].

3.2. Steady-State Data Filter

The generation of residuals requires the in-flight measurements to fulfill the prerequisites for applying steady-state performance models, i.e., constant flight and ambient conditions, constant engine control settings, and engine offtakes. Additionally, the engine must have reached thermal equilibrium. The first four aspects are summarized as a stable operating regime, and the latter represents a thermal stability criterion.

Of the approaches presented in Section 2.3, only the approach described by [40] explicitly covers all of these aspects. In order to meet the prerequisites of stable operating regimes and thermal stability, the Steady-State Data Filter consists of four building blocks: a Low-Pass Filter, a Thermal Transient Filter, a Regime Recognition, and a State Transition Logic. A flowchart of the resulting Steady-State Data Filter is visualized in Figure 2. Since the algorithm of [40] was derived for identifying steady-state data points for turboshaft engines of helicopters, some adjustments are needed to apply the approach to turbofan engines of commercial aircraft.

3.2.1. Low-Pass Filter

Due to measurement noise, the total number of detected steady-state data points might significantly differ from engine to engine, even if they are both mounted on the same aircraft [40]. A low-pass filter is proposed for alleviating the effect of high-frequency noise in the input signal. In general, the response of a low-pass filter can be characterized by its cut-off frequency and order. The cut-off frequency defines the beginning of the signal attenuation, and the magnitude of the attenuation is directly dependent upon the order of the filter [50].

The filter has to be designed such that mostly the high-frequency measurement noise is attenuated, leaving the remaining frequency spectrum unaffected. The measurement noise depends on the type of sensors used, their dynamics, and the data acquisition. Since the measurement systems in turboshaft engines and turbofan engines for commercial aircraft are expected to be similar, the filter defined by [40] is directly adopted. Hence, a second-order low-pass filter is implemented where

y (s)

is the raw and

\tilde{y} (s)

is the corresponding low-pass filtered measurement.

\frac{\tilde{y} (s)}{y (s)} = \frac{ω_{n}^{2}}{s^{2} + 2 ζ ω_{n} s + ω_{n}^{2}}

(1)

An adjustment concerning the position of the low-pass filter is proposed. In the original algorithm, the Low-Pass Filter works in parallel with both the Thermal Transient Filter and the Regime Recognition. A low noise level is expected to be advantageous for determining thermally stable flight segments and the definition of valid flight segments. Hence, utilizing the Low-Pass Filter as a preprocessing step is considered more suitable.

3.2.2. Thermal Transient Filter

The Thermal Transient Filter is used to assure the thermal equilibrium of the engine. Since the time required to heat through an engine generally correlates with its mass [51], the definition of the Thermal Transient Filter has to be adjusted in order to account for the application in commercial turbofan engines.

For approximating the heat fluxes of the engine, a simplified thermal model is evaluated approximating the multi-stage components by representative single-stage modules according to [52]. Such a simplified module is visualized in Figure 3. The thermal model takes the heat fluxes between fluid and casing

{\dot{Q}}_{F C}

, fluid and blade

{\dot{Q}}_{F B}

, and disk and blade

{\dot{Q}}_{D B}

into account. Single elements approximate all structural elements, and the module is approximated as adiabatic. The heat fluxes between the structural elements and the fluid

{\dot{Q}}_{F C}

and

{\dot{Q}}_{F B}

are modeled assuming convective heat transfer. A conductive heat transfer is assumed for the heat fluxes between disk and blade

{\dot{Q}}_{D B}

. The heat fluxes are defined based on the heat transfer coefficients

α

, the wetted area A, and the temperatures of the fluid and structural elements T.

\begin{matrix} {\dot{Q}}_{F C} & = α_{F C} A_{F C} (T_{F l u i d} - T_{C a s i n g}) \\ {\dot{Q}}_{F B} & = α_{F B} A_{F B} (T_{F l u i d} - T_{B l a d e}) \\ {\dot{Q}}_{D B} & = α_{D B} A_{D B} (T_{D i s k} - T_{B l a d e}) . \end{matrix}

(2)

Conductive heat transfer within the structural elements is neglected, assuming an instantaneously uniform temperature distribution within the material. The corresponding temperature changes of the structures can be evaluated via conservation of energy using the mass of the structure m, and the heat capacity of the material c

\begin{matrix} c_{C a s i n g} m_{C a s i n g} {\dot{T}}_{C a s i n g} & = {\dot{Q}}_{F C} \\ c_{D i s k} m_{D i s k} {\dot{T}}_{D i s k} & = - {\dot{Q}}_{D B} \\ c_{B l a d e} m_{B l a d e} {\dot{T}}_{B l a d e} & = {\dot{Q}}_{F B} + {\dot{Q}}_{D B} . \end{matrix}

(3)

Substituting Equation (2) into Equation (3) finally yields a system of first-order differential equations for the temperatures of the casing, disks, and blades of the module.

\begin{matrix} (4) \\ \frac{c_{C a s i n g} m_{C a s i n g}}{α_{F C} A_{F C}} {\dot{T}}_{C a s i n g} + T_{C a s i n g} & = T_{F l u i d} \\ (5) \\ \frac{c_{D i s k} m_{D i s k}}{α_{D B} A_{D B}} {\dot{T}}_{D i s k} + T_{D i s k} & = T_{B l a d e} \\ (6) \\ \frac{c_{B l a d e} m_{B l a d e}}{γ_{F B D}} {\dot{T}}_{B l a d e} + T_{B l a d e} & = \frac{α_{F B} A_{F B}}{γ_{F B D}} T_{F l u i d} + \frac{α_{D B} A_{D B}}{γ_{F B D}} T_{D i s k} \\ (7) \\ γ_{F B D} & = α_{F B} A_{F B} + α_{D B} A_{D B} \end{matrix}

Applying a Laplace transformation to the first-order differential equation of the casing in Equation (4) results in

\frac{T_{C a s i n g} (s)}{T_{F l u i d} (s)} = \frac{1}{\frac{c_{C a s i n g} m_{C a s i n g}}{α_{F C} A_{F C}} s + 1} .

(8)

Equation (8) is of similar form as the first-order filter proposed by [40] for approximating thermal equilibrium within the Thermal Transient Filter. Here, a similar first-order filter is used to describe the correlation between the filtered exhaust gas temperature

\tilde{E G T}

and its raw measurement

E G T

with a given time constant

τ_{t h}

\frac{\tilde{E G T} (s)}{E G T (s)} = \frac{1}{τ_{t h} s + 1} .

(9)

Comparing Equation (8) and Equation (9) shows that

τ_{t h}

characterizes the heat transfer. In general, the lower

τ_{t h}

, the faster the temperature response of the material to a step change of the fluid temperature.

For utilizing the Thermal Transient Filter a representative time constant

τ_{t h}

has to be defined approximating the heat exchange of a particular engine model. The time constants are computed based on Equations (4)–(7). The mass m, heat capacity c, and wetted area A are approximated from drawings and material datasheets. In general, the heat transfer coefficient

α

depends on the geometry, fluid properties, and flow properties. Consequently, the time constant

τ_{t h}

is not fixed but varies with power-setting. Here, a Nusselt-Correlation is used for approximating the heat transfer coefficients

α

[53] with the Reynolds-Number

R e

, the Prandtl-Number

P r

, the characteristic length l, and the thermal conductivity of the fluid

λ_{F}

.

\begin{matrix} N u = \frac{α l}{λ_{F}} = \frac{ζ / 8 (R e - 1000) P r}{1 + 12.7 \sqrt{ζ / 8} ({P r}^{2 / 3} - 1)} \\ ζ = {(1.8 l o g_{10} R e - 1.5)}^{- 2} \end{matrix}

(10)

In order to avoid the necessity to recompute the heat transfer coefficients

α

constantly for changing power settings, the slowest possible heat transfer encountered during operation is used as a conservative estimate. Simplifying Equation (10) yields approximately

α \sim R e^{0.8}

. Hence, a low-idle power-setting results in the highest time constants

τ_{t h}

due to the low corresponding Reynolds-Numbers. For evaluating the thermal model, the material properties and dimensions of the components are derived based on an existing and validated model by [52]. The resulting time constants

τ_{t h}

for the different modules in a low idle-setting are summarized in Table 1. No thermal model is utilized for the fan and LPC since the heat fluxes are considered negligible. Based on Table 1, the largest time constant is selected for the Thermal Transient Filter as

τ_{t h} = 160

s.

Since Equation (9) defines a first-order differential equation, appropriate initial conditions have to be defined for the filtered EGT. The initial thermal state of the engine is directly dependent upon its flight history [52]. Therefore, it cannot be determined without analyzing the preceding flight. In order to avoid such an extensive analysis, the filtered EGT is initialized with the ambient temperature resulting in a conservative estimate of the material temperature, assuming the engine was completely cooled down.

According to [40] a thermally stabilized state is defined if the difference between the filtered and unfiltered EGT is below a threshold of

Δ E G T_{m a x} = 5

K, selected to be at the same order of magnitude as the EGT measurement uncertainty [16,17].

|\tilde{E G T} - E G T| \leq Δ E G T_{m a x}

(11)

3.2.3. Regime Recognition

The Regime Recognition restricts the detection of steady-state data points to certain flight phases. The Regime Recognition can be set up such that steady-state data points are only derived in flight phases where the engine model used for computing the predicted values shows the highest accuracy. Therefore, such a regime filter lowers the uncertainty of the engine model predictions leading to a decrease in variance within the residuals, enhancing the detection rates and sensitivity of the fault detection.

3.2.4. State Transition Logic

The State Transition Logic defines a steady-state regime if the low-pass filtered data fulfill the requirements of the Thermal Transient Filter as well as the Regime Recognition, and the variance of the filtered data is below a threshold value within a given moving window of length

Δ t_{W i n d o w s}

. Its primary purpose is to ensure that the steady-state data points fulfill the requirements of a steady-state performance synthesis calculation, i.e., the stability of the flight conditions, power-setting, thermal state, and mechanical state. The approach defined by [41] is used, which allows the direct definition of limits on the relevant measurements. The maximum variations D of the measurements y are defined as the min-max range

D_{y} = m a x (y) - m i n (y) .

(12)

The maximum variations of the parameters summarized in Table 2 are monitored for ensuring stable flight conditions. The values are application-dependent and were chosen to limit the difference of both net thrust and stored energy. The maximum variation chosen for the parameters related to stable flight conditions ensures that the net thrust in altitude varies with less than 0.5%. The maximum variation of parameters related to the mechanical equilibrium ensures that the power stored in rotating structures varies with less than 5%. The maximum variation of the power setting parameter can be defined based on the measurement uncertainty of aircraft fuel meters [54].

If all stability conditions are met, the corresponding data points within the moving window are arithmetically averaged, resulting in a steady-state data point. The population of the residuals between those steady-state data points and the corresponding predicted values of an engine model forms the input to the following clustering approach.

3.3. Clustering

3.3.1. Principal Component Analysis

In case of a system fault, the residuals of the current flight dataset will deviate from the residuals of the datasets of preceding flights resulting in separated clusters of data points. Clustering algorithms are applied to detect such separated clusters. The nominal and faulty system performance can best be differentiated by deriving a set of not necessarily physically interpretable input parameters pointing in the directions with the most variance.

The Principal Component Analysis [36] defines an approach for such a transformation. An example for the orientation of the axes in the case of a dataset with two dedicated clusters is visualized in Figure 4. The required reduction of input parameters to the clustering algorithm is achieved by restricting the Principal Component Analysis to a subset of axes explaining most of the variance of the original dataset.

Principal Component Analysis requires the input data to be independent in time [55]. For aircraft engines, there are different root causes for time dependency acting on different time scales. The short-term temporal correlation is related to transient effects and depends upon the operating history of the flight [56]. The long-term temporal correlation is caused by gradual engine deterioration and manifests slowly over multiple flight cycles. Only steady-state data points are utilized in the presented fault-detection approach, diminishing the transient effect and the related short-term temporal correlation. Additionally, the gradual deterioration can be neglected by restricting the flight data analysis to a limited number of preceding flights.

The starting point of the Principal Component Analysis is a given dataset

Y \in R^{k \times n}

comprised of n parameters and k independent observations. In this particular case, the dataset

Y

holds the measurement residuals

Δ y

of both the dataset of the current flight and preceding flights.

Y = (\begin{matrix} Δ y_{1, 1} & \dots & Δ y_{1, n} \\ ⋮ & ⋱ & ⋮ \\ Δ y_{k, 1} & \dots & Δ y_{k, n} \end{matrix})

(13)

The principal components of the given dataset

Y

are linear combinations of the given parameters that decouple the covariance matrix resulting in a system of uncorrelated parameters. Without a loss of generality, the means of the parameters are assumed to be zero. In this case, the covariance matrix can be computed using

C = \frac{Y^{T} Y}{k} .

(14)

The matrix of principal components

P = (p_{1}, p_{2}, \dots, p_{n})

and the corresponding diagonal matrix of eigenvalues

Λ = d i a g (λ_{1}, λ_{2}, \dots, λ_{n})

can now be derived by diagonalizing the covariance matrix

C = P Λ P^{T} .

(15)

The magnitude of the eigenvalue

λ_{i}

is a measure for the portion of the total covariance described by its principal component

p_{i}

. It is assumed that principal components with large eigenvalues contain more information. Therefore, dimensionality reduction can be achieved by omitting eigenvectors with small eigenvalues. By choosing a smaller subset of principal components with

d_{P C A} < n

, a transformation matrix with

T = (p_{1}, p_{2}, \dots, p_{d_{P C A}})

can be derived for switching between the physical subspace

Y

and the subspace spanned by the principal components

\bar{Y}

\bar{Y} = Y T^{T} .

(16)

Different physical units of the measurement can bias the resulting principal components since their magnitude is not directly comparable. Therefore, the measurements have to be normalized for direct comparison. The measurements are standardized, scaling them to a zero mean and unit variance. Given the mean

μ

and variance

σ

of the i-th parameter, the standardization is defined by

{\bar{y}}_{i, j} = \frac{y_{i, j} - μ_{i}}{σ_{i}} .

(17)

In the previously derived formulation of the Principal Component Analysis, all data points are weighted equally. In the present use case, the test data of the current flight are compared to a historical dataset of preceding flights. Hence, the dataset of the current flight contains fewer data points than the dataset containing several preceding flights. This imbalance in the available data points results in a possibly diminishing change in variance caused by a fault leading to difficulties in fault detection. In order to account for the potentially higher information content of the current flight, its data points are weighted to give both datasets the same magnitude. This approach leads to the Weighted Principal Component Analysis [57].

3.3.2. One-Class Support Vector Machine

The One-Class Support Vector Machine is an algorithm for cluster analysis. It models the nominal system performance by defining a hypersphere with center C and radius R, fully enclosing the provided training data [29]. A visualization of the One-Class Support Vector Machine is displayed in Figure 5.

The radius and correspondingly the volume of the hypersphere are minimized in order to identify outliers best. The given requirements are cast into an optimization problem:

\begin{matrix} Objective function : & m i n_{R, χ, C} R^{2} + \frac{1}{ν N} \sum_{i = 1}^{N} χ_{i} \\ Constraint 1 : & {||x_{i} - C||}^{2} \leq R^{2} + χ_{i} \\ Constraint 2 : & χ_{i} \geq 0 f ü r i \in N . \end{matrix}

(18)

The second term in the objective function is a penalty term that handles statistical outliers allowing a certain number of data points to lie outside the boundary of the hypersphere. The distance between the outliers and the hypersphere is defined as

χ

. The magnitude of the penalty can be tuned using the regularization parameter

ν

.

In general, the smaller the regularization parameter

ν

, the larger the volume of the hypersphere since the optimization encloses more outliers due to the heavy penalty. The first constraint ensures that all data points except for a few lie within the hypersphere, and the second constraint defines the distance between outliers and the boundary. After defining the hypersphere, the set of outliers is defined by the decision function:

f (x) = s g n (R^{2} - {||x - C||}^{2}) = \{\begin{matrix} + 1 & if x is nominal \\ - 1 & if x is an outlier \end{matrix}

(19)

The hypersphere is defined based on the dataset of preceding flights. In the next step, the data of the current flight are evaluated utilizing the decision function in Equation (19). A fault is detected based on the total number of outliers found. In general, there will always be a certain amount of statistical outliers. However, if the total number of outliers exceeds a predefined threshold

n_{l i m}

, the outliers are considered systematic, and a fault is detected. The total number of outliers tolerated directly influences the algorithm’s sensitivity and must be defined based on a trade-off between fault-detection sensitivity and false positives.

Finding an appropriate hypersphere with a minimum volume might be difficult for an arbitrary nonlinear multi-dimensional problem. In such cases, the dataset can be transformed into a subspace with higher dimensionality where a simple hypersphere enclosing the dataset can be defined [58]. A radial basis function kernel is utilized in the given implementation for such a transformation.

3.4. Data Synthesis

In order to test the detection rates of the developed algorithm on a variety of fault cases, a database of synthetic data was generated using a transient engine model. The engine model was built using NPSS [59] and resembles a two-spool turbofan engine of the 140 kN thrust class. The main components of the turbofan engine are the fan, the low-pressure compressor (LPC), the high-pressure compressor (HPC), the burner, the high-pressure turbine (HPT), and the low-pressure turbine (LPT). A schematic diagram of the engine under consideration, including the available measurements and their position, is visualized in Figure 6.

The measurements are chosen according to [16,17]. In order to model the transient engine performance, shaft inertias, heat fluxes, and tip clearances due to thermal expansion are considered. The heat fluxes are modeled according to Section 3.2.2. The model of the tip clearances is based on existing publications [60]. Volume packing effects were neglected since they cannot be resolved with the sampling rates provided.

The Mach-Number (

M a

), the total ambient pressure (

p_{t 0}

), the total ambient temperature (

T_{t 0}

), and the relative shaft speed of the fan (

N 1 / N 1_{m a x}

) are used as input parameters for the engine model.

In order to mimic the variations and the scatter experienced during operation, the input time series were derived based on published continuously sampled flight data of a commercially operated regional jet [61]. The raw measurements within the published database are acquired with different sampling rates. In order to utilize them for data synthesis, all data were interpolated to a common sampling rate of 1 Hz. Additionally, only complete flight missions were derived from the dataset, and the start-up and shut-down of the engine were cut-off since the engine model used is not calibrated for these operations.

The published dataset contains some short-haul flights with limited cruise segments since it was acquired from regional jets. The proposed fault-detection algorithm requires a statistically relevant number of steady-state data points. It is expected to work well for flights with extended cruise segments experienced in medium- and long-haul flights. In order to exclude extreme short-haul flights, only flights with cruise segments of at least one hour were considered. The time series of input parameters for an example flight is visualized in Figure 7. Altogether, 100 flights were sampled and synthesized from the dataset.

For each of the 100 sampled flights, transient datasets for the nominal and faulty engine performance were generated. For simulating the component faults, the capacities

Q_{M a p}

and efficiencies

η_{M a p}

derived from the component maps were adjusted by scaling factors defined by Equations (20) and (21).

\begin{matrix} Δ Q & = & (Q - Q_{m a p})/ Q_{m a p} \end{matrix}

(20)

\begin{matrix} Δ η & = & η - η_{m a p} \end{matrix}

(21)

The scaling factors

Δ Q

and

Δ η

are based on the OBIDICOTE test cases [62] summarized in Table 3. Additionally, for taking measurement uncertainty into account, the output of the engine model was superposed by Gaussian white noise and a constant sensor bias according to [16,17]. The corresponding sensor noise and biases are summarized in Table 4.

4. Results

4.1. Test and Verification of the Steady-State Data Filter

4.1.1. Test of the Steady-State Data Filter with Synthetic Datasets

In the first step, the Steady-State Data Filter was tested based on the 100 synthesized flights. As described in Section 3.2.1, an adjustment concerning the position of the low-pass filter is proposed. The adjusted design leads to a significant increase in identified steady-state data points.

For the synthetic flights, on average, 36 steady-state datapoints could be derived applying the Steady-State Data Filter defined by [40]. Using the low-pass filter as a preprocessing step nearly doubled the average number of detected stable flight segments resulting in approximately 65 identifications per flight. The steady-state flight conditions were solely detected during cruise and taxi, with on average 60 out of 65 data points sampled during cruise. For most flights, there were no detections during taxi-in and taxi-out. A comparison of both algorithms is visualized in Figure 8 for an example flight. For this particular flight, eight steady-state data points were derived by the original algorithm [40], whilst 37 detections were achieved by the approach proposed in this paper.

4.1.2. Verification of the Steady-State Data Filter with In-Flight Measurements

The implemented Steady-State Data Filter was additionally verified based on three continuously sampled flights of a commercially operated medium-range aircraft. The datasets were acquired with sampling rates of 1 Hz. The shaft speed of the fan as well as the identified steady-state data points are visualized in Figure 9. The steady-state data points are only detected during cruise. The lack of detection during taxiing is mainly attributed to the short timespans spent while taxiing.

Due to the conservative definition of the Thermal Transient Filter, a thermal equilibrium could not be reached during taxi. In general, the longer the taxi phase, the higher the chance of detecting steady-state data points. Compared to the dataset of the regional jet in the previous section, the cruise segment is more stable as can be seen comparing the fan shaft speed

N_{1}

. Combined with the longer time spent in cruise, this significantly increases identified steady-state data points. In total, between 185 and 241 stable data points were detected leading to a statistically significant amount of data points, making fault detection within one flight possible.

4.2. Parameter Study Clustering

The proposed clustering toolchain is controlled via four hyperparameters affecting the algorithms’ ability to distinguish between nominal and faulty engine performance: the number of principal components retained

d_{P C A}

, regularization parameter of the One-Class Support Vector Machine

ν

, the number of flights comprising the training dataset

n_{T r a i n i n g}

and the number of outliers tolerated until fault detection

n_{l i m i t}

. The parameters were derived based on analysis of the synthetic database and selected operational in-flight measurements of commercial flights. As described in the previous section, no steady-state datapoints could be detected during taxi for most flights. Therefore, the corresponding data points during taxi were neglected, ensuring a consistent database of cruise data points for each flight.

4.2.1. Definition of the Principal Components Retained $d_{P C A}$

Dimensionality reduction is always related to a certain degree of information loss. In the case of Principal Component Analysis, this information loss can be quantified by evaluating the Cumulative Percent Variance

C P V

. Since the magnitude of the eigenvalue

λ_{i}

is related to the variance described by its corresponding eigenvector

p_{i}

, the Cumulative Percent Variance

C P V

defines the portion of the original variance retained after dimensionality reduction from n dimensions to

d_{P C A}

.

C P V (d_{P C A}) = \sum_{i = 0}^{d_{P C A}} λ_{i} / \sum_{i = 0}^{n} λ_{i}

(22)

The mean Cumulative Percent Variance

C P V

computed over all available flights and fault cases is visualized in Figure 10 for different numbers of principal components retained

d_{P C A}

. The Cumulative Percent Variance

C P V

is low for the nominal case since this case is dominated by random scatter without a distinct characteristic. Similarly, the poor observability of fault case b results in a comparably low level of Cumulative Percent Variance

C P V

as for the nominal case. For the remaining fault cases, the Cumulative Percent Variance

C P V

indicates the formation of clusters influencing the variance and the direction of the principal components.

The observed difference between two or four principal components retained

d_{P C A}

is small. Choosing three principal components ensures a Cumulative Percent Variance of

C P V \geq 0.9

and increases the computational effectiveness. Furthermore, retaining three dimensions after dimensionality reduction allows the data to be visualized, providing maintenance engineers the means to check the fault-detection results.

4.2.2. Definition of the Regularization Parameter $ν$ and Number of Flights $n_{T r a i n i n g}$ Comprising the Training Dataset

The regularization parameter

ν

controls the extent of the boundary defining the nominal system performance. The volume of the hypersphere enclosing the nominal data points grows when lowering the regularization parameter

ν

. While a small volume of the hypersphere enhances the detectability of faults, it also increases the probability of detecting statistical outliers of nominal data points. Potential faults can best be identified if the ratio Q between detected outliers for faulty and nominal system performance is highest. The outlier ratio Q, therefore, serves as a quality parameter.

Q_{j} = \frac{Number of outliers for fault case j}{Number of outliers for nominal system performance}

(23)

The average ratio Q for all flights and all possible fault cases was evaluated for determining the regularization parameter

ν

. The relation between the regularization parameter

ν

, the number of flights comprising the training dataset

n_{T r a i n i n g}

and the outlier ratio Q is visualized in Figure 11. The outlier ratio Q improves with reducing the regularization parameter

ν

until a plateau is reached. The results indicate that increasing the volume of the hypersphere improves the representation of the nominal system performance without including regions related to faulty system performance.

The regions of faulty and nominal system performance are therefore well separated. The outlier ratio Q is also dependent on the number of flights comprising the training dataset

n_{T r a i n i n g}

. In general, the data-driven representation of the nominal system performance improves when increasing the number of data points, as more combinations of environmental conditions and power settings are included. On the other hand, the computational effort required for evaluating the datasets additionally increases. However, the improvement in the outlier ratio Q diminishes with increasing

n_{T r a i n i n g}

.

4.2.3. Definition of the Threshold $n_{l i m i t}$ for Fault Detection

In the last step, the detection rates of the proposed fault-detection approach are evaluated concerning the number of outliers tolerated until fault-detection

n_{l i m i t}

. The remaining parameters of the cluster analysis are prescribed as derived in the previous sections. For quantifying the detection rates for the different faults, the true positives (

T P

) and false positives (

F P

) are computed defined as

T P_{j} = \frac{Number of faults detected for fault case j}{Total number of flights with fault case j}

(24)

F P = \frac{Number of faults detected for nominal flights}{Total number of nominal flights} .

(25)

In order to analyze the effect of observability on the fault-detection rates, two different measurement suites were analyzed. The first measurement suite is referred to as extensive measurement suite and covers the temperatures and pressures in most sections of the aircraft engine based on the instrumentation defined by [16,17]. The second measurement suite defines the minimum instrumentation provided by most aircraft engines and is therefore referred to as minimum measurement suite.

4.2.4. Detection Rates for the Extensive Measurement Suite

The first measurement suite covers all measurements displayed in Figure 6 as defined in [16,17]. The measurements for

T_{t 0}

,

p_{t 0}

,

M a

and

N_{1}

are used to define the environmental conditions and power-setting for the reference engine model representing nominal engine performance. Therefore, these measurements are not available for fault detection, and only the remaining sensors

N_{2}

,

T_{t 25}

,

T_{t 3}

,

T_{t 5}

,

p_{t 25}

,

p_{s 3}

, and

p_{t 5}

are analyzed via the proposed clustering approach.

The detection rates for the extensive measurement suite are summarized in Table 5. In general, the total number of statistical outliers rises with the total number of available data points. Hence, the presented results are based on the number of outliers counted until the fault-detection

n_{l i m i t}

related to the average number of steady-state data points detected per flight

⌀ n_{F l i g h t}

. Except for fault case b, all the different faults can be detected with similar confidence.

The decrease in true-positives with increasing

n_{l i m i t}

can largely be attributed to the total number of steady-state data points available for a single flight. Fault detection is not possible if

n_{l i m i t}

exceeds the number of steady-state data points available for a single flight. Fault case b is in general more difficult to detect, with detection rates decreasing more rapidly with increasing

n_{l i m i t}

. The exchange rates for the different fault cases are displayed in Figure 12. The exchange rates of fault case b show the limited observability of this fault. The comparably low exchange rates of fault case b cause the low detection rates of fault case b.

This particular fault mainly affects the bypass and cannot be detected using the given measurements. The overall detection rates lead to the conclusion that fault detection is possible within one flight. For example choosing

n_{l i m i t} / ⌀ n_{F l i g h t} = 0.15

results in an overall detection rate of

99 %

for the fault cases a and c-l while keeping false positives below

2 %

.

4.2.5. Detection Rates for the Minimum Measurement Suite

Apart from the measurements defining the environmental conditions

T_{t 0}

,

p_{t 0}

, and

M a

, only the spool-speeds

N_{1}

,

N_{2}

as well as the exhaust-gas temperature

T_{t 5}

are provided underlying the minimum measurements suite. These are the measurements displayed in the cockpit and are covered by all aircraft data acquisition systems. Taking into account the required input variables for the reference engine model, only

N_{2}

and

T_{t 5}

remain for fault detection. Due to the limited number of measurements available for the clustering, no dimensionality reduction is necessary to analyze the dataset, and Principal Component Analysis is omitted.

The resulting detection rates for the dataset assuming a minimum measurement suite are summarized in Table 6. The true positive detection rates are similar to those presented in the previous section with an extended measurement suite. The only significant difference is fault case f for which the detection rate significantly drops with increasing

n_{l i m i t}

. Considering the exchange rates in Figure 12, fault case f mainly affects the pressure

p_{t 3}

and temperature

T_{t 3}

after the HPC. Hence, it is not observable based on the available measurements of the minimum measurements suite. Since fewer measurements are available, the random scatter within the dataset is reduced, resulting in lower false positives than the analysis based on the extensive measurement suite. The analysis proves that fault detection is even possible, providing a minimum measurement suite covered by a wide variety of aircraft.

4.3. Verification of the Clustering Toolchain

Some simplifications were made for the synthetic datasets, e.g., no gradual deterioration was considered. In order to verify the chosen parameters for the clustering workflow consisting of Principal Component Analysis and One-Class Support Vector Machine, two datasets containing operational data with an HPC and an LPC fault are provided. The scope of measurements differs from the one in the previous section as the fuel flow is additionally included. Altogether, the datasets contain the measurement residuals of eight sensors:

Δ W_{f u e l}

,

Δ N 1

,

Δ N 2

,

Δ P 25

,

Δ P 3

,

Δ T 25

,

Δ T 3

, and

Δ T 5

.

Since the data were derived from state-of-the-art engine condition monitoring systems, only a few steady-state cruise snapshots per flight are available. In order to resemble the multitude of data points expected from the proposed approach, several flights were combined to represent the dataset of preceding flights and the dataset of the current flight.

Since there is no information concerning the point in time of fault initiation, the time series were visually separated. Selected measurement residuals are visualized in Figure 13a for the HPC fault and Figure 14a for the LPC fault. The data shown document a sufficiently low level of engine deterioration.

The outlier ratio Q in Figure 13b,c indicates that the chosen regularization parameter

ν

results in optimum detectability of the faults. Additionally, the visualization of the space of principal components of the HPC fault in Figure 13c and the LPC fault in Figure 14c shows that the clusters defining the nominal and faulty engine performance are well separated and are therefore distinguishable leading to the conclusion that the fault cases can be detected utilizing the proposed clustering approach.

For the LPC fault, there is some overlap between the two clusters. This overlap is mainly caused by the fault initiation of the LPC fault. Considering the time series in Figure 13a and Figure 14a, the HPC fault occurs nearly instantaneously, whereas the LPC fault manifests over several cycles.

5. Discussion

A novel approach for detecting faults utilizing continuously sampled flight data was developed. The algorithm can detect arbitrary faults and requires only datasets representing the nominal engine performance. The detection scheme was tested based on synthetic datasets utilizing the full-flight data of commercial regional jets, ensuring realistic variations of the environmental conditions and control settings. The detection rates of various component faults showed high overall detection rates indicating that fault detection was achievable after a single flight, removing the existing latency of state-of-the-art methods.

Fault detection could even be achieved utilizing minimal instrumentation of commercial aircraft. In addition to testing the detection scheme with synthetic datasets, the individual components of the algorithm were verified with in-flight measurements of commercial flights. The results of this verification provide a proof-of-concept for the applicability of the proposed detection approach.

Since steady-state data points are mainly derived during cruise, the method is expected to work well, especially for flights that spend a significant time in cruise and allow the generation of a statistically significant amount of steady-state data points. For flights with limited cruise segments, the number of identified steady-state datapoints might not be sufficient. In order to analyze such flights, either the stability criteria defining stable flight segments have to be adjusted or alternative fault-detection approaches taking additional flight segments into account have to be developed.

Author Contributions

Conceptualization, M.W., S.S., D.B., C.K. and J.M.; methodology, M.W. and S.S.; software, M.W.; validation, M.W., S.S., D.B., C.K. and J.M.; writing—original draft preparation, M.W. and S.S.; writing—review and editing, S.S., D.B., C.K. and J.M.; visualization, M.W.; supervision, S.S.; and funding acquisition, S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This project has received funding from the German Federal Ministry of Economic Affairs and Energy (BMWI) within the LUFO-V project OPsTIMAL (grant number 20X1711D).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://c3.ndc.nasa.gov/dashlink/projects/85/ (accessed on 8 May 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

IATA. Airline Maintenance Cost Executive Commentary; Technical report; IATA: Montreal, QC, Canada, 2016. [Google Scholar]
Fentaye, A.; Baheta, A.T.; Gilani, S.I.; Kyprianidis, K.G. A Review on Gas Turbine Gas-Path Diagnostics: State-of-the-Art Methods, Challenges and Opportunities. Aerospace 2019, 6, 83. [Google Scholar] [CrossRef] [Green Version]
Kurz, R.; Brun, K. Degradation in Gas Turbine Systems. J. Eng. Gas Turb. Power 2001, 123, 70–77. [Google Scholar] [CrossRef] [Green Version]
Koskoletos, O.A.; Aretakis, N.; Alexiou, A.; Romesis, C.; Mathioudakis, K. Evaluation of Aircraft Engine Diagnostic Methods Through ProDiMES. In Volume 6: Ceramics; Controls, Diagnostics, and Instrumentation, Education; Manufacturing Materials and Metallurgy; ASME: Singapore, 2018; p. V006T05A023. [Google Scholar] [CrossRef]
Pérez-Ruiz, J.L.; Tang, Y.; Loboda, I. Aircraft Engine Gas-Path Monitoring and Diagnostics Framework Based on a Hybrid Fault Recognition Approach. Aerospace 2021, 8, 232. [Google Scholar] [CrossRef]
Loboda, I.; Pérez-Ruiz, J.L.; Yepifanov, S. A Benchmarking Analysis of a Data-Driven Gas Turbine Diagnostic Approach. In Volume 6: Ceramics; Controls, Diagnostics, and Instrumentation; Education; Manufacturing Materials and Metallurgy; ASME: Singapore, 2018; p. V006T05A027. [Google Scholar] [CrossRef]
Lipowsky, H.; Staudacher, S.; Bauer, M.; Schmidt, K.J. Application of Bayesian Forecasting to Change Detection and Prognosis of Gas Turbine Performance. J. Eng. Gas Turb. Power 2010, 132, 1–8. [Google Scholar] [CrossRef]
Simon, D.L.; Bird, J.; Davison, C.; Volponi, A.; Iverson, R.E. Benchmarking Gas Path Diagnostic Methods: A Public Approach. Controls Diagnost. Instrum. Cycle Innovat. Electr. Power 2008, 2008, 325–336. [Google Scholar] [CrossRef] [Green Version]
Fentaye, A.D.; Zaccaria, V.; Kyprianidis, K. Aircraft Engine Performance Monitoring and Diagnostics Based on Deep Convolutional Neural Networks. Machines 2021, 9, 337. [Google Scholar] [CrossRef]
Xu, J.; Liu, X.; Wang, B.; Lin, J. Deep Belief Network-Based Gas Path Fault Diagnosis for Turbofan Engines. IEEE Access 2019, 7, 170333–170342. [Google Scholar] [CrossRef]
Fentaye, A.D.; Ul-Haq Gilani, S.I.; Baheta, A.T.; Li, Y.G. Performance-based fault diagnosis of a gas turbine engine using an integrated support vector machine and artificial neural network method. Proc. Inst. Mech. Eng. Part A J. Power Energy 2019, 233, 786–802. [Google Scholar] [CrossRef]
Fu, X.; Luo, H.; Zhong, S.; Lin, L. Aircraft engine fault detection based on grouped convolutional denoising autoencoders. Chin. J. Aeronaut. 2019, 32, 296–307. [Google Scholar] [CrossRef]
Sheng Zhong, S.; Fu, S.; Lin, L. A novel gas turbine fault diagnosis method based on transfer learning with CNN. Meas. J. Int. Meas. Confeder. 2019, 137, 435–453. [Google Scholar] [CrossRef]
Wang, Z.; Zarader, J.L.; Argentieri, S. A novel aircraft engine fault diagnostic and prognostic system based on SVM. In Proceedings of the 2012 IEEE International Conference on Condition Monitoring and Diagnosis, CMD 2012, Bali, Indonesia, 23–27 September 2012; pp. 723–728. [Google Scholar] [CrossRef]
Wu, Z.; Lin, W.; Ji, Y. An Integrated Ensemble Learning Model for Imbalanced Fault Diagnostics and Prognostics. IEEE Access 2018, 6, 8394–8402. [Google Scholar] [CrossRef]
Tang, L.; Volponi, A.J.; Prihar, E. Extending engine gas path analysis using full flight data. Proc. ASME Turbo Expo 2019, 6, 1–11. [Google Scholar] [CrossRef]
Volponi, A.J.; Tang, L. Improved Engine Health Monitoring Using Full Flight Data and Companion Engine Information. SAE Int. J. Aerospace 2016, 9, 91–102. [Google Scholar] [CrossRef]
Losi, E.; Venturini, M.; Manservigi, L.; Ceschini, G.F.; Bechini, G. Anomaly Detection in Gas Turbine Time Series by Means of Bayesian Hierarchical Models. J. Eng. Gas Turb. Power 2019, 141, 1–9. [Google Scholar] [CrossRef]
Bai, M.; Liu, J.; Ma, Y.; Zhao, X.; Long, Z.; Yu, D. Long Short-Term Memory Network-Based Normal Pattern Group for Fault Detection of Three-Shaft Marine Gas Turbine. Energies 2020, 14, 13. [Google Scholar] [CrossRef]
MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: California, UK, 1967; Volume 1, pp. 281–297. [Google Scholar] [CrossRef]
Jung, S.H.; Huh, J.H. A Novel on Transmission Line Tower Big Data Analysis Model Using Altered K-means and ADQL. Sustainability 2019, 11, 3499. [Google Scholar] [CrossRef] [Green Version]
Islam, M.R.; Kim, Y.H.; Kim, J.Y.; Kim, J.M. Detecting and Learning Unknown Fault States by Automatically Finding the Optimal Number of Clusters for Online Bearing Fault Diagnosis. Appl. Sci. 2019, 9, 2326. [Google Scholar] [CrossRef] [Green Version]
Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In The Second International Conference on Knowledge Discovery and Data Mining; AAAI Press: Montreal, QC, Canada, 1996; pp. 226–231. [Google Scholar]
Dani, M.C.; Freixo, C.; Jollois, F.X.; Nadif, M. Unsupervised anomaly detection for Aircraft Condition Monitoring System. In Proceedings of the IEEE Aerospace, Big Sky, MT, USA, 7–14 March 2015; Volume 2015, pp. 1–7. [Google Scholar] [CrossRef]
Lee, C.H.; Shin, H.S.; Tsourdos, A.; Skaf, Z. Anomaly detection of aircraft engine in FDR (Flight Data Recorder) data. IET Conf. Publ. 2017, 2017, 1–6. [Google Scholar] [CrossRef] [Green Version]
Sheridan, K.; Puranik, T.G.; Mangortey, E.; Pinon-Fischer, O.J.; Kirby, M.; Mavris, D.N. An Application of DBSCAN Clustering for Flight Anomaly Detection During the Approach Phase. In AIAA Scitech 2020 Forum; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2020; Volume 1. [Google Scholar] [CrossRef]
Hsu, J.Y.; Wang, Y.F.; Lin, K.C.; Chen, M.Y.; Hsu, J.H.Y. Wind Turbine Fault Diagnosis and Predictive Maintenance Through Statistical Process Control and Machine Learning. IEEE Access 2020, 8, 23427–23439. [Google Scholar] [CrossRef]
Nikpey Somehsaraei, H.; Ghosh, S.; Maity, S.; Pramanik, P.; De, S.; Assadi, M. Automated Data Filtering Approach for ANN Modeling of Distributed Energy Systems: Exploring the Application of Machine Learning. Energies 2020, 13, 3750. [Google Scholar] [CrossRef]
Tax, D.M.; Duin, R.P. Support Vector Data Description. Mach. Learn. 2004, 54, 45–66. [Google Scholar] [CrossRef] [Green Version]
Sadooghi, M.S.; Esmaeilzadeh Khadem, S. Improving one class support vector machine novelty detection scheme using nonlinear features. Pattern Recog. 2018, 83, 14–33. [Google Scholar] [CrossRef]
Xiao, Y.; Wang, H.; Xu, W.; Zhou, J. Robust one-class SVM for fault detection. Chemometr. Intell. Lab. Syst. 2016, 151, 15–25. [Google Scholar] [CrossRef]
Puranik, T.G.; Mavris, D.N. Anomaly Detection in General-Aviation Operations Using Energy Metrics and Flight-Data Records. J. Aerospace Inf. Syst. 2018, 15, 22–36. [Google Scholar] [CrossRef]
Hayton, P.; Schölkopf, B.; Tarassenko, L.; Anuzis, P. Support vector novelty detection applied to jet engine vibration spectra. Ad. Neural Inf. Process. Syst. 2001, 13, 946–952. [Google Scholar]
Tan, Y.; Tian, H.; Jiang, R.; Lin, Y.; Zhang, J. A comparative investigation of data-driven approaches based on one-class classifiers for condition monitoring of marine machinery system. Ocean Eng. 2020, 201, 107174. [Google Scholar] [CrossRef]
Aggarwal, C.C. Data Mining; Springer International Publishing: Cham, Switzerland, 2015; p. 734. [Google Scholar] [CrossRef]
Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educat. Psychol. 1933, 24, 417–441. [Google Scholar] [CrossRef]
Yan, W. Application of Random Forest to Aircraft Engine Fault Diagnosis. In Proceedings of the Multiconference on Computational Engineering in Systems Applications, Beijing, China, 4–6 October 2006; pp. 468–475. [Google Scholar] [CrossRef]
Hosseinzadeh, J.; Masoodzadeh, F.; Roshandel, E. Fault detection and classification in smart grids using augmented K-NN algorithm. SN Appl. Sci. 2019, 1, 1627. [Google Scholar] [CrossRef] [Green Version]
Davison, C.R. Determination of Steady State Gas Turbine Operation. Turbo Expo Power Land Sea Air 2012, 44670, 107–118. [Google Scholar] [CrossRef]
Simon, D.L.; Litt, J.S. A Data Filter for Identifying Steady-State Operating Points in Engine Flight Data for Condition Monitoring Applications. J. Eng. Gas Turb. Power 2011, 133, 071603. [Google Scholar] [CrossRef]
Wang, P.; Liu, K.; Tang, Z. Turbofan Engine Baseline Model Extraction Based on FDR Data. In Proceedings of the 31st Chinese Control and Decision Conference, CCDC 2019, Nanchang, China, 3–5 June 2019; pp. 4044–4049. [Google Scholar] [CrossRef]
Badea, V.E.; Zamfiroiu, A.; Boncea, R. Big Data in the Aerospace Industry. Inf. Econ. 2018, 22, 17–24. [Google Scholar] [CrossRef]
Ren, L.; Qin, H.; Xu, K. A Thermodynamic based and Data Driven Hybrid Network for Gas Turbine Modeling. arXiv 2021, arXiv:2104.14842. [Google Scholar]
Aretakis, N.; Roumeliotis, I.; Alexiou, A.; Romesis, C.; Mathioudakis, K. Turbofan Engine Health Assessment From Flight Data. J. Eng. Gas Turb. Power 2014, 137, 041203. [Google Scholar] [CrossRef]
Urban, L.A. Parameter Selection for Multiple Fault Diagnostics of Gas Turbine Engines. ASME J. Eng. Gas Turb. Power 1974, 1974, 225–230. [Google Scholar] [CrossRef]
Zheng, S.; Ristovski, K.; Farahat, A.; Gupta, C. Long Short-Term Memory Network for Remaining Useful Life estimation. In Proceedings of the 2017 IEEE International Conference on Prognostics and Health Management (ICPHM), Dallas, TX, USA, 19–21 June 2017; pp. 88–95. [Google Scholar] [CrossRef]
Li, X.; Ding, Q.; Sun, J.Q. Remaining useful life estimation in prognostics using deep convolution neural networks. Reliabil. Eng. Syst. Saf. 2018, 172, 1–11. [Google Scholar] [CrossRef] [Green Version]
Castillo, I.G.; Loboda, I.; Pérez Ruiz, J.L. Data-Driven Models for Gas Turbine Online Diagnosis. Machines 2021, 9, 372. [Google Scholar] [CrossRef]
Tang, L.; Volponi, A.J. Intelligent Reasoning for Gas Turbine Fault Isolation and Ambiguity Resolution. J. Eng. Gas Turb. Power 2019, 141, 1–12. [Google Scholar] [CrossRef]
Oppenheim, A.; Willsky, A.; Nawab, H. Signals and Systems, 2nd ed.; Pearson: London, UK, 1996. [Google Scholar]
Bauerfeind, K. Die exakte Bestimmung des Übertragungsverhaltens von Turbostrahltriebwerken unter Berücksichtigung des Instationären Verhaltens seiner Komponenten. Ph.D. Thesis, The Technical University of Munich, Munich, Germany, 1968. [Google Scholar]
Putz, A. Zustandsüberwachung von Turboflugtriebwerken auf der Basis Instationärer Triebwerksmodellierung. Ph.D. Thesis, Universität Stuttgart, Stuttgart, Germany, 2017. [Google Scholar]
Gnielinski, V. Neue Gleichungen für den Wärme- und den Stoffübergang in turbulent durchströmten Rohren und Kanälen. Forsch. Ing. 1975, 41, 8–16. [Google Scholar] [CrossRef]
Conners, T. Measurement Effects on the Calculation of In-FIight Thrust for an F404 Turbofan Engine. Int. J. Turbo Jet Engines 1993, 10, 107–126. [Google Scholar] [CrossRef]
Vanhatalo, E.; Kulahci, M. Impact of Autocorrelation on Principal Components and Their Use in Statistical Process Control. Qual. Reliabil. Eng. Int. 2016, 32, 1483–1500. [Google Scholar] [CrossRef] [Green Version]
Putz, A.; Staudacher, S.; Koch, C.; Brandes, T. Jet Engine Gas Path Analysis Based on Takeoff Performance Snapshots. J. Eng. Gas Turb. Power 2017, 139, 111201. [Google Scholar] [CrossRef]
Fan, Z.; Liu, E.; Xu, B. Weighted Principal Component Analysis. In Conference: Artificial Intelligence and Computational Intelligence—Third International Conference; Springer: Berlin/Heidelberg, Germany, 2011; Volume 7004 LNAI, pp. 569–574. [Google Scholar] [CrossRef]
Schölkopf, B.; Platt, J.C.; Shawe-Taylor, J.; Smola, A.J.; Williamson, R.C. Estimating the Support of a High-Dimensional Distribution. Neural Comput. 2001, 13, 1443–1471. [Google Scholar] [CrossRef] [PubMed]
NPSS User Guide; Technical report; Southwest Research Institute: San Antonio, TX, USA, 2016.
Nielsen, A.E.; Moll, C.W.; Staudacher, S. Modeling and Validation of the Thermal Effects on Gas Turbine Transients. J. Eng. Gas Turb. Power 2005, 127, 564. [Google Scholar] [CrossRef]
Matthews, B.; Oza, N. NASA—Sample Flight Data. 2012. Available online: https://c3.ndc.nasa.gov/dashlink/projects/85/ (accessed on 8 May 2021).
Curnock, B. Obidicote Project—Work Package 4: Steady-State Test Cases; Rolls-Royce PLC: Manchester, UK, 2000. [Google Scholar]

Figure 1. Flowchart describing the fault-detection algorithm.

Figure 2. Flowchart of the Steady-State Data Filter according to [40].

Figure 3. Heat transfer model according to [52].

Figure 4. Example for the orientation of the Principal Components for a dataset with two dedicated clusters. The length of the Principal Components correlates with the variance explained by the direction.

Figure 5. Visualization of the One-Class Support Vector Machine.

Figure 6. Scheme of the modeled engine, including the measurements considered for fault detection.

Figure 7. Example time series for the input parameters used for data synthesis.

Figure 8. Comparison of the original and adjusted Steady-State Data Filter for an example flight.

Figure 9. N1-Setting and identified steady-state data points for continuously sampled flights of a commercial medium-range aircraft. (a) Flight 1, (b) Flight 2, and (c) Flight 3.

Figure 10. Cumulative Percent Variance

C P V

for fault cases a to l (Table 3).

Figure 10. Cumulative Percent Variance

C P V

for fault cases a to l (Table 3).

Figure 11. Outlier ratio Q evaluated for the synthetic datasets.

Figure 12. Exchange rates for fault cases a to l (Table 3).

Figure 13. Results of the cluster analysis applied to operational data of an HPC fault. (a) Examples for time series of measurement residuals for the HPC fault case. (b) Outlier ratio Q. (c) PCA representation with three principal components.

Figure 14. Results of the cluster analysis applied to operational data of an LPC fault. (a) Examples for time series of measurement residuals for the LPC fault case. (b) Outlier ratio Q. (c) PCA representation with three principal components.

Table 1. Time constants

τ_{t h}

of the engine model evaluated at low-idle.

Table 1. Time constants

τ_{t h}

of the engine model evaluated at low-idle.

	HPC	Burner	HPT	LPT
$τ_{B l a d e}$ [s]	2	-	15	10
$τ_{C a s i n g}$ [s]	6	7	37	82
$τ_{D i s k}$ [s]	96	-	160	82

Table 2. Variables monitored for ensuring stable flight conditions.

Requirements	Parameter
const. flight condition	Altitude
	Total Air Temperature
	Mach-Number
thermal equilibrium	Exhaust Gas Temperature
mechanical equilibrium	Shaft Speed Fan
mechanical equilibrium	Shaft Speed Core
const. power setting	Fuelflow

Table 3. Definition of the OBIDICOTE test cases according to [62].

Label	Fault Description		Faulty Component
Label	$Δ Q$	$Δ η$	Faulty Component
a	$Δ Q_{F a n} = - 1.0 %$	$Δ η_{F a n} = - 0.5 %$	Fan
a	$Δ Q_{L P C} = - 1.0 %$	$Δ η_{L P C} = - 0.4 %$	LPC
b	−	$Δ η_{F a n} = - 1.0 %$	Fan
c	$Δ Q_{H P C} = - 1.0 %$	$Δ η_{H P C} = - 0.7 %$	HPC
d	−	$Δ η_{H P C} = - 1.0 %$
e	$Δ Q_{H P C} = - 1.0 %$	−
f	$Δ Q_{H P T} = + 1.0 %$	−	HPT
g	$Δ Q_{H P T} = - 1.0 %$	$Δ η_{H P T} = - 1.0 %$
h	−	$Δ η_{H P T} = - 1.0 %$
i	−	$Δ η_{L P T} = - 1.0 %$	LPT
j	$Δ Q_{L P T} = - 1.0 %$	$Δ η_{L P T} = - 0.4 %$
k	$Δ Q_{L P T} = - 1.0 %$	−
l	$Δ Q_{L P T} = + 1.0 %$	$Δ η_{L P T} = - 0.6 %$

Table 4. Standard deviation

σ

and bias of the measurement uncertainty [16,17].

Table 4. Standard deviation

σ

and bias of the measurement uncertainty [16,17].

Measurement	$σ$	Bias	Units
$N_{1}$	$5.0 \times 10^{- 3}$	$+ 1.7 \times 10^{- 1}$	1/s
$N_{2}$	$2.5 \times 10^{- 2}$	$+ 3.3 \times 10^{- 1}$	1/s
$p_{t 0}$	$1.4 \times 10^{+ 3}$	$+ 0.0 \times 10^{+ 0}$	Pa
$p_{t 25}$	$3.4 \times 10^{+ 3}$	$- 2.1 \times 10^{+ 3}$	Pa
$p_{s 3}$	$1.9 \times 10^{+ 4}$	$+ 1.4 \times 10^{+ 4}$	Pa
$p_{t 5}$	$2.1 \times 10^{+ 3}$	$+ 2.8 \times 10^{+ 3}$	Pa
$T_{t 0}$	$5.6 \times 10^{- 1}$	$+ 0.0 \times 10^{+ 0}$	deg·K
$T_{t 25}$	$1.2 \times 10^{+ 0}$	$- 2.8 \times 10^{+ 0}$	deg·K
$T_{t 3}$	$1.7 \times 10^{+ 0}$	$- 1.1 \times 10^{+ 1}$	deg·K
$T_{t 5}$	$1.2 \times 10^{+ 0}$	$+ 8.3 \times 10^{+ 0}$	deg·K

Table 5. Detection rates of the proposed fault-detection scheme for the extensive measurement suite comprising

N_{2}

,

T_{t 25}

,

T_{t 3}

,

T_{t 5}

,

p_{t 25}

,

p_{s 3}

, and

p_{t 5}

.

Table 5. Detection rates of the proposed fault-detection scheme for the extensive measurement suite comprising

N_{2}

,

T_{t 25}

,

T_{t 3}

,

T_{t 5}

,

p_{t 25}

,

p_{s 3}

, and

p_{t 5}

.

$\frac{n_{Limit}}{⌀ n_{Flight}}$	TP												FP
$\frac{n_{Limit}}{⌀ n_{Flight}}$	a	b	c	d	e	f	g	h	i	j	k	l	Nominal
0.05	1.00	0.99	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.41
0.10	1.00	0.78	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	0.07
0.15	0.99	0.55	0.99	0.99	0.99	0.99	0.99	0.99	0.99	0.99	0.99	0.99	0.02
0.20	0.95	0.38	0.95	0.95	0.95	0.95	0.95	0.95	0.95	0.95	0.95	0.95	0.01
0.25	0.85	0.27	0.85	0.85	0.85	0.85	0.85	0.85	0.85	0.85	0.85	0.85	0.01
0.30	0.74	0.21	0.74	0.74	0.74	0.74	0.74	0.74	0.74	0.74	0.74	0.74	0.00
0.35	0.67	0.15	0.67	0.67	0.67	0.67	0.67	0.67	0.67	0.67	0.67	0.67	0.00
0.40	0.58	0.11	0.58	0.58	0.58	0.58	0.58	0.58	0.58	0.58	0.58	0.58	0.00
0.45	0.54	0.07	0.54	0.54	0.54	0.54	0.54	0.54	0.54	0.54	0.54	0.54	0.00
0.50	0.48	0.05	0.48	0.48	0.48	0.48	0.48	0.48	0.48	0.48	0.48	0.48	0.00

Table 6. Detection rates of the proposed fault-detection scheme for the minimum measurement suite comprising

N_{2}

and

T_{t 5}

.

Table 6. Detection rates of the proposed fault-detection scheme for the minimum measurement suite comprising

N_{2}

and

T_{t 5}

.

$\frac{n_{Limit}}{⌀ n_{Flight}}$	TP												FP
$\frac{n_{Limit}}{⌀ n_{Flight}}$	a	b	c	d	e	f	g	h	i	j	k	l	Nominal
0.05	1.00	0.75	1.00	1.00	1.00	0.86	1.00	1.00	1.00	1.00	1.00	1.00	0.05
0.10	1.00	0.37	1.00	1.00	1.00	0.58	1.00	1.00	1.00	1.00	1.00	1.00	0.01
0.15	0.99	0.23	0.99	0.99	0.99	0.37	0.99	0.99	0.99	0.99	0.99	0.99	0.01
0.20	0.95	0.12	0.95	0.95	0.95	0.22	0.95	0.95	0.95	0.95	0.95	0.95	0.01
0.25	0.83	0.06	0.85	0.85	0.85	0.17	0.85	0.85	0.85	0.85	0.85	0.85	0.01
0.30	0.74	0.06	0.74	0.74	0.74	0.13	0.74	0.74	0.74	0.74	0.74	0.74	0.01
0.35	0.67	0.05	0.67	0.67	0.67	0.06	0.67	0.67	0.67	0.67	0.67	0.67	0.01
0.40	0.58	0.05	0.58	0.58	0.58	0.04	0.58	0.58	0.58	0.58	0.58	0.58	0.00
0.45	0.54	0.05	0.54	0.54	0.54	0.03	0.54	0.54	0.54	0.54	0.54	0.54	0.00
0.50	0.47	0.03	0.48	0.48	0.48	0.01	0.48	0.48	0.48	0.48	0.48	0.48	0.00

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Weiss, M.; Staudacher, S.; Becchio, D.; Keller, C.; Mathes, J. Steady-State Fault Detection with Full-Flight Data. Machines 2022, 10, 140. https://doi.org/10.3390/machines10020140

AMA Style

Weiss M, Staudacher S, Becchio D, Keller C, Mathes J. Steady-State Fault Detection with Full-Flight Data. Machines. 2022; 10(2):140. https://doi.org/10.3390/machines10020140

Chicago/Turabian Style

Weiss, Matthias, Stephan Staudacher, Duilio Becchio, Christian Keller, and Jürgen Mathes. 2022. "Steady-State Fault Detection with Full-Flight Data" Machines 10, no. 2: 140. https://doi.org/10.3390/machines10020140

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Steady-State Fault Detection with Full-Flight Data

Abstract

1. Introduction

2. Related Research

2.1. Fault Detection

2.2. Feature Extraction

2.3. Steady-State Data Filters

2.4. Conclusions from the Literature Review

3. Materials and Methods

3.1. Concept

3.2. Steady-State Data Filter

3.2.1. Low-Pass Filter

3.2.2. Thermal Transient Filter

3.2.3. Regime Recognition

3.2.4. State Transition Logic

3.3. Clustering

3.3.1. Principal Component Analysis

3.3.2. One-Class Support Vector Machine

3.4. Data Synthesis

4. Results

4.1. Test and Verification of the Steady-State Data Filter

4.1.1. Test of the Steady-State Data Filter with Synthetic Datasets

4.1.2. Verification of the Steady-State Data Filter with In-Flight Measurements

4.2. Parameter Study Clustering

4.2.1. Definition of the Principal Components Retained d P C A

4.2.2. Definition of the Regularization Parameter ν and Number of Flights n T r a i n i n g Comprising the Training Dataset

4.2.3. Definition of the Threshold n l i m i t for Fault Detection

4.2.4. Detection Rates for the Extensive Measurement Suite

4.2.5. Detection Rates for the Minimum Measurement Suite

4.3. Verification of the Clustering Toolchain

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2.1. Definition of the Principal Components Retained $d_{P C A}$

4.2.2. Definition of the Regularization Parameter $ν$ and Number of Flights $n_{T r a i n i n g}$ Comprising the Training Dataset

4.2.3. Definition of the Threshold $n_{l i m i t}$ for Fault Detection