Next Article in Journal
Power of eWOM and Its Antecedents in Driving Customers’ Intention to Revisit: An Empirical Investigation on Five-Star Eco-Friendly Hotels in Saudi Arabia
Previous Article in Journal
Potential Benefits and Disbenefits of the Application of Water Treatment Residuals from Drinking Water Treatment Processes to Land in Scotland: Development of a Decision Support Tool
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Efficient Siamese Network and Transfer Learning-Based Predictive Maintenance System for More Sustainable Manufacturing

IMaR Research Centre, Munster Technological University, V92 CX88 Tralee, Ireland
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(12), 9272; https://doi.org/10.3390/su15129272
Submission received: 24 April 2023 / Revised: 6 June 2023 / Accepted: 6 June 2023 / Published: 8 June 2023

Abstract

:
Legacy machinery poses a specific challenge when integrated into modern manufacturing lines. While modern machinery provides swift methods of integration and inbuilt predictive maintenance (PdM), older machines, while physically fully functional, are less attractive to reuse, a specific reason being their lack of ready-to-implement PdM hardware and models. More sustainable manufacturing operations can be achieved if the useable lifespan of functional older machinery can be extended through retrofittable PdM and modern industrial communication systems. While PdM models can be developed for a class (make/model) of machine with retrofitted sensing, it is often found that legacy machines will deviate greatly from their original form, through nonstandard maintenance and component replacement actions during their lengthy lifespan. This would mean that each legacy machine would require a custom PdM model, a cost often leading to the removal or nonusage of legacy machines. This paper proposes a framework designed for the generation of an efficient PdM algorithm which would allow for the reuse of legacy machines retrofitted with low-cost sensing in modern manufacturing for increased sustainability. Given a limited number of data samples collected from a machine to be maintained, we aim to predict a failure or/and maintenance time by making use of the difference between the characteristics of the variation of the healthy and unhealthy data collected from the machine. We measure the healthiness of the machine by using a Siamese network trained with a public dataset and fine-tuned with data samples obtained from machines with similar characteristics. Although we use different training and testing datasets coming from completely different sources, we obtain reasonable results thanks to the proposed technique. The results of simulations and the statistical analysis enable us to devise a transfer learning technique and a Siamese network employed for failure detection in the machine. The proposed system will allow for the continued use of older machines in modern facilities, enabling more sustainable manufacturing models.

1. Introduction

Through the use of the principles of Industry 4.0, manufacturing and production engineers endeavor to increase the efficiency, profitability, and sustainability of industrial production facilities. The adoption of Industry 4.0 techniques as a method to improve manufacturing sustainability has led to the integration of sensing and communication tools into modern process machinery. The design of modern production lines calls for the implementation of these modern and new pieces of machinery to achieve reliability and production-efficiency goals, to the detriment of the reuse of legacy machinery in new production lines. This is in opposition to the principles of the circular economy (CE) which calls for the reuse of such machinery in an effort to extend the useful lifespan of machines and reduce waste [1,2,3]. The extended usage of legacy machinery in modern production facilities is dependent on the ability of that machinery to be modernized through the retrofitting of sensor and communications technology to allow integration of that machine with advanced manufacturing execution systems (MES). While retrofittable communications systems for machine control and report throughput and faults to the MES are largely available, it is now one of the more advanced elements of Industry 4.0 implementation which is a barrier to the continued use of legacy machines, namely predictive maintenance (PdM).
The performance of all machines continuously decreases at a rate dependent on their usage and internal and external influences [4,5]. Therefore, machines are periodically maintained using preventive or reactive maintenance techniques. However, the early maintenance of a machine that still has a useful lifetime using the PdM technique leads to reduced production downtime and reduces unexpected production interruption. This is in contrast to waiting to perform maintenance until the machine breaks, as is the situation when using the reactive maintenance technique. In situations where it is possible to reuse legacy machinery which do not incorporate Industry 4.0/PdM sensing and connectivity, the drawback of increased downtime and reduced profitability counteracts any economic benefits of machine reuse.
The retrofit of PdM solutions for legacy machines has presented some challenges. The creation of PdM is highly data dependent and time consuming, as detailed below. While research exists on the creation of bespoke PdM systems for legacy machinery, this has proved difficult to rollout to multiple machines due to variations in machines, i.e., machine model types, deviations of operation due to degradation, maintenance variations, and historic machine upgrades. This work hopes to address machine-to-machine variation through a machine-learning technique. Such a system will allow fast deployment of PdM on legacy machines of varying parameters.

1.1. Review of the Literature

For the solution of above mention problems, a PdM strategy is utilized and examined in two main groups as model-based and data-driven techniques. The first of these is model-based methods [6,7,8,9,10] that create a mathematical or physical model to describe degradation processes. The inner parameters of the model-based methods, including the Markov process-based model [11,12], Winner process-based model [9,13,14], and Gaussian mixture-based model [15,16], etc., are adjusted using the data measured from the machine. However, establishing a mathematical or physical model of the machine is very difficult due to the random nature of the degradation process and the many subparts in a machine [17]. In addition, the data-driven method uses a classifier or regression technique, whose internal parameters are adjusted to reduce the error between the output of the model and the data obtained from the machine. After tuning the parameters, the classifier or regression technique can detect an error or estimate the remaining useful life (RUL). The data-driven model consists of four main parts: data collection, data preprocessing, feature extraction, and classification/regression [18,19]. The degradation detection performance of the model is based on the harmony of these steps.
In a typical PdM application, signals or data from the machine to be maintained are collected through sensors containing information about the machine’s health. While vibration sensors [8,20,21,22,23,24,25,26,27] are primarily used in the PdM area, temperature [28,29,30,31,32] and acoustic sensors [33,34,35,36] are also widely used to capture the failures in the machine. Due to the nature of the sensor and environmental conditions, the obtained raw signals often contain noise that should be eliminated with various signal-processing techniques. In the next step, the features are obtained from denoised signals by using several feature-extraction techniques, including wavelet [37,38,39] Lyapunov exponents [40], singular spectral analysis [41], fast kurtogram [42], high-order spectral features [43], and empirical mode decomposition [44], etc. In the last step, a dataset is created using the resulting features and labels related to the features. The internal parameters of an appropriate classifier are trained to learn the relationship between features and labels.
A variety of domain classifiers, including logistic regression [35,45,46,47,48,49,50,51], support-vector machine [26,29,43,44,52,53,54,55,56,57,58,59,60,61], naive Bayes [62,63,64,65,66], decision tree [33,67,68,69,70,71,72], random forest [73,74,75,76,77,78,79,80], gradient boosting [20,81,82,83], k-nearest neighbor [36,70,84,85,86,87,88], and linear-discriminant analysis [88,89,90,91,92,93] have been proposed to detect failures in a machine. Although these methods produce successful results in small or medium-sized datasets, they cannot produce good results in large-scale, imbalanced, and nonlinear datasets. To eliminate these disadvantages, some models based on neural networks [24,53,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111], and adaptive neuro-fuzzy inference systems [34,112,113,114,115,116,117,118,119] are proposed. However, these techniques also have problems such as overfitting, requiring several user-supplied parameters, the curse of dimensionality, etc.
On the other hand, recent improvements in sensor technologies and deep learning have led to a surge of new techniques at the intersection of these areas, building new strategies in the PdM applications. Deep-learning-based algorithms analyze complex and multichannel time series to predict maintenance time or detect a machine failure. Recurrent neural network (RNN) [120] or its improved versions [121,122,123], restricted Boltzmann machine (RBM) [124,125], autoencoder (AE) [126,127] and convolutional neural network (CNN) [128,129,130]-based deep neural networks (DNN) are the most commonly used techniques to monitor the current condition of a system or machine. Several studies have recently shown that DNN-based structures have several advantages over the above-mentioned conventional methods, such as SVM, DT, etc.

1.2. Proposed Predictive Maintenance Framework for Legacy Machines

The above-mentioned techniques have some disadvantages, including hyperparameter optimization, high complexity, requiring too many samples, etc. Some studies [123,131,132] use DNNs with too many weights, which is even more than the number of samples; these DNNs are in danger of overfitting. High complexity is another problem, which is observed in some studies [133,134,135,136] using particularly hybrid networks. In addition, DNN requires too many samples. Therefore, a few methods attempt to solve this problem using digital twin-based models [137] and generative models [138].
In addition to the problems related to deep neural networks, there are two common problems in a typical PdM application.
The insufficient data problem: It is not easy to obtain enough data from each machine while production continues in many enterprises, in addition to waiting for the machine to malfunction for a long time; therefore, the obtained limited data can cause an imbalanced data problem.
Inability to predict failure: It is often encountered that the current methods cannot detect the fault situation effectively. It also is an important problem that the data collected from one machine is not used effectively on a similar machine due to environmental conditions and slight differences in the equipment of the machines.
This work proposes a framework by utilizing the transfer learning method and the CNN-based Siamese network to overcome such shortcomings. The proposed framework offers the ability of Siamese networks to measure the similarity of samples despite a limited number of samples and the capability of transfer learning using samples obtained from multiple machines in different conditions.
The main steps of the proposed approach are:
  • A Siamese network is trained with the public dataset named Omniglot dataset containing handwriting images [134] to adapt the network, which measures the similarity between two samples given to its input;
  • The test images obtained from the machine to be maintained are structured similarly to the Omniglot dataset with a proposed step named the generalization step;
  • In the generalization step, a test image is constructed using the feature extracted from the machine to be maintained. The feature is projected to an image with the shape “L” or “C” concerning clustering results;
  • The reference image “L” is compared with the obtained image from the machine by using the trained Siamese network;
  • The comparison results give a number between zero and one to measure the healthiness of the machine;
  • To increase the efficiency of the network, the CNN layer of the Siamese network is preserved, and the fully connected layer of the Siamese network is retrained using a few samples obtained from the machine for further prediction.
The experimental results and statistical analysis show that the proposed method performs successfully and achieves failure detection on various dataset types. Our main contributions to this paper are the following:
We design the framework based on the Siamese network and transfer learning technique to detect failures in a machine, unlike mainstream methods that need too many training-data samples.
The algorithm can be adopted to the same types of machines that work in different settings and environments.
In addition, as it does not strictly require the labels collected from machines; it can be applied to unsupervised failure-detection problems.
The main hypothesis is to test how to monitor a legacy machine’s health using an unsupervised learning technique to increase the efficiency, profitability, and sustainability of industrial production facilities.
The rest of the paper is organized as follows. In Section 2, data collection, preprocessing, feature extraction, generalization process, and Siamese networks and training are reviewed, and the proposed framework is explained. Then, the experimental results are presented and discussed. Finally, concluding remarks are provided in the last Section 4.

2. Materials and Methods

In this study, a deep-learning-based method is proposed to solve PdM algorithms’ two most important problems. The first of these problems is the imbalanced data problem because of the inadequacy of unhealthy data. A minimal number of samples were obtained from an unhealthy machine due to the short span of failure time compared to the sample number obtained from the healthy machine. This problem would be solved in the literature by increasing the number of unhealthy data by deliberately breaking the machine in a specially prepared experimental environment. However, these artificial environments cannot fully simulate the random occurrence mechanisms of faults. Therefore, this data obtained is insufficient to be used in case of another failure. To solve this problem, we proposed a framework based on the Siamese network that can solve the problems despite a limited number of samples.
The second problem is that different machines using the same purpose may have different characteristic signals regarding their localization and separability from each other. In general, the data collected from the machine to be maintained are divided into two groups: a training set and a test. A suitable model is created with the obtained training set, and the failure-detection performance of this model is validated with test data. Thus, the possible fault condition of the machine can be detected with the trained model. However, this model, which is trained with the data obtained from the previous machine, is not as effective as the first machine in detecting the fault status of a similar machine. To solve the second problem, the features obtained from each machine are generalized using a new method called generalization, supported by a transfer learning technique.
A general framework of the proposed technique can be seen in Figure 1. There are five main steps, including data collection, preprocessing, feature extraction, generalization process, and classification, shown in Figure 1 and summarized in the following sections.

2.1. Data Collection

This initial portion pertains to obtaining data points relating to the environmental conditions of the machine being monitored. Within the experiments of this paper, multiple datasets are utilized as test cases. With the publicly available datasets, the data collection is rather nonexistent. However, this study also collected a particular dataset at Munster Technological University. The purpose of collecting this dataset is to observe how the proposed algorithm performs on similar machines. Therefore, audio data were collected from four different lathe machines in five different conditions. By using the data obtained from each machine as training data for another machine, the detection performance of the proposed model is improved as the training data are increased. With the lathe machine collection (please see Supplementary File), the sensors for data collection had to be physically mounted. They monitored for several different recording sessions, with different cutting tools being used each time for multiple minutes.

2.2. Preprocessing

The preprocessing stage consists of two parts: filtering and windowing. In the first step, the raw audio signal is denoised. While recording audio signals, unwanted noise originating from different machines and environmental noise can decrease signal quality. In our case, the recorded signal is denoised using an adaptive filter. This explicitly is a recursive least squares (RLS) algorithm. In our case, the signal acquired pertains to 42 minutes’ worth of data in the windowing period. This signal transforms through the windowing signal, where only a selection of data from the denoised signal is identified for modification and experimentation. These selections are narrowed to a 1-min range with a convenient to machine working condition also check the Supplementary File.

2.3. Feature Extraction

One of the most critical steps of the proposed framework is the feature-extraction section since the test sample images are constructed by using the extracted features. Due to the nature of signal types, including vibration, audio, temperature, etc., it is necessary to apply different methods, including wavelet [37,38,39] Lyapunov exponents [40], singular spectral analysis [41], fast kurtogram [42], high order spectral features [43], empirical mode decomposition [44], etc., to extract features. Therefore, the features are obtained from denoised signals using several feature-extraction techniques in this paper. However, extracting many features significantly increases the size; it can be used if there is a standard feature-extraction method for a particular signal type in the literature. For example, the kurtogram and spectral kurtosis are a good option for vibration signals [139,140,141] and, besides, there is no best choice for the audio signals.

2.4. Generalization Process

The generalization stage provides the most helpful segment to this process in that it begins the transformation to an applicable one-shot learning output result. This involves the usage of three different steps. The first, a feature selection technique called the local-learning algorithm [142,143,144,145,146], is applied to emerge the most suitable features from features extracted from raw signals. The second is to cluster and label the reduced features via the k-means clustering method. The proposed method performs successfully and achieves failure detection on various dataset types. Therefore, we have tried to find the point where the machine breaks down. This means we have 2 classes: failure or normal cases. The last step involves the construction of images from the extracted features and labels. The primary step is that the normalized features project to an L- or C-shape image shown in Figure 2 and structured similarly with the Omniglot dataset [147] regarding amplitude and obtained labels. These images are essential as their structure and attributes will define the result of the next step. The fundamental steps of the generalization process are given in Algorithm 1.
Algorithm 1 Generalization algorithm main steps
1. Collect and denoised signal s q  
2. Divide the signal into m pieces with w dimensional m windows, which can be shown a matrix: S m x w and s ( i )   w for i = 1 , 2 , , m S = [ s ( 1 ) s ( 2 )   s ( m )   ] T
3. Extract features from the windowed signal S to obtain a feature matrix defined X m x d and x ( i )   d for i = 1 , 2 , , m
X = [ x ( 1 ) x ( 2 )   x ( m )   ] T
4. Select f features from feature matrix X based on unsupervised Local Learning-algorithm [142,143] to obtain reduced matrix X r m x f  
X r = [ x r ( 1 ) x r ( 2 )   x r ( m )   ] T where x r ( i )   f for i = 1 , 2 , , m
Apply k-means clustering: [148]
5. Arbitrarily choose k initial centers c 1 ,   c 2 ,   ,   c k   f
6. For every, i set
y ( i ) = arg min j | | x r ( i ) c j | |
7. For every, j set
c j = i = 1 m 1 { y ( i ) = j } x r ( i ) i = 1 m 1 { y ( i ) = j }
where 1{.} is the indicator function.
8. Repeat 6 and 7 until c j no longer changes.
9. Return labels y m  
10. Take features { x ( 1 ) , x ( 2 )   , , x ( m ) } found in step 3 and y m to construct image samples { I ( 1 ) , I ( 2 )   , , I ( m ) } for the siamese network.

2.5. Siamese Network

A Siamese CNN shown in Figure 1 is used for PdM tasks with limited samples in the proposed framework. The Siamese CNN contains two of the same networks that accept different inputs and are combined with a cost function. The Siamese network enables the utilization of concurrent training stages and utilizes a public dataset called Omniglot dataset [147] as a training set. The main aim is to impose an ability on the Siamese network to measure the similarity between two inputs. Once trained, the networks can detect the similarity between samples from different sources, different from the Omniglot dataset. Therefore, one task is failure detection under the restriction that we may only observe an example of possible machine failure before predicting a test sample. This is named one-shot learning [149], which is the focus of our framework presented in this work.
The Siamese networks are based on the CNN deep neural network [145], which has a convolution, pooling, and fully connected layer. The general pattern in the network is that the input image is convolved with several filters that generate several feature maps containing specific information related to the original images. In the pooling layer, the size of the feature maps is reduced to decrease the complexity. Convolution and pooling operations can be repeated as desired, generally concerning dataset size. In our case, this process is repeated four times to build the network for the Omniglot dataset. The network ends with a fully connected network, which generates the final output.
In the convolution operation for the first CNN in the Siamese network with L layer, the value of a unit at position ( a , b ) in the j t h feature map in the i t h layer, denoted as x 1 , i , j a , b is given by [150]
x 1 , i , j a , b = R e L U ( b i , j + f p = 0 P i 1 q = 0 Q i 1 w i , j , f p , q   x 1 , i 1 , f a + p , b + q )
The same is valid for the second CNN in the Siamese network
x 2 , i , j a , b = R e L U ( b i , j + f p = 0 P i 1 q = 0 Q i 1 w i , j , f p , q   x 2 , i 1 , f a + p , b + q )
R e L U is the rectified linear activation function, b i , j is the bias for this feature map, f indexes over the set of feature maps in the ( i 1 ) t h layer connected to the current feature map, and w i , j , k p , q is the value at the position ( p , q ) of the kernel connected to the k t h feature map. P i and Q i are the height and width of the kernel, respectively.
The units in the final convolutional layer are flattened into a single vector for each network to obtain x 1 , L 2 t , x 2 , L 2 t which results from the first and second networks; t is the number of units in the L-2 layer.
d = | x 1 , L 2 x 2 , L 2 |
where d t is the distance between first and second flattened vectors from twin networks? The d with a bias fed to a NN to generate a similarity index (SI), which is defined by
S I = f ( d θ T )
where f is the sigmoid function and θ t + 1 is the weights for the fully connected neural network.

2.6. Training and Testing Process

The CNN parameter w and fully connected neural-network parameter θ is tuned by using the following loss function called cross-entropy [147].
( X 1 , X 2 ,   y ; w , θ ) = 1 m i = 1 m [ y ( i ) l o g ( S I ( i ) ) + ( 1 y ( i ) ) l o g ( 1 S I ( i ) ) ]
Cross-entropy is a measure of the difference between two probability distributions for a given random variable or set of events. we have tried to find the point where the machine breaks down because we have two distributions.
The training set can be defined as
{ ( X 1 ( 1 ) , X 2 ( 1 ) , y ( 1 ) ) , ( X 1 ( 2 ) , X 2 ( 2 ) , y ( 2 ) ) , ,   ( X 1 ( m ) , X 2 ( m ) , y ( m ) ) }
where X 1 ( i ) and   X 2 ( i )   is   input   of   the   siamese   network   and   y ( i )   is   labels . Some examples from the training set can be seen in Figure 3.
The main aim is to decrease the error between the S I and labels ( y ( i ) ) by tuning the inner parameters ( w , θ ) of the Siamese network, shown in Figure 4. After the training process, the Siamese network can measure the similarity between inputs. Image samples obtained from the generalization process are utilized to finalize the proposed framework. The testing set is designed as:
{ ( D , I ( 1 ) ) , ( D , I ( 2 ) ) , ,   ( D , I ( m ) ) }
at the end of the generalization process for any machine, where D is an ideal sample I image, I ( i ) is an image sample obtained from features related to a PdM dataset.
The similarity between ideal and image samples is measured using the trained Siamese network. The similarity index is calculated to determine a result and error values to further train the model on different machines going forward. In the proposed framework structure, the ideal similarity index is 1, meaning the samples are similar. The alternative is 0, meaning the samples are different in appearance. This result is indicative of how healthy the image data available truly is.
This study aims to solve a novel problem where similar machines need to be maintained. The PdM algorithms based on data-driven techniques need too much data from the machine to be maintained. However, collecting too many data samples from each machine takes too much time. To solve this problem, we use the transfer learning technique shown in Figure 5. The application of transfer learning is shown in Algorithm 2.
Algorithm 2 Applicaiton of Transfer Learning.
1. Tune all parameters ( w , θ ) of the Siamese network with { X 1 ( i ) , X 2 ( i ) , y ( i ) } i = 1 m data set.
Repeat for k = 1…N
2. Test the Siamese network with the data { D , I k ( i ) } i = 1 n obtained from kth machine.
3. Find labels { y k ( i ) } i = 1 n related to kth machine
4. Initialize the weights ( θ ) in full connected layer of the network.
5. Tune the weights of the NN with the data { D , I 1 ( i ) , y 1 ( i ) } i = 1 n         { D , I k ( i ) , y k ( i ) } i = 1 n obtained from the machines (1…k)
6. Return labels { y k ( i ) } i = 1 n for kth machine.

3. Results and Discussion

Experimental results are analyzed in three subsections. The datasets are explained in the first of these sections. The training process of the proposed framework is mentioned in the second part. The simulation results and statistical analysis are interpreted in the third part. The obtained results are also discussed in the last one.

3.1. Test Datasets

The classification performance of the proposed method has been verified on four different datasets. The first of these datasets, the audio dataset, was collected on the MTU campus, and the remaining datasets are public datasets.

3.1.1. Lathe Machine Failure Dataset

This study is to propose a robust algorithm that works effectively on different machines and different data types with as few samples as possible. The dataset collected for this purpose was obtained from four lathe machines over five days to detect a failure in lathe tool tips based on audio signals. There are 30 samples for each lathe machine dataset, except for one dataset with 25 samples. Due to the nature of the problem, the datasets obtained from the machines generally have similar characteristics but even a few differences between them reduce the performance of the classification algorithms. On the other hand, the differences between the data collected on different days under almost the same conditions from the same machine have adverse effects on the performance of the classification algorithms. For classical algorithms to work effectively, many samples are collected from each machine, some of these are used for training, and the machine’s health is determined with the rest of the samples. However, in this algorithm, it is possible to get an idea about the machine’s health. The Siamese network provides to test network with new data samples. Therefore, the proposed algorithm has minimized a severe disadvantage. How this dataset was collected is presented in detail in the Supplementary File.

3.1.2. Bearing Fault Dataset

The dataset [151] contains three classes: baseline, inner fault, and outer fault in a rolling element bearing. BFD-1 and BFD-2 are generated from the dataset to test the proposed framework, which discriminates between the fault (inner or outer) and the baseline by selecting the last samples from the dataset folder. The dataset contains vibration time-series signals. Generated from the BFD-1 and BFD-2 are 24 and 36 samples, respectively, using the window size of 0.5 s.

3.1.3. Wind Turbine High-Speed Bearing Prognosis

A vibration signal of 6 s is collected from a 2 MW wind turbine high-speed shaft driven by a 20-tooth pinion gear to generate a dataset for 50 days. The main aim of this dataset [152] is to predict a maintenance time for the wind turbine by analyzing vibration signals whose pattern changes with respect to an inner race-bearing fault. The current algorithm proposed the maintenance time as 40–48 days [153].

3.1.4. AI4I Dataset

This data [154,155] is an artificially created dataset by using a random walk process. The simulated machine failure consists of five modes: tool wear failure, heat dissipation failure, power failure, overstrain failure, and random failures. If at least one of these failure modes is true, the process fails, and the ‘machine failure’ label is set to 1. The number of data samples is 100,000 with 13 features. The 30 failure cases and 30 health cases are used in this study because the dataset is too large for our aim.

3.2. Training Processing

The proposed framework measures the similarity between two images thanks to the Siamese network (SN), whose inner parameter is to be tuned with the help of the Omniglot dataset used for training processes. Images in the Omniglot dataset consist of handwritten characters shown in Figure 4. If the difference between two handwritten characters is large, SN produces a value close to zero. Conversely, if two characters are alike, the SN is close to one.
The test data in this study are likened to the characters found in the training data through the generalization process. Since the test data is in a close structure with the training data, successful results can be obtained in the classification. Classification performance can be improved by transferring trained CNN, and each dataset transfer also enables the next SN to produce better results. In summary, thanks to the Omniglot dataset, successful classification results can be obtained even without using the training set.
SN-1 is the first network trained with the Omniglot dataset in our simulation. It is important that this network is well trained. Since the CNN in the SN is trained in this part, its parameters are fixed during all the test stages. The hyperparameters of the SN are chosen by a trial-and-error method. Since there is no systemic method to choose these parameters. However, the inner parameter of the FCNN can be changed during the testing process because each new incoming data causes the behavior of the system to change. For the network to adapt to the latest data, the network structure is prepared for the following classification process by training it again partially with new data. The training process of the network SN-1 with the Omniglot dataset can be seen in Figure 6. Training has been continued for ten epochs, with 2 K iterations.
The testing process of SN-1 can be seen in the first row of Figure 7 for five datasets collected from LM-1 to LM-5. The images obtained from the generalization phase are used as samples whose similarity indexes are calculated using SN-1. In the next step, the CNN part of SN-1 is transferred to SN-2. The FCNN part of SN-2 is trained with images sent to SN-1 for testing. The results of SN-2 are demonstrated in the second row of Figure 7. The next SN, constructed by using the CNN from SN-1, the FCNN of SN-3, is trained in the same way by using the images obtained from the first and second steps. This process is repeated for SN-4 and SN-5 in the same way. The training process of the FCNN part of SN-2 to SN-5 is given in Figure 7. More samples cause an increase in the separation of unhealth and health regions and decrease the loss.
When the results of LM-1 from SN-1 are examined, a sharp decrease can be observed in the region shown in red. This sharp drop is an indication that the tip used for cutting has lost its function. In particular, the user who does not have sufficient experience in cutting is warned and informed that the tip needs to be changed. During the experiment, an unhealthy tip change is made in the red region, and the difference is marked as the red region. There is severe compatibility between the output of the algorithm and the actual data. The results obtained from SN-1 for all datasets are satisfactory.
On the other hand, the SN-2 is partially trained with the data of LM-1 to make the boundaries even clearer. A significant improvement is seen at the end of the training in all datasets from LM-2 to LM-5. All training progress can be observed in Figure 8, where there is a gradual decrease in loss with increasing the number of samples. At the end of the training, the healthy tip is approximately 1, while the unhealthy tip is 0. Only three samples in the LM-2 were incorrectly classified.
The visual results are supported by statistical analysis, which shows the efficiency of transfer learning and the effect of an increasing number of training samples. If the distribution of two sets is Gaussian, the t-test can be applied to detect whether the two sets of data are significantly different from each other. However, in this case, the results obtained from SN-1 to SN-5 are not Gaussian. Therefore, the nonparametric Wilcoxon test is utilized to determine whether the two datasets are significantly different. When Table 1 is examined, there is a statistically significant difference between SN-1 and SN-2 for all LMs.
It should be noted that there is enough statistical evidence to conclude that the transfer learning technique and increasing sample sizes are efficient techniques. However, there is no statistical evidence regarding transitions to SN-2 to SN-3, SN-3 to SN-4, and SN-4 to SN-5, except for LM-5. There is a statistical meaning to the transition SN-2 to SN-3. If there is no statistically significant difference, applying more transfer learning techniques increases run time. Therefore, a statistical test can decide where the transfer learning will end.

3.3. The Experimental Results for Public Datasets

The main aim of the paper is to detect the failure in machines with similar characteristics, such as the Lathe machine mentioned in the previous section. However, the method can also be applied to other datasets collected from just one machine. The accuracy of the proposed method is verified using public datasets. The results obtained from the three public datasets are shown in Figure 9. In this experiment, the data is collected from just one machine. Therefore, the extracted images from the datasets are sent to SN-1 to SN-6 directly without using a transfer learning strategy. In this experiment section, the datasets are used just as a testing dataset.
First, the images are applied to the input of SN-1 for all datasets. In the next step, the images obtained from the datasets are sent to SN-2. Since SN-2 has been trained on images obtained from the LM before, there is significant separation in terms of all datasets at the output of SN-2. It is seen how the classification process changes when the datasets are sent to SN-3, SN-4, SN-5, and SN-6 trained with the data of LM-5. In general, the output of SN-2 and the results obtained from other SNs are very close to each other. However, in some cases, the images obtained from the output of SN-1 may be far from giving precise information about the machine’s state. We can observe this in the BFD-1 and AI4I datasets.
To analyze this, the results obtained from the output of each SN were interpreted statistically. Here again, the Wilcoxon test was used. According to the results (see Table 2 and Table 3) obtained, there is a significant difference in transitions from SN-1 to SN-2 in all datasets, except for BFD-2. However, there is no significant difference in any later transitions except AI4I. In AI4I, on the other hand, a significant difference was observed in the transition from SN-4 to SN-5.
The maintenance of a machine is a dynamic process. In other words, the machine’s data that is planned to be maintained is constantly changing. Although the variation is within certain limits, it has a severe effect on the performance of classification algorithms. If enough data are not collected for training, there is a severe decrease in the classifier’s performance. On the other hand, it takes almost years to collect and analyze data separately for each device. This situation becomes an even bigger problem with developing technology and renewed machines. Considering all these, the proposed algorithm seen in Figure 10 must produce satisfactory results even without training data.
On the other hand, the increase in the algorithm’s performance with the new data added is proof of how well it can adapt to changing conditions. Another advantage of the algorithm is that it can continue to work with a few changes on different data types. For example, the same SN can classify datasets using vibration data, sound signals, and artificially generated features in this study. In addition to error classification, it has been seen that the algorithm flexibly gives an idea about the health status of the machine. We can see this situation clearly in the WT dataset. When the SN-1 output of the WT dataset was observed, while the machine’s health was in good condition in the first days, the decreases started on the 42nd day, and it was understood that the maintenance time of the machine had come. These findings are consistent with other studies using this dataset.
The main hypothesis was to test how to monitor a legacy machine’s health using an unsupervised learning technique. The all results show that we can monitor a legacy machine with the framework based on the Siamese network to detect failures in a machine. The framework can be adopted for the same types of machines working in different settings and environments. In addition, as it does not strictly require the labels collected from machines, it can be applied to unsupervised failure detection problems. The Siamese network used in this study is trained with the public dataset named Omniglot dataset containing handwriting images to adapt the network which measures the similarity between two samples given to its input. After this process, we tried to find a point where the machine breaks down; therefore, we have no aim to make classification directly. Therefore, we have no performance matrices, only statistical results available for evaluation of the algorithm performances.

4. Conclusions

To implement the concepts of circular economy and sustainable manufacturing, the reuse of machinery to its maximum lifespan is required. In order to conform with modern manufacturing techniques and achieve a higher level of sustainable manufacturing, the machine to be reused must be retrofitted and upgraded with modern connectivity, reporting, and PdM systems. While multiple machines are used for the same purpose in an industrial factory, using PdM training data collected for one machine to implement PdM in other machines often fails. Even the behavior of the same device in different working conditions over a short time period may change. Using established methods, it is necessary to create a vast dataset to solve these problems. However, obtaining such a dataset is highly time-consuming and laborious. In this study, a Siamese network has been constructed and trained with a limited number of samples. The inclusion of test data for the next machine in training and applying transfer learning methods have made the algorithm solid and stable. The general structure of the test dataset is adopted to bear a resemblance between the data obtained from the machine and the public training dataset. Therefore, the projections of the features obtained from the machine into ideal sample bases are designed to be like the public dataset by using a proposed generation step. The limitation of the algorithm is that the preprocessing stages are more than the classical deep-learning methods. The algorithm needs hyperparameter optimization like all frameworks with DNNs. The results will be expected to be applied to industrial problems and will find wider use in the future. The used structure can be improved by changing the clustering technique or different types of DNNs such as Resnet, AlexNet, etc., to boost the efficiency, profitability, and sustainability of legacy machines.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su15129272/s1, Figure S1: Colchester Triumph 2000 Lathe Machine Dimensions; Figure S2: Some Samples from the Experiment; Figure S3: Data Collection; Figure S4: Denoising Process.

Author Contributions

Conceptualization, D.R. and J.W.; methodology, A.C. and D.R.; software, A.C.; validation, A.C., C.O. and K.P.; investigation, A.C., C.O. and K.P.; data curation, A.C. and C.O.; writing—review and editing, A.C., C.O., K.P., J.W. and D.R.; supervision, D.R.; project administration, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported, in part, by Science Foundation Ireland grant number 16/RC/3918 to the CONFIRM Science Foundation Ireland Research Centre for Smart Manufacturing and cofunded under the European Regional Development Fund (www.confirm.ie, (accessed on 19 April 2023)).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available at https://drive.google.com/drive/folders/1XYVIDs3tdLgbRZv-_E3ooWHvYTDr_frD?usp=sharing (Accessed on 26 April 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fontana, A.; Barni, A.; Leone, D.; Spirito, M.; Tringale, A.; Ferraris, M.; Reis, J.; Goncalves, G. Circular Economy Strategies for Equipment Lifetime Extension: A Systematic Review. Sustainability 2021, 13, 1117. [Google Scholar] [CrossRef]
  2. Li, B.; Wang, T.; Li, C.; Dong, Z.; Yang, H.; Sun, Y.; Wang, P. A Strategy for Determining the Decommissioning Life of Energy Equipment Based on Economic Factors and Operational Stability. Sustainability 2022, 14, 16378. [Google Scholar] [CrossRef]
  3. Lolli, F.; Coruzzolo, A.M.; Peron, M.; Sgarbossa, F. Age-Based Preventive Maintenance with Multiple Printing Options. Int. J. Prod. Econ. 2022, 243, 108339. [Google Scholar] [CrossRef]
  4. Sun, Z.; Chen, T.; Meng, X.; Bao, Y.; Hu, L.; Zhao, R. A Critical Review for Trustworthy and Explainable Structural Health Monitoring and Risk Prognosis of Bridges with Human-In-The-Loop. Sustainability 2023, 15, 6389. [Google Scholar] [CrossRef]
  5. Wang, Y.; Zhao, Y. Multi-Scale Remaining Useful Life Prediction Using Long Short-Term Memory. Sustainability 2022, 14, 15667. [Google Scholar] [CrossRef]
  6. van Noortwijk, J.M. A Survey of the Application of Gamma Processes in Maintenance. Reliab. Eng. Syst. Saf. 2009, 94, 2–21. [Google Scholar] [CrossRef]
  7. Alaswad, S.; Xiang, Y. A Review on Condition-Based Maintenance Optimization Models for Stochastically Deteriorating System. Reliab. Eng. Syst. Saf. 2017, 157, 54–63. [Google Scholar] [CrossRef]
  8. Lei, Y.; Li, N.; Gontarz, S.; Lin, J.; Radkowski, S.; Dybala, J. A Model-Based Method for Remaining Useful Life Prediction of Machinery. IEEE Trans. Reliab. 2016, 65, 1314–1326. [Google Scholar] [CrossRef]
  9. Huang, Z.; Xu, Z.; Wang, W.; Sun, Y. Remaining Useful Life Prediction for a Nonlinear Heterogeneous Wiener Process Model With an Adaptive Drift. IEEE Trans. Reliab. 2015, 64, 687–700. [Google Scholar] [CrossRef]
  10. Hanachi, H.; Liu, J.; Banerjee, A.; Chen, Y.; Koul, A. A Physics-Based Modeling Approach for Performance Monitoring in Gas Turbine Engines. IEEE Trans. Reliab. 2015, 64, 197–205. [Google Scholar] [CrossRef]
  11. Dui, H.; Si, S.; Zuo, M.J.; Sun, S. Semi-Markov Process-Based Integrated Importance Measure for Multi-State Systems. IEEE Trans. Reliab. 2015, 64, 754–765. [Google Scholar] [CrossRef]
  12. Cui, L.; Xu, Y.; Zhao, X. Developments and Applications of the Finite Markov Chain Imbedding Approach in Reliability. IEEE Trans. Reliab. 2010, 59, 685–690. [Google Scholar] [CrossRef]
  13. Si, X.; Wang, W.; Hu, C.; Zhou, D.; Pecht, M.G. Remaining Useful Life Estimation Based on a Nonlinear Diffusion Degradation Process. IEEE Trans. Reliab. 2012, 61, 50–67. [Google Scholar] [CrossRef]
  14. Si, X.-S.; Wang, W.; Chen, M.-Y.; Hu, C.-H.; Zhou, D.-H. A Degradation Path-Dependent Approach for Remaining Useful Life Estimation with an Exact and Closed-Form Solution. Eur. J. Oper. Res. 2013, 226, 53–66. [Google Scholar] [CrossRef]
  15. Yu, J. A Nonlinear Probabilistic Method and Contribution Analysis for Machine Condition Monitoring. Mech. Syst. Signal Process. 2013, 37, 293–314. [Google Scholar] [CrossRef]
  16. Yu, J. Health Degradation Detection and Monitoring of Lithium-Ion Battery Based on Adaptive Learning Method. IEEE Trans. Instrum. Meas. 2014, 63, 1709–1721. [Google Scholar] [CrossRef]
  17. Zhang, W.; Yang, D.; Wang, H. Data-Driven Methods for Predictive Maintenance of Industrial Equipment: A Survey. IEEE Syst. J. 2019, 13, 2213–2227. [Google Scholar] [CrossRef]
  18. Carvalho, T.P.; Soares, F.A.A.M.N.; Vita, R.; Francisco, R.D.P.; Basto, J.P.; Alcalá, S.G.S. A Systematic Literature Review of Machine Learning Methods Applied to Predictive Maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
  19. Angelopoulos, A.; Michailidis, E.T.; Nomikos, N.; Trakadas, P.; Hatziefremidis, A.; Voliotis, S.; Zahariadis, T. Tackling Faults in the Industry 4.0 Era—A Survey of Machine-Learning Solutions and Key Aspects. Sensors 2020, 20, 109. [Google Scholar] [CrossRef] [Green Version]
  20. Patil, S.; Desai, S.; Patil, A.; Phalle, V.M.; Handikherkar, V.; Kazi, F.S. Remaining Useful Life (RuL) Prediction of Rolling Element Bearing Using Random Forest and Gradient Boosting Technique. In Proceedings of the ASME International Mechanical Engineering Congress and Exposition, Pittsburgh, PA, USA, 9–15 November 2018; Volume 13. [Google Scholar]
  21. Rafiee, J.; Arvani, F.; Harifi, A.; Sadeghi, M.H. Intelligent Condition Monitoring of a Gearbox Using Artificial Neural Network. Mech. Syst. Signal Process. 2007, 21, 1746–1754. [Google Scholar] [CrossRef]
  22. Sakthivel, N.R.; Sugumaran, V.; Babudevasenapati, S. Vibration Based Fault Diagnosis of Monoblock Centrifugal Pump Using Decision Tree. Expert Syst. Appl. 2010, 37, 4040–4049. [Google Scholar] [CrossRef]
  23. Nasir, V.; Cool, J. Intelligent Wood Machining Monitoring Using Vibration Signals Combined with Self-Organizing Maps for Automatic Feature Selection. Int. J. Adv. Manuf. Technol. 2020, 108, 1811–1825. [Google Scholar] [CrossRef]
  24. Wu, J.-D.; Liu, C.-H. Investigation of Engine Fault Diagnosis Using Discrete Wavelet Transform and Neural Network. Expert Syst. Appl. 2008, 35, 1200–1213. [Google Scholar] [CrossRef]
  25. Devasenapati, S.B.; Sugumaran, V.; Ramachandran, K.I. Misfire Identification in a Four-Stroke Four-Cylinder Petrol Engine Using Decision Tree. Expert Syst. Appl. 2010, 37, 2150–2160. [Google Scholar] [CrossRef]
  26. Jack, L.B.; Nandi, A.K. Support Vector Machines for Detection and Characterization of Rolling Element Bearing Faults. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2001, 215, 1065–1074. [Google Scholar] [CrossRef]
  27. Wang, G.F.; Yang, Y.W.; Zhang, Y.C.; Xie, Q.L. Vibration Sensor Based Tool Condition Monitoring Using ν Support Vector Machine and Locality Preserving Projection. Sens. Actuators A Phys. 2014, 209, 24–32. [Google Scholar] [CrossRef]
  28. Lee, S.-B.; Habetler, T.G. An Online Stator Winding Resistance Estimation Technique for Temperature Monitoring of Line-Connected Induction Machines. IEEE Trans. Ind. Appl. 2003, 39, 685–694. [Google Scholar] [CrossRef]
  29. Ni, Y.Q.; Hua, X.G.; Fan, K.Q.; Ko, J.M. Correlating Modal Properties with Temperature Using Long-Term Monitoring Data and Support Vector Machine Technique. Eng. Struct. 2005, 27, 1762–1773. [Google Scholar] [CrossRef]
  30. Xiang, D.; Ran, L.; Tavner, P.; Bryant, A.; Yang, S.; Mawby, P. Monitoring Solder Fatigue in a Power Module Using Case-above-Ambient Temperature Rise. IEEE Trans. Ind. Appl. 2011, 47, 2578–2591. [Google Scholar] [CrossRef]
  31. Chen, H.; Ji, B.; Pickert, V.; Cao, W. Real-Time Temperature Estimation for Power MOSFETs Considering Thermal Aging Effects. IEEE Trans. Device Mater. Reliab. 2014, 14, 220–228. [Google Scholar] [CrossRef] [Green Version]
  32. Guo, P.; Infield, D.; Yang, X. Wind Turbine Generator Condition-Monitoring Using Temperature Trend Analysis. IEEE Trans. Sustain. Energy 2012, 3, 124–133. [Google Scholar] [CrossRef] [Green Version]
  33. Mouli, D.S.B.; Rameshkumar, K. Acoustic Emission-Based Grinding Wheel Condition Monitoring Using Decision Tree Machine Learning Classifiers. Lect. Notes Mech. Eng. 2020, 2020, 353–359. [Google Scholar] [CrossRef]
  34. Motahari-Nezhad, M.; Jafari, S.M. ANFIS System for Prognosis of Dynamometer High-Speed Ball Bearing Based on Frequency Domain Acoustic Emission Signals. Meas. J. Int. Meas. Confed. 2020, 166, 108154. [Google Scholar] [CrossRef]
  35. Li, H.; Wang, Y.; Zhao, P.; Zhang, X.; Zhou, P. Cutting Tool Operational Reliability Prediction Based on Acoustic Emission and Logistic Regression Model. J. Intell. Manuf. 2015, 26, 923–931. [Google Scholar] [CrossRef]
  36. Glowacz, A. Diagnostics of Synchronous Motor Based on Analysis of Acoustic Signals with the Use of Line Spectral Frequencies and K-Nearest Neighbor Classifier. Arch. Acoust. 2014, 39, 189–194. [Google Scholar] [CrossRef] [Green Version]
  37. Wang, J.; Liang, Y.; Zheng, Y.; Gao, R.X.; Zhang, F. An Integrated Fault Diagnosis and Prognosis Approach for Predictive Maintenance of Wind Turbine Bearing with Limited Samples. Renew. Energy 2020, 145, 642–650. [Google Scholar] [CrossRef]
  38. Senguler, T.; Karatoprak, E.; Seker, S.; Caglar, R. ICA and Wavelet Packet Decomposition Approaches for Monitoring the Incipient Bearing Damage in Electrical Motors. In Proceedings of the 2008 4th International IEEE Conference Intelligent Systems, Nice, France, 22–26 September 2008; Volume 2, pp. 2413–2417. [Google Scholar]
  39. Qiu, H.; Lee, J.; Lin, J.; Yu, G. Robust Performance Degradation Assessment Methods for Enhanced Rolling Element Bearing Prognostics. Adv. Eng. Inform. 2003, 17, 127–140. [Google Scholar] [CrossRef]
  40. Bansal, D.; Evans, D.J.; Jones, B. BJEST: A Reverse Algorithm for the Real-Time Predictive Maintenance System. Int. J. Mach. Tools Manuf. 2006, 46, 1068–1078. [Google Scholar] [CrossRef]
  41. Sánchez, L.; Couso, I. Singular Spectral Analysis of Ill-Known Signals and Its Application to Predictive Maintenance of Windmills with SCADA Records. Soft Comput. 2012, 16, 755–768. [Google Scholar] [CrossRef]
  42. Zhang, Y.; Randall, R.B. Rolling Element Bearing Fault Diagnosis Based on the Combination of Genetic Algorithms and Fast Kurtogram. Mech. Syst. Signal Process. 2009, 23, 1509–1517. [Google Scholar] [CrossRef]
  43. Saidi, L.; Ben Ali, J.; Fnaiech, F. Application of Higher Order Spectral Features and Support Vector Machines for Bearing Faults Classification. ISA Trans. 2015, 54, 193–206. [Google Scholar] [CrossRef]
  44. Tabrizi, A.; Garibaldi, L.; Fasana, A.; Marchesiello, S. Early Damage Detection of Roller Bearings Using Wavelet Packet Decomposition, Ensemble Empirical Mode Decomposition and Support Vector Machine. Meccanica 2015, 50, 865–874. [Google Scholar] [CrossRef]
  45. Phillips, J.; Cripps, E.; Lau, J.W.; Hodkiewicz, M.R. Classifying Machinery Condition Using Oil Samples and Binary Logistic Regression. Mech. Syst. Signal Process. 2015, 60, 316–325. [Google Scholar] [CrossRef]
  46. Barbieri, M.; DIversi, R.; Tilli, A. Condition Monitoring of Ball Bearings Using Estimated Ar Models as Logistic Regression Features. In Proceedings of the 2019 18th European Control Conference (ECC), Naples, Italy, 25–28 June 2019; pp. 3904–3909. [Google Scholar]
  47. Zhang, J.; Nie, H. Experimental Study and Logistic Regression Modeling for Machine Condition Monitoring through Microcontroller-Based Data Acquisition System. J. Adv. Manuf. Syst. 2009, 8, 177–192. [Google Scholar] [CrossRef]
  48. Knoebel, C.; Strommenger, D.; Reuter, J.; Guehmann, C. Health Index Generation Based on Compressed Sensing and Logistic Regression for Remaining Useful Life Prediction. In Proceedings of the Annual Conference of the PHM Society, Scottsdale, AZ, USA, 21–26 September 2019; Vol. 11. [Google Scholar]
  49. Langone, R.; Cuzzocrea, A.; Skantzos, N. Interpretable Anomaly Prediction: Predicting Anomalous Behavior in Industry 4.0 Settings via Regularized Logistic Regression Tools. Data Knowl. Eng. 2020, 130, 101850. [Google Scholar] [CrossRef]
  50. Bodla, M.K.; Malik, S.M.; Rasheed, M.T.; Numan, M.; Ali, M.Z.; Brima, J.B. Logistic Regression and Feature Extraction Based Fault Diagnosis of Main Bearing of Wind Turbines. In Proceedings of the 2016 IEEE 11th Conference on Industrial Electronics and Applications (ICIEA), Hefei, China, 5–7 June 2016; pp. 1628–1633. [Google Scholar]
  51. Yu, J. Tool Condition Prognostics Using Logistic Regression with Penalization and Manifold Regularization. Appl. Soft Comput. J. 2018, 64, 454–467. [Google Scholar] [CrossRef]
  52. Gryllias, K.C.; Antoniadis, I.A. A Support Vector Machine Approach Based on Physical Model Training for Rolling Element Bearing Fault Detection in Industrial Environments. Eng. Appl. Artif. Intell. 2012, 25, 326–344. [Google Scholar] [CrossRef]
  53. Samanta, B.; Al-Balushi, K.R.; Al-Araimi, S.A. Artificial Neural Networks and Support Vector Machines with Genetic Algorithm for Bearing Fault Detection. Eng. Appl. Artif. Intell. 2003, 16, 657–665. [Google Scholar] [CrossRef]
  54. Konar, P.; Chattopadhyay, P. Bearing Fault Detection of Induction Motor Using Wavelet and Support Vector Machines (SVMs). Appl. Soft Comput. J. 2011, 11, 4203–4211. [Google Scholar] [CrossRef]
  55. Pandarakone, S.E.; Mizuno, Y.; Nakamura, H. Distinct Fault Analysis of Induction Motor Bearing Using Frequency Spectrum Determination and Support Vector Machine. IEEE Trans. Ind. Appl. 2017, 53, 3049–3056. [Google Scholar] [CrossRef]
  56. Jack, L.B.; Nandi, A.K. Fault Detection Using Support Vector Machines and Artificial Neural Networks, Augmented by Genetic Algorithms. Mech. Syst. Signal Process. 2002, 16, 373–390. [Google Scholar] [CrossRef]
  57. Lim, G.-M.; Bae, D.-M.; Kim, J.-H. Fault Diagnosis of Rotating Machine by Thermography Method on Support Vector Machine. J. Mech. Sci. Technol. 2014, 28, 2947–2952. [Google Scholar] [CrossRef]
  58. Murthy, V.S.; Tarakanath, K.; Mohanta, D.K.; Gupta, S. Insulator Condition Analysis for Overhead Distribution Lines Using Combined Wavelet Support Vector Machine (SVM). IEEE Trans. Dielectr. Electr. Insul. 2010, 17, 89–99. [Google Scholar] [CrossRef]
  59. Widodo, A.; Yang, B.-S. Machine Health Prognostics Using Survival Probability and Support Vector Machine. Expert Syst. Appl. 2011, 38, 8430–8437. [Google Scholar] [CrossRef]
  60. Tran, V.T.; Thom Pham, H.; Yang, B.-S.; Tien Nguyen, T. Machine Performance Degradation Assessment and Remaining Useful Life Prediction Using Proportional Hazard Model and Support Vector Machine. Mech. Syst. Signal Process. 2012, 32, 320–330. [Google Scholar] [CrossRef] [Green Version]
  61. Shi, D.; Gindy, N.N. Tool Wear Predictive Model Based on Least Squares Support Vector Machines. Mech. Syst. Signal Process. 2007, 21, 1799–1814. [Google Scholar] [CrossRef]
  62. Muralidharan, V.; Sugumaran, V. A Comparative Study of Naïve Bayes Classifier and Bayes Net Classifier for Fault Diagnosis of Monoblock Centrifugal Pump Using Wavelet Analysis. Appl. Soft Comput. J. 2012, 12, 2023–2029. [Google Scholar] [CrossRef]
  63. Pandarakone, S.E.; Gunasekaran, S.; Mizuno, Y.; Nakamura, H. Application of Naive Bayes Classifier Theorem in Detecting Induction Motor Bearing Failure. In Proceedings of the 2018 XIII International Conference on Electrical Machines (ICEM), Alexandroupoli, Greece, 3–6 September 2018; pp. 1761–1767. [Google Scholar]
  64. Mitiche, I.; Nesbitt, A.; Boreham, P.; Stewart, B.G.; Morison, G. Naive Bayes Multi-Label Classification Approach for High-Voltage Condition Monitoring. In Proceedings of the 2018 IEEE International Conference on Internet of Things and Intelligence System (IOTAIS), Bali, Indonesia, 1–3 November 2018; pp. 162–166. [Google Scholar]
  65. Alamelu Manghai, T.M.; Jegadeeshwaran, R.; Sakthivel, G. Real Time Condition Monitoring of Hydraulic Brake System Using Naive Bayes and Bayes Net Algorithms. IOP Conf. Ser. Mater. Sci. Eng. 2019, 624, 012028. [Google Scholar] [CrossRef]
  66. Natarajan, S. Condition Monitoring of Bevel Gear Box Using Morlet Wavelet Coefficients and Naïve Bayes Classifier. Int. J. Syst. Control. Commun. 2019, 10, 18–31. [Google Scholar] [CrossRef]
  67. Sugumaran, V.; Ramachandran, K.I. Automatic Rule Learning Using Decision Tree for Fuzzy Classifier in Fault Diagnosis of Roller Bearing. Mech. Syst. Signal Process. 2007, 21, 2237–2247. [Google Scholar] [CrossRef]
  68. Jegadeeshwaran, R.; Sugumaran, V. Comparative Study of Decision Tree Classifier and Best First Tree Classifier for Fault Diagnosis of Automobile Hydraulic Brake System Using Statistical Features. Meas. J. Int. Meas. Confed. 2013, 46, 3247–3260. [Google Scholar] [CrossRef]
  69. Sakthivel, N.R.; Sugumaran, V.; Nair, B.B. Comparison of Decision Tree-Fuzzy and Rough Set-Fuzzy Methods for Fault Categorization of Mono-Block Centrifugal Pump. Mech. Syst. Signal Process. 2010, 24, 1887–1906. [Google Scholar] [CrossRef]
  70. Sharma, R.K.; Sugumaran, V.; Kumar, H.; Amarnath, M. Condition Monitoring of Roller Bearing by K-Star Classifier and K-Nearest Neighborhood Classifier Using Sound Signal. SDHM Struct. Durab. Health Monit. 2017, 12, 1–16. [Google Scholar]
  71. Muralidharan, V.; Ravikumar, S.; Kangasabapathy, H. Condition Monitoring of Self Aligning Carrying Idler (SAI) in Belt-Conveyor System Using Statistical Features and Decision Tree Algorithm. Meas. J. Int. Meas. Confed. 2014, 58, 274–279. [Google Scholar] [CrossRef]
  72. Devendiran, S.; Manivannan, K. Condition Monitoring on Grinding Wheel Wear Using Wavelet Analysis and Decision Tree C4.5 Algorithm. Int. J. Eng. Technol. 2013, 5, 4010–4024. [Google Scholar]
  73. Qin, S.; Zhang, M.; Ma, X.; Li, M. A New Integrated Analytics Approach for Wind Turbine Fault Detection Using Wavelet, RLS Filter and Random Forest. Energy Sources Part A Recovery Util. Environ. Eff. 2019, 2019, 1–16. [Google Scholar] [CrossRef]
  74. Prabhu, G.R.; Chandrasekar, S.; Ravindran, R.S. A Novel Random Forest Model Approach for Accurate Classification of Single Partial Discharge Sources of HV Transformer Insulation Faults. J. Comput. Theor. Nanosci. 2016, 13, 9040–9050. [Google Scholar] [CrossRef]
  75. Aravinth, S.; Sugumaran, V. Air Compressor Fault Diagnosis through Statistical Feature Extraction and Random Forest Classifier. Prog. Ind. Ecol. 2018, 12, 192–205. [Google Scholar] [CrossRef]
  76. Chen, M.; Pang, X.; Lu, K. Fault Diagnosis of Planetary Gearbox Based on Random Forest and Singular Value Difference Spectrum. Smart Innov. Syst. Technol. 2020, 166, 1529–1540. [Google Scholar] [CrossRef]
  77. Ravikumar, S.; Muralidharan, V.; Ramesh, P.; Pandian, C. Fault Diagnosis of Self-Aligning Conveyor Idler in Coal Handling Belt Conveyor System by Statistical Features Using Random Forest Algorithm. Lect. Notes Electr. Eng. 2021, 688, 207–219. [Google Scholar] [CrossRef]
  78. Kartojo, I.H.; Wang, Y.-B.; Zhang, G.-J. Suwarno Partial Discharge Defect Recognition in Power Transformer Using Random Forest. In Proceedings of the 2019 IEEE 20th International Conference on Dielectric Liquids (ICDL), Rome, Italy, 23–27 June 2019; Volume 2019. [Google Scholar]
  79. Yan, W.; Zhou, J.-H. Predictive Modeling of Aircraft Systems Failure Using Term Frequency-Inverse Document Frequency and Random Forest. In Proceedings of the 2017 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Singapore, 10–13 December 2017; Volume 2017, pp. 828–831. [Google Scholar]
  80. Vamsi, I.V.; Abhinav, N.; Verma, A.K.; Radhika, S. Random Forest Based Real Time Fault Monitoring System for Industries. In Proceedings of the 2018 4th International Conference on Computing Communication and Automation (ICCCA), Greater Noida, India, 14–15 December 2018. [Google Scholar]
  81. Li, J.; Wu, Y.; Wang, G.; Peng, X.; Liu, T.; Jiao, Y. Gradient Boosting Decision Tree and Random Forest Based Partial Discharge Pattern Recognition of HV Cable. In Proceedings of the 2018 China International Conference on Electricity Distribution (CICED), Tianjin, China, 17–19 September 2018; pp. 327–331. [Google Scholar]
  82. Li, X.; Mba, D.; Lin, T.; Yang, Y.; Loukopoulos, P. Just-in-Time Learning Based Probabilistic Gradient Boosting Tree for Valve Failure Prognostics. Mech. Syst. Signal Process. 2021, 150, 107253. [Google Scholar] [CrossRef]
  83. Wang, Y.; Zhu, Z.; Song, H.; Shi, K. Wind Turbine Gearbox Condition Monitoring Based on Extreme Gradient Boosting. In Proceedings of the IECON 2017-43rd Annual Conference of the IEEE Industrial Electronics Society, Beijing, China, 29 October–1 November 2017; Volume 2017, pp. 6017–6023. [Google Scholar]
  84. Moosavian, A.; Ahmadi, H.; Tabatabaeefar, A.; Sakhaei, B. An Appropriate Procedure for Detection of Journal-Bearing Fault Using Power Spectral Density, K-Nearest Neighbor and Support Vector Machine. Int. J. Smart Sens. Intell. Syst. 2012, 5, 685–700. [Google Scholar] [CrossRef] [Green Version]
  85. Surti, K.V.; Naik, C.A. Bearing Condition Monitoring of Induction Motor Based on Discrete Wavelet Transform K-Nearest Neighbor. In Proceedings of the 2018 3rd International Conference for Convergence in Technology (I2CT), Pune, India, 6–8 April 2018. [Google Scholar]
  86. Junior, P.; D’Addona, D.M.; Aguiar, P.; Teti, R. Dressing Tool Condition Monitoring through Impedance-Based Sensors: Part 2—Neural Networks and K-Nearest Neighbor Classifier Approach. Sensors 2018, 18, 4453. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  87. Hasan, M.J.; Kim, J.-M. Fault Detection of a Spherical Tank Using a Genetic Algorithm-Based Hybrid Feature Pool and k-Nearest Neighbor Algorithm. Energies 2019, 12, 991. [Google Scholar] [CrossRef] [Green Version]
  88. Moosavian, A.; Ahmadi, H.; Tabatabaeefar, A. Fault Diagnosis of Main Engine Journal Bearing Based on Vibration Analysis Using Fisher Linear Discriminant, K-Nearest Neighbor and Support Vector Machine. J. Vibroeng. 2012, 14, 894–906. [Google Scholar]
  89. Huang, Y.; Zha, X.F.; Lee, J.; Liu, C. Discriminant Diffusion Maps Analysis: A Robust Manifold Learner for Dimensionality Reduction and Its Applications in Machine Condition Monitoring and Fault Diagnosis. Mech. Syst. Signal Process. 2013, 34, 277–297. [Google Scholar] [CrossRef]
  90. Khandaker, N.; Zhang, R.; Dong, S.; Xu, B.; Zhang, Z.; Wen, G. Application of Unsupervised Linear Discriminant Analysis for Condition Monitoring of Rotating Machinery. In Proceedings of the 9th International Conference on Modelling, Identification and Control (ICMIC), Kunming, China, 10-12 July 2017; pp. 43–48. [Google Scholar] [CrossRef]
  91. Liu, T.; Chen, J.; Zhou, X.N.; Xiao, W.B. Bearing Performance Degradation Assessment Using Linear Discriminant Analysis and Coupled HMM. In Proceedings of the 25th International Congress on Condition Monitoring and Diagnostic Engineering (COMADEM 2012), Huddersfield, UK, 18–20 June 2012; Volume 364. [Google Scholar]
  92. Fernandez-Temprano, M.; Gardel-Sotomayor, P.E.; Duque-Perez, O.; Morinigo-Sotelo, D. Broken Bar Condition Monitoring of an Induction Motor under Different Supplies Using a Linear Discriminant Analysis. In Proceedings of the 2013 9th IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives (SDEMPED), Valencia, Spain, 27–30 August 2013; pp. 162–168. [Google Scholar]
  93. Ramirez-Chavez, M.; Saucedo-Dorantes, J.J.; Jaen-Cuellar, A.Y.; Osorniorios, R.A.; Romero-Troncoso, R.D.J.; Delgado-Prieto, M. Condition Monitoring Strategy Based on Spectral Energy Estimation and Linear Discriminant Analysis Applied to an Induction Motor. In Proceedings of the 2018 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), Ixtapa, Mexico, 14–16 November 2019. [Google Scholar]
  94. Tian, Z.; Wong, L.; Safaei, N. A Neural Network Approach for Remaining Useful Life Prediction Utilizing Both Failure and Suspension Histories. Mech. Syst. Signal Process. 2010, 24, 1542–1555. [Google Scholar] [CrossRef]
  95. Chow, M.-Y.; Yee, S.O.; Mangum, P.M. A Neural Network Approach to Real-Time Condition Monitoring of Induction Motors. IEEE Trans. Ind. Electron. 1991, 38, 448–453. [Google Scholar] [CrossRef]
  96. Gebraeel, N.Z.; Lawley, M.A. A Neural Network Degradation Model for Computing and Updating Residual Life Distributions. IEEE Trans. Autom. Sci. Eng. 2008, 5, 154–163. [Google Scholar] [CrossRef]
  97. Wu, S.-J.; Gebraeel, N.; Lawley, M.A.; Yih, Y. A Neural Network Integrated Decision Support System for Condition-Based Optimal Predictive Maintenance Policy. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2007, 37, 226–236. [Google Scholar] [CrossRef]
  98. Bangalore, P.; Tjernberg, L.B. An Artificial Neural Network Approach for Early Fault Detection of Gearbox Bearings. IEEE Trans. Smart Grid 2015, 6, 980–987. [Google Scholar] [CrossRef]
  99. Tian, Z. An Artificial Neural Network Method for Remaining Useful Life Prediction of Equipment Subject to Condition Monitoring. J. Intell. Manuf. 2012, 23, 227–237. [Google Scholar] [CrossRef]
  100. Fast, M.; Palmé, T. Application of Artificial Neural Networks to the Condition Monitoring and Diagnosis of a Combined Heat and Power Plant. Energy 2010, 35, 1114–1120. [Google Scholar] [CrossRef]
  101. Liang, B.; Iwnicki, S.D.; Zhao, Y. Application of Power Spectrum, Cepstrum, Higher Order Spectrum and Neural Network Analyses for Induction Motor Fault Diagnosis. Mech. Syst. Signal Process. 2013, 39, 342–360. [Google Scholar] [CrossRef] [Green Version]
  102. Malik, H.; Mishra, S. Artificial Neural Network and Empirical Mode Decomposition Based Imbalance Fault Diagnosis of Wind Turbine Using TurbSim, FAST and Simulink. IET Renew. Power Gener. 2017, 11, 889–902. [Google Scholar] [CrossRef]
  103. Paya, B.A.; Esat, I.I.; Badi, M.N.M. Artificial Neural Network Based Fault Diagnostics of Rotating Machinery Using Wavelet Transforms as a Preprocessor. Mech. Syst. Signal Process. 1997, 11, 751–765. [Google Scholar] [CrossRef]
  104. Samanta, B.; Al-Balushi, K.R.; Al-Araimi, S.A. Artificial Neural Networks and Genetic Algorithm for Bearing Fault Detection. Soft Comput. 2006, 10, 264–271. [Google Scholar] [CrossRef]
  105. Samanta, B. Artificial Neural Networks and Genetic Algorithms for Gear Fault Detection. Mech. Syst. Signal Process. 2004, 18, 1273–1282. [Google Scholar] [CrossRef]
  106. Prieto, M.D.; Cirrincione, G.; Espinosa, A.G.; Ortega, J.A.; Henao, H. Bearing Fault Detection by a Novel Condition-Monitoring Scheme Based on Statistical-Time Features and Neural Networks. IEEE Trans. Ind. Electron. 2013, 60, 3398–3407. [Google Scholar] [CrossRef]
  107. Abu-Mahfouz, I. Drilling Wear Detection and Classification Using Vibration Signals and Artificial Neural Network. Int. J. Mach. Tools Manuf. 2003, 43, 707–720. [Google Scholar] [CrossRef]
  108. Ghosh, N.; Ravi, Y.B.; Patra, A.; Mukhopadhyay, S.; Paul, S.; Mohanty, A.R.; Chattopadhyay, A.B. Estimation of Tool Wear during CNC Milling Using Neural Network-Based Sensor Fusion. Mech. Syst. Signal Process. 2007, 21, 466–479. [Google Scholar] [CrossRef]
  109. Kaya, B.; Oysu, C.; Ertunc, H.M. Force-Torque Based on-Line Tool Wear Estimation System for CNC Milling of Inconel 718 Using Neural Networks. Adv. Eng. Softw. 2011, 42, 76–84. [Google Scholar] [CrossRef]
  110. Dimla, D.E., Jr.; Lister, P.M.; Leighton, N.J. Neural Network Solutions to the Tool Condition Monitoring Problem in Metal Cutting—A Critical Review of Methods. Int. J. Mach. Tools Manuf. 1997, 37, 1219–1241. [Google Scholar] [CrossRef]
  111. Rangwala, S.; Dornfeld, D. Sensor Integration Using Neural Networks for Intelligent Tool Condition Monitoring. J. Eng. Ind. 1990, 112, 219–228. [Google Scholar] [CrossRef]
  112. Muniraj, C.; Chandrasekar, S. Adaptive Neuro-Fuzzy Inference System for Monitoring the Surface Condition of Polymeric Insulators Using Harmonic Content. IET Gener. Transm. Distrib. 2011, 5, 751–759. [Google Scholar] [CrossRef]
  113. Kumbhar, S.G.; Sudhagar, P.E. An Integrated Approach of Adaptive Neuro-Fuzzy Inference System and Dimension Theory for Diagnosis of Rolling Element Bearing. Meas. J. Int. Meas. Confed. 2020, 166, 108266. [Google Scholar] [CrossRef]
  114. Forouhari, S.; Abu-Siada, A. Application of Adaptive Neuro Fuzzy Inference System to Support Power Transformer Life Estimation and Asset Management Decision. IEEE Trans. Dielectr. Electr. Insul. 2018, 25, 845–852. [Google Scholar] [CrossRef] [Green Version]
  115. Wadhwani, S.; Wadhwani, A.K.; Gupta, S.P.; Kumar, V. Detection of Bearing Failure in Rotating Machine Using Adaptive Neuro-Fuzzy Inference System. In Proceedings of the 2006 International Conference on Power Electronic, Drives and Energy Systems, New Delhi, India, 12–15 December 2006. [Google Scholar]
  116. Wu, J.-D.; Kuo, J.-M. Fault Conditions Classification of Automotive Generator Using an Adaptive Neuro-Fuzzy Inference System. Expert Syst. Appl. 2010, 37, 7901–7907. [Google Scholar] [CrossRef]
  117. Salahshoor, K.; Khoshro, M.S.; Kordestani, M. Fault Detection and Diagnosis of an Industrial Steam Turbine Using a Distributed Configuration of Adaptive Neuro-Fuzzy Inference Systems. Simul. Model. Pract. Theory 2011, 19, 1280–1293. [Google Scholar] [CrossRef]
  118. Chen, C.; Zhang, B.; Vachtsevanos, G.; Orchard, M. Machine Condition Prediction Based on Adaptive Neuro-Fuzzy and High-Order Particle Filtering. IEEE Trans. Ind. Electron. 2011, 58, 4353–4364. [Google Scholar] [CrossRef]
  119. Chen, C.; Vachtsevanos, G.; Orchard, M.E. Machine Remaining Useful Life Prediction: An Integrated Adaptive Neuro-Fuzzy and High-Order Particle Filtering Approach. Mech. Syst. Signal Process. 2012, 28, 597–607. [Google Scholar] [CrossRef]
  120. Wang, J.; Zhao, R.; Wang, D.; Yan, R.; Mao, K.; Shen, F. Machine Health Monitoring Using Local Feature-Based Gated Recurrent Unit Networks. IEEE Trans. Ind. Electron. 2017, 65, 1539–1548. [Google Scholar] [CrossRef]
  121. Zhang, J.; Wang, P.; Yan, R.; Gao, R.X. Deep Learning for Improved System Remaining Life Prediction. Procedia Cirp 2018, 72, 1033–1038. [Google Scholar] [CrossRef]
  122. Nguyen, K.T.P.; Medjaher, K. A New Dynamic Predictive Maintenance Framework Using Deep Learning for Failure Prognostics. Reliab. Eng. Syst. Saf. 2019, 188, 251–262. [Google Scholar] [CrossRef] [Green Version]
  123. Chen, B.; Liu, Y.; Zhang, C.; Wang, Z. Time Series Data for Equipment Reliability Analysis with Deep Learning. IEEE Access 2020, 8, 105484–105493. [Google Scholar] [CrossRef]
  124. Liao, L.; Jin, W.; Pavel, R. Enhanced Restricted Boltzmann Machine With Prognosability Regularization for Prognostics and Health Assessment. IEEE Trans. Ind. Electron. 2016, 63, 7076–7083. [Google Scholar] [CrossRef]
  125. Luo, B.; Wang, H.; Liu, H.; Li, B.; Peng, F. Early Fault Detection of Machine Tools Based on Deep Learning and Dynamic Identification. IEEE Trans. Ind. Electron. 2018, 66, 509–518. [Google Scholar] [CrossRef]
  126. Aydemir, G. Deep Learning Based Spectrum Compression Algorithm for Rotating Machinery Condition Monitoring. In Smart Materials, Adaptive Structures and Intelligent Systems; American Society of Mechanical Engineers: New York, NY, USA, 2018; Volume 1. [Google Scholar]
  127. Yuan, J.; Wang, K.; Wang, Y. Deep Learning Approach to Multiple Features Sequence Analysis in Predictive Maintenance. Lect. Notes Electr. Eng. 2018, 451, 581–590. [Google Scholar] [CrossRef]
  128. Udmale, S.S.; Singh, S.K.; Bhirud, S.G. A Bearing Data Analysis Based on Kurtogram and Deep Learning Sequence Models. Meas. J. Int. Meas. Confed. 2019, 145, 665–677. [Google Scholar] [CrossRef]
  129. Ince, T.; Kiranyaz, S.; Eren, L.; Askar, M.; Gabbouj, M. Real-Time Motor Fault Detection by 1-D Convolutional Neural Networks. IEEE Trans. Ind. Electron. 2016, 63, 7067–7075. [Google Scholar] [CrossRef]
  130. Huerta Herraiz, Á.; Pliego Marugán, A.; García Márquez, F.P. Photovoltaic Plant Condition Monitoring Using Thermal Images Analysis by Convolutional Neural Network-Based Structure. Renew. Energy 2020, 153, 334–348. [Google Scholar] [CrossRef] [Green Version]
  131. Zhou, F.; Gao, Y.; Wen, C. A Novel Multimode Fault Classification Method Based on Deep Learning. J. Control. Sci. Eng. 2017, 2017, 3583610. [Google Scholar] [CrossRef] [Green Version]
  132. Utah, M.N.; Jung, J.C. Fault State Detection and Remaining Useful Life Prediction in AC Powered Solenoid Operated Valves Based on Traditional Machine Learning and Deep Neural Networks. Nucl. Eng. Technol. 2020, 52, 1998–2008. [Google Scholar] [CrossRef]
  133. Wu, Z.; Guo, Y.; Lin, W.; Yu, S.; Ji, Y. A Weighted Deep Representation Learning Model for Imbalanced Fault Diagnosis in Cyber-Physical Systems. Sensors 2018, 18, 1096. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  134. Essien, A.; Giannetti, C. A Deep Learning Model for Smart Manufacturing Using Convolutional LSTM Neural Network Autoencoders. IEEE Trans. Ind. Inform. 2020, 16, 6069–6078. [Google Scholar] [CrossRef] [Green Version]
  135. Aydemir, G.; Paynabar, K. Image-Based Prognostics Using Deep Learning Approach. IEEE Trans. Ind. Inform. 2020, 16, 5956–5964. [Google Scholar] [CrossRef]
  136. Su, C.; Li, L.; Wen, Z. Remaining Useful Life Prediction via a Variational Autoencoder and a Time-Window-Based Sequence Neural Network. Qual. Reliab. Eng. Int. 2020, 36, 1639–1656. [Google Scholar] [CrossRef]
  137. Xu, Y.; Sun, Y.; Liu, X.; Zheng, Y. A Digital-Twin-Assisted Fault Diagnosis Using Deep Transfer Learning. IEEE Access 2019, 7, 19990–19999. [Google Scholar] [CrossRef]
  138. Booyse, W.; Wilke, D.N.; Heyns, S. Deep Digital Twins for Detection, Diagnostics and Prognostics. Mech. Syst. Signal Process. 2020, 140, 106612. [Google Scholar] [CrossRef]
  139. Randall, R.B.; Antoni, J. Rolling Element Bearing Diagnostics—A Tutorial. Mech. Syst. Signal Process. 2011, 25, 485–520. [Google Scholar] [CrossRef]
  140. Antoni, J. Fast Computation of the Kurtogram for the Detection of Transient Faults. Mech. Syst. Signal Process. 2007, 21, 108–124. [Google Scholar] [CrossRef]
  141. Antoni, J. The Spectral Kurtosis: A Useful Tool for Characterising Non-Stationary Signals. Mech. Syst. Signal Process. 2006, 20, 282–307. [Google Scholar] [CrossRef]
  142. Zeng, H.; Cheung, Y. Feature Selection and Kernel Learning for Local Learning-Based Clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 1532–1547. [Google Scholar] [CrossRef] [Green Version]
  143. Wu, M.; Schölkopf, B. A Local Learning Approach for Clustering. In Proceedings of the Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2007; Volume 19. [Google Scholar]
  144. Roffo, G.; Melzi, S.; Castellani, U.; Vinciarelli, A. Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1398–1406. [Google Scholar]
  145. Roffo, G.; Melzi, S. Ranking to Learn. In Proceedings of the New Frontiers in Mining Complex Patterns; Appice, A., Ceci, M., Loglisci, C., Masciari, E., Raś, Z.W., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 19–35. [Google Scholar]
  146. Roffo, G.; Melzi, S.; Cristani, M. Infinite Feature Selection. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 11–18 December 2015; pp. 4202–4210. [Google Scholar]
  147. Lake, B.M.; Salakhutdinov, R.; Tenenbaum, J.B. Human-Level Concept Learning through Probabilistic Program Induction. Science 2015, 350, 1332–1338. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  148. Arthur, D.; Vassilvitskii, S. K-Means++: The Advantages of Careful Seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, 7–9 January 2007; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA; pp. 1027–1035.
  149. Fei-Fei, L.; Fergus, R.; Perona, P. One-Shot Learning of Object Categories. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 594–611. [Google Scholar] [CrossRef] [Green Version]
  150. Ji, S.; Xu, W.; Yang, M.; Yu, K. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 221–231. [Google Scholar] [CrossRef] [Green Version]
  151. Society For Machinery Failure Prevention Technology. Fault Data Sets. Available online: https://www.mfpt.org/fault-data-sets/ (accessed on 23 April 2023).
  152. WindTurbineHighSpeedBearingPrognosis-Data; MathWorks Open Source and Community Projects. 2022. Available online: https://github.com/mathworks/WindTurbineHighSpeedBearingPrognosis-Data (accessed on 23 April 2023).
  153. Ben Ali, J.; Saidi, L.; Harrath, S.; Bechhoefer, E.; Benbouzid, M. Online Automatic Diagnosis of Wind Turbine Bearings Progressive Degradations under Real Experimental Conditions Based on Unsupervised Machine Learning. Appl. Acoust. 2018, 132, 167–181. [Google Scholar] [CrossRef]
  154. Matzka, S. Explainable Artificial Intelligence for Predictive Maintenance Applications. In Proceedings of the 2020 Third International Conference on Artificial Intelligence for Industries (AI4I), Irvine, CA, USA, 21–23 September 2020; pp. 69–74. [Google Scholar]
  155. UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 19 February 2021).
Figure 1. The main steps of the framework. General framework.
Figure 1. The main steps of the framework. General framework.
Sustainability 15 09272 g001
Figure 2. Image samples; the first one is the Ideal image.
Figure 2. Image samples; the first one is the Ideal image.
Sustainability 15 09272 g002
Figure 3. Examples from the training set, the handwritten character set.
Figure 3. Examples from the training set, the handwritten character set.
Sustainability 15 09272 g003
Figure 4. The training of the Siamese network.
Figure 4. The training of the Siamese network.
Sustainability 15 09272 g004
Figure 5. The visual summary of Algorithm 2.
Figure 5. The visual summary of Algorithm 2.
Sustainability 15 09272 g005
Figure 6. Training Process for the Omniglot Dataset.
Figure 6. Training Process for the Omniglot Dataset.
Sustainability 15 09272 g006
Figure 7. Testing Process for Lathe Machines.
Figure 7. Testing Process for Lathe Machines.
Sustainability 15 09272 g007
Figure 8. Training Loss for SN-2 to SN-5.
Figure 8. Training Loss for SN-2 to SN-5.
Sustainability 15 09272 g008
Figure 9. Testing SN-1 to SN-6 for Public Datasets.
Figure 9. Testing SN-1 to SN-6 for Public Datasets.
Sustainability 15 09272 g009
Figure 10. Summary prediction results.
Figure 10. Summary prediction results.
Sustainability 15 09272 g010
Table 1. Statistical Results for Lathe Machines.
Table 1. Statistical Results for Lathe Machines.
ComparisonLM-2
p-ValueZ-ValueSig. RankSig.
SN-1 vs. SN-20−4.771901
ComparisonLM-3
p-ValueZ-ValueSig. RankSig.
SN-1 vs. SN-20−4.771901
SN-2 vs. SN-30.99842.94133750
ComparisonLM-4
p-ValueZ-ValueSig. RankSig.
SN-1 vs. SN-20−4.771901
SN-2 vs. SN-30.99993.66124100
SN-3 vs. SN-414.79244650
ComparisonLM-5
p-ValueZ-ValueSig. RankSig.
SN-1 vs. SN-20.0348−1.8143861
SN-2 vs. SN-31 × 10−4−3.6714211
SN-3 vs. SN-40.99993.67142780
SN-4 vs. SN-50.99993.78572820
Table 2. Statistical Results for BFD-1 and BFD-2.
Table 2. Statistical Results for BFD-1 and BFD-2.
ComparisonBFD-1
p-ValueZ-ValueSig. RankSig.
SN-1 vs. SN-20.0205−2.0429781
SN-2 vs. SN-30.95541.72090
SN-3 vs. SN-40.96291.78592120
SN-4 vs. SN-50.79230.81441780
SN-5 vs. SN-60.861.08021730
ComparisonBFD-2
p-ValueZ-ValueSig. RankSig.
SN-1 vs. SN-20.99983.5275570
SN-2 vs. SN-30.4843−0.03933300
SN-3 vs. SN-40.3215−0.46353030
SN-4 vs. SN-50.61430.29063510
SN-5 vs. SN-614.10835940
Table 3. Statistical Results for AI4I and WT datasets.
Table 3. Statistical Results for AI4I and WT datasets.
ComparisonAI4I
p-ValueZ-ValueSig. RankSig.
SN-1 vs SN-20−6.1948731
SN-2 vs SN-316.732218290
SN-3 vs SN-416.739518300
SN-4 vs SN-50−4.82552591
SN-5 vs SN-614.93615850
ComparisonWT
p-ValueZ-ValueSig. RankSig.
SN-1 vs SN-20−4.50811701
SN-2 vs SN-315.454112020
SN-3 vs SN-416.149112740
SN-4 vs SN-514.324710850
SN-5 vs SN-60.99983.552410050
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Caliskan, A.; O’Brien, C.; Panduru, K.; Walsh, J.; Riordan, D. An Efficient Siamese Network and Transfer Learning-Based Predictive Maintenance System for More Sustainable Manufacturing. Sustainability 2023, 15, 9272. https://doi.org/10.3390/su15129272

AMA Style

Caliskan A, O’Brien C, Panduru K, Walsh J, Riordan D. An Efficient Siamese Network and Transfer Learning-Based Predictive Maintenance System for More Sustainable Manufacturing. Sustainability. 2023; 15(12):9272. https://doi.org/10.3390/su15129272

Chicago/Turabian Style

Caliskan, Abdullah, Conor O’Brien, Krishna Panduru, Joseph Walsh, and Daniel Riordan. 2023. "An Efficient Siamese Network and Transfer Learning-Based Predictive Maintenance System for More Sustainable Manufacturing" Sustainability 15, no. 12: 9272. https://doi.org/10.3390/su15129272

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop