# Research on Degeneration Model of Neural Network for Deep Groove Ball Bearing Based on Feature Fusion

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Fundamental Theory of Neural Network

## 3. Bearing Experiment and Feature Extraction Method

#### 3.1. Introduction of Experimental Data

#### 3.2. Feature Extraction Method

#### 3.3. The Trend of Feature Change

## 4. Model Training

#### 4.1. Training Sample Division

#### 4.2. Neural Network Degradation Model Modeling

- (a)
- Compared with the full gradient descent method, the proposed method disrupts the training samples randomly while updating iteration. The training samples are divided into multiple small samples of appropriate size, and a gradient descent is used for each small sample instead of all samples. The benefit of doing so is a great reduction in the convergence time and an improvement of the calculation speed.
- (b)
- The general stochastic gradient descent method is randomly applied by a sample from the training samples while updating the iteration. Although this method can increase the calculation speed, it is easy to cause large errors. On the contrary, the small-scale stochastic gradient descent method used in this paper reduces the error while preserving the computational speed.

#### 4.3. Model Setting

#### 4.4. Analysis of Training Results Model Setting

- (1)
- In addition to the sample size of 75, with the increase of training sample size, the learning effect of the neural network will get better and better. The average correct rate of classification under the setting of 1000 iterations, the standard deviation of classification accuracy, the minimum number of iterations, the training time, and other indicators showed an increasing trend, indicating that the learning ability of the model, the learning stability, and the convergence rate of the calculation will increase with the increase in training samples.
- (2)
- When the sample size is 75, the model oscillates during the training process. When the training sample size increases, the phenomenon vanishes. It can be seen that it is not because the learning rate is set improperly. Instead, there are some indistinguishable sample points and we should enhance the learning ability and generalization ability of the model to increase the training samples. This is confirmed by the test results after increasing the sample size.

## 5. Feature Fusion Based on Skewness Factor

#### 5.1. Pearson Correlation Coefficient between Features

#### 5.2. Feature Fusion Pearson

#### 5.3. New Feature Verification

- (1)
- Compared with the previous model, the model with new features 1 (skewness × shape) and new features 2 (skewness × crest) has been obviously improved in the training effect, and the average accuracy of the classification has improved too. The standard deviation of classification accuracy, the least number of iterations, training time, and so on all improved, which shows that adding new features can improve the model’s learning ability, learning stability, and the convergence rate of the calculation.
- (2)
- After adding new features, the training sample size is 75; although these three indicators, the classification accuracy, the standard classification accuracy, and the minimum number of iterations, are all improved, the oscillation phenomenon still exists. If you want to further improve the model’s ability to classify difficult-to-distinguish sample points, you need to increase your training sample size.

#### 5.4. Comparison with SVM

- (1)
- For the classification ability, the degenerative model built by neural network or SVM has the same effect. Even in the case of 75 training samples, the classification accuracy of the degraded model built by SVM is also in the oscillation range of neural network.
- (2)
- The training time of the degenerative model built by neural network is actually longer than the model by SVM. It can be seen that SVM is the better model than neural network for the 6205 bearing.
- (3)
- Although the performance of neural network is weaker than that of SVM, it should be noted that for neural network, with an increase in the number of training samples and the inclusion of fusion features, the improvement of training speed is obvious. In contrast, for SVM, with an increase in the number of training samples, although the computational speed of each iteration increases, the total training time is not significantly reduced. The training time ratio of SVM and neural network has increased from 11.57% in 50 training samples to 16.41% in 150 training samples.

## 6. Conclusions

- (1)
- In this paper, 150 training samples were extracted from the experimental data collected in the experiment, and five training samples of 50, 75, 100, 125, and 150 samples were extracted from the training data and input into the neural network for training. The training results show that with an increase in training sample size, the learning effect of the neural network gets better and better, and the learning ability, learning stability, and convergence speed of the model gradually increase.
- (2)
- When the training sample size is 75, with an increase in the number of iterations, the classification accuracy of the model to the verification set fails to stabilize at 100%, resulting in oscillation. When the number of samples increases, the oscillation phenomenon disappears. It can be seen that the oscillation occurs not because the model itself, but the indistinguishable samples. The results show that further enhancement of the classification ability of BP neural network degeneration model for indistinguishable samples needs to be achieved by increasing the training sample size.
- (3)
- Based on the Pearson correlation coefficient and polynomial fitting principle, feature fusion based on the skewness factor improves the model significantly. Although too many fusion features lead to overfitting of the model, overfitting will be alleviated with the increase in training samples. The model’s performance can be further improved by increasing the fusion features.
- (4)
- The comparison of the performance of the SVM model and the neural network model on this dataset is discussed, and the possible effects and performance of these two different modeling methods on more complicated problems and more data are discussed. The research shows that neural networks have more potential on complex and high-volume datasets.

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Huang, Y. Advances in artificial neural networks—Methodological development and application. Algorithms
**2009**, 2, 973–1007. [Google Scholar] [CrossRef] - Rai, A.; Upadhyay, S.H. The use of MD-CUMSUM and NARX neural network for anticipating the remaining useful life of bearings. Measurement
**2017**, 111, 397–410. [Google Scholar] [CrossRef] - Dubey, R.; Agrawal, D. Bearing fault classification using ANN-based Hilbert footprint analysis. IET Sci. Meas. Technol.
**2015**, 9, 1016–1022. [Google Scholar] [CrossRef] - Li, S.; Liu, G.; Tang, X.; Lu, J.; Hu, J. An ensemble deep convolutional neural network model with improved DS evidence fusion for bearing fault diagnosis. Sensors
**2017**, 17, 1729. [Google Scholar] [CrossRef] [PubMed] - Sadegh, H.; Mehdi, A.N.; Mehdi, A. Classification of acoustic emission signals generated from journal bearing at different lubrication conditions based on wavelet analysis in combination with artificial neural network and genetic algorithm. Tribol. Int.
**2016**, 95, 426–434. [Google Scholar] [CrossRef] - Meng, Z.; Hu, M.; Gu, W.; Zhao, D. Rolling bearing fault diagnosis method based on LMD multi scale entropy and probabilistic neural network. China Mech. Eng.
**2016**, 27, 433–437. [Google Scholar] [CrossRef] - Prajapati, D.K.; Tiwari, M. Use of artificial neural network (ANN) to determining surface parameters, frictionand wear during Pin-on-Disc tribotesting. Key Eng. Mater.
**2017**, 739, 87–95. [Google Scholar] [CrossRef] - Zhao, N.; Zheng, H.; Yang, L.; Wang, Z. A fault diagnosis approach for rolling element bearing based on S-Transform and artificial neural network. In Proceedings of the ASME Turbo Expo 2017: Turbomachinery Technical Conference and Exposition, Charlotte, NC, USA, 26–30 June 2017; Volume 6. [Google Scholar] [CrossRef]
- Karkan, P.K.; Satish, C.S.; Harsha, S.P. Fault diagnosis of ball bearings using machine learning methods. Exp. Syst. Appl.
**2011**, 38, 1876–1886. [Google Scholar] [CrossRef] - You, W.; Shen, C.; Guo, X.; Jiang, X.; Shi, J.; Zhu, Z. A hybrid technique based on convolutional neural network and support vector regression for intelligent diagnosis of rotating machinery. Adv. Mech. Eng.
**2017**, 9, 116–134. [Google Scholar] [CrossRef] - Jia, F.; Lei, Y.; Lin, J.; Zhou, X.; Lu, N. Deep neural networks: A promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data. Mech. Syst. Signal Process.
**2016**, 72, 303–315. [Google Scholar] [CrossRef] - Zhang, W.; Li, C.; Peng, G.; Chen, Y.; Zhang, Z. A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech. Syst. Signal Process.
**2018**, 100, 439–453. [Google Scholar] [CrossRef] - Wang, F.; Jiang, H.; Shao, H.; Duan, W.; Wu, S. An adaptive deep convolutional neural network for rolling bearing fault diagnosis. Meas. Sci. Technol.
**2017**, 28, 223–237. [Google Scholar] [CrossRef] - Jiang, H.; Wang, F.; Shao, H.; Zhang, H. Rolling bearing fault identification using multilayer deep learning convolutional neural network. J. Vibroeng.
**2017**, 19, 99–104. [Google Scholar] [CrossRef] - Lu, C.; Wang, Z.; Zhou, B. Intelligent fault diagnosis of rolling bearing using hierarchical convolutional network based health state classification. Adv. Eng. Inf.
**2017**, 32, 39–151. [Google Scholar] [CrossRef] - Chen, Z.; Deng, S.; Chen, X.; Li, C.; Sanchez, R.-V.; Qin, X. Deep neural networks-based rolling bearing fault diagnosis. Microelectron. Reliab.
**2017**, 75, 327–333. [Google Scholar] [CrossRef] - He, J.; Yang, S.; Gan, C. Unsupervised fault diagnosis of a gear transmission chain using a deep belief network. Sensors
**2017**, 17, 1564. [Google Scholar] [CrossRef] [PubMed] - Eren, L. Bearing fault detection by one-dimensional convolutional neural networks. Math. Probl. Eng.
**2017**, 8, 146–154. [Google Scholar] [CrossRef] - Kanai, R.A.; Desavale, R.G.; Chavan, S.P. Experimental-based fault diagnosis of rolling bearings using artificial neural network. J. Tribol.
**2016**, 138, 31–53. [Google Scholar] [CrossRef] - Yang, D.; Mu, H.; Xu, Z.; Wang, Z.; Yi, C.; Liu, C. Based on soft competition ART neural network ensemble and its application to the fault diagnosis of bearing. Math. Probl. Eng.
**2017**, 17, 112–119. [Google Scholar] [CrossRef] - Jiang, P.; Hu, Z.; Liu, J.; Yu, S.; Wu, F. Fault diagnosis based on chemical sensor data with an active deep neural network. Sensors
**2016**, 16, 169–185. [Google Scholar] [CrossRef] [PubMed] - Gan, M.; Wang, C. Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings. Mech. Syst. Signal Process.
**2016**, 72, 92–104. [Google Scholar] [CrossRef] - Zhao, Z.; Xu, Q.; Jia, M. Improved shuffled frog leaping algorithm-based BP neural network and its application in bearing early fault diagnosis. Neural Comput. Appl.
**2016**, 27, 375–385. [Google Scholar] [CrossRef] - Smith, W.A.; Randall, R.B. Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mech. Syst. Signal Process.
**2015**, 64–65, 100–131. [Google Scholar] [CrossRef]

Feature | Equation |
---|---|

Shape Factor | ${\mathrm{S}}_{f}=\frac{{X}_{rms}}{\left|\overline{X}\right|}$ |

Crest Factor | ${C}_{f}=\frac{{X}_{max}}{{X}_{rms}}$ |

Impulse Factor | ${I}_{f}=\frac{{X}_{max}}{\left|\overline{X}\right|}$ |

Margin Factor | $C{L}_{f}=\frac{{X}_{max}}{{X}_{r}}$ |

Skewness Factor | $\alpha =\frac{\mathrm{m}}{{\mathsf{\sigma}}^{3}}$ |

Kurtosis Factor | ${K}_{v}=\frac{\beta}{{X}_{rms}^{4}}$ |

Label | Bearing Type |
---|---|

0 | Normal bearings |

1 | Bearings with 0.007 inches of failure |

2 | Bearings with 0.014 inches of failure |

3 | Bearings with 0.021 inches of failure |

4 | Bearings with 0.028 inches of failure |

Label | Absolute Mean (m/s^{2}) | Shape Factor | Crest Factor | Impulse Factor | Margin Factor | Skewness Factor | Kurtosis Factor |
---|---|---|---|---|---|---|---|

0 | 0.012566 | 1.23187447 | 3.6337506 | 4.47632462 | 5.28194 | −0.0948106 | 2.9031877 |

0 | 0.011475 | 1.23281404 | 3.4248338 | 4.22218312 | 4.9671475 | −0.0823644 | 2.90118091 |

0 | 0.011053 | 1.23041602 | 2.9480345 | 3.62730882 | 4.2840401 | −0.0976983 | 2.77140046 |

1 | 0.015452 | 1.39794467 | 5.1771788 | 7.23740953 | 9.0825825 | 0.12473383 | 5.4470428 |

1 | 0.01528 | 1.39697415 | 5.3681822 | 7.49921182 | 9.3689355 | 0.12262814 | 5.67139855 |

1 | 0.014523 | 1.39665376 | 5.5099342 | 7.69547032 | 9.6334086 | 0.14372596 | 5.66742352 |

2 | 0.050414 | 1.59591926 | 9.5319981 | 15.2122993 | 19.78025 | −0.144491 | 22.3901311 |

2 | 0.048688 | 1.62707031 | 8.5809309 | 13.9617779 | 18.339879 | 0.06575567 | 20.3030647 |

2 | 0.046767 | 1.6431311 | 9.1235632 | 14.9912104 | 19.764463 | 0.03716416 | 22.2385791 |

3 | 0.019804 | 1.47237616 | 5.5168049 | 8.12281206 | 10.485159 | 0.34665274 | 7.16871024 |

3 | 0.018877 | 1.46768005 | 6.8820267 | 10.1006133 | 13.052605 | 0.35161847 | 7.46175939 |

3 | 0.018286 | 1.46783372 | 6.158466 | 9.03960406 | 11.635453 | 0.34561281 | 7.55165032 |

4 | 0.008011 | 1.2818403 | 4.1804227 | 5.35863428 | 6.4009591 | 0.13830736 | 3.48278044 |

4 | 0.007579 | 1.27312284 | 3.6516067 | 4.6489439 | 5.5276294 | 0.10151341 | 3.29118482 |

4 | 0.007539 | 1.27710124 | 4.7200015 | 6.02791982 | 7.1994001 | 0.08903434 | 3.41036724 |

Number of Samples | Training Set | Verification Set |
---|---|---|

50 | 40 | 10 |

75 | 60 | 15 |

100 | 80 | 20 |

125 | 100 | 25 |

150 | 120 | 30 |

Number of Neurons | Accuracy of Classification after 1000 Iterations |
---|---|

10 | 70% |

20 | 90% |

30 | 100% |

40 | 100% |

Learning Rate | Minimum Number of Iterations to Reach 100% Accuracy of Classification |
---|---|

1 | 980 |

1.1 | 953 |

1.2 | 919 |

1.3 | 896 |

1.4 | 880 |

1.5 | Oscillation occurred |

1.6 | Oscillation occurred |

Parameter | Parameter Value |
---|---|

Iteration times | 1000 |

Small batch size of the sample | 5 |

Size of neural network | 7 × 30 × 5 |

Learning rate | 1.4 |

Training Sample Size | Average Classification Accuracy | Accuracy Standard Deviation | Minimum Number of Iterations | Training Time (s) |
---|---|---|---|---|

50 | 0.9236 | 0.1168 | 880 | 15.10 |

75 | 0.8556 | 0.1466 | oscillation | 18.35 |

100 | 0.9433 | 0.1026 | 630 | 23.71 |

125 | 0.9566 | 0.0986 | 492 | 29.36 |

150 | 0.9729 | 0.0744 | 365 | 35.03 |

Training Sample Size | Number of Iterations | Training Time (s) |
---|---|---|

50 | 900 | 13.70 |

75 | null | null |

100 | 650 | 12.91 |

125 | 500 | 12.24 |

150 | 380 | 11.43 |

Absolute Mean | Shape Factor | Crest Factor | Impulse Factor | Margin Factor | Skewness Factor | Kurtosis Factor | |
---|---|---|---|---|---|---|---|

Absolute Mean | 1 | 0.8192 | 0.8480 | 0.8633 | 0.8633 | −0.3890 | 0.9206 |

Shape Factor | 0.8192 | 1 | 0.9755 | 0.9817 | 0.9839 | −0.0498 | 0.9317 |

Crest Factor | 0.8480 | 0.9755 | 1 | 0.9967 | 0.9954 | −0.0994 | 0.9440 |

Impulse Factor | 0.8633 | 0.9817 | 0.9967 | 1 | 0.9998 | −0.1470 | 0.9640 |

Margin Factor | 0.8633 | 0.9839 | 0.9954 | 0.9998 | 1 | −0.1462 | 0.9645 |

Skewness Factor | −0.3890 | −0.0498 | −0.0994 | −0.1470 | −0.1462 | 1 | −0.3538 |

Kurtosis Factor | 0.9206 | 0.9317 | 0.9440 | 0.9640 | 0.9645 | −0.3538 | 1 |

New Feature | Feature Fusion Method | Pearson Correlation Coefficient between Features |
---|---|---|

New feature 1 | Skewness × Shape | −0.0498 |

New feature 2 | Skewness × Crest | −0.0994 |

New feature 3 | Skewness × Margin | −0.1462 |

New feature 4 | Skewness × Impulse | −0.1470 |

New feature 5 | Skewness × Kurtosis | −0.3538 |

New feature 6 | Skewness × Absolute Mean | −0.3890 |

Number of New Features | Minimum Number of Iterations for 100% Accuracy |
---|---|

1 | 660 |

2 | 550 |

3 | Overfitting |

4 | Overfitting |

Training Sample Size | Average Classification Accuracy | Accuracy Standard Deviation | The Minimum Number of Iterations | Training Time (s) |
---|---|---|---|---|

50 | 0.9236 | 0.1168 | 880 | 12.61 |

50 (new feature) | 0.9471 | 0.1102 | 611 | 12.54 |

75 | 0.8556 | 0.1466 | Oscillation | 17.36 |

75 (new feature) | 0.8783 | 0.0883 | Oscillation | 17.35 |

100 | 0.9433 | 0.1026 | 630 | 22.95 |

100 (new feature) | 0.9620 | 0.0916 | 488 | 23.18 |

125 | 0.9566 | 0.0986 | 492 | 28.18 |

125 (new feature) | 0.9730 | 0.0741 | 342 | 28.25 |

150 | 0.9729 | 0.0744 | 365 | 33.41 |

150 (new feature) | 0.9809 | 0.0666 | 293 | 33.38 |

Number of Training Samples | Number of Iterations | Training Time (s) |
---|---|---|

50 | 900 | 13.70 |

50 (new feature) | 600 | 7.52 |

75 | null | null |

75 (new feature) | null | null |

100 | 650 | 12.91 |

100 (new feature) | 500 | 6.88 |

125 | 500 | 12.24 |

125 (new feature) | 350 | 6.19 |

150 | 380 | 11.43 |

150 (new feature) | 300 | 5.36 |

Number of New Features | Minimum Number of Iterations |
---|---|

2 | 293 |

3 | 247 |

4 | Overfitting |

5 | Overfitting |

Number of Training Samples | Classification Accuracy | Training Time (s) | Time Ratio (SVM/Neural Network) |
---|---|---|---|

50 (neural network) | 100% | 7.52 | null |

50 (SVM) | 100% | 0.87 | 11.57% |

75 (neural network) | 86.7~100% | null | null |

75 (SVM) | 93.3% | null | null |

100 (neural network) | 100% | 6.88 | null |

100 (SVM) | 100% | 0.81 | 11.77% |

125 (neural network) | 100% | 6.19 | null |

125 (SVM) | 100% | 0.85 | 13.73% |

150 (neural network) | 100% | 5.36 | null |

150 (SVM) | 100% | 0.88 | 16.41% |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhang, L.; Tao, J.
Research on Degeneration Model of Neural Network for Deep Groove Ball Bearing Based on Feature Fusion. *Algorithms* **2018**, *11*, 21.
https://doi.org/10.3390/a11020021

**AMA Style**

Zhang L, Tao J.
Research on Degeneration Model of Neural Network for Deep Groove Ball Bearing Based on Feature Fusion. *Algorithms*. 2018; 11(2):21.
https://doi.org/10.3390/a11020021

**Chicago/Turabian Style**

Zhang, Lijun, and Junyu Tao.
2018. "Research on Degeneration Model of Neural Network for Deep Groove Ball Bearing Based on Feature Fusion" *Algorithms* 11, no. 2: 21.
https://doi.org/10.3390/a11020021