A feature fusion method using WPD-SVD and t-SNE for gearbox fault diagnosis

The vibration signals of a gearbox always contain the dynamic operation information, which are important for the feature extraction and further work. However, the low signal-to-noise ratio and combined multi-mode faults make it difficult to extract discriminable features of gearboxes. In this study, a feature fusion method based on wavelet packet decomposition (WPD), singular value decomposition (SVD) and -Distributed stochastic neighbor embedding ( -SNE) for gearbox fault diagnosis is proposed. First, time-frequency analysis method of WPT-SVD as well as time-domain analysis methods are utilized to extract robust feature vectors of gearboxes with different conditions. As an effective method for the visualization of high-dimensional datasets, -SNE is then introduced to realize the dimensionality reduction of feature vectors. Finally, with the fused features, a radial basis function (RBF) neural network is trained to realize the classification of gearbox fault modes. Sufficient experiments have been implemented to validate the effectiveness and superiority of the proposed method by analyzing the vibration signals of gearboxes.


Introduction
As one of the most important machine components, gearboxes are extensively used in transmission design of many rotating machine.However, the severe operation conditions of heavy duty and intensive impact load may result in gear tooth damage and other fault modes, which heavily influences the working condition of the whole systems [1].In order to reduce the operation and maintenance costs for gearboxes, numerous studies have been conducted to realize the gearbox fault recognition [2,3].But the low signal-to-noise ratio and combined multi-mode faults make it still a challenge to extract discriminable features for gearboxes.This study provides a feature fusion method based on wavelet packet decomposition (WPD), singular value decomposition (SVD) and -Distributed stochastic neighbor embedding ( -SNE) for gearbox fault diagnosis.
For fault diagnosis, one of challenges is to obtain reliable features of the gearbox by analyzing the monitoring vibration signals in the first step.Generally, the main feature extraction methods include time-domain methods, frequency-domain methods, and time-frequency methods.Time-domain analysis methods such as root-mean-square (RMS) value, crest factor, form factor, kurtosis and skewness have been successfully used to realize fault diagnosis of rotating machine [4].Frequency-domain analysis methods include Fourier transform, cepstrum analysis and so on.As for time-frequency analysis methods such as short-time Fourier transform (STFT) and empirical mode decomposition (EMD), they have been proven effective to extract features from nonlinear and non-stationary vibration signals [5].Among these time-frequency techniques, WPD is one of the best tools since it has particular advantages for decomposing original signals into different frequency bands.And the SVD method can be utilized to form the final feature vectors based on the results of WPD.To extract robust representatives for the gearbox, both time-domain analysis methods and time-frequency analysis method of WPT-SVD are applied in this study.
Mapping the extracted high-dimensional feature representatives into low-dimensional space properly is another challenge in this paper.A large number of dimensionality reduction techniques have been proposed, such as PCA, KPCA and manifold learning methods like local tangent space alignment (LTSA) [6].However, most of these methods have the limitation to capture both the local and global structure of the high-dimensional features.To realize the presence of clusters at several scales, Maaten et al. proposed the -SNE method which can achieve good visualization of high-dimensional data [7].In this study, -SNE is employed as the dimensionality reduction method to get the most discriminable features.
Inspired by the aforementioned challenges, a novel feature fusion method for gearbox fault diagnosis is proposed in this study.Our contributions are summarized as follows: Firstly, we proposed an effective feature extraction method relying on WPD-SVD and time-domain analysis methods for gearboxes.The extracted robust feature vectors embody the key information of gearbox operation condition.Secondly, a -SNE based dimensionality reduction method is employed to obtain the discriminable features, relying on which fault diagnosis can be realized with a RBF neural network model.Moreover, sufficient experiments are conducted by comparing with the existing methods based on the operation data of gearboxes, which demonstrates the feasibility and effectiveness of our proposed approach.
The rest of this paper is organized as follows.In Section 2, we explain the overall scheme of fault diagnosis and the mathematical principles.The results of case study are provided and analyzed in Section 3, followed by our conclusion in Section 4.

Procedures of the method
The procedure of our methodology is shown in Fig. 1.This paper provides a fault diagnosis methodology which contains two main steps as below: -In the first step, time-domain analysis methods, including the RMS value, crest factor, form factor, kurtosis and skewness, are applied to extract the time-domain features of gearboxes, while the WPD-SVD method is employed to form the time-frequency feature vectors.
-The second step involves dimensionality reduction of the extracted feature vectors based on -SNE.Relying on the obtained low-dimensional fused features, the RBF neural network model can be trained to realize the fault mode classification.

Time-domain analysis
The time-domain parameters of the original signals including the mean value, the maximum value, the RMS value, etc.In this study, the RMS value, crest factor, form factor, kurtosis and skewness are chosen to form the time-domain features of gearboxes.
Then the time-domain feature vectors of gearboxes can be expressed as = [ , , , , ].

Wavelet packet decomposition and singular value decomposition
The method of WPD has the framework of multi-resolution analysis based on wavelet analysis.The function of wavelet packet , ( ) can be expressed as: where denotes the decomposition level, denotes the scale factor, and denotes the translation factor.
Then, the SVD method can be used to extract the prominent features from the wavelet packet coefficients.The singular values of wavelet packet coefficients can be utilized to represent time-frequency features in this study.

SNE method
SNE is introduced based on the spirit of converting the high-dimensional Euclidean distances between data points into conditional probabilities that represent similarities.The similarity of data point to data point is the conditional probability, which can be denoted as | : where indicates the variance of the Gaussian distribution that is centered on data point , and | is set as zero.
For the low-dimensional mapping values and corresponding to the original and , the similarity | between them can be calculated by: where = 1 √2 ⁄ and | is set as zero.In order to make | match | best, the sum of Kullback-Leibler divergences over all data points is minimized by a gradient descent method.The cost function is expressed as: where represents the distribution of | , and represents the distribution of | .

-SNE method
As an extension of Stochastic Neighbor Embedding (SNE), -SNE was proposed for visualizing high-dimensional data [7].To optimize the cost function more effectively, -SNE was proposed with two improvements.Firstly, a symmetric version of SNE cost function is selected by minimizing a single Kullback-Leibler divergence between the joint probability distribution in the high-dimensional space and in the low-dimensional space, respectively: where and are expressed as: To solve the problem that the widely separated data tend to be crowded in the low-dimensional space, -SNE employs a Student-distribution rather than a Gaussian distribution to convert distances into probabilities in the low-dimensional space.Then can be defined as: And the gradient is modified as: By solving the problems of SNE cost function, -SNE can realize better dimensionality reduction of high-dimensional datasets.

Case study
The dataset in the 2009 PHM Conference Data Analysis Competition is applied in this paper.The gearbox dataset consists of two types of gearboxes and fourteen kinds of fault modes.Data were collected at 30, 35, 40, 45 and 50 Hz shaft speed while being subjected to either high or low loading.To demonstrate the feasibility and effectiveness of the proposed method, we choose six typical conditions of spur gearboxes including one normal state and five fault states under 40 Hz

Feature fusion based gearbox fault diagnosis
In this section, -SNE is utilized to fuse the high-dimensional feature vectors of gearboxes.The former 13-dimensional features are reduced to be 3-dimensional features, which can be seen as the key representatives of gearboxes.To evaluate the effectiveness and superiority of -SNE method, the traditional methods of PCA and LTSA are also applied to the same dataset.The perplexity of -SNE method is set as 25.By comparing the results of the different methods, we can find that the results of -SNE have the best visualization effects as well as the best separability in low-dimensional feature space, as shown in Fig. 3.
Based on the low-dimensional fused features given by -SNE, further fault diagnosis can be carried out.Here, 150 samples from normal condition and five fault modes are selected to train the RBF model, respectively.With the trained classification model, 50 samples from each state are chosen for testing as the model input.The samples of the PCA results and the samples of the LTSA results are also implemented to verify the classification performance.
As showed in Table 2, the accuracy rate of fault diagnosis based on -SNE can reach 100 %, while other feature fusion methods cannot cluster all the fault states of gearboxes clearly, which results in low accuracy of classification.The result of fault mode classification verifies the effectiveness and superiority of the proposed feature extraction and feature fusion method.

Conclusions
In this paper, a novel method of feature fusion based on WPD-SVD and -SNE for gearbox fault diagnosis is proposed.In the first step, several time-domain parameters and singular values based on WPD-SVD are both obtained by processing the vibration signals of gearboxes, which together constitute the robust feature vectors.Then, -SNE, as an effective method for the visualization of high-dimensional datasets, is introduced to realize dimensionality reduction of the extracted feature vectors.Based on the fused features, a RBF based fault diagnosis model is applied to achieve the gearbox fault mode classification.Sufficient experiments have been implemented to demonstrate the effectiveness and superiority of the proposed method by analyzing the vibration signals of gearboxes.

Fig. 1 .
Fig. 1.The procedure of the proposed methodology

3 .
a) Result of PCA b) Result of LTSA c) Result of -SNE Fig.The results of feature fusion FEATURE FUSION METHOD USING WPD-SVD AND T-SNE FOR GEARBOX FAULT DIAGNOSIS.

Table 2 .
Accuracy of classification using different methods Method Total Normal Fault 1 Fault 2 Fault 3 Fault 4 Fault 5