Gear fault feature extraction and classification of singular value decomposition based on Hilbert empirical wavelet transform

Vibration signal of gearbox systems carries the important dynamic information for fault diagnosis. However, vibration signals always show non stationary behavior and overwhelmed by a large amount of noise make this task challenging in many cases. Thus, a new fault diagnosis method combining the Hilbert empirical wavelet transform (HEWT), the singular value decomposition (SVD) and Elman neural network is proposed in this paper. Vibration signals of normal gear, gear with tooth root crack, gear with chipped tooth in width, gear with chipped tooth in length, gear with missing tooth and gear with general surface wear are collected in different speed and load conditions. HEWT, a new self-adaptive time-frequency analysis, was applied to the vibration signals to obtain the instantaneous amplitude matrices. Singular value vectors, as the fault feature vectors were then acquired by applying the SVD. Last, the Elman neural network was used for automatic gearbox fault identification and classification. Through experimental results, it was concluded that the proposed method can accurately extract and classify the gear fault features under variable conditions. Moreover, the performance of the proposed HEWT-SVD method has an advantage over that of Hilbert-Huang transform (HHT)-SVD, local mean decomposition (LMD)-SVD or wavelet packet transform (WPT)-PCA for feature extraction.


Introduction
Gears, an important and most frequently encountered components in rotating machinery, whose operation condition directly affects the whole performance of the entire system.An unexpected fault of gear may cause huge economic losses, even personal injury if not detected in time [1].According to statistics, 80 % of transmission machinery failure was caused by the gear, and gear failure was about 10 % of rotating machinery failure.Thus, a robust monitoring system is needed to detect earlier any malfunctions.
The vibration signals acquired during the operation of a gear carry the important dynamic information of the machine: the fault features of a gear can, therefore, be obtained from the analysis of such vibration signals.[2,3].Generally, gearbox diagnosis includes two main steps: feature extraction and fault classification.Feature extraction is the most challenging task.If the feature extraction is incorrect or incomplete, it will inevitably lead to erroneous classification and false positives.Furthermore, the gear fault signatures are known to be weak and often covered by the nature frequency of the machine and overwhelmed by noise with obvious non-linear and non-stationary behavior [4].Thus, how to efficiently obtain the vital features that robustly indicate the presence of faults from the complex dynamic mechanical signals is the key to solve the problem.
The main feature extraction methods including: Time-domain, frequency-domain and time-frequency domain methods.Time-domain methods are focused on extracting statistical features of the time waveform to determine transient phenomena originate from faulty gearbox [5,6].Time synchronous average (TVA) has been widely used to extract the time waveform from the vibration signal synchronous with the shaft rate of rotation.Removing the harmonic families of the gearmesh components, the residual signal waveform which contains the fault feature signatures could be used for fault diagnosis.Frequency-domain methods include Fourier spectra, cepstrum analysis, and envelope spectra technique [7,8].
Conventional methods show their limits when applied to vibration signals that are nonstationary and that have low energy of weak signals generated by faults.To deal with this problem, time-frequency (TF) analysis techniques have been developed.Among them, there is the short time Fourier transform (STFT), this technique has been widely used in rotating machinery fault diagnosis [9].To overcome the limitation of STFT related to the Heisenberg uncertainty principle, the Wigner-Ville distribution has been widely used.Choy et al. [10] have shown the capacity of WVD in analyzing the effects of surface pitting and wear on the vibrations of a gear transmission system.Thereafter, several studies have exploit the technique like the work done by [11].
In the last few years, the wavelet transform (WT) has drawn a lot attention and it could overcome the classical TF tools [12].The WT is a multi-resolution analysis has the advantage for characterizing signals at different localization levels in both time and frequency domains.The WPT is an advanced version of the WT which provides a complete level by level TF decomposition of signal.It may not only supply richer information but also supply more promise frequency localization information.However, the WPT is not self-adaptive, it has prescribed dyadic subdivision in time and frequency which may leads to severe damage in identifying transient vibration features that lie in transient areas of dyadic packets [13,14].
Huang et al. [15], have proposed a self-adaptive TF analysis called HHT, it can overcome the problem of the prescribed dyadic subdivision in WT and WPT.In [16], Peng Z. K. et al. compared the HHT and WT, and then verified that the HHT has better resolution and computational efficiency in time-frequency domain.EMD and Hilbert spectrum were used to diagnose rotating machines faults [17,18].Several adaptive methods based on EMD have been developed such as the LMD [19], intrinsic time-scale decomposition (ITD) [20].These methods however, all have the same drawbacks, like the mode mixing, distorted components and the end effects phenomena.Recently, Daubechies et al. [21] proposed a wavelet based time-frequency reallocation method called Synchrosqueezed wavelet transform.This method was successfully applied in gear fault diagnosis [22].
In recent years, Gilles, [23] developed the empirical wavelet transform (EWT) [24,25].The uniqueness of this method is in building an adaptive wavelet filter bank capable of extracting amplitude modulated-frequency modulated (AM-FM) components of a signal.It is demonstrated to be superior to EMD [23].Merainani et al. [26], proposed the Hilbert empirical wavelet transform (HEWT) and did a comparison with HHT.The technique has shown a good results in gear tooth crack, tooth pitting and rolling bearing diagnosis [26][27][28].This combination leads to self-organizing TF plane which is very beneficial for fault feature extraction.Consequently, the HEWT is used in this research.Nevertheless, the instantaneous amplitude signals of HEWT are always complex and too large, thus, in order to enhance the robustness of the classification both principal component analysis (PCA) and SVD can be employed for dimensionality reduction.Moreover, the singular values have great stability, so that they change little when the matrix elements change.Hence, the HEWT together with SVD is proposed for gear fault feature extraction.
Using solely advanced signal processing techniques for fault diagnosis require a great deal of expertise to apply them successfully.Techniques are required that can automatically make decisions on the health of running machines.After extracting the feature vectors, intelligent classification techniques can be used to identify the fault modes.These include the Elman neural network [29].In this study, Elman neural network is employed for automatic gearbox identification and classification.This paper is organized as follows: Section 2 introduces the HEWT-SVD.Section 3, describes the experimental setup and experimental results.Conclusion is offered in Section 4.

Time-frequency signal decomposition based on HEWT
The HEWT is a merger of the empirical wavelet transform and Hilbert transform [26].The EWT is used to extract adaptive modes from the vibration signal.Then, the instantaneous amplitude and frequency are performed for each mode using HT.

Empirical wavelet transform
The idea of EWT is to construct a set of  wavelet filters (one Low pass and ( − 1) band pass filters) capable of extracting Amplitude Modulated-Frequency Modulated components that is AM-FM components (modes)  () of an input signal () by adaption from the processed signal [23] such as: For the adaptation process, the wavelet filter bank is based on the Fourier supports detected from the information contained in the processed signal spectrum by finding the local maxima then taking support boundaries  as the middle between successive maxima [23].
As it shown in Fig. 1, the Fourier support [0, ] is partitioned into  contiguous segments.Each segment is denoted as Λ =  ,  , thus ⋃ Λ = 0,  .Centered around each  , there is transient phase  with width 2 .
The empirical wavelets are defined as band pass filters on each Λ and based on Λ a wavelet tight frame can be defined.Hence, inspired by the Meyer's and Littlewood-Paley wavelets, a wavelet tight frame  =  () ,  () is defined.And for arbitrary  > 0, their Fourier transforms, i.e. the empirical scaling function  () and the empirical wavelets  () are given by the expressions of (Eqs.(2) and 3), respectively: The function () is given as follows: Note that the most used function that satisfies this property is: Once the tight frame set of empirical wavelets is built, the definition of EWT,  (, ) can be defined in the same way as the classical wavelet transform.The detail coefficients are obtained by the inner products of the input signal with the empirical wavelets [23]: The approximation coefficients are obtained by the inner product with the scaling function: So, the signal is decomposed into various empirical modes  , which is given by: The EWT is invertible and the signal can be reconstructed as follows: As the initial goal of the EWT is to get a decomposition as depicted in Eq. ( 1).Comparing the Eq. ( 1) with (11), we can deduce that each mode  in Eq. ( 1) corresponds to Eqs. ( 9) and (10).

Hilbert transform
Hilbert Transform can be seen like a convolution of signal with 1  ⁄ .For every extracted mode (), the analytical signal () associated with () is given by: where:  () is the Hilbert Transform of (), defined as: Thus, the instantaneous amplitude and frequency are Eqs.( 14) and ( 15) respectively: The instantaneous amplitude of each mode is considered as its envelope signal.It's a time varying signal that gives beneficial information on investigating intrinsic characteristics of the signal.For this raison, the envelope signal () is computed to support our analysis.However, the envelope signals still too large and complex to be taken as the fault features.Thus, we further propose the use of SVD for dimension reduction to improve the robustness of the fault features.

Singular value decomposition on the HEWT
The singular value decomposition is a matrix transformation algorithm that decomposes any given matrix  ( × ) into three matrices , Σ and  as follows: where  ( × ) and  ( × ) are orthogonal matrices and Σ is an ( × ) diagonal matrix of singular values,  ≥  ⋯ ≥  .The matrix Σ is represented as: Fig. 2. Feature extraction process using the proposed approach The columns of the orthogonal matrix  are called the left singular vectors and the columns of the matrix  are called the right singular vectors of the matrix .An important property of  and  is that they are mutually orthogonal [30].
The use of SVD presents its own advantages.It is able to expresses the feature matrix  in the form of several values (singular values), so it is endowed a dimension reduction strategy.Furthermore, the singular values have a good stability.In other words, when the feature matrix element changes, a large variance of its singular values does not occur.

Elman neural network
The Elman neural network is generally divided into four layers, including input layer, hidden layer, association layer and output layer.The basic structure of the network is seen in Fig. 3.The association between context layer and hidden layer, makes the neural networks sensitive to the history of input data.The context neurons can be treated as the memory units, so, this internal feedback improves significantly the capacity of the network to deal with dynamic information, overcoming the drawback of the feed-forward network.To this end, the Elman neural network compared with other neural networks, is robust in fault classification.
The transfer function of hidden layer is a non linear function, which is generally using a Sigmoid function.The activation function of the output layer neuron is a linear function.The mathematical model of Elman neural network is analyzed as follows: where , , ,  are separately representing the output vector of the network, the output vector of the hidden layer, input vector of the network and the feedback state vector.Then  ,  ,  are separately representing connection weight matrix from the hidden layer to the output layer, from the input layer to the hidden layer and from the context layer to the hidden layer.(⋅) is the transfer function of the hidden layer and (⋅) denotes the activity function of the output neurons.

Experimental result and discussion
In order to test the effectiveness of the proposed method, experiments were conducted on the dataset provided by the laboratory of contact and structure mechanics at INSA Lyon, France [25].
Experiments of the proposed method and detailed comparison with another widely used methods HHT-SVD, LMD-SVD and WPT together with PCA is also conducted as follows.

Experimental system description
Gear vibration testing experimental apparatus is shown in Fig. 4. The rotation motion of the equipment is generated by an electric dc-motor controlled in rotational speed with a nominal speed of 3600 RPM.Torque will be transmitted to the gearbox through the coupling where several pinion fault configurations were assembled.After the reducer outputting, through gear coupling, the torque will be transferred to a magnetic powder brake capable of generating different resistive torques [31].It can be seen that the TRC and the CTL faults (Fig. 6(b, d) respectively) do not significantly affect the vibration signal.On the other hand, a significant increase in the time signal energy is caused by the CTW, MT and GSW faults (Fig. 6(c, e, f) respectively) and as for localized defects, the presence of a repetitive shock waves at every revolution period.Similar observations were noted from the vibration signals of the other operating speed and load conditions.

Comparison between the proposed HEWT-SVD method with HHT-SVD, LMD-SVD and WPT in tandem with PCA for feature extraction
To verify the robustness of the proposed method, three widely used methods, HHT-SVD [32], LMD-SVD and WPT together with PCA are also implemented and their performances are compared with that of HEWT-SVD for gear feature extraction under variable conditions.
In four different operating modes, vibration signals of a normal gearbox and five kinds of pinion faults are divided into 15 groups of data.Thus, for each operating mode, we end up with 90 samples.
Since the experiment data we used are collected from test bench in laboratory setup, the vibration signals, therefore, present a low noise level, while in real word application, tremendous noise level may corrupt the vibration signals.Thus, strong background Gaussian white noises (GWN) with SNR of 0.5 dB was added to the vibration signals as seen in Fig. 7.
To begin with, we first analyze the various signals shown in Fig. 7 by the HEWT method.An example of the HEWT application for one certain fault and operating modes (gearbox with tooth root crack fault under the operating speed 1200 RPM and 5 N.m of load) is shown in Fig. 8, from which we can display the Fourier spectrum segmentation, the extracted modes and their envelopes.
It can be seen in Fig. 8(a), that the whole spectrum is adaptively divided into six regions.Consequently, six different modes are obtained in total.The analytical signal, thereafter, was computed for each mode by HT.The feature matrix can therefore be constructed for fault identification.Finally, the singular value vector which is the fault feature vector can be obtained by conducting SVD.Table 1, partly gives the fault feature values obtained by HEWT-SVD for an operating speed 1200 RPM and under the state of loading 5 N.m, each vector contains the first three singular values.
To make a comparison between the proposed method with HHT-SVD, LMD-SVD and WPT-PCA, they are all implemented on the same data sets.
The EMD (resp.LMD) decomposes the vibration signals into a set of IMFs (resp.product functions (PFs)).As their number is defined by the characteristic and complexity of the analyzed signal, only the first 6 IMFs (resp.PFs) were taken.To perform the HHT, the IMFs are used to get the analytical signal using HT.Next, the feature matrix was constructed and the fault feature vectors were obtained using SVD.
When using WPT-PCA, the signals are decomposed by four layers sym1 wavelet packet decomposition, and signal energy of 16 frequency bands are obtained.Subsequently, PCA is used to extract principal components from the wavelet packet energy feature for dimensionality reduction.Fig. 9 shows the projections of the first three singular values obtained by HEWT-SVD of the normal gearbox and five kinds of pinion faults in four different operating modes in the three-dimensional plan view, the classification results for different faults of the gearbox can be seen from the figures.Obviously, for one certain fault mode and whatever the operating mode, the singular values extracted by HEWT-SVD have a high degree of coincidence.In the same context, comparing the singular value clusters of the normal gearbox with the five fault modes for different operating modes shown in Figs.9(a-d), one can see that, even though the signals are noisy, the mode separability is satisfactory, thus a good classification and it looks better for the operating speed 1200 RPM and under the state of loading 5 N.m shown in Fig. 9(c).
The results obtained by HHT-SVD for the operating speed 900 RPM and under the state of loading 8 and 11 N.m respectively shown in Fig. 10(a and b) and those obtained by LMD-SVD for the operating speed 1200 RPM and under the state of loading 5 respectively shown in Fig. 11(c) are almost comparable to the results obtained by HEWT-SVD shown in Fig. 9.A noticeable separability between faults and a good coincidence of the singular value vectors.However, under the remaining operating modes, the classification effect is worst between normal gear and gear with TRC and between gears with CTW and CTL as shown in Figs.10(c, d) and Figs.11(a, d) and between normal gear and gear with CTW as shown in Fig. 11(b) from which, the clustering gap is very small or even inexistent that may lead to the misclassification.(a, c, d) shows that: utilizing the PCA in the process of fault feature extraction for wavelet packet frequency band energy, the effect is significantly better than the results obtained by HHT-SVD and LMD-SVD.Nevertheless, there exist major fluctuations for all states, the clustering intervals between the faults states are relatively small.Moreover, the effect of classification is worst between gears with TRC and CTW as shown in Fig. 12(b).
Known from the above analysis, using HEWT with SVD for fault feature extraction under different operating modes is much better than HHT-SVD, LMD-SVD and WPT in tandem with PCA.
Since the instantaneous amplitude matrices of the vibration signals of different fault in the pinions are different, adaptively extracted by the HEWT, thus, they can be used as the feature of diagnostic fault.As the goal is to extract more efficient and more accurate fault features for fault recognition and classification, dimensionality reduction was done by the SVD, thus fault can be accurately identified.
It was obvious in the scatter plots that the various fault singles have a large difference and no overlapping portions in Figs.9(a-d), from which we suggest that the various faults can be distinguished well.

State classification based on Elman neural network
Here, the fault feature vectors obtained by HEWT-SVD are used to identify and classify the gear states by using the Elman neural network.
The data sets are divided into training dataset with 168 groups (7 groups of data for each fault status) while the remaining data set (192 groups) are used to test the effectiveness and the accuracy of the classifier.State classification results are partly shown in Table 2, from which we can see that even under variable operating modes and even the signals are noisy the actual outputs of the Elman neural network are consistent with the target outputs.Therefore, combining HEWT-SVD with Elman neural network is effective for gear fault diagnosis under variable operating modes.In order to compare the classification effect, another classification technique Back Propagation (BP) neural network is applied in classification.Comparison results are shown in Table 3 in which 5 examples are given to calculate the mean value of classification accuracy.As shown in the table, the Elman neural network has an advantage over BP in classification accuracy.Actually, both classifiers have classification accuracies higher than 0.9, because the obtained feature vectors have good separability.This gives confirmatory evidence about the effectiveness of the proposed feature extraction method.

Conclusions
Gearbox fault feature extraction and classification have always been unsatisfactory.This constitutes the principal motivation to develop a new approach to overcome this problem.Our new approach consists of combining HEWT, SVD and Elman neural network.Experimental datasets of normal gearbox and five pinion faults and under different operating modes were processed by the HEWT.The instantaneous amplitude matrices namely the feature matrices were decomposed by SVD to get the singular value communality.Their dimensions were hence reduced and more stable features were obtained.This is followed by using the Elman neural network for fault identification and classification according to the extracted feature vectors.
In this paper, the performance of the proposed HEWT-SVD-Elman method is shown to have an advantage over that of HHT-SVD, LMD-SVD and WPT-PCA for feature extraction and Elman neural network with BP for classification accuracy under different operating modes.
It is worth mentioning that in this study we only applied the proposed method for gearbox fault diagnosis, future experiments should be done on similar project to verify the effectiveness of this method.

Fig. 4 .Fig. 5 .
Fig. 4. a) Experimental gearbox test rig, b) structure of the single stage gear in the gearboxIn order to verify the effectiveness of the proposed method, six pinions with different fault states were considered.The first one is referred as Good (G), whereas the others have several different types of faults: a Tooth Root Crack (TRC), a Chipped Tooth in Length (CTL), a Chipped Tooth in Width (CTW), a Missing Tooth (MT) and General Surface Wear (GSW) as shown in Fig.5.Three pinions are simultaneously mounted on the input shaft of the gearbox, the engagement of each of them is done by a simple axial movement of the wheel on the output shaft (Fig.3(b)).To record vibration signals, two accelerometers with a sensitivity 100 mV/g was mounted radially, one vertically and the other horizontally on the bearing case of the output shaft.The time sampling frequency of the accelerometer channels is 125 kHz.The cut-off frequency of the anti-aliasing filter is 27 kHz.The acquisition duration is 30 s.

Table 1 .
Fault feature values obtained by HEWT-SVD for an operating speed 1200 RPM and under the state of loading 5 N.m

Table 2 .
State classification results based on Elman neural network.

Table 3 .
Classification results of Elman and back propagation neural networks