Application of empirical mode decomposition and Euclidean distance technique for feature selection and fault diagnosis of planetary gearbox

Abstract. Planetary gearbox plays an important role in large and complex mechanical equipment due to the advantage that it can provide larger transmission ratio in a compact space than fixed shaft gearbox. However, its fault diagnosis is a dilemma due to the special structure and harsh working conditions. This paper applies Empirical Mode Decomposition (EMD) and Euclidean Distance Technique (EDT) for planetary gearbox feature selection and fault diagnosis. EMD is a self-adaptive signal processing method that can be applied to non-linear and non-stationary signal and it can also get the aim of de-noising. EDT can give out the quantitative fault diagnosis result. And its theoretical knowledge is easy to understand. An intrinsic mode function (IMF) selection method based on energy ratio is proposed to select IMFs which include sensitive fault information. A two-stage feature selection and weighting method based on EDT is applied to get a new combinative feature and 36 feature parameters are extracted before this process. Then, the feature vector matrix of each raw signal can be computed out by extracting the new combinative feature from every IMF. Finally, the diagnosis result can be obtained through calculating the Euclidean Distance value between two feature vector matrixes. Namely, the health state of the tested signal is as same as the trained signal which the Euclidean Distance between them is the minimum. The performance of the proposed method is validated by experimental data and industrial data.


Introduction
Due to the advantage that it can provide larger transmission ratio in a compact space than fixed shaft gearbox, planetary gearbox is increasingly used in many kinds of rotary machinery in recent years, such as wind turbines, cranes and military equipment (e.g.tanks and helicopter), etc.However, the special internal structure and the harsh working conditions make it very difficult to the fault diagnosis of the planetary gearbox [1].
1) The fault feature frequencies of planetary gearbox are very low and along with serious noise pollution.The feature extraction is also very difficult.
Compared with fixed shaft gearbox, the structure of planetary gearbox is compact relatively.The main components of a single stage planetary gearbox usually consist of one sun gear, one ring gear, one planet carrier and three or four planet gears.So it can achieve larger transmission ratio.Because of the special structure and adverse working conditions, the fault feature frequencies of planetary gearbox are very low (such as the fault feature frequencies of planet gear and ring gear) and along with serious noise pollution.Therefore, the feature extraction of planetary gearbox is also very difficult.
2) The vibration signal transmission path is very complicated.And the vibration signal acquired from planetary gearbox usually is strong non-linear and non-stationary.
The special internal structure makes the signal transmission path very complicated.The signal transmission path concerns both fault and accelerometer location.Additionally, since there are several gears meshing in a compact space, the vibration signal acquired from planetary gearbox usually is coupled with strong non-linear and non-stationary data.
3) The component parts of vibration signal of planetary gearbox are complicated.Therefore, frequency spectrum analysis is very difficult.
Due to the special internal structure, the component parts of vibration signal of planetary gearbox are complicated.Fault feature frequencies are related to the teeth number of each gear, the number of planet gears, meshing phase and so on.In addition, gears failure, manufactured error and the location varying between planet gears and accelerometer will bring amplitude and frequency modulation to vibration signal.So, frequency spectrum analysis is very difficult.
It can draw a conclusion from the analysis described above that planetary gearbox fault diagnosis is very difficult.And efficient and accurate methods are particularly needed to prevent it from failure.Many researchers made important contributions to planetary gearbox fault diagnosis.Samuel and Pines introduced vibration-based techniques for helicopter transmission diagnostics roundly [2].Some researchers focused on model-based method [3][4][5][6][7].Some researchers got fault diagnosis results using frequency analysis [8][9][10][11][12].Other researchers solved the dilemma using evaluating method [13] and condition monitoring method [14].The intelligent and quantitative method is particularly needed.
The procedure of the intelligent fault diagnosis methods is mainly made up by data acquisition, data processing, features selection and fault diagnosis.In this aspect, major efforts have been made by many researchers and organizations.Some documents [15][16][17][18] introduced the common fault features in detail, such as maximum, minimum, kurtosis and energy, etc.In addition, the data processing techniques that utilized to extract the features were also explained by them.However, when applied on planetary gearbox fault diagnosis, the effectiveness of them still needs to validate.
Empirical mode decomposition (EMD) [19][20][21][22][23], as a new time-frequency analysis technique, has been developed and widely applied in fault diagnosis of rotating machinery recently.EMD can decompose the complicate signal into a set of complete, simple and almost orthogonal components named intrinsic mode function (IMF).The IMFs represent the natural oscillatory mode embedded in the signal and work as the basis functions, which are determined by the signal itself, rather than pre-determined kernels.Thus, it is a self-adaptive signal processing method that can be applied to non-linear and non-stationary signal perfectly.In addition, Euclidean Distance Technique (EDT) is a useful method to automotive fault diagnosis.EDT can give out the quantitative result.And compared with other similar methods, its theoretical knowledge is easy to understand.Therefore, this paper proposed a feature selection and fault diagnosis method based on EMD and EDT.
Hereafter, the paper is organized as follows.Section 2 introduces the proposed feature selection and fault diagnosis method.Section 3 illustrates the method by experimental data of a planetary gearbox test rig.Section 4 validates the proposed method using experimental data and industrial data.Finally, the conclusions are drawn in Section 5.

EMD theory
Hilbert Huang Transform (HHT) is a new signal processing method which developed by Huang et al.It contains two parts: EMD and Hilbert spectrum analysis method.As the kernel of HHT, EMD has been developed and widely used in fault diagnosis of rotating machinery recently.Using EMD, the complex signal can be decomposed into a set of complete, simple and almost orthogonal components named intrinsic mode functions (IMF).The IMFs represent the natural oscillatory mode embedded in the signal and work as the basis functions, which are determined by the signal itself.The detailed theory of HHT, including EMD, can be seen in documents [19][20][21][22][23]

APPLICATION OF EMPIRICAL MODE DECOMPOSITION AND EUCLIDEAN DISTANCE TECHNIQUE FOR FEATURE SELECTION AND FAULT
DIAGNOSIS OF PLANETARY GEARBOX.HAIPING LI, JIANMIN ZHAO, JIAN LIU, XIANGLONG NI and it will not be introduced any more in this paper.

IMF selection method based on energy ratio
After getting all the IMFs of a signal, sensitive IMFs which contain main fault information should be selected to promote the velocity of calculation.This paper proposes an energy ratiobased method to select sensitive IMFs because the energy value of a fault signal must be higher than the normal signal's.The specific steps are as follows.
3) Because the IMFs produced by EMD are arranged from high frequency to low frequency naturally and the feature frequencies of planetary gearbox are usually very low, the ( + 1)th IMF which contain main fault information can be obtained by adding the residual -IMFs together to decrease error.

Feature extraction
A good feature parameter is crucial to fault diagnosis of planetary gearbox.This paper summarizes 36 feature parameters which may be appropriate for planetary gearbox fault diagnosis [2,[15][16][17][18] which can be seen in Table 1.The feature extraction methods of the 36 features are different, as Fig. 1 shows.This is a supplement to Ref. [15] and it will be not introduced here anymore.Mean square value = 1 Root mean square (RMS) = 1 Where is the vibration signal acquired by the accelerometer. is the total number of the data points.( ) is the th meature of the frequency spectrum of and ( ) is the frequency value of ( ).
is the maximum peak-to-peak amplitude of the signal ; is the amplitude of the th harmonic, and is the total number of harmonics in the frequency range.is the current time record number in the run ensemble.is difference signal which = − . Where is the signal containing the mesh frequencies, their harmonics and their first-order sidebands.is residual signal which = − . Where is the signal containing only the mesh frequencies and their harmonics.is envelop signal of band-pass filtered signal computed using the Hilbert transform.
equals − • and is the th measurement of the resulting signal., , and are the mean values of , , and , respectively.

Feature selection and weighting based on EDT
In the application of fault diagnosis for planetary gearbox, different features may be sensitive to different kinds of faults.Therefore, feature selection is needed after extracting all of the 36 features [24].Lei and Zuo proposed a two-stage feature selection and weighting technique via EDT in Ref. [17].But they ignored the influence of working condition changing.And the method was implemented on fixed gearbox not planetary gearbox.This paper improves the method by considering not only the health state but also the working condition and applies to planetary gearbox.The specific steps can be described as follows: Assume that is the number of the health statuses of planetary gearbox; is the number of the samples of the th health status and extract features from each sample.Thus, a feature set { , , , = 1, 2,…, ; = 1, 2,…, ; = 1, 2,…, } can be obtained.Where , , is the th feature of the th sample of the th health status.Meanwhile, assume that is the number of working conditions.Then the improved feature selection and weighting procedures based on EDT are as follows.
(1) Calculating the average Euclidean Distance between the samples of the same health status: Then getting the average Euclidean Distance of health statuses: (2) Defining the variance factor under the same health status with considering the changing of vibration signal as follows: (3) Calculating the mean values of each feature of all samples under the same health status: Then, calculating the average Distance between the mean values of different health statuses: (5) Defining a variance factor with considering vibration signal and features as follows: (6) Calculating the ratio of ( ) and ( ) with assigning the : Then normalizing the obtained values: The appropriate features are sensitive to the changing of health status and insensitive to the changing of the working condition.Assume that , is the mean value of the ratio corresponding to Eq. ( 11) of working condition, calculating the -value as follows: Because the selection procedure has removed the influence of working condition changing, the features with ≥ 0 are appropriate.Although the appropriate features are selected, they have different sensitivities.Therefore, feature weighting is essential to achieve a more accurate diagnosis result.Repeat the Step (1)~( 7), the different sensitivities of the selected features can be obtained.Namely, each selected feature can get a weighting coefficient ( = 1, 2,…, ).Where is the number of the selected features and ≤ 36.Thus, a new assembled feature can be obtained as follows: The is the feature we wanted.

Feature selection and fault diagnosis method
The flow chart of the feature selection and fault diagnosis method for planetary gearbox based on EMD and EDT which proposed in this paper is shown in Fig.

Experimental evaluation
In this section, we examine the performance of the proposed method by using it to analyze the experimental signals of a planetary gearbox in a test rig.

Experimental setup
Fig. 3 shows the planetary gearbox test rig.And the gear parameters of the planetary gearbox can be seen in Table 2.  Four accelerometers were mounted on the tested planetary gearbox, as depicted in Fig. 4(a).Fig. 4(b) is the structure of the planetary gearbox.
The sampling frequency and time of this experimental system is 20 kHz and 2s, respectively.Motor drives the input shaft, shaft 1, at 400, 800 and 1200 rpm three kinds of rotary speed.To each speed, 0, 0.4, 0.8 and 1.2 Nm four different kinds of loads are implemented.
In order to demonstrate the effectiveness of the proposed method, wear failure is introduced to the gears and seeded on one tooth of sun gear, planet gear and ring gear, respectively.The specific failures are as Fig. 5 shows.

Feature extraction, selection and weighting
The vibration signal acquired from the planetary gearbox test rig is utilized to analyze according to the feature extraction, selection and weighting method proposed in Section 2.2.Fig. 5 shows the sensitive degree of the 36 features.Where Fig. 6(a) is the result of health status and working condition change at the same time, Fig. 6(b) is the result of working condition changes.It can be seen from the two figures that some features have high sensitive degree no matter health status and working condition change at the same time or only the working condition changes, for example # (mean square value), # (variance) and # (energy), some features have high sensitive degree only when working condition changes, for example # (maximum value), # (minimum value), # (peak to peak) and # (peak value).In addition, # (average value) can be of high sensitive degree only when health status and working condition change at the same time.
The appropriate features which are sensitive to the changing of health status and insensitive to the changing of the working condition can be obtained by the D-value of Fig. 6

EMD decomposition and IMFs selection
The vibration data used to EMD decomposition contain 4 kinds of health statuses, normal, sun gear fault, planet gear fault and ring gear fault, and each health status has 30 samples.After all of the samples processed by EMD, 15-23 IMFs can be obtained.Taking the planet gear fault signal which are acquired when the working condition is 1200 rpm and 1.2 Nm as example, the result of EMD decompsition can be seen in Fig. 8.According to the IMFs selection method proposed in Section 2.1.2,the energy ratio of each IMF and original signal can be obtained, see Table 3.It can be seen from Table 3 that the energy ratio of the first to the sixth IMF and original signal is much larger than the others, so the six IMFs are selected.In order to decrease computed error, this paper combines the seventh to the fifteenth IMF and the residual signal together and make it as a new IMF, IMF 7* , as Fig. 9 shows.

Diagnosis result
After obtaining the new feature and the IMFs which contain main fault information, the next step is extracting the from each selected IMF to constitute a feature vector to be used by EDT. is a little serious than planet gear and ring gear.Similarly, it can be seen from the figure that the 1-30 test samples are in normal status, the 31-60 test samples are in planet gear fault status and the 61-90 test samples are in ring gear fault status.In addition, the status of the test samples which the arrow directing are judged incorrectly.For example, the real status of the test sample which the first arrow directing is normal, but it will be judged to be in planet gear fault status due to the Euclidean Distance value between it and planet gear fault trained sample is less than others.7 test samples are judged incorrectly.Namely, the accuracy of the method proposed in this paper is 94.17 %.

Methodological validation
In order to validate the proposed fault diagnosis method for planetary gearbox, experimental and industrial data are utilized.

Validation by experimental data
In order to validate the effectiveness of the proposed method, this paper uses the experimental data at first.The specific set can be seen in Table 4.The data used before is acquired when the working condition is 1200 rpm and 1.2 Nm, in order to validate the effectiveness of the method, the data that acquired when the working condition is 800 rpm and 0.4 Nm is also utilized to be analyzed.The process of the analyzing is as same as before and Fig. 11 shows the diagnosis results.Similarly, it can be seen that 11 test samples are judged incorrectly.Namely, the accuracy of the method proposed in this paper is 90.83 %.Therefore, the effectiveness of the proposed method can be validated.Processing the data used before using EMD and then selecting a feature from the 36 features randomly 4 Validate the necessity of feature weighting Processing the data used before using EMD and feature selection but without feature weighting

Validate the necessity of EMD
The data acquired when the working condition is 1200 rpm and 1.2 Nm is still utilized here but EMD will be not implemented.Fig. 12 shows the diagnosis results.Additionally, in order to validate the necessity of EMD further, this paper simulates the process of EMD by dividing the data used before into 7 equal lengths.Fig. 13 shows the diagnosis results.It is needed to emphasize that both of the two ways are still utilize as the feature.It can be seen from the two figures that 16 and 13 test samples are judged incorrectly, respectively.Namely, the accuracies are 86.67 % and 89.17 % and they are less than 94.17 %.So, the necessity of EMD can be validated.
It also can be seen from the two results that dividing a signal into several equal lengths also can increase the accuracy if EMD cannot be implemented.The data acquired when the working condition is 1200 rpm and 1.2 Nm is still utilized here and EMD is also implemented, but the feature is selected randomly from the 36 features.Fig. 14 shows the results and it can be seen that 16 test samples are judged incorrectly.Namely, the accuracy is 86.67 % and it is much less than 94.17 %.So, the necessity of feature selection can be also validated.

Validate the necessity of feature weighting
The data acquired when the working condition is 1200 rpm and 1.2 Nm is still utilized but the selected features are not weighted.Namely, * = + + + + + + + .Fig. 15 shows the results and it can be seen that there are 20 test samples are judged incorrectly.The accuracy is 83.33 % and it is also much less than 94.17 %.Therefore, the necessity of feature weighting can be validated.

Industrial data specifications
Industrial data was analyzed to validate the effectiveness of the method proposed in this paper further.And the industrial data was acquired from a helicopter planetary gearbox due to planetary gearbox is a main part of helicopter engine.Table 5 is the parameters of the planetary gearbox.On the whole of the helicopter engine, there are 8 accelerometers.However, only the 1, 6 and 7 accelerometers are installed on the casing of the planetary gearbox.Namely, the three locations are useful and the diagram of the specific locations can be seen in Fig. 16.
Because of the specialty of helicopter, the rotary speed of the sun gear shaft is constant which is 574.74 rpm.And 30 %, 40 % and 50 % of the rated torque of the helicopter are implemented.The data is acquired during the flight time 600 h to 800 h.It is found through open-gearbox examination that the planetary gearbox is healthy after testing.Continue experiment has not been implemented with considering the economy and safety.

Results analysis and discussion
Generally, though the planetary gearbox is healthy, the effectiveness of the proposed method also can be validated if the proposed method has a good performance when analyze the data acquired from different working conditions.Therefore, the signal acquired from the three kinds of torques is analyzed in this section.Following the procedure of the method proposed in Section 2, the diagnosis result is shown in Fig. 17.It can be seen from the results that 3 in 30 test samples are judged incorrectly.Namely, the accuracy is 90 %.The effectiveness of the proposed method can be validated once more.

Conclusions
In this paper, the dilemmas of planetary gearbox fault diagnosis are point out first.Then a feature selection and fault diagnosis method based on EMD and EDT is proposed to solve these problems and output the diagnosis results intelligently and quantitatively.In the proposed method, an IMF selection method based on energy ratio is proposed first.Then a two-stage feature selection and weighting method based on EDT is applied to planetary gearbox fault diagnosis and 36 feature parameters are extracted before this process.Finally, compute out the feature vector matrix of each raw signal and gain the diagnosis results through calculating the Euclidean Distance value between two feature vector matrixes.A planetary gearbox test rig is established and four kinds of health conditions are simulated.Furthermore, the effectiveness of the proposed method is validated by industrial and experimental data.

2 .
It consists of the following procedural steps.(1) Get a new feature parameter through feature extraction, selection and weighting, the main steps as the blue flow (dotted) line shows.(2) Obtain several sensitive IMFs by EMD decomposition and IMFs selection, as the red flow line shows.(3) Compute out the feature vector matrix of each raw signal by extracting the new feature from every IMF.(4) Gain the diagnosis result through calculating the Euclidean Distance value between two feature vector matrixes.Namely, the health state of the tested signal is as same as the trained signal which the Euclidean Distance between them is the minimum.

Fig. 2 .
Fig. 2. Flow chart of the method proposed in this paper

Fig. 4 .Fig. 5 .
a) Mounted location of each accelerometer; b) structure of the tested planetary gearbox aSeeded wear failure of every gear: a) sun gear; b) planet gear; c) ring gear

14 )Fig. 6 .Fig. 7 .
Fig. 6. a) The sensitive degree of the 36 features with health status and working condition change; b) the sensitive degree of the 36 features with only working condition changes

Fig. 8 .
Fig. 8.The result of EMD decompsition of the planet gear fault signal This paper uses the vibration data of 4 kinds of health statuses, normal, sun gear fault, planet gear fault and ring gear fault which are all acquired when the working condition is 1200 rpm and 1.2 Nm.Selecting 33 samples from each health status, the first to the third samples are used as the trained samples and the other 30 samples are utilized as the test samples.Therefore, there are 120 test samples.Extracting the new feature from the first to the sixth IMFs and IMF 7* which made up by the other IMFs and the residual signal of each sample.The feature vector is constituted by 7 numbers.Finally, calculating the Euclidean Distance values between each test sample and the 4 kinds of trained samples using the feature vectors, the results are shown as Fig. 10.It can be seen from the figure that the Euclidean Distance values between 91-120 test samples and the sun gear fault trained samples are less than others obviously.The reason is that the damage of sun gear

4 . 1 . 1 .
2256.APPLICATION OF EMPIRICAL MODE DECOMPOSITION AND EUCLIDEAN DISTANCE TECHNIQUE FOR FEATURE SELECTION AND FAULTDIAGNOSIS OF PLANETARY GEARBOX.HAIPING LI, JIANMIN ZHAO, JIAN LIU, XIANGLONG NI Validate the effectiveness of the method

Fig. 11 .Fig. 12 .
Fig. 11.Diagnosis results using data acquired when the working condition is 1200 rpm and 0.4 Nm

Fig. 16 .
Fig. 16.The diagram of accelerometers installed on the casing of the planetary gearbox

Table 1 .
The 36 feature parameters No.
2256.APPLICATION OF EMPIRICAL MODE DECOMPOSITION AND EUCLIDEAN DISTANCE TECHNIQUE FOR FEATURE SELECTION AND FAULTDIAGNOSIS OF PLANETARY GEARBOX.HAIPING LI, JIANMIN ZHAO, JIAN LIU, XIANGLONG NI ) 2256.APPLICATION OF EMPIRICAL MODE DECOMPOSITION AND EUCLIDEAN DISTANCE TECHNIQUE FOR FEATURE SELECTION AND FAULT DIAGNOSIS OF PLANETARY GEARBOX.HAIPING LI, JIANMIN ZHAO, JIAN LIU, XIANGLONG NI (4) Defining the variance factor between different health statuses with considering the changing of features as follows:

Table 2 .
Planetary gearbox configuration parameters

Table 3 .
Energy of each IMF and its ration to original signal's Where the energy value of original signal is 145223.241 2256.APPLICATION OF EMPIRICAL MODE DECOMPOSITION AND EUCLIDEAN DISTANCE TECHNIQUE FOR FEATURE SELECTION AND FAULTDIAGNOSIS OF PLANETARY GEARBOX.HAIPING LI, JIANMIN ZHAO, JIAN LIU, XIANGLONG NI

Table 4 .
Specific set of validating the effectiveness of the proposed method No.

Table 5 .
The configuration parameters of the planetary gearbox of a helicopter 2256.APPLICATION OF EMPIRICAL MODE DECOMPOSITION AND EUCLIDEAN DISTANCE TECHNIQUE FOR FEATURE SELECTION AND FAULTDIAGNOSIS OF PLANETARY GEARBOX.HAIPING LI, JIANMIN ZHAO, JIAN LIU, XIANGLONG NI