A fault diagnosis method combined with ensemble empirical mode decomposition , base-scale entropy and clustering by fast search algorithm for roller bearings

A method based on ensemble empirical mode decomposition (EEMD), base-scale entropy (BSE) and clustering by fast search (CFS) algorithm for roller bearings faults diagnosis is presented in this study. Firstly, the different vibration signals were decomposed into a number of intrinsic mode functions (IMFs) by using EEMD method, then the correlation coefficient method was used to verify the correlation degree between each IMF and the corresponding original signals. Secondly, the first two IMF components were selected according to the value of correlation coefficient, each IMF entropy values was calculated by BSE, permutation entropy (PE), fuzzy entropy (FE) and sample entropy (SE) methods. Thirdly, comparing the elapsed time of BSE/PE/FE/SE models, using the first two IMF-BSE/PE/FE/SE entropy values as the input of CFS clustering algorithm. The CFS clustering algorithm did not require pre-set the number of clustering centers, the cluster centers were characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities. Finally, the experiment results show that the computational efficiency of BSE model is faster than that of PE/FE/SE models under the same fault recognition accuracy rate, then the effect of fault recognition for roller bearings is good by using CFS method.


Introduction
The major electric machine faults include bearing defects, stator faults, broken rotor bar and end ring, and eccentricity-related faults.The failure of rolling element bearings can result in the deterioration of machine operating conditions.Therefore, it is significant to detect the existence and severity of a fault in the bearing fast accurately and easily.Owing to vibration signals carry a great deal of information representing mechanical equipment health conditions, the use of vibration signals is quite common in the field of condition monitoring and diagnostics of rotating machinery [1][2][3][4].Actually, roller bearings fault diagnosis is a pattern recognition process, which includes acquiring information, extracting features and recognizing conditions.The latter two are key links.
On one hand, the purpose of extracting features is to extract parameters representing the machine operation conditions to be used for machine condition identification.Since the characteristics of rolling bearing fault signals are nonlinear and non-flat stability.Feature extraction based on nonlinear dynamics parameters such as fractal dimension, approximate entropy (AE), sample entropy (SE), fuzzy entropy (FE) and permutation entropy (PE) have been applied to the mechanical fault diagnosis, and it has become one of the new ways of nonlinear time series analysis.Pincus M. proposed the AE method and applied it into fault diagnosis [5,6].However, the effect of AE method depends on the length of the data heavily.As a result, the value of AE is uniformly lower than the expected one and lacks relative coherence especially when the data length is short.To overcome this drawback, Richman J. S. and Moorman J. R. proposed a new SE method [7], in which SE was applied in the fault diagnosis of mechanical failure successfully [8].Nonetheless, the similarity definition of AE and SE is based on the heaviside step function, which is discontinuous and mutational at the boundary.As a substitute for improvement, fuzzy entropy (FE) has been proposed by Chen W. T. et al. recently [9,10], it has been widely applied in medical physiology signal processing and mechanical fault diagnosis [11].Permutation entropy (PE) was presented by Bandit C. and Pompe B. [12], for the complexity analysis of time domain data by using the comparison of neighboring values.It is reported that for some well-known chaotic dynamical systems, PE performs similar to Lyapunov exponents.The advantages of PE are its simplicity, fast calculation, robustness, and invariance with respect to nonlinear monotonous transformations [12], it has been successfully applied in mechanical fault diagnosis [13][14][15].However, the SE and FE methods need increase the dimension to + 1 for constructing the vector set sequences [6][7][8][9][10][11], and calculate the number of vector ( ) and ( ) (here is similarity tolerance in SE and FE methods) with the distance between two vectors and .In PE method, it needs to sort each matrix with dimension when calculating the probability distribution of each symbol -dimensional sequences .A method named base-scale entropy (BSE) proposed in reference [16].It did not like the PE/FE/SE methods which need complicated sorting operation, the BSE makes only use all the adjacent points by using the root mean square in -dimensional vector to calculating the base-scale(BS) value [16], this method is simplicity and extremely fast calculation to short data sets, it was applied in physiological signal processing [17].
The commonly used signal decomposing method includes wavelet analysis [18,19], empirical mode decomposition (EMD) [20], etc.However, wavelet transform requires choosing wavelet basis and decomposing layers, which makes it a non-adaptive signal processing method in nature.EMD method can decompose a complex signal self-adaptively into some intrinsic mode functions (IMFs) and a residual.To overcome the problem of mode mixing in EMD, ensemble empirical mode decomposition(EEMD) [21], an improved version of EMD, can self-adaptively decompose a complicated signal into IMFs based on the local characteristic timescale of the signal.Recently, EEMD has been widely applied in fault diagnosis [13,22].Considering that the IMFs decomposed by EEMD represent the natural oscillatory mode embedded in the signal, the BSE values of each IMF (IMF-BSE) are extracted as feature vector to reveal the characteristics of the vibration signals.
After extracting the feature parameters with EEMD and BSE, naturally, a classifier is expected to achieve the rolling bearing fault diagnosis automatically, such as support vector machine (SVM) [23] with particle swarm optimization (PSO) algorithm [24] and neural network (NN), label data sets are assumed available.But in practical applications, the data sets are usually unlabeled.Fuzzy c-means (FCM) algorithm [25] is a common method for rolling bearing fault diagnosis when the data is unlabeled [26].FCM algorithm is suitable for data structure with the homogenous structure, but it can only handle the spherical distance data of the standard specification.Gustafson-kessel (GK) clustering algorithm is an improved FCM algorithm, in which adaptive distance norm and covariance matrix are introduced.As GK can handle subspace dispersion and scattering along any direction of the data [27], GK cluster has been successfully used in fault diagnosis [28].As the Euclidean distance is used to compute the distance between two sample in FCM and GK algorithms, hence they are only handle data with a sphere-like structure.Since the distribution patterns of the data are seldom spheres, gath-geva (GG) clustering algorithm is proposed for this purpose [29].Fuzzy maximum likelihood estimation of distance norm can reflect the different shape and orientations of data structure [30].Most clustering methods such as FCM, GK and GG clustering algorithms require pre-determining the number of clustering centers which decides the clustering accuracy.To solve this problem, a method called clustering by fast search (CFS) algorithm which did not require pre-set the number of clustering centers is proposed in [31].The CFS algorithm is aimed at classifying elements into categories on the basis of their similarity, the cluster centers are characterized by a higher density than their neighbors and a relatively large distance from points with higher densities.
As mentioned above, combining EEMD, BSE and CFS, an intelligent fault detection and classification method for fault diagnosis of roller bearings is presented with experimental validation.Firstly, using the EEMD method to decompose the vibration signals under different conditions into a number of IMFs.Secondly, calculating the correlation coefficient value between each IMF component and the corresponding original signal.Using the BSE/PE/FE/SE methods to calculate the IMFs entropy values, then comparing the elapsed time of BSE/PE/FE/SE methods, respectively.Finally, selecting the first two IMFs entropy values according to the value of correlation coefficients as the input of CFS clustering model for fulfill the fault recognition.The experiment results show that the computational efficiency of BSE model is faster than PE/FE/SE models under the same classification accuracy.
The rest of this paper is organized as follows: Section 2 presents the review of EEMD, BSE and CFS models, respectively.Section 3 gives fault diagnosis methodology.Section 4 describes the experimental data sources and parameters selection for EEMD, BSE, PE, FE, SE and CFS models.Experiments validation is given in Section 5 followed by conclusions in Section 6.

Theoretical framework of EEMD
EEMD [13,21,22] is a substantial improvement method of EMD [20], and its procedures are as follows: Step 1: Given that ( ) is an original signal, add a random white noise signal ( ) to ( ): where ( ) is the noise-added signal, = 1, 2, 3,…, and is the number of trial.
Step 2: The original signals ( ) are decomposed into a number of IMFs by using EMD as follows: where indicates that the th IMF of the th trial describes the residue of th trial, and is the IMFs number of the th trial.
Step 3: If < , then repeat step1 and step 2, and add different random white noise signals each time.
Step 2: Base-scale (BS) is defined as the root mean square under difference value of all adjacent points in -dimensional vector .The BSE value of each -dimensional vector is calculated as follows: Step 3: Each -dimensional vector is transformed into a symbol vector set sequences ( ) = ( ),⋅⋅⋅, ( + − 1) , ∈ (1,2,3,4).The symbol dividing standard is chosen as a * .Therefore, procedure for ( ) with -dimension is given as: where = 1, 2,…, − + 1, = 1, 2,…, − 1, the meaning of and are the average value of the th vector and a constant respectively.The symbol set sequences {1, 2, 3, 4} are just label which is used to statistical probability distribution for each vector , they have not make any sense.
Step 4: The probability distribution ( ) is calculated for each vector ( ) .Therefore, there are 4 different composite states in ( ) and each state represents a different mode.The calculation of ( ) is given as follows: where 1 ≤ ≤ − + 1.
Step 5: The BSE is defined as:

Theoretical framework of CFS clustering
CFS clustering algorithm uses the value of density and distance between two points to determine the clustering centers.The steps for CFS method [31] can be described as follows: Step 1: For a given data set with points = , , … , , the distance of two points and as follows: where is the dimension of each point.
Step 2: Computing the local density of each point: where is equal to the number of points that are closer than cutoff distance to point .
Step 3: The parameter is measured by computing the minimum distance between the point and other points with higher density: where is a descending order subscript of the local density .Note that is much larger than the typical nearest neighbor distance only for points that are local or global maximum in the density.
Step 4: Using the value of to determine the clustering centers: The larger value of , the more possibility of the point become a clustering center, here the value of is in descending order.
Step 5: Computing the number of data points for each clustering center point according to the cutoff distance .

Procedures of the proposed method
An intelligent fault diagnosis strategy, which is based on EEMD, BSE and CFS models, is proposed in this study.Procedure of the proposed system can be summarized as follows: Step 1: Preprocessing vibration signals under different conditions are decomposed into a number of IMFs by using EEMD model.The original signals are decomposed into a series of IMFs, but the first IMF is the highest frequency portion of the original signals, and other IMFs in descending order, therefore, the first two IMFs contains the main information of the original signals.
Step 2: Calculating the correlation coefficient value between each IMF and the corresponding original signals.It reduces information redundancy Step 3: Using the BSE/PE/FE/SE models to calculate the IMF-BSE/PE/FE/SE values, comparing the elapsed time of BSE/PE/FE/SE models, respectively.
Step 4: In order to make the data visualization and improve computational efficiency, then using the first two IMF-BSE/PE/FE/SE entropy values as the input of CFS method.
Step 5: Unlike FCM/GK/GG clustering models, the CFS algorithm which did not require pre-set the number of clustering centers, the cluster centers are characterized by a higher density.and relatively large distance.Using the classification accuracy to compare the EEMD-BSE/PE/FE/SE-CFS models.
Fig. 1 shows the structure of the proposed fault diagnosis method.

Rolling bearing data set
In this subsection the proposed approach is applied to the experimental data, which comes from the Case Western Reserve University Bearing Data [32].Experiment data was collected using accelerometers mounted at the drive end (DE) and fan end (FE) of an induction motor.The motor bearings under consideration were seeded with faults by electro-discharge machining (EDM).Three single point defects (inner race fault (IRF), outer race fault (ORF) and ball fault (BF)) with fault diameters 0.1778 mm in.was introduced separately.The fault seeded at the outer race was placed at a position equivalent to 6:00 o'clock time configuration.The data collection system consists of a high bandwidth amplifier particularly designed for vibration signals and a data recorder with a sampling frequency of 12,000 Hz per channel.Table 1 shows the working conditions considered in this study.In Table 1, "NR" denotes the bearings with no faults and "BF", "IRF" and "ORF" denote the ball fault in ball, inner race fault and outer race fault.0.1778 mm is the fault diameters.The motor revolving speed was chosen as 1,750 rpm from the drive end of the motor.

Input vibration signals under different conditions
The

Parameter selection for different methods
(1) EEMD: EEMD has two parameters to be set, which are the ensemble number and the amplitude of the added white noise ( ) in Eq. (1).Generally speaking, an ensemble number of a few hundred will lead to an exact result, and the remaining noise would cause less than a fraction of one percent of error if the added noise has the standard deviation that is a fraction of the standard deviation of the input signal.For the standard deviation of the added white noise, it is suggested to be about 20 % of the standard deviation of the input signal [13].The parameter was set as 100.
(2) BSE: The parameter in Eq. ( 4) often varies from 3 to 7 [16,17].However, too large value is unfavorable for the need of a very large ≥ 4 , which is hard to meet generally and will lead to the losing of information.The parameter in Eq. ( 3) is a constant.Typically, the parameter is often fixed as 0.1-0.4[16,17], larger allows more detailed reconstruction of the dynamic process.However, smaller will be affected by noise.In this paper, the parameters , and is set as 0.2, 0.3 and 4, 5, respectively [16,17].
(3) PE: There are few parameters affect the PE value calculation, such as embedded dimension and the time delay .In Bandit C and Pompe B [12]'s studies, they recommended to select embedded dimension = 3-7 [13][14][15].For practical calculation, when < 3, PE cannot detect the dynamic changes of the mechanical vibration signals exactly.On the other hand, when > 8, reconstruction of phase space will homogenize vibration signals, and PE is not only computationally expensive but also cannot be observed easily because of its small varying range.When time delay > 5, the computational results can not exactly detect small changes in signals.
The time delay has a small influence on the PE calculation of the time series [13].Therefore, in this study, the parameters , and is set as 4, 5, 6 and 1 respectively to calculate the PE values of vibration signals.
(4) SE and FE: Three parameters must be selected and determined before the calculation of SE and FE.The first parameter embedding dimension , as in SE and FE, is the length of sequences to be compared.Typically, larger allows more detailed reconstruction of the dynamic process [7,13,30].But a too large value is unfavorable due to the need of a too large = 10 − 30 , which is hard to meet generally and will lead to the losing of information.Generally speaking, m is often fixed as 2 [7,13,33].The parameters similarity tolerance and determine the width and the gradient of the boundary of the exponential function respectively.In terms to the FE similarity boundary determined by and , too narrow settings will result in salient influence from noise, while too broad a boundary, as mentioned above, is supposed to be avoided for fear of information loss.It is convenient to set the width of the boundary as multiplied by the standard deviation (SD) of the original dataset.Experimentally, = (0.1-0.25)×SD [7,10,11,33], the parameter is set as 0.2SD in SE and FE in this paper.Finally, the parameter is often fixed to 2 [7,10,11,33].
(5) CFS: The parameter cutoff distance to be set, as suggested in [31], a cutoff distance equals to the average number of neighbors is often set as 1 % to 2 % of the total number of points in the data set [31].Large will lead to the value of local density of each point become high.However, a too small value will unfavorable lead to a cluster divided into many clusters.As a result, the parameters is set as 1.5 % in this paper.

Experimental result s and analysis
The data is selected from the experiments in which SKF bearings are used.The approximate motor speed is 1,750 rpm.The data set consists of 200 data samples in total, 50 data samples under each fault condition and every data sample has 2048 data points.The number of each sample is set as: NR:1-50, IRF: 51-100, BF:101-150, ORF:151-200.As limited space, here with a sample of each state for an example, the time domain waveforms of vibration signals under different working conditions are shown in Fig. 2.
The vertical axis is the acceleration vibration amplitude.Because of the influence of noise, it is difficult to find significant differences in different states.As shown in Fig. 2, it is hard to distinguish the four signals, especially, there is no obvious regularity in two states of NR and BF signals.
After EEMD decomposition, the original signals in Table .1 are decomposed into IMF1-IMF10 and a residue , Fig. 3 shows the EEMD decomposition results of the vibration signals, they contain 10 IMF components and a residue ..2 (As limited space, here with a sample of each state for an example).

Table 2. The correlation values of each IMF Mode IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 IMF7 IMF8 IMF9 IMF10
The value of correlation coefficient NR 0.5632 0.6771 0.4179 0.3904 0.4336 0.2547 0.1415 0.0436 0.0067 0.0178 IRF 0.8824 0.4346 0.2677 0.1168 0.0412 0.0307 0.0032 0.0001 0.0005 -0.0001 BF 0.9585 0.2083 0.1899 0.1225 0.0780 0.0462 0.0059 0.0046 0.0021 0.0010 ORF 0.9858 0.1248 0.0675 0.0304 0.0106 0.0044 0.0006 0.0004 0.0005 0.00001 As shown in Table 2, the correlation values of first two IMF components are higher than the other components, it contains the main information of the original signals.As shown in Fig. 4, the first two IMF-BSE/PE/FE/SE entropy values are higher than the other IMF components, that is because the EEMD method decompose the original signals into high frequency and low frequency components in descending order, the entropy value of the first two IMFs are higher than the other IMFs.Compared with the IMF-SE values, the decreasing tendency of IMF-BSE/PE/FE values are clear on the whole, because the SE method uses the hard threshold to measure the similarity of -dimensional vector and .The FE introduced the fuzzy exponential functions to measure the similarity and make the signal become smooth, but the smoothing feature of the IMF-FE entropy values are no better than PE and BSE methods, the BSE/PE methods count up the number of probability ( ) for each -dimensional vector before compute the BSE and PE entropy values, the original signals are decomposed into a series of IMFs by using EEMD model in descending order.Therefore, the number of probability ( ) for each -dimensional vector is close to a fixed value for each IMF, hence the characteristics of continuous and smooth in BSE and PE are better than FE/SE models.The SE and FE methods make use of the , smaller than similar tolerance (SE) and fuzzy exponential (FE) functions to compute the corresponding entropy values, so the characteristics of random mutation exists in the IMF-FE/SE entropy values, it is consistent with the case of Fig. 4(h) and Fig. 4(i), this indicates that the BSE/PE methods can detect small changes in the signal.However, the computational efficiency of BSE model is faster than PE model, the elapsed time of all samples calculated by BSE, PE, FE and SE are counted, which is given in Table 3.As shown in Table 3, when = 5, = 0.2 the biggest total and average elapsed time in BSE method are 53.0.125515 seconds and 0.26506258 seconds, which is smaller than the PE, FE and SE methods.The reasons why the elapsed time of BSE method is the lowest in Table 3 are listed as follows: (1) For a given original signal with points, it has ( − + 1) -dimensional vector by reconstruction operation [7,12,16], here is the embedding dimension in BSE/PE/SE/FE methods.The BSE and PE methods takes the same time to reconstruct the original signals [12,16], but the SE method requires twice reconstruct operations [7].Therefore, the elapsed time of the SE method is larger than that of BSE and PE methods.
(4) The steps for SE and FE are described in detail in reference [7][8][9][10][11], computing the distance , = max which is meeting the condition , ≤ [7], here is the similar tolerance.It requires comparison operation ("<" and "=") with ( − )( − + 1) cycles.The following step is count up the number of ( ) [7], the number of addition, multiplication and division operations are ( − + 1), ( − + 1) and ( − + 1).Finally, by increasing the to + 1 and repeating the previous steps to find ( ) in the following step, the number of division and logarithm operations for calculate the value in SE method are once [7].In FE, it was imported the concept and employed the exponential functions exp − , ⁄ as the fuzzy function to get a fuzzy measurement of two vectors' similarity based on SE method [9,10].Therefore, the total cycle number of basic operations for FE is close to SE.As mentioned above, the total number of addition, subtraction, multiplication, division and comparison operations of BSE/PE/SE/FE models are given in Table 4.
It can be seen that the total number of BSE method is (4 + 4 + 11)( − + 1), which is smaller than that of the PE/SE method because the parameter ≥ 2 (here = 2048 in this paper).Therefore, the BSE method is faster than PE/SE methods   As shown in Table 5 and Fig 5, the symbol 'CC' denotes the clustering centers of each cluster.The meaning of "NR-13"in Fig. 5(3) is that the number of clustering center is 13 in NR signal.It can be seen from Fig. 5(1), Fig. 5 (26), it shows that the value of four outlier points are higher than other points, such as the 13th point in NR signal, choosing the clustering centers according to the value of in Eq. (12).In [28], the authors suggests that these outlier clustering centers have character of hop in all points such as the 13th point jump to 128th point in Fig. 5.2, the 177th point jump to normal points.The normal points not like the clustering centers, they have character of smooth in Fig. 5(2).The classification accuracy rate under different models are given in Table 6.As shown in Table 6 the highest accuracy is up to 100 %, which indicates that the CFS clustering algorithm performs well in solving fault recognition problem.The original signals are decomposed into a series of IMFs, but the first two IMFs are the highest frequency portion of the original signals [31], and other IMFs in descending order, therefore, the first two IMFs contains the main information of the original signals.The results of IMF-BSE/PE/FE/SE values are given in Fig. 4. It can be seen that IMF-BSE/PE/FE/SE entropy values are in descending order, at the same the time, the correlation degree between overall IMFs and the original signals are measured by using the correlation coefficient, Table 2 shows that the correlation values are in descending order.In order to make the data visualization and reduce information redundancy, hence the first two IMF-BSE/PE/FE/SE values are used as the input of the CFS clustering model.Because the irregularity of NR vibration signal is higher than the other three kinds of signals, compared with the NR signal, fault signals have vibration regularity, especially the IRF and ORF signals.But the regularity and self-similarity of BF signal is weaker than the IRF and ORF.Thus the irregularity of BF signal is the highest in three fault signals, therefore, the first two IMF-BSE/PE/FE/SE values are different in Fig. 4.
CFS clustering algorithm uses the value of local density and distance between two points to determine the clustering centers.In [28], the larger value of for the th sample point, the more possibility of the th sample point become the cluster center point, the value of for cluster center point has obvious characteristic of jump when the cluster centers into non-cluster center, such as Fig. 5(2), Fig. 5(5), Fig. 5 (26).The entropy values of the same signal samples were similar, and the entropy values of the different signal samples are not the same, therefore, the value of for each center point has better discriminative.The CFS algorithm has its basis in the assumptions that cluster centers are surrounded by neighbors with lower local density and that they are at a relatively large distance δ from any points with a higher local density .After the cluster centers have been found, each remaining point is assigned to the same cluster as its nearest neighbor of higher density.So the fault recognition accuracy rate of CFS clustering algorithm is good.
As mentioned above.The fault diameters of and motor speed of roller bearings in Table .1 are 01778 mm and 1750 rpm, then using the experiment data with 0.5334 mm and 1797 rpm to fulfill the fault recognition by EEMD-BSE/PE/FE/SE-CFS models.The classification accuracy rate under different models are given in Table 7, and the 2-dimension clustering are shown in Fig. 6.
The symbol 'CC' denotes the clustering center points, the symbol "NR-29" denotes the 29th sample point which is regarded as the NR cluster center point in first sub-figure included in Fig. 6 (the corresponding serial number of sample are as follow: NR: 1-50, IRF: 51-100, BF: 101-150, ORF: 151-200).As shown in Table 7, the highest accuracy is also up to 100 %.The classification accuracy rate by using EEMD-BSE-FCM/GK/GG models are given in Table 8, and the 2-dimension clustering are shown in Fig. 7.
It can be seen that the best total accuracy (%) is up to 100 % in Table 8, it is same as the 100 % by using EEMD-BSE/PE/SE/FE-CFS models, but the CFS algorithm which did not require pre-set the number of clustering centers, the cluster centers are characterized by a higher density than their neighbors and by a relatively large distance from points with higher densities.But the FCM/GK/GG model requires to pre-set the number of clustering centers

Fig. 1 .
Fig. 1.Flow chart of the proposed method

Fig. 5 .
The results of the local density , distance , and the 2-dimension clustering for all samples As shown in Fig. 4, the first two IMF-BSE/PE/FE/SE entropy values are higher than other IMF components.the first two IMF-BSE/PE/FE/SE values were selected as the input of CFS model according to the correlation values in Table 2, the results of local density and distance of the clustering centers are given in Table 5, the figure of local density , distance and 2210.A FAULT DIAGNOSIS METHOD COMBINED WITH ENSEMBLE EMPIRICAL MODE DECOMPOSITION, BASE-SCALE ENTROPY AND CLUSTERING BY FAST SEARCH ALGORITHM FOR ROLLER BEARINGS.FAN XU, YAN JUN FANG, RONG ZHANG, ZHENG MIN KONG, RUO LI TANG 2-dimension clustering are shown in Fig. 5.

Fig. 6 .
Fig. 6.The 2-dimension clustering for all samples by using CFS model

Table 1 .
The rolling bearing experimental data under different conditions

Table 3 .
The elapsed time of each sample by using BSE, PE, FE and SE methods

Table 4 .
The total number of basic operations of BSE/PE/SE models

Table 5 .
The value of local density , distance and

Table 6 .
The classification accuracy rate under different models

Table 7 .
The classification accuracy rate under different models

Table 8 .
The classification accuracy rate under different models