Classification of machinery vibration signals based on group sparse representation

The working condition of mechanical equipment can be reflected by vibration signals collected from it. Accurate classification of these vibration signals is helpful for the machinery fault diagnosis. In recent years, the L1-norm regularization based sparse representation for classification (SRC) has obtained huge success in image recognition, especially in face recognition. However, the investigation of SRC for machinery vibration signals shows that the accuracy and sparsity concentration index are not high enough. In this paper, a new classification method for machinery vibration signals is proposed, in which the L1L2-norm regularization based sparse representation, i.e. group sparse representation, is recommended as a coding strategy. The method achieves its idea classification performance by three steps. Firstly, time-domain vibration signals, including training and test samples, are transformed to frequency-domain to reduce the influence of corrupting noise. Then, the transform coefficient vectors of the test samples are coded with a combination of L1-norm and L2-norm constrain on a dictionary, which is constructed by merging the transform coefficient vectors of the training samples. At last, the fault types of the test samples are labeled by identifying their minimal reconstruction errors. The classification results of simulated and experimental vibration signals demonstrate the superiority of proposed method in comparison with the state-of-the-art classifiers.


Introduction
In the field of machinery fault diagnosis, vibration analysis is one of the most common and reliable methods [1].It takes advantage of the advanced signal processing methods to extract fault information from raw vibration signals, which are collected by vibration sensors installed on the machinery, and then makes a diagnosis according to the fault information.In general, vibration analysis based diagnosis techniques can be classified into two categories: feature frequency recognition based method (FFRM) and classification-based diagnosis method (CDM).
The basic idea of FFRM is to use fault feature frequencies to determine the fault type.Specifically, fault feature component is extracted from the raw vibration signal by an effective signal processing method, and then its dominant frequency is judged whether or not equal to one of the fault feature frequencies.Fourier transform based frequency-domain analysis and envelope spectrum analysis, as a fundamental tool for FFRM, are usually used after the extraction of fault feature for analyzing its frequency component.Although lots of the existing fault diagnosis techniques based on FFRM have gotten decent results, the feature frequency cannot be known in some cases for the difficulty in obtaining the rotating frequency or parameters of mechanical parts, which limits its broader application.
The basic idea of CDM is to use training samples to establish a diagnostic decision maker and to determine the fault type of test samples according to the maker output.Compared with FFRM, CDM diagnoses depending on the relationship among samples rather than the feature frequency of a single sample.In general, it includes three steps: preprocessing, feature extraction and classification [2].In preprocessing step, the vibration signal collection and rejection are conducted.FAJUN YU, FENGXING ZHOU Other measures might also be implemented at this stage, such as synchronization [3] and pre-whitening [4] etc.In feature extraction step, a meaningful feature vector is extracted from the vibration signal by using a dimensionality reduction method.The feature vector should be distinguishable for different classes.Another important step is classification.The purpose of classification is to translate the extracted feature of a sample into an output, which labels the fault type of the sample.Frequently used classification methods in the field of fault diagnosis are linear discriminant analysis (LDA), artificial neural network (ANN) and support vector machine (SVM).LDA, as a basic Fisher discriminant classifier, pursues a low degree of coupling between classes and a high degree of polymerization within class.ANN can realize nonlinear mapping between symptoms and faults.However, the network configuration, which seriously influences classification results, is difficult to determine.SVM, as another linear classifier, is a machine learning method based on statistical learning theory, and produces a favorable generalization performance [5].However, larger samples and multiclass problems would lead to a sharp increase of the training time and computational complexity [6].
In recent years, a new classification technique, i.e.SRC, has been proposed in the field of pattern recognition [7][8].Its basic principle is to sparse code a test sample over a dictionary and then to perform the classification based on the reconstruction error.Since its appearance, SRC and its variants have been widely applied in face recognition, EEG signal classification, music genre classification and hyper-spectral image classification etc.In the field of fault diagnosis, SRC is rarely studied.A typical application appeared with a good result in [5], where compressive sensing theory was implied to reduce the dimension of original vibration signals and SRC was used to classify the low-dimensional signals.In this paper, we take advantage of group sparse representation (GSR) and explore a new classification method for machinery vibration signals.In the proposed method, timedomain vibration signals are translated into frequency-domain at first, then, GSR is employed in coding stage and the categories of test samples are labeled by identifying their minimal reconstruction errors.The results of simulated analysis and two experiments demonstrate that the classification performances of the proposed method are superior to that of SRC and SVM.
The remainder of the paper is organized as follows.In Section 2, the basic idea of SRC is reviewed as a background of the proposed method.Section 3 presents the model of GSR and gives an efficient solving algorithm.In Section 4, a classification method based on GSR for machinery vibration signals is proposed.Then, simulation analysis and two experiments are provided to verify the proposed method in Section 5. Next, some discussions about the proposed method are presented in Section 6.Finally, Section 7 concludes the paper.

Basic idea of sparse representation-based classification
Sparse representation theory tells us that a signal can be well represented in terms of linear combination of atoms in a redundant dictionary.Given a × matrix contains the training signal samples as its column vectors and a signal ∈ is a test sample, the problem of sparse representation is to solve the following optimization model: where is a ×1 coefficient vector, ‖ ‖ denotes the L0-norm which counts the number of non-zeros entries in , and > 0 is the noise level parameter.It [9] is shown that problem Eq. ( 1) is NP-hard and difficult to be solved closely.An approximate solution can be obtained by relaxing the L0-norm constraint to the L1-norm constraint as follows: where ‖ ‖ = ∑ | | .Problem Eq. ( 2) is a typical convex optimization one, which can be In recent years, the L1-norm constraint based sparse representation has been widely used in classification field.It categorizes a test signal sample to the right class with the aid of two steps, i.e. sparse coding it over the training signal set and then performing the classification based on the reconstruction error as follows: where = ‖ − ‖ , and are the sub-set of the training samples and coefficient vector associated with class respectively.Since SRC appears, it has yielded good results in the field of image recognition especially in face recognition.However, there are two shortcomings for it.One is that it is slow for its L1-norm minimization, which will cost expensive computational time to solve [10].Another is that L1-norm minimization cannot select a sparse group of correlated samples [11] (In the limited cases, it selects only a single sample from all the correlated samples).Usually, in classification problems, the training samples from each class are high correlated.Therefore, L1-norm minimization is not an ideal choice for ensuring selection of all the training samples from a group.

Group sparse representation model and it solving algorithm
To overcome the problem of SRC, GSR [12] is advocated in the field of classification for promoting selection of the entire class of training samples.

Coefficients solving algorithm
The regularized least squares model gotten from Eq. ( 6) is expressed as: where ∈ [0, 1] is the regularization parameter.Problem Eq. ( 7) is a group lasso model [24] and can be solved by several existing techniques, such as Block-coordinate Descent (BCD) [15], spectral projected gradient method (SPGL1) [16] and Sparse Learning with Euclidean projection (SLEP) [17], etc.For its computational rapidity and global convergence property, we choose SLEP as a basic tool for solving problem Eq. ( 7).The algorithm contains the following steps [18].Firstly, ( ) = ‖ − ‖ is denoted as a smooth convex loss function and the following model is constructed for approximating the objective function (•): where the first-order Taylor expansion at the point for the smooth loss function (•) is applied and the regularization term ‖ − ‖ prevents walking far away from .
Secondly, the Armijo Goldstein line search schemes and accelerated gradient descent are used for search the approximate solution.Specifically, suppose is the approximate solution of the th iteration, the search point can be gotten as: where is a tunable coefficient.Then the approximate solution of the ( + 1)th iteration is computed as the minimizer of , ( ), i.e.: where is determined by the Armijo Goldstein line search rule and keeps in the neighborhood of .Problem Eq. ( 10) can be solved as = ( − ′( )/ , γ/ ), here (•) is the L1L2-norm Euclidean projection problem expressed as: Problem Eq. ( 11) can decouple into a set of independent L2-norm ones, i.e.: where and denote the sub-vector of and corresponding to the th group ( = 1, 2,…, ), respectively.Obviously, the solution of problem Eq. ( 12) is: Thirdly, update parameter as: where = (1 + 1 + 4 )/2.The three steps are repeated lots of times until the objective function (•) converges to a minimum value.To better showing the solving algorithm of GSR, we summarize it in Table 1

The proposed classification method for machinery vibration signals
Generally, vibration signals of machinery system are collected at a high sampling frequency and contaminated by noise.In addition, due to the change in speed, load and other factors over time, the signals are non-stationary.This means that many statistics of vibration signals are time-variant, e.g. the instantaneous values and waveform will be significantly different between two samples with different sampling start points, even if they are of the same fault-type.Therefore, in the implementation of classification based on GSR, if a dictionary is directly constructed by the vibration signals in time-domain, the classification performances are susceptible to the start points and noise intensity of samples.
In order to overcome the above problem, we translate vibration signal samples into frequency-domain at first, for samples of the same fault-type have the similar spectrum distributions.Here, DFT is adopted as follows: where ∈ denotes a vibration signal sample in time-domain, ( ≤ ) is the total points of DFT and ∈ is the DFT coefficient sequence of .The modulus value of each coefficient is used as an entry to construct the dictionary ∈ × , i.e.: Then, the test sample are translated into frequency-domain the same as above and its frequency-domain vector is gotten as = ( ).Next, the algorithm in Table 1 is utilize to solve the following problem: Finally, the category of is labeled based on the minimal reconstruction error: where = − , and are the sub-set of the dictionary and coefficient vector associated with class , respectively.
The flowchart of the proposed classification method for machinery vibration signals is shown in Fig. 1, which includes the following main steps: Step 1: Perform DFT to translate all the vibration signal samples into frequency-domain, including a test sample and training samples; Step 2: Construct a dictionary with the modulus values of DFT coefficients of the training samples; Step 3: Project the modulus values of DFT coefficients of the test sample on the dictionary and utilize the algorithm in Table 1 to get group sparse coefficients; Step 4: Categorize the test sample based on the minimal reconstruction error and determine its fault-type.
1989.CLASSIFICATION OF MACHINERY VIBRATION SIGNALS BASED ON GROUP SPARSE REPRESENTATION.

Simulation analysis
This subsection mainly demonstrates that the proposed method has strong noise immunity and its classification performance is almost unaffected by the sampling starting point.The classification performance includes three factors, i.e. reconstruction errors (RE) (Eq.( 19)), SCI (Eq.( 20)) [8] and accuracy rate (AR) (Eq.( 21)):

Number of correct classification
Total number of test × 100 %.
Seven types of signals, including a normal signal and six fault signals, are simulated as seven conditions of a rolling element bearing, i.e. no fault, two outer-race faults with different defect sizes, two ball faults with different defect sizes and two inner-race faults with different defect sizes, respectively.The normal signal is set as a white Gaussian noise (WGN) signal with unit standard deviation.For the six fault signals, their model is expressed as: where denotes amplitude modulation frequency, is attenuation factor, is impulse number, and are natural frequency and fault feature frequency, respectively.These parameters are set as in Table 2.The seven signals with sampling frequency of 2048 Hz are plotted in Fig. 2.
Each of the seven signals is mixed with 50 different WGN keeping SNR in the range of -2 dB to 2 dB such that 350 (50×7) training samples are obtained.The fault 1 signal is chosen as a test object.At first, it is mixed with WGN to get a noisy signal with SNR of 0 dB.After implementation of the proposed method, the sparse coefficients and RE of the noisy signal are obtained (in Fig. 3), both of which confirm that the test sample is of fault 1 category.Then, the WGN intensity is enlarged to reduce SNR to -2 dB and the proposed method is performed as above.
Although the sparse coefficients spread to three classes and the minimal RE magnifies a lot (in Fig. 4), the test sample still can be classified correctly according to the minimal RE, which shows the strong noise immunity of the proposed method.At last, the fault 1 signal with SNR of 0 dB is circularly shifted right 50 points and tested as above.From Fig. 5, it can be seen that the classification performance is almost unaffected by the sampling starting point.Most tests are conducted.50 signals of each category with SNR in the range of -2 dB to 2 dB are generated as the test samples.So far, the "simulation dataset", containing 350 training samples and 350 test ones, has been constructed, which will be used again in discussion section.The proposed method is executed to classify the 350 test samples.The results (summarized in Table 5) preliminarily demonstrate the classification performances of the proposed method.

Classification results of bearing vibration data
To investigate the effectiveness of the proposed method for vibration signal classification, bearing vibration signals are considered in this subsection.
The vibration dataset from Case Western Reserve University Bearing Data Center [19] has been analyzed by lots of researchers [20][21] and considered as a benchmark.In our study, this dataset is adopted as well.The experimental platform consists of a motor, a dynamometer, a torque transducer, and control electronics.Single point faults of size 0.007, 0.014, 0.021 and 0.028 in.were set on the drive-end bearings (Type 6205-2RS JEM SKF) at the location of outer raceway, inner raceway and rolling element (ball), respectively.The vibration data were measured by using an accelerometer being attached to the motor housing with the sampling frequency of 12 kHz.
In this experiment, twelve fault-types of vibration data are chosen to construct training and test datasets, including a normal, three types of outer race fault, four types of inner race fault, and four types of ball fault.Each data is split with an overlapping length of 512-point into lots of segments, whose length are set to 2048-point.From these segments, we randomly select 50 ones as training samples and other 25 ones as testing samples.Totally, 600 (12×50) samples and 300 (12×25) samples with 12 classes of bearing data are used as training set and test one, respectively.The descriptions of the "bearing dataset" are shown in Table 3. FAJUN YU, FENGXING ZHOU Without loss of generality, we randomly select a test sample from each class and classify them by the proposed method.Their sparse coefficients and RE are plotted in Fig. 6 and Fig. 7, respectively, from which we can see that almost all sparse coefficients of each test sample appear in their own corresponding class and the minimal RE of each test sample just corresponds to its class with value significantly smaller than the other RE.Then, all the 300 test samples are classified by the proposed method.The results (summarized in Table 5), including AR, average SCI and average minimal reconstruction error (AMRE), show that our method can correctly category all the test samples with a high average SCI and relatively low AMRE.

Classification results of gearbox vibration data
The proposed method is further applied in gearbox fault diagnosis.The experimental platform consists of a motor, a drive shaft seat, a gearbox and a damper, etc.The vibration signals are acquired by an acceleration sensor placed in the output shaft bearing seat.A normal situation and five fault-ones, including three single faults, i.e. tooth-broken and point-corrosion of large gear, wear-out of small gear, and two combination faults, i.e. broken-wear and point-wear faults, are considered.Rotating speed is set as 1500 r/min.The horizontal vibration signals are collected with a sampling frequency of 5120 Hz and the sampling time is about 20 s in each situation.FAJUN YU, FENGXING ZHOU Like the experiment of bearing data classification, in the preprocessing stage, each of the six gearbox vibration signals is split with an overlapping length of 128-point into lots of segments, whose length are set to 2048-point.From these segments, we randomly select 50 ones as training samples and other 20 ones as testing samples.Totally, there are 300 (6×50) training samples and 120 (6×20) test samples.The descriptions of the "gearbox dataset" are shown in Table 4.
Without loss of generality, we randomly select a test sample from each class and classify them by the proposed method.Their sparse coefficients and RE are plotted in Fig. 8 and Fig. 9, respectively.From Fig. 8, it can be seen that most of sparse coefficients of each test sample appear in their own corresponding class except for "PCL" and "PCL-WOS" samples.The reason for this phenomenon is that the feature frequency of point-corrosion fault is just equal to rotating frequency and the instantaneous value of vibration signal will almost do not change compared with normal conditions when the fault size is little.The minimal RE of each test sample in Fig. 9 indicates the proposed method can correctly classify it.
The classification results of the 120 test samples are displayed in Table 5.The accuracy rate has reached to 97.5 % (117/120), which verifies that our method can be applied in the gearbox fault diagnosis.Moreover, the high average SCI (0.9723) demonstrates again that group sparse representation based classification method can represent a test vibration signal using only signals from the same fault-type.

Discussions
In this section, we discuss issues from two aspects, i.e. the influence of parameters, and comparison with some other methods.

The influence of parameters
In our proposed method, two important parameters should be considered.One is the total points of DFT , and the other is the regularization parameter of GSR .In all of the tests above, including simulation analysis and two experiments, is set equally to the length of samples (i.e.= = 2048) and is set as 0.5.Fig. 10 shows the relationship between classification accuracy rate and the total points of DFT.It can be seen that the accuracy rate is improved significantly with the increase of the total points of DFT.Theoretically, the more DFT points, the more spectral information we can get.Therefore, we can improve the accuracy rate by increasing DFT points.However, group sparse representation will need more computational time to complete when the number of DFT point increases.For machinery fault diagnosis, vibration signal is analyzed in real-time and diagnostic conclusion should be determined in the crucial early hours.This requires the computational time of a diagnosis method as shorter as possible.As far as the proposed method, it needs to balance the classification accuracy rate and computational time to determine DFT points.In general, we can set DFT points as the half of sample length if accuracy rate is in range of acceptance.The relationship between classification accuracy rate and the regularization parameter is visualized in Fig. 11, which demonstrates that the accuracy rate almost keeps unchanged with a high value when the parameter varies in [0.0001, 0.8], while the accuracy rate will drop sharply if approaches to 0 or 1.In other words, when group sparse representation are utilized in classification of machinery vibration signals, the influence of the regularization parameter is very limited when it is set in a reasonable range.
Besides and , the termination tolerance and sample length have some effect on the classification results as well.In all of the tests above, and is set respectively as 0.001 and 2048.From Table 1, it can be seen that ε directly determines the runtime of coefficient solving algorithm and the reconstruction error.For classification-based diagnosis method, the reconstruction precision is not as important as for other signal approximation problems.Therefore, we set the termination condition of coefficient solving algorithm as the difference of objective function of two adjacent iterations less than εto reduce runtime.The parameter has similar effect with on classification accuracy rate and computational time.In practical application, if the training samples of each class are enough, a time-series vibration signal should no longer be split into a number of segments unless its length is too long.In this case, should be regulated rather than .

Comparison of the proposed method with some other methods
In this subsection, the proposed method is compared with SRC and SVM to demonstrate its superiority.
The L1-norm based SRC [8] is utilized to classify the test samples of three datasets for comparison.It is known there have been many effective methods for solving L1-minimization problems.Here, l1ls [20], SLEP [17] and GPSR [23] are adopted respectively to solve the problem.From the results in Table 5, it is seen that their classification performances are not as good as that of the proposed method.Specifically, AR and average SCI obtained by the three methods are respectively lower than that of proposed method roughly 10 and 50 percent, while average minimal RE is higher 20 percent or so.
For further comparison, the three datasets are tested by SVM method.SVM is a pattern recognition classification algorithm indicating favorable generalization performance based on statistical learning theory [24].Fault classifications are achieved with these methods as characteristic parameters (e.g., root mean square value, kurtosis, mean square frequency, etc.) from time and frequency domains are extracted and adjusted based on the actual situation [5].Modeled on reference [25], vibration signals are translated into different frequency bands by wavelet package transform (WPT); then, the optimal features are selected based on the distance evaluation technique from the statistical characteristics of raw signals and wavelet package coefficients, and the energy characteristics of decomposition frequency band, finally, the optimal features are input the SVM ensemble with SVM toolbox [26] to identify the fault type.The results recorded in Table 5 show that the AR obtained by the SVM ensemble method is improved significantly compared with that of SRC, yet it is still lower than that of the proposed method.In machinery fault diagnosis, classification efficiency is another important factor.We investigate the computational time (CT) of these methods on the three datasets and tabulate the  5. Matlab 2012b is run with a computer of CPU: 3.6 GHz, RAM: 4G.From the observations, the proposed method is as fast as SRC solved by SLEP and GPSR and significantly faster than SVM ensemble and SRC solved by l1l s.

Conclusions
This paper presents a new classification method based on group sparse representation for machinery vibration signals.In the method, time-domain vibration signals are transformed into Fourier coefficients by DFT at first; then, the transform coefficient vectors of the training samples are merged as a dictionary and the transform coefficient vectors of the test samples are coded on the dictionary by group sparse representation algorithm; finally, the class labels of the test samples are identified by the minimal reconstruction error.Simulation analysis has shown that the proposed method has a strong noisy immunity and its classification performance is almost unaffected by the sampling starting point.The classification results of bearing and gearbox vibration signals demonstrate the method can effectively diagnose both of them fault types with a high accuracy and efficiency.
1989.CLASSIFICATION OF MACHINERY VIBRATION SIGNALS BASED ON GROUP SPARSE REPRESENTATION.

Fig. 1 .
Fig. 1.Scheme of the proposed classification method for machinery vibration signals based on group sparse representation

Fig. 5 .
Fig. 5.The sparse coefficients and reconstruction errors (RE) of the fault 1 test sample with SNR of 0 dB and circularly shifting right 50 points, obtained by the proposed method

Fig. 6 .Fig. 7 .
Fig. 6.The sparse coefficients of test samples from each class in bearing dataset 1989.CLASSIFICATION OF MACHINERY VIBRATION SIGNALS BASED ON GROUP SPARSE REPRESENTATION.

Fig. 8 .Fig. 9 .
Fig. 8.The sparse coefficients of test samples from each class in gearbox dataset

Fig. 10 .Fig. 11 .
Fig. 10.The accuracy rate of the three datasets with the change of DFT points ) 1989.CLASSIFICATION OF MACHINERY VIBRATION SIGNALS BASED ON GROUP SPARSE REPRESENTATION.

Table 1 .
The solving algorithm for group sparse representation

Table 2 .
Parameters of the seven simulation signals The sparse coefficients and reconstruction errors (RE) of the fault 1 test sample with SNR of 0 dB, obtained by the proposed method Fig. 2. The waveform of seven simulation signals, which simulate seven conditions of a rolling element bearing (from above to below: no fault, two outer-race faults with different defect sizes, two ball faults with different defect sizes and two inner-race faults with different defect sizes) Fig. 3. Fig. 4. The sparse coefficients and reconstruction errors (RE) of the fault 1 test sample with SNR of -2 dB, obtained by the proposed method

Table 3 .
Description of bearing dataset for classification 1989.CLASSIFICATION OF MACHINERY VIBRATION SIGNALS BASED ON GROUP SPARSE REPRESENTATION.

Table 4 .
Description of gearbox dataset for classification

Table 5 .
Classification results of the three datasets obtained by the proposed method and other methods