Fault diagnosis of rolling bearing with incomplete labels using weakly labeled support vector machine

Zhou Bo1 , Lu Chen2 , Wang Zhenya3

1, 2, 3School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China

1, 2, 3Science and Technology Laboratory on Reliability and Environmental Engineering, Beijing 100191, China

1Corresponding author

Vibroengineering PROCEDIA, Vol. 5, 2015, p. 187-192.
Accepted 21 August 2015; published 18 September 2015

Copyright © 2015 JVE International Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Creative Commons License

The fault diagnosis of rolling bearing has attracted increasing attention in recent years on account of the significant impact on the functionality and efficiency of complex primary system. In consideration of the bearing samples with incomplete labels, this paper investigates the possibilities of a novel fault diagnosis method using the experience of image cognition theory in dealing with the fault state classification of rolling bearings, aiming to realize fault classification that only utilizes a small amount of labeled bearing data. In this paper empirical mode decomposition (EMD) is firstly applied to the original signal, where the basic time domain features are extracted from the first three intrinsic mode functions (IMFs), and are set as the inputs of the following classifier for final training and testing. Weakly labeled support vector machine (WELLSVM), which seems more efficient than inductive support vector machines especially in the case of very small training sets and large test sets, is then established via a novel label generation strategy in the method of semi-supervised learning. Validation data are collected to facilitate the comparison and evaluation of the fault diagnosis results, of which the labeled data proportion is diverse from each other. The results indicates the effectiveness of the proposed method for bearing fault diagnosis with weakly labeled data.

Keywords: rolling bearing, fault diagnosis, incomplete labels, weakly labeled support vector machine.

1. Introduction

Rolling element bearing plays an important role in the rotating machinery system, of which failure may result in serious economic losses and security incidents [1]. The importance of early detection of defects in bearings has led to continuous efforts due to the fact that unpredictable occurrence of damage may cause disastrous failure. In order to ensure the normal operation of industry, fault diagnosis of bearings is essential. Fault diagnosis of rolling element bearings using vibration signature analysis is the most commonly used to prevent breakdowns in machinery [2]. The vibration data labels are the key of fault classification. D. H. Pandya and S. H. Upadhyay investigated the APF-KNN approach which was based on asymmetric proximity function with optimize feature selection, and it showed that better classification accuracy can increase reliability for the faults diagnosis of rolling bearing [3]. Diego Fernández-Francos proposed an automatic bearing fault diagnosis method based on one-class v-SVM which can identify the location of the defect and qualitatively assess its evolution over time [4].

The above two methods used all labeled data, however, in real working condition, the labels may not exist enough. Obviously, if only use a small amount of marked labels to train the prototype, on the one hand, it is often difficult to make the trained learning system have strong generalization ability; on the other hand, using only a small amount of “expensive” marked samples without using a large number of “cheap” no tag sample is also a great waste of data resources [5]. Therefore, exploiting weakly labeled training data may help improve performance and discover the underlying structure of the data. Indeed, this has been regarded as one of the most challenging tasks in machine learning research [6].

In this paper a fault diagnosis method based on the weakly labeled support vector machine (WELLSVM) is proposed. Unlike supervised learning, this method conducts fault diagnosis making full use of a large amount of the data without labels. In addition, Multi-Instance learning and clustering are a potential application for WELLSVM as well. The goal of semi-supervised learning of WELLSVM is to employ the large collection of unlabeled data jointly with a few labeled examples for improving generalization performance [7].

This paper is organized as follows: Section 2 briefly introduces EMD and WELLSVM; Section 3 shows the case study performed to validate the method; and Section 4 gets conclusions and relates to future works.

2. Methodology

2.1. Empirical mode decomposition (EMD)

The empirical mode decomposition (EMD) method is able to decompose any complicated signal into finite components called intrinsic mode functions (IMFs) [8]. In the EMD decomposition, a signal must satisfy two criteria to be an IMF: (1) in the whole data set, the difference between the number of maxima and the number of zero crossings must be no more than one; and (2) the average of the upper and lower envelopes is zero at any time instant. The standard EMD process of a signal can be described as follows:

(1) In order to obtain the upper or lower envelope of the signal xt, a cubic spline is employed to link all the local maxima or the minima points of the signal. The local maxima (or minima) is obtained by comparing the values of neighboring points, if a point’s value is larger (or lower) than both its neighbors, it will be taken as a local peak.

(2) The different over time x1t is obtained from the data which subtracts the averaged trace mt of the upper and lower envelopes:

x 1 t = x t - m t .

(3) Let xt=x1t and repeat step (1) and (2) until x1t meets the two criteria of an intrinsic mode. The resulting x1t of this process is an IMF, represented as Cjt in the below, where j is the label of scale.

(4) The residue signal rjt is obtained by separating Cjt from the initial signal xt:

r j t = x t - C j t .

With this decomposition process, the original signal xt is decomposed into N IMFs, each of which has a different resolution. The original signal xt equals to the summation of the extracted IMFs of different scales and the residual signal:

x t = j = 1 N C j t + r N t ,

where N is the number of extracted IMFs, j is the scale label of a IMF, rNt is the final residue.

2.2. Time domain feature extraction

Time domain features which include more information can reflect the basic characteristics of the signals. Time domain features are extracted to diagnose the failure status such as root mean square (RMS), maximum value, standard deviation, kurtosis, root amplitude and peak-to-peak value [9]. Maximum value and root mean square are extracted from the first IMF; standard deviation and kurtosis are extracted from the second IMF; root amplitude and peak-to-peak value are extracted from the third IMF. The following table lists the formula of the extracted time domain.

2.3. Weakly labeled support vector machine (WELLSVM)

We commence the classification method from SVM. The basic task of SVM is to estimate a classification function f:RN{±1} using input-output training data from two classes [10]. The hyperplane equation of the train set is:

w T x i + b = 1 ,       w T x i + b = - 1 .

Table 1. The basic formula of time domain features

Time domain feature
Time domain feature
Maximum value
X m a x = m a x x i t
Root mean square
X r m s = 1 N i = 1 N x i 2
Standard deviation
σ = 1 N i = 1 N x i - μ 2
β = 1 N i = 1 N x i 4
Root amplitude
X r = i = 1 N x 2
Peak-to-peak value
m a x x i t - m i n x i t

The basic idea of the method is to look for the largest interval separating plane (shows in Fig. 1) in the case of mis-classification which corresponds to the following optimization problem:

m i n Φ w = m i n ω , b , ξ 1 2 w T w + C i = 1 N ξ i ,
s . t .             y ^ i w T x i + b 1 - ξ i ,       ξ i 0 ,       i = 1 ,   2 , ,   N ,

where ξ=ξ1, ξ2, ,ξN, C>0 is a fixed penalty parameter.

Fig. 1. The largest interval separating plane of SVM

 The largest interval separating plane of SVM

Eq. (5) can be written as:

m i n y ^ β     m a x α Α     G α , y ^ .

Interchanging the order of maxαΑ and miny^β in Eq. (7), we obtain the proposed WELLSVM:

m a x α Α     m i n y ^ β   G α , y ^

We rewritten the objective of WELLSVM as the following optimization problem:

m i n μ Μ     m a x α Α t : y ^ t β   μ t G α , y ^ ,

where μ is the vector of μt’s, Μ is the simplex μtμt=1,  μt0, and y^tβ.

In semi-supervised learning, not all the training labels are known. Let DL=xi,yii=1l and DU=xjj=l+1N be the sets of labeled and unlabeled examples, and y^=y^1, y^2,,y^N' is the vector of learned labels on both labeled and unlabeled examples, yL=y1, y2,,yl', and y^U=y^l+1, y^l+2,,y^N' [11]. Then the Eq. (5) leads to:

m i n Φ w = m i n y β m i n w , ξ 1 2 w T w + C 1 i = 1 l ξ i + C 2 i = l + 1 N ξ i ,
s . t .             y ^ i w T x i + b 1 - ξ i ,       ξ i 0 ,       i = 1 ,   2 , , N ,

where β=y^=y^=y^L;y^U,   y^L=yL,  y^U±1N-1, and C1, C2 balance the 2 types of Hinge Loss function:

m i n Φ w = m i n μ Μ , ξ 1 2 1 μ t w T w + C 1 i = 1 l ξ i + C 2 i = l + 1 N ξ i ,
s . t .             y ^ i w T x i + b 1 - ξ i ,       ξ i 0 ,       i = 1 ,   2 , , N .

We iterate the following two steps until convergence to solve Eq. (10) by:

1) Fix the mixing coefficients μ of the base kernel matrices.

2) Fix wt’s and update μ in closed-form.

3. Experimental verification

This section is devoted to show the reliability of the WELLSVM model for fault diagnosis of rolling bearings. Experiment data in different working conditions are chosen to validate the effectiveness of the proposed method.

3.1. Experiment setup

Bearing data from the bearing data center of Case Western Reserve University were used for testing and verification in the experiment. The bearing test-rig contained a 2 horsepower motor which was used as the prime mover to drive a shaft coupled with a bearing housing as shown in Fig. 2. The test-rig included both drive end (DE) and fan end (FE) bearings of 6205-2RS JEM SKF, of which the vibration data were collected by using accelerometers attached to the housing with magnetic bases. Accelerometers were placed at the 3 clock position for both the DE and FE bearings. For data acquisition, digital data was collected at 12,000 samples per second while a sampling rate of 1.2 kHz was used for DE and FE bearing faults.

3.2. Experiment execution

This section we executed the experiment content. Firstly, we did empirical mode decomposition (EMD) to the original signal, the first three IMFs were chosen and then we extracted two different time domain features from each IMF of the used bearing data. We obtained six dimension features from the extracted time domain features. At last, the WELLSVM was employed to classify the four failure mode.

Fig. 2. Bearing test-rig for the experiment

Bearing test-rig for the experiment

Firstly, we clustered inner ring failure, outer ring failure and rolling element failure, and classified them with the normal condition. Then we clustered outer failure and rolling element failure together, and classified them with inner ring failure. After that WELLSVM was used to classify outer ring failure and rolling element failure (shows in Fig. 3). After data processing, for each data set of the mode, 75 % of the examples were randomly chosen for training, and the rest for testing. We investigated the performance of each approach with varying amount of labeled data (namely, 5 %, 10 %, 15 % and 20 % of all the labeled data). The whole setup was repeated 10 times and the average accuracies on the test set are reported in Table 2.

Fig. 3. Multi-classification of WELLSVM

 Multi-classification of WELLSVM

Table 2. Average accuracies of fault classification

Classification pattern
Normal with other failure mode
Outer ring with inner ring and rolling element
Inner ring and rolling element
Average accuracies with 5 % labeled
Average accuracies with 10 % labeled
Average accuracies with 15 % labeled
Average accuracies with 20 % labeled

3.3. Result and comparison

In Table 2, it is shown that the classification accuracy of 5 %, 10 %, 15 % and 20 % of all the labeled data are all exceeded 95 %, which indicates that the proposed method WELLSVM can separate the normal mode, inner ring failure, outer ring failure and rolling element failure commendably. What’s more, the more labeled examples for training set, the more effective of the result is.

4. Conclusions

In this study, a method of fault diagnosis for rolling bearing is proposed. EMD was utilized as a powerful signal decomposition method for any complicated signal. Because the time domain features can represent the essential characteristics of vibration signals, we extracted the time domain features. At last, WELLSVM was employed as a powerful signal processing method for classification to classify the data of all failure mode which was extracted features from the IMFs. The experiment indicates that WELLSVM can effectively classify fault rolling bearing.

Our future works will focus on the following aspects, firstly, more attempts used WELLSVM will be made to other objects except for rolling bearing. Secondly, we will try to use WELLSVM in another fields except classification to extend the universality of the proposed method.


This study is supported by the National Natural Science Foundation of China (Grant Nos. 61074083, 50705005, and 51105019) and by the Technology Foundation Program of National Defense (Grant No. Z132013B002).


  1. Lou Xinsheng, Loparo Kenneth A. Bearing fault diagnosis based on wavelet transform and fuzzy inference. Mechanical Systems and Signal Processing, Vol. 18, Issue 5, 2004, p. 1077-1095. [Search CrossRef]
  2. Kankar P. K., Sharma Satish C., Harsha S. P. Rolling element bearing fault diagnosis using wavelet transform. Neurocomputing, Vol. 74, Issue 10, 2011, p. 1638-1645. [Search CrossRef]
  3. Pandya D. H., Upadhyay S. H., Harsha S. P. Fault diagnosis of rolling element bearing with intrinsic mode function of acoustic emission data using APF-KNN. Expert Systems with Applications, Vol. 40, Issue 10, 2013, p. 4137-4145. [Search CrossRef]
  4. Fernández-Francos Diego, Martínez-Rego David, Fontenla-Romero Oscar, Alonso-Betanzos Amparo Automatic bearing fault diagnosis based on one-class v-SVM. Computers and Industrial Engineering, Vol. 64, Issue 1, 2013, p. 357-365. [Search CrossRef]
  5. Ying Zhao Research on Semi-supervised Support Vector Machine Learning Algorithms. Ph.D. Eng. Theses. [Search CrossRef]
  6. Li Yu-Feng, Tsang Ivor W., Kwok James T., Zhou Zhi-Hua Convex and scalable weakly labeled SVMs. Journal of Machine Learning Research, Vol. 14, Issue 1, 2013, p. 2151-2188. [Search CrossRef]
  7. Chapelle Olivier, Sindhwani Vikas, Keerthi Sathiya S. Optimization techniques for semi-supervised support vector machines. Journal of Machine Learning Research, Vol. 9, 2008, p. 203-233. [Search CrossRef]
  8. Zhao ShuanFeng, Liang Lin, Xu GuangHua, Wang Jing, Zhang WenMing Quantitative diagnosis of a spall-like fault of a rolling element bearing by empirical mode decomposition and the approximate entropy method. Mechanical Systems and Signal Processing, Vol. 40, Issue 1, 2013, p. 154-177. [Search CrossRef]
  9. Lee Hong-Hee, Nguyen Ngoc-Tu, Kwon Jeong-Min Bearing diagnosis using time-domain features and decision tree. International Conference on Intelligent Computing, Vol. 4682, 2007, p. 952-960. [Search CrossRef]
  10. Bennett Kristin P., Demiriz Ayhan Semi-supervised support vector machine. Proceedings of the Conference on Advances in Neural Information, 1998, p. 368-374. [Search CrossRef]
  11. Li Yu-Feng, Tsang Ivor W., Kwok James T., Zhou Zhi-Hua Convex and scalable weakly labeled SVMs. Journal of Machine Learning Research, Vol. 14, Issue 1, 2013, p. 2151-2188. [Search CrossRef]