Gearbox fault diagnosis through quantum particle swarm optimization algorithm and kernel extreme learning machine

Gearbox is the key component of mechanical transmission system. Accurate fault diagnosis of gearbox is of great significance to ensure the operation of rotating machinery. Based on the comprehensive simulation test-bed in the laboratory, a gearbox fault diagnosis method based on QPSO-KELM is proposed. Firstly, the fault pre planting experiments of gear fault, bearing fault and gear bearing mixed fault are carried out on the comprehensive simulation test-bed. Then, the vibration signals collected are preprocessed by TSA to eliminate noise. The time domain, frequency domain and NASA feature parameters of the preprocessed signals are taken as training samples and test samples of QPSO-KELM. The experimental results show that the proposed method can effectively solve the problem of gearbox fault pattern recognition, and the fault diagnosis accuracy is higher than traditional methods, so the research has certain reference significance and engineering application value.


Introduction
Gearbox is widely used in fans, machine tools, vehicles and other equipments. As a key component of mechanical transmission system, gearbox will cause huge economic losses and even casualties in case of failure during equipment operation. Therefore, the earlier the fault is found in the gearbox and the earlier the maintenance is carried out, the more loss can be reduced. The main components of gearbox are shaft, bearing and gear, among which gear failure and bearing failure are quite common.
At present, the mechanical transmission system fault diagnosis research is extensive, and the method is diverse. As the most mainstream classification method, intelligent classification algorithm has achieved good results in the field of equipment fault diagnosis. Common intelligent classification algorithms include neural network [1,2], support vector machine (SVM) [3][4][5], kernel extreme learning machine (KELM) [6,7], deep learning [8][9][10] and other methods. Wang [2] proposed a fault diagnosis method based on RDGWPR-MSE and PNN, which is used to realize the automatic fault identification of electric submersible pump. Wang and Yan [4] used the energy of IMF component after SVD decomposition as feature parameters to input and train SVM model, so as to realize bearing fault diagnosis. Saari et al. [5] proposed a fault diagnosis method for wind turbine bearing based on one class SVM. Iosifidis A et al. [7] studied the classification method of KELM, and achieved good results. Yang Yu et al. [9] proposed a structure adaptive DBN algorithm to solve the problem of difficult to determine the structure of DBN, which was successfully applied to the fault classification of rolling bearings. Shao et al. [10] successfully diagnosed the bearing fault by combining the dual tree complex wavelet packet transform with the adaptive DBN algorithm.
Compared with these methods, KELM has a stronger comprehensive advantage in sample size, network generalization, calculation speed and accuracy, so KELM is more suitable for solving gearbox fault diagnosis problems that require higher calculation speed and accuracy. Since the classification capability of KELM network is greatly affected by the value of kernel parameters and penalty coefficient, it is necessary to optimize the above structure parameters. Currently, genetic algorithm [11], fish swarm algorithm [12], whale algorithm [13], ant colony algorithm [14], wolf algorithm [15], particle swarm algorithm [16] and other intelligent algorithms have been widely applied in the field of parameter optimization, such as Xu [13] used whale algorithm to optimize wavelet kernel extreme learning machine (WKELM) and Wang [17] used the modified WKELM wolves algorithm in diagnosing rolling bearing fault; Wu [18] used ant colony algorithm for fault diagnosis of rotating mechanical equipment. Pei [19] used particle swarm optimization algorithm to optimize KELM for transformer fault diagnosis. Liu [20] proposed the method of combining the variational mode decomposition (VMD) with the improved KELM for engine fault diagnosis. However, most of the above parameter optimization methods have the problem of long calculation time and easy to fall into the local optimum. Therefore, the QPSO-KELM (KELMbased on Quantum Particle Swarm Optimization) [21][22][23] was put forward. With strong global search ability, the QPSO can find the optimal parameters of KELM, thereby, the learning speed and classification accuracy of KELM can be improved and the accuracy of gearbox fault diagnosis will also be improved.

Fault diagnosis strategy of gearbox based on QPSO-KELM
The fault diagnosis process of gearbox based on QPSO-KELM is described as follows: (1) Signal acquisition. In general, the closer the sensor is to the source of the vibration, the better the signal will be.
(2) Extract the fault characteristic parameters. The time-domain characteristic parameters, frequency-domain characteristic parameters and the characteristic parameters in the technical report of NASA were extracted from the collected signals. Then, the corresponding training set and test set were selected from all samples in a ratio of 3:1.
(3) The training of KELM. Aiming at maximizing the training accuracy, the training set is input to KELM, and the QPSO algorithm is used to search the optimal parameters. After that, the trained KELM can be obtained.
(4) Calculate the test accuracy. Input the test samples to the trained KELM, the correct classification times of test samples will be calculated and the test accuracy will be calculated.
The above process is shown in Fig. 1. In the Fig. 1, the gearbox fault diagnosis process also involves time-domain synchronous average algorithm [24], time-domain synchronous resampling algorithm [25,26], pre-whitening algorithm [27], relevant fault characteristic parameters, KELM, QPSO and other relevant contents.

Signal preprocessing
During the signal acquisition process, the signal are easy interfered by other vibration signals, such as the vibration of the bearing of the equipment itself, the vibration of the motor, the meshing vibration of the gear independent of the synchronous shaft and the vibration from other machines and equipment. As a signal preprocessing technology, time synchronous averaging (TSA) technology can eliminate the signal components irrelevant to the synchronous shaft and enhance the fault signals of the synchronous shaft gear and its meshing gear, making the gear fault diagnosis easier. Where, the reference axis is the synchronous axis and the non-reference axis is the nonsynchronous axis. The theoretical equation of TSA is as follows: where, is the TSA signal, is the original signal, and is the average times.
In the actual process, the implementation steps of TSA are as follows: (1) Determine the zero crossing point of the synchronous axis according to the speed pulse signal, that is, the starting or ending position of each rotation of the synchronous axis.
(2) According to the position of zero crossing, divide the vibration acceleration signal into − 1 segments, and each segment represents the sampling signal of the integer rotation of the synchronous axis.
(3) Perform cubic curve interpolation for each signal segment, so that the sampling points of each signal segment are the same. The newly interpolated sampling points are = 2 , where is the mean of sampling points of each signal segment.
(4) Stack and average the resampled signals to obtain the time-domain synchronous average signal.
Based on the speed signal sampled at the same angle, TSA resamples the time-domain vibration acceleration signal to ensure that each revolution has the same number of sampling points, that is, equal angle sampling. The speed signal is usually obtained by the photoelectric sensor, and a rotation of the axis produces several pulses. The more pulses generated by a rotation, the more accurate the speed signal will be. The position of zero crossing can be obtained by rotating speed signal, and it is the key of the TSA algorithm. The gear fault data is mainly the fault characteristic parameters extracted from the time-domain synchronous average signal.

Extract feature parameters [28]
After preprocessing, the feature extraction of the preprocessed signal is carried out. The feature parameters are the basis for feature classification and fault recognition. The feature parameters adopted in this paper include time-domain feature parameters, frequency-domain feature parameters and some feature parameters mentioned in the report of NASA.
Time-domain characteristic parameters include: maximum ( ), minimum ( ), peak-to-peak ( ), mean value ( ), mean square value ( ), root mean square ( ), variance ( ), standard deviation ( ), energy ( ), root square amplitude ( ), mean square amplitude ( ), mean square amplitude ( ), kurtosis ( ), skewness ( ), waveform index ( ), peak index ( ), pulse index (X 17 ), margin index ( ), clearance coefficient ( ), etc: Some common statistical features in the frequency domain are shown below. Feature namely mean frequency. Features -, , and -describe the convergence of the spectrum power, reflecting the energy of frequency spectrum. Features and -show the change of main frequencies which are dominant in the frequency spectrum. The features in frequency-domain are calculated from the spectrum of the original signal, which contains more effective information than time-domain features: In addition to the common time-domain and frequency-domain features, NASA also proposed some features, such as FM0, ER, FM4, FM4*, M6A, M6A*, M8A, M8A*, NA4, NA4*, NB4, NB4*, ER, EOP, etc. These features are mainly used to assess the gear failure:

KELM
In KELM algorithm, the kernel function is introduced into ELM, and the input weights and offsets in ELM are replaced by kernel function mapping, which makes the output of the ELM more stable and solves the over learning problems. Therefore, the KELM is widely used in classification identify areas.
ELM is a single hidden layer feedforward neural network, and its hidden layer weight does not need to be adjusted by feedback regulation. The structure of the ELM is shown in the Fig. 2, including an input layer, a hidden layer and an output layer. Suppose = , are training sample sets, where = , , ⋯ , ∈ is the input characteristic parameter and = , , ⋯ , ∈ is the sample label, then the output function expression of ELM training model is: where, = , , ⋯ , and = , , ⋯ , represent the input and output weight matrix of the th hidden layer node respectively; = , , ⋯ , represents the output layer vector of the ELM. The goal of ELM training is to make the error between the actual output of the training sample and the sample label close to 0, the equation is: So the existence of , and can be used to obtain the Eq. (4), as below: The Eq. (4) can be expressed as a matrix: where, is the output matrix of the hidden layer, which can be expressed in the following equation: For , the equation is as follows: where, represents the inverse matrix of , is the optimal quadratic solution to . According to the structural optimization and ERM criteria, the output weight of ELM can be determined, that is: where, = , , ⋯ , , represents network calculation error, ‖ ‖ is the structural error, ∑ is the empirical error, and is the penalty factor. Through the optimization of the solution, the optimal solution of can be obtained as follows: Therefore, the output model function of ELM is as follows: By using kernel operation to replace matrix operation in ELM, the KELM classification model can be established. The kernel matrix is defined as: where Ω is the symmetric matrix of × and ( , ) is the kernel function.
Eq. (11) can be expressed by kernel function as follows: where, represents the output weight matrix of KELM. Different support vector machines can be constructed by selecting different kernel functions. Among the commonly used kernel functions, compared with the polynomial kernel function and Sigmoid kernel function, the radial basis kernel function has the advantages of simple parameters and strong adaptability to randomly distributed samples. Therefore, a SVM based on radial basis kernel function is established in this paper, and its expression is shown as follows: According to Eq. (9) and (14), it can be seen that KELM contains kernel parameter and penalty coefficient , and it is necessary to find the optimal parameter to make the classification effect of KELM the best.

QPSO-KELM
By quantizing the iterative updating process of particles of PSO algorithm, the QPSO can reduce the algorithm complexity, improve algorithm convergence speed and global search ability. The basic principles of QPSO are as follows: Assume that Ω is the -dimensional search space, and the population number of particles in the space is , then the position of the th particle can be expressed as: Suppose the individual optimal position of the particle is , and the global optimal position of the particle is , the and are as follows: The particle can find and update its individual optimal position and population optimal position through iterative operation, and the average optimal position can be introduced as the population optimal center. Then the particle optimization process can be expressed as: where, is the contraction expansion factor, which is dynamically adjusted in the iterative operation according to the Eq. (21): where, represents the maximum number of iterations. and are the initial and final values of respectively, usually the two parameters are set as: According to the basic principle of KELM and QPSO, the network training process which is used QPSO to optimize the network structure parameters of KELM is as follows: (1) Initialize the position of population particles, and set parameters such as particle swarm size, iteration step size, termination conditions and so on.
(2) Initialize the current position of each particle as , define the classification accuracy of KELM as fitness function, and calculate the fitness value of each particle. The position of the particle with the maximum fitness value was initialized as . (3) Update the particle position according to Eq. (18)(19)(20). (4) The fitness value of each particle is calculated, and the individual optimal position , the group optimal position and the group optimal center are updated based on the optimal fitness value.
(5) Judge whether the termination conditions are met. If so, stop the calculation and output the result, if not, return to step (3).
According to the above steps, the modeling flow chart of QPSO-KELM can be drawn as shown in Fig. 3.

Experimental verification and discussion
To verify whether the QPSO-KELM is available, the pre planting fault tests are carried out for gear fault, bearing fault and gear-bearing mixed fault. The fault simulation test rig is shown in Fig. 4. The test rig consists of four parts: the power and control part, the bearing fault simulation part (not used), the gear fault simulation part and the data acquisition part (not shown). This section mainly uses the gearbox fault simulation part, which is mainly composed of a two-stage reduction spur gearbox, a magnetic powder brake (supply the load) and a magnetic powder brake controller Initialize particle swarm size, evolutionary algebra and particle position (C,σ)

Determine particle fitness function (The test classification accuracy of KELM)
Calculate the fitness function value of each particle Update individual optimal location, group optimal location and group optimal center Output the optimal parameter combination of KELM (C,σ) Whether the termination conditions are met ?
The end N Y (control the load change). The perspective and internal structure of the two-stage reduction spur gearbox are shown in Fig. 5. The teeth number in the gearbox from high speed shaft to low speed shaft is 41, 79, 36 and 90. The fault gear is the intermediate shaft pinion gear (gear 3 with 36 teeth). The fault types include wear, break and miss teeth. The fault bearing is located in the deep groove ball bearing at the end cover of the gear 3 side of the medium-speed shaft (Fig. 5(b)). The deep groove ball bearing type is ER-16k, and its size parameters are shown in Table 1. The fault types include inner race fault, outer race fault and single roller fault.

Fig. 4. The mechanical fault simulation test bench
In the test, we set two vibration acceleration sensors in vertical and horizontal directions respectively. Both sensors are installed outside the end cover of the fault bearing (Fig. 5(c)), which are used to collect the vibration acceleration data in vertical and horizontal directions. The data acquisition parameters are set as: sampling frequency = 20.48 kHz, sampling time = 48 s, motor speed 30 r/s (the actual motor speed is about 29.602 r/s). According to the speed of the motor, the number of gear teeth of the gearbox and the main dimension parameters of the bearing ER-16K (Table 1), the main relevant frequency of the gearbox can be calculated by the calculation formula of bearing fault characteristic frequency ( Table 2). The calculation results are shown in Table 3.

Bearing fault diagnosis
Firstly, the single fault of bearing is analyzed. In the experiment, the motor speed is set to 30 r/s, and the signal sampling frequency is 20.48 kHz. The analysis data is mainly collected by sensor1 (vertical direction). The test data types include bearing normal data, rolling element fault data, inner ring fault data and outer ring fault data, each type contains 240 groups of data, and 960 groups of data is in total. The sampling time of each group of data is 2 seconds.  is the rotating frequencies of medium speed axis, is the rotating frequencies of low speed axis; is the meshing frequencies of gear 1 and gear 2, is the meshing frequencies of gear 3 and gear 4; is the fault characteristic frequencies of the bearing outer race, is the fault characteristic frequencies of the bearing inner race, is the fault characteristic frequencies of the bearing ball spin When diagnosing the bearing fault, follow the procedure shown in Fig. 1. Firstly, the time-domain, frequency-domain and NASA characteristic parameters of the bearing fault were extracted from the bearing data of each group, and 960 samples will be obtained. And each group of the samples is marked with the following rules: normal-1, roller-2, inner-3, outer-4. Then, the training set and test set are selected from all samples in a ratio of 3:1, and the training set is input to KELM for training. In order to maximize the test accuracy, QPSO is used to optimize the parameters of KELM. Finally, in order to verify the superiority and effectiveness of the method, the ELM and PSO-KELM are also applied to bearing fault diagnosis, and the experiment results of different methods are used to compare with each other.
Since the standard elm algorithm uses a single layer feedforward neural network structure, it is unnecessary to consider the parameter optimization of ELM. So the parameter optimization process of PSO-KELM and QPSO-KELM are analyzed, the results is shown in Fig. 6. Therefore, it is only necessary to analyze the parameter optimization process of PSO-KELM and QPSO-KELM. a) b) Fig. 6. a) Iterative process of the two methods, b) the local amplify of a) Fig. 6(a) is iterative process of the two methods, Fig. 6(b) is the local amplify of Fig. 6(a). It can be seen from the figure that PSO-KELM and QPSO-KELM algorithm have the highest test accuracy from the beginning, which shows that the fault characteristics of bearing single fault are obvious, and it can achieve high classification accuracy without parameter iteration optimization. The initial parameters setting of QPSO-KELM is ∈ 0.01,1000 , ∈ 0.01,100 , and the optomized parameters are = 1000, = 18.
The final training accuracy and test accuracy of the methods are shown in Table 4. It can be seen from Table 4 that both of the training accuracy and the test accuracy of QPSO are the highest and reach 99.58 %. The QPSO-KELM classification results of the test dataset are shown in Fig. 7, in which there is only one misjudgment point, so the accuracy of the trained QPSO-KELM model is reliable.

Gear fault diagnosis
Secondly, the single fault of gear is analyzed. The fault gear is the medium-speed shaft pinion in the gearbox, which is gear 3 in Fig. 5(b). In the experiment, the motor speed and the signal sampling frequency are the same as in section 4.1. The analysis data is mainly collected by sensor 1 (vertical direction). The test data types include gear normal data, worn teeth data, broken tooth data and missing tooth data, each type contains 240 groups of data, and 960 groups of data is in total. The sampling time of each group of data is 2 seconds.
When diagnosing the gear fault, follow the procedure shown in Fig. 1. Firstly, the characteristic parameters of gear fault in time-domain, frequency-domain and NASA are extracted from each group of gear data, and 960 samples will be gotten. And each group of the samples is marked with the following rules: normal-1, wear-2, miss-3, break-4. Then, the training set and test set are selected from all samples in a ratio of 3:1, and the training dataset is input to KELM for training. In order to maximize the test accuracy, QPSO is used to optimize the parameters of KELM. Finally, in order to verify the effectiveness of the method and highlight the advantages of the proposed method. The ELM and PSO-KELM are also used for gear fault diagnosis to compare with QPSO-KELM.
The parameters optimization of PSO-KELM and QPSO-KELM are analyzed, the iteration process is shown in Fig. 8. It can be seen from the figure that the QPSO-KELM algorithm has the highest test accuracy from the beginning, which indicates that the QPSO-KELM has stronger learning ability, while PSO-KELM reaches the maximum value at the 11th iteration, which indicates that its learning ability is slightly inferior to the method proposed in this paper. The initial setting of parameters is the same as that in Section 4.1, and the settings after parameter optimization are as follows: = 1000, = 19.
The final training accuracy and test accuracy of the methods are shown in Table 5. It can be seen from Table 5 that both of the training accuracy and the test accuracy of QPSO are the highest and the test accuracy reaches 98.75 %. The QPSO-KELM classification results of the test dataset are shown in Fig. 9, in which there are only three error points, so the accuracy of the trained QPSO-KELM model is available. From the iteration process and fault diagnosis accuracy, it can be seen that the fault features of bearing are more obvious than that of gear. Therefore, gear fault diagnosis needs less iteration times, but higher test accuracy can be achieved.

Gear-bearing fault diagnosis
At last, the composite faults of gear-bearing are analyzed. The fault parts are the gear and bearing mentioned in the previous section. There are 10 types of gear-bearing data: normal gear-normal bearing data, worn gear -faulty bearing rolling element data, worn gear-faulty bearing inner race data, worn gear-faulty bearing outer race data, gear tooth missing-faulty bearing rolling element data, missing gear tooth-faulty bearing inner race data, missing gear tooth-faulty bearing outer race data, broken gear tooth-faulty bearing rolling element data, broken gear tooth-faulty bearing inner race data, broken gear tooth-faulty bearing inner race data. The mixed faults of gear-bearing is shown in Table 6.
In the experiment, the motor speed is set to 30 r/s, and the signal sampling frequency is 20.48 kHz. The sampling time of each group of data is 2 seconds. The total data is 2472, except the data of the normal state (sample size is 288) and the broken tooth-faulty inner race (sample size is 264), the sample sizes of the other states are all 240 of each group. When diagnosing the gear-bearing fault, follow the procedure shown in Fig. 1. Firstly, the feature parameters of gear-bearing fault in time-domain, frequency-domain and NASA are extracted from each group of gear-bearing data, and 2742 samples will be gotten. Each group of samples is labeled with the following rules: normal-1, wear-roller-2, wear-inner-3, wear-outer-4, miss-roller-5, miss-inner-6, miss-outer-7, break-roller-8, break-inner-9, break-outer-10. Then, the training set and test set are randomly selected from all samples in a ratio of 3:1, and the training dataset is input to KELM for training. In order to maximize the test accuracy, QPSO is used to optimize the parameters of KELM. Finally, in order to verify the effectiveness of the method and the superiority of the method, the ELM and PSO-KELM are also used for gear-bearing fault diagnosis to compare with QPSO-KELM.
The optimization process of PSO-KELM and QPSO-KELM are shown in Fig. 10. It can be seen from the figure that the test progress of QPSO-KELM algorithm reach the highest after six iterations, while the PSO-KELM algorithm has no transformation from the beginning, and the accuracy is low, indicating that the classification ability of PSO-KELM is weaker than that of QPSO-KELM. The initial setting of parameters is the same as that in Section 4.1, and the settings after parameter optimization are as follows: = 1000, =20.
The final classification accuracy of the three methods for mixed faults of gear-bearing is shown in Table 7. It can be seen from Table 7 that the training accuracy of QPSO is not the highest in the training process, but when comparing the final test accuracy, the classification accuracy of QPSO is far higher than that of the other two calculations, and the test accuracy is 90.13 %. However, the test accuracy of the other two methods are no higher than 60 %. The QPSO-KELM classification results of the test dataset are shown in Fig. 11. As can be seen from Fig. 11, the error points are mainly concentrated in three states: the worn-rolling element fault state, worn-outer race fault state and broken-outer race fault state. Therefore, more sensitive feature parameters should be extracted for these three states.

Conclusions
A novel QPSO-KELM method is proposed in this paper and applied to gearbox fault diagnosis. The method is implemented by using QPSO algorithm to select the proper KELM parameters, kernel parameter and penalty coefficient . In order to verify the effectiveness and the superiority of the method, the time-domain, frequency-domain and NASA features are collected, the results of the bearing single fault, gear single fault and gear-bearing fault are analyzed. After comparing with two other methods (ELM and PSO-KELM), and analyzing the test results, the following conclusions can be given: (1) Experimental results show that the proposed method is effective, it can identify the types of faults mentioned above; (2) Compared with SVM, ELM, CNN and PSO-KELM, OPSO-KELM has more advantages. The accuracy of QPSO-KELM is more than 90 % in both training data and test data, which is higher than other methods.
The study focuses on optimizing the structural parameters of KELM and the application in gearbox fault diagnosis. From the analysis of experimental results, however, it can be found that there is still a space to improve the identification accuracy of gear-bearing hybrid fault. Since the accuracy of fault judgment is closely related to fault feature extraction, in order to further improve the fault identification accuracy of composite faults, the next step of the study is to extract more sensitive and effective fault feature parameters.