Nonlinear factor analysis and its application to acoustical source separation and identification

Wei Cheng1, Lin Gao2, Jie Zhang3, Jiantao Lu4 1, 3, 4State Key Laboratory for Manufacturing Systems Engineering, Xi’an Jiaotong University, Xi’an 710049, Shaanxi, China 2Institute of Biomedical Engineering, Key Laboratory of Biomedical Information Engineering of Education Ministry, Xi’an Jiaotong University, Xi’an 710049, Shaanxi, China 2Corresponding author E-mail: 1chengw@xjtu.edu.cn, 2gaolin2013@xjtu.edu.cn, 3epicureans@163.com, 4lujiantao1990@stu.xjtu.edu.cn


Introduction
Vibration and noises normally reduce the operational precision and even shorten the service life of machinery.However, vibration and noise caused by collision and friction of mechanical components can provide important information of operating conditions, and thus noise monitoring, reduction and control can be carried out based on a non-destructive measurement.In essential, measured acoustical signals are mixed signals of all sources and noises.Therefore, it has great significance to separate and identify source information from measured signals, and provides pure source information of each mechanical component especially some key components for an effective noise monitoring, reduction or control.
Generally, the measured acoustical signals are complicated and rough information of mechanical systems, which are caused by a complicated mixing of sources and transmission effects of mechanical structures.In the past decade, many researchers have devoted their efforts on transmission effects of vibration and acoustical signals.Radzevich S. P. [1] proposed an advanced technology of finishing of topologically modified pinion tooth surface for low noise/noiseless vehicle transmission.Denli H. [2] presented an optimization study of cylindrical sandwich shells to minimize the transmitted sound into the interior induced by the exterior acoustic excitations.Xin F. X. [3] formulated an analytical approach to account for the effects of mean flow on sound transmission across a simply supported rectangular aeroelastic panel.Nennig B. [4] studied the propagation of sound in a lined duct containing sheared mean flow.Bravo T. [5] described theoretical and experimental investigations into the sound absorption and transmission properties of micro-perforated panels backed by an air cavity and a thin plate.Yin J. F. [6] considered an approach using advanced statistical energy analysis that can incorporate tunneling mechanisms within a statistical approach.Fleury R. [7] studied the anomalous sound transmission and uniform energy squeezing through ultranarrow acoustic channels filled with zero-density metamaterials.Kim C. J. [8] developed a model for predicting the vibration transmission from two major excitation sources, ground vibration and fluid bearing force, to the tool and the workpiece position through the mechanical and control system of a precision machine.Yu X. [9] proposed a virtual panel considering an aperture as an equivalent structural component, which can be integrated with the solid/flexible structure to form a unified compound interface.Liu Y. [10] extended the prediction of sound transmission loss for a double panel structure lined with poroelastic materials to address the problem of a triple-panel structure.All these articles studied sound transmission properties through different structures, and can benefit for a passive vibration and noise control.However, to build a precise acoustical transmission model for complicated mechanical systems is still a challenging task and also costs plenty of time and resources.
Signal processing provides another way to interpret operating conditions of a system through response signals, and has benefited for system analysis, machinery condition monitoring and fault diagnosis.To effectively separate and identify acoustical sources, source separation is developed to extract source information from measured mixed signals without knowledge of sources and their mixing mode.Cheng W. studied source number estimation [11,12], source separation [13] and source contribution evaluation [14,15] methods for mechanical systems based on an effective source separation.Zhang E. L. [16] proposed an efficient solution to the separation of uncorrelated wide-band sound sources.Hioka Y. [17] proposed a method that can separate underdetermined sound sources based on a novel power spectral density estimation.Dong B. [18] offered a method for separating incoherent and compact sound sources based on the least spatial entropy.Han T. J. [19] proposed a new method of estimating the location of a predominant source in an amplitude panned stereo signal with two sources.To overcome the nonlinear mixing mode of sources in real physical systems, nonlinear factor analysis (NLFA) is developed to adaptively extract basic factors hidden in the observed signals.Mcdonald R. P. [20] firstly form a parametric function for nonlinear factor analysis in 1962, and then constructed a polynomial model with numerical methods [21].After that, many nonlinear factor analysis methods are developed.Jochum C. [22] described a combined linear and nonlinear factor analysis program package for chemical data evaluation.Etezadiamoli J. [23] studied a 2nd generation nonlinear factor analysis.Zhu H. T. [24] presented a Bayesian analysis of a general nonlinear factor analysis model.Valpola H. [25] developed a computationally efficient algorithm for a nonlinear extension of the linear factor analysis model.Yalcin I. [26] reviewed identification ambiguity and heavy reliance on normality of nonlinear factor analysis.Besides theoretical studies, nonlinear factor analysis was also a powerful tool in real physical systems, and it has been applied to trading equity [27], structural health monitoring [28], Aliasing detection of images [29], and integrative data analysis [30].In this paper, NLFA is applied to acoustical source separation and identification, and the separation performances of NLFA for acoustical signals are quantitatively evaluated through numerical case studies and an experimental study on a test bed with shell structures.
The remainder of this paper is organized as follows.In Section 2, basic theory and key principals of NLFA are introduced.In Section 3, separation performances of NLFA are comparatively studied with different numbers of hidden neurons and mixed signals.In Section 4, NLFA is applied to separate and identify acoustical sources based on source separation and correlation analysis.In Section 5, the conclusions of this study are summarized.

Nonlinear Factor Analysis Model
Considering a physical system of observations ( ) = ( ), ( ), ⋯ , ( ) , sources ( ) = ( ), ( ), ⋯ , ( ) , and noises ( ) = ( ), ( ), ⋯ , ( ) .The mixing mode of all the sources can be expressed as a function () as a parametrised mapping from source space to observation space, which is shown in Fig. 1.Thus nonlinear factor analysis can be modeled about how the observations are generated from the sources [31]: If noises can be ignored or treated as sources, the above model can be further simplified as: In Fig. 1, the nonlinearity of each hidden neuron is the hyperbolic tangent, which is the same as the usual logistic sigmoid except for a scaling.The mapping can be defined as: where matrices and are weights of first and second layer, and are corresponding biases.The sources are assumed to have zero mean Gaussian distributions and again the variances are parametrised by log-std , and the noise is assumed to be independent and Gaussian.Therefore, all the parameters are considered to have following distributions: The prior distributions of , , , , the six hyper-parameters ,…, are assumed to be Gaussian with zero mean and standard deviation, and the priors are assumed to be very flat.

Objective function
If there are only the observations , the posterior pdf of all the hidden sources should be approximated.Thus the objective function measures the misfit between the actual posterior pdf ( | ) and its approximation ( | ): Each individual approximation ( | ) is parametrised by the posterior mean ̅ and variance of the parameter.The objective function ( ; ; ) contains ( ; ) and ( ; ): ; , = ( ; ) + ( ; ), Since , , are independent in , the expectation over produces: From Eq. ( 4), there has: where ( ) denotes the total posterior variance of the output ( ), and * ( ) denotes the variance originating from the weights and biases.

Parameter updating rules
After the objective function is constructed, the next step is using optimization algorithm to minimize the ( ; , ) with respect to the posterior means and variances of the unknown variables.The fixed point equation is used to update the variances: Newton iteration is used as posterior variances containing the information about the second order derivatives of the objective function ( ; , ), and there has: where ̅ is a learning parameter to dampen the oscillations in the fixed point algorithms according to the following rule:

Source Identification
As some information of sources for mechanical systems normally can be obtained by the theory studies or Instructions, a waveform correlation is constructed to identify the source information by the correlation analysis between the sources and the separated signals.For discrete signals ( ) and ( ), the waveform correlation coefficient is defined as: where: and is the data length.

Introductions
In this section, typical signals of mechanical systems are artificially generated, which are used to test the effects by numbers of hidden neurons and mixed signals on the separation performances of NLFA.Since acoustical signals transmit from sources to measured points mainly through air, and the generating mechanisms of sources are statistically different.Therefore, all the sources are considered to be linearly mixed and independent from each other.
Source signals are: signal ( ) is a periodic wave of oscillating attenuation that simulates mechanical shocks; signal ( ) is a sinusoidal wave that simulates a vibration signal of rotational equipment; signal ( ) is a periodic wave that simulates amplitude modulation; signal ( ) is a white noise that simulates environmental noises.In the numerical case study, the noise is considered to be one source, and the generating functions of the sources are listed as follows:  2 and Fig. 3, respectively.Obviously, from Fig. 2 it can be seen that all the source signals have typical wave features.However, from Fig. 3 it is very difficult to identify the features of the sources as all the waveforms of sources are coupling together.Therefore, nonlinear factor analysis is applied to separate the mixed signals into uncorrelated factors, and then source identification can be carried out by a correlation analysis between the separated factors and the given sources.WEI CHENG, LIN GAO, JIE ZHANG, JIANTAO LU

Effects by numbers of hidden neurons
To test the effects by numbers of hidden neurons (HN), four mixed signals are separated into four separated factors using nonlinear factor analysis with different numbers of hidden neurons, and the waveforms of each separated component in different separating conditions are shown in Fig. 4-7.
Fig. 4-7 show the waveforms of different separated factors by NLFA with different HN: Fig. 4 shows that the major feature of source 1 can be extracted as HN is no less than 4, Fig. 5 shows that the major features of source 2 are extracted as HN is no less than 2, Fig. 6 shows that the major features of source 3 can be extracted as HN is no less than 3, and Fig. 7 shows that the major feature of source 4 can be extracted as HN is no less than 2. Generally, the separation performances of NLFA become better as the number of HN increases (HN < 4), and then the separation performances change very little as HN increases (HN ≥ 4).The waveform correlation coefficients between the separated components and related sources are shown in Fig. 8: the separation performances become better as the number of HN increases (HN ≤ 5), and change very litter (HN > 5), the source 2 and source 4 can be well separated as HN is only 2. Generally, increasing the number of HN can improve the separation performances, but it also costs more calculating time (Cost time: 15.44 s (HN2), 16.09 s (HN3), 16.35 s (HN4), 17.25 s (HN5), 17.70 s (HN6), 18.22 s (HN7), 19.60 s (HN8), 21.90 s (HN9), 22.26 s (HN10)).Moreover, as the number of HN increases, there also can cost calculating errors which can be found at the curve of separated factor 1 in Fig. 8. Therefore, normally the number of HN should be as less as possible to improve the calculating efficiency and avoid additional calculating errors.From Fig. 8, the number of HN is suggested to be 5-7.components are separated from 3 mixed signals, and only the sine wave are well separated.Fig. 11 shows that 4 components are separated from 4 mixed signals, and the major features of source 2 and source 4 are well separated.The features of source 1 and source 3 are separated, but the features are not clear as they are still coupling with other waveforms.Fig. 12 shows that 4 components are separated from 5 mixed signals, and the major features of all the sources are well separated.Therefore, in this case study, the separation performances of NLFA become better as the number of the mixed signals increases.
To quantitatively reveals effects by the number of the mixed signals, more numerical case studies are provided, and correlation coefficients between the separated factors and the sources are shown in Fig. 13 as the numbers of the mixed signals are from 2 to 10. From Fig. 13, it can be seen that the separation performances increase fast ( < 4), and changes very little ( ≥ 4).Therefore, increasing the number of the mixed signals can improve the separation performances from theory analysis.However, in engineering applications, more mixed signals mean that more noises may be added, and the effectiveness of separation cannot be guaranteed.Therefore, it is suggested the number of mixed signals should be equal to the number of sources, or more than one.In this case study, the number of the mixed signals is suggested to be 4 or 5.

Introductions of the test bed
A test bed is constructed to test the separation performances of NLFA, which has four major components: two end covers, a shell structure, two clapboards, and supports.Rubber air springs support the whole test bed, which can reduce the effects of ground vibrations and environmental noises.Three acoustical sources are designed in the test bed: two loudspeakers controlled by signal generators, and one motor controlled by a frequency converter.The structure of the test bed is shown in Fig. 14.Six sound pressure sensors are used to collect acoustical signals, and they are placed in six different directions of the test bed with a same distance of 500 millimeters.HBM Gen2i data acquisition system is used to collect acoustical data.The framework of the data acquisition system is shown in Fig. 15, and the test parameters are shown in Table 1.

Acoustical signals of the test bed
As all the sources are working with the given parameters, the mixed acoustical signals are measured by six sound pressure sensors.In the experimental study, the acoustical signals from sensors 2, 3, 4 and 6 are used as mixed signals.Generally, the number of the mixed signals is 1 more than that of sources, which is to guarantee an effective source separation of NLFA and also reduce noise effects of environment, and the directions of these sensors represent a diversified mixing mode of sources.The waveforms and spectrums of mixed signals are shown in Fig. 16 and Fig. 17 respectively.From waveforms in Fig. 16, it is very difficult to identify source features except some periodic waves.While from spectrums in Fig. 17, nearly all the mixed signals contain some major components of 20, 910, 1090, 1500, 1800, 2000, and 2200 Hz, which can be used to identify sources.Generally, independent source information cannot be directly identified from mixed signals as they are coupling together.

Acoustical source separation
NLFA is applied to separate mixed signals into uncorrelated factors, and 3 factors are extracted from the given mixed signals, whose waveforms and spectrums are shown in Fig. 18 and Fig. 19, respectively.Fig. 18 clearly shows that the waveform of the separated factor has clearly periodic features, and the basic component is a sine wave.The spectrum in Fig. 19 also shows that the separated factor has a major component of 1500 Hz.The waveform of the separated component has a typical feature of a sine wave with amplitude modulation, whose spectrum also clearly shows typical features of 1800, 2000, and 2200 Hz.The waveform of the separated factor has an unclear feature of a sine wave with randomly amplitude modulation, whose spectrum has 4 major components of 20, 870, 910, and 1090 Hz.Comparing the spectrums of separated components with parameters of experimental setups, it can be speculated that the separated factor represents a typical feature of source 1 from Loudspeaker I, while the separated factors and represent typical features of source 2 and 3 from loudspeaker II and the motor, respectively.However, this is just based on parameters of experimental setups, and still not a convincing and reliable source identification method.

Acoustical source identification and validation
To intelligently identify sources from the separated factors, and validate the effectiveness of NLFA for acoustical signals of real mechanical systems, source signals are measured independently by the closest sensors in the condition that only the related source is working with given parameters.The waveforms of independent sources from sensors 1, 4, and 6 for Loudspeaker I, Loudspeaker II, and the motor are shown in Fig. 20, and their corresponding spectrums are shown in Fig. 21.
Comparing waveforms in Fig. 20 and spectrums in Fig. 21 The correlation matrix Ω shows that correlation coefficients between separated factors and related sources are 0.92, 0.92 and 0.72, which indicate high correlation coefficients or high similarity between separated factors and related sources (Liu [32] obtained waveform correlation coefficients of 0.77±0.03for ECG signals with noises, and Farila [33] obtained waveform correlation coefficients of 0.70±0.09for non-stationary surface myoelectric signals); while correlation coefficients between separated factors and unrelated sources are all less than 0.13, which indicates that all 3 sources have a good independence property.Therefore, a threshold can be set as ∈ (0.13, 0.72) (Normally 0.55~0.65) to intelligently identify acoustical sources.Furthermore, high correlation coefficients between separated factors and related sources validate that the effectiveness of NLFA in acoustical source separation and identification.

Conclusions
This paper presents fundamental theory and key principals of nonlinear factor analysis, and validates the effectiveness of NLFA according to numerical case studies and an experimental study on a test bed with shell structures.All the case studies indicate that the acoustical sources can be effectively separated and intelligently identified.
In numerical case studies, separation performances of NLFA are tested in different conditions.For cases with different numbers of hidden neurons, separation performances of NLFA improve greatly as the number of hidden neurons is under 4, and change very little above 4, which mean that the optimal number of hidden neurons is 4.For cases with different numbers of mixed signals, the major waveform information of sources is well separated until the number of mixed signals is above 4, and separation performances of NLFA changes very little as the number of mixed signals increases, which mean that the optimal number of mixed signals is 4 or 5. Generally, increasing the numbers of hidden neurons and mixed signals can improve separation performances of NLFA.However, more observations and hidden neurons can not only cause more noises and calculating errors, but also cost a lot of calculating time.Therefore, the numbers of hidden neurons and mixed signals should be kept as small as possible.In the experimental study on a test bed, all the correlation coefficients between the separated factors and sources are more than 0.72, which also indicates an effective acoustical source separation.If artificially set a threshold ∈ (0.55, 0.65) for correlation coefficients, all the acoustical sources can be intelligently identified.
This work can benefit for noise monitoring, reduction and control, and provide pure source information for machinery condition monitoring and fault diagnosis.

Fig. 8 .correlation coefficients 3 . 3 .
Fig. 8. Waveform correlation coefficients 3.3.Effects by numbers of mixed signals To test the effects by numbers of mixed signals, the number of mixed signals are set from 2 to 10, and each mixed signals are generated by all the sources with a randomly mixing matrix.Then the mixed signals are separated by NLFA, and the waveforms of separated components in different conditions are shown in Fig. 9-12.

Fig. 14 .
Fig. 14.The structure of the test bed: a) End cover; b) Loudspeaker I; c) Left clapboard; d) Loudspeaker II; e) Shell; f) Motor; g) Right clapboard; h) Rubber springs; i) Supports

Fig. 15 .
Fig. 15.The framework of data acquisition system

Fig. 18 . 19 .
Fig. 18.Waveforms of separated factors Fig. 19.Spectrums of separated factors of source signals with that in Fig.18and Fig.19of separated factors, the waveforms of separated factors are very similar to that of related sources: the separated factor and source have a clear sine wave of 1500 Hz , which well agrees with the experimental setup (1500 Hz) of Loudspeaker I; the separated factor and source have 3 major components of 1800, 2000, and 2200 Hz corresponding to the Loudspeaker II; the separated factor and source have a basic sine wave(20 Hz)  with an uncertainty amplitude modulation, which can be interpreted as an eccentric vibration of the motor, and the uncertainty amplitude modulation (840, 910, 1090 Hz) can be caused by rubbing.Therefore, all the major components of source information have been effectively separated from mixed signals by NLFA.

Fig. 20 . 21 .
Fig. 20.Waveforms of source signals Fig. 21.Spectrums of source signals To quantitatively and intelligently identify sources from separated factors, waveform correlation analysis is used to measure similarity between separated factors and sources.All 3 separated factors are made correlation analysis with all 3 sources, and their correlation coefficients are listed in the correlation matrix Ω : Ω = 0.92 0.01 0.03 0.13 0.92 0.07 0.02 0.01 0.72 .(28) 2131.NONLINEAR FACTOR ANALYSIS AND ITS APPLICATION TO ACOUSTICAL SOURCE SEPARATION AND IDENTIFICATION.

Table 1 .
The test parameters of the data acquisition system