Multi-source and multi-fault condition monitoring based on parallel factor analysis and sequential probability ratio test

Yang, Liu; Chen, Hanxin; Ke, Yao; Li, Menglong; Huang, Lang; Miao, Yuzhuo

doi:10.1186/s13634-021-00730-w

Research
Open access
Published: 13 July 2021

Multi-source and multi-fault condition monitoring based on parallel factor analysis and sequential probability ratio test

Liu Yang¹,
Hanxin Chen^1,2,
Yao Ke¹,
Menglong Li¹,
Lang Huang¹ &
…
Yuzhuo Miao¹

EURASIP Journal on Advances in Signal Processing volume 2021, Article number: 37 (2021) Cite this article

2406 Accesses
3 Citations
Metrics details

Abstract

The monitoring of mechanical equipment systems contains an increasing number of complex content, expanding from traditional time, and frequency information to three-dimensional data of the time, space, and frequency information, and even higher-dimensional data containing subjects, experimental conditions. For high-dimensional data analysis, traditional decomposition methods such as Hilbert transform, fast Fourier transformation, and Gabor transformation not only lose the integrity of the data, but also increase the amount of calculation and introduce a lot of redundant information. The phenomenon of feature coupling, aliasing, and redundancy between the mechanical multi-source data signals will cause the inaccuracy of the evaluation, diagnosis, and prediction of industrial production operation status. The analysis of the three-way tensor composed of channel, frequency, and time is called parallel factor analysis (PARAFAC). The properties between the parallel factor analysis results and the input signals are studied through simulation experiments. Parallel factor analysis is used to decompose the third-order tensor composed of channel-time-frequency after continuous wavelet transformation of vibration signal into channel, time, and frequency characteristics. Multi-scale parallel factor analysis successfully extracted non-linear multi-dimensional dynamic fault characteristics by generating the spatial, spectral, time-domain signal loading value and three-dimensional fault characteristic expression. In order to verify the effectiveness of the space, frequency, and time domain signal loading values of the fault characteristic factors generated by the centrifugal pump system after parallel factor analysis, the characteristic factors obtained after parallel factor analysis are used as the SPRT test sequence for identification and verification. The results indicate that the method proposed in this article improves the measurement accuracy and intelligence of mechanical fault detection.

1 Introduction

In recent years, equipment fault diagnosis, as a new technology that crosses various disciplines, has been developed rapidly and has produced huge economic benefits [1,2,3,4,5,6]. The centrifugal pump is an important energy conversion and liquid transmission device in the process industry and its working state directly affects the production of the entire operating equipment. A slight damage to the impeller will shorten the running time of the centrifugal pump and disturb the operation of the equipment. When the impeller fails, it will cause damage to the centrifugal pump components or personal injury accidents, which will cause significant economic losses [7, 8]. The normal operation and failure of the centrifugal pump will cause the equipment to vibrate. The vibration signal contains rich information of the pump body running state and is easy to be collected, which can be used to monitor and diagnose the running state of the centrifugal pump. Fault diagnosis is generally divided into three steps: first, we collect the relevant vibration signal of the diagnostic object; then, the signal is analyzed and processed to acquire the characteristics of the vibration signal; finally, pattern recognition and fault diagnosis are performed through the corresponding extracted special diagnosis [9, 10]. The core content is to obtain the effective characteristics of the vibration signal. Due to the complex structure of the centrifugal pump, many excitation sources and mutual interference, the vibration signal of the centrifugal pump is a non-linear and non-stationary signal. Researchers have proposed various effective diagnostic methods to process the collected raw vibration signals of the centrifugal pump, extract effective information, and improve the accuracy of diagnosis. Wenjian Huang et al. [11] extracted the characteristic parameters of the vibration signal through time-domain signal analysis, then the PCA was used to reduce the amount of data, and finally the main component with the largest contribution rate was used as the input signal of SPRT to evaluate the proposed algorithm. Literature [12] proposed an improved deep convolutional neural network (CNN) to identify defects in centrifugal pumps by using sound and image recognition. A feature extraction method based on empirical mode decomposition (EMD) was developed to detect the gravity of cavitation in the centrifugal pump by Azizi et al. [1]. Liu Yang et al. [13] proposed the new method for analysis of big data based on particle swarm optimization wavelet neural network for diagnosis in the gearbox. Literature [8] applies variational mode decomposition (VMD) with different input parameters to fault diagnosis of multi-stage pumps. Signal processing combined with empirical mode decomposition (EMD) and fuzzy c-means clustering is used for monitoring piston pump defects in literature [14]. The traditional decomposition method of processing high-dimensional data will not only lose the integrity of the data but also increase the amount of calculation and introduce redundancy [15,16,17,18]. These methods of extracting time-frequency characteristics from single-channel signals, such as Fourier transform, cannot reflect the internal relationship of non-linear changes between multi-source channel characteristic signals, nor can they eliminate information interference.

Mechanical non-linear multi-fault mode multi-source dynamic feature identification is a technical bottleneck and difficult problem encountered in the application of fault diagnosis in process industry production lines. It not only needs to extract the time-frequency characteristics of multi-source fault signals, but also to ensure the correspondence between non-linear variables and multi-fault modes and multi-source fault features in time, frequency, and space after feature extraction. Parallel factor analysis proposed by Carroll, Chang [19], and Harshman [20] in 1970 is a three-dimensional or multi-dimensional signal processing algorithm that uses iterative least squares to resolve the decomposition and identification of multi-dimensional matrices. The general time-frequency decomposition ignores the spatial information of the vibration signal and cannot handle multi-channel data [21,22,23]. Data framing in the form of a three-way array indexed by channel, frequency and time allows the application of a unique decomposition called Parallel Factor Analysis (PARAFAC). The decomposition uniqueness of PARAFAC model can obtain its model parameters without ambiguity so that the PARAFAC model has important application value. As a data processing algorithm, PARAFAC model has been successfully applied in fluorescence spectroscopy, psychology, signal processing, food science, and other fields. Multi-channel electroencephalogram EEG data can usually be expressed as an M×N×P three-way data set and the components of the three-way data array correspond to the channel (electrodes at different positions), time (data samples) and frequency components. Schmitz, S. applied PARAFAC to analyze the temporal and spatial patterns of functional connections between neurons, which were revealed in the sequence of peaks recorded in the cat’s main visual cortex (area 18) [24]. This parallel factor analysis was applied for decomposing EEG data into space-time-frequency components during the resting state and mental arithmetic by Miwakeichi, Fumikazu et al. [25]. Rost'akova compares non-negative Tucker decomposition with parallel factor analysis to identify and measure human brain electrical rhythms [26]. In the literature [27], Choi, Ji Yeh proposed a new extension function PARAFAC for processing response to three-dimensional data arranged along a two-dimensional domain and one-dimensional parameters. Technically, this method combines PARAFAC with basis function expansion approximation and is applied to EEG data to prove its empirical usefulness. A parallel factor analysis study showed that the frontal lobe area with higher frequency response is the main feature of laser evoked potential in rats with chronic inflammatory pain [28].

Parallel decomposition has attracted great attention, because parallel factorization can process the constructed high-dimensional data as a whole, which not only retains the overall structure information of the data, but also makes the structure more compact and easy to understand. In the literature [29], parallel factor analysis was used as the diagnostic tool through decomposing centrifugal pump diagnostic signal into time-frequency-space modes. Considering the difficulty of extracting fault features from rolling bearings under strong background noise, Yang Cheng [30] proposed a new method based on variable mode decomposition (VMD) and phase space parallel factor analysis to detect weak fault signals of rolling bearings. In order to overcome the inability to extract sparse and interpretable latent variables from batch data, literature [31] proposed a batch three-way data array sparse model based on sparse parallel factor (SPARAFAC) decomposition. Sparse factor matrices have the potential advantage of being easy to interpret because they eliminate redundant data information and show significant variable correlation. In chemistry, medicine, and food science commonly used fluorescence excitation and emission data typically contain several chemical components at different concentrations. Fluorescence spectroscopy can generate a three-way data set with the mode “sample × excitation × emission.” The main purpose of the analysis of this data type is to determine which chemicals are present in each sample and their relative concentrations. Reference [32] conducted a comparison between parallel factor analysis (PARAFAC) and support vector machine (SVM) to identify and distinguish the fluorescence spectrum of coconut water brands. The above results indicate that fluorescence spectroscopy combined with PARAFAC and SVM method has been proved to be a simple and rapid detection method for coconut water and other beverages. This study [33] aims to determine whether the composition or distribution of humus in lake sediments can be characterized by chemometric spectral data. This method determines the three-dimensional excitation emission matrix in the extracted humus and performs spectral analysis of the data by using parallel factor analysis (PARAFAC) with classification tree and regression tree (CART).

The theory of sequential probability ratio test (SPRT) that is a branch of mathematical statistics was proposed by Abraham Wald in 1947 in order to solve the problem of sampling and acceptance of valuable military products. This method provides an approximate formula for the critical value of accepting the null hypothesis H₀ or accepting the alternative hypothesis H₁ based on the sample values obtained from each observation, and also provides the average sampling times and power function of this test method. In 1948, Abraham Wald and American statistician Wolfowitz proved that the above-mentioned sequential probability is the smallest number of sampling times required for the test in all the two types of tests whose error probability does not exceed α and β, respectively. The sequential probability ratio test is the most fundamental sequential test in sequential analysis proposed by Abraham Wald, and it has subsequently been widely developed in various fields. Almost all the hypothesis testing problems of SPRT in mechanical fault diagnosis, such as signal detection, model variable point detection, life data analysis of centrifugal pump, and crack detection of gearbox, can be well applied. This research [34] were performed on actual faults in a laboratory-scale distillation plan based on artificial neural network-multi-layer perceptron (ANN-MLP) and the Wald sequential probability ratio test (SPRT). In the literature [35], Guo Peng proposed Gaussian process and SPRT wind turbine power curve modeling and monitoring. The modeling and monitoring method proposed in this paper successfully identified two wind anemometer failures and pitch system failures. Literature [36] proposed a fault detection algorithm based on the sequential probability ratio test (SPRT) and chi-square test for redundant multi-sensor navigation systems for supersonic cruise ships (HCV).

The rest of this paper is organized as follows. In Section 2, the parallel factorization model and simulation is described. The multi-scale parallel factorization optimization algorithm for non-linear multi-source fault characteristic signal extraction is established. The characteristic factor signal is successfully obtained from the matrix factor, and the “loading” factor and “component” factor are defined. In Section 3, we studied the multi-channel data decision theory based on SPRT, and established an adaptive optimization diagnosis method for tracking and identifying the non-linear multi-dimensional dynamic optimal characteristic signal. To research the validity of the multi-scale parallel factor analysis and SPRT for multi-channel signal in actual complex industrial production, the centrifugal pump fault diagnosis experimental system was designed and implemented in Section 4. Following that, multi-source dynamic feature extraction based on parallel factorization and SPRT for the multi-source condition monitoring of centrifugal pump are presented in Section 5. Finally, the conclusions are drawn in Section 6.

2 The model and simulation

2.1 Parallel factorization model

In a two-dimensional matrix, x_{i, j} is generally used to represent the elements in the two-dimensional matrix (i represents any row, j represents any column). Similarly, in the three-dimensional matrix, we use x_{i, j, k}to represent any element in the three-dimensional matrix. At present, there is no definite name naming subscript k, let us call it “page” [37]. The subscript of the three-dimensional matrix consists of three index value row, column, and page composition. The left picture in Fig. 1 shows the three-dimensional matrix and the right picture shows its sub-matrix. When a certain dimension in a three-dimensional matrix is fixed, it constitutes a sub-matrix of the three-dimensional matrix that is called a slice of the three-dimensional matrix along a certain dimension.

The expansion of the three-dimensional matrix is actually to rearrange the slices of the three-dimensional matrix to constitute the new two-dimensional matrix. For example, we fixed the rows and columns of a three-dimensional matrix and rearranged its pages to formulate the new two-dimensional matrix. At this time, the number of rows is equivalent to the number of rows I of the original matrix and the number of columns changes from the original J to J × K, denoted asX^I × JK. It is expressed as shown in Eq. (1).

$$ {X}^{I\times JK}=\left[{X}_{k=1},{X}_{k=2}\cdots {X}_{k=K}\right] $$

(1)

Of course, it can also be expanded by column, such as X^IK × J, which is defined as formula (2).

$$ {X}^{IK\times J}=\left[\begin{array}{c}{X}_{k=1}\\ {}{X}_{k=2}\\ {}\vdots \\ {}{X}_{k=K}\end{array}\right] $$

(2)

After expanding by columns, we acknowledge that I × K is displayed as the number of rows of the new matrix and parameter “J” is the number of columns.

The symbol $ {x}_{i,j,k}=\sum \limits_{f=1}^F{a}_{i,f}{b}_{j,f}{c}_{k,f} $ can be used to express any element in a three-dimensional or larger than three-dimensional matrix, the variables i, j, and k in the formula can be any natural numbers. The elements in the i-th row, j-th column, and k-th page of the matrix X can be represented by x_{i, j, k}. According to the definition, the low-rank decomposition of a two-dimensional matrix can be popularized to the low-rank decomposition of a three-dimensional matrix. Let the element X ∈ C^{I × J × K}of the three-dimensional matrix be defined as x_ijk, the variables i, j, and k in the formula can be any natural numbers. Similarly, it can be seen that the three-dimensional matrix can be indicated as the modality of the vector outer product of the following formula (3).

$$ X={a}_1\circ {b}_1\circ {c}_1+,\dots, +{a}_R\circ {b}_R\circ {c}_R=\sum \limits_{r=1}^R{a}_r\circ {b}_r\circ {c}_r $$

(3)

It gives the process of a three-dimensional matrix low rank decomposition in formula (3) and the symbol R of the formula is indicated as the rank of the three-dimensional matrix X, where c_r ∈ C^K, b_r ∈ C^J,a_r ∈ C^I, r = 1,...,R. Harshman names the low-rank decomposition model of three-dimensional matrix given by formula (3) as the general model of parallel factor. The general forms of the parallel factorization model are interpreted in Fig. 2, in which X can be displayed by the popular geometric cube.

TheseC = [c₁, …, c_R];B = [b₁, …, b_R];A = [a₁, …, a_R] are any given three two-dimensional matrix definitions. We define A, B, and C as the three loading matrices of the parallel factorization model of the general form. Equation (4) is shown as the scalar form of the general parallel factorization model. They are labeled as c_kr = [C]_{k, r}, b_jr = [B]_{j, r},a_ir = [A]_{i, r}, and $ {\underline{x}}_{ijk}={\left[\underline{X}\right]}_{i,j,k} $.

$$ {\underline{x}}_{ijk}=\sum \limits_{r=1}^R{a}_{ir}{b}_{jr}{c}_{kr} $$

(4)

The general form of the parallel factorization model can be viewed as the low-rank decomposition of a two-dimensional matrix extending to a three-dimensional matrix. The formula (4) indicates that these subitems $ {\underline{x}}_{ijk} $of the three-dimensional matrix X can also be denoted as the sum of the products of R elements a, b, and c. Compared with the matrix elements x_ij in PCA, $ {\underline{x}}_{ijk} $ has three independently changing dimensions called “mode A,” “mode B,” and “mode C.”

2.2 Matrix essential equalization

There is a matrix A ∈ C^I × J. If the matrix satisfies $ \overline{A}=A\Pi \Delta $, it is said that A and $ \overline{A} $ are matrix essential equalization, denoted as $ A\cong \overline{A} $. Among them, ∏ is the column exchange matrix and △ is the diagonal scale matrix. There is one and only one non-zero element “1” in each row and each column of the column exchange matrix ∏. The function of the column exchange matrix is to rearrange the column vector of A in the order of ∏ without changing the value of the elements in the vector. The diagonal scale matrix △ is a J×J diagonal matrix with non-zero diagonal elements. The function of △ is to multiply each column of matrix A by a non-zero amplitude.

According to the concept of matrix essential equalization, we take a two-dimensional matrixX = AB^Tas an example, where A ∈ C^M × F,B ∈ C^N × F. For any matrix $ \overline{A}\in {C}^{M\times F},\overline{B}\in {C}^{N\times F} $, if $ X={AB}^T=\overline{A}{\overline{B}}^T $is satisfied, then we can get to formula (5).

$$ \overline{A}=A{\prod}_1{\Delta}_1,\overline{B}=B{\prod}_2{\Delta}_2 $$

(5)

∏2 and ∏2 in the formula are column exchange matrices, which means to rearrange the columns of the A and B matrices. △₁ and △₂ are diagonal scale matrices, which means that each column of matrix A and B is multiplied by the non-zero coefficient. At this time, the matrix decomposition is said to be unique.

When the matrix factorization is unique, the matrix $ \overline{A},\overline{B} $obtained by the matrix factorization is not completely equal to the original matrices A and B, they are only the essential equality relationship of the matrix. The essential equal relationship of matrix A and B is shown in the following formula (6) (7).

$$ \overline{A}=A{\prod}_A{\Delta}_A\cong A $$

(6)

$$ \overline{B}=B{\prod}_B{\Delta}_B\cong B $$

(7)

Due to the existence of column exchange matrices ∏_A, ∏_B and diagonal scale matrices △_A, △_B, the order and magnitude of the column vectors in matrix $ \overline{A},\overline{B} $ can be different from those of A and B. In matrix theory, they are used to be called column blur and scale blur, which are represented by column exchange matrix ∏ and diagonal scale matrix △, respectively. In the process of matrix decomposition, if there is no structural constraint on the matrices A and B, column blur and scale blur are unavoidable. The above problem can be explained in the vector form of matrix decomposition, which is denotedA = [a₁, ⋯, a_F], B = [b₁, ⋯, b_F], wherea_f ∈ C^I × 1、 b_f ∈ C^J × 1 (f=1,...,F) is the column vector of A and B, respectively. The above formula can be expressed as the following vector form.

$$ X={AB}^T={\mathrm{a}}_1{\mathrm{b}}_1^T+{\mathrm{a}}_2{\mathrm{b}}_2^T+\cdots +{\mathrm{a}}_F{\mathrm{b}}_F^T $$

(8)

In formula (8), $ {\mathrm{a}}_1{\mathrm{b}}_1^T,\cdots, {\mathrm{a}}_F{\mathrm{b}}_F^T $ are F matrices with rank 1. At this time, if the order of $ {\mathrm{a}}_1{\mathrm{b}}_1^T $,...,$ {\mathrm{a}}_F{\mathrm{b}}_F^T $ is changed arbitrarily, the value of matrix X is unchanged. Similarly, if the vector a_f is multiplied by the non-zero coefficient λ_f and the corresponding b_f is multiplied by a non-zero coefficient 1/λ_f, the value of X will not change either. Assuming that the order of $ {\mathrm{a}}_1{\mathrm{b}}_1^T $and $ {\mathrm{a}}_2{\mathrm{b}}_2^T $ in formula (8) is exchanged, it can be rewritten into the following form:

$$ {\displaystyle \begin{array}{c}X={AB}^T={\mathrm{a}}_1{\mathrm{b}}_1^T+{\mathrm{a}}_2{\mathrm{b}}_2^T+\cdots +{\mathrm{a}}_F{\mathrm{b}}_F^T\\ {}={\lambda}_2{a}_2\frac{1}{\lambda_2}{b}_2^T+{\lambda}_1{a}_1\frac{1}{\lambda_1}{b}_1^T+\cdots +{\lambda}_F{a}_F\frac{1}{\lambda_F}{b}_F^T\\ {}=A{\prod}_A{\Delta}_A{\left(B{\prod}_B{\Delta}_B\right)}^T\\ {}=\overline{A}{\overline{B}}^T\end{array}} $$

(9)

The result $ X={AB}^T=\overline{A}{\overline{B}}^T $ can be obtained, the code in the equation is specifically expressed as follows $ \overline{A}=A{\prod}_A{\Delta}_A $,$ \overline{B}=B{\prod}_B{\Delta}_B $.

It can be seen that there are always $ \overline{A} $ and $ \overline{B} $ to achieve matrix decomposition, but they are only essentially equal to A and B. Therefore, column ambiguity and scale ambiguity are inherent ambiguities in the matrix decomposition process. Without additional constraints, the order and magnitude of loading matrix columns cannot be determined through matrix decomposition. Therefore, the uniqueness of matrix factorization given by the definition can also be called “essential uniqueness.” In the actual application process, some methods can be used to eliminate column blur and proportion blur caused by matrix decomposition.

2.3 Recognizability and uniqueness of parallel factorization

The essential feature of the parallel factorization model is the uniqueness of the model. When there is no array blur, the matrices A, B, and C can be identified. The following conclusions can be obtained. When X_i = BD_i(A)C^T,i=1,2,...,I, A ∈ C^I × F,B ∈ C^J × F,C ∈ C^K × F is given, if k_A + k_B + k_C ≥ 2F + 2, then these matrices A, B, and C are uniqueness for column exchange and plurality transformation or scale transformation.

The matrix composed of relatively independent columns taken from the absolute continuous distribution has full k-rank. If all three matrices meet this condition, the sufficient condition for recognizability is shown in formula (10).

$$ \min \left(I,F\right)+\min \left(J,F\right)+\min \left(K,F\right)\ge 2F+2 $$

(10)

If the matrices A, B, and C have other structural constraints, better identifiable results may be obtained. PARAFAC uniqueness theorem can be used to obtain the ith submatrix of the X-axis of PARAFAC model:

$$ {X}_i^{J\times K}={BD}_i(A){C}^T,i=1,\dots, I $$

(11)

The matrix in formula (11) satisfies the following A ∈ R^I × R, B ∈ R^J × R, C ∈ R^K × R. If the following conditions are met in Eq. (12).

$$ {k}_A+{k}_B+{k}_C\ge 2\left(R+1\right) $$

(12)

Even if there is column blur and scale blur, the matrix A, B, and C are unique. In mathematical language, when formula (13) is satisfied

$$ {X}_i^{J\times K}=\hat{B}{D}_i\left(\hat{A}\right){\hat{C}}^T,i=1,\dots, I $$

(13)

The relation shown in formula (14) can be obtained.

$$ \hat{A}=A\prod {\Delta}_1,\hat{B}=B\prod {\Delta}_2,\hat{C}=C\prod {\Delta}_3 $$

(14)

Equation (14) shows that ∏ is a column fuzzy matrix, Δ₁Δ₂ and Δ₃ are the scale fuzzy matrix and the equation of Δ₁Δ₂Δ₃ = I needs to be satisfied.

2.4 Trilinear alternating least square for parallel factor analysis

There are many methods to achieve the decomposition of PARAFAC, and the trilinear alternate least squares (TALS) algorithm is the most widely adopted methodology for data detection of parallel factor trilinear models. The fundamental principle of the TALS is to update the matrix in each step. First of all, TALS is updated by employing least squares (LS) to renovate the residual matrix based on the results of the previous estimate; then, it continues to update other matrices; finally, stop running until the result converges or reaches the set number of iterations after repeating the above steps. The trilinear model of three-dimensional data set X has the configuration shown in formula (15) below.

$$ {x}_{i,j,k}=\sum \limits_{f=1}^F{a}_{i,f}{b}_{j,f}{c}_{k,f}+{e}_{ijk},i=1\dots I;j=1\dots J;k=1\dots K $$

(15)

Where F is the number of factors, a_i,f is the i-th element in vector a_f, b_j,f is the j-th element in vector b_f, and c_k,f is the k-th element in vector c_f. The data set X of third-order tensor I × J × K is indicated as “x_i,j,k.” The “e_ijk” represents the error set E of the third-order tensor I × J × K. Equation A = [a₁, a₂, ⋯, a_I] is defined as the I × F matrix; the B = [b₁, b₂, ⋯, b_J], and C = [c₁, c₂, ⋯, c_K] are defined as the J × F matrix and the K × F matrix.

(1)
First, matrix A is calculated by formula (16).

$$ \left[\begin{array}{l}{X}_{\mathrm{K}1}\\ {}{X}_{\mathrm{K}2}\\ {}\mathrm{M}\\ {}{X}_{\mathrm{K}\ K}\end{array}\right]=\left[\begin{array}{l} BdiagC\left(1,:\right)\\ {} BdiagC\left(2,:\right)\\ {}\mathrm{M}\\ {} BdiagC\left(K,:\right)\end{array}\right]{A}^T+{E}_K $$

(16)

Formula (16) satisfies X_{K k} = Bdiag(C(k,:))A^T +E_{K k}. The error is expressed in terms of E_K. The least squares (LS) estimate of A^T is calculated by Eq. (17).

$$ {A}^T={\left[\begin{array}{c} BdiagC\left(1,:\right)\\ {} BdiagC\left(2,:\right)\\ {}M\\ {} BdiagC\left(2,:\right)\end{array}\right]}^{+}\left[\begin{array}{c}{X}_{\mathrm{K}1}\\ {}{X}_{\mathrm{K}2}\\ {}\mathrm{M}\\ {}{X}_{\mathrm{K}K}\end{array}\right] $$

(17)

The generalized inverse in formula (17) is represented by []⁺.

(2)
Secondly, matrix B is calculated by formula (18).

$$ \left[\begin{array}{c}{Y}_{\mathrm{K}1}\\ {}{Y}_{\mathrm{K}2}\\ {}\mathrm{M}\\ {}{Y}_{\mathrm{K}\ I}\end{array}\right]=\left[\begin{array}{c} CdiagA\left(1,:\right)\\ {} CdiagA\left(2,:\right)\\ {}M\\ {} CdiagA\left(I,:\right)\end{array}\right]{B}^T+{E}_I $$

(18)

Formula (18) satisfies the following Y_{K i} = Cdiag(A(i,:))B^T +E_{K i}, The error is expressed in terms ofE_I. The least squares (LS) estimate of B^T is calculated by Eq. (19).

$$ {B}^T={\left[\begin{array}{c} CdiagA\left(1,:\right)\\ {} CdiagA\left(2,:\right)\\ {}\mathrm{M}\\ {} CdiagA\left(I,:\right)\end{array}\right]}^{+}\left[\begin{array}{c}{Y}_{\mathrm{K}1}\\ {}{Y}_{\mathrm{K}2}\\ {}\mathrm{M}\\ {}{Y}_{\mathrm{K}I}\end{array}\right] $$

(19)

(3)
Thirdly, matrix C is calculated by formula (20).

$$ \left[\begin{array}{l}{Z}_{\mathrm{K}1}\\ {}{Z}_{\mathrm{K}2}\\ {}\mathrm{M}\\ {}{Z}_{\mathrm{K}J}\end{array}\right]=\left[\begin{array}{l} AdiagB\left(1,:\right)\\ {} AdiagB\left(2,:\right)\\ {}\mathrm{M}\\ {} AdiagB\left(J,:\right)\end{array}\right]{C}^T+{E}_J $$

(20)

Where Z_Kj = Adiag(B(j,:))C^T + E_Kj, j = 1, 2, ⋯, J. The error is expressed in terms of E_J. The least squares estimate of C^T is calculated by Eq. (21).

$$ {C}^T={\left[\begin{array}{c} AdiagB\left(1,:\right)\\ {} AdiagB\left(2,:\right)\\ {}\vdots \\ {} AdiagB\left(J,:\right)\end{array}\right]}^{\div}\left[\begin{array}{c}{Z}_{\dots 1}\\ {}{Z}_{\dots 2}\\ {}\vdots \\ {}{Z}_{\dots J}\end{array}\right] $$

(21)

(4)
Finally, stop running until the result converges or reaches the set number of iterations after repeating the above steps (1)–(3).

Multi-channel vibration signals are collected in this paper to research the fault state of equipment, and a third-order tensor is constructed through continuous wavelet transform. Figure 3 shows the basic structure of the parallel factor analysis decomposition model for fault diagnosis.

The N_t, N_d, and N_f of the data matrix $ {\mathrm{S}}_{\left({\mathrm{N}}_{\mathrm{d}}\times {\mathrm{N}}_{\mathrm{f}}\times {\mathrm{N}}_{\mathrm{t}}\right)} $ are the number of data points, the number of channels, and the frequency step size, respectively.

$$ {\hat{S}}_{dft}=\sum \limits_{k=1}^{N_k}{a}_{dk}{b}_{fk}{c}_{ik}+{e}_{dft} $$

(22)

The key issue of this parallel factorization model is to obtain the matrices A, B, and C. The a_dk, b_fk, and c_tk are their elements, in which component k represents an atom. These spatial signals, spectral signals, and temporal signals for each atom are indicated as the a_k = {a_dk}, b_k = {b_fk} and c_k = {c_tk}. The “e_ijk” is the error, which forms error set E of the third order tensor I × J × K. The uniqueness of the solution of parallel factor decomposition for fault diagnosis is guaranteed through rank(A) + rank(B) + rank(C) ≥ 2N_k + 2. The decomposition of formula (22) is achieved by solving $ \underset{a_{dk}{b}_{jk}{c}_{ik}}{\min}\left\Vert {\hat{S}}_{dft}-{\sum}_{k=1}^{N_k}{a}_{dk}{b}_{fk}{c}_{tk}\right\Vert $. The main advantage of this method is that the spectrum decomposition of time-varying vibration signal is unique and the best model can be obtained under the principle of minimum square deviation.

The basic steps for implementing the multi-scale parallel factorization model of fault diagnosis are as follows .

(1).
After the vibration signal is collected, the third order tensor is constructed by continuous wavelet transform.
(2).
The number of factor F is determined by the principle of consistency in MATLAB.
(3).
Initialization for load matrix B and C.
(4).
After initializing and running the matrices B and C, the matrix A is estimated by the least square regression algorithm. A = XZ^'(ZZ^')⁻¹, Z = (b ⊗ c).
(5).
Similarly, the matrices B and C are estimated.
(6).
Continue from step (3) until the result converges or reaches the set number of iterations.
(7).
Corresponding results obtained.

2.5 Numerical simulation based on parallel factor analysis

Simulation experiments can investigate the characteristic of the results of input signals with different parameters after parallel factor analysis for fault diagnosis. Therefore, the simulation signals are used to simulate the running state of the centrifugal pump to test the method proposed in this article. The simulation signal is shown in the following formula (23).

$$ \mathrm{y}\left(\mathrm{t}\right)=0.01\ast \cos \left(2\ast \mathrm{pi}\ast 400\ast \mathrm{t}-5\right)\ast \exp \left(-0.5\ast \left(\left(\left(\mathrm{t}-0.03\right)/0.003\right)\hat{\mkern6mu} 2\right)\right) $$

(23)

Figure 4 shows the time-domain diagram of the simulated signal. An impact signal appears in the graph at 0.03 μs, which simulates the signal generated when the system fails. It is shown in Fig. 5 that the time-frequency diagram of the simulation signal is obtained by the continuous wavelet transform. It can be seen from the time-frequency diagram of the simulation signal that the dominant frequency of the signal is 400 Hz and the impact signal in the simulation signal appears in the frequency range of 180–400 Hz. For the time-frequency diagram of the simulated signal in Fig. 5, it can also be seen that the impact signal appears at 0.03 μs.

The simulation signal is constructed into a third-order tensor after continuous wavelet transformation, and then the third-order tensor is decomposed by parallel factors to obtain the result shown in Fig. 6. After parallel factor analysis, we can get the loading value and residual error corresponding to frequency, time, and channel. Comparing the loading value of frequency and time after parallel factor analysis with the time-frequency diagram, we can find their corresponding relationship. The hypothetical frequency curve in the graph fluctuates in the range of 180–400 Hz, which is a contrast relationship between the fluctuations of the simultaneous frequency graph. The time curve fluctuates at 0.03 μs and has the maximum value of the third component and the minimum value of the second component. It can be seen from the simulation signal corresponding to the ground that the simulation signal also has an impact signal at 0.03 μs. This indicates that the parallel factor analysis for high-dimensional data of multi-source feature factors can well detect the characteristics of the shock signal generated by the simulated fault.

The vibration signals collected in engineering are generally mixed with various noise signals. In order to check on the effectiveness in complex conditions, we add the noise signal to the original simulation signal and perform parallel factor analysis on it. Figures 7 and 8, respectively, show the time-domain diagram of the original simulation signal after adding noise to the signal and the time-frequency diagram obtained through continuous wavelet transform. After adding the noise signal to the original simulation signal, it can be seen that the waveform of the noisy simulation signal is similar to the original simulation signal in Fig. 4 and the impact signal is almost covered by the noise signal. The waveform of the noisy simulation signal in Fig. 8 is steeper and more rapid, and there is a larger blurred signal at 10–20 Hz.

The simulation signal with noise is transformed into a third-order tensor after continuous wavelet transformation. The result of the parallel factor analysis of the third-order tensor is shown in Fig. 9. After parallel factor analysis, we can get the loading value and residual parameters corresponding to frequency, time, and channel. Comparing the loading value of frequency and time after parallel factor analysis with the time-frequency diagram, we can find the correspondence between them relationship. The hypothetical frequency curve in the graph fluctuates in the range of 180–400 Hz, which is a contrast relationship between the fluctuations of the simultaneous frequency graph. The time curve fluctuates at 0.03 μs and has a maximum value. We get the normal probability plot and the residual variance corresponding to the data in mode 1, mode 2, and mode 3. This shows that the parallel factor analysis proposed can well detect the characteristics of the impact signal in this paper even when the collected signal contains noise.

3 Proposed method

Likelihood function is a function of statistical model parameters, which plays a great role in statistical inference. The general method of using likelihood ratio test statistics was proposed by Neyman-Pearson in 1982 [38]. Its basic idea is similar to the maximum likelihood method of parameter estimation theory, which is called likelihood ratio test. For hypothesis H₀ : θ = θ₀, alternative hypothesis H₁ : θ = θ₁, x is a set of random variables. When H₀ is true, the probability density function of the random variable x is expressed as f(x, θ₀). When H₁ is true, the probability density function of the random variable x is expressed as f(x, θ₁). The likelihood function of the sample is the following formula (24).

$$ L\left(\theta \right)=\prod \limits_{i=1}^nf\left({x}_i,\theta \right) $$

(24)

Therefore, the likelihood ratio test is performed to obtain the statistic L in the following formula (25).

$$ L=\frac{L\left({\theta}_1\right)}{L\left({\theta}_0\right)}=\frac{\prod \limits_{i=1}^nf\left({x}_i,{\theta}_1\right)}{\prod \limits_{i=1}^nf\left({x}_i,{\theta}_0\right)} $$

(25)

If the likelihood ratio L is larger, the parameter θ is more likely to be θ₁; it shows that the result may tend to negate H₀. On the contrary, if the ratio is smaller, the parameter θ is more likely to be θ₀, which indicates that the result may be inclined to accept H₀. For a certain limit k, L is defined as shown in the following formula (26).

$$ \varphi (x)=\left\{\begin{array}{cc}1& l>k\\ {}0& l\le k\end{array}\right. $$

(26)

Test φ(x) is called the likelihood ratio test of the above test problem.

Neyman-Pearson proposes a principle to determine the optimal test method: parameter α satisfies formula (27).

$$ \beta \left(\theta \right)\le \alpha \kern0.5em \forall \theta \in {\Theta}_0 $$

(27)

In formula (27), β(θ) is the power function of the test, Θ₀ is the parameter space of the null hypothesis H₀, and θ is the test parameter. Look for a test that satisfies the above formula so that β(θ) is as large as possible when θ ∈ Θ₀.To ensure that the probability of making two types of errors is very small, the sample size must be increased. For field testing, the smaller the sample size, the better when ensuring the reliability of the conclusion. The sequential method proposed by A. Wald solves the problem of optimal selection of sample size and play an important milestone in the history of statistical development.

The probability function f(x, θ) represents the distribution of the random variable x, H₀(θ = θ₀) and H₁(θ = θ₁) are the null hypothesis and alternative hypothesis of the random variable x, respectively. When accepting H₁, the probability of the sample x₁, …, x_m for any positive integer m is given by P_1m = f(x₁, θ₁), …, f(x_m, θ₁), and the probability is given by P_0m = f(x₁, θ₀), …, f(x_m, θ₀) when accepting H₀.The definition of the sequential probability ratio test is as follows: select two normal numbers A and B (B < A) and calculate the probability ratio P_1m/P_0m at each stage of the test.

(a)
If p_1m/p_0m ≥ A, the sequential probability ratio test ends, H₁is accepted and H₀ is discarded.
(b)
If p_1m/p_0m ≤ B, the sequential probability ratio test ends, H₀ is accepted and H₁ is discarded.
(c)
If B < p_1m/p_0m < A, we continue to observe the sequential probability ratio test until the requirement is met.

When SPRT is applied to target recognition, it is first assumed that one of the M alternative hypotheses is the initial hypothesis. The signal propagation waveform is denoted as s(t). When a signal is transmitted, one of the possible waveforms is received and recorded as follows:

$$ y(t)=s(t)\ast {h}_i(t)+n(t)\kern0.5em i\in \left\{1,2,\mathrm{K},M\right\} $$

(28)

Where n(t) is additive white Gaussian noise; the impulse response of the target hypothetical channel is expressed as h_i(t) and “*” is the convolution factor.

The signal channel receiving data is defined in formula (29), where Q_i represents the target convolution matrix defined in the literature.

$$ y={Q}_is+n $$

(29)

The M target hypotheses are denoted as H₁, H₂, …, H_M, respectively. The parameter α_i,j is the probability (i ≠ j) when the true hypothesis H_i is wrongly selected as H_i. After obtaining k-th observations, suppose the likelihood ratio of i and j can be defined as shown in formula (30)

$$ {\Lambda}_{i,j}^k=\frac{p_{i1}\left({\mathrm{y}}_1\right){p}_{i2}\left({\mathrm{y}}_2\right)\Lambda \kern0.5em {p}_{ik}\left({\mathrm{y}}_k\right)\kern0.5em {P}_i}{p_{j1}\left({\mathrm{y}}_1\right){p}_{j2}\left({\mathrm{y}}_2\right)\Lambda \kern0.5em {p}_{j2}\left({\mathrm{y}}_k\right)\kern0.5em {P}_j} $$

(30)

Where p_ik(y_k) is the probability density function (PDF) with k-th data under the i-th hypothesis and y_k is the k-th observation data. When the likelihood ratio satisfies formula (31), accept the assumption H_m.

$$ {\Delta}_{i,j}^k>\frac{1-{\theta}_{i,j}}{\theta_{i,j}}\kern0.5em j\ne i $$

(31)

When the likelihood ratio satisfies the formula (31), stop the loop. If the likelihood ratio does not meet the stopping condition, continue to the next iteration. In fact, the probability density function of the observed data is constant and satisfies p_i1(y) = p_i2(y) = … = p_ik(y). The intensity waveform is updated with the number of iterations, so the probability density function of the observation data under the condition of additive white Gaussian noise can be defined as formula (32).

$$ {p}_{ik}\left({y}_k\right)=\frac{1}{{\left(\sqrt{2{\pi \sigma}_n^2}\right)}^{L_y}}\times \exp \left[-\frac{1}{2{\sigma}_n^2}{\left({y}_k-{Q}_i{s}_k\right)}^T\kern0.5em \left({y}_k-{Q}_i{s}_k\right)\right] $$

(32)

4 Experiments

4.1 Slurry pump fault test system and experimental design

The experimental system to be established in this project is required to operate the slurry pump under controlled conditions of speed, flow rate, slurry density, and inlet pressure, and to use and replace the impeller of the slurry pump of different grades and wear. Common failure parts of centrifugal pumps include rotor impeller, rolling bearing, seal, coupling, etc., of which impeller and rolling bearing failure account for a large proportion. The schematic diagram of the slurry pump fault diagnosis test system is shown in Fig. 10. The figure shows the three-dimensional schematic diagram of the test circuit and identifies the key components. It mainly includes motors, centrifugal pumps, data acquisition systems, control instruments, glycol cooling tanks, pressure gauges, flow meters, conveyor belts, sand tanks, pipelines, pressure control tanks, density meters, and sampling ports. First, the normal impeller is used in the centrifugal pump to run the slurry pump fault test system for collecting and testing the signal data of the slurry pump vibration, flow, slurry density, motor speed, and pump inlet and outlet pressure. Then impeller perforation, impeller edge damage and blade damage, and its impellers with different degrees of damage were selected to replace the original centrifugal pump impeller. After running the slurry pump fault diagnosis and test system, the data of the vibration, speed, and pump speed of the slurry pump experiment system were collected.

Figure 11 shows the process flow chart of the slurry pump fault diagnosis test system. The arrow direction in the figure is the flow direction of the mud when the mud pump fault diagnosis experiment system is running. It is the basis for establishing and running the centrifugal pump fault diagnosis experiment system in this article. The serial number and related schematic diagram in Figure 11 indicate the following meanings: 1—centrifugal pump, 2—motor, 3—inverter, 4—power meter sensor, 5—accelerometer, 6—pressure sensor, 7—flow meter, 8—hole plate, 9—heat exchanger, 10—cooler, 11—temperature sensor, 12—sand, 13—suction pressure control tank, and 14—suction pressure sensor. The fault diagnosis test system for slurry pump contains a Weir/Warman 3/2 CAH slurry pump (40 HP) with impeller C2147(8.4"). The process flow chart of fault diagnosis test system for slurry pump covers the key issues mentioned in this article, but does not cover all aspects of the design of the experimental system. The key issues include that the medium of the cooler in the pipeline is ethylene glycol, the process water is municipal water, and the heat exchanger medium is steam. Microphone means for sound collector.

In order for the experiment to run successfully, the designer first needs to design the system after engineering calculation and determine the components. The main equipment required for the experiment includes the centrifugal pump, data acquisition system for vibration data acquisition, sensors, and a laptop computer. Auxiliary equipment including storage tanks, valves, instruments, and drive motors are used to control various functions. The data collected by vibration accelerometers is used to analyze the centrifugal pump system in this experiment. The detailed explanation of the vibration sensor for the system signal acquisition is shown below. In the experiment, three three-axis vibration accelerometers are used. Two of the PCB three-axis ICP (Integrated Circuit Piezoelectric) accelerometers have the sensitivity of 100 mV/g and the frequency range of 2–5 kHz. Another PCB three-axis ICP accelerometer has the range of 0.5–3 kHz and the sensitivity of 1000 mV/g.

4.2 Slurry pump experimental equipment and signal acquisition system

To research the validity of the multi-scale parallel factor analysis and sequential probability ratio test proposed in this paper for multi-channel signal in actual complex industrial production, the centrifugal pump fault diagnosis experimental system was designed. The general Fig. 12a shows the centrifugal pump fault diagnosis experimental system. The data acquisition system is shown in Fig. 12b based on a combination of PC measurement hardware and software, which can input electrical signals from sensors and other instruments into a computer for processing. NI LabView 7.0 was chosen as the measurement standard application software because it is easy to build a graphical measurement interface with the help of a large number of tools and objects. The selected hardware is provided by NI DAQ and is highly compatible with our software applications. In order to collect the vibration signals of the centrifugal pump in three directions for each state, it is necessary to install a short-range but high-sensitivity sensor at the key position. Figure 12d is a schematic diagram of the position of the accelerometer. The standard accelerometer and the high-sensitivity accelerometer are installed on the pump casing near the pump suction port, where they will be close to the parts that are prone to failure. Another standard accelerometer is mounted on the shaft bearing because this location is sensitive to vibrations transmitted from the stuffing box. Real-time signals such as flow rate, pressure, speed, and vibration, can be collected synchronously by the experimental data acquisition system. By commanding the pressure and the flow of the equipment’s loop, we simulate the non-linear operating state of the industrial process of the mechanical system to establish the non-linear multi-fault mode, synchronously collect multi-channel signals, and obtain multi-source signals. The internal interaction mechanism between fluid excitation and vibration response under non-linear operation mechanism can be analyzed.

The liquid transported in this experiment is set as mud to better collect vibration signals. In this experiment, normal impeller, and three types of faulty impellers, including impeller perforation, impeller edge damage and blade damage, were set to simulate failures in industrial production. Among them, these three failure modes have clear differences and the typical failures of centrifugal pump impellers can be represented well. The impeller in the normal state is denoted as S1, and the three types of impellers with impeller perforation, impeller edge damage, and blade damage are denoted as S2, S3, and S4. In order to avoid aliasing, the sampling frequency in this experiment is 9009 Hz according to the Nyquist sampling theorem and the data acquisition time is 20 s for each group.

In the experiment, different impellers were replaced to collect the vibration signals of the centrifugal pump under different operation conditions. The steps of the whole experiment are summarized as follows:

(1)
Establish the experimental device according to the schematic diagram of slurry pump test system shown in Fig. 11. The normal impeller shown in Fig. 13 is used as the impeller of the pump and sediment is added as the pumping medium. After starting the motor, we adjust the motor speed to 1200 rpm, 1600 rpm, 1800 rpm, 2200 rpm, and 2600 rpm through the known voltage, motor power, motor efficiency, and other coefficients and instructions. According to the sampling time of 20 s and sampling frequency of 9009 Hz shown in the previous article, NI LabView 7.0 application software and NI DAQ signal acquisition system were run to collect the three sets of three-dimensional vibration signals of the corresponding pump.
(2)
The normal impeller in the original centrifugal pump is replaced by the impeller perforation of the fault S2 in Fig. 13, and the other parts remain unchanged. Follow the previous steps to start the centrifugal pump and collect data. When a set of data is collected, the speed is set to 1400 rpm, 1600 rpm, and 2600 rpm and the above steps are repeated to collect data.
(3)
The S3 of impeller edge damage in Fig. 13c is selected to replace the impeller of S2 in the original centrifugal pump and other parts remain unchanged. Similarly, follow the previous steps to start the centrifugal pump and collect data.
(4)
The S4 of blade damage in Fig. 13d is selected to replace the impeller of S3 in the original centrifugal pump and other parts remain unchanged. Similarly, follow the previous steps to start the centrifugal pump and collect data.
(5)
After the experiment, the outlet valve of the pump was closed. Close the inlet valve after turning off the motor. Store experimental data to prepare for subsequent vibration signal analysis.

5 Results and discussion

5.1 Multi-source dynamic feature extraction based on parallel factorization

This multi-scale parallel factorization method for the extraction of characteristic signals in non-linear multi-source and multi-fault modes is proposed in the article. Parallel factorization can not only perform high-dimensional data processing, but also has the uniqueness of the decomposition. This property makes the results of parallel factorization more realistic and has specific physical meanings. The third-order tensor constructed by multi-channel vibration signals through continuous wavelet transform is decomposed into the series of different modes of channel/frequency/time by the multi-scale parallel factor analysis algorithm. The spatial information is introduced into the time-frequency analysis of signals to form the three-dimensional spatial/time/frequency characteristic analysis of each factor. The simulation results show that the parallel factor decomposition for the tensor built by multi-channel signal has the compatibility of decomposition path and overall consistency. As a result, the topographic map, spectrum, and time contour of the multi-source fault signal in the centrifugal pump experiment are acquired. The multi-scale parallel decomposition method for extracting multi-source feature signals of non-linear failure modes is applied in the fault diagnosis of centrifugal pumps. It analyzes the internal connection between the optimal decomposition paths of multi-source signal feature factors. The optimal non-linear correspondence relationship between failure modes and characteristic signals in time, frequency, and space are constructed. Based on the correspondence and overall consistency of multi-source feature factor decomposition paths, we remodeled three-dimensional fault feature models such as the frequency spectrum and time profile of the fault feature factors, successfully extracted non-linear multi-dimensional dynamic fault feature signals. Finally, the corresponding fluctuation regularities of the homologous non-linear failure mode in the multi-source signals were displayed.

Figure 14 shows the time-frequency diagram obtained of the vibration signals collected in the X-axis direction of the three vibration signal collection points by continuous wavelet transformation when the slurry pump is in normal operation. Figure 15 shows the result of the parallel factor analysis of vibration signal of slurry pump after continuous wavelet transform in normal state. In this experiment, three-dimensional vibration sensors are set up at three measuring points. We analyze the vibration signals of these three measuring points to explore the three-dimensional spatial distribution and characteristic propagation path of dynamic characteristics on the mechanical structure of the slurry pump. Three groups of original vibration signals are transformed by continuous wavelet to obtain three-dimensional time-frequency signals to construct a third-order matrix. After multi-scale parallel factor analysis for the third-order tensor, the loading values and residual variance of the aisle, time, and frequency factors are obtained.

Figure 16 shows the time-frequency diagram of the vibration signals collected in the X-axis direction of the three vibration signal collection points by continuous wavelet transformation when the slurry pump is in S2 impeller perforation. In the S2 state, the third-order tensor of 3 × 126 × 4096 is constructed by continuous wavelet transform. Figure 17 indicates the result of the loading values and residual variance of the aisle, time, and frequency modes by the parallel factor analysis for the third-order tensor of slurry pump in state S2.

Figure 18 shows the time-frequency diagram of the vibration signals collected in the X-axis direction of the three vibration signal collection points by continuous wavelet transformation when the slurry pump is in S3 impeller edge damage. In the S3 state, the third-order tensor of 3 × 126 × 4096 is constructed by continuous wavelet transform. Figure 19 indicates the result of the loading values and residual variance of the aisle, time, and frequency modes by the parallel factor analysis for the third-order tensor of slurry pump in state S3.

The operating state of the slurry pump system in our experimental device system has normal operation and three failure states. The failure states include impeller holes, leading edge damage, and propeller blade damage. Similarly, Fig. 20 shows the time-frequency diagram when the slurry pump is in S4 propeller blade damage. Figure 21 indicates the result of the loading values and residual variance of the aisle, time and frequency modes by the parallel factor analysis for state S4.

We analyze the vibration signals of these three measuring points to discuss the three-dimensional spatial distribution and characteristic propagation path of dynamic characteristics on the mechanical structure of the slurry pump. By comparing the decomposition results of parallel factor analysis in the normal and fault state, there are obvious difference in the time loading factor and frequency loading factor component. Due to the phenomenon of characteristic coupling and aliasing of mechanical multi-source signals, the parallel factor analysis can optimize the independent characteristics of each channel on the surface of the mechanical structure and eliminate the mutual interference, overlap, and redundancy of the characteristic signals between the channels. Therefore, the parallel factor analysis are effective in providing a basis for subsequent diagnosis of SPRT and the fault identification can be successfully implemented.

5.2 SPRT for the multi-source condition monitoring of centrifugal pump

The proportions of the standard deviation “σ” and mean “μ” of the test signal sequences have significant influence on the likelihood ratio in the sequential probabilistic ratio test. Therefore, the mean value and standard deviation of the frequency loading value after the parallel factor decomposition should be calculated first for the test signal sequence. Assuming the probability distribution of the frequency load value sequence of one set of signals under the multi-source condition monitoring of the centrifugal pump meets the null hypothesis H_i : μ = μ_i, and the probability distribution of the frequency load value sequence of the other set of signals satisfies the alternative hypothesis H_j : μ = μ_j [39]. Their corresponding standard deviation σ remains unchanged. When the original hypothesis and the alternative hypothesis are both true, the joint probability density functions of these two sets of sequences are shown below.

$$ {p}_{ik}\left({y}_k\right)=\frac{1}{\sigma \sqrt{2\pi }}\exp \left(-\frac{1}{2{\sigma}^2}{\left({y}_k-{\mu}_i\right)}^2\right) $$

(33)

$$ {p}_{jk}\left({y}_k\right)=\frac{1}{\sigma \sqrt{2\pi }}\exp \left(-\frac{1}{2{\sigma}^2}{\left({y}_k-{\mu}_i\right)}^2\right) $$

(34)

In formula (33), P_ik(y_k) is the probability density function null hypothesis. P_jk(y_k) in formula (34) is the probability density function under the alternative hypothesis. The SPRT probability ratio is calculated in formula (35).

$$ {\lambda}_{i,j}\left({Y}_{Sm}\right)=\frac{\prod \limits_{k=1}^n{P}_{jk}}{\prod \limits_{k=1}^n{P}_{ik}}=\frac{P_{j1}\left({y}_1\right){P}_{j2}\left({y}_2\right)\Lambda \kern0.5em {P}_{jk}\left({y}_k\right)}{P_{i1}\left({y}_1\right){P}_{i2}\left({y}_2\right)\Lambda \kern0.5em {P}_{ik}\left({y}_k\right)}\times \frac{P_{j0}}{P_{i0}} $$

(35)

In order to make the calculation easier in practical applications, the likelihood ratio formula is further derived and simplified to obtain the formula (36). Where Y_Si and Y_Sj are the to-be-checked sequences of vibration signals Si and Sj, respectively, Δ_{i, j}(Y_Si) and Δ_{i, j}(Y_Sj) are the likelihood ratios of the sequence to be tested Y_Si and Y_Sj, respectively.

$$ {\Delta}_{i,j}\left({Y}_{Sm}\right)=1\mathrm{n}{\lambda}_{i,j}\left({Y}_{Sm}\right)=1\mathrm{n}\frac{\prod \limits_{k=1}^n{P}_{jk}}{\prod \limits_{k=1}^n{P}_{ik}}=\sum \limits_{k=1}^n1\mathrm{n}\frac{P_{jk}}{P_{ik}}\kern0.5em m=i,j $$

(36)

Referring to the sequential probability ratio test algorithm, we compare the likelihood ratio with the thresholds A and B to identify different forms of failure of the centrifugal pump. The size of A and B are closely related to the probability α of type I error and the probability β of type II error. The variables α, β, A, and B are satisfied with the following relationship:

$$ a=\ln A=\ln \frac{1-\beta }{\alpha } $$

(37)

$$ b=\ln B=\ln \frac{\beta }{1-\alpha } $$

(38)

For S1, S2, S3, and S4 four different impeller fault states of the centrifugal pump experimental system, Figure 22 shows the process of using the sequential probability ratio test algorithm to identify the fault. Vibration signals at three different positions collected under two different impeller fault conditions (sj) and (sj) are decomposed by parallel factors to obtain frequency loading values, which are calculated according to formulas (33)–(36) to obtain the likelihood ratio Δ_{i, j} of the sequential probability ratio test. The process of centrifugal pump fault identification is shown below. (1) If Δ_{i, j} = ln (λ_{i, j}) ∈ (−∞, b], accept H_j, the centrifugal pump system is under the condition (sj). (2) If Δ_{i, j} = ln (λ_{i, j}) ∈ [a, ∞), accept H_i, the centrifugal pump is under the condition (si). (3) If Δ_{i, j} ∈ [a, b], the likelihood ratio of sequential probabilistic ratio test continues to be calculated by extracting the next data in the test sequence according to formulas (33)–(36). The likelihood ratio will continue to be compared with the threshold value until the condition (1) or (2) is met or the number of iterations is reached. After the test is stopped and the probability parameters λ_{1, 2}(Y_S1), λ_{1, 2}(Y_S2), λ_{1, 3}(Y_S1), λ_{1, 3}(Y_S3), λ_{1, 4}(Y_S1) andλ_{1, 4}(Y_S4) are obtained, the conditions of centrifugal pump will be distinguished.

The means of the signal S1, S2, S3 and S4 under the four conditions parameters areμ₁, μ₂, μ₃, μ₄. Then, the likelihood ratio is calculated and analyzed according to formulas (33)~(36). SPRT probability ratios λ_{i, j}(Y_Si) and λ_{i, j}(Y_Sj) are calculated by importing the testing data (Y_Si, Y_Sj) of the signal waveform for slurry pump Si and Sj conditions to Eq. (35). Compare the likelihood ratios Δ_{i, j}(Y_Si) and Δ_{i, j}(Y_Sj) with the threshold to determine the state Si and Sj of the centrifugal pump.

The difference in the variables λ_{i, j}(Y_Si) and λ_{i, j}(Y_Sj) is used to distinguish different condition (Si,Sj) of the centrifugal pump. When the iteration periods are determined, the condition S1 is compared with S2, S3, S4 in the relation between SPRT probability ratios. Figure 23a shows the fluctuation curve between the likelihood ratios Δ_{1, 2}(Y_S1) and Δ_{1, 2}(Y_S2) of the signals S1 and S2 during the determined number of iteration. It can be seen from the curve in Fig. 23a that Δ_{1, 2}(Y_S1) a indicates that the centrifugal pump is in a fault state of the S2 impeller perforation. From the curve in Fig. 23b, it can be seen that Δ_{1, 3}(Y_S1) a indicates that the centrifugal pump is in the fault state of S3 impeller edge damage. In Fig. 23c, the inequality (Δ_{1, 4}(Y_S1)) a is satisfied, the condition of the centrifugal pump is S4 impeller blade damage. The SPRT parameters Δ_{1, m}, m = 2, 3, 4 in Fig. 22a–c are adopted to distinguish that the normal condition (S1) of the centrifugal pump from the conditions (S2,S3,S4).

We found that the different fault states of the centrifugal pump in the experiment can also be distinguished by this method of SPRT. Figure 24a indicates the SPRT parameters λ_{2, 3} in Eqs. (33–36) of the testing sequences Y_S2 and Y_S3. The SPRT inequality (Δ_{2, 3}(Y_S2)) a is satisfied, then the condition of the slurry pump is S3 (impeller edge damage). It can be seen from the curve in Fig. 24b that when (Δ_{2, 4}(Y_S2)) a is satisfied, the centrifugal pump is fault S4 (impeller blade damage). When the inequality (Δ_{3, 4}(Y_S3)) a is satisfied, the condition of the centrifugal pump is S4 (impeller blade damage). The parameters (λ_{i, j}(Y_Si), λ_{i, j}(Y_Sj)) are effective indicator to monitor the different conditions of the multi-fault and multi-source centrifugal pump.

6 Conclusion

Parallel factorization can not only perform high-dimensional data processing, but also has the uniqueness of the decomposition. This property makes the results of parallel factorization more realistic and has specific physical meanings. Through numerical simulation, the parallel factorization is used to explore the non-linear correspondence relationship of multi-source signal characteristic factors in time, frequency, and space under different simulation states. By adjusting the different frequency, time, phase, and amplitude of the analog signal, the loading values of the three modes are captured after parallel factorization to research the corresponding relationship between the analog signals. Then, the parallel factor analysis is applied to the centrifugal pump fault diagnosis experimental system to analyze the state characteristics under multiple fault modes. The non-linear multi-dimensional actional fault characteristic parameter of the impellers with different faults of the centrifugal pump was triumphantly acquired, and the corresponding fluctuation regularities of the homologous fault mode characteristics in the multi-source signals were displayed.

The analysis of the comprehensive result graph shows that the centrifugal pump fault diagnosis methodology based on parallel factor analysis of multiple scales and sequential probability ratio test is effective and reliable. This method first analyzes the collected vibration signals through parallel factor analysis and then conducts sequential probability ratio testing. It identifies different failure modes by comparing likelihood ratios and thresholds. Not only normal conditions and fault conditions can be identified, but also different fault conditions can be distinguished. Therefore, the methodology proposed in these contents of article is very suitable for non-linear multi-source and multi-fault signal analysis and processing. The PARAFAC theory proposed in this paper can also be used in the blind separation of mechanical multiple faults.

Availability of data and materials

All data generated or analyzed during this study are included in this published article.

Abbreviations

PARAFAC:: Parallel factor analysis
SPRT:: Sequential probability ratio test
CNN:: Convolutional neural network
EMD:: Empirical mode decomposition
VMD:: Variational mode decomposition
EEG:: Electroencephalogram
ANN-MLP:: Artificial neural network-multi-layer perceptron
TALS:: Trilinear alternate least squares
LS:: Least squares
PDF:: Probability density function

References

R. Azizi, B. Attaran, A. Hajnayeb, A. Ghanbarzadeh, M. Changizian, Improving accuracy of cavitation severity detection in centrifugal pumps using a hybrid feature selection technique. Measurement 108, 9–17 (2017)
Article Google Scholar
Y. Dayong, A new pose accuracy compensation method for parallel manipulators based on hybrid artificial neural network. Neural. Comput. Appl. 33(3), 909–923 (2021)
Article Google Scholar
H. Chen, D. Fan, J. Huang, W. Huang, G. Zhang, L. Huang, Finite element analysis model on ultrasonic phased array technique for material defect time of flight diffraction detection. Sci. Adv. Mater. 12(5), 665–675 (2020)
Article Google Scholar
Z. Xu, C. Cheng, V. Sugumaran, Big data analytics of crime prevention and control based on image processing upon cloud computing. J Surveill Secur Saf 1, 16–33 (2020)
Google Scholar
O. Janssens, V. Slavkovikj, B. Vervisch, et al., Convolutional Neural Network Based Fault Detection for Rotating Machinery. J. Sound Vibration 377, 331–345 (2016)
Article Google Scholar
B.A. Jaouher, N. Fnaiech, L. Saidi, et al., Application of empirical mode decomposition and artificial neural network for automatic bearing fault diagnosis based on vibration signals. Appl. Acoustics 89, 16–27 (2015)
Article Google Scholar
G. Mousmoulis, N. Karlsen-Davies, G. Aggidis, I. Anagnostopoulos, D. Papantonis, Experimental analysis of cavitation in a centrifugal pump using acoustic emission, vibration measurements and flow visualization. Eur. J. Mech. B Fluids 75, 300–311 (2019)
Article Google Scholar
M. Zhang, Z. Jiang, K. Feng, Research on variational mode decomposition in rolling bearings fault diagnosis of the multistage centrifugal pump. Mech. Syst. Sig. Process 93, 460–493 (2017)
Article Google Scholar
H. Chen, D.L. Fan, L. Fang, W. Huang, J. Huang, C. Cao, L. Yang, Y. He, L. Zeng, Particle swarm optimization algorithm with mutation operator for particle filter noise reduction in mechanical fault diagnosis. Int. J. Pattern Recognit. Artif. Intell.. https://doi.org/10.1142/S0218001420580124
H. Chen, Y. Shang, K. Sun, Multiple fault condition recognition of gearbox with sequential hypothesis test. Mech. Syst. Signal Process. 40(2), 469–482 (2013)
Article Google Scholar
H. Chen, W. Huang, J. Huang, C. Cao, L. Yang, Y. He, L. Zeng, Multi-fault condition monitoring of slurry pump with principle component analysis and sequential hypothesis test. Int. J Pattern Recognit. Artif. Intell. 34(7) (2020). https://doi.org/10.1142/S0218001420590193.
A. Kumar, C.P. Gandhi, Y. Zhou, R. Kumar, J. Xiang, Improved deep convolution neural network (CNN) for the identification of defects in the centrifugal pump using acoustic images. Appl. Acoustics 167, 107399 (2020). https://doi.org/10.1016/j.apacoust.2020.107399
Article Google Scholar
L. Yang, H. Chen, Fault diagnosis of gearbox based on RBF-PF and particle swarm optimization wavelet neural network. Neural. Comput. Appl. 31(9), 4463–4478 (2019)
Article Google Scholar
C. Lu, S. Wang, C. Zhang, Fault diagnosis of hydraulic piston pumps based on a two-step EMD method and fuzzy C-means clustering. Proc. Instit. Mech. Eng. Part C 230, 2913–2928 (2016)
Article Google Scholar
H. Chen, F. Lu, D.L. Fan, W. Huang, J. Huang, C. Cao, L. Yang, Y. He, L. Zeng, Particle Swarm Optimization Algorithm with Mutation Operator for Particle Filter Noise Reduction in Mechanical Fault diagnosis. Int. J. Pattern Recognit. Artif. Intell.. https://doi.org/10.1142/S0218001420580124
H. Chen, Y. Chen, L. Yang, Intelligent early structural health prognosis with nonlinear system identification for RFID signal analysis. Comput. Commun. 157, 150–161 (2020)
Article Google Scholar
H. Chen, H. Lang, Y. Liu, C. Yongting, H. Jinmin, Model-base method with nonlinear ultrasonic system identification for mechanical structural health. Transact. Emerg. Telecommunications Technol. (2020). https://doi.org/10.1002/ett.3955
H. Chen, G. Zhang, D. Fan, L. Fang, L. Huang, Nonlinear Lamb wave analysis for microdefect identification in mechanical structural health assessment. Measurement 164 (2020). https://doi.org/10.1016/j.measurement.2020.108026.
J.D. Carroll, J.-J. Chang, Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika 35(3), 283–319 (1970)
Article MATH Google Scholar
R.A. Harshman, Foundations of the PARAFAC procedure: models and conditions for an ‘explanatory’ multi-modal factor analysis. UCLA Work. Pap. Phon. 16, 1–84 (1970)
Google Scholar
C. Hanxin, Y. Lu, T. Ling, Fault identification of gearbox degradation with optimized wavelet neural network. Shock. Vibration 20(2), 247–262 (2013)
Article Google Scholar
Z. Rui, Y. Ruqiang, C. Zhenghua, et al., Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 115, 213–237 (2019)
Article Google Scholar
Y. Li, G. Cheng, C. Liu, X. Chen, Study on planetary gear fault diagnosis based on variational mode decomposition and deep neural networks. Measurement 130, 94–104 (2018)
Article Google Scholar
S.S. Katharina, P. Hasselbach Philipp, E. Boris, et al., Application of Parallel Factor Analysis (PARAFAC) to electrophysiological data. Front. Neuroinformatics 8 (2015). https://doi.org/10.3389/fninf.2014.00084.
F. Miwakeichi, E. Martinez-Montes, et al., Decomposing EEG data into space-time-frequency components using Parallel Factor Analysis. Neuroimage 22(3), 1035–1045 (2004)
Article Google Scholar
Z. Rost‘akova, R. Rosipa, S. Seifpour, et al., A Comparison of Non-negative Tucker Decomposition and Parallel Factor Analysis for Identification and Measurement of Human EEG Rhythms. Meas. Sci. Rev. 20(3), 126–138 (2020)
Article Google Scholar
C.J. Yeh, H. Heungsun, E. Timmerman Marieke, Functional Parallel Factor Analysis for Functions of One- and Two-dimensional Arguments. Psychometrika 83(1), 1–20 (2018)
Article MathSciNet MATH Google Scholar
W. Jing, W. Juan, W. You, L. Xiaoli, The frontal area with higher frequency response is the principal feature of laser-evoked potentials in rats with chronic inflammatory Pain: a Parallel Factor analysis study. Front. Neurol. 8 (2017). https://doi.org/10.3389/fneur.2017.00155.
L. Yang, H. Chen, Y. Ke, L. Huang, Q. Wang, Y. Miao, L. Zeng, A novel time-frequency-space method with parallel factor theory for big data analysis in condition monitoring of complex system. Int. J. Adv. Robotic Syst. 17(2) (2020). https://doi.org/10.1177/1729881420916948.
Y. Cheng, J. Minping, A novel weak fault signal detection approach for a rolling bearing using variational mode decomposition and phase space parallel factor analysis. Meas. Sci. Technol. 30(11) (2019). https://doi.org/10.1088/1361-6501/ab30bd.
L. Lijia, B. Shiyi, M. Jianfeng, et al., Monitoring batch processes using sparse parallel factor decomposition. Ind. Eng. Chem. Res 56(44), 12682–12692 (2017)
Article Google Scholar
G. Haiyang, L. Kaiqi, H. Xingyi, et al., Feasibility study for the analysis of coconut water using fluorescence spectroscopy coupled with PARAFAC and SVM methods. Br. Food J (2020). https://doi.org/10.1108/BFJ-12-2019-0941
H. Xiu, Y. Huibin, S. Yonghui, et al., Characterizing humic substances from a large-scale lake with irrigation return flows using 3DEEM-PARAFAC with CART and 2D-COS. J. Soils Sediments 20(9), 3514–3523 (2020)
Article Google Scholar
C. Yahya, A sequential probability ratio test (SPRT) to detect changes and process safety monitoring. Process Saf. Environ. Protect. 92(3), 206–214 (2014)
Article Google Scholar
G. Peng, I. David, Wind turbine power curve modeling and monitoring with Gaussian process and SPRT. IEEE Transact. Sustainable Energy 11(1), 107–115 (2020)
Article Google Scholar
W. Rong, X. Zhi, L. Jianye, et al., Chi-square and SPRT combined fault detection for multisensor navigation. IEEE Transact. Aerospace Electron. Syst. 52(3), 1352–1365 (2016)
Article Google Scholar
H. Kasai, Fast online low-rank tensor subspace tracking by CP decomposition using recursive least squares from incomplete observations. Neurocomputing 347, 177–190 (2019)
Article Google Scholar
W. Huang, J. Huang, L. Yang, et al., Fault diagnosis of gearboxbased on principal component analysis and sequential probability ratio test [C]. Proceedings of 2nd International Conference on Computer Science and Application Engineering, Hohhot, China. (2018). https://doi.org/10.1145/3207677.3277945
P. Juan, Z. Santiago, Using one class SVM to counter intelligent attacks against an SPRT defense mechanism. Ad Hoc Networks. 94 (2019). https://doi.org/10.1016/j.adhoc.2019.101946.

Download references

Acknowledgements

The Reliability Research Lab in the Department of Mechanical Engineering at the University of Alberta in Canada provided the original experimental data. The National Natural Science Foundation of China (Grant No. 51775390, 51901164, 51805378), the Natural Science Foundation of Hubei Province (Grant No. 2018CFB394), and the Foundation of Wuhan Science and Technology Bureau (Grant No. 2019010701011417) provides the financial support for paper research.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 51775390, 51901164, 51805378), the Natural Science Foundation of Hubei Province (Grant No. 2018CFB394), and the Foundation of Wuhan Science and Technology Bureau (Grant No. 2019010701011417).

Author information

Authors and Affiliations

No. 206, Guanggu First Road, Jiangxia District, Wuhan, Hubei, 430205, People’s Republic of China
Liu Yang, Hanxin Chen, Yao Ke, Menglong Li, Lang Huang & Yuzhuo Miao
Hubei Provincial Key Laboratory of Chemical Equipment, Intensification and Intrinsic Safety, Wuhan Institute of Technology, Wuhan, 430073, People’s Republic of China
Hanxin Chen

Authors

Liu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hanxin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yao Ke
View author publications
You can also search for this author in PubMed Google Scholar
Menglong Li
View author publications
You can also search for this author in PubMed Google Scholar
Lang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhuo Miao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The algorithms proposed in this paper have been conceived by all authors. Hanxin Chen designed and performed the experiments and then analyzed the results. Liu Yang performed the experiments and analyzed the simulation results. Liu Yang wrote the paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hanxin Chen.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yang, L., Chen, H., Ke, Y. et al. Multi-source and multi-fault condition monitoring based on parallel factor analysis and sequential probability ratio test. EURASIP J. Adv. Signal Process. 2021, 37 (2021). https://doi.org/10.1186/s13634-021-00730-w

Download citation

Received: 27 February 2021
Accepted: 28 April 2021
Published: 13 July 2021
DOI: https://doi.org/10.1186/s13634-021-00730-w

Multi-source and multi-fault condition monitoring based on parallel factor analysis and sequential probability ratio test

Abstract

1 Introduction

2 The model and simulation

2.1 Parallel factorization model

2.2 Matrix essential equalization

2.3 Recognizability and uniqueness of parallel factorization

2.4 Trilinear alternating least square for parallel factor analysis

2.5 Numerical simulation based on parallel factor analysis

3 Proposed method

4 Experiments

4.1 Slurry pump fault test system and experimental design

4.2 Slurry pump experimental equipment and signal acquisition system

5 Results and discussion

5.1 Multi-source dynamic feature extraction based on parallel factorization

5.2 SPRT for the multi-source condition monitoring of centrifugal pump

6 Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords