Fault diagnosis of Tennessee Eastman process using signal geometry matching technique

This article employs adaptive rank-order morphological filter to develop a pattern classification algorithm for fault diagnosis in benchmark chemical process: Tennessee Eastman process. Rank-order filtering possesses desirable properties of dealing with nonlinearities and preserving details in complex processes. Based on these benefits, the proposed algorithm achieves pattern matching through adopting one-dimensional adaptive rank-order morphological filter to process unrecognized signals under supervision of different standard signal patterns. The matching degree is characterized by the evaluation of error between standard signal and filter output signal. Initial parameter settings of the algorithm are subject to random choices and further tuned adaptively to make output approach standard signal as closely as possible. Data fusion technique is also utilized to combine diagnostic results from multiple sources. Different fault types in Tennessee Eastman process are studied to manifest the effectiveness and advantages of the proposed method. The results show that compared with many typical multivariate statistics based methods, the proposed algorithm performs better on the deterministic faults diagnosis.


Introduction
The last decades have been witnessing the modern large-scale processes developing toward high complexity and multiplicity in industries such as chemical, metallurgical, mechanical, logistics, and etc. These processes are generally characterized by a long-process flow with large operation scales and complicated mechanisms. The typical features are highly nonlinear, long-time delay, and heavily correlated among measurements [1]. Process monitoring, aiming to ensure that the operations satisfy the performance specifications and indicating anomalies, becomes a major challenge in practice. First, the requirements of process expertise for model-based methods often pose difficulties for operators not specializing in this realm; secondly, the system identification theory based methods need to postulate specified mathematical models, which are incapable of capturing varied nonlinearities. In addition, due to the growing number of sensors installed in processes, quantity of data constantly generated under different conditions soars by a few orders of magnitude or more compared to small-scale processes [2]. The fundamental dilemma for process monitoring is deficient knowledge to establish relative accurate mathematical process description while incomplete methodology to exploit abundant data to reveal process mechanisms and operational statuses. In large-scale processes, standard PI (proportional-integral) or PID (proportional-integral-derivative) closed-loop control schemes are often adopted to compensate for variable disturbances and outliers. However, excessive compensation may easily cause controllers overburden and a trivial glitch could eventually develop to catastrophic fault(s). Based on the considerations of practical limits, demands of safety operation, cost optimization as well as business opportunities in technical development, the problem of how to more effectively utilize mass amount of process data to meet the increasing demand of system reliability has received intensive attention of academics and practitioners in related areas. Among all the tasks, data-driven fault diagnosis, involving the use of data to detect and identify faults, is one of the most interesting research domains.
In previous extensively cited literature, Venkatasubramanian once proposed classical three subclasses of diagnostic techniques: quantitative model-based methods, qualitative model-based methods, and process history based method [3][4][5]. From a new perspective to further investigate Venkatasubramanian's classification, data-driven based fault diagnosis not only includes a large part of techniques in process history based method, but also some belonging to qualitative modelbased methods. To view data-driven methods as an integrated type, we can re-divide fault diagnosis methods into three subclasses, namely analytical model-based methods, qualitative knowledge-based methods, and data-driven based methods (DDBM), where DDBM can be further divided into data transform based methods (DTBM) and data reasoning based methods (DRBM). Figure 1 illustrates the proposed classification. In general, DDBM are associated with the methods with insufficient information available to form mechanism model. These kinds of methods employ process data in dynamic system to perform fault detection, diagnosis, identification, and location. DTBM, to be more specifically, highlights the adoption of linear or nonlinear mathematical transforms to map original data to data in another form and the transforms are often reversible. The transformed data may be without clear physical meanings, but with more practicality. The key concept of data transform lies in two attributes: deterministic transform paradigm and realization of data compression. With this concept, the scope of DTBM is smaller and more concentrating compared to DDBM; the purpose for data utilization is more specific. DTBM also needs no in-depth knowledge about system structure as well as experience accumulation and reasoning knowledge which are necessary to DRBM. Besides, the implementation of DTBM algorithms are easily understood and realized, but the drawback may be less robust than model based methods. Dimension transformation (often dimension reduction), filtering, decomposition and nonlinear mapping are recognized as common tools for data transform.
In Figure 1, signal processing is categorized as a data transform methodology which covers a wide range of different techniques. Typical ones are primarily filtering and multilayer signal decomposition, both requiring preset models and carefully selected parameters, like Wavelet Analysis, Hilbert-Huang Transform, etc. Morphological signal processing, however, gives a different viewpoint. It derives from rank-order based data sorting technique and modifies signal geometry shape to achieve filtering [6]. This feature may provide more advantages of noise reduction and detail preservation than linear tools when treating measurements in complex processes [7]. Moreover, Salembier [8] analyzed that how the performance of rank-order based filter can be adaptively optimized in terms of the filter mask and rank value. Based on the investigations above, morphological signal processing as a nonlinear data transform tool may be suitable for constructing feature extractor for pattern matching.
In our previous work (unpublished work), we developed Salembier's idea [8] to adaptively adjust flat structuring element and rank parameter for each sample rather than adopting uniform ones for all the samples in a sampled sequence. Based on this idea, we designed a signal geometry matching approach: pattern classification using one-dimensional adaptive rank-order morphological filter for fault diagnosis, named PC1DARMF approach. The proposed method belongs to DTBM with major parameters capable of being randomly chosen, which is superior to those DTBM which need predefined parameters. This article applies PC1DARMF approach to a more complex and challenging application: Tennessee Eastman process (TEP). TEP is a classic model of an industrial chemical process widely studied in literature for validating new developed control or process monitoring strategies. It is a typical large-scale process characterized by features described previously. The fact that many data-driven diagnostic methods have been performed on TEP also provides chances to evaluate their performances in comparisons with method proposed in this article.
The remainder of this article is organized as follows: Section 2 expounds the derivation of pattern classification method using adaptive rank-order morphological filter. Key implementation issues are also discussed. An example is given to build a step-by-step realization of the method, making it easier for readers to understand. Section 3 gives an essential introduction to TEP and reviews the previous TEP fault diagnosis methods. Section 4 shows the diagnosis results for different TEP simulated faults with detailed analysis. Comparisons between the proposed method and typical multivariate statistics based approaches are made to highlight the advantages and features of PC1DARMF. The last part finally presents the conclusion and discussions.
2. Signal geometry matching based on adaptive rank-order morphological filter 2.1. One-dimensional adaptive rank-order morphological filter (1DARMF) Adaptive rank-order morphological filter is derived from a nonlinear signal processing tool referred as the rankorder based filter (ROBF). ROBF firstly reads a certain number of input values, then sorts the values in ascending order and determines the output value according to the predefined rank parameter in the sorted set. The basic definition of one-dimensional (1D) ROBF is firstly given in [9]: let x i be discrete sampled signal defined on a 1D space Z and M be a 1D mask containing N points (|M|= N and | | is the set cardinality). Define j as an index belonging to the mask M and r as the normalized rank parameter of the filter (0 ≤ r ≤1). Given the rankorder operator denoted by f r,M [x i ], the output of ROBF y i can be then formulated as (1): where elements of set X are sorted in ascending order and Rank n {X} denotes the nth ordered value in X (n is the nearest integer value of (N -1)r + 1), x i-j denote all the points which belong to the range of mask M centered on i (e.g., if j = -3, -2, -1,0,1,2,3, ij = i -3,...,i+3). This operation is the essentials of both median filter and morphological filter with flat structuring element [8,9]. However, its drawback is that the selections of filter mask and rank parameter heavily rely on practical experience and intuition. With understanding the feature of ROBF, its adaptive form named adaptive rank-order morphological filter was then proposed [8,9]. It is optimized as adapting filter mask and rank parameter in order to minimize a criterion such as the MAE (mean absolute error) or the MSE (mean squared error). The problem of designing adaptive rank-order morphological filter can be briefly stated as follows: assume that x i and d i are given as noised signal and desired signal, respectively, when ROBF f r,M is adopted, the aim is to find the best rank parameter r and filter mask M which minimizes a cost function C between output y i and d i using iterative learning. In order to expound the procedure of building 1DARMF for better understanding, how to formulate the operation of ROBF is to be introduced at the beginning.
First, in order to overcome the optimization difficulty for dealing with the discrete nature of parameters, the rank parameter r can be optimized in continuous normalized manner and let n in Rank n {X} be the nearest integer value of (N -1)r + 1. Secondly, for filter mask M optimization problem, a search area A which is selected to be larger than the optimum mask is introduced and a continuous value m (j) is assigned for ∀j A. New filter mask in next iterative step is thus determined by comparing the set of continuous values associated with the current filter mask against a preset value (denoted as threshold thm_M). If the assigned value for any j A is greater than the threshold, the location associated to that j belongs to the filter mask. With introduction of search area A and the continuous values assignments, the optimization problem of filter mask M is successfully converted from the binary values modification of the mask (belong or not belong) to continuous values m (j) modification.
On the basis of realizing parameters updating continuously, we proceed to find a way to establish a mathematical relationship involving filter input, output, and the parameters all together. Let us define S the sum of signs of (x i-j -y i ) for all j. It can be expressed by It is easy to find out that if r = 0, y i is the minimum of {x i-j | j M}and S is then equal to N -1; if r = 0.5, y i is the median value of {x i-j | j M} and S = 0; if r = 1, y i is the maximum of {x i-j | j M}, S = -(N -1). Based on the mapping relations between S and r above, if they were assumed to be linearly related, the general expression of S with respect to r is given as In case of thm_M being set 0, we obtain if (sgn(m (j) -thm_M)+1)/2 = 1, then m (j) > thm_M, which means j M and else if (sgn(m (j) -thm_M)+1/2) = 0, then m (j) < thm_M , j M c Notice all j is selected from A and let (sgn(m (j) -thm_M)+1/2) (i.e., (sgn(m (j) )+1)/2) be the weight, combing (2) and (3) gives Thus, the output of ROBF is successfully expressed by the implicit function F(m (j) ,x i-j ,y j ,r). As will be stated later, this implicit function is applied to take derivatives of y i with respect to m and r to develop iterative formulae for parameter updates.
In [8], an iterative algorithm similar to the LMS (least mean squares) algorithm was suggested to update the m (j) and r in the case of MSE optimization: Where a and b are two predefined parameters controlling the convergence rates. The derivatives of y j with respect to m (j) and r are calculated through employing implicit function (5). To obtain the expression of ∂y i ∂m (j) and ∂y i ∂r , the derivative of F with respect to m k is firstly expressed as That is Using (5) to take the derivative of F with respect to m (j) gives ∂F ∂y i is also calculated by using (5): In (11), the term δ(x i-j -y i ) is equal to 1 only if j equals to j 0 , i.e., the time shift whose corresponding x i-j 0 equals to output y i . This indicates j 0 M and sgn(m j 0 ) = 1, (11) is simplified to Combined with (10), (9) is written as If δ(m k ) is replaced by δ'(m k ) = 1 for -1 ≤ m k ≤ 1 for simplification. Based on (13), (6) is converted to Similar with the deduction of (9) and (13), we also have Based on (12), (16) is written as Combined with (17), (7) is converted to where N = |M| is the current length of filter mask in use.
Combining (1), (14), and (18), the parameters updating algorithm for one dimensional adaptive rank order morphological filter are given as (19), where itN denotes the current iteration and itN + 1 for the next. Note that the update processes of filter mask M and rank parameter r are varying according to each sample i rather than remaining the same for each sample.
To illustrate the performance of 1DARMF given by (19), an example is shown in Figure 2. In Figure 2a, it depicts three signals: noised signal x (dash-dot line) as input signal, desired signal d (solid line) as supervisory signal, and output signal y (dotted line) as recovered signal.
where s is the useful signal contaminated by Gaussian noise n and SNR x (signal-to-noise ratio) is set 2. In this example, s = sin(t) and d is selected equal to s in order to recover the useful signal. Initial parameters of 1DARMF in (19) are set as follows: initial 1D filter mask M (0) = [-5,-4,-3,-2,-1,0,1,2,3,4,5], initial assigned value for element in the mask m (0,j) = 0.5 (∀j M), initial rank parameter r (0) = 0, thm_M = 0, max iterations iterationNUM = 300, convergence rate a = 1 × 10 -4 and b = 1.5 × 10 -3 . If we define the sum of squared error between y and d as the evaluation of signal recovering ability, the expression is given as where i means the ith sample of signal and itN denotes current iteration. Figure 2b shows e (itN) converges to steady state and oscillates in a stable manner as itN gets increased.

Pattern classification using 1DARMF (PC1DARMF)
In Section 2.1, the general procedure to implement 1DARMF needs desired signal d as supervisory signal to train the key parameters of filter to obtain desired output. However, for a certain input x, if d is alternatively chosen, the iterative training process would finally lead to different output y. This means under supervision of inappropriate or undesirable d, the output may fail to recover useful signal from original input x. A performance comparison of 1DARMF using different supervisory signals is given to illustrate this phenomenon in Figure 3. With input x and the initial parameters being set the same with Section 2.1, different d results in different y, as shown in Figure 3a, c, e, g, i. Figure 3b, d, f, h, j depict corresponding e (itN) gradually reaches stable oscillation as iterations increase. The most distinct common feature is all e (itN) eventually progress to a steadystate through enough iterations. This phenomenon can be theoretically guaranteed: Feuer and Weinstein [10] concluded that if the convergence rate was restrained within a upper limit, then it was the necessary and sufficient for LMS algorithm to ensure the convergence of the algorithm. Therefore, with the proper selection of ain (6) and b in (7), e (itN) is also expected to stably oscillate eventually. The selection rule will be later summarized in Section 2.3. This condition is the crucial prerequisite to further form our algorithm for pattern classification. In Table 1 min(e (itN) ) are also listed to numerically compare the effect of different d on signal recovering. Figure 3 and Table 1 indicate the most matching supervisory signal in signal geometry shape with original input x (i.e., d = s = sin(t)) yields minimum value of min(e (itN) ), showing the best signal recovering ability. Based on this property, it is expected that given an unrecognized noised signal and a certain number of reference signals (also known as signal templates) as supervisory signals, 1DARMF may be capable of achieving signal recognition and classification through finding out under which reference signal the min(e (itN) ) value reach the minimum among all reference signals provided. We thus propose the basic procedures for pattern classification using 1DARMF in Figure 4. The procedure for pattern classification using 1DARMF can be further developed to an algorithm, named PC1DARMF algorithm. It is a supervised pattern classification approach. The fundamental of this algorithm is to realize signal geometry shape matching using 1DARMF as a tool in an iterative way. If the supervisory signals denote different types of physical meanings, for example representing different operation conditions or fault types in dynamic processes, this algorithm could achieve faults diagnosis through the signal geometry shape matching. In general, PC1DARMF algorithm is meaningful in two levels: first, it serves for the type classification purpose and secondly a feature extractor from nonstationary signals with proper parameter settings.

Issues for implementing PC1DARMF algorithm
In Section 2.2, PC1DARMF algorithm was mainly described in a high-level structure. There are still several significant engineering principles and experience to know which are important to practical implementation. They include initial parameter settings, convergence rates selections, and iteration stopping criteria.

Initial parameter settings
Initial parameter settings for PC1DARMF algorithm involves initial value determination of filter mask M (0) , assigned value m (0,j) for each element in filter mask, rank parameter r (0) and the threshold thm_M. Several reasons are supporting the random initial parameter settings. First, the only variable of filter mask in 1DARMF is its length. Based on analysis of Nikolaou and Antoniadis [11] of empirical rule for the length selection and consideration of keeping computational complexity relatively low, we propose to random chose it between 0.3 and 0.5 times of the total length of input signal. Secondly, there are no guidelines in theory for m i and r i initial values. They get renewal in continuous manner to optimal value during iterations, so their initial values are expected to be different chosen each time within an Step 1: Set values of initial parameters M (0) , m (0,j) , r (0) and thm_M Step2: For a input signal x, select a signal template d n ( n=1,2,3…,Np and Np is the signal templates number) as supervis o r y signal and apply 1DARMF until e n ( i tN ) in (20) oscillates in steady state, then calculate index FI n =min(e n (itN) ).  interval (e.g., [0, 1]). Thirdly, notice the derivations of (6) and (18) in Section 2.1 are all irrelevant to the value of thm_M, thm_M can be also randomly chosen within [0, 1]. Besides, the most important is that it is impossible to find optimal initial parameter settings for signals with varying nonstationary characteristics. The first goal of PC1DARMF is to measure how good two signals match each other rather than achieve optimal signal recovering, so the selection of initial parameter values would not be necessarily restrained as special ones.
Based on the analysis, we use random initial parameter settings for later experiments.

Convergence rates selections
The selection rule of convergence rate a and b in (19) is (21), which is referenced from [10] and early mentioned in Section 2.1. As was indicated before, (21) guarantees the convergence of the LMS algorithm.
where μ denotes convergence rate, R is covariance matrix of input signal, tr[R] is the trace of R. We further find empirically that if a and b is chosen as 1/3tr[R], output y may often cause unstable oscillation. In this article, we adopt that a and b is much smaller than 1/ 3tr[R]: for example, a = 0.0001,b = 0.0015.

Iteration stop criteria
Max iteration number preset is the key factor to greatly influence the algorithm computational cost. Notice the computational complexity of PC1DARMF algorithm is O(| N log N ||SL||dNUM||MaxitN|), where N is the average length of structuring element and O(|N log N |) is the computational complexity of Quicksort algorithm, SL is the processed signal length, and dNUM for the number of signal templates. SL and dNUM are predefined and unchangeable. MaxitN is the max iterations to ensure the convergence. Salembier [8] and Figure 3 in Section 2.2 also pointed out that 1DARMF had an ability to provide fast convergence. If the PC1DARMF algorithm always set a fixed iteration numbers, it would be unnecessary and the computational cost would be tremendous. An alternative way for reducing redundant iterations is to stop the iterations when within a certain number of continuous iterations, average variation of e (itN) falls below a threshold if no specified information about input signal and the noise level is given.
3. Tennessee Eastman process fault diagnosis using PC1DARMF algorithm

Introduction to Tennessee Eastman process (TEP)
Tennessee Eastman process is first proposed by Downs and Vogel [12] to provide a simulated model of real industrial complex process for studying large-scale process control and monitoring methods. As is shown in Figure 5, the process consists of five major units: an exothermic two-phase reactor, a product condenser, a recycle compressor, a flash separator, and a reboiler stripper. Gaseous reactants A, C, D, E, and inert B are fed to the reactor. Component G and H are two products of TEP, while F is undesired byproduct. The reaction stoichiometry is listed as (22). All the reactions are irreversible, exothermic, and approximately first-order with respect to the reactant concentrations. The reaction rates are expressed as Arrhenius function of temperature. The reaction producing G has higher activation energy than that producing H, thus resulting in more sensitivity to temperature.
The reactor product stream is cooled through a condenser and fed to a vapor-liquid separator. The vapor exits the separator and recycles to the reactor feed through a compressor. A portion of the recycle stream is purged to prevent the inert and byproduct from accumulating. The condensed component from the separator is sent to a stripper, which is used to strip the remaining reactants. After G and H exit the base of the stripper, they are sent to a downstream process which is not included in the diagram. The inert and byproducts are finally purged as vapor from vapor-liquid separator.
The process provides 41 measured and 12 manipulated variables, denoted as XMEAS(1) to XMEAS(41) and XMV(1) to XMV(12), respectively. Their brief descriptions and units are listed in Table 2. Twenty preprogrammed faults IDV(1) to IDV(20) plus normal operation IDV(0) of TEP are given to represent different conditions of the process operation, as listed in Table 3. TEP proposed in [12] is open loop unstable and it should be operated under closed loop. Lyman and Georgakis [13] proposed a plant-wide control scheme for the process. In this article, we implement this control structure to evaluate performance of PC1DARMF algorithm on fault diagnosis for it provides the best performance for the process.

Related work for TEP fault diagnosis
Various approaches have been proposed to deal with the fault diagnosis and isolation for TEP since its introduction in 1993. Most of them are dedicated to exploit data-driven techniques because of the process complexity and data abundance. Multivariate statistics based, machine learning based, and pattern matching based methods are the most frequently adopted methodologies summarized in this article. Meanwhile hybrids of the three have been also studied in literature.
Raich and Cinar [14][15][16] are among the earliest researchers to apply multivariate statistics techniques for TEP fault diagnosis. Training data under different operation conditions are firstly utilized to design PCA (principal component analysis) models for fault detection and fault classification. Then, designed PCA models are applied to new data to calculate statistic metrics and different discriminant analysis is conducted to determine whether and which fault occurs. The method is also able to diagnosis multiple simultaneous disturbances by quantitatively measuring the similarities between models for different fault types. Russell et al. [17] and Chiang et al. [18] gives a comprehensive and detailed study of multivariate statistical process monitoring using major dimensionality reduction techniques: PCA, FDA (Fisher discriminant analysis), PLS (partial least squares), and CVA (canonical variate analysis). Additionally, some improved multivariate statistical methods outperform their conventional counterparts for TEP fault diagnosis, like dynamic PCA/FDA (DPCA/DFDA) [19], moving PCA (MPCA) [20], and modified independent component analysis (modified ICA) [21]. Application of the multivariate statistics based methods is under assumption that sample data mean and covariance are equal to their actual values [17]. This would leads to requirement of large quantity of real data for ensuring relative accurate statistic estimations.
Machine learning based methods are also abundant in literature. It requires large amount of historical data under various fault conditions as training data to form a data mapping mechanism. Artificial neural networks (ANN) and support vector machine (SVM) are the most employed techniques applied to TEP fault diagnosis [22][23][24][25] among machine learning based methods. Eslamloueyan [26] further proposed hierarchical artificial neural network (HANN) to diagnosis faults for TEP. Fault pattern space is first divided to subspaces using fuzzy clustering algorithm. For each subspace representing a fault pattern, a special NN is trained for fault diagnosis. Besides, Bayesian networks [27,28] and signed directed graphs (SDG) [29] are also investigated in TEP fault diagnosis problem.
Another important approach is pattern matching. The basic idea is to match the pattern against the templates stored after using feature extracting techniques. Different similarity measures are defined to quantify the matching degree. Qualitative trend analysis (QTA) is a significant pattern-matching based method. It represents signals as a set of basic shapes as major features, which distinguishes different signals in geometry shapes. Maurya et al. [30] used seven primitives to represent signal geometry under different fault conditions. Maurya  [31] also proposed an interval-halving method for trend extraction and a fuzzy matching based method for similarity estimation and inferences. Akbarya and bishnoi [32] used wavelet-based method to extract features and binary decision tree to classify them. All the above, QTA-based methods require training data, while Singhal and Seborg [33] proposed a pattern-matching-strategy requires no training data but a huge amount of historical data. The approach needs specification of snapshot dataset, which serves as a template during the historical database search. Pattern similar to snapshot data in historical database can be located by sliding a window of signals in fixed length. The drawback of this method is that it needs to accumulate historical data and, of course, cannot perform on-line process monitoring tasks. In general, pattern recognition based methods are   Step

IDV(3) D feed temperature (Stream 2)
Step IDV(4) Reactor cooling water inlet temperature Step IDV(5) Condenser cooling water inlet temperature Step IDV(6) A feed loss (Stream 1) Step IDV (7) C header pressure loss-reduced availablity (Stream 4) Step IDV ( [34] combined SDG and PLS to demonstrate better diagnosis resolution, accuracy, and reliability than previous qualitative methods. Lu et al. [35] considered the limitation of PCA for differentiating faults with similar time-varying characteristics and utilized wavelet analysis to extend the feature extracting ability into time-frequency domain.
PC1DARMF algorithm in this article is a novel pattern matching method. The theoretical basis is rank-order based filter theory, which is easy to understand and implement. The computational complexity is controllable and may be relatively lower than traditional QTAbased methods. It employs the complete signal rather than some elements extracted as template. The applied strategy preserves important information, preventing possible information loss, and distortion. It also needs no knowledge about occurrence moments of the fault. Simulation data in Section 4 would verify its effectiveness.

Diagnostic procedure of using PC1DARMF algorithm
The method proposed for TEP fault diagnosis is a supervised signal geometry shape matching approach, so constructing the signal templates as supervisory signals should be considered in the first place. The left part of Figure 6 depicts the procedure for obtaining the templates. The training data used for template construction is a matrix consisting of raw sampled signal intervals of selected measurements within different fault conditions. The normalization of the raw signal to zero mean, unit variance signal eliminates the discrepancy in the different weights given to different variables. After noise reduction by wavelet de-noising method [36], PCA model is then employed to extract PCs (principal components) since variables in TEP are highly correlated.

Data set specification for simulation
This section describes TEP simulation data specification we adopt in this article. It mainly concerns the data constituent for the training and testing sets, data sampling interval, and sample size. The form of training and testing data is a matrix consisting of variables XMEAS (1) to XMEAS(2) and XMV(1) to XMV(11) except constant-valued XMV (12). An observation vector of TEP process is given as (23) and data collected during one simulation run of fault type i consists of k observations assembled as (24). Downs and Vogel [12] recommended the favorable time for one simulation run was between 24 and 48 hrs. In this article, we chose 24 h to keep computational cost relatively low. Russell et al. [17] proposed a sampling interval of 3 min to allow fast fault diagnosis. Thus, one simulation run contains 480 observations in our simulation studies, i.e., k is 480 in (24). If n i simulation runs are implemented for fault type i, the training data matrix including n c fault types is represented as (25).

Deterministic fault diagnosis in TEP
In this section, we first consider diagnosis of deterministic faults, i.e., IDV(1) to IDV(7) in TEP. Their brief introductions are listed in Table 3. Following the procedures described in Section 3.3, we construct the signal templates as standard patterns in the first place. The training data matrix M tr of (25) is assembled. It is designed to contain process data produced in eight different simulation runs and each fault type corresponds to a simulation. In these simulations, the fault is introduced 8 h after the simulation started. After PCA is conducted on normalized and de-noised M tr , parallel analysis [38] is applied to suggest that the PCs, which correspond to the first five largest eigenvalues of training data covariance matrix, capture total variations of the data set optimally. To form standard patterns of each fault type, matrices X 0 to X 7 defined in (24) are first normalized and de-noised, respectively. After that projections of preprocessed data onto the first five PCA loading vectors are performed to obtain five new observations (scores). With the consideration of reducing the computational cost of later PC1DARMF algorithm, a resampling of the new observations reducing data points from 480 to 60 is performed. Figure 7 illustrates trends of five resampled signals for each fault type. With five in a row composing a set, these trends obtained from training data set are signal templates.
After signal templates were prepared, we move to build signal patterns derived from testing data. In Section 4.2, we define testing data set for each fault type, which consists of data collected from 20 simulation runs, in which 10 are generated with fault introduced in the 4th h and the 10 others introduced in the 12th h. As the same with training data set, testing data set is also needed to be normalized and de-noised. PCA model generated from training data and resampling are adopted to acquire signal patterns. The five signal patterns as a set represent current status of sampled raw data collected from the process in signal geometry manner, possessing major features for fault recognition.
After signal templates and signal patterns to be recognized are built, PC1DARMF algorithm can be performed step by step as demonstrated previously in Figure 4. The initial parameters for the algorithm are selected as follows: assigned value m (0,j) , rank parameter r (0) , and the threshold thm_M are subject to random choices while the initial length of filter mask N (0) is an integer chosen arbitrarily from 20 to 30 (0.3 to 0.5 times the trend length) for balancing tradeoff between algorithm effectiveness and low computational cost. The convergence rates a and b are pointed in Section 2.3. The algorithm stops when the variation of e (itN) is less than 1% within continuous 50 iterations and max iterations is set 200 to ensure algorithm convergence.
In this example, an unrecognized signal pattern of a selected PC adopts eight faulty statuses signal templates for that PC as supervisory signals and applies PC1DARMF algorithm to find its best matching signal template. If one signal pattern and its best matching signal template turn out to both derive from the same fault type, it is regarded as correct fault diagnosis, otherwise the incorrect diagnosis. Table 4 lists the correct diagnosis rates of 20 tests for each fault type using five PCs. In order to combine the fault diagnosis results given by five PCs, multisource data fusion employing consensus theory is considered here to give a more reliable final result. The linear opinion pool (LIOP) as one of the most popular approaches of consensus theory achieves results fusion by computing the weighted sum of diagnosis credibility given by each PC.
where P(w j |PC i ) quantifies the credibility of algorithm result of using ith PC for fault diagnosis. In other words, it reflects the frequencies of ascending orders of index FI n (introduced in Figure 4) in overall FI values when choosing the right signal template of PC i . P(w j | PC i ) can be calculated on the basis of statistics of extra training data set, which contains 10 simulation runs for each type with fault introduced time of the 8th h after the simulation started. λ i is the weight of result given by PC i and can be determined according to data variation captured by related PC. Tables 5 and 6 list P(w j |PC i ) and weight λ i for PC i . Based on the knowledge of Tables  5 and 6 and (26), the final diagnosis results using PC1DARMF algorithm and consensus theory for IDV(0) to IDV(7) are tabulated in Table 7.
The fusion results of five PCs in Table 7 are the same with results only using signal templates of PC 1 in Table   4. It requires less computational effort to only use signal templates of PC 1 for deterministic faults diagnosis in TEP. Table 7 suggests normal operation (IDV(0)) is correctly diagnosed with low rate. When the process is under normal operation, observations are in steady state with minor oscillation in noise level. Then, the normalization and PCA dimension reduction may result in signal templates varying randomly rather than retaining regular geometry shapes. Table 8 compares the performances of PC1DARMF algorithm with multivariate statistics based approaches (MSBA) [17]. Both MSBA and the proposed method are on-line monitoring methods. Besides, [17] is among handful investigations which gave the detailed specifications of data in use and studied all fault types of TEP. It helps provide more comprehensive comparisons. Twenty MSBA includes PCA, DPCA, FDA.DFDA, CVA, PLS, MS (multivariate statistics) based statistic measurement (such as Hotelling T 2 or Q statistic) [17] and average values of correct diagnosis rates of 20 MSBA are listed in Table 8. It shows four out of seven faults are more easily detected by PC1DARMF algorithm rather than MSBA. IDV(3) is defined as unobservable from the process data in [17], which implies no observable change in mean or the variance can be detected. All MSBA performs poorly on IDV(3) diagnosis. However, PC1DARMF algorithm manages to capture variations in signal geometries and performs much better than MSBA. IDV(4) only cause the mean and standard deviation of each variable differ less than 2% between the faulty status and normal operation (IDV(0)) [17]. This phenomenon leaves signal shapes of observations almost   invariable from IDV(0), causing both misclassification rates for IDV(4) and IDV(0) higher than other faults. In general, both average values for seven deterministic faults are equally well. Considering the testing data for PC1DARMF algorithm are more diverse than MSBA in [17], the proposed method fares better than the existing ones.

Stochastic fault classification in TEP
IDV (8) to IDV(12) given in Table 3 are featured by random variations in measurements when one of them occurs. In this subsection, PC1DARMF algorithm is employed to classify stochastic fault types. The training data set consists of 10 simulation runs for each fault type and fault is introduced 8 h after simulation started. The testing data set simulates faulty states with different faults occurrence time and 20 simulation runs are provided for each fault type (with fault occurrence time 4th, 2nd, 10th, 6th h per five simulations). Parallel analysis suggests that five PCs capture most variations. The initial parameter settings for algorithm follow the settings in Section 4.2. Five sets of signal templates for characterizing IDV(8) to IDV (12) are depicted in Figure  8. With the same steps in Section 4.2, Tables 9, 10, 11 and 12 list corresponding statistics for stochastic fault classification directly for saving details of derivations.
In Table 9 statistics suggest that supervision of signal template sets of every PC results in at least one 0 correct classification rate, while the fusion gives more comprehensive results. The performances of stochastic faults classification are poorer than performances of deterministic faults. One important reason was analyzed in Section 4.2 for random variations hinder formation of standard patterns. Table 13 tabulates comparisons between PC1DARMF algorithm and MSBA [17] for classifying stochastic faults in TEP. The former performs better on two out of five faults. Note that IDV(9) is also viewed as unobservable like IDV(3) in [17] and very difficult to be identified by MSBA. PC1DARMF algorithm here again fares better for unobservable fault. Since random variations lead to irregular morphology of signal shape but observable quantitative variations of statistic measurements, PC1DARMF algorithm performs poorer than MSBA on average. However, the conclusion that MSBA will always give lower misclassification rates than PC1DARMF algorithm would be incorrect, because 8 out of 20 approaches studied in [17] still give lower average correct diagnosis rates compared to PC1DARMF algorithm. If more diverse training data types are provided, the classification results of the proposed method are expected to be better.

Diagnosis of all fault types in TEP
After investigating diagnostic performances of PC1DARMF algorithm for two major fault classes, respectively, we proceed to study the case when it is applied to all possible faults in TEP. The training data set consists of 10 simulation runs for each fault type and the fault is introduced the 8th h after simulation started. The testing data set simulates faulty states with different faults occurrence time and 20 simulation runs are provided for each type (with fault introduced time 4th, 2nd, 10th, 6th h per five simulations). Parallel analysis is applied to find that six PCs enough capture most variations. The parameter selection rules for PC1DARMF algorithm is the same as previous. Tables 14, 15, 16 and 17 list related statistics. Table 17 suggests that the performances of PC1ADRMF algorithm degrades when more sets of signal templates representing more fault types are provided in comparisons with Tables 7,8,9,10,11 and 12. This phenomenon implies as a supervised pattern matching strategy, PC1DARMF algorithm may require more different training data covering major features of relevant groups as much as possible to assist forming highly representative signal templates.

Conclusion and discussion
In this article, a supervised pattern classification method using one-dimensional adaptive rank-order morphological filter called PC1DARMF is developed to detect and recognize different faults in Tennessee Eastman process. This method generates several signals of featured geometry shapes as standard patterns on the basis of training data. With the same processing procedures as training data, testing data reflecting current operational states of TEP are transformed to signal patterns with defined specification. They are matched against standard signal patterns with employment of 1DARMF. It adaptively adjusts filter mask and rank parameter for each sample of signal rather than adopting uniform ones for all the samples. The major parameters for implementing this algorithm are capable of being randomly chosen.  (e) IDV(12) Figure 8 Signal templates for (a) IDV (8) to (e) IDV (12). In each row, five figures from left to right correspond to signal templates of PC 1 to PC 5 , respectively.  TEP deterministic, stochastic, and all fault classes diagnosis are studied to verify the effectiveness of the proposed method in complex process. Consensus theory is employed for fusion of results provided by different sources The results show with only small quantity of training data provided and the testing data being much more different from training data, the performances of deterministic or stochastic faults diagnosis is better than or equally well as multivariate statistics based approaches studied in [17]. Several faults deemed as unobservable for multivariate statistic based approaches can be also recognized more easily. Deterministic faults diagnosis fares better than the stochastic faults diagnosis, since deterministic ones are apt to form similar or regular trends regardless of specified faulty conditions and noise levels while stochastic ones fail to retain basic morphologies for signal patterns. It is also noted that for some real-time applications, signal template sets provided by only one or two PCs is recommended to reduce computational cost. The future work also lies in diagnosing more diverse or multiple faults in TEP or      other complex processes to improve the proposed methods. Besides, Ku et al. [19] pointed DPCA model was expected to perform better than regular PCA on the TEP problem. DPCA could be introduced to extract more information in data set. Despite some promising results of the proposed method, this article presents only a preliminary implementation in complex process, which sheds light on the novel idea of PC1DARMF. The key concept of this idea is to achieve pattern matching between standard pattern and unrecognized pattern by 1DARMF. The specification of pattern defined is not only limited to timedomain signal geometry shapes or even not signal geometry shapes. Frequency spectrums, power spectrum, vector bases, and feature coefficients, etc. can be assembled to form user-defined pattern as well. The new developed pattern should be capable of not only capturing different characteristics for each group but also maintaining a relatively steady form without too much unexpected random variations. Using the same scheme may address the problems such as unstable patterns introduced by random variations.