EURASIP Journal on Applied Signal Processing 2002:12, 1427–1436 c ○ 2002 Hindawi Publishing Corporation Multiuser Channel Estimation from Higher-Order Statistical Matrix Pencil

This paper presents a new statistical approach to the blind estimation of linear multiple-input multiple-output (MIMO) channels with finite impulse response. A matrix pencil is constructed from a set of fourth-order cumulant matrices of the channel output signals. The MIMO channel impulse responses can then be efficiently estimated from the generalized eigendecomposition of this cumulant matrix pencil. Random weighting is applied in the matrix pencil construction to improve the reliability of the algorithm. The proposed new method requires a relaxed channel identifiability condition and is robust in the sense that it does not require the exact knowledge of the MIMO channel order.


INTRODUCTION
In recent years, blind estimation of multiple-input multipleoutput (MIMO) linear channels has become a well-known research problem in multichannel communications and signal recovery. Satisfactory solutions of this problem can find diverse applications in areas such as multiuser detection, array signal processing, speech processing, and multichannel biomedical signal recovery.
The key objective of blind MIMO channel estimation is to determine the unknown matrix channel impulse response without direct training or knowledge of the channel input signals. The receiver must rely on the statistical information from the channel output signals. When the channel is a memoryless system, the problem is often known as blind source separation (or independent component analysis) with the goal of directly extracting source signals from the instantaneous mixtures without explicitly identifying the mixing matrix [1,2,3]. On the other hand, many multiuser systems must deal with dynamic channels that are characterized by a convolutive model. Once the dynamics of the MIMO system are estimated, techniques previously used in blind source separation of memoryless systems can be employed subsequently for individual signal separation.
Most existing approaches to blind channel estimation rely on the use of either second-order statistics (SOS) or higher-order statistics (HOS) of the channel output signals. As two equally important directions, second-order and higher-order methods have different characteristics suitable for different application scenarios. Compared with HOS methods, SOS methods may provide better performance for shorter data records [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]. They also require a subsequent source separation step after identifying the channel dynamics. On the other hand, the proper use of HOS information allows blind identification of a wider class of MIMO channels. HOS methods can directly resolve the inherent unitary ambiguity in SOS channel estimation. Considering channels without excessive diversity, in this work, we present an HOS approach to blind channel estimation.
There are many well-known HOS blind identification methods for single-input single-output (SISO) and MIMO systems. An incomplete list of works include [19,20,21,22,23,24,25,26,27,28]. Specifically, Giannakis et al. [22] generalized their original "GM method" to MIMO systems. Swami and his colleagues [23] presented a unified framework to define cumulants of vector processes for arbitrary orders and developed parameter estimation algorithms for causal and noncausal multichannel AR, MA, and ARMA models. In [24], Tong proposed a new eigen-structure based idea under a very general channel identifiability condition. Using an indirect approach of inverse criteria, Tugnait [26,27] proposed two nonlinear algorithms that iteratively recover one user signal at a time before estimating its corresponding channels.
In this paper, our goal is to develop a simple method that can estimate a large class of MIMO channels. A matrix pencil is a powerful tool and has been successfully applied in the fields of blind source separation and array processing [1,29,30,31]. Motivated by the SOS matrix pencil algorithms for the blind separation of nonstationary or colored signals [30,31], we develop an HOS approach to the blind estimation of MIMO channels driven by white stationary input signals. In a previous work [32], we developed a matrix pencil-based algorithm for blind identification of SISO systems. Here, the work in [32] is generalized to MIMO channel estimation. The channel identifiability condition of the new method is weaker than some existing higher-order methods, for example, [22,26,27], but is still stronger than the condition given in [24]. In addition, this matrix pencil method exhibits some robustness to errors in channel order estimation.
Our paper consists of the following parts: Section 2 describes the problem formulation and its basic assumptions. Section 3 outlines the principle of channel estimation from the cumulant matrix pencil, and Section 4 presents the algorithm in details. Section 5 provides simulation examples that illustrate the performance advantages of the proposed algorithm compared with some existing algorithms [26,33].

System model and basic assumptions
We consider the discrete-time model of a linear MIMO system with m input users and p outputs. The ith channel output signals x i (n) is given by (1) where h i j (n) represents the channel impulse response between the jth channel input and the ith channel output and w i (n) denotes additive Gaussian noise at the ith channel output. We assume that the MIMO channel has finite impulse response (FIR) with maximum memory order of q.
For notational convenience, define the channel impulse response matrix as and define the MIMO output signal vector, input signal vector, and noise vector as respectively. The input-output relationship for the linear MIMO system is given by We can define the MIMO channel transfer function as For convenience of algorithm derivation, the convolutional relationship (4) can also be written as where the stacked signal vectors are defined as . . .
and the channel convolution matrix H is an (L + 1)p × (L + q + 1)m block Toeplitz matrix We will present a method to estimate the channel impulse response {H[k]} q k=0 from the fourth-order cumulants of output signal x[n].
The discussion throughout this paper is based on several key assumptions. Here, we provide a list of these assumptions for later reference.
(A1) Independent source signals {s i (n)} are stationary, temporally i.i.d. processes with zero-means and nonzero fourth-order kurtoses {γ i } . (A2) Channel noises {w i (n)} are zero-mean stationary Gaussian processes. They are mutually independent, and are also independent of input signals.
(A3) The number of input signals is no more than the number of channel outputs, that is, p ≥ m. (A4) Matrix H[0] only consists of nonzero columns. (A5) There exists a nonzero z 0 (including ∞), such that H(z 0 ) has full column rank.
The first three assumptions (A1), (A2), and (A3) are commonly used in higher-order methods. Assumption (A4) can be made without loss of generality since all-zero column corresponding to user s i (n) can be removed by renaming s i (n) = s i (n − τ) for some delay τ, as indicated in [24]. Note that assumption (A5) is the channel identifiability condition (see [33]) required in our channel estimation algorithm. This condition, though stronger than that proposed in [24], is less restrictive than the irreducibility condition required by second-order methods and the requirements of some other higher-order methods [22,26,27].

Cumulant matrix construction
As in [33], fourth-order cumulant matrix C l [k], associated with the lth channel output signal x l (n − k), is defined as where (·) * and (·) H represent complex conjugate and conjugate transpose, respectively. Matrix C l [k] has the same dimension as the covariance matrix of x[n]. Based on assumptions (A1), (A2) and the properties of cumulants [34], the noise contribution to this matrix is zero and we have in which We will describe a matrix pencil algorithm for MIMO channel estimation using a collection of these cumulant matrices.

CHANNEL ESTIMATION FROM MATRIX PENCIL
This section outlines the basic principle of the proposed multiuser cumulant matrix pencil (MCMP) channel estimation algorithm. It first describes the matrix pencil formation using a set of cumulant matrices {C l [k]}. Then, it gives key equations for channel identification that involves finding nontrivial generalized eigenvectors of the cumulant matrix pencil.

Cumulant matrix pencil
Our approach is motivated in part by [30,31], where a matrix pencil, formed from output autocorrelation matrices at different time delays, was used to extract source signals that are nonstationary with colored power spectra. Here, we use cumulant matrices {C l [k]} associated with the same time delay k, but different spatial indices l. Following our study on single-user system [32], the matrix pencil is constructed from linear combinations of these cumulant matrices. Equations (10) and (11) describe the general form of cumulant matrix C l [k] for given parameters {k, l, L}. Here, we consider the special case of Under this condition, the cumulant matrix can be specifically represented as (see [33]) with Notice that H s is a convolution matrix with block Toeplitz structure, and can be a tall matrix with dimension (L + 1)p × (L + 1 − q)m, given that p ≥ m. Then, we define a cumulant matrix pencil {S 1 , S 2 } in which Two sets of weighting factors {w il } are i.i.d. random numbers with uniform distribution within interval (0, 1). From (13) and (15), we have has an all-zero column corresponding to input signal s u (n), then both Σ 1 [ j] and Σ 2 [ j] have a zero element at the uth diagonal entry. This property will be used to single out trivial solutions of our channel estimation algorithm.

Channel identification via matrix pencil
We consider the generalized eigenvalue problem or equivalently It has a total of (L + 1)p generalized eigenvectors {v i } with corresponding generalized eigenvalues {λ i }. For MIMO systems, our matrix pencil algorithm requires that H s have full column rank. As shown in [33], this requirement can be satisfied under assumption (A5). Given the full column rank matrix H s , every generalized eigenvector v i should satisfy We will show that the MIMO channel impulse response can be identified by finding the proper solutions {λ i , v i } of (20). For clarity, three different classes of generalized eigenvectors are discussed separately.

Trivial eigenvectors
A trivial solution of the generalized eigenvalue problem (18) is given by Thus, λ i can be of any complex value. Such a solution can occur in two cases. First, the full row rank matrix H H s has a nonempty nullspace with a dimension of (L + 1)p − (L + 1 − q)m. Any vector in this nullspace will give H H s v i = 0. In addition, other trivial solutions may exist depending on the MIMO channel impulse response. Since an all-zero column in H[ j] ( j = 0, . . . , q) will lead to the singularity of Σ 1 [ j] and Σ 2 [ j], there is a one-to-one mapping between an all-zero column in H[ j] and a trivial generalized eigenvector v i such that All these trivial eigenvectors provide no information about the channel and thus will be discarded. They can be singled out based on the following two properties: Here, · represents the Euclidean norm of a vector.

Nontrivial eigenvectors of distinct eigenvalues
Nontrivial eigenvectors of distinct eigenvalues are the most useful ones because they possess the desired properties for channel estimation. When an eigenvalue is said to be distinct, it does not have any multiplicity and has only one corresponding eigenvector. It can be seen from (20) that, if λ i is a distinct eigenvalue, then there is only one j ( j ∈ {0, . . . , q}), such that λ i is also a unique eigenvalue of matrix pencil In other words, there exists a vector a such that Because both Σ 1 [ j] and Σ 2 [ j] are diagonal matrices defined by (17), λ i is given by This implies that the uth element, corresponding to input signal s u (n), is the only zero diagonal element of ( are nonzero except for trivial cases discussed above. Consequently, the generalized eigenvector v i corresponding to λ i must satisfy where α i is an unknown scaling factor and e j u represents the ( jm+u)th canonical vector in the identity matrix (i.e., e j u has 1 in the ( jm + u)th entry and zeros elsewhere).
Therefore, by multiplication, we have where h j u represents the ( jm + u)th column of H s and α i absorbs all scaling factors. Similarly, Apparently, vector S 1 v i (or C l [q]v i ) provides an estimate of the channel impulse response for input s u (n).

Nontrivial eigenvectors of repeated eigenvalues
It should be noted that the proposed MCMP algorithm will fail if the cumulant matrix pencil {S 1 , S 2 } has nontrivial repeated eigenvalues. Let λ i represent a repeated eigenvalue with the corresponding eigenspace denoted by V i . We will have in which Λ is a diagonal matrix, P is a permutation matrix, and U i is a unitary matrix. In this case, channel estimates can not be directly extracted from S 1 V i because of the unknown mixture matrix U i .

Discussions
From the above analysis, MCMP channel estimation algorithm requires that all nontrivial generalized eigenvalues be unique for the cumulant matrix pencil under consideration. When we adopt random weighting, it is with probability one that all nontrivial generalized eigenvalues are distinct, as in (23). Equation (26) indicates that each nontrivial eigenvector can only provide an estimate of one user channel and we do not know which user it belongs to. Thus, the remaining problem is to ensure that each active user channel can be estimated. Another important issue is to note that the MCMP algorithm does not rely on the exact knowledge of channel order q. Instead, it needs an approximate channel orderq such that q ≥ q. Suppose that we do not know the true channel order q and useq in the delay and length parameter setting (12) to compute cumulant matrix C l [q]. Equations (13), (14), (15), (16), and (17) As a result, the number of trivial eigenvectors is increased, but the nontrivial solutions are not affected.

ALGORITHM IMPLEMENTATION AND CHANNEL TAGGING
This section focuses on a detailed procedure to implement the MCMP channel estimation algorithm and describes how to separate the estimates of different user channels.

Generalized eigendecomposition
First, cumulant matrices {C l [q]} are estimated from the channel output signals to form a matrix pencil {S 1 , S 2 }. As shown above, symmetric matrices S 1 and S 2 are positive semidefinite and share a common nullspace. This means that {S 1 , S 2 } is a symmetric, singular matrix pencil. The QZ algorithm, such as the one used in MATLAB function eig(A, B), can be a standard method for solving small-to moderate-dimensional generalized eigenvalue problems. However, as pointed in [35], the QZ algorithm is generally unreliable when handling a singular matrix pencil. In [36], Cao specifically addressed the generalized eigenvalue problem of a symmetric matrix pencil {A, B} (regular or singular), in which one of the matrices is positive (or negative) semidefinite. The Fix-Heiberger reduction was generalized to deflate the infinite and the singular structure from {A, B}. By applying this deflation method, it was shown that the Kronecker canonical form of {A, B} is very special and the generalized eigendecomposition of {A, B} can be considerably simplified.

Eigenvector selection
The matrix pencil {S 1 , S 2 } has (L + 1)p generalized eigenvectors overall. When none of H[n] (n = 0, . . . , q) has an allzero column, the number of nontrivial eigenvectors reaches the maximum of (L + 1 − q)m. Our goal here is to determine these nontrivial eigenvectors. The selecting procedures are proposed as follows.
(i) Normalize all generalized eigenvectors. For the ith normalized eigenvector v i , find C l [q] corresponding to (ii) Compute for i = 1, . . . , (L + 1)p. (iii) Let denote the set of index i for nontrivial eigenvectors. Since f li for trivial eigenvectors are very small, set consists of indices corresponding to (L + 1 − q)m largest f li . Although it is possible that a small number of trivial eigenvectors may be wrongly included, they will be discarded in the channel tagging step (Section 4.3). (iv) Channel estimates can be obtained as shown in (26) and (27). For better estimation performance, we compute where β i is a scaling factor and h j u represents the ( jm + u)th column of H s .
Regarding the last step, the block Toeplitz structure of H s determines that channel parameters are located in the middle of vector y i with zero entries at one end or both ends. In other words, the estimation result y i includes unknown delay ambiguity and scaling ambiguity β i .

Channel tagging
Since different y i carry channel information for different users, we need to classify all { y i } into m groups, each corresponding to one user. This is known as channel tagging.
To construct a proper classification criterion, we first define a distance measure between two vectors y i1 and y i2 . If they correspond to the same user, without loss of generality, they can be written as It is clear that h j1 u and h j2 u belong to different column blocks of matrix H s . Moreover, when y i2 is circularly up shifted by ( j 2 − j 1 )p elements, the resulting vector y is exactly the same as y i1 except for a scalar difference. In this case, the normalized cross-correlation between y i1 and y ( j2− j1) i2 reaches the maximum value. Based on this property, we define a crosscorrelation matrix R y to measure the similarity between every pair of y i1 and y i2 . We normalize each vector y i into y i . The (i 1 , i 2 )th entry of R y is given as where y i2 is circularly up shifted by np elements to form y (n) i2 . Matrix R y provides an effective distance measure for channel tagging.
We then apply the simple hierarchical dimensionality reduction (HDR) approach to classify { y i } into m groups. HDR method was also adopted in [31] for signal grouping. Initially, we have (L + 1 − q)m groups and each of them has only one vector y i . In every iteration, two most correlated groups are merged together. For any two groups, the group correlation is defined as the median of cross-correlation values [R y ] i1,i2 of any two vectors taken from each of these two groups, respectively. The merging procedure is continued until only m groups are left.
After classification, every user group may have several y i providing channel estimates for the same user. The final estimation result is selected as the vector with maximum norm. By doing this, the results obtained from wrongly selected trivial eigenvectors will be discarded.

Summary
To summarize, the MCMP channel estimation algorithm contains the following major steps.

SIMULATIONS
In this section, we present several simulation examples to demonstrate the feasibility and performance of the proposed MCMP channel estimation algorithm. For comparison, the simulation results of two existing MIMO channel estimation algorithms, namely the MIMO Cumulant Subspace (MIMOCS-MP) algorithm [33] and the iterative Constant Modulus Algorithm (IT-CMA) based approach [26], are also included.
In our simulation setup, channel inputs are mutually independent, i.i.d. QPSK signals. Additive channel noise is complex white Gaussian with zero mean, and its variance is determined by the averaged signal-to-noise ratio (SNR) over all channel outputs. We consider 2-input/2-output FIR MIMO channels, in which each subchannel is a random realization of COST207 typical urban (TU) propagation model [37] for GSM system. In addition to the examples where exact channel order q is used, some results illustrating the performance under channel order overestimation are also provided.
For MCMP algorithm, we set {k = q, L = 2q} as in (12) and solve (2q + 1)p generalized eigenvectors of matrix pencil {S 1 , S 2 }. MIMOCS-MP algorithm chooses the delay and length parameters as {k 1 = q, k 2 = 2q − 1, L = 3q − 1}. It needs to perform singular value decompositions of a (3q 2 p × 3qp) rectangular matrix and a (q + 1)p × (q + 1)p square matrix. Obviously, the implementation of MCMP algorithm is less costly than the MIMOCS-MP algorithm. On the other hand, IT-CMA algorithm involves adaptation of MIMO-CMA equalizer parameters and linear filtering operation to recover the input signals. The low-computational complexity of IT-CMA algorithm is proportional to the data length.
As a performance measure, we use the normalized mean square error (NMSE) criterion. For a given subchannel h i j (n), define in whichĥ i j (n) is the estimated channel impulse response and ρ j is used to compensate the scalar ambiguity associated with the estimation results for the jth user. The overall NMSE (ONMSE) is obtained by averaging over all subchannels Assume that channel order q is known. We select the length of MIMO-CMA equalizer to be L e = 4 and the adaptation stepsize µ = 0.0002. Figure 1 illustrates the performance of these three algorithms at different SNR levels. Comparison is carried out for data lengths of 4000 and 8000 samples, respectively. In both cases, MCMP algorithm outperforms MIMOCS-MP and IT-CMA at various SNR levels. Figure 2 shows the comparative results for various data lengths when SNR is fixed at 10 dB and 20 dB, respectively. These results confirm the observation made from Figure 1. It is seen that MCMP algorithm demonstrates its performance advantage over IT-CMA and MIMOCS-MP algorithms, especially with shorter data lengths and at lower SNR levels.
Example 2. Next, we study the performance of the three algorithms when channel order is not exactly known. Recall that both MCMP algorithm and IT-CMA algorithm only need an approximate channel orderq such thatq ≥ q, while the subspace decomposition based MIMOCS-MP algorithm requires the exact knowledge of the channel order to guarantee the uniqueness of estimation results.  We test these algorithms on channel CH1 using 8000 data samples. In Figure 3, the dashed lines represent estimation results when true channel order q is known, and the solid lines represent estimation results when the channel order is overestimated asq = q + 1. Clearly, both MCMP and IT-CMA algorithms introduce mild performance degradation, while MIMOCS-MP algorithm shows much severe performance loss. This example illustrates that MCMP algorithm is less sensitive to channel order overestimation.
Note that IT-CMA algorithm requires H(z) to be of full column rank for |z| = 1 when a doubly infinite length equalizer is used. Clearly, channel CH2 violates the identifiability condition of IT-CMA algorithm, but it is still identifiable by the proposed MCMP algorithm. Figure 4 shows the comparative results of the MCMP algorithm and the IT-CMA algorithm at different SNR levels. Assume that channel order q is known. The length of MIMO-CMA equalizer is selected as L e = 10. We tested different adaptation stepsizes for fast and good convergence of IT-CMA algorithm, and chose µ = 0.0032 for data length of 4000 samples and µ = 0.0016 for data length of 8000 samples. It is observed that the performance of IT-CMA tends to stagnate, while MCMP can evidently improve its performance as SNR increases or data record length becomes longer. For the class of noninvertible channels as CH2, global convergence of MIMO-CMA equalizer is not guaranteed and the equalizer parameters may be trapped  at local minima. When the recovered input signal from equalizer is unreliable, it in turn affects the accuracy of channel estimation in IT-CMA algorithm. This example illustrates the advantage of the MCMP algorithm in the sense that it allows a less restrictive channel identifiability condition. three algorithms to various channel conditions, we employ the following random channel test. We first randomly generate ten FIR MIMO channel models in the same way used to obtain CH1 and CH2. Then, ten Monte Carlo runs are performed for each random channel model. The ONMSE is obtained by averaging over 100 runs corresponding to these ten random channel models. Here, all subchannels have channel order q = 1 which is assumed to be known. The parameters of these three algorithms are chosen as in Example 1.
As illustrated in Figure 5, the proposed MCMP algorithm demonstrates better averaged performance than the other two algorithms in this random channel test, and the averaged performance of IT-CMA and MIMOCS-MP are very close. Next, in Figure 6, the results are shown versus Monte Carlo runs when data length is 8000 samples and SNR is 20 dB. It can be seen that, in most cases, MCMP algorithm generates more reliable estimates than IT-CMA and MIMOCS-MP. On the other hand, there do exist some "bad" TU channels, for example, Monte Carlo runs 70 to 90, for which all three algorithms perform poorly.

CONCLUSIONS
A new matrix pencil algorithm is developed for the blind estimation of FIR MIMO systems from fourth-order cumulants of the multiple channel outputs. It is shown that an unknown MIMO channel impulse responses can be identified by finding nontrivial generalized eigenvectors of a cumulant matrix pencil. The proposed new method requires a weaker channel identifiability condition than some existing methods and does not rely on the exact knowledge of the channel order.
Numerical simulation examples are given to demonstrate its consistent performance.