EURASIP Journal on Applied Signal Processing 2003:8, 757–765 c ○ 2003 Hindawi Publishing Corporation An Evolutionary Approach for Joint Blind Multichannel Estimation and Order Detection

A joint blind order-detection and parameter-estimation algorithm for a single-input multiple-output (SIMO) channel is presented. Based on the subspace decomposition of the channel output, an objective function including channel order and channel parameters is proposed. The problem is resolved by using a specifically designed genetic algorithm (GA). In the proposed GA, we encode both the channel order and parameters into a single chromosome, so they can be estimated simultaneously. Novel GA operators and convergence criteria are used to guarantee correct and high convergence speed. Simulation results show that the proposed GA achieves satisfactory convergence speed and performance.


INTRODUCTION
Many applications in signal processing encounter the problem of blind multichannel identification. Traditional methods of such identification usually apply higher-order statistics techniques. The major problems of these methods are slow convergence and many local optima [1]. Since the original work of Tong et al. [1,2], many lower-order statisticsbased methods have been proposed for blind multichannel identification (see [3] and references therein). A common assumption in these methods is that the channel order is known in advance. However, such information is, in fact, not available. Thus, we are obliged to estimate the channel order beforehand. Though many order-detection algorithms can be applied (e.g., see [4]) to solve this particular problem, the approaches that separate order detection and parameter estimation may not be efficient, especially when the channelimpulse response has small head and tail taps [5].
To tackle this drawback, a class of channel-estimation algorithms performing joint order detection and parameter estimation has been proposed [5,6]. In [5], a cost function in-cluding channel order and parameters is proposed. However, the algorithm may not be efficient because the channel order is estimated by evaluating all the possible candidates from 1 to a predefined ceiling. The method proposed in [6] is also not a real joint approach since the order was separately estimated by detecting the rank of an overmodelled data matrix. In fact, this is very similar to the methods that applied a rankdetection procedure to an overmodelled data covariance matrix in [4]. Order estimation via rank detection may not be efficient because it is sensitive to noise [4] and the calculation of eigenvalue decomposition is also computationally costly.
In this paper, we propose a real joint order-detection and channel-estimation method based on genetic algorithm (GA). The GAs have been widely used in channel-parameter estimation [7,8,9]. However, its application to joint order detection and parameter estimation has not been well explored. Based on the subspace decomposition of the outputautocorrelation matrix, we first develop a new objective function for estimating channel order and parameters. Then, a novel GA-based technique is presented to resolve this problem. The key proposition of the proposed GA is that the channel order can be encoded as part of the chromosome. Consequently, the channel order and parameters can be simultaneously estimated. Simulation results show that the new GA outperforms existing GAs in convergence speed. We also compare the performance of the proposed GA with the closed-form subspace method which assumes that the channel order is known [10]. Simulation results show that the proposed GA achieves a similar performance.

PROBLEM FORMULATION
We consider a multichannel FIR system with M subchannels. The transmitted discrete signal s(n) is modulated, filtered, and transmitted over these Gaussian subchannels. The received signals are filtered and down-band converted. The resulting baseband signal at the mth sensor can be expressed as follows [1]: (1) where b m (n) denotes the additive Gaussian noise and is assumed to be uncorrelated with the input signal s(n), h m (n) is the equivalent discrete channel-impulse response associated with the mth sensor, and L is the largest order of these subchannels (note that the subchannels may have different orders). Equation (1) can be represented in vector-matrix formulation as follows: where is the (N + 1) × 1 observed vector at the mth sensor, is the (N + 1) × 1 additive noise vector, and is the (N + L + 1) × 1 transmitted vector. The matrix is the (N + 1) × (N + L + 1) transfer matrix of subchannel h m (n). We define an M(N + 1) × 1 overall observation vector as x(n) = [x T 1 (n) · · · x T M (n)] T , then the multichannel system can be represented in matrix formulation as where If we define the output-autocorrelation matrix as R xx = E[x(n)x(n) T ], then we have where In the following, we will present an objective function based on the subspace decomposition of R xx . To exploit the subspace properties, the following assumptions must be made [10]: the parameter matrix H has full column rank, which implies M(N + 1) ≥ (N + L + 1) and the subchannels do not share common zeros. The autocorrelation matrix R ss has full rank. The basic idea of subspace decomposition is to decompose the R xx into a signal subspace and a noise subspace. Let λ 1 ≥ λ 2 ≥ · · · ≥ λ M(N+1) be the eigenvalues of R xx ; since H has full column rank (N + L + 1) and R ss has full rank, it implies that the signal component of R xx , that is, HR ss H H , has rank of N + L + 1. Therefore, where σ 2 n denotes the variance of the additive Gaussian noise. If we perform the subspace decomposition of R xx , we get where Λ s = diag{λ 1 , . . . , λ N+L+1 } contains N + L + 1 largest eigenvalues of R xx in descending order and the columns of U s are the corresponding orthogonal eigenvectors of λ 1 , . . . , λ N+L+1 , and Λ n = diag{λ N+L+2 , . . . , λ M(N+1) } contains the other eigenvalues and the columns of U n are the orthogonal eigenvectors corresponding to eigenvalue σ 2 n . The spans of U s and U n denote the signal subspace and the noise subspace, respectively. The key proposal is that the columns of H also span the signal subspace of R xx . The channel parameters can then be uniquely identified by the orthogonal property between the signal subspace and the noise subspace [10], that is, Let h = [h1,0 · · · h 1,L · · · h M,0 · · · h M,L] T contain all the channel parameters. From (11), we propose an objective function as follows: In this objective function, the channel order is assumed to be known. However, in practice this is not true. Therefore, the channel order must be estimated beforehand. In this paper, we estimate the channel order based on (12). Since the subchannels may have different orders, order estimation refers to the largest. Note that the channel identifiability does not depend on whether the subchannels have the same order but on whether they have common zeros [10]. We show that order estimation affects the number of global optima in (12). It shows that J(h) has only one nonzero optimum when the channel order is correctly estimated [10]. We study the cases where the channel order is either under-or overestimated based on (12).
If the channel order is overestimated, then J(h) will have more than one nonzero optimum. For instance, let the estimated order be L + 1; we define By constructing H 1 , H 2 from h 1 m , h 2 m , one can verify that H 1 , H 2 will satisfy the following condition: This means that J(h) will have two linear independent nonzero optima: It is straightforward to show that if the channel order is underestimated, then J(h) has no nonzero optimum. If this is not true, from the above derivation, J(h) with correctly estimated order will have more than one nonzero solution. This contradicts the conclusion in [10].
Therefore, we can conclude that the optima of J(h) satisfy the following conditions: optima of J(h) are (i) more than one nonzero optimum overestimated order, (ii) only one nonzero optimum correctly estimated order, (iii) no nonzero optimum underestimated order. Now let l denote the estimated order. Assuming that the channel order is unknown, we propose to include l in the objective function of (12) and propose a new objective function J(l, h) = H H U n . In order to let l converge on the correct order, the following conditions must be met: (1) trivial solution, that is, h = 0, must be avoided, (2) l is more likely to converge to a small order.
Note that h has a free constant scale. If h is a solution of (11), then η h, where η is an arbitrary constant, is also a solution of (11). A common technique to avoid a trivial solution is to normalize h to h = 1 [5,6,10]. In this paper, we extend this constraint by proposing h ≥ 1, and concentrate on a special case. That is, we fix the first parameter of h to h(1) = 1. Such a constraint is helpful in avoiding the computation of normalization during iteration. Note that l will affect the objective value by using the number of elements in h to compute it. A smaller l implies that fewer elements are used. Consequently, it may result in a smaller objective value. Therefore, such a constraint is also helpful in making l converge to a smaller value.
To ensure condition (2), we suggest imposing a penalty on J(l, h) when a larger estimate of channel order is achieved. Practically, the objective value (J(l, h)) converges to a small value rather than exact zero. Therefore, we apply the multiplication instead of addition. The following objective function is proposed: where K scales the penalty and it must be guaranteed that K ≥ 0.

GENETIC ALGORITHM
A GA is a "random" search algorithm that mimics the process of biological evolution. The algorithm begins with a collection of parameter estimates (called a chromosome) and each is evaluated for its fitness for solving a given optimization task. In each generation, the fittest chromosomes are allowed to mate, mutate, and give birth to offspring. These children form the basis of the new generation. Since the children generation always contains the elite of the parents generation, a newborn generation tends to be closer to a solution to the optimization problem. After a few evolutions, workable solutions can be achieved if some convergence criteria are satisfied. In fact, a GA is a very flexible tool and is usually adapted to the given optimization problem. The features of the proposed GA are described as below.

Encoding
Each chromosome has two parts. One represents the channel order and is encoded in binary and the other represents the channel parameters and is encoded in real value. Let (c, h) i j ( j = 1, . . . , Q) denote the jth chromosome of the ith generation where Q is the population size. The chromosome structure is as follows: where the parameter chromosomes have the same structure as h. Note that the length of order chromosomes decides the length of parameter chromosomes and one should ensure that the length of parameter chromosomes is greater than the possible channel order.

Initialization
Normally, the initial values of the chromosomes are randomly assigned. In the proposed GA, in order to prevent the algorithm from converging to a trivial solution, as we have shown in Section 2, the first parameter of h (i.e., the first gene of parameter chromosomes) is fixed to h 1 = 1, where other genes are randomly initialized.

Fitness function
In the proposed GA, tournament selection is adopted, in which the objective values are obtained by computing the value in (16). Consequently, it is not necessary to map the objective value to fitness value. Since the order chromosomes have a very simple coding (in binary) and a smaller gene pool, order chromosomes are expected to converge much faster than the parameter chromosomes. Thus, we propose to detect the convergence of order chromosomes and parameter chromosomes separately. However, it should be noted that the objective values of (16) cannot directly indicate the fitness of the order chromosomes. The fitness function for order chromosomes is required and is defined as follows. The fitness of an estimated order l is measured as the number of chromosomes whose order is equal to l. The order fitness of (c, h) i j is denoted as The above fitness function is not used in tournament selection but only in the convergence criteria of order chromosomes.

Parent selection
A good parent selection mechanism gives better parents a better chance to reproduce. In the proposed GA, we employ an "elitist" method [8] and tournament selection [11]. First, partial chromosomes of the present population, that is, the ρ·Q best chromosomes, are directly selected. Then, the other (1 − ρ) · Q child chromosomes are generated via tournament selection within the whole parent population. That is, two chromosomes are randomly selected from the parent's population in each cycle. The one with the smaller objective value is selected.

Crossover
Crossover combines the feature of two parent chromosomes to form two child chromosomes. Generally, the parent chromosomes are mated randomly [12]. In the proposed GA, each chromosome contains two parts with different coding technique. The order chromosome will decide how many elements in the parameter chromosome are used to calculate the objective value. Therefore, these two parts cannot be decoupled. The conventional methods that perform crossover separately may not be efficient. Normally, the order chromosomes will be short. For instance, an order chromosome with a length of 5 implies a searching space from 1 to 32, which covers most practical cases of the FIR channels. Therefore, the order chromosomes are expected to converge much faster than the parameter chromosomes. We propose not to perform crossover on the order chromosomes but to use mutation only. For the parameter chromosomes, crossover between chromosomes with different order is more explorative (i.e., searches more data space). However, it may also damage the building blocks in the parent chromosomes. On the other hand, crossover between chromosomes with the same order is more exploitative (i.e., it speeds up convergence). However it may cause premature convergence. Since faster convergence is preferable in blind channel identification, we propose to mate chromosomes of the same order. For each estimated order, if the number of corresponding chromosomes is odd, a randomly selected chromosome is added to the mating pool. Assume that the chromosomes are mated and a pair of them is given as Let a 1 , a 2 ∈ [1, T] be two random integers (a 1 < a 2 ), and let α a1+1 , . . . , α a2 be a 2 − a 1 random real numbers in (0, 1), then the parameter parts of the child chromosomes are defined as where a two-point crossover is adopted.

Mutation
A mutation feature is introduced to prevent premature convergence. Originally, mutation was designed only for binaryrepresented chromosomes. For real value chromosomes, the following random mutation is now widely adopted [12]: where g is the real value gene, ϕ is a random function which may be Gaussian or uniform, and µ and σ are the related mean and variance. In this paper, we use normal mutation for the order genes. That is, we randomly alter the genes from 0 to 1 or from 1 to 0 with probability P m . Normally, P m is a small number. However, in the proposed GA, the value of the order chromosome decides the used parameter genes for calculating the objective function. Less value of order means a lesser number of parameter genes and consequently less objective value. Therefore, in the start-up period of the iteration, the order chromosomes are more likely to converge on a small value where order is equal to 1. A large mutation rate is adopted to prevent such premature convergence. For the parameter part, a uniform PDF is employed. Let a 3 , a 4 ∈ [1, T] be two random integers (a 3 < a 4 ), and let β a3+1 , . . . , β a4 be a 4 − a 3 random real numbers between (−1, 1), then the parameter chromosomes of the child generation are defined as where P is a predefined number and can be adjusted during iteration to speed up the convergence.

Convergence criterion
We propose a different convergence criterion for order chromosomes and parameter chromosomes. The order chromosomes are considered to be converged if the gene pool is dominated by a certain order, that is, where l D is the dominant order, cum i j (l D ) is the number of chromosomes with order l D , and γ is a predefined ratio. When the order chromosomes are converged, the mutation rate of order chromosomes is set to zero (p m = 0). The parameter chromosomes are considered to be converged if the change in the smallest objective value within X generations is small, that is, where e is also a predefined ratio. Theoretically, the objective function in (16) has multiple minima that may have overestimated orders. In order to cause the order chromosomes to converge on the correct channel order, we impose a penalty on the chromosomes with greater order. Due to the "random" nature of a GA, though in most cases the order chromosomes can converge on the real channel order (see the simulation result in Table 1), there is no guarantee that the chromosomes will absolutely converge on the real channel order. Therefore, we propose to examine the converged result to ensure correct convergence. If we let (c, h) s1 be the current converged result, the examination can be carried out as follows (see the outer loop in Figure 2): reduce the order of (c, h) s1 by 1, fix the order, and run the proposed GA again (note that this time the order chromosomes are fixed, i.e., p m = 0  reexamine J(c, h) s2 using the same strategy. Otherwise, if the drop from J(c, h) s1 to J(c, h) s2 is significantly large, the following inequality arises: The drop between J(c, h) s1 and J(c, h) s2 is considered to be distinguishably large enough for us to say that (c, h) s1 has converged on the real channel order. From the inequality in (25), one can draw two lines with slope of (θ + 1)/(θ − 1) and (θ − 1)/(θ + 1) (see Figure 1). The shaped region in Figure 1 shows the data space given by (25). The criterion set in (25) is, in fact, an enumeration search. However, the order estimation in the proposed GA does not solely rely on this enumeration search. In the proposed GA, we have employed certain strategies to give the order chromosome a better chance of converging to the real channel order. The simulation result also shows that in most cases the order chromosomes can converge on (or close to) the real channel order (see Table 2). The enumeration search is, thus, used to compensate for the drawback of the GA.  Figure 2: Flow diagram of the proposed GA.
The overall flow diagram of the proposed approach is illustrated in Figure 2. It can be seen that the proposed GA has an inner and an outer loop. The criteria in (23) and (24) in the inner loop guarantee that a global optimum is achieved. We have shown that this solution may have an overestimated order. The criterion in (25) in the outer loop is used to reexamine the solution reached and guarantee the correct estimate.
It is important to note that although the order part and the parameter part have a distinct representation, fitness function, and convergence criterion, we encode the two parts into a single chromosome rather than keeping two separate chromosomes. This is because the order part decides how many genes of the parameter chromosome should be used to calculate the objective value and, therefore, these two parts cannot be decoupled.

EXPERIMENTAL RESULT
Computer simulations are done to evaluate the performance of the proposed GA. We use the same multichannel FIR system as that in [9], where two sensors are adopted and the channel-impulse responses are (26) Table 1 shows the configuration of the proposed GA. A large population size is used in order to explore greater data space. The searching space of channel order is from 1 to 8 (S = 3). In the blind channel estimation, a model of FIR multichannel is normally modelled by oversampling the output of a real channel. A multichannel model with two subchannels of order 8 represents a real channel of order 16, which covers most normal channels. Note that order chromosomes of length 3 can also map the searching space from 9 to 16. So, in case no satisfactory solution is reached, one may remap the order searching space (9-16) and rerun the algorithm. A large mutation rate (p c = 0.5) is adopted to prevent premature convergence. To speed up the convergence of parameter chromosomes, we adjust P every 100 generations (see Table 2), where a denotes the floor value of a.
A 25-dB Gaussian white noise is added to the output and 2,000 output samples are used to estimate the autocorrelation matrix R xx . Figure 3 shows a typical evolution curve. In each generation, the average objective value and estimated order of the whole population are plotted. From Figure 3, one can see that the order chromosomes converge much faster than the parameter chromosomes. They converge on the true channel order in the first inner loop run (order = 5 in Figure 3). We store this converged result, reduce the order by 1, set p m = 0, and then begin another GA execution. After the convergence (order = 4 in Figure 3), we evaluate these two converged results (order = 5 and order = 4 in Figure 3) by using the outer loop criterion in (25). Since there is an exponential drop between the two results, the condition in (25) is satisfied. Thus, our algorithm stops and concludes that order 5 is the final estimate.
The channel order is estimated by detecting the drop between two converged objective values, which may be similar to the traditional method where the eigenvalues of an overmodeled covariance matrix are calculated and the channel order is determined when there is a significant drop between two adjoining eigenvalues [4]. However, our algorithm is more efficient since the calculation of eigenvalue decomposition can be avoided and it can be seen that the drop is much more significant (an exponential drop). Figure 4 shows an evolution curve where the channel order is overestimated in the first inner loop run (order = 6 in Figure 4). In Figure 4, the objective values of the first two converged results are quite close, which does not satisfy the criterion set in (25). Further examination is thus required. As above, we can get the third converged result (order = 4 in Figure 4). By evaluating it with (25), we can draw the same conclusion as from Figure 3.
When compared with existing work, the convergence speed of the proposed GA is satisfactory since it can be seen that a quite reliable solution can be reached in about 1,000 generations, whereas the algorithm in [9] converges after 2,000 generations (note that in [9] the channel order is assumed to be known). In [8], an identification problem with similar complexity is simulated. The algorithm converges after hundreds of generations, but it is nonblind and, there- fore, the objective function is quite simple. It is important to note that the convergence speed is affected by the complexity of the target problem. A more complicated multichannel will result in slower convergence speed. We simulate a multichannel system with four subchannels and find that the algorithm converges after 1,000 generations. The effect of problem complexity seems to be a common problem of GAs and needs further study.
Since the proposed GA needs to estimate the secondorder statistics of the channel output (the autocorrelation matrix), it cannot be used directly in a rapidly varying channel. However, if some subspace tracking algorithm is employed (e.g., [13]), the noise subspace, that is, U n in (16) can be updated when a new sample vector (x(n) in (7)) is received. The objective function can be adapted according to 10 15 the channel variation. In this case, the proposed GA may be applied to a rapidly varying channel. However, this requires further investigation and is beyond the scope of this paper. It is obvious that the computation is costly if the converged order in the first inner loop run is much greater than the real channel order. In the proposed GA, though there is no guarantee that the order chromosomes are absolutely converging on the real channel order in the first inner loop run, we have proposed several strategies to make them converge more closely. To illustrate the point, 60 independent trials are done and we record the converged order in the first inner loop run. Table 2 shows the results. The first row denotes the converged orders. The second row gives the times where the order chromosomes converge on a certain order. The third row shows the proportions. Table 2 illustrates that at most times the order chromosomes converge to or close to the real channel order (order 5 and 6 get about 80% of the trials).
To evaluate the performance of the proposed GA, we compare it with a singular value decomposition-based closed form approach (SVD) that assumes that the channel order is known [10]. Root mean square error (RMSE) is employed to measure the estimation performance, which is defined as where N t denotes the number of Monte Carlo trials and is set at 50, and h t denotes the estimated channel parameters in the ith trial. The comparison results are given in Figure 5. It can be seen that the proposed GA achieves similar performance with lower signal-to-noise ratio (SNR). At high SNR, the performance of GA is worse, because the converged result is not close enough to the real optimum. However, the per-formance of GA can be improved by making it execute more generation cycles.

CONCLUSIONS
Based on the SIMO model and the subspace criterion, a new GA has been proposed for blind channel estimation. Computer simulations show that its performance is comparable with existing closed form approaches. Moreover, the proposed GA can provide a joint order and channel estimation, whereas most of the existing approaches must assume that the channel order is known or treat the problem of order estimation and parameter estimation separately.