A New Method for Estimating the Number of Harmonic Components in Noise with Application in High Resolution Radar

In order to operate properly, the superresolution methods based on orthogonal subspace decomposition, such as multiple signal classification (MUSIC) or estimation of signal parameters by rotational invariance techniques (ESPRIT), need accurate estimation of the signal subspace dimension, that is, of the number of harmonic components that are superimposed and corrupted by noise. This estimation is particularly difficult when the S/N ratio is low and the statistical properties of the noise are unknown. Moreover, in some applications such as radar imagery, it is very important to avoid underestimation of the number of harmonic components which are associated to the target scattering centers. In this paper, we propose an effective method for the estimation of the signal subspace dimension which is able to operate against colored noise with performances superior to those exhibited by the classical information theoretic criteria of Akaike and Rissanen. The capabilities of the new method are demonstrated through computer simulations and it is proved that compared to three other methods it carries out the best trade-off from four points of view, S/N ratio in white noise, frequency band of colored noise, dynamic range of the harmonic component amplitudes, and computing time.


INTRODUCTION
There has been an increasing interest for many years in the field of superresolution methods, such as multiple signal classification (MUSIC) [1,2] or estimation of signal parameters by rotational invariance techniques (ESPRIT) [3,4]. They have been conceived to overcome the limitations of the Fourier-transform-based techniques, which are mainly related to the resolution achieved, especially when the number of available samples is reduced, and to the choice of the weighting windows, which controls the sidelobe level. Furthermore, there is always a tradeoff to do between the spatial (spectral, temporal, or angular) resolution and the dynamic resolution.
The most effective classes of superresolution methods divide the observation space into two orthogonal subspaces (the so-called signal subspace and noise subspace) and are based on the autocorrelation matrix eigenanalysis. In conjunction with signal subspace dimension estimation criteria, they are well known to provide performances close to the Cramer-Rao bound [5].
Akaike information criterion (AIC) [6] is one of the most frequently used techniques to perform the estimation of the signal subspace dimension in the case of the white Gaussian noise. The number of harmonic components is determined to achieve the best concordance between the model and the observation data. Analytically, this condition is expressed in the form where C(k) is a cost function related to the log-likelihood ratio of the model parameters for N = k. However, Rissanen demonstrated that the AIC yields an inconsistent estimate and proposed the minimum description length (MDL) criterion [7] to overcome this problem. Although the estimate given by the MDL criterion is consistent, the signal subspace dimension is underestimated, especially when the number of samples is small.
In our experiments, we have used both the AIC and the MDL criteria adapted by Wax and Kailath [8]. If P is the number of independent realizations of length M, then the cost functions in the two cases have the following expressions, where {λ i } i=1,...,M stand for the eigenvalues of the autocorrelation matrix.
When the noise statistics are unknown, other methods have been proposed, such as the Gerschgörin disk technique [9], known also as the Gerschgörin disk estimator (GDE) criterion. It makes use of a set of disks, whose centers and radii are both calculated from the autocorrelation matrix Σ. Let A be the M × M matrix obtained by the following unitary transformation, where are the orthogonal Fourier vectors so that q k 2 = 1. The M normalized frequencies are uniformly spaced from 0 to 1 − 1/M. It can be shown that a kk = q H k Σq k ∼ = λ k . The centers of the Gerschgörin disks are then given by C k = a kk , while their radii by The cost function is expressed in the form where are sorted in decreasing order. The choice of the coefficient δ is somehow arbitrary. Its value should be dependent only on the autocorrelation matrix dimension M according to [10], where it is set to 1. However, we found out that it also depends on the number of harmonic components to be estimated. Although this dependence is weak, it results in significant differences in terms of detection performance when a random number of sinusoids are superimposed with respect to the case when the signal contains only two harmonic components, as it is shown in Section 4. The solution is considered to be the argument which yields the last positive value of the cost function defined above. Although GDE method performs better than AIC and MDL for colored noise, it is less effective for white noise and significantly increases the computing time compared to these two criteria.
The method we propose in this paper for the estimation of the signal subspace dimension performs the best tradeoff in terms of robustness to white noise, robustness to colored noise, dynamic range of the spectral components, and computing time.
The rest of the paper is organized as follows. The principle of the new criterion and the associated cost function are described in Section 2. Section 3 gives an analytical demonstration for a simplified, but representative, variation of the autocorrelation matrix eigenvalues. Section 4 provides some convincing results which prove the capabilities of the proposed method and validate it on the example of a radar range profile reconstruction using the MUSIC technique. General conclusion is drawn in Section 5 together with some perspectives about our future research work.

NEW CRITERION DERIVATION
The variation of the autocorrelation matrix eigenvalues is directly related to the number of harmonic components (N) present in the analyzed signal. Indeed, there are exactly N nonzero eigenvalues in the noiseless case, while if an additive white Gaussian noise (AWGN) is considered, the M − N smallest eigenvalues should all equal to the noise variance [11].
An example is provided on Figure 1 for the case of the superposition of 2 sinusoids (N = 4) corrupted by AWGN. Thus, for large S/N ratios the number of significant eigenvalues equals the number of harmonic components, the others taking values close to zero, as it can be seen on Figure 1a. When the noise level increases, the N largest eigenvalues are still associated to the eigenvectors which span the signal subspace, but it is much more difficult to make a robust decision using only their simple variation. Thus, the distribution of the eigenvalues associated to the noise subspace is not uniform, as predicted in theory, because of the small number of data samples considered, while the transition between the two classes of eigenvalues becomes less and less marked ( Figure 1b).
Consequently, the distribution of the autocorrelation matrix eigenvalues cannot be considered a reliable criterion for estimating the number of harmonic components, when the S/N ratio is weak, no matter whether the noise is white or not. However, the AIC and MDL criteria demonstrate that even if a simple thresholding is not able to provide this   estimate, the eigenvalue variation can be still used, in a different form, for obtaining N.
The main idea behind the new method is that estimating N is equivalent to finding how many eigenvalues are associated to each of the two subspaces, signal and noise. This can be considered a classification problem with two classes, whose separation limit can be found using two discriminant functions to be defined. In the ideal case, for the example given above, these functions should have the shapes shown on Figure 2a. They have been normalized so that they can be considered equivalent probability density functions (pdf) associated to the two classes.
This approach, which makes use of discriminant func-tions instead of the probabilities, is considered to be an effective alternative to the Bayes decision approach in the pattern classification theory. While suboptimality may still occur because of improper choice of the discriminant functions, as in the case of incorrect distribution assumption in the Bayes approach, the discriminant function based method usually offers implementational simplicity and it may be possible to circumvent the data consistency issue [12]. If g 1 and g 2 denote the two discriminant functions, then a new cost function, represented on Figure 2b, can be defined in the form Just like in the case of the GDE criterion, the solution N is obtained as the argument which yields the last positive value of this cost function.
We will present in the following the proposed forms for the two discriminant functions g 1 and g 2 . They have been deduced in an empirical way, using some remarks on the behavior of the autocorrelation matrix eigenvalues (see Section 3).
The values {λ k } k=1,...,M can be considered as their membership measures with respect to the signal subspace. Consequently, in order to approximate the first ideal shape shown on Figure 2a, the function g 1 is chosen as the variation of the last M − 1 eigenvalues sorted in decreasing order and normalized in order to obtain an equivalent probability density function The variation of the second discriminant function should capture in a suitable way the jump from the last eigenvalue associated to the signal subspace and the first eigenvalue associated to the noise subspace. As it was stated above, it is difficult to detect directly this jump in the case of noisy signals. However, it can be noticed that even for these signals there is a slope variation between the two classes of eigenvalues. The main idea for defining the second discriminant function is then to exploit this slope variation to distinguish between the two classes. Thus, the function g 2 , corresponding to the noise subspace, is chosen to have an inverse variation with respect to the function g 1 and is defined as an equivalent probability density function too, where Note that {ξ k } k=1,...,M mainly measures the relative slope variation of the eigenvalues {λ k } k=1,...,M . The difference between the current eigenvalue and the mean of the next ones has been preferred to the simple subtraction of the next eigenvalue in order to integrate the irregular eigenvalue variation. A smoother form of the second discriminant function can be thus obtained.
The shapes of the two discriminant functions calculated with (8) and (9), for the example given above, are represented on Figure 3a. The corresponding cost function is also represented on Figure 3b. Note that even if the real shapes of the discriminant functions approximate rather poorly the ideal ones, the cost function issued from their difference allows quite satisfactorily the estimation of N.

PARTICULAR CASE OF A LINEAR PIECEWISE VARIATION OF THE AUTOCORRELATION MATRIX EIGENVALUES
The theoretical validity of this criterion will be demonstrated in the following, on the simplified case of the linear variation of the autocorrelation matrix eigenvalues, as illustrated on Figure 4. It can be expressed in the following form, There are some important elements concerning this figure to be discussed. When the noise is white, the smallest M − N eigenvalues should be all equal to the noise variance. In practice it is never true, because the noise is never completely white. The more colored is the noise, the larger is the dynamic range ∆λ 2 .
The N largest eigenvalues are related to the harmonic signal component. The value of ∆λ 1 is mainly given by the frequency gap between the closest components. The closer they are, the larger is the dynamic range ∆λ 1 .
The eigenvalue variation is not necessarily linear, but the results obtained in this case can be generalized. This type of variation has also the advantage of being the simplest model which is able to integrate the elements related to the most difficult components to be resolved and to the noise characteristics.
The slopes corresponding to the eigenvalue variation in the two domains represented on Figure 4 can be readily calculated, The eigenvalues are usually normalized so that Because even the smallest eigenvalue must be positive, the following condition has to be met, with ε > 0, but ε 1. Obviously, it is necessary to insure that the smallest eigenvalue corresponding to the signal subspace is larger than the largest eigenvalue corresponding to the noise subspace, The eigenvalue variation can be rewritten now in the following form, In order to build the first discriminant function as a pdf, the following sum is calculated, The function g 1 (k) can be therefore expressed as The first step for calculating the second discriminant function g 2 (k) consists in expressing the partial eigenvalue average The expression of µ(k) from 1 to N has been obtained by taking into account that M j=k+1 λ( j) = S 1 − k j=2 λ( j). The previous result leads to Note that even for the simplest case of a linear model for the eigenvalue variation, it becomes too complicated to continue using the exact forms of the expressions deduced above. That is why the following approximations will be considered hereinafter, A much simpler form for η(k) is obtained, taking into account these approximations, The maximum value of this function is obtained for k = N. It can be consequently normalized and then transformed into the second discriminant function, The final form of the second discriminant function is obtained by simply transforming the function h(k) into a pdf, which means to normalize it to the following sum, Consequently, the following form is finally obtained for the second discriminant function, Using the same approximations as indicated above, the first discriminant function becomes The values of the two discriminant functions corresponding to the arguments N and N + 1 are to be calculated in order to demonstrate that the solution of the problem is N, It is obvious from these relationships that On the other hand, If the limit value for ∆λ 1 is considered, that is, ∆λ 1 = 1, the following inequality is obtained, This means that in the worst case the solution of the problem is still N if the noise power and whiteness are so that the condition above is accomplished. It corresponds to S/N ratios lower than those from the validity domain of the Akaike and Rissanen criteria.

SIMULATION RESULTS
Three types of computer simulations have been conducted in order to demonstrate the capabilities of the new method.
A superposition of two sinusoids (N = 4), corrupted by an additive white Gaussian noise, has been firstly considered. Since the number of samples is 16, two harmonic components cannot be resolved by Fourier analysis if their normalized frequencies are closer than 1/16 = 0.0625. For each S/N ratio between 0 and 20 dB, 10000 independent simulations have been performed for calculating the detection rate. The two normalized frequencies associated to the two sinusoids are chosen randomly for each iteration so that the distance between them is between 1/32 and 1/16. The results are presented on Figure 5.
Note that the proposed criterion slightly outperforms the AIC and MDL criteria in terms of detection rate (Figure 5a). the number of harmonic components for low S/N ratios. This is particularly important in superresolution radar imagery applications, where underestimation has to be always avoided because it leads to lost scattering centers in the reconstructed image of the radar target. It is also obvious that the new criterion is the most consistent because its variance, expressed in dB on Figure 5c, decreases the fastest.
The variation of the detection rate corresponding to the four criteria for a colored noise is presented on Figure 5d. The colored noise has been obtained by filtering the white noise using an AR filter of order 1, defined by its denominator coefficient a, which has been chosen as 0.7 for the example given here. Note that the new criterion clearly outperforms again both the AIC and MDL criteria, being in the same time less robust than the GDE criterion. A more complete study has been performed on the behavior of the four criteria, with respect to the dynamic range of the amplitudes of the two sinusoids ( Figure 6a) and to the whiteness of the noise (Figure 6b). S/N ratios of 10 dB and 15 dB, respectively, have been considered in the two cases. As it can be seen, the AIC and MDL criteria perform better when the dynamic range of the amplitudes is larger than 3 dB, but they are much less robust than the other two criteria for colored noise.
We have also evaluated the performance of the four compared criteria when the signal is corrupted by a second-order AR random process (Figure 7). The two poles of the whitenoise-driven AR filter take values between 0 and 0.95, with an increment of 0.05. Just like in the case of the first-order AR random process, the detection rate obtained using the new approach begins to decrease when the two poles start approaching simultaneously the unit circle so that the proposed method is obviously outperformed by the GDE criterion in its neighborhood. However, it performs better than the AIC and MDL criteria for a wide range of variation of the two poles.
A random number of harmonic components has been considered in the second phase of computer simulations. In this case, all the superimposed sinusoids have the same magnitude and are uniformly frequency spaced, the normalized frequencies of two successive components being separated by 0.06. The results are given on Figure 8, for four values of the AR filter coefficient, 0, 0.75, 0.9, and 0.95.
The S/N ratio domain has been extended because the GDE criterion reaches the maximum value of the detection rate around 30 dB, compared to 20 dB for the case of two sinusoids. Hence, it is clear that the detection performance of this method depends on the number of harmonic components to be detected, as we have already stated in Section 1. It is also important to note that the new criterion performs again better than the AIC and MDL criteria for all the S/N ratios and even better than the GDE criterion if the AR coefficient is up to 0.9.
Finally, the third type of simulations have been devoted to a high-resolution radar application. The goal is to find the most accurate estimate of the range profile of a radar target using its complex signature in the frequency domain. An illustrative example is shown on Figure 9 for the case of five scattering centers. Their positions along the line of sight are recovered very precisely using the MUSIC technique, while their number is correctly estimated by the new criterion defined above. Note that even for low S/N ratios, the associated cost function gives an appropriate and unambiguous result.
The last comparison of the four criteria has been performed with respect to the computing time required to estimate the number of harmonic components. It has been measured over 10000 independent simulations and for different numbers of samples from 16 to 256. The results which are given on Figure 10 have been obtained on a PC Pentium IV, operating at 650 MHz.

CONCLUSION
A new method is proposed in the paper for estimating the number of harmonic components in colored noise. Its principle is based on the original idea which consists in reformulating the estimation problem as a classification problem with two classes. An analytical demonstration is provided for a special case of a piecewise linear variation of the autocorrelation matrix eigenvalues. Although this model is very simple, it contains all the essential information related to the number of the harmonic components, to the power and the whiteness of the noise, and to the closest spectral components.
The new method has been compared to AIC, MDL, and GDE techniques and its capabilities have been evaluated from the point of view of the supported dynamic range of the harmonic component magnitudes, of its behavior against white and colored noise, and of the required computing time. We found out that the new criterion realizes the best tradeoff in estimating the signal subspace dimension. Thus, it performs better than AIC and MDL methods, in white and especially colored noise, and it has a better behavior than the GDE criterion against white noise and with respect to the amplitude dynamic range. It is still better than this one, even against colored noise, for a wide range of the associated frequency band. It is also the fastest among the criteria mentioned above. Finally, it is the only method which overestimates the number of harmonic components, for low S/N ratios and small number of samples.
This last property makes our method particularly useful in radar imagery applications, where it is preferable to overestimate the number of scattering centers than underestimate it. Hence, as future work, we plan to use it in the context of our ongoing research concerning the robust reconstruction and classification of radar target images by superresolution methods [13,14].