Joint fundamental frequency and order estimation using optimal filtering

Recently, two optimal filter designs for fundamental frequency estimation have been proposed with the first being based on a filterbank and the second on a single filter. The two designs are related in a simple manner and are shown to result in the same residual when used for cancelling out the harmonics of periodic signals. We propose to use this residual for estimating the number of harmonics by combining a noise variance estimate with an order dependent penalty term. This leads to a joint estimator of the fundamental frequency and the order based on the same criterion. Via Monte Carlo simulations, the estimator is demonstrated to have good performance in terms of the percentage of correctly estimated orders.


INTRODUCTION
Periodic signals consist of a set of sinusoids whose frequen cies are integer multiples of a fundamental frequency. The task of finding this fundamental frequency from an observed signal is important in applications for many kinds of signals, but especially so for speech and audio signals. Fundamental frequency estimators form the basis of many signal process ing applications with some examples being separation, com pression, analysis, and enhancement. The different meth ods for fundamental frequency estimation are too numerous to mention in any detail here and we will refer to [l] for an overview of classical methods. Some examples of more recent methods based on estimation theoretical approaches are [2][3][4]. One particular method, though, is the main in spiration for this paper, namely the adaptive comb filtering method [5]. We will here pursue the idea of obtaining fun damental frequency estimates using filters further, but unlike the fixed filter design of [5], we will use the signal-adaptive and optimal filters proposed in [6] and [7]. More specifi cally, two optimal filter designs for finding the fundamental frequency from an observed signal were proposed. The first design is based on a filterbank and was first introduced in [6] where it was shown to have excellent performance compared to many other methods, while the second design is new and is based on a single filter [7]. Both are generalizations of Capon's classical filter design [8] that has been used exten sively for spectral estimation and beamforming in array pro cessing. In this paper, we extend these estimators to also ac count for an unknown number of harmonics, something that is critical in avoiding ambiguities in the cost function (see, e.g., [9] for more on this) by deriving a method for estimat ing the fundamental frequency and the number of harmonics and the model order jointly for a particular data segment. In doing so, a noise variance estimate is obtained, and we show that the resulting optimal filterbank and the single filter lead to identical noise variance estimates.
We will make use of the following signal model and notation: a signal consisting of a set of sinusoids having frequencies that are integer multiples of a fundamental fre quency, mo, is corrupted by an additive white complex cir cularly symmetric Gaussian noise, e (n), having variance a2, for n = 0, ... ,N -1, i.e., L x(n) = L azejWo l l1+e (n), where al = Aleill'l, with Al > ° and If/ I being the amplitude and the phase of the lth harmonic, respectively. The problem of interest is to estimate the fundamental frequency mo as well as the order L from a set of N measured samples x( n) .
The remaining part of the present paper is organized as follows: First, we introduce two ditlerent filter designs in Section 2, one based on a filterbank and one based on a sin gle filter. Then, in Section 3, we will derive a noise variance estimator for estimating the order based on these filter de signs. In Section 4, we then proceed to evaluate the proposed joint fundamental frequency and order estimator before con cluding on our work in Section 5.

OPTIMAL DESIGNS 2.1 Filterbank
We begin by introducing some useful notation and def mltlOns. First, we introduce a vector formed from M time-reversed samples of the observed signal, i.e., x(n) = [ x( n) x( n -1) ... x( n -M + 1) ] T with M :S N /2 and with ( . f denoting the transpose. Next, we define the output signal YI (n) of the lth filter having coefficients hi (n) as  itive approach is to seek to find a set of filters that pass power undistorted at specific frequencies, in our case the harmonic frequencies, while minimizing the power at all other frequen cies. This problem can be formulated mathematically as the optimization problem: where I is the L x L identity matrix. Furthermore, the matrix Z E C M x L has a Vandermonde structure and is constructed from L complex sinusoidal vectors as with z (w) = [ 1 e-jOJ ... e-jOJ( M -I ) jT, i.e., the matrix con tains the harmonically related complex sinusoids. We note that the complex conjugation is due to the covariance ma trix being defined from the time-reversed signal vector x(n) .
Using the method of Lagrange multipliers, the unconstrained optimization problem can be written as individual filter is constrained to have unit gain for a certain harmonic frequency and zero gain for the others. It is easy to see that this can be written using a more convenient form as (8) where the matrix A contains all the Lagrange multiplier (col umn) vectors AI associated with the various filters of the fil terbank, i.e., By differentiation, we obtain that the gradient of this com posite cost function is By setting these matrix equations equal to zero, one readily obtains that the Lagrange multipliers that solve the original problem are ( 11) and that the optimal filterbank expressed in terms of the La grange multipliers is (12) By substituting the solution for the Lagrange multipliers, the filter bank matrix H solving (5) can be seen to be given by This data and frequency dependent filter bank can then be used to estimate the fundamental frequencies by treating it as an unknown variable and maximizing the power of the filter's output, which is This expression depends only on the covariance matrix and the Vandermonde matrix constructed for different candidate fundamental frequencies.

Single Filter
There is an alternative formulation of the filter design prob lem that we will now examine further. Suppose that we in stead wish to design a single filter h that passes the signal undistorted at the harmonic frequencies and suppresses ev erything else. This filter design problem can be stated math ematically as It is worth noting that the single filter in (15) is designed sub ject to L constraints, whereas in (5) the filter bank is designed using a number of constraints for each filter. Clearly, these two formulations are related; we will return to this relation later on. First, we will derive the optimal filter. Introduc ing the Lagrange multiplier column vector A, the Lagrangian dual function associated with the problem stated above can be written as By combining the last two expressions, we get the optimal filter expressed in terms of the covariance matrix and the Vandermonde matrix Z, i.e., The output power of this filter can then be expressed as which, as for the first design, depends only on the inverse of R and the Vandermonde matrix Z. By maximizing the out put power, we readily obtain an estimate of the fundamental frequency as

Properties
The question arises as to exactly how the two approaches dif fer. Comparing the optimal filters in (13) and (20), it can be observed that the latter can be written in terms of the former as respectively. In Figure 1, an example of such filters are given with the magnitude response of the optimal filterbank and the single filter being shown for white Gaussian noise with 0J0 = 1.2566 and L = 3. It should be stressed that for a non diagonal R, the resulting filters will look radically different.
Interestingly, the two methods can be shown to be asymptotically equivalent as (ZHR-1 Z) -1 is in fact diagonal asymp totically in M when normalized appropriately in the sense that [7] where <1>( ill ) is the power spectral density of the observed signal x(n), which has been assumed to be non-zero and fi nite1. This means that the two cost functions are generally different for small N and M and may result in different fun damental frequency estimates, but asymptotically they tend to the same cost function.
For the optimal filtering methods, the choise of the filter length M requires some consideration. It can be seen that both methods require that R be invertible, which is always the case for the signal model considered here when R is de fined by the expectation operator. However, when the sample covariance matrix is used in its place, M must be chosen such that the sample covariance matrix has rank M, i.e., M ::; N /2. To obtain a good estimate of the covariance matrix and thus an accurate output power estimate, M should be chosen low so that many sub-vectors are used in the averaging. On the other hand, a high M results in more selective filters.

ORDER ESTIMATION
In order to determine the order, i.e., the number of harmon ics, we employ the maximum a posteriori (MAP) principle [10,11]. The derivations of the MAP criterion are somewhat lengthy and we will therefore only present the results here.   More specifically, the MAP criterion for determining L � 1 in (1) can be shown to be where the first term is a log-likelihood term that comprised a noise variance estimate that depends on the candidate model order L, the second is the penalty associated with the ampli tude and phase of (1) while the third term is due to the funda mental frequency. Note that the linear and nonlinear param eters have different penalties associated with them. To de termine whether any harmonics are present at all, the above cost function should be compared to the log-likelihood of the zero order model, meaning that no harmonics are present if Nlog 6"5 < Nlog 6"t + LlogN + � 10gN. (27) We will now proceed to use the filters presented in Section 2 to estimate the variance of the signal once the harmonics have been filtered out. First, we will do this based on the filterbank design. An estimate of the noise is defined as e( n) = x( n)y(n) which we will refer to as the residual. Additionally, y( n) is the sum of the input signal filtered by the filterbank, i.e.,

M -I
Furthermore, it can easily be verified that bf Z = 1 H, from which it can be concluded that Therefore, the variance estimate can be expressed as which conveniently features the same expression as in the fundamental frequency estimation criterion in (22). This means that the same expression can be used for determin ing the model order and the fundamental frequency, i.e., the approach allows for joint estimation of the model order and the fundamental frequency. It also shows that the same filter that maximizes the output power minimizes the variance of the residual. A more conventional variance estimate could be formed by first finding the frequency using, e.g., (22) and then finding the amplitudes of the signal model using least-squares to obtain a noise variance estimate. Since the proposed procedure uses the same information in finding the fundamental frequency and the noise variance, it is superior to the least-squares approach in terms of computational com plexity. Note that for finite filter lengths, the output of the filters considered here are generally "power levels " and not power spectral densities (see [12]), which is consistent with our use of the filters. Asymptotically, the filters do comprise power spectral density estimates [7].

RESULTS
We will now evaluate the statistical performance of the pro posed scheme. In doing so, we will compare to two other methods based on well-established estimation theoretical ap proaches that are able to jointly estimate the fundamental fre quency and the order, namely a subspace method, the MU SIC method of [9], and the nonlinear least-squares (NLS) method [6]. The NLS method in combination with the cri terion (26) yields both a maximum likelihood fundamental frequency estimate and a MAP order estimate (see [10] for details). The three methods are comparable in terms of com putational efficiency as they all have complexity (j(M 3 ). We will here focus on their application to order estimation, inves tigating the performance of the estimators given the funda mental frequency. The reason for this is simply that the high resolution estimation capabilities of the proposed method, MUSIC and NLS for the fundamental frequency estimation problem are already well-documented in [6,7,9]. Note that the NLS method reduces to a linear least-squares method when the fundamental frequency is given but the joint es timator is still nonlinear. In these experiments the following conditions were used: signals were generated using (I) with a fundamental frequency of 0J0 = 0.8170, L = 5 and Al = 1 \II. For each test condition, 1000 Monte Carlo iterations were run. In the first experiment, we will investigate the perfor mance as a function of the pseudo signal-to-noise (PSNR) as defined in [9]. Note that this PSNR is higher than the usual SNR, meaning that the conditions are more noisy than they may appear at first sight. The performance of the estimators has been evaluated for N = 200 observed samples with a co variance matrix size/filter length of M = 50. The results are shown in Figure 2 in terms of the percentage of correctly es timated orders. Similarly, the performance is investigated as a function of N with M = N / 4 in the second experiment for PSNR = 40 dB, i.e., the filter length is set proportionally to the number of samples. Note that the NLS method operates on the entire length N signal and thus does not depend on M. This experiment thus reveals not only the dependency of the performance on the number of observed samples but also on the filter length. The results are shown in Figure 3. In the final experiment, the N is kept fixed while the filter length M is varied with PSNR = 40 dB. In the process, the covari ance matrix of MUSIC is varied too. The results can be seen in figure Figure 4. From the figures, it can be observed that   the proposed method has good performance for high PSNRs and N with the percentage approaching 100 %. Furthermore, the filter length should not be chosen too low or too close to N /2. It can, however, also be observed that the method ap pears to be more sensitive than MUSIC and NLS to low num ber of samples (and thus filter lengths) and SNRs. It should be stressed, though, that while the method based on optimal filtering appears to generally exhibit slightly worse perfor mance than both MUSIC and NLS in terms of estimating the model order, it generally outperforms both MUSIC and the NLS with respect to fundamental frequency estimation un der adverse conditions, in particular when multiple periodic sources are present at the same time [6], something that hap pens frequently in audio signals. Also, it should be noted that for our intended application, which is audio processing, the segment lengths are generally high. More specifically, seg ments of 30 ms or longer sampled at 44.1 kHz correspond ing to 1323 samples are commonly used in audio estimation problems. That the proposed method appears to require long filters is therefore not necessarily a concern.

CONCLUSION
Two optimal filter designs that can be used for estimating the fundamental frequency of a periodic signal have been con sidered. Both are based on Capon's classical filter design and are related in a simple way. In this paper, we have ex tended the principles of these methods to also account for an unknown model order leading to a joint estimator for the fun damental frequency and the number of harmonics. In Monte Carlo simulations, the proposed scheme is demonstrated to have good performance estimating the correct order with a high probability for a high number of observations. The re sults are promising as methods based on optimal filtering have previously been shown to have excellent performance even under adverse conditions for a known model order. The results are particularly relevant for speech and audio signals where the model order may vary greatly over time.