- Research
- Open Access

# Joint DOA and multi-pitch estimation based on subspace techniques

- Johan Xi Zhang
^{1}Email author, - Mads Græsbøll Christensen
^{2}, - Søren Holdt Jensen
^{1}and - Marc Moonen
^{3}

**2012**:1

https://doi.org/10.1186/1687-6180-2012-1

© Zhang et al; licensee Springer. 2012

**Received: **26 March 2011

**Accepted: **2 January 2012

**Published: **2 January 2012

## Abstract

In this article, we present a novel method for high-resolution joint direction-of-arrivals (DOA) and multi-pitch estimation based on subspaces decomposed from a spatio-temporal data model. The resulting estimator is termed multi-channel harmonic MUSIC (MC-HMUSIC). It is capable of resolving sources under adverse conditions, unlike traditional methods, for example when multiple sources are impinging on the array from approximately the same angle or similar pitches. The effectiveness of the method is demonstrated on a simulated an-echoic array recordings with source signals from real recorded speech and clarinet. Furthermore, statistical evaluation with synthetic signals shows the increased robustness in DOA and fundamental frequency estimation, as compared with to a state-of-the-art reference method.

## Keywords

## 1. Introduction

The problem of estimating the fundamental frequency, or pitch, of a period waveform has been of interest to the signal processing community for many years. Fundamental frequency estimators are important for many practical applications such as automatic note transcription in music, audio and speech coding, classification of music, and speech analysis. Numerous algorithms have been proposed for both the single- and multi-pitch scenarios [1–5]. The problem for single-pitch scenarios is considered as well-posed. However, in real-world signals, the multi-pitch scenario occurs quite frequently [2, 6]. The multi-pitch estimation algorithms are often based on, i.e., various modification of the auto-correlation function [1, 7], maximum likelihood, optimal filtering, and subspace techniques [2, 3, 8]. In real-life recordings, problems such as frequency overlap of sources, reverberation, and colored noise will strongly limit the performance of multi-pitch estimator and estimator designed for single channel recordings often use simplified signal models. One widely used signal simplification in multi-pitch estimators, for example, is the sparseness of the signal, where the frequency spectrum of sources are assumed to not overlap [2]. This assumption may be appropriate when sources consist of mixture of several speech signals having different pitches [9]. However, for audio signals it is less likely to be true. This is especially so in western music, where instruments are most often played in accord, something that causes the harmonics to overlap or even coincide. With only single-channel recording it is, therefore, hard, or perhaps even impossible, to estimate pitches with overlapping harmonics, unless additional information, such as a temporal or spectral model, is included.

Recently, multi-channel approaches have attracted considerable attention both in single- and multi-pitch scenarios. By exploring the spatial information of the sources, more robust pitch estimators have been proposed [10–14]. Most of those multi-channel methods are still mainly based on auto-correlation function-related approaches, however, although a few exceptions can be found in [15–18]. In direction-of-arrival (DOA) estimators, audio and speech signals are often modeled as broadband signal, and standard subspace methods such as MUSIC and ESPRIT are only defined for narrow-band signal model, which then fail to directly operate on broadband signals [19]. One often used concept is band-pass filtering of broadband signals into subbands, where narrow-band estimators can be applied to each subband [20]. In the narrow-band case, a delay in the signal is equivalent to a phase shifts according to the frequencies of complex exponentials. An alternative method is, however, as follows: since harmonic signals consist of sinusoidal components, we can model each source as multiple narrow-band signal with distinct frequencies arriving at the same DOA.

In this article, we propose a parametric method for solving the problem of joint fundamental frequency and DOA estimation based on subspace techniques where the quantities of interest are jointly estimated using a MUSIC-like approach. We term the proposed estimator Multi-channel multi-pitch Harmonic MUSIC (MC-HMUSIC). The spatio-temporal data model used in MC-HMUSIC is based on the JAFE data model [21, 22]. Originally, the JAFE data model was used for estimating joint unconstrained frequencies and DOAs estimates of complex exponential using ESPRIT, which is referred as joint angle-frequency estimation (JAFE) algorithm. Other-related work with joint frequency-DOA methods includes [23–25]. In this article, we have parametrized the harmonic structure of periodic signals in the signal model to model the fundamental frequency and the DOA of individual sources. An estimator is constructed for jointly estimating the parameters of interest. Incorporating the DOA parameter in finding the fundamental frequency may give better robustness against a signal with overlapping harmonics. Similarly, it can be expected that the DOA can be found more accurately when the nature of the signal of interest is taken into account.

The remainder of this article is comprised four sections: Section 2, in which we will introduce some notation, the spatio-temporal signal model, for which we also derive the associated Cramér-Rao lower bound, along with the JAFE data mode; Section 3, where we then present the proposed method; Section 4, in which we present the experimental results obtained using the proposed method; and, finally, Section 5, where we conclude on our work.

## 2. Fundamentals

### 2.1. Spatio-temporal signal model

*x*

_{ i }received by microphone element

*i*arranged in a uniform linear array (ULA) configuration,

*i*= 1,...,

*M*, is given by

for sample index *n* = 0,..., *N* - 1, where subscript *k* denotes the *k* th source and *l* the *l* th harmonic. Moreover, *A*_{
l,k
} is the real-valued positive amplitude of the complex exponential, *L*_{
k
} is the number of harmonics, *K* is number of sources, *γ*_{
l,k
} is the phase of the individual harmonics, *ϕ*_{
k
} is the phase shift caused by the DOA, and *e*_{
i
}(*n*) is complex symmetric white Gaussian noise. The phase shift between array elements is given as ${\varphi}_{k}={\omega}_{k}{f}_{s}\frac{d}{c}\text{sin}\left({\theta}_{k}\right)$, where *d* is the spacing between the elements measured in wavelengths, *c* is the speed of propagation in unit [m/s], *θ*_{
k
} is the DOA defined for *θ*_{
k
} ∈ [-90°, 90°], *f*_{
s
} is the signal sampling frequency. The problem of interest is to estimate *ω*_{
k
} and *θ*_{
k
}. We in the following assume that the number of sources *K* is known and the number of harmonics *L*_{
k
} of individual sources is known or found in some other, possibly joint, way. We note that a number of ways of doing this has been proposed in the past [26–28, 2].

### 2.2. Cramér-Rao lower bound

*M*× 1 deterministic signal model vector

**s**(

*n*, μ) with column element as

**s**(

*n*, μ) = [

*s*

_{1}(

*n*, μ) ...

*s*

_{ M }(

*n*, μ)]

^{ T }. Furthermore, the parameter vector μ is given by

**e**(

*n*) being the noise column vector. The CRLB is defined as the variance of an unbiased estimate of the

*p*th element of μ, which is lower bounded as

**C**is the so-called Fisher information matrix given by

### 2.3. The JAFE data model

*n*the received signal from the

*M*array elements are

**x**(

*n*) = [

*x*

_{1}(

*n*)

*x*

_{2}(

*n*) ...

*x*

_{ M }(

*n*)]

^{ T }, which can be written as

**e**(

*n*) ∈ ℂ

^{ M }

^{×1}is the noise vector, and

**A**= [

**A**

_{1}...

**A**

_{ K }] is a Vandermonde matrix containing parameters

*ω*

_{ k }and

*θ*

_{ k }for sources

*k*= 1, . . . ,

*K*, i.e.,

**a**(

*θ*,

*ω*) being the array steering vector given by

^{ T }denotes the vector transpose. Unlike the steering vector defined in [22, 21], where only the DOA is parametrized, here, a general definition of the vector (11) is used, in which it depends on both

*θ*and

*ω*[29]. The frequency components are expressed in ${\Phi}^{n}=\mathsf{\text{diag}}\phantom{\rule{2.77695pt}{0ex}}\left(\left[{\Phi}_{1}^{n}\phantom{\rule{1em}{0ex}}\cdots \phantom{\rule{1em}{0ex}}{\Phi}_{K}^{n}\right]\right)$where the matrix for each source is given by

*N*time-domain data samples of the array output

**x**(

*n*) are collected to form the

*M*×

*N*data matrix

**X**, which is defined as

where **E** ∈ ℂ^{
M×N
} is a matrix containing *N* sample of the noise vector **e**(n).

In speech and audio signal processing, it is common to model each source as a set of multiple harmonics with model order *L*_{
k
} *>* 1. Due to the narrow-band approximation of the steering vector, the multiple complex components with distinct frequencies impinge on the array with identical DOA will result in a non-unique spatial frequencies which cause a harmonic structure in the spatial frequencies *ϕ*_{
k
}*l* ∀*l* as well. The multiple sources impinge on the array with different DOAs consisting of various frequency components may, for certain frequency combinations, give the same array steering vector, which cause the matrix **A** to be rank deficient. Normally, this ambiguous mapping of the steering vector is mitigated by band-pass filtering the signal into its subbands, where the DOA of the signal is uniquely modeled by the narrow-band steering vector [20, Chap. 9].

**A**. The temporally smoothed data matrix is obtained by stacking

*t*times temporally shifted versions of the original data matrix [22, 21, 29], given as

**X**

_{ t }∈ ℂ

^{ tM }

^{×}

^{ N }

^{-}

^{ t }

^{+1}is the temporally smoothed data matrix, and

**E**

_{ t }is the noise term constructed from

**E**in a similar way as

**X**

_{ t }. In using the signal model where the amplitudes are assumed stationary for

*n*= 0, . . . ,

*N*- 1,

**X**

_{ t }can be factorized as

where **Ā**_{
t
} = [**A** **AΦ** ... **AΦ**^{
t
}^{-1}]^{
T
} and **B**_{
t
} = [**b** **Φb** ... **Φ**^{
N-t
} **b**]. The temporally smoothed data matrix **X**_{
t
} can maximally resole up to $tM\ge {\sum}_{k=1}^{K}{L}_{k}$ complex exponentials, where **Ā**_{
t
} is linearly independent for any distinct *θ* and *ω* [30].

*M*sensors is subdivided into

*S*subarrays. In this article, the subarrays are spatially shifted with one element in each subarrays, the number of elements in each subarray being

*M*

_{ S }=

*M*-

*S*+ 1. For

*s*= 1, . . . ,

*S*, let ${\mathbf{J}}_{s}\in {\u2102}^{t{M}_{s}\times tM}$ be the selection matrix corresponding to the

*s*th subarray for the data matrix

**X**

_{ t }. Then, the spatio-temporally smoothed data matrix ${\mathbf{X}}_{t,S}\in {\u2102}^{t{M}_{s}\times S\left(N-t+1\right)}$ is given by

**X**

_{ t,s }can be factorized as

**E**

_{ t,s }is the noise term constructed from

**E**in a similar way as

**X**

_{ t,s }. Using the shift invariance structure in

**A**

_{ m }, the term

**J**

_{ s }

**A**

_{ m }for

*s*= 1, . . . ,

*S*is given by

**X**

_{ t,s }can be written in a compact form as

where **I**_{
t
} ∈ ℝ^{
t×t
} and ${\mathbf{I}}_{{M}_{s}}\in {\mathbb{R}}^{{M}_{s}\times {M}_{s}}$ are the identity matrices, ⊗ is the Kroneker product as defined in [22].

It is interesting to note that the noise term **E**_{
t,s
} is no longer white due to the spatio-temporal smoothing procedure, as correlation between the different rows of (23) is obtained. A pre-whitening step can be implemented in (23) to mitigate this. We note, however, that according to results reported in [22], pre-whitening step is only interesting for signals with low SNR where minor estimation improvement can be achieved. In this article, the main interest is to propose a multi-channel joint DOA and multi-pitch estimator, for which reason the whitening process is left without further description, but we refer the interested reader to [22]. We also note that aside from spatial smoothing, forward-backward averaging could also be implemented to reduce the influence of the correlated sources [22, 31, 19].

## 3. The proposed method

### 3.1. Coarse estimates

**U**are the singular vectors, i.e.,

*mM*

_{ S }-

*Q*least significant singular values, i.e.,

*Q*largest singular values, i.e.,

**J**

_{1}

**Ā**

_{ t }=

**A**

_{ ts }. The matrix

**A**

_{ ts }is comprised Vandermonde matrices for sources

*k*= 1, . . . ,

*K*. The matrix for each individual source is given by

_{ F }is the Frobenius Norm. Note that this measure is closely related to the angles between the subspaces as explained in [33] and can hence be used as a measure of the extent to which (29) holds for a candidate fundamental frequency and DOA. The pair of fundamental frequency and DOA can, therefore, be found as the combination that is the closest to being orthogonal to

**G**, i.e.,

The multi-channel estimators will have a cost function which is more well-behaved compared to those of single channel multi-pitch estimators (see, e.g., [26, 32, 28] for some examples of such).

### 3.2. Refined estimates

with Re (·) denoting the real value. The gradient can be used for finding refined estimate using standard methods.

*ω*

_{ k }is first estimated with

*i*is the iteration index and

*δ*is a small positive constant that is found using line search. The estimated ${\widehat{\omega}}_{k}^{i+1}$ is then used to initialize the minimization function for DOA, which is then found as

The method is initialized for *i* = 0 using the coarse estimates obtained from (32).

## 4. Experimental results

### 4.1. Signal examples

*f*

_{ s }= 8000 Hz. The single-channel signals are converted into a multi-channel signal by introducing different delays according to two pre-determined DOA to simulate a microphone array with

*M*= 8 channels. The simulated DOAs of the speech and the clarinet signals are, respectively,

*θ*

_{1}= -45° and

*θ*

_{2}= 45°. The spectrogram of the mixed signal of the first channel is illustrated in Figure 1. To avoid spatial ambiguities, the distance between two sensor is half the wavelength of the highest frequency in the observed signal, here

*d*= 0.0425 m. The mixed signal is segmented into 50% overlapped signal segments with

*N*= 128. The user parameter selected in this experiment is $t=\u230a\frac{2N}{3}\u230b$ and $s=\u230a\frac{M}{2}\u230b$. The cost function is evaluated with a Vandermonde matrix with

*L*= 5 complex exponentials, and the noise subspace is formed from an overestimated signal subspace with assumption of signal subspace containing

*N*/2 = 64 complex exponentials. The signal subspace overestimation technique is usually used when the true order of the signal subspace is unknown, the signal subspace is assumed to be larger than the true one which can minimize the signal subspace components in the noise subspace. An added benefit of posing the problem as a joint estimation problem is that the multi-pitch estimation problem can be seen as several single-pitch problems for a distinct set of DOAs, one per source. Therefore, it is less important to select an exact signal model order than single-channel multi-pitch estimators would need [28]. The cost function is evaluated for frequencies from 100 to 500 with granularity of 0.52 Hz. The evaluated results are illustrated in Figure 2 where the upper panel contains the fundamental frequency estimates and lower panel the DOA estimates. It can be seen that the proposed algorithm can track the fundamental frequency and the DOA of the speech signal well, with only a few observed errors on regions with low signal energy. The clarinet signal's DOA and fundamental frequencies have also been estimated well for all segments.

*M*= 8 array elements which will further affect the performance of the single-channel pitch estimator, as shown in the upper panel. In this example, the proposed algorithm shown in Figure 2 is superior compared to reference method shown in Figure 3. The low resolution performance of the reference method will make the statistical evaluation of this method uninteresting, and we, therefore, will not be using it any further in the experiments to follow.

### 4.2. Statistical evaluation

Next, we use Monte Carlo simulations evaluated on synthetic signals embedded in noise in assessing the statistical properties of the proposed method and compare it with the exact CRLB. As a reference method for pitch and DOA estimation, we use the JAFE algorithm proposed in [22] for jointly estimating unconstrained frequencies and DOAs. Next, the unconstrained frequencies are grouped according to their corresponding DOAs where closely related directions are grouped together. A fundamental frequency is formed from these grouped frequencies in a weighted way as proposed in [35]. We refer this as the WLS estimator. In order to remove the errors due to the erroneous estimate of amplitudes, we assume WLS having the exact signal amplitude given. The WLS estimator is a computationally efficient pitch estimation method with good statistical properties. The reference DOA estimate is easily obtained in a similar way from the mean value of these grouped DOAs according to [22].

*M*= 8 element ULA with sensor distance

*d*= 0.0425 with a sampling frequency of

*f*

_{ s }= 8000. The estimators are evaluated for two signal setups, first with two sources having

*ω*

_{1}= 252.123 and

*ω*

_{2}= 300.321 with

*L*

_{1,2}= 3, and second with one harmonic source of

*ω*

_{1}= 252.123 and

*L*

_{1}= 3. All amplitudes on individual harmonics are set to unity

*A*

_{ k,l }= 1 for tractability. Both sources are assumed to be far-field sources impinging on the array with DOAs at

*θ*

_{1}= -43.23° and

*θ*

_{2}= 70°, respectively, and for one source having a DOA of

*θ*

_{1}= -43.23°. All simulation results are based on 100 Monte Carlo runs. The performance is measured using the root mean squared estimation error (RMSE) as defined in [28, 32, 26, 27]. The user parameter for JAFE data model is selected to the optimal values as proposed in [22] with temporal and spatial smoothness parameters, $t=\u230a\frac{2N}{3}\u230b$ and $s=\u230a\frac{M}{2}\u230b$, respectively. We note that in practical applications, the computational complexity has to also be considered in selecting the appropriate parameters

*t*and

*s*. An example of the 2-dimensional (2D) cost function of our proposed method evaluated on two mixed signal is illustrated in Figure 4, where a coarser estimate of the DOA and fundamental estimates can be identified from the two peaks in the 2D cost function.

*N*are shown in Figure 5, and with varying SNR in Figure 6. It can be seen from these figures that both estimators perform well for all SNR above 0 dB with WLS being slightly better for fundamental frequency estimation while the proposed estimator is better in DOA estimation. Both methods are also able to follow CRLB closely for around sample length

*N >*60. The better DOA estimation capabilities of the proposed method can be explained by the joint estimation of the fundamental frequency and DOA, which leads to increased robustness under adverse conditions. Both estimators can be considered as consistent in the single-pitch scenario.

*N >*80 samples while the proposed estimator is for

*N >*64. The remaining gap between CRLB and both evaluated estimators for

*N >*80 are due to the mutual interference between the harmonic sources. The slowly converging performance of WLS is mainly due to the bad estimate of the unconstrained frequency estimate using the JAFE method. With our selected simulation setup, the JAFE estimator is not giving consistent estimates for all harmonic components, which, in turn, results in poor performance in the WLS estimates. In general, the WLS estimator is sensitive to spurious estimate of the unconstrained frequencies. Moreover, the proposed estimator, which is jointly estimating both the DOA and the fundamental frequency, yields better estimates for smaller sample length

*N*. The results in terms of RMSEs for varying SNRs are shown in Figure 8. This figure shows that the proposed estimator is again more robust than the WLS estimator for both DOA and fundamental frequency estimation.

*ω*= |

*ω*

_{1}-

*ω*

_{2}|, with

*θ*

_{1}= -43.321° and

*θ*

_{2}= 70°. Here, we use an SNR set to 40 dB, and a sample length

*N*= 64 with

*M*= 8 array elements. The obtained RMSEs are shown in Figure 9. The figure clearly shows that both methods can successfully estimate the fundamental frequencies and DOAs. Once again the proposed estimator gives more robust estimates, close to the CRLB. Additionally, it should be noted that both methods are correctly estimating the DOA even when the both fundamental frequencies are identical

*ω*

_{1}=

*ω*

_{2}, something that would not be possible with only a single channel. MC-HMUSIC has the ability to estimate the fundamental frequencies when both harmonics are identical provided that the DOAs are distinct and vice versa. Estimation of the parameters of signals with overlapping harmonics is a crucial limitation in multi-pitch estimation using only single-channel recordings. In the final experiment, the RMSE as a function of the difference between the DOAs of two harmonic sources Δ

*θ*= |

*θ*

_{1}-

*θ*

_{2}| is analyzed for an SNR set to 40 dB and a sample length of

*N*= 64 with

*M*= 8 array elements. The fundamental frequencies are

*ω*

_{1}= 252.123 and

*ω*

_{2}= 300.321, respectively. The observations and conclusions are basically the same as before, with the proposed method outperforming the reference method so far.

## 5. Conclusion

In this article, we have generalized the single-channel multi-pitch problem into a multi-channel multi-pitch estimation problem. To solve this new problem, we propose an estimator for joint estimation of fundamental frequencies and DOAs of multiple sources. The proposed estimator is based on subspace analysis using a time-space data model. The method is shown to have potential in applications to real signals with simulated anechoic array recording, and a statistical evaluation demonstrates its robustness in DOA and fundamental frequency estimation as compared to a state-of-the-art reference method. Furthermore, the proposed method is shown to have good statistical performance under adverse conditions, for example for sources with similar DOA or fundamental frequency.

## Declarations

### Acknowledgements

The study of Zhang was supported by the Marie Curie EST-SIGNAL Fellowship, Contract No. MEST-CT-2005-021175.

## Authors’ Affiliations

## References

- Klapuri A: Automatic music transcription as we know it today.
*J New Music Res*2004, 33: 269-282.View ArticleGoogle Scholar - Christensen MG, Jakobsson A: Multi-Pitch Estimation.
*Synthesis Lectures on Speech and Audio Processing*2009.Google Scholar - Rabiner L: On the use of autocorrelation analysis for pitch detection.
*IEEE Trans Signal Process*1996, 44: 2229-2244.View ArticleGoogle Scholar - Zhang JX, Christensen MG, Jensen SH, Moonen M: A robust and computationally efficient subspace-based fundamental frequency estimator.
*IEEE Trans Acoust Speech Language Process*2010, 18(3):487-497.View ArticleGoogle Scholar - de Cheveigne A, Kawahara H: YIN, a fundamental frequency estimator for speech and music.
*J Acoust Soc Am*2002, 111(4):1917-1930.View ArticleGoogle Scholar - Wang DL, Brown GJ:
*Computational Auditory Scene Analysis: Principle, Algorithm, and Applications.*Wiley, IEEE Press, New York; 2006.View ArticleGoogle Scholar - Klapuri A: Multiple fundamental frequency estimation based on harmonicity and spectral smoothness.
*IEEE Trans Speech Audio Process*2003, 11: 804-816.View ArticleGoogle Scholar - Emiya V, Bertrand D, Badeau R: A parametric method for pitch estimation of piano tones.
*IEEE International Conference on Acoustics, Speech, and Signal Processing*2007, 1: 249-252.Google Scholar - Rickard S, Yilmaz O: Blind separation of speech mixtures via time-frequency masking.
*IEEE Trans Signal Process*2004, 52: 1830-1847.MathSciNetView ArticleGoogle Scholar - Wohmayr M, Kepsi M: Joint position-pitch extraction from multichannel audio.
*Proceedings of the Interspeech*2007.Google Scholar - Qian X, Kumaresan R: Joint estimation of time delay and pitch of voiced speech signals.
*Record of the Asilomar Conference on Signals, Systems, and Computers*1996., 2:Google Scholar - Wrigley SN, Brown GJ: Recurrent timing neural networks for joint F0-localisation based speech separation.
*IEEE International Conference on Acoustics, Speech and Signal Processing*2007.Google Scholar - Flego F, Omologo M: Robust F0 estimation based on a multi-microphone periodicity function for distant-talking speech.
*EUSIPCO*2006.Google Scholar - Armani L, Omologo M: Weighted auto-correlation-based F0 estimation for distant-talking interaction with a distributed microphone network.
*IEEE International Conference on Acoustics, Speech and Signal Processing*2004, 1: 113-116.Google Scholar - Chazan D, Stettiner Y, Malah D: Optimal multi-pitch estimation using the em algorithm for co-channel speech separation.
*Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing*1993.Google Scholar - Liao G, So HC, Ching PC: Joint time delay and frequency estimation of multiple sinusoids.
*IEEE International Conference on Acoustics, Speech and Signal Processing*2001, 5: 3121-3124.Google Scholar - Wu Y, So HC, Tan Y: Joint time-delay and frequency estimation using parallel factor analysis.
*Elsevier Signal Process*2009, 89: 1667-1670.View ArticleGoogle Scholar - Ngan LY, Wu Y, So HC, Ching PC, Lee SW: Joint time delay and pitch estimation for speaker localization.
*Proceedings of the IEEE International Symposium on Circuits and Systems*2003, 722-725.Google Scholar - Stoica P, Moses R:
*Spectral Analysis of Signals*. Prentice-Hall, Upper Saddle River; 2005.Google Scholar - Brandstein M, Ward D:
*Microphone Arrays*. Springer, Berlin; 2001.View ArticleGoogle Scholar - van der Veen AJ, Vanderveen M, Paulraj A: Joint angle and delay estimation using shift invariance techniques.
*IEEE Trans Signal Process*1998, 46: 405-418.View ArticleGoogle Scholar - Lemma AN, van der Veen AJ, Deprettere EF: Analysis of joint angle-frequency estimation using ESPRIT.
*IEEE Trans Signal Process*2003, 51: 1264-1283.MathSciNetView ArticleGoogle Scholar - Viberg M, Stoica P: A computationally efficient method for joint direction finding and frequency estimation in colored noise.
*Record of the Asilomar Conference on Signals, Systems, and Computers*1998, 2: 1547-1551.Google Scholar - Lin JD, Fang WH, Wang YY, Chen JT: FSF MUSIC for joint DOA and frequency estimation and its performance analysis.
*IEEE Trans Signal Process*2006, 54: 4529-4542.View ArticleGoogle Scholar - Wang S, Caffery J, Zhou X: Analysis of a joint space-time doa/foa estimator using MUSIC.
*IEEE International Symposium on Personal, Indoor and Mobile Radio Communications*2001, B138-B142.Google Scholar - Christensen MG, Stoica P, Jakobsson A, Jensen SH: Multi-pitch estimation.
*Elsevier Signal Process*2008, 88(4):972-983.View ArticleGoogle Scholar - Christensen MG, Jakobsson A, Jensen SH: Joint high-resolution fundamental frequency and order estimation. IEEE Trans.
*Acoust Speech Signal Process*2007, 15(5):1635-1644.Google Scholar - Zhang JX, Christensen MG, Jensen SH, Moonen M: An iterative subspace-based multi-pitch estimation algorithm.
*Elsevier Signal Process*2011, 91: 150-154.View ArticleGoogle Scholar - Lemma AN: ESPRIT based joint angle-frequency estimation algorithms and simulations.
*PhD Thesis Delft University*1999.Google Scholar - Shu T, Liu XZ: Robust and computationally efficient signal-dependent method for joint DOA and frequency estimation.
*EURASIP J Adv Signal Process*2008., 2008: Article ID 10.1155/2008/134853Google Scholar - Krim H, Viberg M: Two decades of array processing research-the parametric approach.
*IEEE SP Mag*1996.Google Scholar - Christensen MG, Jakobsson A, Jensen SH: Multi-pitch estimation using Harmonic MUSIC.
*Record of the Asilomar Conference on Signals, Systems, and Computers*2006, 521-525.Google Scholar - Christensen MG, Jakobsson A, Jensen SH: Sinusoidal order estimation using angles between subspaces.
*EURASIP J Adv Signal Process*2009, 1-11. Article ID 948756Google Scholar - Veen BDV, Buckley KM: Beamforming: a versatile approach to spatial filtering.
*IEEE ASSP Mag*1988, 5: 4-24.View ArticleGoogle Scholar - Li H, Stoica P, Li J: Computationally efficient parameter estimation for harmonic sinusoidal signals.
*Elsevier Signal Process*2000, 1937-1944.Google Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.