- Research Article
- Open Access
Microphone Array Speaker Localizers Using Spatial-Temporal Information
EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 059625 (2006)
A dual-step approach for speaker localization based on a microphone array is addressed in this paper. In the first stage, which is not the main concern of this paper, the time difference between arrivals of the speech signal at each pair of microphones is estimated. These readings are combined in the second stage to obtain the source location. In this paper, we focus on the second stage of the localization task. In this contribution, we propose to exploit the speaker's smooth trajectory for improving the current position estimate. Three localization schemes, which use the temporal information, are presented. The first is a recursive form of the Gauss method. The other two are extensions of the Kalman filter to the nonlinear problem at hand, namely, the extended Kalman filter and the unscented Kalman filter. These methods are compared with other algorithms, which do not make use of the temporal information. An extensive experimental study demonstrates the advantage of using the spatial-temporal methods. To gain some insight on the obtainable performance of the localization algorithm, an approximate analytical evaluation, verified by an experimental study, is conducted. This study shows that in common TDOA-based localization scenarios—where the microphone array has small interelement spread relative to the source position—the elevation and azimuth angles can be accurately estimated, whereas the Cartesian coordinates as well as the range are poorly estimated.
Knapp CH, Carter GC: The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 1976, 24(4):320–327. 10.1109/TASSP.1976.1162830
Brandstein M, Silverman H: A robust method for speech signal time-delay estimation in reverberant rooms. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), April 1997, Munich, Germany 1: 375–378.
Stéphenne A, Champagne B: A new cepstral prefiltering technique for estimating time delay under reverberant conditions. Signal Processing 1997, 59(3):253–266. 10.1016/S0165-1684(97)00051-0
Benesty J: Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. The Journal of the Acoustical Society of America 2000, 107(1):384–391. 10.1121/1.428310
Doclo S, Moonen M: Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1110–1124. 10.1155/S111086570330602X
Dvorkind T, Gannot S: Speaker localization in a reverberant environment. Proceedings of the 22nd IEEE Convention of Electrical and Electronics Engineers in Israel (IEEEI '02), December 2002, Tel-Aviv, Israel 7–9.
Dvorkind TG, Gannot S: Approaches for time difference of arrival estimation in a noisy and reverberant environment. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 215–218.
Dvorkind TG, Gannot S: Time difference of arrival estimation of speech source in a noisy and reverberant environment. Signal Processing 2004, 85(1):177–204.
Chan YT, Ho KC: A simple and efficient estimator for hyperbolic location. IEEE Transactions on Signal Processing 1994, 42(8):1905–1915. 10.1109/78.301830
Brandstein MS, Adcock JE, Silverman HF: A closed-form location estimator for use with room environment microphone arrays. IEEE Transactions on Speech and Audio Processing 1997, 5(1):45–50. 10.1109/89.554268
Schau HC, Robinson AZ: Passive source localization employing intersecting spherical surfaces from time-of-arrival differences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987, 35(8):1223–1225.
Smith J, Abel J: Closed-form least-squares source location estimation from range-difference measurements. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987, 35(12):1661–1669. 10.1109/TASSP.1987.1165089
Huang Y, Benesty J, Elko GW: Passive acoustic source localization for video camera steering. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 909–912.
Huang Y, Benesty J, Elko GW, Mersereati RM: Real-time passive source localization: a practical linear-correction least squares approach. IEEE Transactions on Speech and Audio Processing 2001, 9(8):943–956. 10.1109/89.966097
Press WH, Flannery BP, Teukolsky SA, Vetterling WT: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, UK; 1988.
Chen JC, Hudson RE, Yao K: Maximum-likelihood acoustic source localization: experimental results. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 3: 2949–2952.
Chen JC, Hudson RE, Yao K: Maximum-likelihood source localization and unknown sensor location estimation for wideband signals in the near-field . IEEE Transactions on Signal Processing 2002, 50(8):1843–1854. 10.1109/TSP.2002.800420
Segal M, Weinstein E, Musicus BR: Estimate-maximize algorithms for multichannel time delay and signal estimation. IEEE Transactions on Signal Processing 1991, 39(1):1–16. 10.1109/78.80760
Birchfield ST, Gillmor DK: Fast Bayesian acoustic localization. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1793–1796.
Ward DB, Lehmann EA, Williamson RC: Particle filtering algorithms for tracking an acoustic source in a reverberant environment . IEEE Transactions on Speech and Audio Processing 2003, 11(6):826–836. 10.1109/TSA.2003.818112
Vermaak J, Blake A: Nonlinear filtering for speaker tracking in noisy and reverberant environments. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 3021–3024.
Lehmann EA, Williamson RC: Importance sampling particle filter for robust acoustic source localization and tracking in reverberant environments. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA C: 17–18.
Bechler D, Grimm M, Kroschel K: Speaker tracking with a microphone array using Kalman filtering. Advances in Radio Science 2003, 1: 113–117.
Klee U, McDonough J: Kalman filtering for acoustic source localization based on time delay of arrival. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA C: 5–6.
Dvorkind TG, Gannot S: Speaker localization exploiting spatial-temporal information. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 295–298.
Dvorkind TG, Gannot S: Speaker localization using the unscented Kalman filter. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA'05), March 2005, Piscataway, NJ, USA C: 3–4.
Haykin S: Adaptive Filter Theory, Information and System Sciences. 4th edition. Prentice Hall, Upper Saddle River, NJ, USA; 2002.
Popescu DC, Rose C: Emitter localization in a multipath environment using extended Kalman filter. Proceedings of the 33rd Conference on Information Sciences and Systems (CISS '99), March 1999, Baltimore, Md, USA 1: 147–150.
Julier SJ, Uhlmann JK: Unscented filtering and nonlinear estimation. Proceedings of the IEEE 2004, 92(3):401–422. 10.1109/JPROC.2003.823141
Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America 1979, 65(4):943–950. 10.1121/1.382599
Wan EA, van der Merwe R: The unscented Kalman filter for nonlinear estimation. Proceedings of IEEE Symposium on Adaptive Systems for Signal Processing, Communication and Control (AS-SPCC '00), October 2000, Lake Louise, Alberta, Canada 153–158.
About this article
Cite this article
Gannot, S., Dvorkind, T.G. Microphone Array Speaker Localizers Using Spatial-Temporal Information. EURASIP J. Adv. Signal Process. 2006, 059625 (2006) doi:10.1155/ASP/2006/59625
- Kalman Filter
- Speech Signal
- Temporal Information
- Extended Kalman Filter