Open Access

Microphone Array Speaker Localizers Using Spatial-Temporal Information

EURASIP Journal on Advances in Signal Processing20062006:059625

https://doi.org/10.1155/ASP/2006/59625

Received: 20 January 2005

Accepted: 22 August 2005

Published: 3 May 2006

Abstract

A dual-step approach for speaker localization based on a microphone array is addressed in this paper. In the first stage, which is not the main concern of this paper, the time difference between arrivals of the speech signal at each pair of microphones is estimated. These readings are combined in the second stage to obtain the source location. In this paper, we focus on the second stage of the localization task. In this contribution, we propose to exploit the speaker's smooth trajectory for improving the current position estimate. Three localization schemes, which use the temporal information, are presented. The first is a recursive form of the Gauss method. The other two are extensions of the Kalman filter to the nonlinear problem at hand, namely, the extended Kalman filter and the unscented Kalman filter. These methods are compared with other algorithms, which do not make use of the temporal information. An extensive experimental study demonstrates the advantage of using the spatial-temporal methods. To gain some insight on the obtainable performance of the localization algorithm, an approximate analytical evaluation, verified by an experimental study, is conducted. This study shows that in common TDOA-based localization scenarios—where the microphone array has small interelement spread relative to the source position—the elevation and azimuth angles can be accurately estimated, whereas the Cartesian coordinates as well as the range are poorly estimated.

[12345678910111213141516171819202122232425262728293031]

Authors’ Affiliations

(1)
School of Engineering, Bar-Ilan University
(2)
Department of Electrical Engineering, Technion –Israel Institute of Technology

References

  1. Knapp CH, Carter GC: The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 1976, 24(4):320-327. 10.1109/TASSP.1976.1162830View ArticleGoogle Scholar
  2. Brandstein M, Silverman H: A robust method for speech signal time-delay estimation in reverberant rooms. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), April 1997, Munich, Germany 1: 375-378.Google Scholar
  3. Stéphenne A, Champagne B: A new cepstral prefiltering technique for estimating time delay under reverberant conditions. Signal Processing 1997, 59(3):253-266. 10.1016/S0165-1684(97)00051-0View ArticleMATHGoogle Scholar
  4. Benesty J: Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. The Journal of the Acoustical Society of America 2000, 107(1):384-391. 10.1121/1.428310View ArticleGoogle Scholar
  5. Doclo S, Moonen M: Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1110-1124. 10.1155/S111086570330602XView ArticleMATHGoogle Scholar
  6. Dvorkind T, Gannot S: Speaker localization in a reverberant environment. Proceedings of the 22nd IEEE Convention of Electrical and Electronics Engineers in Israel (IEEEI '02), December 2002, Tel-Aviv, Israel 7-9.Google Scholar
  7. Dvorkind TG, Gannot S: Approaches for time difference of arrival estimation in a noisy and reverberant environment. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 215-218.Google Scholar
  8. Dvorkind TG, Gannot S: Time difference of arrival estimation of speech source in a noisy and reverberant environment. Signal Processing 2004, 85(1):177-204.View ArticleMATHGoogle Scholar
  9. Chan YT, Ho KC: A simple and efficient estimator for hyperbolic location. IEEE Transactions on Signal Processing 1994, 42(8):1905-1915. 10.1109/78.301830MathSciNetView ArticleGoogle Scholar
  10. Brandstein MS, Adcock JE, Silverman HF: A closed-form location estimator for use with room environment microphone arrays. IEEE Transactions on Speech and Audio Processing 1997, 5(1):45-50. 10.1109/89.554268View ArticleGoogle Scholar
  11. Schau HC, Robinson AZ: Passive source localization employing intersecting spherical surfaces from time-of-arrival differences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987, 35(8):1223-1225.View ArticleGoogle Scholar
  12. Smith J, Abel J: Closed-form least-squares source location estimation from range-difference measurements. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987, 35(12):1661-1669. 10.1109/TASSP.1987.1165089View ArticleGoogle Scholar
  13. Huang Y, Benesty J, Elko GW: Passive acoustic source localization for video camera steering. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 909-912.Google Scholar
  14. Huang Y, Benesty J, Elko GW, Mersereati RM: Real-time passive source localization: a practical linear-correction least squares approach. IEEE Transactions on Speech and Audio Processing 2001, 9(8):943-956. 10.1109/89.966097View ArticleGoogle Scholar
  15. Press WH, Flannery BP, Teukolsky SA, Vetterling WT: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, UK; 1988.MATHGoogle Scholar
  16. Chen JC, Hudson RE, Yao K: Maximum-likelihood acoustic source localization: experimental results. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 3: 2949-2952.Google Scholar
  17. Chen JC, Hudson RE, Yao K: Maximum-likelihood source localization and unknown sensor location estimation for wideband signals in the near-field . IEEE Transactions on Signal Processing 2002, 50(8):1843-1854. 10.1109/TSP.2002.800420View ArticleGoogle Scholar
  18. Segal M, Weinstein E, Musicus BR: Estimate-maximize algorithms for multichannel time delay and signal estimation. IEEE Transactions on Signal Processing 1991, 39(1):1-16. 10.1109/78.80760View ArticleMATHGoogle Scholar
  19. Birchfield ST, Gillmor DK: Fast Bayesian acoustic localization. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1793-1796.Google Scholar
  20. Ward DB, Lehmann EA, Williamson RC: Particle filtering algorithms for tracking an acoustic source in a reverberant environment . IEEE Transactions on Speech and Audio Processing 2003, 11(6):826-836. 10.1109/TSA.2003.818112View ArticleGoogle Scholar
  21. Vermaak J, Blake A: Nonlinear filtering for speaker tracking in noisy and reverberant environments. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 3021-3024.Google Scholar
  22. Lehmann EA, Williamson RC: Importance sampling particle filter for robust acoustic source localization and tracking in reverberant environments. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA C: 17-18.Google Scholar
  23. Bechler D, Grimm M, Kroschel K: Speaker tracking with a microphone array using Kalman filtering. Advances in Radio Science 2003, 1: 113-117.View ArticleGoogle Scholar
  24. Klee U, McDonough J: Kalman filtering for acoustic source localization based on time delay of arrival. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA C: 5-6.Google Scholar
  25. Dvorkind TG, Gannot S: Speaker localization exploiting spatial-temporal information. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 295-298.Google Scholar
  26. Dvorkind TG, Gannot S: Speaker localization using the unscented Kalman filter. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA'05), March 2005, Piscataway, NJ, USA C: 3-4.Google Scholar
  27. Haykin S: Adaptive Filter Theory, Information and System Sciences. 4th edition. Prentice Hall, Upper Saddle River, NJ, USA; 2002.Google Scholar
  28. Popescu DC, Rose C: Emitter localization in a multipath environment using extended Kalman filter. Proceedings of the 33rd Conference on Information Sciences and Systems (CISS '99), March 1999, Baltimore, Md, USA 1: 147-150.Google Scholar
  29. Julier SJ, Uhlmann JK: Unscented filtering and nonlinear estimation. Proceedings of the IEEE 2004, 92(3):401-422. 10.1109/JPROC.2003.823141View ArticleGoogle Scholar
  30. Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America 1979, 65(4):943-950. 10.1121/1.382599View ArticleGoogle Scholar
  31. Wan EA, van der Merwe R: The unscented Kalman filter for nonlinear estimation. Proceedings of IEEE Symposium on Adaptive Systems for Signal Processing, Communication and Control (AS-SPCC '00), October 2000, Lake Louise, Alberta, Canada 153-158.Google Scholar

Copyright

© Gannot and Dvorkind 2006