Skip to main content
  • Research Article
  • Open access
  • Published:

Microphone Array Speaker Localizers Using Spatial-Temporal Information

Abstract

A dual-step approach for speaker localization based on a microphone array is addressed in this paper. In the first stage, which is not the main concern of this paper, the time difference between arrivals of the speech signal at each pair of microphones is estimated. These readings are combined in the second stage to obtain the source location. In this paper, we focus on the second stage of the localization task. In this contribution, we propose to exploit the speaker's smooth trajectory for improving the current position estimate. Three localization schemes, which use the temporal information, are presented. The first is a recursive form of the Gauss method. The other two are extensions of the Kalman filter to the nonlinear problem at hand, namely, the extended Kalman filter and the unscented Kalman filter. These methods are compared with other algorithms, which do not make use of the temporal information. An extensive experimental study demonstrates the advantage of using the spatial-temporal methods. To gain some insight on the obtainable performance of the localization algorithm, an approximate analytical evaluation, verified by an experimental study, is conducted. This study shows that in common TDOA-based localization scenarios—where the microphone array has small interelement spread relative to the source position—the elevation and azimuth angles can be accurately estimated, whereas the Cartesian coordinates as well as the range are poorly estimated.

References

  1. Knapp CH, Carter GC: The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 1976, 24(4):320–327. 10.1109/TASSP.1976.1162830

    Article  Google Scholar 

  2. Brandstein M, Silverman H: A robust method for speech signal time-delay estimation in reverberant rooms. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97), April 1997, Munich, Germany 1: 375–378.

    Google Scholar 

  3. Stéphenne A, Champagne B: A new cepstral prefiltering technique for estimating time delay under reverberant conditions. Signal Processing 1997, 59(3):253–266. 10.1016/S0165-1684(97)00051-0

    Article  Google Scholar 

  4. Benesty J: Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. The Journal of the Acoustical Society of America 2000, 107(1):384–391. 10.1121/1.428310

    Article  Google Scholar 

  5. Doclo S, Moonen M: Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1110–1124. 10.1155/S111086570330602X

    MATH  Google Scholar 

  6. Dvorkind T, Gannot S: Speaker localization in a reverberant environment. Proceedings of the 22nd IEEE Convention of Electrical and Electronics Engineers in Israel (IEEEI '02), December 2002, Tel-Aviv, Israel 7–9.

    Google Scholar 

  7. Dvorkind TG, Gannot S: Approaches for time difference of arrival estimation in a noisy and reverberant environment. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 215–218.

    Google Scholar 

  8. Dvorkind TG, Gannot S: Time difference of arrival estimation of speech source in a noisy and reverberant environment. Signal Processing 2004, 85(1):177–204.

    Article  Google Scholar 

  9. Chan YT, Ho KC: A simple and efficient estimator for hyperbolic location. IEEE Transactions on Signal Processing 1994, 42(8):1905–1915. 10.1109/78.301830

    Article  Google Scholar 

  10. Brandstein MS, Adcock JE, Silverman HF: A closed-form location estimator for use with room environment microphone arrays. IEEE Transactions on Speech and Audio Processing 1997, 5(1):45–50. 10.1109/89.554268

    Article  Google Scholar 

  11. Schau HC, Robinson AZ: Passive source localization employing intersecting spherical surfaces from time-of-arrival differences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987, 35(8):1223–1225.

    Article  Google Scholar 

  12. Smith J, Abel J: Closed-form least-squares source location estimation from range-difference measurements. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987, 35(12):1661–1669. 10.1109/TASSP.1987.1165089

    Article  Google Scholar 

  13. Huang Y, Benesty J, Elko GW: Passive acoustic source localization for video camera steering. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 909–912.

    Google Scholar 

  14. Huang Y, Benesty J, Elko GW, Mersereati RM: Real-time passive source localization: a practical linear-correction least squares approach. IEEE Transactions on Speech and Audio Processing 2001, 9(8):943–956. 10.1109/89.966097

    Article  Google Scholar 

  15. Press WH, Flannery BP, Teukolsky SA, Vetterling WT: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, UK; 1988.

    MATH  Google Scholar 

  16. Chen JC, Hudson RE, Yao K: Maximum-likelihood acoustic source localization: experimental results. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 3: 2949–2952.

    Google Scholar 

  17. Chen JC, Hudson RE, Yao K: Maximum-likelihood source localization and unknown sensor location estimation for wideband signals in the near-field . IEEE Transactions on Signal Processing 2002, 50(8):1843–1854. 10.1109/TSP.2002.800420

    Article  Google Scholar 

  18. Segal M, Weinstein E, Musicus BR: Estimate-maximize algorithms for multichannel time delay and signal estimation. IEEE Transactions on Signal Processing 1991, 39(1):1–16. 10.1109/78.80760

    Article  Google Scholar 

  19. Birchfield ST, Gillmor DK: Fast Bayesian acoustic localization. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 2: 1793–1796.

    Google Scholar 

  20. Ward DB, Lehmann EA, Williamson RC: Particle filtering algorithms for tracking an acoustic source in a reverberant environment . IEEE Transactions on Speech and Audio Processing 2003, 11(6):826–836. 10.1109/TSA.2003.818112

    Article  Google Scholar 

  21. Vermaak J, Blake A: Nonlinear filtering for speaker tracking in noisy and reverberant environments. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 3021–3024.

    Google Scholar 

  22. Lehmann EA, Williamson RC: Importance sampling particle filter for robust acoustic source localization and tracking in reverberant environments. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA C: 17–18.

    Google Scholar 

  23. Bechler D, Grimm M, Kroschel K: Speaker tracking with a microphone array using Kalman filtering. Advances in Radio Science 2003, 1: 113–117.

    Article  Google Scholar 

  24. Klee U, McDonough J: Kalman filtering for acoustic source localization based on time delay of arrival. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA C: 5–6.

    Google Scholar 

  25. Dvorkind TG, Gannot S: Speaker localization exploiting spatial-temporal information. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 295–298.

    Google Scholar 

  26. Dvorkind TG, Gannot S: Speaker localization using the unscented Kalman filter. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA'05), March 2005, Piscataway, NJ, USA C: 3–4.

    Google Scholar 

  27. Haykin S: Adaptive Filter Theory, Information and System Sciences. 4th edition. Prentice Hall, Upper Saddle River, NJ, USA; 2002.

    Google Scholar 

  28. Popescu DC, Rose C: Emitter localization in a multipath environment using extended Kalman filter. Proceedings of the 33rd Conference on Information Sciences and Systems (CISS '99), March 1999, Baltimore, Md, USA 1: 147–150.

    Google Scholar 

  29. Julier SJ, Uhlmann JK: Unscented filtering and nonlinear estimation. Proceedings of the IEEE 2004, 92(3):401–422. 10.1109/JPROC.2003.823141

    Article  Google Scholar 

  30. Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America 1979, 65(4):943–950. 10.1121/1.382599

    Article  Google Scholar 

  31. Wan EA, van der Merwe R: The unscented Kalman filter for nonlinear estimation. Proceedings of IEEE Symposium on Adaptive Systems for Signal Processing, Communication and Control (AS-SPCC '00), October 2000, Lake Louise, Alberta, Canada 153–158.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sharon Gannot.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Gannot, S., Dvorkind, T.G. Microphone Array Speaker Localizers Using Spatial-Temporal Information. EURASIP J. Adv. Signal Process. 2006, 059625 (2006). https://doi.org/10.1155/ASP/2006/59625

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/ASP/2006/59625

Keywords