Skip to main content

Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking

Abstract

In noisy and reverberant environments, the problem of acoustic source localisation and tracking (ASLT) using an array of microphones presents a number of challenging difficulties. One of the main issues when considering real-world situations involving human speakers is the temporally discontinuous nature of speech signals: the presence of silence gaps in the speech can easily misguide the tracking algorithm, even in practical environments with low to moderate noise and reverberation levels. A natural extension of currently available sound source tracking algorithms is the integration of a voice activity detection (VAD) scheme. We describe a new ASLT algorithm based on a particle filtering (PF) approach, where VAD measurements are fused within the statistical framework of the PF implementation. Tracking accuracy results for the proposed method is presented on the basis of synthetic audio samples generated with the image method, whereas performance results obtained with a real-time implementation of the algorithm, and using real audio data recorded in a reverberant room, are published elsewhere. Compared to a previously proposed PF algorithm, the experimental results demonstrate the improved robustness of the method described in this work when tracking sources emitting real-world speech signals, which typically involve significant silence gaps between utterances.

References

  1. 1.

    Gannot S, Dvorkind TG: Microphone array speaker localizers using spatial-temporal information. EURASIP Journal on Applied Signal Processing 2006, 2006: 17 pages.

    MATH  Google Scholar 

  2. 2.

    Ward DB, Lehmann EA, Williamson RC: Particle filtering algorithms for tracking an acoustic source in a reverberant environment. IEEE Transactions on Speech and Audio Processing 2003,11(6):826–836. 10.1109/TSA.2003.818112

    Article  Google Scholar 

  3. 3.

    Vermaak J, Blake A: Nonlinear filtering for speaker tracking in noisy and reverberant environments. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 3021–3024.

    Google Scholar 

  4. 4.

    Potamitis I, Chen H, Tremoulis G: Tracking of multiple moving speakers with multiple microphone arrays. IEEE Transactions on Speech and Audio Processing 2004,12(5):520–529. 10.1109/TSA.2004.833004

    Article  Google Scholar 

  5. 5.

    Dvorkind TG, Gannot S: Speaker localization exploiting spatial-temporal information. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 295–298.

    Google Scholar 

  6. 6.

    Bechler D, Grimm M, Kroschel K: Speaker tracking with a microphone array using Kalman filtering. Advances in Radio Science 2003, 1: 113–117.

    Article  Google Scholar 

  7. 7.

    Chen J, Shue L, Ser W: A new approach for speaker tracking in reverberant environment. Signal Processing 2002,82(7):1023–1028. 10.1016/S0165-1684(02)00206-2

    Article  Google Scholar 

  8. 8.

    Huang Y, Benesty J, Elko GW: Passive acoustic source localization for video camera steering. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 909–912.

    Google Scholar 

  9. 9.

    Doclo S, Moonen M: Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments. EURASIP Journal on Applied Signal Processing 2003,2003(11):1110–1124. 10.1155/S111086570330602X

    MATH  Google Scholar 

  10. 10.

    Sheng X, Hu YH: Sequential acoustic energy based source localization using particle filter in a distributed sensor network. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Québec, Canada 3: 972–975.

    Google Scholar 

  11. 11.

    Chen JC, Yao K, Hudson RE: Acoustic source localization and beamforming: theory and practice. EURASIP Journal on Applied Signal Processing 2003,2003(4):359–370. 10.1155/S1110865703212038

    MATH  Google Scholar 

  12. 12.

    Davis A, Nordholm S, Togneri R: Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. IEEE Transactions on Audio, Speech and Language Processing 2006,14(2):412–424.

    Article  Google Scholar 

  13. 13.

    Anderson B, Moore J: Optimal Filtering. Dover, New York, NY, USA; 2005.

    Google Scholar 

  14. 14.

    Arulampalam MS, Maskell S, Gordon N, Clapp T: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing 2002,50(2):174–188. 10.1109/78.978374

    Article  Google Scholar 

  15. 15.

    Gordon NJ, Salmond DJ, Smith AFM: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings, F: Radar and Signal Processing 1993,140(2):107–113. 10.1049/ip-f-2.1993.0015

    Google Scholar 

  16. 16.

    Lehmann EA, Ward DB, Williamson RC: Experimental comparison of particle filtering algorithms for acoustic source localization in a reverberant room. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 177–180.

    Google Scholar 

  17. 17.

    Knapp CH, Carter GC: The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 1976,24(4):320–327. 10.1109/TASSP.1976.1162830

    Article  Google Scholar 

  18. 18.

    Waterhouse R: Statistical properties of reverberant sound fields. Journal of the Acoustical Society of America 1968,43(6):1436–1444. 10.1121/1.1911004

    Article  Google Scholar 

  19. 19.

    Haykin S: Communication Systems. 3rd edition. John Wiley & Sons, New York, NY, USA; 1994.

    Google Scholar 

  20. 20.

    Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. Journal of the Acoustical Society of America 1979,65(4):943–950. 10.1121/1.382599

    Article  Google Scholar 

  21. 21.

    Johansson AM, Lehmann EA, Nordholm S: Real-time implementation of a particle filter with integrated voice activity detector for acoustic speaker tracking. Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems (APCCAS '06), December 2006, Singapore

    Google Scholar 

  22. 22.

    Lehmann EA, Johansson AM: Experimental performance assessment of a particle filter with voice activity data fusion for acoustic speaker tracking. Proceedings of the 7th IEEE Nordic Signal Processing Symposium (NORSIG '06), June 2006, Reykjavik, Iceland

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Eric A. Lehmann.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Lehmann, E.A., Johansson, A.M. Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking. EURASIP J. Adv. Signal Process. 2007, 050870 (2006). https://doi.org/10.1155/2007/50870

Download citation

Keywords

  • Speech Signal
  • Particle Filter
  • Tracking Algorithm
  • Acoustic Source
  • Voice Activity Detection