- Research Article
- Open Access
Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 050870 (2006)
In noisy and reverberant environments, the problem of acoustic source localisation and tracking (ASLT) using an array of microphones presents a number of challenging difficulties. One of the main issues when considering real-world situations involving human speakers is the temporally discontinuous nature of speech signals: the presence of silence gaps in the speech can easily misguide the tracking algorithm, even in practical environments with low to moderate noise and reverberation levels. A natural extension of currently available sound source tracking algorithms is the integration of a voice activity detection (VAD) scheme. We describe a new ASLT algorithm based on a particle filtering (PF) approach, where VAD measurements are fused within the statistical framework of the PF implementation. Tracking accuracy results for the proposed method is presented on the basis of synthetic audio samples generated with the image method, whereas performance results obtained with a real-time implementation of the algorithm, and using real audio data recorded in a reverberant room, are published elsewhere. Compared to a previously proposed PF algorithm, the experimental results demonstrate the improved robustness of the method described in this work when tracking sources emitting real-world speech signals, which typically involve significant silence gaps between utterances.
Gannot S, Dvorkind TG: Microphone array speaker localizers using spatial-temporal information. EURASIP Journal on Applied Signal Processing 2006, 2006: 17 pages.
Ward DB, Lehmann EA, Williamson RC: Particle filtering algorithms for tracking an acoustic source in a reverberant environment. IEEE Transactions on Speech and Audio Processing 2003,11(6):826–836. 10.1109/TSA.2003.818112
Vermaak J, Blake A: Nonlinear filtering for speaker tracking in noisy and reverberant environments. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 3021–3024.
Potamitis I, Chen H, Tremoulis G: Tracking of multiple moving speakers with multiple microphone arrays. IEEE Transactions on Speech and Audio Processing 2004,12(5):520–529. 10.1109/TSA.2004.833004
Dvorkind TG, Gannot S: Speaker localization exploiting spatial-temporal information. Proceedings of the International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 295–298.
Bechler D, Grimm M, Kroschel K: Speaker tracking with a microphone array using Kalman filtering. Advances in Radio Science 2003, 1: 113–117.
Chen J, Shue L, Ser W: A new approach for speaker tracking in reverberant environment. Signal Processing 2002,82(7):1023–1028. 10.1016/S0165-1684(02)00206-2
Huang Y, Benesty J, Elko GW: Passive acoustic source localization for video camera steering. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 909–912.
Doclo S, Moonen M: Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments. EURASIP Journal on Applied Signal Processing 2003,2003(11):1110–1124. 10.1155/S111086570330602X
Sheng X, Hu YH: Sequential acoustic energy based source localization using particle filter in a distributed sensor network. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Québec, Canada 3: 972–975.
Chen JC, Yao K, Hudson RE: Acoustic source localization and beamforming: theory and practice. EURASIP Journal on Applied Signal Processing 2003,2003(4):359–370. 10.1155/S1110865703212038
Davis A, Nordholm S, Togneri R: Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. IEEE Transactions on Audio, Speech and Language Processing 2006,14(2):412–424.
Anderson B, Moore J: Optimal Filtering. Dover, New York, NY, USA; 2005.
Arulampalam MS, Maskell S, Gordon N, Clapp T: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing 2002,50(2):174–188. 10.1109/78.978374
Gordon NJ, Salmond DJ, Smith AFM: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings, F: Radar and Signal Processing 1993,140(2):107–113. 10.1049/ip-f-2.1993.0015
Lehmann EA, Ward DB, Williamson RC: Experimental comparison of particle filtering algorithms for acoustic source localization in a reverberant room. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), April 2003, Hong Kong 5: 177–180.
Knapp CH, Carter GC: The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 1976,24(4):320–327. 10.1109/TASSP.1976.1162830
Waterhouse R: Statistical properties of reverberant sound fields. Journal of the Acoustical Society of America 1968,43(6):1436–1444. 10.1121/1.1911004
Haykin S: Communication Systems. 3rd edition. John Wiley & Sons, New York, NY, USA; 1994.
Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. Journal of the Acoustical Society of America 1979,65(4):943–950. 10.1121/1.382599
Johansson AM, Lehmann EA, Nordholm S: Real-time implementation of a particle filter with integrated voice activity detector for acoustic speaker tracking. Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems (APCCAS '06), December 2006, Singapore
Lehmann EA, Johansson AM: Experimental performance assessment of a particle filter with voice activity data fusion for acoustic speaker tracking. Proceedings of the 7th IEEE Nordic Signal Processing Symposium (NORSIG '06), June 2006, Reykjavik, Iceland
About this article
Cite this article
Lehmann, E.A., Johansson, A.M. Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking. EURASIP J. Adv. Signal Process. 2007, 050870 (2006). https://doi.org/10.1155/2007/50870
- Speech Signal
- Particle Filter
- Tracking Algorithm
- Acoustic Source
- Voice Activity Detection