Skip to main content

A Computationally Efficient Mel-Filter Bank VAD Algorithm for Distributed Speech Recognition Systems

Abstract

This paper presents a novel computationally efficient voice activity detection (VAD) algorithm and emphasizes the importance of such algorithms in distributed speech recognition (DSR) systems. When using VAD algorithms in telecommunication systems, the required capacity of the speech transmission channel can be reduced if only the speech parts of the signal are transmitted. A similar objective can be adopted in DSR systems, where the nonspeech parameters are not sent over the transmission channel. A novel approach is proposed for VAD decisions based on mel-filter bank (MFB) outputs with the so-called Hangover criterion. Comparative tests are presented between the presented MFB VAD algorithm and three VAD algorithms used in the G.729, G.723.1, and DSR (advanced front-end) Standards. These tests were made on the Aurora 2 database, with different signal-to-noise (SNRs) ratios. In the speech recognition tests, the proposed MFB VAD outperformed all the three VAD algorithms used in the standards by relative (G.723.1 VAD), by relative (G.729 VAD), and by relative (DSR VAD) in all SNRs.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Damjan Vlaj.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Vlaj, D., Kotnik, B., Horvat, B. et al. A Computationally Efficient Mel-Filter Bank VAD Algorithm for Distributed Speech Recognition Systems. EURASIP J. Adv. Signal Process. 2005, 561951 (2005). https://doi.org/10.1155/ASP.2005.487

Download citation

  • Received:

  • Revised:

  • Published:

  • DOI: https://doi.org/10.1155/ASP.2005.487

Keywords and phrases

  • voice activity detection
  • distributed speech recognition
  • telecommunication systems