Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy

Kos, Marko; Grašič, Matej; Kačič, Zdravko

doi:10.1155/2009/628570

Research Article
Open access
Published: 22 October 2009

Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy

Marko Kos¹,
Matej Grašič¹ &
Zdravko Kačič¹

EURASIP Journal on Advances in Signal Processing volume 2009, Article number: 628570 (2009) Cite this article

1601 Accesses
9 Citations
Metrics details

Abstract

This paper presents a novel feature for online speech/music segmentation based on the variance mean of filter bank energy (VMFBE). The idea that encouraged the feature's construction is energy variation in a narrow frequency sub-band. The energy varies more rapidly, and to a greater extent for speech than for music. Therefore, an energy variance in such a sub-band is greater for speech than for music. The radio broadcast database and the BNSI broadcast news database were used for feature discrimination and segmentation ability evaluation. The calculation procedure of the VMFBE feature has 4 out of 6 steps in common with the MFCC feature calculation procedure. Therefore, it is a very convenient speech/music discriminator for use in real-time automatic speech recognition systems based on MFCC features, because valuable processing time can be saved, and computation load is only slightly increased. Analysis of the feature's speech/music discriminative ability shows an average error rate below 10% for radio broadcast material and it outperforms other features used for comparison, by more than 8%. The proposed feature as a stand-alone speech/music discriminator in a segmentation system achieves an overall accuracy of over 94% on radio broadcast material.

Publisher note

To access the full article, please see PDF.

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova ul. 17, 2000, Maribor, Slovenia
Marko Kos, Matej Grašič & Zdravko Kačič

Authors

Marko Kos
View author publications
You can also search for this author in PubMed Google Scholar
Matej Grašič
View author publications
You can also search for this author in PubMed Google Scholar
Zdravko Kačič
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marko Kos.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kos, M., Grašič, M. & Kačič, Z. Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy. EURASIP J. Adv. Signal Process. 2009, 628570 (2009). https://doi.org/10.1155/2009/628570

Download citation

Received: 06 March 2009
Revised: 04 June 2009
Accepted: 02 September 2009
Published: 22 October 2009
DOI: https://doi.org/10.1155/2009/628570

Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy

Abstract

Publisher note

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords