Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification

Mak, Man-Wai; Tsang, Chi-Leung; Kung, Sun-Yuan

doi:10.1155/S1110865704308048

Research Article
Published: 21 April 2004

Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification

Man-Wai Mak¹,
Chi-Leung Tsang¹ &
Sun-Yuan Kung²

EURASIP Journal on Advances in Signal Processing volume 2004, Article number: 927921 (2004) Cite this article

505 Accesses
9 Citations
Metrics details

Abstract

The performance of telephone-based speaker verification systems can be severely degraded by linear and nonlinear acoustic distortion caused by telephone handsets. This paper proposes to combine a handset selector with stochastic feature transformation to reduce the distortion. Specifically, a Gaussian mixture model (GMM)-based handset selector is trained to identify the most likely handset used by the claimants, and then handset-specific stochastic feature transformations are applied to the distorted feature vectors. This paper also proposes a divergence-based handset selector with out-of-handset (OOH) rejection capability to identify the "unseen" handsets. This is achieved by measuring the Jensen difference between the selector's output and a constant vector with identical elements. The resulting handset selector is combined with the proposed feature transformation technique for telephone-based speaker verification. Experimental results based on 150 speakers of the HTIMIT corpus show that the handset selector, either with or without OOH rejection capability, is able to identify the "seen" handsets accurately (98.3% in both cases). Results also demonstrate that feature transformation performs significantly better than the classical cepstral mean normalization approach. Finally, by using the transformation parameters of the seen handsets to transform the utterances with correctly identified handsets and processing those utterances with unseen handsets by cepstral mean subtraction (CMS), verification error rates are reduced significantly (from 12.41% to 6.59% on average).

Author information

Authors and Affiliations

Centre for Multimedia Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
Man-Wai Mak & Chi-Leung Tsang
Department of Electrical Engineering, Princeton University, NJ, 08544, USA
Sun-Yuan Kung

Authors

Man-Wai Mak
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Leung Tsang
View author publications
You can also search for this author in PubMed Google Scholar
Sun-Yuan Kung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Man-Wai Mak.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mak, MW., Tsang, CL. & Kung, SY. Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification. EURASIP J. Adv. Signal Process. 2004, 927921 (2004). https://doi.org/10.1155/S1110865704308048

Download citation

Received: 07 October 2002
Revised: 20 June 2003
Published: 21 April 2004
DOI: https://doi.org/10.1155/S1110865704308048

Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification

Abstract

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases