Open Access

A Robust Formant Extraction Algorithm Combining Spectral Peak Picking and Root Polishing

  • Chanwoo Kim1,
  • Kwang-deok Seo2 and
  • Wonyong Sung3
EURASIP Journal on Advances in Signal Processing20062006:067960

https://doi.org/10.1155/ASP/2006/67960

Received: 22 September 2004

Accepted: 22 August 2005

Published: 21 February 2006

Abstract

We propose a robust formant extraction algorithm that combines the spectral peak picking, formants location examining for peak merger checking, and the root extraction methods. The spectral peak picking method is employed to locate the formant candidates, and the root extraction is used for solving the peak merger problem. The location and the distance between the extracted formants are also utilized to efficiently find out suspected peak mergers. The proposed algorithm does not require much computation, and is shown to be superior to previous formant extraction algorithms through extensive tests using TIMIT speech database.

[12345678910111213141516]

Authors’ Affiliations

(1)
School of Computer Science, Carnegie Mellon University
(2)
Computer and Telecommunications Engineering Division, Yonsei University
(3)
School of Electrical Engineering and Computer Science, Seoul National University

References

  1. Rabiner LR, Schafer RW: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs, NJ, USA; 1978.Google Scholar
  2. Snell RC, Milinazzo F: Formant location from LPC analysis data. IEEE Transactions on Speech and Audio Processing 1993, 1(2):129-134. 10.1109/89.222882View ArticleGoogle Scholar
  3. McCandless SS: An algorithm for automatic formant extraction using linear prediction spectra. IEEE Transactions on Acoustics, Speech, and Signal Processing 1974, 22(2):135-141. 10.1109/TASSP.1974.1162559View ArticleGoogle Scholar
  4. Welling L, Ney H: Formant estimation for speech recognition. IEEE Transactions on Speech and Audio Processing 1998, 6(1):36-48. 10.1109/89.650308View ArticleGoogle Scholar
  5. Dellar JR Jr., Proakis JG, Hansen JHL: Discrete-Time Processing of Speech Signals. Macmillan, New York, NY, USA; 1993.Google Scholar
  6. Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL: Darpa TIMIT acoustic-phonetic continuous speech corpus. In Tech. Rep. NISTIR 4930. U.S. Department of Commerce, National Institute of Standards and Technology, Gaithersburg, Md, USA; 1993.Google Scholar
  7. Peterson GE, Barney HL: Control methods used in a study of the vowels. Journal of the Acoustical Society of America 1952, 24(2):175-194. 10.1121/1.1906875View ArticleGoogle Scholar
  8. Kim C, Sung W: Vowel pronunciation accuracy checking system based on phoneme segmentation and formants extraction. Proceedings of International Conference on Speech Processing, August 2001, Daejeon, Korea 447-452.Google Scholar
  9. Markel JD: Digital inverse filtering: a new tool for formant trajectory estimation. IEEE Transactions on Audio and Electroacoustics 1972, 20(2):129-137. 10.1109/TAU.1972.1162367MathSciNetView ArticleGoogle Scholar
  10. Atal BS, Hanauer SL: Speech analysis and synthesis by linear prediction of the speech wave. Journal of the Acoustical Society of America 1971, 50(2B):637-655. 10.1121/1.1912679View ArticleGoogle Scholar
  11. Kang GS, Coulter DC: 600 bits per second voice digitizer (linear predictive formant vocoder). Naval Research Laboratory Report 8043 November 1976.Google Scholar
  12. Bell CG, Fujisaki H, Heinz JM, Stevens KN, House AS: Reduction of speech spectra by analysis-by-synthesis techniques. Journal of the Acoustical Society of America 1961, 33(12):1725-1736. 10.1121/1.1908556View ArticleGoogle Scholar
  13. Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes in C. Cambridge University Press, Cambridge, UK; 1992. pp. 376MATHGoogle Scholar
  14. Burden RL, Faires JD: Numerical Analysis. Brooks/Cole, Pacific Grove, Calif, USA; 1997.MATHGoogle Scholar
  15. Dunn HK: Methods of measuring vowel formant bandwidths. Journal of the Acoustical Society of America 1961, 33(12):1737-1746. 10.1121/1.1908558View ArticleGoogle Scholar
  16. WaveSurfer Center for Speech Technology (CTT) at KTH, Stockholm, Sweden, available at http://www.speech.kth.se/wavesurfer/

Copyright

© Kim et al. 2006