Open Access

A New Method to Represent Speech Signals Via Predefined Signature and Envelope Sequences

EURASIP Journal on Advances in Signal Processing20062007:056382

https://doi.org/10.1155/2007/56382

Received: 3 June 2005

Accepted: 30 April 2006

Published: 7 September 2006

Abstract

A novel systematic procedure referred to as "SYMPES" to model speech signals is introduced. The structure of SYMPES is based on the creation of the so-called predefined "signature and envelope " sets. These sets are speaker and language independent. Once the speech signals are divided into frames with selected lengths, then each frame sequence is reconstructed by means of the mathematical form . In this representation, is called the gain factor, and are properly assigned from the predefined signature and envelope sets, respectively. Examples are given to exhibit the implementation of SYMPES. It is shown that for the same compression ratio or better, SYMPES yields considerably better speech quality over the commercially available coders such as G.726 (ADPCM) at 16 kbps and voice excited LPC-10E (FS1015) at kbps.

Keywords

Information TechnologyQuantum InformationMathematical FormCompression RatioSpeech Signal

[123456789101112131415161718192021222324252627282930313233]

Authors’ Affiliations

(1)
Department of Electronics Engineering, Engineering Faculty, Işık University, Şile, Istanbul, Turkey
(2)
SRI-International, Speech Technology and Research (STAR) Laboratory, Menlo Park, USA
(3)
Department of Electronics Engineering, Engineering Faculty, Işık University, Şile, Istanbul, Turkey
(4)
Department of Physical Electronics, Graduate School of Science and Technology, Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan

References

  1. Spanias AS: Speech coding: a tutorial review. Proceedings of the IEEE 1994,82(10):1541-1582. 10.1109/5.326413View ArticleGoogle Scholar
  2. Watanabe S: Karhunen-Loeve expansion and factor analysis; theoretical remarks and applications. In Transactions of the 4th Prague Conference on Information Theory, Statistical Decision Functions and Random Processes, 1965, Prague, Czech Republic. Czechoslovak Academy of Sciences; 635-660.Google Scholar
  3. Varile G, Zampolli A: Survey of the State of the Art in Human Language Technology. Cambridge University Press, Cambridge, UK; 1998. chapter 10.2: Transmission and Storage (B. S. Atal and N. S. Jayant)Google Scholar
  4. Karaş AM, Yarman BS: A new approach for representing discrete signal waveforms via private signature base sequences. Proceedings of the IEEE European Conference on Circuit Theory and Design, August 1995, Istanbul, Turkey 875-878.Google Scholar
  5. Karaş AM: Characterization of electrical signals by using signature base functions, Ph.D. thesis. Department of Electrical and Computer Engineering, Institute of Science, Istanbul University, Istanbul, Turkey; January 1997. Advisor: Professor B. S. YarmanGoogle Scholar
  6. Akdeniz R, Yarman BS: Turkish speech coding by signature base sequences. Proceedings of the International Conference on Signal Processing Applications & Technology (ICSPAT '98), September 1998, Toronto, Canada 1291-1294.Google Scholar
  7. Akdeniz R, Yarman BS: A novel method to represent speech signals. Signal Processing 2005,85(1):37-50. 10.1016/j.sigpro.2004.08.012View ArticleMATHGoogle Scholar
  8. Hotelling H: Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 1933,24(6):417-498.View ArticleMATHGoogle Scholar
  9. Oja E: A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology 1982,15(3):267-273. 10.1007/BF00275687MathSciNetView ArticleMATHGoogle Scholar
  10. Jolliffe IT: Principal Component Analysis, Springer Series in Statistics. Springer, New York, NY, USA; 1933.Google Scholar
  11. Akansu AN, Haddad RA: Multiresolution Signal Decomposition. Academic Press, San Diego, Calif, USA; 1992.MATHGoogle Scholar
  12. Fukunaga K: Introduction to Statistical Pattern Recognition. Academic Press, London, UK; 1990.MATHGoogle Scholar
  13. Newman AJ: Model reduction via the Karhunen Loeve expansion part I: an exposition. In Tech. Rep. ISR T.R.96-32. Institute of Systems Research, College Park, Md, USA; April 1996.Google Scholar
  14. Strang G: Linear Algebra and Its Applications. Academic Press, New York, NY, USA; 1980.MATHGoogle Scholar
  15. Güz Ü: A new approach in the determination of optimum signature base functions for Turkish speech, Ph.D. thesis. Department of Electrical and Computer Engineering, Institute of Science, Istanbul University, Istanbul, Turkey; 2002. Advisor: Professor B. S. YarmanGoogle Scholar
  16. Güz Ü, Yarman BS, Gürkan H: A new method to represent speech signals via predefined functional bases. Proceedings of the IEEE European Conference on Circuit Theory and Design, August 2001, Espoo, Finland 2: 5-8.MATHGoogle Scholar
  17. Güz Ü, Gürkan H, Yarman BS: A novel method to represent the speech signals by using language and speaker independent predefined functions sets. Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vancouver, BC, Canada 3: 457-460.Google Scholar
  18. IPA : Handbook of the International Phonetics Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press, Cambridge, UK; 1999.Google Scholar
  19. Pearson K: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 1901,2(11):559-572.View ArticleMATHGoogle Scholar
  20. Linde Y, Buzo A, Gray RM: An algorithm for vector quantizer design. IEEE Transactions on Communications 1980,28(1):84-95. 10.1109/TCOM.1980.1094577View ArticleGoogle Scholar
  21. OGI Multi-Language Telephone Speech Corpus, CD-ROM, Linguistic Data ConsortiumGoogle Scholar
  22. Quackenbush SR, Barnwell TP, Clements MA: Objective Measures of Speech Quality. Prentice Hall, Englewood Cliffs, NJ, USA; 1988.Google Scholar
  23. Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL: DARPA TIMIT acoustic phonetic speech corpus. In Tech. Rep. NISTIR 4930. U.S. Department of Commerce, NIST, Computer Systems Laboratory, Washington, DC, USA; 1993.Google Scholar
  24. ITU-T Recommendation G.726; 40, 32, 24, 16 kbit/s ADPCM, Geneva, (12/90)Google Scholar
  25. ITU-T Appendix III to ITU-T Recommendation G.726; General aspects of digital transmission systems-comparison of ADPCM algorithms, Geneva, (05/94)Google Scholar
  26. ITU-T Recommendation P.861; Series P: Telephone transmission quality methods for objective and subjective assessment of quality-objective quality measurement of telephone band (300-3400 Hz) speech codecs, Geneva, (08/96)Google Scholar
  27. ITU-T Recommendation P.830; Telephone transmission quality methods for objective and subjective assessment of quality-subjective performance assessment of telephone-band and wideband digital codecs, Geneva, (02/96)Google Scholar
  28. Voiers WD: Methods of predicting user acceptance of voice communication systems. Final Report DCA100-74-C-0056 July 1976.Google Scholar
  29. ITU-T Recommendation P.800; Series P: Telephone transmission quality methods for objective and subjective assessment of quality-methods for subjective determination of transmission quality, Geneva, (08/96)Google Scholar
  30. ITU-T Recommendation G.729; Coding of speech at 8 kbit/s using CS-ACELPGoogle Scholar
  31. Güz Ü, Gürkan H, Yarman BS: A new speech signal modeling and word recognition method by using signature and envelope feature spaces. Proceedings of the IEEE European Conference on Circuit Theory and Design, September 2003, Cracow, Poland 3: 161-164.Google Scholar
  32. Yarman BS, Gürkan H, Güz Ü, Aygün B: A new modeling method of the ECG signals based on the use of an optimized predefined functional database. Acta Cardiologica - An International Journal of Cardiology 2003,58(3):59-61.Google Scholar
  33. Gürkan H, Güz Ü, Yarman BS: A novel representation method for electromyogram (EMG) signal with predefined signature and envelope functional bank. Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vancouver, BC, Canada 4: 69-72.Google Scholar

Copyright

© Ümit Güz et al. 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.