Skip to main content

A New Method to Represent Speech Signals Via Predefined Signature and Envelope Sequences

Abstract

A novel systematic procedure referred to as "SYMPES" to model speech signals is introduced. The structure of SYMPES is based on the creation of the so-called predefined "signature and envelope" sets. These sets are speaker and language independent. Once the speech signals are divided into frames with selected lengths, then each frame sequence is reconstructed by means of the mathematical form. In this representation, is called the gain factor, and are properly assigned from the predefined signature and envelope sets, respectively. Examples are given to exhibit the implementation of SYMPES. It is shown that for the same compression ratio or better, SYMPES yields considerably better speech quality over the commercially available coders such as G.726 (ADPCM) at 16 kbps and voice excited LPC-10E (FS1015) at kbps.

References

  1. 1.

    Spanias AS: Speech coding: a tutorial review. Proceedings of the IEEE 1994,82(10):1541–1582. 10.1109/5.326413

    Article  Google Scholar 

  2. 2.

    Watanabe S: Karhunen-Loeve expansion and factor analysis; theoretical remarks and applications. In Transactions of the 4th Prague Conference on Information Theory, Statistical Decision Functions and Random Processes, 1965, Prague, Czech Republic. Czechoslovak Academy of Sciences; 635–660.

    Google Scholar 

  3. 3.

    Varile G, Zampolli A: Survey of the State of the Art in Human Language Technology. Cambridge University Press, Cambridge, UK; 1998. chapter 10.2: Transmission and Storage (B. S. Atal and N. S. Jayant)

    Google Scholar 

  4. 4.

    Karaş AM, Yarman BS: A new approach for representing discrete signal waveforms via private signature base sequences. Proceedings of the IEEE European Conference on Circuit Theory and Design, August 1995, Istanbul, Turkey 875–878.

    Google Scholar 

  5. 5.

    Karaş AM: Characterization of electrical signals by using signature base functions, Ph.D. thesis. Department of Electrical and Computer Engineering, Institute of Science, Istanbul University, Istanbul, Turkey; January 1997. Advisor: Professor B. S. Yarman

    Google Scholar 

  6. 6.

    Akdeniz R, Yarman BS: Turkish speech coding by signature base sequences. Proceedings of the International Conference on Signal Processing Applications & Technology (ICSPAT '98), September 1998, Toronto, Canada 1291–1294.

    Google Scholar 

  7. 7.

    Akdeniz R, Yarman BS: A novel method to represent speech signals. Signal Processing 2005,85(1):37–50. 10.1016/j.sigpro.2004.08.012

    Article  Google Scholar 

  8. 8.

    Hotelling H: Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 1933,24(6):417–498.

    Article  Google Scholar 

  9. 9.

    Oja E: A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology 1982,15(3):267–273. 10.1007/BF00275687

    MathSciNet  Article  Google Scholar 

  10. 10.

    Jolliffe IT: Principal Component Analysis, Springer Series in Statistics. Springer, New York, NY, USA; 1933.

    Google Scholar 

  11. 11.

    Akansu AN, Haddad RA: Multiresolution Signal Decomposition. Academic Press, San Diego, Calif, USA; 1992.

    Google Scholar 

  12. 12.

    Fukunaga K: Introduction to Statistical Pattern Recognition. Academic Press, London, UK; 1990.

    Google Scholar 

  13. 13.

    Newman AJ: Model reduction via the Karhunen Loeve expansion part I: an exposition. In Tech. Rep. ISR T.R.96-32. Institute of Systems Research, College Park, Md, USA; April 1996.

    Google Scholar 

  14. 14.

    Strang G: Linear Algebra and Its Applications. Academic Press, New York, NY, USA; 1980.

    Google Scholar 

  15. 15.

    Güz Ü: A new approach in the determination of optimum signature base functions for Turkish speech, Ph.D. thesis. Department of Electrical and Computer Engineering, Institute of Science, Istanbul University, Istanbul, Turkey; 2002. Advisor: Professor B. S. Yarman

    Google Scholar 

  16. 16.

    Güz Ü, Yarman BS, Gürkan H: A new method to represent speech signals via predefined functional bases. Proceedings of the IEEE European Conference on Circuit Theory and Design, August 2001, Espoo, Finland 2: 5–8.

    MATH  Google Scholar 

  17. 17.

    Güz Ü, Gürkan H, Yarman BS: A novel method to represent the speech signals by using language and speaker independent predefined functions sets. Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vancouver, BC, Canada 3: 457–460.

    Google Scholar 

  18. 18.

    IPA : Handbook of the International Phonetics Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press, Cambridge, UK; 1999.

    Google Scholar 

  19. 19.

    Pearson K: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 1901,2(11):559–572.

    MATH  Google Scholar 

  20. 20.

    Linde Y, Buzo A, Gray RM: An algorithm for vector quantizer design. IEEE Transactions on Communications 1980,28(1):84–95. 10.1109/TCOM.1980.1094577

    Article  Google Scholar 

  21. 21.

    OGI Multi-Language Telephone Speech Corpus, CD-ROM, Linguistic Data Consortium

  22. 22.

    Quackenbush SR, Barnwell TP, Clements MA: Objective Measures of Speech Quality. Prentice Hall, Englewood Cliffs, NJ, USA; 1988.

    Google Scholar 

  23. 23.

    Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL: DARPA TIMIT acoustic phonetic speech corpus. In Tech. Rep. NISTIR 4930. U.S. Department of Commerce, NIST, Computer Systems Laboratory, Washington, DC, USA; 1993.

    Google Scholar 

  24. 24.

    ITU-T Recommendation G.726; 40, 32, 24, 16 kbit/s ADPCM, Geneva, (12/90)

  25. 25.

    ITU-T Appendix III to ITU-T Recommendation G.726; General aspects of digital transmission systems-comparison of ADPCM algorithms, Geneva, (05/94)

  26. 26.

    ITU-T Recommendation P.861; Series P: Telephone transmission quality methods for objective and subjective assessment of quality-objective quality measurement of telephone band (300–3400 Hz) speech codecs, Geneva, (08/96)

  27. 27.

    ITU-T Recommendation P.830; Telephone transmission quality methods for objective and subjective assessment of quality-subjective performance assessment of telephone-band and wideband digital codecs, Geneva, (02/96)

  28. 28.

    Voiers WD: Methods of predicting user acceptance of voice communication systems. Final Report DCA100-74-C-0056 July 1976.

    Google Scholar 

  29. 29.

    ITU-T Recommendation P.800; Series P: Telephone transmission quality methods for objective and subjective assessment of quality-methods for subjective determination of transmission quality, Geneva, (08/96)

  30. 30.

    ITU-T Recommendation G.729; Coding of speech at 8 kbit/s using CS-ACELP

  31. 31.

    Güz Ü, Gürkan H, Yarman BS: A new speech signal modeling and word recognition method by using signature and envelope feature spaces. Proceedings of the IEEE European Conference on Circuit Theory and Design, September 2003, Cracow, Poland 3: 161–164.

    Google Scholar 

  32. 32.

    Yarman BS, Gürkan H, Güz Ü, Aygün B: A new modeling method of the ECG signals based on the use of an optimized predefined functional database. Acta Cardiologica - An International Journal of Cardiology 2003,58(3):59–61.

    Google Scholar 

  33. 33.

    Gürkan H, Güz Ü, Yarman BS: A novel representation method for electromyogram (EMG) signal with predefined signature and envelope functional bank. Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vancouver, BC, Canada 4: 69–72.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ümit Güz.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Güz, Ü., Gürkan, H. & Yarman, B.S. A New Method to Represent Speech Signals Via Predefined Signature and Envelope Sequences. EURASIP J. Adv. Signal Process. 2007, 056382 (2006). https://doi.org/10.1155/2007/56382

Download citation

Keywords

  • Information Technology
  • Quantum Information
  • Mathematical Form
  • Compression Ratio
  • Speech Signal