- Research Article
- Open access
- Published:
A New Method to Represent Speech Signals Via Predefined Signature and Envelope Sequences
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 056382 (2006)
Abstract
A novel systematic procedure referred to as "SYMPES" to model speech signals is introduced. The structure of SYMPES is based on the creation of the so-called predefined "signature and envelope" sets. These sets are speaker and language independent. Once the speech signals are divided into frames with selected lengths, then each frame sequence is reconstructed by means of the mathematical form. In this representation, is called the gain factor, and are properly assigned from the predefined signature and envelope sets, respectively. Examples are given to exhibit the implementation of SYMPES. It is shown that for the same compression ratio or better, SYMPES yields considerably better speech quality over the commercially available coders such as G.726 (ADPCM) at 16 kbps and voice excited LPC-10E (FS1015) at kbps.
References
Spanias AS: Speech coding: a tutorial review. Proceedings of the IEEE 1994,82(10):1541–1582. 10.1109/5.326413
Watanabe S: Karhunen-Loeve expansion and factor analysis; theoretical remarks and applications. In Transactions of the 4th Prague Conference on Information Theory, Statistical Decision Functions and Random Processes, 1965, Prague, Czech Republic. Czechoslovak Academy of Sciences; 635–660.
Varile G, Zampolli A: Survey of the State of the Art in Human Language Technology. Cambridge University Press, Cambridge, UK; 1998. chapter 10.2: Transmission and Storage (B. S. Atal and N. S. Jayant)
Karaş AM, Yarman BS: A new approach for representing discrete signal waveforms via private signature base sequences. Proceedings of the IEEE European Conference on Circuit Theory and Design, August 1995, Istanbul, Turkey 875–878.
Karaş AM: Characterization of electrical signals by using signature base functions, Ph.D. thesis. Department of Electrical and Computer Engineering, Institute of Science, Istanbul University, Istanbul, Turkey; January 1997. Advisor: Professor B. S. Yarman
Akdeniz R, Yarman BS: Turkish speech coding by signature base sequences. Proceedings of the International Conference on Signal Processing Applications & Technology (ICSPAT '98), September 1998, Toronto, Canada 1291–1294.
Akdeniz R, Yarman BS: A novel method to represent speech signals. Signal Processing 2005,85(1):37–50. 10.1016/j.sigpro.2004.08.012
Hotelling H: Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 1933,24(6):417–498.
Oja E: A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology 1982,15(3):267–273. 10.1007/BF00275687
Jolliffe IT: Principal Component Analysis, Springer Series in Statistics. Springer, New York, NY, USA; 1933.
Akansu AN, Haddad RA: Multiresolution Signal Decomposition. Academic Press, San Diego, Calif, USA; 1992.
Fukunaga K: Introduction to Statistical Pattern Recognition. Academic Press, London, UK; 1990.
Newman AJ: Model reduction via the Karhunen Loeve expansion part I: an exposition. In Tech. Rep. ISR T.R.96-32. Institute of Systems Research, College Park, Md, USA; April 1996.
Strang G: Linear Algebra and Its Applications. Academic Press, New York, NY, USA; 1980.
Güz Ü: A new approach in the determination of optimum signature base functions for Turkish speech, Ph.D. thesis. Department of Electrical and Computer Engineering, Institute of Science, Istanbul University, Istanbul, Turkey; 2002. Advisor: Professor B. S. Yarman
Güz Ü, Yarman BS, Gürkan H: A new method to represent speech signals via predefined functional bases. Proceedings of the IEEE European Conference on Circuit Theory and Design, August 2001, Espoo, Finland 2: 5–8.
Güz Ü, Gürkan H, Yarman BS: A novel method to represent the speech signals by using language and speaker independent predefined functions sets. Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vancouver, BC, Canada 3: 457–460.
IPA : Handbook of the International Phonetics Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press, Cambridge, UK; 1999.
Pearson K: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 1901,2(11):559–572.
Linde Y, Buzo A, Gray RM: An algorithm for vector quantizer design. IEEE Transactions on Communications 1980,28(1):84–95. 10.1109/TCOM.1980.1094577
OGI Multi-Language Telephone Speech Corpus, CD-ROM, Linguistic Data Consortium
Quackenbush SR, Barnwell TP, Clements MA: Objective Measures of Speech Quality. Prentice Hall, Englewood Cliffs, NJ, USA; 1988.
Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL: DARPA TIMIT acoustic phonetic speech corpus. In Tech. Rep. NISTIR 4930. U.S. Department of Commerce, NIST, Computer Systems Laboratory, Washington, DC, USA; 1993.
ITU-T Recommendation G.726; 40, 32, 24, 16 kbit/s ADPCM, Geneva, (12/90)
ITU-T Appendix III to ITU-T Recommendation G.726; General aspects of digital transmission systems-comparison of ADPCM algorithms, Geneva, (05/94)
ITU-T Recommendation P.861; Series P: Telephone transmission quality methods for objective and subjective assessment of quality-objective quality measurement of telephone band (300–3400 Hz) speech codecs, Geneva, (08/96)
ITU-T Recommendation P.830; Telephone transmission quality methods for objective and subjective assessment of quality-subjective performance assessment of telephone-band and wideband digital codecs, Geneva, (02/96)
Voiers WD: Methods of predicting user acceptance of voice communication systems. Final Report DCA100-74-C-0056 July 1976.
ITU-T Recommendation P.800; Series P: Telephone transmission quality methods for objective and subjective assessment of quality-methods for subjective determination of transmission quality, Geneva, (08/96)
ITU-T Recommendation G.729; Coding of speech at 8 kbit/s using CS-ACELP
Güz Ü, Gürkan H, Yarman BS: A new speech signal modeling and word recognition method by using signature and envelope feature spaces. Proceedings of the IEEE European Conference on Circuit Theory and Design, September 2003, Cracow, Poland 3: 161–164.
Yarman BS, Gürkan H, Güz Ü, Aygün B: A new modeling method of the ECG signals based on the use of an optimized predefined functional database. Acta Cardiologica - An International Journal of Cardiology 2003,58(3):59–61.
Gürkan H, Güz Ü, Yarman BS: A novel representation method for electromyogram (EMG) signal with predefined signature and envelope functional bank. Proceedings of the IEEE International Symposium on Circuits and Systems, May 2004, Vancouver, BC, Canada 4: 69–72.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Güz, Ü., Gürkan, H. & Yarman, B.S. A New Method to Represent Speech Signals Via Predefined Signature and Envelope Sequences. EURASIP J. Adv. Signal Process. 2007, 056382 (2006). https://doi.org/10.1155/2007/56382
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/56382