Skip to content

Advertisement

  • Research Article
  • Open Access

A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

EURASIP Journal on Advances in Signal Processing20072007:074064

https://doi.org/10.1155/2007/74064

  • Received: 2 October 2006
  • Accepted: 29 June 2007
  • Published:

Abstract

This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).

Keywords

  • Information Technology
  • Quantum Information
  • Major Result
  • Pulse Excitation
  • Standardize Coder

[1234567891011121314151617181920212223242526272829303132333435363738394041424344]

Authors’ Affiliations

(1)
Philips Research Laboratories, Digital Signal Processing Group, Prof. Holstlaan 4, Eindhoven, 5656, AA, The Netherlands
(2)
Department of Mathematics and Informatics, University of the Balearic Islands, Carretera de Valldemossa km 7.5, Palma de Mallorca, 07122, Spain

References

  1. Noll P: MPEG digital audio coding. IEEE Signal Processing Magazine 1997,14(5):59-81. 10.1109/79.618009MathSciNetView ArticleGoogle Scholar
  2. Edler B, Purnhagen H, Ferekidis C: ASAC—Analysis/synthesis codec for very low bit rates. The 100th AES Convention, April 1996, Copenhagen, Denmark 4179.Google Scholar
  3. Serra X: Musical sound modeling with sinusoids plus noise. In Musical Signal Processing. Edited by: Roads C, Pope S, Picialli A, Poli GD. Swets & Zeitlinger, Lisse, The Netherlands; 1997:91-122.Google Scholar
  4. Audio Subgroup : Report on the verification test of MPEG-4 parametric coding for high-quality audio. 2004.Google Scholar
  5. van Schijndel NH, van de Par S: Rate-distortion optimized hybrid sound coding. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 235-238.Google Scholar
  6. Myburg FP: Design of a scalable parametric audio coder, Ph.D. dissertation.Google Scholar
  7. Verma TS, Meng THY: 6Kbps to 85Kbps scalable audio coder. Proceedings of IEEE Interntional Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 877-880.Google Scholar
  8. Kleijn W, Paliwal KK (Eds): Speech Coding and Synthesis. Elsevier, Amsterdam, The Netherlands; 1995.Google Scholar
  9. Ramprashad SA: The multimode transform predictive coding paradigm. IEEE Transactions on Speech and Audio Processing 2003,11(2):117-129. 10.1109/TSA.2003.809195View ArticleGoogle Scholar
  10. Salami R, Lefebvre R, Lakaniemi A, Kontola K, Bruhn S, Taleb A: Extended AMR-WB for high-quality audio on mobile devices. IEEE Communications Magazine 2006,44(5):90-97.View ArticleGoogle Scholar
  11. Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561-580.View ArticleGoogle Scholar
  12. Markel JD, Gray AH Jr.: Linear Prediction of Speech. Springer, Berlin, Germany; 1976.View ArticleMATHGoogle Scholar
  13. Strube HW: Linear prediction on a warped frequency scale. The Journal of the Acoustical Society of America 1980,68(4):1071-1076. 10.1121/1.384992View ArticleGoogle Scholar
  14. den Brinker AC, Voitishchuk V, van Eijndhoven SJL: IIR-based pure linear prediction. IEEE Transactions on Speech and Audio Processing 2004,12(1):68-75. 10.1109/TSA.2003.815524View ArticleGoogle Scholar
  15. Härmä A, Karjalainen M, Savioja L, Välimäki V, Laine UK, Huopaniemi J: Frequency-warped signal processing for audio applications. Journal of the Audio Engineering Society 2000,48(11):1011-1031.Google Scholar
  16. Oppenheim AV, Johnson DH, Steiglitz K: Computation of spectra with unequal resolution using the fast Fourier transform. Proceedings of the IEEE 1971,59(2):299-301.View ArticleGoogle Scholar
  17. Kroon P, Deprettere E, Sluyter R: Regular-pulse excitation—a novel approach to effective and efficient multipulse coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(5):1054-1063. 10.1109/TASSP.1986.1164946View ArticleGoogle Scholar
  18. Atal BS, Remde JR: A new model of LPC excitation for producing natural-sounding speech at low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '82), May 1982, Paris, France 614-617.View ArticleGoogle Scholar
  19. Schroeder MR, Atal BS: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '85), April 1985, Tampa, Fla, USA 937-940.View ArticleGoogle Scholar
  20. Adoul J-P, Mabilleau P, Delprat M, Morissette S: Fast CELP coding based on algebraic codes. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87), April 1987, Dallas, Tex, USA 1957-1960.View ArticleGoogle Scholar
  21. Cox RV, Kroon P: Low bit-rate speech coders for multimedia communication. IEEE Communications Magazine 1996,34(12):34-41. 10.1109/35.556484View ArticleGoogle Scholar
  22. Bessette B, Salami R, Lefebvre R, et al.: The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing 2002,10(8):620-636. 10.1109/TSA.2002.804299View ArticleGoogle Scholar
  23. Salami RA: Binary code excited linear prediction (BCELP): new approach to CELP coding of speech without codebooks. Electronics Letters 1989,25(6):401-403. 10.1049/el:19890276View ArticleGoogle Scholar
  24. Xydeas CS, Ireton MA, Baghbadrani DK: Theory and real time implementation of a CELP coder at 4.8 & 6.0 kbits/sec using ternary code excitation. Proceedings of IERE International Conference on Digital Processing of Signals in Communications, September 1990, Loughborough, UK 167-174.Google Scholar
  25. Singhal S: High quality audio coding using multipulse LPC. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '90), April 1990, Albuquerque, NM, USA 2: 1101-1104.Google Scholar
  26. Lin X, Salami RA, Steele R: High quality audio coding using analysis-by-synthesis technique. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '91), May 1991, Toronto, Ont, Canada 5: 3617-3620.Google Scholar
  27. Waters GT (Ed): Sound quality assessment material recordings for subjective tests In Users' handbook for the EBU—SQAM Compact Disk Tech. 3253-E. Technical Centre of the European Broadcasting Union, Brussels, Belgium; 1988.Google Scholar
  28. Riera-Palou F, den Brinker AC, Gerrits AJ: Modelling long-term correlations in broadband speech and audio pulse coders. Electronics Letters 2005,41(8):508-509. 10.1049/el:20058338View ArticleGoogle Scholar
  29. Riera-Palou F, den Brinker AC, Gerrits AJ, Sluijter RJ: Improved optimisation of excitation sequences in speech and audio coders. Electronics Letters 2004,40(8):515-517. 10.1049/el:20040321View ArticleGoogle Scholar
  30. Ramachandran RP, Kabal P: Pitch prediction filters in speech coding. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989,37(4):467-478. 10.1109/29.17527View ArticleGoogle Scholar
  31. Ramachandran RP, Kabal P: Stability and performance analysis of pitch filters in speech coders. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987,35(7):937-946. 10.1109/TASSP.1987.1165238View ArticleGoogle Scholar
  32. Sreenivas TV: Modelling LPC-residue by components for good quality speech coding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '88), April 1988, New York, NY, USA 171-174.Google Scholar
  33. Kroon P: Time-domain coding of (near) toll quality speech at rates below 16 kb/s, Ph.D. dissertation.Google Scholar
  34. Kleijn WB, Krasinski DJ, Ketchum RH: Improved speech quality and efficient vector quantization in SELP. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '88), April 1988, New York, NY, USA 155-158.Google Scholar
  35. Zhang S, Lockhart G: Embedded RPE based on multistage coding. IEEE Transactions on Speech and Audio Processing 1997,5(4):367-371. 10.1109/89.593313View ArticleGoogle Scholar
  36. Schuijers EGP, Oomen AWJ, den Brinker AC, Gerrits AJ: Advances in parametric coding for high-quality audio. Proceedings of IEEE Benelux Workshop on Model Based Processing and Coding of Audio, November 2002, Leuven, Belgium 73-79.Google Scholar
  37. Johnson JD, Ferreira AJ: Sum-difference stereo transform coding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '92), March 1992, San Francisco, Calif, USA 2: 569-572.Google Scholar
  38. Herre J, Brandenburg K, Lederer D: Intensity stereo coding. The 96th AES Convention, March 1994, Amsterdam, The Netherlands 3799.Google Scholar
  39. Breebaart J, van de Par S, Kohlrausch A, Schuijers E: High-quality parametric spatial audio coding at low bitrates. The 116th AES Convention, May 2004, Berlin, Germany 6072.Google Scholar
  40. Breebaart J, van de Par S, Kohlrausch A, Schuijers E: Parametric coding of stereo audio. EURASIP Journal on Applied Signal Processing 2005,2005(9):1305-1322. 10.1155/ASP.2005.1305View ArticleMATHGoogle Scholar
  41. ISO/IEC : Coding of audio-visual objects—part3: audio, AMENDMENT 2: parametric coding of high quality audio. 2004.Google Scholar
  42. Vaidyanathan PP: Multirate Systems And Filter Banks. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.MATHGoogle Scholar
  43. Riera-Palou F, den Brinker AC, Gerrits AJ: A hybrid parametric-waveform approach to bit stream scalable audio coding. Proceedings of the 38th Asilomar Conference on Signals, Systems and Computers, November 2004, Pacific Grove, Calif, USA 2: 2250-2254.Google Scholar
  44. Amorim R: Results of 64kbps Public Listening Test. 2003.http://www.rjamorim.com/test/64test/results.htmlGoogle Scholar

Copyright

Advertisement