A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

Riera-Palou, Felip; den Brinker, Albertus C.

doi:10.1155/2007/74064

Research Article
Open access
Published: 01 December 2007

A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

Felip Riera-Palou^1,2 &
Albertus C. den Brinker¹

EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 074064 (2007) Cite this article

1249 Accesses
Metrics details

Abstract

This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).

References

Noll P: MPEG digital audio coding. IEEE Signal Processing Magazine 1997,14(5):59-81. 10.1109/79.618009
Article Google Scholar
Edler B, Purnhagen H, Ferekidis C: ASAC—Analysis/synthesis codec for very low bit rates. The 100th AES Convention, April 1996, Copenhagen, Denmark 4179.
Google Scholar
Serra X: Musical sound modeling with sinusoids plus noise. In Musical Signal Processing. Edited by: Roads C, Pope S, Picialli A, Poli GD. Swets & Zeitlinger, Lisse, The Netherlands; 1997:91-122.
Google Scholar
Audio Subgroup : Report on the verification test of MPEG-4 parametric coding for high-quality audio. 2004.
Google Scholar
van Schijndel NH, van de Par S: Rate-distortion optimized hybrid sound coding. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 235–238.
Google Scholar
Myburg FP: Design of a scalable parametric audio coder, Ph.D. dissertation.
Verma TS, Meng THY: 6Kbps to 85Kbps scalable audio coder. Proceedings of IEEE Interntional Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 877–880.
Google Scholar
Kleijn W, Paliwal KK (Eds): Speech Coding and Synthesis. Elsevier, Amsterdam, The Netherlands; 1995.
Google Scholar
Ramprashad SA: The multimode transform predictive coding paradigm. IEEE Transactions on Speech and Audio Processing 2003,11(2):117-129. 10.1109/TSA.2003.809195
Article Google Scholar
Salami R, Lefebvre R, Lakaniemi A, Kontola K, Bruhn S, Taleb A: Extended AMR-WB for high-quality audio on mobile devices. IEEE Communications Magazine 2006,44(5):90-97.
Article Google Scholar
Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561-580.
Article Google Scholar
Markel JD, Gray AH Jr.: Linear Prediction of Speech. Springer, Berlin, Germany; 1976.
Book Google Scholar
Strube HW: Linear prediction on a warped frequency scale. The Journal of the Acoustical Society of America 1980,68(4):1071-1076. 10.1121/1.384992
Article Google Scholar
den Brinker AC, Voitishchuk V, van Eijndhoven SJL: IIR-based pure linear prediction. IEEE Transactions on Speech and Audio Processing 2004,12(1):68-75. 10.1109/TSA.2003.815524
Article Google Scholar
Härmä A, Karjalainen M, Savioja L, Välimäki V, Laine UK, Huopaniemi J: Frequency-warped signal processing for audio applications. Journal of the Audio Engineering Society 2000,48(11):1011-1031.
Google Scholar
Oppenheim AV, Johnson DH, Steiglitz K: Computation of spectra with unequal resolution using the fast Fourier transform. Proceedings of the IEEE 1971,59(2):299-301.
Article Google Scholar
Kroon P, Deprettere E, Sluyter R: Regular-pulse excitation—a novel approach to effective and efficient multipulse coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(5):1054-1063. 10.1109/TASSP.1986.1164946
Article Google Scholar
Atal BS, Remde JR: A new model of LPC excitation for producing natural-sounding speech at low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '82), May 1982, Paris, France 614–617.
Chapter Google Scholar
Schroeder MR, Atal BS: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '85), April 1985, Tampa, Fla, USA 937–940.
Chapter Google Scholar
Adoul J-P, Mabilleau P, Delprat M, Morissette S: Fast CELP coding based on algebraic codes. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87), April 1987, Dallas, Tex, USA 1957–1960.
Chapter Google Scholar
Cox RV, Kroon P: Low bit-rate speech coders for multimedia communication. IEEE Communications Magazine 1996,34(12):34-41. 10.1109/35.556484
Article Google Scholar
Bessette B, Salami R, Lefebvre R, et al.: The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing 2002,10(8):620-636. 10.1109/TSA.2002.804299
Article Google Scholar
Salami RA: Binary code excited linear prediction (BCELP): new approach to CELP coding of speech without codebooks. Electronics Letters 1989,25(6):401-403. 10.1049/el:19890276
Article Google Scholar
Xydeas CS, Ireton MA, Baghbadrani DK: Theory and real time implementation of a CELP coder at 4.8 & 6.0 kbits/sec using ternary code excitation. Proceedings of IERE International Conference on Digital Processing of Signals in Communications, September 1990, Loughborough, UK 167–174.
Google Scholar
Singhal S: High quality audio coding using multipulse LPC. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '90), April 1990, Albuquerque, NM, USA 2: 1101–1104.
Google Scholar
Lin X, Salami RA, Steele R: High quality audio coding using analysis-by-synthesis technique. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '91), May 1991, Toronto, Ont, Canada 5: 3617–3620.
Google Scholar
Waters GT (Ed): Sound quality assessment material recordings for subjective tests In Users' handbook for the EBU—SQAM Compact Disk Tech. 3253-E. Technical Centre of the European Broadcasting Union, Brussels, Belgium; 1988.
Riera-Palou F, den Brinker AC, Gerrits AJ: Modelling long-term correlations in broadband speech and audio pulse coders. Electronics Letters 2005,41(8):508-509. 10.1049/el:20058338
Article Google Scholar
Riera-Palou F, den Brinker AC, Gerrits AJ, Sluijter RJ: Improved optimisation of excitation sequences in speech and audio coders. Electronics Letters 2004,40(8):515-517. 10.1049/el:20040321
Article Google Scholar
Ramachandran RP, Kabal P: Pitch prediction filters in speech coding. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989,37(4):467-478. 10.1109/29.17527
Article Google Scholar
Ramachandran RP, Kabal P: Stability and performance analysis of pitch filters in speech coders. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987,35(7):937-946. 10.1109/TASSP.1987.1165238
Article Google Scholar
Sreenivas TV: Modelling LPC-residue by components for good quality speech coding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '88), April 1988, New York, NY, USA 171–174.
Chapter Google Scholar
Kroon P: Time-domain coding of (near) toll quality speech at rates below 16 kb/s, Ph.D. dissertation.
Kleijn WB, Krasinski DJ, Ketchum RH: Improved speech quality and efficient vector quantization in SELP. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '88), April 1988, New York, NY, USA 155–158.
Chapter Google Scholar
Zhang S, Lockhart G: Embedded RPE based on multistage coding. IEEE Transactions on Speech and Audio Processing 1997,5(4):367-371. 10.1109/89.593313
Article Google Scholar
Schuijers EGP, Oomen AWJ, den Brinker AC, Gerrits AJ: Advances in parametric coding for high-quality audio. Proceedings of IEEE Benelux Workshop on Model Based Processing and Coding of Audio, November 2002, Leuven, Belgium 73–79.
Google Scholar
Johnson JD, Ferreira AJ: Sum-difference stereo transform coding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '92), March 1992, San Francisco, Calif, USA 2: 569–572.
Google Scholar
Herre J, Brandenburg K, Lederer D: Intensity stereo coding. The 96th AES Convention, March 1994, Amsterdam, The Netherlands 3799.
Google Scholar
Breebaart J, van de Par S, Kohlrausch A, Schuijers E: High-quality parametric spatial audio coding at low bitrates. The 116th AES Convention, May 2004, Berlin, Germany 6072.
Google Scholar
Breebaart J, van de Par S, Kohlrausch A, Schuijers E: Parametric coding of stereo audio. EURASIP Journal on Applied Signal Processing 2005,2005(9):1305-1322. 10.1155/ASP.2005.1305
MATH Google Scholar
ISO/IEC : Coding of audio-visual objects—part3: audio, AMENDMENT 2: parametric coding of high quality audio. 2004.
Google Scholar
Vaidyanathan PP: Multirate Systems And Filter Banks. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.
MATH Google Scholar
Riera-Palou F, den Brinker AC, Gerrits AJ: A hybrid parametric-waveform approach to bit stream scalable audio coding. Proceedings of the 38th Asilomar Conference on Signals, Systems and Computers, November 2004, Pacific Grove, Calif, USA 2: 2250–2254.
Google Scholar
Amorim R: Results of 64kbps Public Listening Test. 2003.https://doi.org/www.rjamorim.com/test/64test/results.html
Google Scholar

Download references

Author information

Authors and Affiliations

Philips Research Laboratories, Digital Signal Processing Group, Prof. Holstlaan 4, Eindhoven, 5656, AA, The Netherlands
Felip Riera-Palou & Albertus C. den Brinker
Department of Mathematics and Informatics, University of the Balearic Islands, Carretera de Valldemossa km 7.5, Palma de Mallorca, 07122, Spain
Felip Riera-Palou

Authors

Felip Riera-Palou
View author publications
You can also search for this author in PubMed Google Scholar
Albertus C. den Brinker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felip Riera-Palou.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Riera-Palou, F., den Brinker, A.C. A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding. EURASIP J. Adv. Signal Process. 2007, 074064 (2007). https://doi.org/10.1155/2007/74064

Download citation

Received: 02 October 2006
Revised: 16 March 2007
Accepted: 29 June 2007
Published: 01 December 2007
DOI: https://doi.org/10.1155/2007/74064

A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords