- Research Article
- Open access
- Published:
A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding
EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 074064 (2007)
Abstract
This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).
References
Noll P: MPEG digital audio coding. IEEE Signal Processing Magazine 1997,14(5):59-81. 10.1109/79.618009
Edler B, Purnhagen H, Ferekidis C: ASAC—Analysis/synthesis codec for very low bit rates. The 100th AES Convention, April 1996, Copenhagen, Denmark 4179.
Serra X: Musical sound modeling with sinusoids plus noise. In Musical Signal Processing. Edited by: Roads C, Pope S, Picialli A, Poli GD. Swets & Zeitlinger, Lisse, The Netherlands; 1997:91-122.
Audio Subgroup : Report on the verification test of MPEG-4 parametric coding for high-quality audio. 2004.
van Schijndel NH, van de Par S: Rate-distortion optimized hybrid sound coding. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 235–238.
Myburg FP: Design of a scalable parametric audio coder, Ph.D. dissertation.
Verma TS, Meng THY: 6Kbps to 85Kbps scalable audio coder. Proceedings of IEEE Interntional Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 877–880.
Kleijn W, Paliwal KK (Eds): Speech Coding and Synthesis. Elsevier, Amsterdam, The Netherlands; 1995.
Ramprashad SA: The multimode transform predictive coding paradigm. IEEE Transactions on Speech and Audio Processing 2003,11(2):117-129. 10.1109/TSA.2003.809195
Salami R, Lefebvre R, Lakaniemi A, Kontola K, Bruhn S, Taleb A: Extended AMR-WB for high-quality audio on mobile devices. IEEE Communications Magazine 2006,44(5):90-97.
Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561-580.
Markel JD, Gray AH Jr.: Linear Prediction of Speech. Springer, Berlin, Germany; 1976.
Strube HW: Linear prediction on a warped frequency scale. The Journal of the Acoustical Society of America 1980,68(4):1071-1076. 10.1121/1.384992
den Brinker AC, Voitishchuk V, van Eijndhoven SJL: IIR-based pure linear prediction. IEEE Transactions on Speech and Audio Processing 2004,12(1):68-75. 10.1109/TSA.2003.815524
Härmä A, Karjalainen M, Savioja L, Välimäki V, Laine UK, Huopaniemi J: Frequency-warped signal processing for audio applications. Journal of the Audio Engineering Society 2000,48(11):1011-1031.
Oppenheim AV, Johnson DH, Steiglitz K: Computation of spectra with unequal resolution using the fast Fourier transform. Proceedings of the IEEE 1971,59(2):299-301.
Kroon P, Deprettere E, Sluyter R: Regular-pulse excitation—a novel approach to effective and efficient multipulse coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(5):1054-1063. 10.1109/TASSP.1986.1164946
Atal BS, Remde JR: A new model of LPC excitation for producing natural-sounding speech at low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '82), May 1982, Paris, France 614–617.
Schroeder MR, Atal BS: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '85), April 1985, Tampa, Fla, USA 937–940.
Adoul J-P, Mabilleau P, Delprat M, Morissette S: Fast CELP coding based on algebraic codes. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87), April 1987, Dallas, Tex, USA 1957–1960.
Cox RV, Kroon P: Low bit-rate speech coders for multimedia communication. IEEE Communications Magazine 1996,34(12):34-41. 10.1109/35.556484
Bessette B, Salami R, Lefebvre R, et al.: The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing 2002,10(8):620-636. 10.1109/TSA.2002.804299
Salami RA: Binary code excited linear prediction (BCELP): new approach to CELP coding of speech without codebooks. Electronics Letters 1989,25(6):401-403. 10.1049/el:19890276
Xydeas CS, Ireton MA, Baghbadrani DK: Theory and real time implementation of a CELP coder at 4.8 & 6.0 kbits/sec using ternary code excitation. Proceedings of IERE International Conference on Digital Processing of Signals in Communications, September 1990, Loughborough, UK 167–174.
Singhal S: High quality audio coding using multipulse LPC. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '90), April 1990, Albuquerque, NM, USA 2: 1101–1104.
Lin X, Salami RA, Steele R: High quality audio coding using analysis-by-synthesis technique. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '91), May 1991, Toronto, Ont, Canada 5: 3617–3620.
Waters GT (Ed): Sound quality assessment material recordings for subjective tests In Users' handbook for the EBU—SQAM Compact Disk Tech. 3253-E. Technical Centre of the European Broadcasting Union, Brussels, Belgium; 1988.
Riera-Palou F, den Brinker AC, Gerrits AJ: Modelling long-term correlations in broadband speech and audio pulse coders. Electronics Letters 2005,41(8):508-509. 10.1049/el:20058338
Riera-Palou F, den Brinker AC, Gerrits AJ, Sluijter RJ: Improved optimisation of excitation sequences in speech and audio coders. Electronics Letters 2004,40(8):515-517. 10.1049/el:20040321
Ramachandran RP, Kabal P: Pitch prediction filters in speech coding. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989,37(4):467-478. 10.1109/29.17527
Ramachandran RP, Kabal P: Stability and performance analysis of pitch filters in speech coders. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987,35(7):937-946. 10.1109/TASSP.1987.1165238
Sreenivas TV: Modelling LPC-residue by components for good quality speech coding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '88), April 1988, New York, NY, USA 171–174.
Kroon P: Time-domain coding of (near) toll quality speech at rates below 16 kb/s, Ph.D. dissertation.
Kleijn WB, Krasinski DJ, Ketchum RH: Improved speech quality and efficient vector quantization in SELP. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '88), April 1988, New York, NY, USA 155–158.
Zhang S, Lockhart G: Embedded RPE based on multistage coding. IEEE Transactions on Speech and Audio Processing 1997,5(4):367-371. 10.1109/89.593313
Schuijers EGP, Oomen AWJ, den Brinker AC, Gerrits AJ: Advances in parametric coding for high-quality audio. Proceedings of IEEE Benelux Workshop on Model Based Processing and Coding of Audio, November 2002, Leuven, Belgium 73–79.
Johnson JD, Ferreira AJ: Sum-difference stereo transform coding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '92), March 1992, San Francisco, Calif, USA 2: 569–572.
Herre J, Brandenburg K, Lederer D: Intensity stereo coding. The 96th AES Convention, March 1994, Amsterdam, The Netherlands 3799.
Breebaart J, van de Par S, Kohlrausch A, Schuijers E: High-quality parametric spatial audio coding at low bitrates. The 116th AES Convention, May 2004, Berlin, Germany 6072.
Breebaart J, van de Par S, Kohlrausch A, Schuijers E: Parametric coding of stereo audio. EURASIP Journal on Applied Signal Processing 2005,2005(9):1305-1322. 10.1155/ASP.2005.1305
ISO/IEC : Coding of audio-visual objects—part3: audio, AMENDMENT 2: parametric coding of high quality audio. 2004.
Vaidyanathan PP: Multirate Systems And Filter Banks. Prentice-Hall, Englewood Cliffs, NJ, USA; 1993.
Riera-Palou F, den Brinker AC, Gerrits AJ: A hybrid parametric-waveform approach to bit stream scalable audio coding. Proceedings of the 38th Asilomar Conference on Signals, Systems and Computers, November 2004, Pacific Grove, Calif, USA 2: 2250–2254.
Amorim R: Results of 64kbps Public Listening Test. 2003.https://doi.org/www.rjamorim.com/test/64test/results.html
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Riera-Palou, F., den Brinker, A.C. A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding. EURASIP J. Adv. Signal Process. 2007, 074064 (2007). https://doi.org/10.1155/2007/74064
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2007/74064