Bandwidth Extension of Telephone Speech Aided by Data Embedding

Sagi, Ariel; Malah, David

doi:10.1155/2007/64921

Research Article
Open access
Published: 01 December 2006

Bandwidth Extension of Telephone Speech Aided by Data Embedding

Ariel Sagi¹ &
David Malah¹

EURASIP Journal on Advances in Signal Processing volume 2007, Article number: 064921 (2006) Cite this article

1144 Accesses
15 Citations
Metrics details

Abstract

A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT) and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise), in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately. In a listening test, the reconstructed wideband speech was preferred (at different degrees) over conventional telephone speech in of the test utterances.

References

Voran S: Listener ratings of speech passbands. Proceedings of the IEEE Workshop on Speech Coding for Telecommunications, 1997, Pocono Manor, Pa, USA 81–82.
Google Scholar
Jax P, Vary P: Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding? IEEE Communications Magazine 2006,44(5):106–111.
Article Google Scholar
Eggers JJ, Bäuml R, Tzschoppe R, Girod B: Scalar Costa scheme for information embedding. IEEE Transactions on Signal Processing 2003,51(4):1003–1019. 10.1109/TSP.2003.809366
Article MathSciNet Google Scholar
Costa MHM: Writing on dirty paper. IEEE Transactions on Information Theory 1983,29(3):439–441. 10.1109/TIT.1983.1056659
Article MathSciNet Google Scholar
Cheng Q, Sorensen J: Spread spectrum signaling for speech watermarking. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake, Utah, USA 3: 1337–1340.
Google Scholar
Swanson MD, Zhu B, Tewfik AH, Boney L: Robust audio watermarking using perceptual masking. Signal Processing 1998,66(3):337–355. 10.1016/S0165-1684(98)00014-0
Article Google Scholar
Chen B, Wornell GW: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Transactions on Information Theory 2001,47(4):1423–1443. 10.1109/18.923725
Article MathSciNet Google Scholar
Geiser B, Jax P, Vary P: Artificial bandwidth extension of speech supported by watermark-transmitted side information. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1497–1500.
Google Scholar
Sagi A, Malah D: Data embedding in speech signals using perceptual masking. European Signal Processing Conference, September 2004, Vienna, Austria 1657–1660.
Google Scholar
Chen S, Leung H: Concurrent data transmission through analog speech channel using data hiding. IEEE Signal Processing Letters 2005,12(8):581–584.
Article Google Scholar
Larsen E, Aarts RM: Audio Bandwidth Extension. John Wiley & Sons, New York, NY, USA; 2004.
Book Google Scholar
Fuemmeler JA, Hardie RC, Gardner WR: Techniques for the regeneration of wideband speech from narrowband speech. EURASIP Journal on Applied Signal Processing 2001,2001(4):266–274. 10.1155/S1110865701000300
Article Google Scholar
Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561–580.
Article Google Scholar
Jax P, Vary P: On artificial bandwidth extension of telephone speech. Signal Processing 2003,83(8):1707–1719. 10.1016/S0165-1684(03)00082-3
Article Google Scholar
McCree A: 14 kb/s wideband speech coder with a parametric highband model. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 1153–1156.
Google Scholar
McCree A, Unno T, Anandakumar A, Bernard A, Paksoy E: An embedded adaptive multi-rate wideband speech coder. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake, Utah, USA 2: 761–764.
Google Scholar
Valin J-M, Lefebvre R: Bandwidth extension of narrowband speech for low bit-rate wideband coding. Proceedings of the IEEE Speech Coding Workshop (SCW '00), September 2000, Delavan, Wis, USA 130–132.
Google Scholar
Epps JR, Holmes WH: A new very low bit rate wideband speech coder with a sinusoidal highband model. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '01), May 2001, Sydney, NSW, Australia 2: 349–352.
Google Scholar
Makhoul J: Spectral linear prediction: properties and applications. IEEE Transactions on Acoustics, Speech, and Signal Processing 1975,23(3):283–297. 10.1109/TASSP.1975.1162685
Article MathSciNet Google Scholar
Kondoz AM: Digital Speech: Coding for Low Bit Rate Communications Systems. John Wiley & Sons, New York, NY, USA; 1994.
Google Scholar
Makhoul J, Berouti M: High-frequency regeneration in speech coding systems. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '79), April 1979, Washington, DC, USA 4: 428–431.
Article Google Scholar
Linde Y, Buzo A, Gray RM: An algorithm for vector quantizer design. IEEE Transactions on Communications Systems 1980,28(1):84–95. 10.1109/TCOM.1980.1094577
Article Google Scholar
ISO/IEC : Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s—part 3: audio. In Tech. Rep. ISO/IEC 11172-3. International Organization for Standardization, Geneva, Switzerland; 1992.
Google Scholar
Bracewell RN: Discrete Hartley transform. Journal of Optical Society of America 1983,73(12):1832–1835. 10.1364/JOSA.73.001832
Article Google Scholar
Sorensen HV, Jones DL, Burrus CS, Heideman MT: On computing the discrete Hartley transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1985,33(5):1231–1238. 10.1109/TASSP.1985.1164687
Article MathSciNet Google Scholar
Haykin S: Adaptive Filter Theory. 3rd edition. Prentice-Hall, New York, NY, USA; 1996.
MATH Google Scholar
Haykin S: Communication Systems. 4th edition. John Wiley & Sons, New York, NY, USA; 2001.
Google Scholar
Sklar B: Digital Communications, Fundamentals and Applications. Prentice-Hall, Englewood Cliffs, NJ, USA; 1988.
MATH Google Scholar
ITU-T : Netwrok transmission model for evaluating modem performance over 2-wire voice grade connections. In Tech. Rep. V.56 bis. International Telecommunication Union, Geneva, Switzerland; August 1995.
ITU-T : Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. In Tech. Rep. P.862. International Telecommunication Union, Geneva, Switzerland; February 2001.
Sagi A: Data embedding in speech signals, M.S. thesis.
Fischer RFH, Tzschoppe R, Bäuml R: Lattice costa schemes using subspace projection for digital watermarking. Proceedings of the 5th International ITG Conference on Source and Channel Coding (SCC '04), January 2004, Erlangen, Germany 127–134.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion, Israel Institute of Technology, Haifa, 32000, Israel
Ariel Sagi & David Malah

Authors

Ariel Sagi
View author publications
You can also search for this author in PubMed Google Scholar
David Malah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ariel Sagi.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Sagi, A., Malah, D. Bandwidth Extension of Telephone Speech Aided by Data Embedding. EURASIP J. Adv. Signal Process. 2007, 064921 (2006). https://doi.org/10.1155/2007/64921

Download citation

Received: 18 February 2006
Revised: 19 July 2006
Accepted: 10 September 2006
Published: 01 December 2006
DOI: https://doi.org/10.1155/2007/64921

Bandwidth Extension of Telephone Speech Aided by Data Embedding

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords