Skip to main content

Bandwidth Extension of Telephone Speech Aided by Data Embedding

Abstract

A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT) and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise), in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately. In a listening test, the reconstructed wideband speech was preferred (at different degrees) over conventional telephone speech in of the test utterances.

References

  1. 1.

    Voran S: Listener ratings of speech passbands. Proceedings of the IEEE Workshop on Speech Coding for Telecommunications, 1997, Pocono Manor, Pa, USA 81–82.

    Google Scholar 

  2. 2.

    Jax P, Vary P: Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding? IEEE Communications Magazine 2006,44(5):106–111.

    Article  Google Scholar 

  3. 3.

    Eggers JJ, Bäuml R, Tzschoppe R, Girod B: Scalar Costa scheme for information embedding. IEEE Transactions on Signal Processing 2003,51(4):1003–1019. 10.1109/TSP.2003.809366

    MathSciNet  Article  Google Scholar 

  4. 4.

    Costa MHM: Writing on dirty paper. IEEE Transactions on Information Theory 1983,29(3):439–441. 10.1109/TIT.1983.1056659

    MathSciNet  Article  Google Scholar 

  5. 5.

    Cheng Q, Sorensen J: Spread spectrum signaling for speech watermarking. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake, Utah, USA 3: 1337–1340.

    Google Scholar 

  6. 6.

    Swanson MD, Zhu B, Tewfik AH, Boney L: Robust audio watermarking using perceptual masking. Signal Processing 1998,66(3):337–355. 10.1016/S0165-1684(98)00014-0

    Article  Google Scholar 

  7. 7.

    Chen B, Wornell GW: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Transactions on Information Theory 2001,47(4):1423–1443. 10.1109/18.923725

    MathSciNet  Article  Google Scholar 

  8. 8.

    Geiser B, Jax P, Vary P: Artificial bandwidth extension of speech supported by watermark-transmitted side information. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1497–1500.

    Google Scholar 

  9. 9.

    Sagi A, Malah D: Data embedding in speech signals using perceptual masking. European Signal Processing Conference, September 2004, Vienna, Austria 1657–1660.

    Google Scholar 

  10. 10.

    Chen S, Leung H: Concurrent data transmission through analog speech channel using data hiding. IEEE Signal Processing Letters 2005,12(8):581–584.

    Article  Google Scholar 

  11. 11.

    Larsen E, Aarts RM: Audio Bandwidth Extension. John Wiley & Sons, New York, NY, USA; 2004.

    Google Scholar 

  12. 12.

    Fuemmeler JA, Hardie RC, Gardner WR: Techniques for the regeneration of wideband speech from narrowband speech. EURASIP Journal on Applied Signal Processing 2001,2001(4):266–274. 10.1155/S1110865701000300

    Article  Google Scholar 

  13. 13.

    Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561–580.

    Article  Google Scholar 

  14. 14.

    Jax P, Vary P: On artificial bandwidth extension of telephone speech. Signal Processing 2003,83(8):1707–1719. 10.1016/S0165-1684(03)00082-3

    Article  Google Scholar 

  15. 15.

    McCree A: 14 kb/s wideband speech coder with a parametric highband model. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 1153–1156.

    Google Scholar 

  16. 16.

    McCree A, Unno T, Anandakumar A, Bernard A, Paksoy E: An embedded adaptive multi-rate wideband speech coder. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake, Utah, USA 2: 761–764.

    Google Scholar 

  17. 17.

    Valin J-M, Lefebvre R: Bandwidth extension of narrowband speech for low bit-rate wideband coding. Proceedings of the IEEE Speech Coding Workshop (SCW '00), September 2000, Delavan, Wis, USA 130–132.

    Google Scholar 

  18. 18.

    Epps JR, Holmes WH: A new very low bit rate wideband speech coder with a sinusoidal highband model. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '01), May 2001, Sydney, NSW, Australia 2: 349–352.

    Google Scholar 

  19. 19.

    Makhoul J: Spectral linear prediction: properties and applications. IEEE Transactions on Acoustics, Speech, and Signal Processing 1975,23(3):283–297. 10.1109/TASSP.1975.1162685

    MathSciNet  Article  Google Scholar 

  20. 20.

    Kondoz AM: Digital Speech: Coding for Low Bit Rate Communications Systems. John Wiley & Sons, New York, NY, USA; 1994.

    Google Scholar 

  21. 21.

    Makhoul J, Berouti M: High-frequency regeneration in speech coding systems. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '79), April 1979, Washington, DC, USA 4: 428–431.

    Article  Google Scholar 

  22. 22.

    Linde Y, Buzo A, Gray RM: An algorithm for vector quantizer design. IEEE Transactions on Communications Systems 1980,28(1):84–95. 10.1109/TCOM.1980.1094577

    Article  Google Scholar 

  23. 23.

    ISO/IEC : Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s—part 3: audio. In Tech. Rep. ISO/IEC 11172-3. International Organization for Standardization, Geneva, Switzerland; 1992.

    Google Scholar 

  24. 24.

    Bracewell RN: Discrete Hartley transform. Journal of Optical Society of America 1983,73(12):1832–1835. 10.1364/JOSA.73.001832

    Article  Google Scholar 

  25. 25.

    Sorensen HV, Jones DL, Burrus CS, Heideman MT: On computing the discrete Hartley transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1985,33(5):1231–1238. 10.1109/TASSP.1985.1164687

    MathSciNet  Article  Google Scholar 

  26. 26.

    Haykin S: Adaptive Filter Theory. 3rd edition. Prentice-Hall, New York, NY, USA; 1996.

    Google Scholar 

  27. 27.

    Haykin S: Communication Systems. 4th edition. John Wiley & Sons, New York, NY, USA; 2001.

    Google Scholar 

  28. 28.

    Sklar B: Digital Communications, Fundamentals and Applications. Prentice-Hall, Englewood Cliffs, NJ, USA; 1988.

    Google Scholar 

  29. 29.

    ITU-T : Netwrok transmission model for evaluating modem performance over 2-wire voice grade connections. In Tech. Rep. V.56 bis. International Telecommunication Union, Geneva, Switzerland; August 1995.

  30. 30.

    ITU-T : Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. In Tech. Rep. P.862. International Telecommunication Union, Geneva, Switzerland; February 2001.

  31. 31.

    Sagi A: Data embedding in speech signals, M.S. thesis.

  32. 32.

    Fischer RFH, Tzschoppe R, Bäuml R: Lattice costa schemes using subspace projection for digital watermarking. Proceedings of the 5th International ITG Conference on Source and Channel Coding (SCC '04), January 2004, Erlangen, Germany 127–134.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ariel Sagi.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Sagi, A., Malah, D. Bandwidth Extension of Telephone Speech Aided by Data Embedding. EURASIP J. Adv. Signal Process. 2007, 064921 (2006). https://doi.org/10.1155/2007/64921

Download citation

Keywords

  • Side Information
  • Listening Test
  • Spectral Envelope
  • Transmitted Analog
  • Test Utterance