Skip to content

Advertisement

  • Research Article
  • Open Access

Bandwidth Extension of Telephone Speech Aided by Data Embedding

EURASIP Journal on Advances in Signal Processing20062007:064921

https://doi.org/10.1155/2007/64921

  • Received: 18 February 2006
  • Accepted: 10 September 2006
  • Published:

Abstract

A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT) and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise), in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately . In a listening test, the reconstructed wideband speech was preferred (at different degrees) over conventional telephone speech in of the test utterances.

Keywords

  • Side Information
  • Listening Test
  • Spectral Envelope
  • Transmitted Analog
  • Test Utterance

[1234567891011121314151617181920212223242526272829303132]

Authors’ Affiliations

(1)
Department of Electrical Engineering, Technion, Israel Institute of Technology, Haifa, 32000, Israel

References

  1. Voran S: Listener ratings of speech passbands. Proceedings of the IEEE Workshop on Speech Coding for Telecommunications, 1997, Pocono Manor, Pa, USA 81-82.Google Scholar
  2. Jax P, Vary P: Bandwidth extension of speech signals: a catalyst for the introduction of wideband speech coding? IEEE Communications Magazine 2006,44(5):106-111.View ArticleGoogle Scholar
  3. Eggers JJ, Bäuml R, Tzschoppe R, Girod B: Scalar Costa scheme for information embedding. IEEE Transactions on Signal Processing 2003,51(4):1003-1019. 10.1109/TSP.2003.809366MathSciNetView ArticleGoogle Scholar
  4. Costa MHM: Writing on dirty paper. IEEE Transactions on Information Theory 1983,29(3):439-441. 10.1109/TIT.1983.1056659View ArticleMATHGoogle Scholar
  5. Cheng Q, Sorensen J: Spread spectrum signaling for speech watermarking. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake, Utah, USA 3: 1337-1340.Google Scholar
  6. Swanson MD, Zhu B, Tewfik AH, Boney L: Robust audio watermarking using perceptual masking. Signal Processing 1998,66(3):337-355. 10.1016/S0165-1684(98)00014-0View ArticleMATHGoogle Scholar
  7. Chen B, Wornell GW: Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Transactions on Information Theory 2001,47(4):1423-1443. 10.1109/18.923725MathSciNetView ArticleMATHGoogle Scholar
  8. Geiser B, Jax P, Vary P: Artificial bandwidth extension of speech supported by watermark-transmitted side information. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1497-1500.Google Scholar
  9. Sagi A, Malah D: Data embedding in speech signals using perceptual masking. European Signal Processing Conference, September 2004, Vienna, Austria 1657-1660.Google Scholar
  10. Chen S, Leung H: Concurrent data transmission through analog speech channel using data hiding. IEEE Signal Processing Letters 2005,12(8):581-584.MathSciNetView ArticleGoogle Scholar
  11. Larsen E, Aarts RM: Audio Bandwidth Extension. John Wiley & Sons, New York, NY, USA; 2004.View ArticleGoogle Scholar
  12. Fuemmeler JA, Hardie RC, Gardner WR: Techniques for the regeneration of wideband speech from narrowband speech. EURASIP Journal on Applied Signal Processing 2001,2001(4):266-274. 10.1155/S1110865701000300View ArticleGoogle Scholar
  13. Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561-580.View ArticleGoogle Scholar
  14. Jax P, Vary P: On artificial bandwidth extension of telephone speech. Signal Processing 2003,83(8):1707-1719. 10.1016/S0165-1684(03)00082-3View ArticleMATHGoogle Scholar
  15. McCree A: 14 kb/s wideband speech coder with a parametric highband model. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 2: 1153-1156.Google Scholar
  16. McCree A, Unno T, Anandakumar A, Bernard A, Paksoy E: An embedded adaptive multi-rate wideband speech coder. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '01), May 2001, Salt Lake, Utah, USA 2: 761-764.Google Scholar
  17. Valin J-M, Lefebvre R: Bandwidth extension of narrowband speech for low bit-rate wideband coding. Proceedings of the IEEE Speech Coding Workshop (SCW '00), September 2000, Delavan, Wis, USA 130-132.Google Scholar
  18. Epps JR, Holmes WH: A new very low bit rate wideband speech coder with a sinusoidal highband model. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '01), May 2001, Sydney, NSW, Australia 2: 349-352.Google Scholar
  19. Makhoul J: Spectral linear prediction: properties and applications. IEEE Transactions on Acoustics, Speech, and Signal Processing 1975,23(3):283-297. 10.1109/TASSP.1975.1162685MathSciNetView ArticleGoogle Scholar
  20. Kondoz AM: Digital Speech: Coding for Low Bit Rate Communications Systems. John Wiley & Sons, New York, NY, USA; 1994.Google Scholar
  21. Makhoul J, Berouti M: High-frequency regeneration in speech coding systems. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '79), April 1979, Washington, DC, USA 4: 428-431.View ArticleGoogle Scholar
  22. Linde Y, Buzo A, Gray RM: An algorithm for vector quantizer design. IEEE Transactions on Communications Systems 1980,28(1):84-95. 10.1109/TCOM.1980.1094577View ArticleGoogle Scholar
  23. ISO/IEC : Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s—part 3: audio. In Tech. Rep. ISO/IEC 11172-3. International Organization for Standardization, Geneva, Switzerland; 1992.Google Scholar
  24. Bracewell RN: Discrete Hartley transform. Journal of Optical Society of America 1983,73(12):1832-1835. 10.1364/JOSA.73.001832View ArticleGoogle Scholar
  25. Sorensen HV, Jones DL, Burrus CS, Heideman MT: On computing the discrete Hartley transform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1985,33(5):1231-1238. 10.1109/TASSP.1985.1164687MathSciNetView ArticleGoogle Scholar
  26. Haykin S: Adaptive Filter Theory. 3rd edition. Prentice-Hall, New York, NY, USA; 1996.Google Scholar
  27. Haykin S: Communication Systems. 4th edition. John Wiley & Sons, New York, NY, USA; 2001.Google Scholar
  28. Sklar B: Digital Communications, Fundamentals and Applications. Prentice-Hall, Englewood Cliffs, NJ, USA; 1988.MATHGoogle Scholar
  29. ITU-T : Netwrok transmission model for evaluating modem performance over 2-wire voice grade connections. In Tech. Rep. V.56 bis. International Telecommunication Union, Geneva, Switzerland; August 1995.Google Scholar
  30. ITU-T : Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. In Tech. Rep. P.862. International Telecommunication Union, Geneva, Switzerland; February 2001.Google Scholar
  31. Sagi A: Data embedding in speech signals, M.S. thesis.Google Scholar
  32. Fischer RFH, Tzschoppe R, Bäuml R: Lattice costa schemes using subspace projection for digital watermarking. Proceedings of the 5th International ITG Conference on Source and Channel Coding (SCC '04), January 2004, Erlangen, Germany 127-134.Google Scholar

Copyright

© A. Sagi and D. Malah 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement