On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

Beracoechea, J. A.; Torres-Guijarro, S.; García, L.; Casajús-Quirós, F. J.

doi:10.1155/ASP/2006/40960

Research Article
Open access
Published: 01 December 2006

On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

J. A. Beracoechea¹,
S. Torres-Guijarro¹,
L. García¹ &
…
F. J. Casajús-Quirós¹

EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 040960 (2006) Cite this article

1269 Accesses
8 Citations
Metrics details

Abstract

This paper deals with some of the different problems, strategies, and solutions of building true immersive audio systems oriented to future communication applications. The aim is to build a system where the acoustic field of a chamber is recorded using a microphone array and then is reconstructed or rendered again, in a different chamber using loudspeaker array-based techniques. Our proposal explores the possibility of using recent robust adaptive beamforming techniques for effectively estimating the original sources of the emitting room. A joint audio-video localization method needed in the estimation process as well as in the rendering engine is also presented. The estimated source signal and the source localization information drive a wave field synthesis engine that renders the acoustic field again at the receiving chamber. The system performance is tested using MUSHRA-based subjective tests.

References

Blumlein A: Improvements in and relating to sound transmission, sound-recording and sound reproduction systems. patent no. 394325 December 1931
Google Scholar
Snow WB: Basic principle of stereophonic sound. Journal of SMPTE 1953, 61: 567–589.
Article Google Scholar
Snow WB: Auditory perspective. Bell Laboratories Record 1934, 12: 194–198.
Google Scholar
Härmä A: Coding principles for virtual acoustic openings. Proccedings of the Audio Engineering Society 22nd Conference on Virtual, Synthetic and Entertainment Audio (AES22 '02), June 2002, Espoo, Finland 159–165.
Google Scholar
Torres S, Beracoechea JA, Pérez-García I, et al.: Coding strategies and quality measure for multichannel audio. Proceedings of the 116th Audio Engineering Society Convention, May 2004, Berlin, Germany
Google Scholar
Teutsch H, Spors S, Herbordt W, Kellermann W, Rabesnstein R: An integrated real-time system for immersive audio aplications. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03), October 2003, New Paltz, NY, USA
Google Scholar
Kellermann W: Acoustic signal processing for next generation human/machine interfaces. Proceedings of the 8th International Conference on Digital Audio Effects (DAFx '05), September 2005, Madrid, Spain
Google Scholar
Berkhout AJ: Holographic approach to acoustic control. Journal of the Audio Engineering Society 1988, 36(12):977–995.
Google Scholar
Boone MM, Bruijn WPJ: Improving speech intelligibility in teleconferencing by using Wave Field Synthesis. Proceedings of the 114th Audio Engineering Society Convention, March 2003, Amsterdam, The Netherlands
Google Scholar
Bruijn WPJ, Boone MM: Application of Wave Field Synthesis in life-size videoconferencing. Proceedings of the 114th Audio Engineering Society Convention, March 2003, Amsterdam, The Netherlands
Google Scholar
Van Veen BD, Buckley KM: Beamforming: a versatile approach to spatial filtering. IEEE ASSP magazine 1988, 5(2):4–24.
Article Google Scholar
Bell Labs's Varecoic Chamber https://doi.org/www.bell-labs.com/org/1133/Research/Acoustics/VarechoicChamber.html
Griffiths LJ, Jim CW: Alternative approach to linearly constrained adaptive beamforming. IEEE Transactions on Antennas and Propagation 1982, 30(1):27–34. 10.1109/TAP.1982.1142739
Article Google Scholar
Widrow B, McCool JM: Comparison of adaptive algorithms based on the methods of steepest descent and random search. IEEE Transactions on Antennas and Propagation 1976, 24(5):615–637. 10.1109/TAP.1976.1141414
Article MathSciNet Google Scholar
Liu Y, Zou Q, Lin Z: Generalized sidelobe cancellers with leakage constraints. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '05), May 2005, Kobe, Japan
Google Scholar
Hoshuyama O, Sugiyama A, Hirano A: Robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Transactions on Signal Processing 1999, 47(10):2677–2684. 10.1109/78.790650
Article Google Scholar
Abad A, Hernando J: Integrated adaptive beamforming and Wiener filtering for a robust microphone array. IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM '04), July 2004, Barcelona, Spain 367–371.
Google Scholar
Saric ZM, Jovicic ST: Adaptive microphone array based on pause detection. Acoustic Research Letters Online 2004, 5(2):68–74. 10.1121/1.1650411
Article Google Scholar
Zheng Y, Goubran R: Adaptive beamforming using Affine Projection Algorithms. Proceedings of 5th International Conference on Signal Processing (ICSP '00), August 2000, Beijing, China
Google Scholar
Apolinário JA Jr., De Campos MLR, Bernal CPO: Constrained conjugate gradient algorithm. IEEE Signal Processing Letters 2000, 7(12):351–354. 10.1109/97.883366
Article Google Scholar
Beracoechea JA, Torres S, García L, et al.: Source separation for microphone arrays using conjugate gradient techniques. Proceedings of the 8th International Conference on Digital Audio Effects (DAFx '05), September 2005, Madrid, Spain
Google Scholar
Buchner H, Spors S, Kellermann W: Wave-domain adaptive filtering: acoustic echo cancellation for full-duplex systems based on wave-field synthesis. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 4: 117–120.
Google Scholar
Low S, Nordholm S, Grbic N: Subband generalized Sidelobe approach—a constrained region approach. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '03), October 2003, New Paltz, NY, USA
Google Scholar
Glentis G: Implementation of adaptive generalized sidelobe cancellers using complex valued arithmetic. International Journal of Applied Mathematics and Computer Science 2003, 13(4):549–566.
MathSciNet MATH Google Scholar
Herbordt W, Kellermann W: Efficient frequency-domain realization of robust generalized sidelobe cancellers. Proceedings of IEEE 4th Workshop on Multimedia Signal Processing, October 2001, Cannes, France 377–382.
Google Scholar
Yu ZL, Er MH: An extended generalized sidelobe canceller in time and frequency domain. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '05), May 2004, Vancouver, BC, Canada 3: 629–632.
Google Scholar
Páez Borrallo JM, García Otero M: On the implementation of a partitioned block frequency domain adaptive filter (PBFDAF) for long acoustic echo cancellation. Signal Processing 1992, 27(3):301–315. 10.1016/0165-1684(92)90077-A
Article Google Scholar
García L, Torres S, Beracoechea JA, et al.: Conjugate Gradient techniques for Multichannel acoustic echo cancellation. Proceedings of the 8th International Conference on Digital Audio Effects (DAFx '05), September 2005, Madrid, Spain
Google Scholar
Hoshuyama O, Begasse B, Sugiyama A, Hirano A: Realtime robust adaptive microphone array controlled by an SNR estimate. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '98), May 1998, Seattler, Wash, USA 6: 3605–3608.
Google Scholar
Strobel N, Meier T, Rabenstein R: Speaker localization using a steered filter-and-sum beamformer. Erlangen Workshop '99: Vision, Modeling and Visualization, November 1999, Erlangen, Germany
Google Scholar
Haykin S: Adaptive Filter Theory. Prentice Hall, Englewood Cliffs, NJ, USA; 1991.
MATH Google Scholar
Teutsch H, Kellermann W: EB-ESPRIT: 2D localization of multiple wideband acoustic sources using eigen-beams. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), March 2005, Philadelphia, Pa, USA
Google Scholar
Knapp CH, Carter GC: Generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 1976, 24(4):320–327. 10.1109/TASSP.1976.1162830
Article Google Scholar
Wang H, Chu P: Voice source localization for automatic camera pointing system in videoconferencing. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '97), April 1997, Munich, Germany 1: 187–190.
Google Scholar
DiBiase J, Silverman H, Brandstein M: Robust localization in reverberant rooms. In Microphone Arrays; Signal Processing Techniques and Applications. Springer, Berlin, Germany; 2001:157–180.
Google Scholar
Faugeras O: Three-Dimensional Computer Vision. A Geometric Viewpoint. MIT press, Cambridge, Mass, USA; 1993.
Google Scholar
Strobel N, Spors S, Rabenstein R: Joint audio-video signal processing for object localization and tracking. In Microphone Arrays: Signal Processing Techniques and Applications. Edited by: Brandstein MS, Ward DB. Springer, Berlin, Germany; 2001:197–219.
Google Scholar
Asano F, Yamamoto K, Hara I, et al.: Detection and separation of speech event using audio and video information fusion and its application to robust speech interface. EURASIP Journal on Applied Signal Processing 2004, 2004(11):1727–1738. 10.1155/S1110865704402303
Google Scholar
Fasel I, Fortenberry B, Movellan J: A generative framework for real-time object detection and classification. Computer Vision and Image Understanding 2005, 98: 182–210. 10.1016/j.cviu.2004.07.014
Article Google Scholar
Bleda S, López JJ, Pueo B: Software for the simulation, performance analysis and real time implementation of Wave Field Synthesis systems for 3D Audio. Proceedings the 6th International Conference on Digital Audio Effects (DAFx '03), September 2003, London, UK
Google Scholar
De Vries D: Sound reinforcement by wavefield synthesis: adaptation of the synthesis operator to the loudspeaker directivity characteristics. Journal of the Audio Engineering Society 1996, 44(12):1120–1131.
Google Scholar
Boone MM: Acoustic rendering with wave field systhesis. Proceedings of the ACM Siggraph and Eurographics Campfire on Acoustic Rendering for Virtual Environments, May 2001, Snowbird, Utah, USA 37–45.
Google Scholar
MUSHRA (MUlti Stimulus test with Hidden Reference and Anchor, ITU-R BS.1534)
ITU-R BS.1116-1 : Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems.
Albayzin. Spanish Speech Database. Universidad Politécnica de Cataluña. Proyecto TIC91-1488-C06
Spors S, Buchner H, Rabenstein R: A novel approach to active listening room compensation for wave field synthesis using wave-domain adaptive filtering. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Quebec, Canada 4: 29–32.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Señales, Sistemas y Radiocomunicaciones, Universidad Politécnica de Madrid, Madrid, 28040, Spain
J. A. Beracoechea, S. Torres-Guijarro, L. García & F. J. Casajús-Quirós

Authors

J. A. Beracoechea
View author publications
You can also search for this author in PubMed Google Scholar
S. Torres-Guijarro
View author publications
You can also search for this author in PubMed Google Scholar
L. García
View author publications
You can also search for this author in PubMed Google Scholar
F. J. Casajús-Quirós
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. A. Beracoechea.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Beracoechea, J.A., Torres-Guijarro, S., García, L. et al. On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization. EURASIP J. Adv. Signal Process. 2006, 040960 (2006). https://doi.org/10.1155/ASP/2006/40960

Download citation

Received: 20 December 2005
Revised: 26 April 2006
Accepted: 11 June 2006
Published: 01 December 2006
DOI: https://doi.org/10.1155/ASP/2006/40960

On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords