Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models

Mukai, Ryo; Sawada, Hiroshi; Araki, Shoko; Makino, Shoji

doi:10.1155/ASP/2006/83683

Research Article
Open access
Published: 01 December 2006

Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models

Ryo Mukai¹,
Hiroshi Sawada¹,
Shoko Araki¹ &
…
Shoji Makino¹

EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 083683 (2006) Cite this article

1586 Accesses
19 Citations
3 Altmetric
Metrics details

Abstract

We discuss the frequency-domain blind source separation (BSS) of convolutive mixtures when the number of source signals is large, and the potential source locations are omnidirectional. The most critical problem related to the frequency-domain BSS is the permutation problem, and geometric information is helpful as regards solving it. In this paper, we propose a method for obtaining proper geometric information with which to solve the permutation problem when the number of source signals is large and some of the signals come from the same or a similar direction. First, we describe a method for estimating the absolute DOA by using relative DOAs obtained by the solution provided by independent component analysis (ICA) and the far-field model. Next, we propose a method for estimating the spheres on which source signals exist by using ICA solution and the near-field model. We also address another problem with regard to frequency-domain BSS that arises from the circularity of discrete-frequency representation. We discuss the characteristics of the problem and present a solution for solving it. Experimental results using eight microphones in a room show that the proposed method can separate a mixture of six speech signals arriving from various directions, even when two of them come from the same direction.

References

Haykin S (Ed): Unsupervised Adaptive Filtering. John Wiley & Sons, New York, NY, USA; 2000.
Google Scholar
Cichocki A, Amari S: Adaptive Blind Signal and Image Processing. John Wiley & Sons, New York, NY, USA; 2002.
Book Google Scholar
Benesty J, Makino S, Chen J (Eds): Speech Enhancement. Springer, New York, NY, USA; 2005.
Google Scholar
Comon P: Independent component analysis. A new concept? Signal Processing 1994, 36(3):287–314. 10.1016/0165-1684(94)90029-9
Article Google Scholar
Bell AJ, Sejnowski TJ: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 1995, 7(6):1129–1159. 10.1162/neco.1995.7.6.1129
Article Google Scholar
Lee TW: Independent Component Analysis. Kluwer Academic, Boston, Mass, USA; 1998.
Book Google Scholar
Hyvärinen A, Karhunen J, Oja E: Independent Component Analysis. John Wiley & Sons, New York, NY, USA; 2001.
Book Google Scholar
Puntonet CG, Prieto A (Eds): Independent Component Analysis and Blind Signal Separation, Lecture Notes in Computer Science. Volume 3195. Springer, New York, NY, USA; 2004.
Google Scholar
Matsuoka K, Nakashima S: Minimal distortion principle for blind source separation. Proceedings of 3rd International Conference on Independent Component Analysis and Blind Source Separation (ICA '01), December 2001, San Diego, Calif, USA 722–727.
Google Scholar
Douglas SC, Sun X: Convolutive blind separation of speech mixtures using the natural gradient. Speech Communication 2003, 39(1–2):65–78. 10.1016/S0167-6393(02)00059-6
Article Google Scholar
Takatani T, Nishikawa T, Saruwatari H, Shikano K: High-fidelity blind separation of acoustic signals using SIMO-model-based independent component analysis. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2004, E87-A(8):2063–2072.
Google Scholar
Buchner H, Aichner R, Kellermann W: A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 2005, 13(1):120–134.
Article Google Scholar
Matsuoka K, Ohba Y, Toyota Y, Nakashima S: Blind separation for convolutive mixture of many voices. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 279–282.
Google Scholar
Soon VC, Tong L, Huang YF, Liu R: A robust method for wideband signal separation. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '93), May 1993, Chicago, Ill, USA 1: 703–706.
Google Scholar
Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 1998, 22(1–3):21–34.
Article Google Scholar
Anemüller J, Kollmeier B: Amplitude modulation decorrelation for convolutive blind source separation. Proceedings of the 2nd International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '00), June 2000, Helsinki, Finland 215–220.
Google Scholar
Kurita S, Saruwatari H, Kajita S, Takeda K, Itakura F: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 3140–3143.
Google Scholar
Murata N, Ikeda S, Ziehe A: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 2001, 41(1–4):1–24.
Article Google Scholar
Ikram MZ, Morgan DR: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 881–884.
Google Scholar
Parra LC, Alvino CV: Geometric source separation: merging convolutive source separation with geometric beamforming. IEEE Transactions on Speech and Audio Processing 2002, 10(6):352–362. 10.1109/TSA.2002.803443
Article Google Scholar
Schobben DWE, Sommen PCW: A frequency domain blind signal separation method based on decorrelation. IEEE Transactions on Signal Processing 2002, 50(8):1855–1865. 10.1109/TSP.2002.800417
Article Google Scholar
Sawada H, Mukai R, Araki S, Makino S: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions on Speech and Audio Processing 2004, 12(5):530–538. 10.1109/TSA.2004.832994
Article Google Scholar
Mukai R, Sawada H, Araki S, Makino S: Near-field frequency domain blind source separation for convolutive mixtures. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Que, Canada 4: 49–52.
Google Scholar
Mukai R, Sawada H, Araki S, Makino S: Frequency domain blind source separation using small and large spacing sensor pairs. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '04), May 2004, Vancouver, BC, Canada 5: 1–4.
Google Scholar
Sawada H, Mukai R, de la Kethulle S, Araki S, Makino S: Spectral smoothing for frequency-domain blind source separation. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 311–314.
Google Scholar
Sawada H, Mukai R, Araki S, Makino S: Convolutive blind source separation for more than two sources in the frequency domain. Acoustical Science and Technology 2004, 25(4):296–298. 10.1250/ast.25.296
Article Google Scholar
Sawada H, Mukai R, Araki S, Makino S: Frequency-domain blind source separation. In Speech Enhancement. Edited by: Benesty J, Makino S, Chen J. Springer, New York, NY, USA; 2005:299–327. chapter 13
Chapter Google Scholar
Makino S, Sawada H, Mukai R, Araki S: Blind source separation of convolutive mixtures of speech in frequency domain. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005, E88-A(7):1640–1654. (Invited) 10.1093/ietfec/e88-a.7.1640
Article Google Scholar
Sawada H, Winter S, Mukai R, Araki S, Makino S: Estimating the number of sources for frequency-domain blind source separation. In Proceedings of 5th International Conference on Independent Component Analysis (ICA '04), September 2004, Granada, Spain, Lecture Notes in Computer Science. Volume 3195. Springer; 610–617.
Google Scholar
Bingham E, Hyvärinen A: A fast fixed-point algorithm for independent component analysis of complex valued signals. International Journal of Neural Systems 2000, 10(1):1–8.
Article Google Scholar
Amari S-I: Natural gradient works efficiently in learning. Neural Computation 1998, 10(2):251–276. 10.1162/089976698300017746
Article Google Scholar
Sawada H, Mukai R, Araki S, Makino S: Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(3):590–596.
Google Scholar
Asano F, Ikeda S, Ogawa M, Asoh H, Kitawaki N: Combined approach of array processing and independent component analysis for blind separation of acoustic signals. IEEE Transactions on Speech and Audio Processing 2003, 11(3):204–215. 10.1109/TSA.2003.809191
Article Google Scholar
Iwaki M, Ando A: Selective microphone system using blind separation by block decorrelation of output signals. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 1023–1028.
Google Scholar
Araki S, Makino S, Hinamoto Y, Mukai R, Nishikawa T, Saruwatari H: Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1157–1166. 10.1155/S1110865703305074
MATH Google Scholar
Duda RO, Hart PE, Stork DG: Pattern Classification. 2nd edition. Wiley Interscience, New York, NY, USA; 2000.
MATH Google Scholar
Mukai R, Sawada H, Araki S, Makino S: Blind source separation and DOA estimation using small 3-D microphone array. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA d.9-10.
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Laboratories, NTT Corporation, 2–4 Hikaridai, Seika-Cho, Soraku-Gun, Kyoto, 619-0237, Japan
Ryo Mukai, Hiroshi Sawada, Shoko Araki & Shoji Makino

Authors

Ryo Mukai
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Sawada
View author publications
You can also search for this author in PubMed Google Scholar
Shoko Araki
View author publications
You can also search for this author in PubMed Google Scholar
Shoji Makino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryo Mukai.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Mukai, R., Sawada, H., Araki, S. et al. Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models. EURASIP J. Adv. Signal Process. 2006, 083683 (2006). https://doi.org/10.1155/ASP/2006/83683

Download citation

Received: 19 December 2005
Revised: 26 April 2006
Accepted: 11 June 2006
Published: 01 December 2006
DOI: https://doi.org/10.1155/ASP/2006/83683

Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords