Open Access

Frequency-Domain Blind Source Separation of Many Speech Signals Using Near-Field and Far-Field Models

EURASIP Journal on Advances in Signal Processing20062006:083683

https://doi.org/10.1155/ASP/2006/83683

Received: 19 December 2005

Accepted: 11 June 2006

Published: 16 August 2006

Abstract

We discuss the frequency-domain blind source separation (BSS) of convolutive mixtures when the number of source signals is large, and the potential source locations are omnidirectional. The most critical problem related to the frequency-domain BSS is the permutation problem, and geometric information is helpful as regards solving it. In this paper, we propose a method for obtaining proper geometric information with which to solve the permutation problem when the number of source signals is large and some of the signals come from the same or a similar direction. First, we describe a method for estimating the absolute DOA by using relative DOAs obtained by the solution provided by independent component analysis (ICA) and the far-field model. Next, we propose a method for estimating the spheres on which source signals exist by using ICA solution and the near-field model. We also address another problem with regard to frequency-domain BSS that arises from the circularity of discrete-frequency representation. We discuss the characteristics of the problem and present a solution for solving it. Experimental results using eight microphones in a room show that the proposed method can separate a mixture of six speech signals arriving from various directions, even when two of them come from the same direction.

[12345678910111213141516171819202122232425262728293031323334353637]

Authors’ Affiliations

(1)
NTT Communication Science Laboratories, NTT Corporation

References

  1. Haykin S (Ed): Unsupervised Adaptive Filtering. John Wiley & Sons, New York, NY, USA; 2000.Google Scholar
  2. Cichocki A, Amari S: Adaptive Blind Signal and Image Processing. John Wiley & Sons, New York, NY, USA; 2002.View ArticleGoogle Scholar
  3. Benesty J, Makino S, Chen J (Eds): Speech Enhancement. Springer, New York, NY, USA; 2005.Google Scholar
  4. Comon P: Independent component analysis. A new concept? Signal Processing 1994, 36(3):287-314. 10.1016/0165-1684(94)90029-9View ArticleMATHGoogle Scholar
  5. Bell AJ, Sejnowski TJ: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 1995, 7(6):1129-1159. 10.1162/neco.1995.7.6.1129View ArticleGoogle Scholar
  6. Lee TW: Independent Component Analysis. Kluwer Academic, Boston, Mass, USA; 1998.MATHGoogle Scholar
  7. Hyvärinen A, Karhunen J, Oja E: Independent Component Analysis. John Wiley & Sons, New York, NY, USA; 2001.View ArticleGoogle Scholar
  8. Puntonet CG, Prieto A (Eds): Independent Component Analysis and Blind Signal Separation, Lecture Notes in Computer Science. Volume 3195. Springer, New York, NY, USA; 2004.Google Scholar
  9. Matsuoka K, Nakashima S: Minimal distortion principle for blind source separation. Proceedings of 3rd International Conference on Independent Component Analysis and Blind Source Separation (ICA '01), December 2001, San Diego, Calif, USA 722-727.Google Scholar
  10. Douglas SC, Sun X: Convolutive blind separation of speech mixtures using the natural gradient. Speech Communication 2003, 39(1-2):65-78. 10.1016/S0167-6393(02)00059-6View ArticleMATHGoogle Scholar
  11. Takatani T, Nishikawa T, Saruwatari H, Shikano K: High-fidelity blind separation of acoustic signals using SIMO-model-based independent component analysis. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2004, E87-A(8):2063-2072.Google Scholar
  12. Buchner H, Aichner R, Kellermann W: A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 2005, 13(1):120-134.View ArticleGoogle Scholar
  13. Matsuoka K, Ohba Y, Toyota Y, Nakashima S: Blind separation for convolutive mixture of many voices. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 279-282.Google Scholar
  14. Soon VC, Tong L, Huang YF, Liu R: A robust method for wideband signal separation. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '93), May 1993, Chicago, Ill, USA 1: 703-706.Google Scholar
  15. Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 1998, 22(1–3):21-34.View ArticleMATHGoogle Scholar
  16. Anemüller J, Kollmeier B: Amplitude modulation decorrelation for convolutive blind source separation. Proceedings of the 2nd International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '00), June 2000, Helsinki, Finland 215-220.Google Scholar
  17. Kurita S, Saruwatari H, Kajita S, Takeda K, Itakura F: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 3140-3143.Google Scholar
  18. Murata N, Ikeda S, Ziehe A: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 2001, 41(1–4):1-24.View ArticleMATHGoogle Scholar
  19. Ikram MZ, Morgan DR: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 881-884.Google Scholar
  20. Parra LC, Alvino CV: Geometric source separation: merging convolutive source separation with geometric beamforming. IEEE Transactions on Speech and Audio Processing 2002, 10(6):352-362. 10.1109/TSA.2002.803443View ArticleGoogle Scholar
  21. Schobben DWE, Sommen PCW: A frequency domain blind signal separation method based on decorrelation. IEEE Transactions on Signal Processing 2002, 50(8):1855-1865. 10.1109/TSP.2002.800417View ArticleGoogle Scholar
  22. Sawada H, Mukai R, Araki S, Makino S: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions on Speech and Audio Processing 2004, 12(5):530-538. 10.1109/TSA.2004.832994View ArticleGoogle Scholar
  23. Mukai R, Sawada H, Araki S, Makino S: Near-field frequency domain blind source separation for convolutive mixtures. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '04), May 2004, Montreal, Que, Canada 4: 49-52.Google Scholar
  24. Mukai R, Sawada H, Araki S, Makino S: Frequency domain blind source separation using small and large spacing sensor pairs. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '04), May 2004, Vancouver, BC, Canada 5: 1-4.Google Scholar
  25. Sawada H, Mukai R, de la Kethulle S, Araki S, Makino S: Spectral smoothing for frequency-domain blind source separation. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 311-314.Google Scholar
  26. Sawada H, Mukai R, Araki S, Makino S: Convolutive blind source separation for more than two sources in the frequency domain. Acoustical Science and Technology 2004, 25(4):296-298. 10.1250/ast.25.296View ArticleGoogle Scholar
  27. Sawada H, Mukai R, Araki S, Makino S: Frequency-domain blind source separation. In Speech Enhancement. Edited by: Benesty J, Makino S, Chen J. Springer, New York, NY, USA; 2005:299-327. chapter 13View ArticleGoogle Scholar
  28. Makino S, Sawada H, Mukai R, Araki S: Blind source separation of convolutive mixtures of speech in frequency domain. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005, E88-A(7):1640-1654. (Invited) 10.1093/ietfec/e88-a.7.1640View ArticleGoogle Scholar
  29. Sawada H, Winter S, Mukai R, Araki S, Makino S: Estimating the number of sources for frequency-domain blind source separation. In Proceedings of 5th International Conference on Independent Component Analysis (ICA '04), September 2004, Granada, Spain, Lecture Notes in Computer Science. Volume 3195. Springer; 610-617.Google Scholar
  30. Bingham E, Hyvärinen A: A fast fixed-point algorithm for independent component analysis of complex valued signals. International Journal of Neural Systems 2000, 10(1):1-8.View ArticleGoogle Scholar
  31. Amari S-I: Natural gradient works efficiently in learning. Neural Computation 1998, 10(2):251-276. 10.1162/089976698300017746MathSciNetView ArticleGoogle Scholar
  32. Sawada H, Mukai R, Araki S, Makino S: Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(3):590-596.Google Scholar
  33. Asano F, Ikeda S, Ogawa M, Asoh H, Kitawaki N: Combined approach of array processing and independent component analysis for blind separation of acoustic signals. IEEE Transactions on Speech and Audio Processing 2003, 11(3):204-215. 10.1109/TSA.2003.809191View ArticleGoogle Scholar
  34. Iwaki M, Ando A: Selective microphone system using blind separation by block decorrelation of output signals. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 1023-1028.Google Scholar
  35. Araki S, Makino S, Hinamoto Y, Mukai R, Nishikawa T, Saruwatari H: Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1157-1166. 10.1155/S1110865703305074View ArticleMATHGoogle Scholar
  36. Duda RO, Hart PE, Stork DG: Pattern Classification. 2nd edition. Wiley Interscience, New York, NY, USA; 2000.MATHGoogle Scholar
  37. Mukai R, Sawada H, Araki S, Makino S: Blind source separation and DOA estimation using small 3-D microphone array. Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA '05), March 2005, Piscataway, NJ, USA d.9-10.Google Scholar

Copyright

© Ryo Mukai et al. 2006

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.