Skip to main content

Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking

Abstract

A new two-stage blind source separation (BSS) method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO)-model-based independent component analysis (ICA) and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA enables us to separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, our novel SIMO-model-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results reveal that the separation performance can be considerably improved by the proposed method compared with that achieved by conventional BSS methods. In addition, the real-time implementation of the proposed BSS is illustrated.

References

  1. 1.

    Haykin S (Ed): Unsupervised Adaptive Filtering. John Wiley & Sons, New York, NY, USA; 2000.

    Google Scholar 

  2. 2.

    Cardoso JF: Eigenstructure of the 4th-order cumulant tensor with application to the blind source separation problem. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), May 1989, Glasgow, UK 2109–2112.

    Google Scholar 

  3. 3.

    Jutten C, Herault J: Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Processing 1991, 24(1):1–10. 10.1016/0165-1684(91)90079-X

    Article  Google Scholar 

  4. 4.

    Comon P: Independent component analysis. A new concept? Signal Processing 1994, 36(3):287–314. 10.1016/0165-1684(94)90029-9

    Article  Google Scholar 

  5. 5.

    Bell AJ, Sejnowski TJ: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 1995, 7(6):1129–1159. 10.1162/neco.1995.7.6.1129

    Article  Google Scholar 

  6. 6.

    Lee T-W: Independent Component Analysis. Kluwer Academic, Norwell, Mass, USA; 1998.

    Google Scholar 

  7. 7.

    Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 1998, 22(1–3):21–34.

    Article  Google Scholar 

  8. 8.

    Ikeda S, Murata N: A method of ICA in time-frequency domain. Proceedings of International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '99), January 1999, Aussions, France 365–371.

    Google Scholar 

  9. 9.

    Parra L, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000, 8(3):320–327. 10.1109/89.841214

    Article  Google Scholar 

  10. 10.

    Saruwatari H, Kurita S, Takeda K, Itakura F, Nishikawa T, Shikano K: Blind source separation combining independent component analysis and beamforming. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1135–1146. 10.1155/S1110865703305104

    MATH  Google Scholar 

  11. 11.

    Nishikawa T, Saruwatari H, Shikano K: Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(4):846–858.

    Google Scholar 

  12. 12.

    Takatani T, Nishikawa T, Saruwatari H, Shikano K: High-fidelity blind separation of acoustic signals using SIMO-model-based ICA with information-geometric learning. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 251–254. (also submitted to IEEE Transactions on Speech and Audio Processing)

    Google Scholar 

  13. 13.

    Kolossa D, Orglmeister R: Nonlinear postprocessing for blind speech separation. Proceedings of 5th International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 832–839.

    Google Scholar 

  14. 14.

    Lyon R: A computational model of binaural localization and separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '83), April 1983, Boston, Mass, USA 1148–1151.

    Google Scholar 

  15. 15.

    Roman N, Wang DL, Brown GJ: Speech segregation based on sound localization. Proceedings of the International Joint Conference on Neural Networks (IJCNN '01), July 2001, Washington, DC, USA 4: 2861–2866.

    Article  Google Scholar 

  16. 16.

    Aoki M, Okamoto M, Aoki S, Matsui H, Sakurai T, Kaneda Y: Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones. Acoustical Science and Technology 2001, 22(2):149–157. 10.1250/ast.22.149

    Article  Google Scholar 

  17. 17.

    Sawada H, Mukai R, Araki S, Makino S: Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(3):590–596.

    Google Scholar 

  18. 18.

    Saruwatari H, Kawamura T, Nishikawa T, Shikano K: Fast-convergence algorithm for blind source separation based on array signal processing. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(4):286–291.

    Google Scholar 

  19. 19.

    Saruwatari H, Kawamura T, Nishikawa T, Lee A, Shikano K: Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Transactions on Speech and Audio Processing 2006, 14(2):666–678.

    Article  Google Scholar 

  20. 20.

    Rickard S, Yilmaz Ö: On the approximate W-disjoint orthogonality of speech. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 529–532.

    Google Scholar 

  21. 21.

    Takatani T, Ukai S, Nishikawa T, Saruwatari H, Shikano K: A self-generator method for initial filters of SIMO-ICA applied to blind separation of binaural sound mixtures. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005, E88-A(7):1673–1682. 10.1093/ietfec/e88-a.7.1673

    Article  Google Scholar 

  22. 22.

    Poularikas A: The Handbook of Formulas and Tables for Signal Processing. CRC Press, Boca Raton, Fla, USA; 1999.

    Google Scholar 

  23. 23.

    Mukai R, Sawada H, Araki S, Makino S: Blind source separation for moving speech signals using blockwise ICA and residual crosstalk subtraction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2004, E87-A(8):1941–1948.

    Google Scholar 

  24. 24.

    Buchner H, Aichner R, Kellermann W: A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 2005, 13(1):120–134.

    Article  Google Scholar 

  25. 25.

    Kobayashi T, Itabashi S, Hayashi S, Takezawa T: ASJ continuous speech corpus for research. The Journal of The Acoustic Society of Japan 1992, 48(12):888–893.

    Google Scholar 

  26. 26.

    Deller JJR, Hansen JHL, Proakis JG: Discrete-Time Processing of Speech Signals. Wiley-IEEE Press, New York, NY, USA; 2000.

    Google Scholar 

  27. 27.

    Itou K, Yamamoto M, Takeda K, et al.: JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. The Journal of The Acoustic Society of Japan 1999, 20(3):199–206.

    Article  Google Scholar 

  28. 28.

    Lee A, Kawahara T, Takeda K, Shikano K: A new phonetic tied-mixture model for efficient decoding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 3: 1269–1272.

    Google Scholar 

  29. 29.

    Davis SB, Mermelstein P: "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1980, 28(4):357–366. 10.1109/TASSP.1980.1163420

    Article  Google Scholar 

  30. 30.

    Lee A, Kawahara T, Shikano K: Julius—an open source real-time large vocabulary recognition engine. Proceedings of 7th European Conference on Speech Communication and Technology (EUROSPEECH '01), September 2001, Aalborg, Danemark 1691–1694.

    Google Scholar 

  31. 31.

    Cooke M, Green P, Josifovski L, Vizinho A: Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 2001, 34(3):267–285. 10.1016/S0167-6393(00)00034-0

    Article  Google Scholar 

  32. 32.

    Kolossa D, Klimas A, Orglmeister R: Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 82–85.

    Google Scholar 

  33. 33.

    Cichocki A, Amari S: Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. John Wiley & Sons, West Sussex, UK; 2002.

    Google Scholar 

  34. 34.

    Choi S, Amari S, Cichocki A, Liu R: Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels. Proceedings of 1st International Workshop on Independent Component Analysis and Blind Source Separation (ICA '99), January 1999, Aussois, France 371–376.

    Google Scholar 

  35. 35.

    Nishikawa T, Saruwatari H, Shikano K: Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and linear prediction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(8):2028–2036.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yoshimitsu Mori.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Mori, Y., Saruwatari, H., Takatani, T. et al. Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking. EURASIP J. Adv. Signal Process. 2006, 034970 (2006). https://doi.org/10.1155/ASP/2006/34970

Download citation

Keywords

  • Independent Component Analysis
  • Acoustic Signal
  • Sound Source
  • Independent Component Analysis
  • Attractive Property