- Research Article
- Open access
- Published:
Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking
EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 034970 (2006)
Abstract
A new two-stage blind source separation (BSS) method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO)-model-based independent component analysis (ICA) and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA enables us to separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, our novel SIMO-model-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results reveal that the separation performance can be considerably improved by the proposed method compared with that achieved by conventional BSS methods. In addition, the real-time implementation of the proposed BSS is illustrated.
References
Haykin S (Ed): Unsupervised Adaptive Filtering. John Wiley & Sons, New York, NY, USA; 2000.
Cardoso JF: Eigenstructure of the 4th-order cumulant tensor with application to the blind source separation problem. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), May 1989, Glasgow, UK 2109–2112.
Jutten C, Herault J: Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Processing 1991, 24(1):1–10. 10.1016/0165-1684(91)90079-X
Comon P: Independent component analysis. A new concept? Signal Processing 1994, 36(3):287–314. 10.1016/0165-1684(94)90029-9
Bell AJ, Sejnowski TJ: An information-maximization approach to blind separation and blind deconvolution. Neural Computation 1995, 7(6):1129–1159. 10.1162/neco.1995.7.6.1129
Lee T-W: Independent Component Analysis. Kluwer Academic, Norwell, Mass, USA; 1998.
Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Neurocomputing 1998, 22(1–3):21–34.
Ikeda S, Murata N: A method of ICA in time-frequency domain. Proceedings of International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '99), January 1999, Aussions, France 365–371.
Parra L, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000, 8(3):320–327. 10.1109/89.841214
Saruwatari H, Kurita S, Takeda K, Itakura F, Nishikawa T, Shikano K: Blind source separation combining independent component analysis and beamforming. EURASIP Journal on Applied Signal Processing 2003, 2003(11):1135–1146. 10.1155/S1110865703305104
Nishikawa T, Saruwatari H, Shikano K: Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(4):846–858.
Takatani T, Nishikawa T, Saruwatari H, Shikano K: High-fidelity blind separation of acoustic signals using SIMO-model-based ICA with information-geometric learning. Proceedings of International Workshop on Acoustic Echo and Noise Control (IWAENC '03), September 2003, Kyoto, Japan 251–254. (also submitted to IEEE Transactions on Speech and Audio Processing)
Kolossa D, Orglmeister R: Nonlinear postprocessing for blind speech separation. Proceedings of 5th International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 832–839.
Lyon R: A computational model of binaural localization and separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '83), April 1983, Boston, Mass, USA 1148–1151.
Roman N, Wang DL, Brown GJ: Speech segregation based on sound localization. Proceedings of the International Joint Conference on Neural Networks (IJCNN '01), July 2001, Washington, DC, USA 4: 2861–2866.
Aoki M, Okamoto M, Aoki S, Matsui H, Sakurai T, Kaneda Y: Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones. Acoustical Science and Technology 2001, 22(2):149–157. 10.1250/ast.22.149
Sawada H, Mukai R, Araki S, Makino S: Polar coordinate based nonlinear function for frequency-domain blind source separation. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(3):590–596.
Saruwatari H, Kawamura T, Nishikawa T, Shikano K: Fast-convergence algorithm for blind source separation based on array signal processing. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(4):286–291.
Saruwatari H, Kawamura T, Nishikawa T, Lee A, Shikano K: Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Transactions on Speech and Audio Processing 2006, 14(2):666–678.
Rickard S, Yilmaz Ö: On the approximate W-disjoint orthogonality of speech. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 529–532.
Takatani T, Ukai S, Nishikawa T, Saruwatari H, Shikano K: A self-generator method for initial filters of SIMO-ICA applied to blind separation of binaural sound mixtures. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2005, E88-A(7):1673–1682. 10.1093/ietfec/e88-a.7.1673
Poularikas A: The Handbook of Formulas and Tables for Signal Processing. CRC Press, Boca Raton, Fla, USA; 1999.
Mukai R, Sawada H, Araki S, Makino S: Blind source separation for moving speech signals using blockwise ICA and residual crosstalk subtraction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2004, E87-A(8):1941–1948.
Buchner H, Aichner R, Kellermann W: A generalization of blind source separation algorithms for convolutive mixtures based on second-order statistics. IEEE Transactions on Speech and Audio Processing 2005, 13(1):120–134.
Kobayashi T, Itabashi S, Hayashi S, Takezawa T: ASJ continuous speech corpus for research. The Journal of The Acoustic Society of Japan 1992, 48(12):888–893.
Deller JJR, Hansen JHL, Proakis JG: Discrete-Time Processing of Speech Signals. Wiley-IEEE Press, New York, NY, USA; 2000.
Itou K, Yamamoto M, Takeda K, et al.: JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. The Journal of The Acoustic Society of Japan 1999, 20(3):199–206.
Lee A, Kawahara T, Takeda K, Shikano K: A new phonetic tied-mixture model for efficient decoding. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 3: 1269–1272.
Davis SB, Mermelstein P: "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 1980, 28(4):357–366. 10.1109/TASSP.1980.1163420
Lee A, Kawahara T, Shikano K: Julius—an open source real-time large vocabulary recognition engine. Proceedings of 7th European Conference on Speech Communication and Technology (EUROSPEECH '01), September 2001, Aalborg, Danemark 1691–1694.
Cooke M, Green P, Josifovski L, Vizinho A: Robust automatic speech recognition with missing and unreliable acoustic data. Speech Communication 2001, 34(3):267–285. 10.1016/S0167-6393(00)00034-0
Kolossa D, Klimas A, Orglmeister R: Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing data techniques. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '05), October 2005, New Paltz, NY, USA 82–85.
Cichocki A, Amari S: Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. John Wiley & Sons, West Sussex, UK; 2002.
Choi S, Amari S, Cichocki A, Liu R: Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels. Proceedings of 1st International Workshop on Independent Component Analysis and Blind Source Separation (ICA '99), January 1999, Aussois, France 371–376.
Nishikawa T, Saruwatari H, Shikano K: Stable learning algorithm for blind separation of temporally correlated acoustic signals combining multistage ICA and linear prediction. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 2003, E86-A(8):2028–2036.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Mori, Y., Saruwatari, H., Takatani, T. et al. Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking. EURASIP J. Adv. Signal Process. 2006, 034970 (2006). https://doi.org/10.1155/ASP/2006/34970
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/ASP/2006/34970