Skip to content

Advertisement

Open Access

Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures

EURASIP Journal on Advances in Signal Processing20062006:075206

https://doi.org/10.1155/ASP/2006/75206

Received: 31 January 2005

Accepted: 1 September 2005

Published: 9 April 2006

Abstract

This paper presents a method for blind separation of convolutive mixtures of speech signals, based on the joint diagonalization of the time varying spectral matrices of the observation records. The main and still largely open problem in a frequency domain approach is permutation ambiguity. In an earlier paper of the authors, the continuity of the frequency response of the unmixing filters is exploited, but it leaves some frequency permutation jumps. This paper therefore proposes a new method based on two assumptions. The frequency continuity of the unmixing filters is still used in the initialization of the diagonalization algorithm. Then, the paper introduces a new method based on the time-frequency representations of the sources. They are assumed to vary smoothly with frequency. This hypothesis of the continuity of the time variation of the source energy is exploited on a sliding frequency bandwidth. It allows us to detect the remaining frequency permutation jumps. The method is compared with other approaches and results on real world recordings demonstrate superior performances of the proposed algorithm.

Keywords

Frequency DomainQuantum InformationSpeech SignalFrequency BandwidthDomain Approach

[12345678910111213141516171819202122232425262728293031]

Authors’ Affiliations

(1)
Laboratoire des Images et des Signaux, St Martin d'Hère Cedex, France
(2)
Laboratoire de Modélisation et Calcul, Grenoble Cedex, France

References

  1. Parra LC, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000, 8(3):320-327. 10.1109/89.841214View ArticleMATHGoogle Scholar
  2. Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Proceedings of the International ICSC Workshop on Independence & Artificial Neural Networks (I&ANN '98), February 1998, Tenerife, Spain 9-10.Google Scholar
  3. Wu H-C, Principe JC: Simultaneous diagonalization in the frequency domain (SDIF) for source separation. Proceedings of the 1st International Conference on Independent Component Analysis and Signal Separation (ICA '99), January 1999, Aussois, France 245-250.Google Scholar
  4. Mukai R, Araki S, Makino S: Separation and dereverberation performance of frequency domain blind source separation. Proceedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation (ICA '01), December 2001, San Diego, Calif, USA 230-235.Google Scholar
  5. Pham D-T, Cardoso J-F: Blind separation of instantaneous mixtures of nonstationary sources. IEEE Transactions on Signal Processing 2001, 49(9):1837-1848. 10.1109/78.942614MathSciNetView ArticleGoogle Scholar
  6. Pham D-T, Servière Ch, Boumaraf H: Blind separation of convolutive audio mixtures using nonstationarity. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 981-986.Google Scholar
  7. Pham D-T, Servière Ch, Boumaraf H: Blind separation of speech mixtures based on nonstationarity. Proceedings of 7th International Symposium on Signal Processing and Its Applications (ISSPA '03), July 2003, Paris, France 2: 73-76.Google Scholar
  8. Matsuoka K, Nakashima S: Minimal distortion principle for blind source separation. Proceedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation (ICA '01), December 2001, San Diego, Calif, USA 722-727.Google Scholar
  9. Murata N, Ikeda S, Ziehe A: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 2001, 41(1–4):1-24.View ArticleMATHGoogle Scholar
  10. Sawada H, Winter S, Mukai R, Araki S, Makino S: Estimating the number of sources for frequency-domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 610-617.View ArticleGoogle Scholar
  11. Torkkola K: Blind separation for audio signals—Are we there yet? Proceedings of the 1st International Workshop on Independent Component Analysis and Signal Separation (ICA '99), January 1999, Aussois, France 239-244.Google Scholar
  12. Westner A: Object-based audio capture: separating acoustically mixed sounds, M.S. thesis. Massachusetts Institute of Technology, Cambridge, Mass, USA; 1998.Google Scholar
  13. Parra LC, Spence C: On-line convolutive blind source separation of non-stationary signals. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 2000, 26(1-2):39-46.View ArticleMATHGoogle Scholar
  14. Anemüller J, Kollmeier B: Amplitude modulation decorrelation for convolutive blind source separation. Proceedings of the 2nd International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '00), June 2000, Helsinki, Finland 215-220.Google Scholar
  15. Wang W, Chambers JA, Sanei S: A novel hybrid approach to the permutation problem of frequency domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 532-539.View ArticleGoogle Scholar
  16. Ikeda S, Murata N: A method of blind separation based on temporal structure of signals. Proceedings of the 5th International Conference on Neural Information Processing (ICONIP '98), October 1998, Kitakyushu, Japan 737-742.Google Scholar
  17. Servière Ch, Pham D-T: A novel method for permutation correction in frequency-domain in blind separation of speech mixtures. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 807-815.View ArticleGoogle Scholar
  18. Asano F, Ikeda S, Ogawa M, Asoh H, Kitawaki N: A combined approach of array processing and independent component analysis for blind separation of acoustic signals. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 2729-2732.Google Scholar
  19. Kamata K, Hu X, Kobatake H: A new approach to the permutation problem in frequency domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 849-856.View ArticleGoogle Scholar
  20. Ikram MZ, Morgan DR: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 881-884.Google Scholar
  21. Knaak M, Araki S, Makino S: Geometrically constrained ICA for robust separation of sound mixtures. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 951-956.Google Scholar
  22. Kurita S, Saruwatari H, Kajita S, Takeda K, Itakura F: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 3140-3143.Google Scholar
  23. Mitianoudis N, Davies M: Permutation alignment for frequency domain ICA using subspace beamforming methods. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 669-676.View ArticleGoogle Scholar
  24. Mukai R, Sawada H, Araki S, Makino S: Frequency domain blind source separation for many speech signals. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 461-469.View ArticleGoogle Scholar
  25. Parra LC, Alvino CV: Geometric source separation: merging convolutive source separation with geometric beamforming. IEEE Transactions on Speech and Audio Processing 2002, 10(6):352-362. 10.1109/TSA.2002.803443View ArticleGoogle Scholar
  26. Saruwatari H, Kawamura T, Shikano K: Fast-convergence algorithm for ICA-based blind source separation using array signal processing. Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (WASPAA '01), October 2001, New Platz, NY, USA 91-94.Google Scholar
  27. Soon VC, Tong L, Huang YF, Liu R: A robust method for wideband signal separation. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '93), May 1993, Chicago, Ill, USA 1: 703-706.Google Scholar
  28. Matsuoka K, Ohya M, Kawamoto M: A neural net for blind separation of nonstationary signals. Neural Networks 1995, 8(3):411-419. 10.1016/0893-6080(94)00083-XView ArticleGoogle Scholar
  29. Pham D-T: Joint approximate diagonalization of positive definite Hermitian matrices. SIAM Journal on Matrix Analysis and Applications 2001, 22(4):1136-1152. 10.1137/S089547980035689XMathSciNetView ArticleMATHGoogle Scholar
  30. Oppenheim AV, Schafer RW: Digital Signal Processing. Prentice Hall, Englewood Cliffs, NJ, USA; 1975.MATHGoogle Scholar
  31. Sawada H, Muaki R, Araki S, Makino S: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions on Speech and Audio Processing 2004, 12(5):530-538. 10.1109/TSA.2004.832994View ArticleGoogle Scholar

Copyright

© Servière and Pham 2006

Advertisement