Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures

Servière, Ch; Pham, DT

doi:10.1155/ASP/2006/75206

Research Article
Open access
Published: 01 December 2006

Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures

Ch Servière¹ &
DT Pham²

EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 075206 (2006) Cite this article

1271 Accesses
16 Citations
Metrics details

Abstract

This paper presents a method for blind separation of convolutive mixtures of speech signals, based on the joint diagonalization of the time varying spectral matrices of the observation records. The main and still largely open problem in a frequency domain approach is permutation ambiguity. In an earlier paper of the authors, the continuity of the frequency response of the unmixing filters is exploited, but it leaves some frequency permutation jumps. This paper therefore proposes a new method based on two assumptions. The frequency continuity of the unmixing filters is still used in the initialization of the diagonalization algorithm. Then, the paper introduces a new method based on the time-frequency representations of the sources. They are assumed to vary smoothly with frequency. This hypothesis of the continuity of the time variation of the source energy is exploited on a sliding frequency bandwidth. It allows us to detect the remaining frequency permutation jumps. The method is compared with other approaches and results on real world recordings demonstrate superior performances of the proposed algorithm.

References

Parra LC, Spence C: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 2000, 8(3):320–327. 10.1109/89.841214
Article Google Scholar
Smaragdis P: Blind separation of convolved mixtures in the frequency domain. Proceedings of the International ICSC Workshop on Independence & Artificial Neural Networks (I&ANN '98), February 1998, Tenerife, Spain 9–10.
Google Scholar
Wu H-C, Principe JC: Simultaneous diagonalization in the frequency domain (SDIF) for source separation. Proceedings of the 1st International Conference on Independent Component Analysis and Signal Separation (ICA '99), January 1999, Aussois, France 245–250.
Google Scholar
Mukai R, Araki S, Makino S: Separation and dereverberation performance of frequency domain blind source separation. Proceedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation (ICA '01), December 2001, San Diego, Calif, USA 230–235.
Google Scholar
Pham D-T, Cardoso J-F: Blind separation of instantaneous mixtures of nonstationary sources. IEEE Transactions on Signal Processing 2001, 49(9):1837–1848. 10.1109/78.942614
Article MathSciNet Google Scholar
Pham D-T, Servière Ch, Boumaraf H: Blind separation of convolutive audio mixtures using nonstationarity. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 981–986.
Google Scholar
Pham D-T, Servière Ch, Boumaraf H: Blind separation of speech mixtures based on nonstationarity. Proceedings of 7th International Symposium on Signal Processing and Its Applications (ISSPA '03), July 2003, Paris, France 2: 73–76.
Google Scholar
Matsuoka K, Nakashima S: Minimal distortion principle for blind source separation. Proceedings of the 3rd International Conference on Independent Component Analysis and Blind Signal Separation (ICA '01), December 2001, San Diego, Calif, USA 722–727.
Google Scholar
Murata N, Ikeda S, Ziehe A: An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 2001, 41(1–4):1–24.
Article Google Scholar
Sawada H, Winter S, Mukai R, Araki S, Makino S: Estimating the number of sources for frequency-domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 610–617.
Chapter Google Scholar
Torkkola K: Blind separation for audio signals—Are we there yet? Proceedings of the 1st International Workshop on Independent Component Analysis and Signal Separation (ICA '99), January 1999, Aussois, France 239–244.
Google Scholar
Westner A: Object-based audio capture: separating acoustically mixed sounds, M.S. thesis. Massachusetts Institute of Technology, Cambridge, Mass, USA; 1998.
Google Scholar
Parra LC, Spence C: On-line convolutive blind source separation of non-stationary signals. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 2000, 26(1–2):39–46.
Article Google Scholar
Anemüller J, Kollmeier B: Amplitude modulation decorrelation for convolutive blind source separation. Proceedings of the 2nd International Workshop on Independent Component Analysis and Blind Signal Separation (ICA '00), June 2000, Helsinki, Finland 215–220.
Google Scholar
Wang W, Chambers JA, Sanei S: A novel hybrid approach to the permutation problem of frequency domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 532–539.
Chapter Google Scholar
Ikeda S, Murata N: A method of blind separation based on temporal structure of signals. Proceedings of the 5th International Conference on Neural Information Processing (ICONIP '98), October 1998, Kitakyushu, Japan 737–742.
Google Scholar
Servière Ch, Pham D-T: A novel method for permutation correction in frequency-domain in blind separation of speech mixtures. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 807–815.
Chapter Google Scholar
Asano F, Ikeda S, Ogawa M, Asoh H, Kitawaki N: A combined approach of array processing and independent component analysis for blind separation of acoustic signals. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 5: 2729–2732.
Google Scholar
Kamata K, Hu X, Kobatake H: A new approach to the permutation problem in frequency domain blind source separation. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 849–856.
Chapter Google Scholar
Ikram MZ, Morgan DR: A beamforming approach to permutation alignment for multichannel frequency-domain blind speech separation. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 1: 881–884.
Google Scholar
Knaak M, Araki S, Makino S: Geometrically constrained ICA for robust separation of sound mixtures. Proceedings of the 4th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '03), April 2003, Nara, Japan 951–956.
Google Scholar
Kurita S, Saruwatari H, Kajita S, Takeda K, Itakura F: Evaluation of blind signal separation method using directivity pattern under reverberant conditions. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '00), June 2000, Istanbul, Turkey 5: 3140–3143.
Google Scholar
Mitianoudis N, Davies M: Permutation alignment for frequency domain ICA using subspace beamforming methods. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 669–676.
Chapter Google Scholar
Mukai R, Sawada H, Araki S, Makino S: Frequency domain blind source separation for many speech signals. Proceedings of the 5th International Conference on Independent Component Analysis and Blind Signal Separation (ICA '04), September 2004, Granada, Spain 461–469.
Chapter Google Scholar
Parra LC, Alvino CV: Geometric source separation: merging convolutive source separation with geometric beamforming. IEEE Transactions on Speech and Audio Processing 2002, 10(6):352–362. 10.1109/TSA.2002.803443
Article Google Scholar
Saruwatari H, Kawamura T, Shikano K: Fast-convergence algorithm for ICA-based blind source separation using array signal processing. Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (WASPAA '01), October 2001, New Platz, NY, USA 91–94.
Google Scholar
Soon VC, Tong L, Huang YF, Liu R: A robust method for wideband signal separation. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '93), May 1993, Chicago, Ill, USA 1: 703–706.
Google Scholar
Matsuoka K, Ohya M, Kawamoto M: A neural net for blind separation of nonstationary signals. Neural Networks 1995, 8(3):411–419. 10.1016/0893-6080(94)00083-X
Article Google Scholar
Pham D-T: Joint approximate diagonalization of positive definite Hermitian matrices. SIAM Journal on Matrix Analysis and Applications 2001, 22(4):1136–1152. 10.1137/S089547980035689X
Article MathSciNet Google Scholar
Oppenheim AV, Schafer RW: Digital Signal Processing. Prentice Hall, Englewood Cliffs, NJ, USA; 1975.
MATH Google Scholar
Sawada H, Muaki R, Araki S, Makino S: A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Transactions on Speech and Audio Processing 2004, 12(5):530–538. 10.1109/TSA.2004.832994
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire des Images et des Signaux, BP 46, St Martin d'Hère Cedex, 38402, France
Ch Servière
Laboratoire de Modélisation et Calcul, BP 53, Grenoble Cedex, 38041, France
DT Pham

Authors

Ch Servière
View author publications
You can also search for this author in PubMed Google Scholar
DT Pham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ch Servière.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Servière, C., Pham, D. Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures. EURASIP J. Adv. Signal Process. 2006, 075206 (2006). https://doi.org/10.1155/ASP/2006/75206

Download citation

Received: 31 January 2005
Revised: 26 August 2005
Accepted: 01 September 2005
Published: 01 December 2006
DOI: https://doi.org/10.1155/ASP/2006/75206

Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords