Dependent component analysis
© Kuruoglu and Theis; licensee Springer. 2013
Received: 29 November 2013
Accepted: 2 December 2013
Published: 16 December 2013
Source separation is not a new problem. We, human beings, as well as many other species, do it unconsciously at every instant of our lives. Our organisms receive a multitude of signals mixed together from the environment, and we are constantly uncovering the relevant ones in order to derive vital information to continue our lives. Other than the biological signals that are occurring at the cell or organ level, there are also various source separation tasks that we consciously do daily. In a crowded underground, room or office, we try to extract what our friend or colleague is telling us among all other speech or audial signals arriving at our ears.
Although the source separation problem is handled rather seamlessly by our brains, its implementation in computers has required the development of mathematical models and algorithms. Starting in the 1980s, the problem has been addressed firstly in the context of the cocktail party problem: how to separate a number of speech signals from multichannel mixtures of them. Originally, a simple model of linear mixing was adopted. A solution has been provided to this simplified problem by the independent component analysis algorithm. Later, several other applications have been considered with source separation problems ranging from financial time series analysis to functional magnetic resonance imaging, and from cosmological image separation to music signal separation. The model has been extended to convolutive mixtures and to non-linear mixtures, and variations of the independent component analysis (ICA) algorithm have been utilised.
The essence behind independent component analysis is the assumption of statistical independence of the sources. Since the problem is a blind one, that is, we do not know the channel (mixing) characteristics, the mixture model alone represents an ill-posed problem, and we need additional information to resolve the unmixing. The independence assumption provides this additional information, and one tries to design a reverse transformation by minimising the mutual information between estimated sources.
Despite its success in certain problems, it has been observed recently that independence assumption does not hold in many signal mixing scenarios in various applications. For example, in a cosmological image separation problem, it is a well-known fact that cosmological sources are statistically dependent. In functional magnetic resonance imaging (fMRI), it is not reasonable to assume that brain processes are independent. These and many other cases make it clear that there is a need for relaxing the independence assumption and generalising the methods developed for independent source separation to the case of statistically dependent sources but with some other additional structure in the source space.
To this end, a number of approaches have been proposed in the literature starting the early years of the twenty-first century. An initial approach has been to group the sources which are statistically dependent in a way that the groups are independent from each other. This approach, which initially started as a multidimensional ICA, later evolved into an independent subspace analysis with developments in theory and methodology. Other extensions include independent vector analysis, where the mixture model is separated into multiple layers, with dependent source components across the layers being grouped into a multidimensional source vector.
These approaches aim to cluster the sources which have dependence and separate them from sources which are independent. In contrast, other approaches exist which aim to develop a model for dependence between sources and perform separation making use of this dependence model. The following are a few of such approaches: an early one is topographic component analysis which aims at defining a topographic rule to model the statistical dependencies between components, and another is tree-dependent component analysis which fits a tree structure to the conditional probabilities over the different sources.
Dependent component analysis algorithms have found various applications recently. These applications cover areas of speech recognition, music classification, fMRI analysis, face recognition, biomedical image analysis, watermarking, gene expression array analysis, etc.
In this special issue, we aimed to provide a view of the general panorama of the research on dependent component analysis. The issue starts with a paper by Castella et al., who demonstrate that for certain classes of dependent sources, classical ICA methods still apply, and who also give explicit conditions for this situation. Next comes a methodological paper by Caiafa, which provides the citerion for choosing valid objective functions for independent and dependent source separation. The following algorithmic paper is on complex independent vector analysis by Shen and Kleinsteuber and presents non-unitary matrix diagolisation methods. Na and Yu introduce a method utilising subspace and subband non-linearity for independent vector analysis. Four application-oriented papers follow: Quiros and Wilson provide a Bayesian formulation for the separation of dependent sources in astrophysical images using a mixture prior. Tonazzini and Bedini provide results on document image restoration using correlated component analysis. Almeida et al. study the separation of phase-locked sources in MEG data. Liang and Chambers provide a separation method based on independent vector analysis for multimedia data.
We hope that the papers presented in this issue will provoke discussions and further research on different aspects of dependent component analysis, leading to new theoretical and applicative results.
Ercan E. Kuruoglu
Fabian J. Theis
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.