Skip to content


  • Research
  • Open Access

Dependent Gaussian mixture models for source separation

EURASIP Journal on Advances in Signal Processing20122012:239

Received: 31 January 2012

Accepted: 30 October 2012

Published: 16 November 2012


Source separation is a common task in signal processing and is often analogous to factor analysis. In this study, we look at a factor analysis model for source separation of multi-spectral image data where prior information about the sources and their dependencies is quantified as a multivariate Gaussian mixture model with an unknown number of factors. Variational Bayes techniques for model parameter estimation are used. The development of this methodology is motivated by the need to bring an efficient solution to the separation of components in the microwave radiation maps that are being obtained by the satellite mission Planck which has the objective of uncovering cosmic microwave background radiation. The proposed algorithm successfully incorporates a rich variety of prior information available to us in this problem in contrast to many previous solutions that assume completely blind separation of the sources. Results on realistic simulations of Planck maps and on Wilkinson microwave anisotropy probe fifth year images are shown. The technique suggested is easily applicable to other source separation applications by modifying some of the priors.


  • CMB
  • Gaussian mixture priors
  • Variational Bayes


The discovery of the cosmic microwave background (CMB) is a strong evidence for the Big Bang theory of the formation and development of the universe. According to the theory, the early universe was smaller and hotter but cooled as it expanded. Once the temperature cooled to about 3000 K, photons were free to propagate without being scattered off ionized matter; the CMB is an image of this event and is visible across the entire sky. Three satellites have been launched to measure the CMB: the cosmic background explorer, Wilkinson microwave anisotropy probe (WMAP) and most recently the Planck surveyor. Planck is the highest resolution data to date, of the order of 107 pixels across the sky measured at nine channels.

Unfortunately, the signals measured by these satellites as shown in Figure 1 contain radiation not only from CMB, but also contributions from a number of other sources, namely foreground radiations and extragalactic sources in addition to antenna receiver noise. Foreground sources from our galaxy include synchrotron, dust, and free–free emission. Therefore, the separation of the CMB signal from other sources is an important stage in the production of CMB maps [1].
Figure 1
Figure 1

Observed WMAP 7 year-data. The data were taken from the NASA WMAP website [2].

To date, there have been several attempts to achieve it in a Bayesian framework using both (a) Gaussian mixture model (GMM) prior [3], and (b) Markov Random Field (MRF) prior [4, 5]. Full sky maps at low resolution through MCMC, using masks to reduce the effect of the signal in the galactic plane, were described in [6]. Some of these are fully Bayesian source separation methods which are developed to separate the underlying CMB from the mixed observed signals of extraterrestrial microwaves made at several frequencies.

A common assumption among works in the literature is the independence of the cosmological sources. Although it is well known that CMB is independent from the rest of the sources, the galactic sources demonstrate significant statistical dependence among themselves, as stated in [1]. Recently, a small number of researchers have started addressing this problem [7, 8]. Various dependent component analysis approaches are compared in [9], demonstrating their superior performance with respect to classical ICA.

In this study, we present a dependent components model for source separation of multi-spectral image data, where prior information about the sources and between-source dependencies is quantified as a multivariate GMM, using variational Bayes techniques for model parameter estimation. This article can thus be considered as an extension of [3], modeling dependencies between-sources through generalizing the prior to multivariate GMM.

The rest of the article is structured as follows. The next section gives the model for the mixing problem and describes the hierarchical Bayesian model that we use, including the prior we assume for the sources. Section “Implementing the source separation” describes the variational Bayes approach we use for the implementation of the separation. Section “Examples” provides results on both synthetic Planck and real WMAP images. Finally, we provide a discussion of the results in the last section.


The model description is defined in terms of the microwave source separation problem, where there are n f maps of the sky at frequencies ( ν 1 , , ν n f ) , each map consisting of J pixels. The data are denoted d j R n f , j = 1 , , J . The source model consists of n s sources and is represented by the vectors s j R n s , with each component representing the amplitude of a physical source of microwaves. We assume that the d j can be represented as a linear combination of the s j :
d j = A s j + e j ,
where A is an n f  × n s “mixing” matrix and e j is a vector of n f independent Gaussian error terms with precisions (inverse variances) τ = ( τ 1 , , τ n f ) . For convenience, define
D = { d ij | i = 1 , , n f , j = 1 , , J } ; S = { s kj | k = 1 , , n s , j = 1 , , J }

to represent all data and sources.

We assume dependence between the sources, defined by a prior distribution p(S|ψ) with parameters ψ. The goal is to estimate the S and the parameters ψ associated with the model for S, given observation of D. The noise variances τ and the mixing matrix A are assumed known. GMM are used to represent the non-Gaussian sources, in which case it is an example of a model known as a mixture of factor analyzers [10]. As in [10], we adopt a Bayesian approach to the data fitting, implemented by a variational Bayes approach.

Bayesian inference will be based on the posterior distribution, which following the above description can be factorized as
p ( S , ψ | A , D , τ ) p ( D | S , A , τ ) p ( S | ψ ) p ( ψ ) .

Each element of this distribution is defined next in turn.

Noise structure

Gaussian error, e j , is assumed independent within and between pixels j and frequency, which gives
p ( D | S , A , τ ) = j = 1 J i = 1 n f τ i 2 Π exp τ i 2 ( d ij A i · s j ) 2

where Ai· is the i th row of A.

Mixing matrix structure

In this application, A is parameterized and denoted A(θ). Each column of A(θ) is the contribution to the observation of a source at different frequencies, which is written as a function of the frequencies and θ. These parameterizations are approximations that come from the current state of knowledge about how the sources are generated. Here, we merely state the parameterization that we are going to use, and refer to [11] for a more detailed exposition on the background to them. Some restrictions are usually placed on A(θ)in order to force a unique solution; this is achieved here by setting the first row of A(θ) to be ones.

It is assumed that the CMB is the first source and therefore it corresponds to the first column of A(θ). It is modeled as a black body at a temperature, and its contribution is a known constant at each frequency. The parametrization of the mixing matrix is given as
A i 1 ( θ ) = g ( ν i ) g ( ν 1 ) ,
g ( ν i ) = h ν i k B T 0 2 exp ( h ν i / k B T 0 ) ( exp ( h ν i / k B T 0 ) 1 ) 2 ,
T0 = 2.725K is the average CMB temperature, h is the Planck constant, and kB is Boltzmann’s constant. The ratio g(ν i ) / g(ν1) is designed to ensure that A11(θ) = 1 as we constraint the first row of A(θ) to be ones.
A i 2 ( θ ) = ν i ν 1 κ s , A i 3 ( θ ) = exp ( h ν 1 / k B T 1 ) 1 exp ( h ν i / k B T 1 ) 1 ν i ν 1 1 + κ d , and A i 4 ( θ ) = ν i ν 1 κ f ,

where T1 = 18.1K is the assumed thermodynamical temperature of the dust grains, and column 2 corresponds to synchrotron, column 3 to galactic dust, and column 4 is free–free emission. There are three unknown model parameters for A, for synchrotron κ s { κ s : 3 . 0 κ s 2 . 3 } , the spectral indices for dust κ d  {κ d  : 1 ≤ κ d  ≤ 2}, and for free–free emission κ f { κ f : 2 . 3 κ f 2 . 0 } .

The sources

The distribution of s j is modeled as a GMM with m factors. The model proposed allows for between-source dependence; the vector of sources at a pixel is a mixture of multivariate Gaussians
p ( S | ψ ) = j = 1 J a = 1 m w a p ( s j | μ a , Q a )
p ( s j | μ a , Q a ) = | Q a | ( 2 Π ) n s exp 1 2 ( s j μ a ) T Q a ( s j μ a )

for mixture component weights w a , mean vectors μ a , and precision matrices Q a , so that ψ is all the w a , μ a and Q a , with a = 1,…,m. Note that, in the standard inflationary cosmological model the CMB is a single multivariate Gaussian (m = 1) while the Galactic foregrounds might require m > 1 to be correctly modeled. In order to fulfill this, we set the CMB for components 2,…,m of the multivariate mixture to be exactly zero in the implementation.


The remaining term in Equation (2) is p(ψ). We use the conjugate prior distributions [12] that facilitate the computation of the posterior and yet flexible enough to incorporate good prior information: Gaussians for the component means, Dirichlet for the component weights, and Wishart for precision matrices. In the microwave source application, background knowledge about the magnitude of the sources can be incorporated through specifying values of the parameters of these prior distributions. This prior specification follows [13], who discuss how to specify these values in more detail.

Implementing the source separation

The posterior developed in the previous section does not lend itself to an analytical solution. MCMC techniques are one approach that let us evaluate complicated integrals by sampling rather than by analytical or numerical methods. The main criticism of Bayesian source separation with sampling methods, MCMC in particular, is their computational load and slow convergence. Regarding the speed, they cannot compete with methods such as FastICA [14, 15].

There are several approaches to speed up the algorithm, such as the strategies suggested in [16]. In the image source separation problem framework, the Langevin sampling scheme has been implemented [4], as a way to obtain a faster MC algorithm.

In this study, the source separation model presented in Section “Model” is implemented by a variational Bayesian approach [10, 17, 18], that allows for more efficient inference when dealing with large data when compared with MCMC techniques. In essence, given the data D and a model with parameters θ and latent variables Z, the variational Bayes method is based on approximating the posterior distribution p(Z,θ|D) with a factorial approximation q(Z,θ|ϕ) = q(Z|ϕ Z )q(θ|ϕ θ ), where ϕ are the variational parameters. The approximation is fitted by minimizing the Kullback–Leibler divergence between q and p, or equivalently maximizing a lower bound on marginal log-likelihood of the data.

Attias [19] has recently developed a fully Bayesian approach to GMM with a variational approximation to the posterior that, when choosing conjugate priors, leads to the following components: Wishart densities for the precisions, Q a ; Normal densities for the means, μ a ; and a Dirichlet for the mixing coefficients, p; and a discrete distribution for the indicator posteriors, z j , which indicates the component that explains information in pixel j. We further derived the variational approximation to the marginal posterior of sources, s j , which turns out to be a multivariate Gaussian distribution. In brief
q ( s j ) MVN ( A j , B j ) q ( p ) D ( λ ) q ( μ a | Q a ) N ( ξ a , β a Q a ) q ( Q a ) W ( η a , V a )
q ( z j = a ) exp Ψ ( λ a ) Ψ a λ a | V a | 1 2 2 n s 2 exp 1 2 i = 1 n s Ψ η a + 1 i 2 exp n s 2 β a exp η a 2 ( A j ξ a ) T V a ( A j ξ a ) + tr ( V a ( B j ) 1 )

where MVN stands for multivariate normal distribution and Ψ denotes the digamma function. Note that q(z j  = a) is the probability that component a is responsible for information in pixel j in sources, s j .

The quantities of interest, i.e., the hyper-parameters to be computed, A j , B j , λ, ξ a , β a , η a , and V a for j = 1,…,J and a = 1,…,m, have the following values:
( B j ) kl = i = 1 n f τ i A ik A il + a = 1 m q ( z j = a ) η a ( V a ) kl ( A j ) k = ( B j ) 1 v ( k ) , with v ( k ) = i = 1 n f τ i d ij A ik + a = 1 m q ( z j = a ) η a l = 1 n s ( V a ) kl ( ξ a ) l λ a = j q ( z j = a ) + λ a prior ξ a = j [ q ( z j = a ) A j ] + β a prior ξ a prior j q ( z j = a ) + β a prior β a = j q ( z j = a ) + β a prior η a = j q ( z j = a ) + η a prior
V a = j q ( z j = a ) 1 2 q ( z j = a ) j q ( z j = a ) ( B j ) 1 + Φ + ( A j μ ̄ a ) ( A j μ ̄ a ) T + V a prior + β a prior [ j q ( z j = a ) ] [ Φ + ( μ ̄ a ξ a prior ) ( μ ̄ a ξ a prior ) T β a
μ ̄ a = j [ q ( z j = a ) A j ] j q ( z j = a ) ( Φ ) kl = j { [ q ( z j = a ) ] 2 ( B j ) kl 1 } [ j q ( z j = a ) ] 2 .

Computations were carried out using Matlab.


Analysis of simulated data

The synthetic Planck data were generated using the Planck Sky Model (PSM) package. The PSM is a set of IDL codes created by the Planck WG2 team in order to provide realistic simulations of the sky at the Planck frequencies [20]. Figure 2 shows data obtained from realistic simulations of CMB, synchrotron, and galactic dust on a 512×512 patch. The original sources are shown in Figure 3. The data were generated at the nine frequencies that are observed by Planck from 30 to 857 GHz. The mixing matrix used was as defined in Section “Mixing matrix structure” with κ s  = −2.9 and κ d  = 2.0. Noise precisions were those published by the Planck research team [21]. After exploring several values for m, the number of components in the GMM source model was fixed to be m = 1, as it provided the best fit, taking into account the compromise between fit and number of parameters in the model.
Figure 2
Figure 2

Observations on the three of the nine channels (lowest, middle, and highest frequencies are shown) of the data generated from the source separation model with realistic simulations of CMB, synchrotron, and galactic dust.

Figure 3
Figure 3

The simulated sources used to generate the simulated data in Figure 2 .

Figure 4 shows an estimate of CMB, along with a scatter plot of this estimate against the true value, as shown in Figure 3. Such an estimate is the average of the samples obtained for the first column of A, which corresponds to CMB. We see from the scatter plot and from comparison with Figure 3 that the reconstruction of CMB is very accurate here. The same is true for the other two sources, as shown in Figure 5.
Figure 4
Figure 4

The posterior mean of the reconstruction of the CMB with a scatter plot of true versus posterior mean.

Figure 5
Figure 5

The posterior mean of the reconstruction of synchrotron and galactic dust.

Table 1 shows the mean of the parameters of the model. Regarding the between-sources dependence structure, posterior estimates of V 1 k 1 , k = 2 , 3 are approximately 0, suggesting independence between CMB and the other sources, as expected. On the other hand, posterior estimate of V 23 1 0 , indicating dependence between synchrotron and galactic dust.
Table 1

Mean estimate of parameters for simulated data

μ ̂ = 0 . 020 0 . 014 0 . 008

Q ̂ = 4 . 665 0 0 0 0 . 006 0 . 003 0 0 . 003 0 . 010 × 1 0 8

Analysis of a WMAP year 5 patch

The WMAP [22] was launched in 2001 and data collection activities finished in 2010. It observes 5 frequencies from 22 to 90 GHz. Figure 6 shows a patch of 5-year WMAP data.
Figure 6
Figure 6

Temperatures (in mK) at 20° square patch of the sky from WMAP[2]at 5 microwave frequencies (clockwise from top left) 22, 30, 40, 90, and 60 GHz).

The algorithm was implemented with four sources (CMB, synchrotron, dust, and free–free emission). The noise precisions were assumed to be the published values for WMAP detectors. The spectral density for free–free emission was fixed at −2.14 (following [11]) and the synchrotron and dust spectral indices were as in the first example. The number of components in the GMM source model were fixed to be m = 2, following the same reasoning as in the simulation study. Informative priors were placed on the GMM parameters, based on discussions on the expected marginal properties of the sources. Table 2 shows the mean of the mixture parameters of the model. Figure 7 shows the estimated CMB. The result obtained is in agreement with previous work [3], as can be appreciated in Figure 8 that shows an histogram of the differences between the estimated CMB using the approach presented here and the estimated CMB obtained in [3].
Figure 7
Figure 7

Estimated CMB.

Figure 8
Figure 8

Histogram of the (pixel-by-pixel) differences between the estimated CMB using the approach presented in this article and the estimated CMB obtained in [3]. The result obtained is in agreement with previous work

Table 2

Parameter estimated mean for WMAP data

a = 1

p ̂ 1 = 0 . 12

μ ̂ 1 = 0 . 012 0 . 051 0 . 001 0 . 017

Q ̂ 1 = 5 . 29 0 0 0 0 1 . 47 0 . 04 1 . 60 0 0 . 04 0 . 41 0 . 32 0 1 . 60 0 . 32 3 . 79 × 1 0 5

a = 2

p ̂ 2 = 0 . 88

μ ̂ 2 = 0 . 012 0 . 029 0 . 003 0 . 017

Q ̂ 2 = 1 . 45 0 0 0 0 0 . 54 0 . 08 0 . 20 0 0 . 08 0 . 28 0 . 26 0 0 . 20 0 . 26 1 . 09 × 1 0 4

Finally, in order to show the fit of the data to the model, Figure 9 is a scatter plot of the observed value of the d jk with the standardized residuals, with one figure for each frequency k = 1,…,5.
Figure 9
Figure 9

Assessment of model fit. Scatter plot of the posterior predicted values of d j against the standardized residual over all pixels.


A fully Bayesian factor analysis algorithm has been presented and applied to a multi-channel image source separation problem, where dependencies between sources are modeled as a multivariate GMM. The algorithm performs very well on simulated Planck data and has been applied to data from WMAP.

In this study, we extend previous approaches [3] by allowing the source priors to be a mixture of multivariate Gaussian distributions for each pixel.

The development of this methodology is motivated by the need to bring an efficient solution to the separation of components in the microwave radiation maps to be obtained by the satellite mission Planck which has the objective of uncovering CMB radiation. The proposed algorithm successfully incorporates a rich variety of prior information available to us in this problem in contrast to most of the previous work that assumes completely blind separation of the sources. Further, the variational approach presented here overcomes the convergence problems of the MCMC stated in [23], when dealing with large datasets such as that will be provided by the satellite mission Planck.

In the analysis of simulated data, the number of components in the GMM source model turned out to be m = 1. This means that sources are multivariate Gaussian a priori. On the other hand, for real data, the number of components is m = 2. In blind source separation problem, identifiability relies on the independence of the sources. In this study, in spite of modeling the sources as Gaussians when m = 1, identifiability is obtained because of the prior information which is incorporated to the model, given structure to the mixing matrix.

Another type of dependence is that a source is spatially correlated. Spatial dependence is most conveniently modeled by a Gaussian MRF and some preliminary work on this idea can be found in [5]. Combining with cross source correlations, one might ultimately consider a mixture of multivariate Gaussian MRF as a prior for the sources. Implementing the analysis with such a prior would be a significant challenge computationally; we hypothesize that it will be difficult to derive a well-behaved MCMC approach. Other functional approximations, such as that of [24], offer feasible alternative to computing the posterior distribution in this case.

Finally, although the technique was developed for the astrophysical source separation problem in mind, it is general and it is applicable to other source separation problems as well.



Simon Wilson was supported by the STATICA project, funded by the Principal Investigator program of Science Foundation Ireland, contract number 08/IN.1/I1879. The authors acknowledge the use of the PSM, developed by the Component Separation Working Group (WG2) of the Planck Collaboration.

Authors’ Affiliations

Department of Signal Theory and Communications, Universidad Rey Juan Carlos, Fuenlabrada, Spain
Department of Statistics, Trinity College Dublin, Dublin, Ireland


  1. Kuruoglu EE: Bayesian source separation for cosmology. IEEE Signal Process. Mag 2010, 27: 43-54.View ArticleGoogle Scholar
  2. []
  3. Wilson SP, Kuruoglu EE, Salerno E: Fully Bayesian source separation of astrophysical images modelled by a mixture of Gaussians. IEEE J. Sel. Topics Signal Process 2008, 2(5):685-696.View ArticleGoogle Scholar
  4. Kayabol K, Kuruoglu EE, Sanz JL, Sankur B, Salerno E, Herranz D: Adaptive Langevin sampler for separation of t-distribution modelled astrophysical maps. IEEE Trans. Image Process 2010, 19(9):2357-2368.MathSciNetView ArticleGoogle Scholar
  5. Kayabol K, Kuruoglu EE, Sankur B: Bayesian separation of images modeled with MRFs using MCMC. IEEE Trans. Image Process 2009, 18(5):982-994.MathSciNetView ArticleGoogle Scholar
  6. Dickinson C, Eriksen HK, Banday AJ, Jewell JB, Gorski KM, Huey G, Lawrence CR, O’Dwyer IJ, Wandelt BD: Bayesian component and separation cosmic microwave background estimation for the five-year WMAP temperature data. Astrophys. J 2010, 705: 1607-1623.View ArticleGoogle Scholar
  7. Bedini L, Herranz D, Salerno E, Baccigalupi C, Kuruoglu EE, Tonazzini A: Separation of correlated astrophysical sources using multiple-lag data covariance matrices. EURASIP J. Appl. Signal Process 2005, 2005(15):2400-2412. 10.1155/ASP.2005.2400View ArticleMATHGoogle Scholar
  8. Bonaldi A, Bedini L, Salerno E, Baccigalupi C, De Zott G: Estimating the spectral indices of correlated astrophysical foregrounds by a second-order statistical approach. Monthly Notices R. Astron. Soc 2006, 373: 271-279. 10.1111/j.1365-2966.2006.11025.xView ArticleGoogle Scholar
  9. Kuruoglu EE: Dependent component analysis for cosmology. Lecture Notes Comput. Sci 2010, 6365: 538-545. 10.1007/978-3-642-15995-4_67View ArticleGoogle Scholar
  10. Ghahramani Z, Beal M: Variational inference for Bayesian mixtures of factor analysers. In Advances in Neural Information Processing Systems. Edited by: Solla SA, Leen TK, Muller KR. MIT Press, Cambridge, MA); 2000.Google Scholar
  11. Eriksen HK, Dickinson C, Lawrence CR, Baccigalupi C, Banday AJ, Gorski KM, Hansen FK, Lilje PB, Pierpaoli E, Smith KM, Vanderlinde K: C M B component separation by parameter estimation. Astrophys. J 2006, 641: 665-682. 10.1086/500499View ArticleGoogle Scholar
  12. Lee PM: Bayesian Statistics: An Introduction. Hodder Arnold H& S, London; 2004.MATHGoogle Scholar
  13. Richardson S, Green P: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Stat. Soc. Ser. B 1997, 59: 731-792. 10.1111/1467-9868.00095MathSciNetView ArticleMATHGoogle Scholar
  14. Hyvärinen A: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw 1999, 10(3):626-634. 10.1109/72.761722View ArticleGoogle Scholar
  15. Leach S, Cardoso JF, Baccigalupi C, Barreiro R, Betoule M, Bobin J, Bonaldi A, Delabrouille J, De Zotti G, Dickinson C, Eriksen HK, González-Nuevo J, Hansen FK, Herranz D, Le Jeune M, López-Caniego M, Martínez-González E, Massardi M, Melin JB, Miville-Deschênes MA, Patanchon G, Prunet S, Ricciardi S, Salerno E, Sanz JL, Starck JL, Stivoli F, Stolyarov V, Stompor R, Vielva P: Component separation methods for the PLANCK mission. Astron. Astrophys 2008, 491(2):597-615. 10.1051/0004-6361:200810116View ArticleGoogle Scholar
  16. Gilks WR, Richardson S, Spiegelhalter DJ: Markov Chain Monte Carlo in Practice. CRC Press, Boca Raton, FL; 1996.MATHGoogle Scholar
  17. McKay DJC: Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge; 2003.Google Scholar
  18. Bishop CM: Pattern Recognition and Machine Learning. Springer, New York; 2006.MATHGoogle Scholar
  19. Attias H: A Variational Bayesian Framework for Graphical Models. MIT Press, Cambridge, MA; 2000.Google Scholar
  20. Collaboration P: The pre-launch plack sky model: a model of sky emission at submillimetre to centimetre wavelengths (in preparation).Google Scholar
  21. []
  22. []
  23. Wilson SP, Kuruoglu EE, Quirós A: Bayesian factor analysis using Gaussian mixture sources, with application to separation of the cosmic microwave background. 2nd International Workshop on Cognitive Information Processing 2010.Google Scholar
  24. Rue H, Martino S, Chopin N: Approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B 2008, 71: 319-392.MathSciNetView ArticleMATHGoogle Scholar


© Quirós and Wilson; licensee Springer. 2012

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.