The model description is defined in terms of the microwave source separation problem, where there are *n*_{
f
} maps of the sky at frequencies \left(\right)close="">\n \n (\n \n \n \nu \n \n \n 1\n \n \n ,\n \u2026\n ,\n \n \n \nu \n \n \n \n \n n\n \n \n f\n \n \n \n \n )\n \n, each map consisting of *J* pixels. The data are denoted \left(\right)close="">\n \n \n \n d\n \n \n j\n \n \n \u2208\n \n \n R\n \n \n \n \n n\n \n \n f\n \n \n \n \n ,\n j\n =\n 1\n ,\n \u2026\n ,\n J\n \n. The source model consists of *n*_{
s
} sources and is represented by the vectors \left(\right)close="">\n \n \n \n s\n \n \n j\n \n \n \u2208\n \n \n R\n \n \n \n \n n\n \n \n s\n \n \n \n \n \n, with each component representing the amplitude of a physical source of microwaves. We assume that the **d**_{
j
} can be represented as a linear combination of the **s**_{
j
}:

{\mathbf{d}}_{j}=\mathbf{A}{\mathbf{s}}_{j}+{\mathbf{e}}_{j},

(1)

where **A** is an *n*_{
f
} × *n*_{
s
} “mixing” matrix and **e**_{
j
} is a vector of *n*_{
f
} independent Gaussian error terms with precisions (inverse variances) \left(\right)close="">\n \n \tau \n =\n (\n \n \n \tau \n \n \n 1\n \n \n ,\n \u2026\n ,\n \n \n \tau \n \n \n \n \n n\n \n \n f\n \n \n \n \n )\n \n. For convenience, define

\begin{array}{lcr}\mathbf{D}& =& \left\{{d}_{\mathrm{ij}}\right|i=1,\dots ,{n}_{f},j=1,\dots ,J\};\\ \mathbf{S}& =& \left\{{s}_{\mathrm{kj}}\right|k=1,\dots ,{n}_{s},j=1,\dots ,J\}\end{array}

to represent all data and sources.

We assume dependence between the sources, defined by a prior distribution *p*(**S**|*ψ*) with parameters *ψ*. The goal is to estimate the **S** and the parameters *ψ* associated with the model for **S**, given observation of **D**. The noise variances *τ* and the mixing matrix **A** are assumed known. GMM are used to represent the non-Gaussian sources, in which case it is an example of a model known as a mixture of factor analyzers [10]. As in [10], we adopt a Bayesian approach to the data fitting, implemented by a variational Bayes approach.

Bayesian inference will be based on the posterior distribution, which following the above description can be factorized as

p(\mathbf{S},\mathit{\psi}|\mathbf{A},\mathbf{D},\mathbf{\tau})\propto p(\mathbf{D}|\mathbf{S},\mathbf{A},\mathbf{\tau})p\left(\mathbf{S}\right|\mathit{\psi}\left)p\right(\mathit{\psi}).

(2)

Each element of this distribution is defined next in turn.

### Noise structure

Gaussian error, **e**_{
j
}, is assumed independent within and between pixels *j* and frequency, which gives

p\left(\mathbf{D}\right|\mathbf{S},\mathbf{A},\mathbf{\tau})=\prod _{j=1}^{J}\prod _{i=1}^{{n}_{f}}\sqrt{\frac{{\tau}_{i}}{2\Pi}}\text{exp}\left(-\frac{{\tau}_{i}}{2}{({d}_{\mathrm{ij}}-{\mathbf{A}}_{i\xb7}{\mathbf{s}}_{j})}^{2}\right)

(3)

where **A**_{i·} is the *i* th row of **A**.

### Mixing matrix structure

In this application, **A** is parameterized and denoted **A**(*θ*). Each column of **A**(*θ*) is the contribution to the observation of a source at different frequencies, which is written as a function of the frequencies and *θ*. These parameterizations are approximations that come from the current state of knowledge about how the sources are generated. Here, we merely state the parameterization that we are going to use, and refer to [11] for a more detailed exposition on the background to them. Some restrictions are usually placed on **A**(*θ*)in order to force a unique solution; this is achieved here by setting the first row of **A**(*θ*) to be ones.

It is assumed that the CMB is the first source and therefore it corresponds to the first column of **A**(*θ*). It is modeled as a black body at a temperature, and its contribution is a known constant at each frequency. The parametrization of the mixing matrix is given as

{\mathbf{A}}_{i1}\left(\theta \right)=\frac{g\left({\nu}_{i}\right)}{g\left({\nu}_{1}\right)},

(4)

where

g\left({\nu}_{i}\right)={\left(\frac{h{\nu}_{i}}{{k}_{B}{T}_{0}}\right)}^{2}\frac{\text{exp}(h{\nu}_{i}/{k}_{B}{T}_{0})}{{\left(\text{exp}\right(h{\nu}_{i}/{k}_{B}{T}_{0})-1)}^{2}},

*T*_{0} = 2.725K is the average CMB temperature, *h* is the Planck constant, and *k*_{B} is Boltzmann’s constant. The ratio *g*(*ν*_{
i
}) / *g*(*ν*_{1}) is designed to ensure that **A**_{11}(*θ*) = 1 as we constraint the first row of **A**(*θ*) to be ones.

\begin{array}{ccc}{\mathbf{A}}_{i2}\left(\theta \right)\hfill & =\hfill & {\left(\frac{{\nu}_{i}}{{\nu}_{1}}\right)}^{{\kappa}_{s}},\hfill \\ {\mathbf{A}}_{i3}\left(\theta \right)\hfill & =\hfill & \frac{\text{exp}(h{\nu}_{1}/{k}_{\mathrm{B}}{T}_{1})-1}{\text{exp}(h{\nu}_{i}/{k}_{\mathrm{B}}{T}_{1})-1}{\left(\frac{{\nu}_{i}}{{\nu}_{1}}\right)}^{1+{\kappa}_{d}},\text{and}\hfill \\ {\mathbf{A}}_{i4}\left(\theta \right)\hfill & =\hfill & {\left(\frac{{\nu}_{i}}{{\nu}_{1}}\right)}^{{\kappa}_{f}},\hfill \end{array}

where *T*_{1} = 18.1K is the assumed thermodynamical temperature of the dust grains, and column 2 corresponds to synchrotron, column 3 to galactic dust, and column 4 is free–free emission. There are three unknown model parameters for **A**, for synchrotron \left(\right)close="">\n \n \n \n \kappa \n \n \n s\n \n \n \u2208\n {\n \n \n \kappa \n \n \n s\n \n \n :\n \u2212\n 3\n .\n 0\n \u2264\n \n \n \kappa \n \n \n s\n \n \n \u2264\n \u2212\n 2\n .\n 3\n }\n \n, the spectral indices for dust *κ*_{
d
} ∈ {*κ*_{
d
} : 1 ≤ *κ*_{
d
} ≤ 2}, and for free–free emission \left(\right)close="">\n \n \n \n \kappa \n \n \n f\n \n \n \u2208\n {\n \n \n \kappa \n \n \n f\n \n \n :\n \u2212\n 2\n .\n 3\n \u2264\n \n \n \kappa \n \n \n f\n \n \n \u2264\n \u2212\n 2\n .\n 0\n }\n \n.

### The sources

The distribution of **s**_{
j
} is modeled as a GMM with *m* factors. The model proposed allows for between-source dependence; the vector of sources at a pixel is a mixture of multivariate Gaussians

p\left(\mathbf{S}\right|\mathit{\psi})=\prod _{j=1}^{J}\sum _{a=1}^{m}{w}_{a}p({\mathbf{s}}_{j}|{\mathbf{\mu}}_{a},{Q}_{a})

(5)

where

\phantom{\rule{-5.5pt}{0ex}}p\left({\mathbf{s}}_{j}\right|{\mathbf{\mu}}_{a},{Q}_{a})\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\sqrt{\phantom{\rule{0.3em}{0ex}}\frac{\left|{Q}_{a}\right|}{{\left(2\Pi \right)}^{{n}_{s}}}}\text{exp}\phantom{\rule{0.3em}{0ex}}\left(\phantom{\rule{0.3em}{0ex}}-\frac{1}{2}{({\mathbf{s}}_{j}-{\mathbf{\mu}}_{a})}^{T}{Q}_{a}({\mathbf{s}}_{j}-{\mathbf{\mu}}_{a})\phantom{\rule{0.3em}{0ex}}\right)

for mixture component weights *w*_{
a
}, mean vectors *μ*_{
a
}, and precision matrices *Q*_{
a
}, so that *ψ* is all the *w*_{
a
}, *μ*_{
a
} and *Q*_{
a
}, with *a* = 1,…,*m*. Note that, in the standard inflationary cosmological model the CMB is a single multivariate Gaussian (*m* = 1) while the Galactic foregrounds might require *m* > 1 to be correctly modeled. In order to fulfill this, we set the CMB for components 2,…,*m* of the multivariate mixture to be exactly zero in the implementation.

### Priors

The remaining term in Equation (2) is *p*(*ψ*). We use the conjugate prior distributions [12] that facilitate the computation of the posterior and yet flexible enough to incorporate good prior information: Gaussians for the component means, Dirichlet for the component weights, and Wishart for precision matrices. In the microwave source application, background knowledge about the magnitude of the sources can be incorporated through specifying values of the parameters of these prior distributions. This prior specification follows [13], who discuss how to specify these values in more detail.