 Research
 Open Access
 Published:
Contribution of statistical tests to sparsenessbased blind source separation
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 169 (2012)
Abstract
We address the problem of blind source separation in the underdetermined mixture case. Two statistical tests are proposed to reduce the number of empirical parameters involved in standard sparsenessbased underdetermined blind source separation (UBSS) methods. The first test performs multisource selection of the suitable time–frequency points for source recovery and is full automatic. The second one is dedicated to autosource selection for mixing matrix estimation and requires fixing two parameters only, regardless of the instrumented SNRs. We experimentally show that the use of these tests incurs no performance loss and even improves the performance of standard weaksparseness UBSS approaches.
Introduction
Source separation is aimed at reconstructing multiple sources from multiple observations (mixtures) captured by an array of sensors. In what follows, we assume these sensors to be linear, which is acceptable in many applications. The problem is said to be blind when the observations are linearly mixed by the transfer medium and no prior knowledge on the transfer medium or the sources is available. Blind source separation (BSS) is an important research topic in a variety of fields, including radar processing[1], medical imaging[2], communication[3, 4], speech and audio processing[5]. BSS problems can be classified according to the nature of the mixing process (instantaneous, convolutive) and the ratio between the number of sources and the number of sensors of the problem (underdetermined, overdetermined).
If the sources are assumed to be statistically independent, solutions to the BSS problem are calculated so as to optimize separation criteria based on higher order statistics[6, 7]. Otherwise, when the sources have temporal coherency[8], are nonstationary[9], or possibly cyclostationary[10], the separation criteria to optimize are based on secondorder statistics.
Although BSS algorithms exist in great profusion, the underdetermined case (UBSS for underdetermined blind source separation), where the number of sensors is smaller than the number of sources, is less addressed than the overdetermined case, where the number of sensors is greater than or equal to the number of sources. Therefore, the UBSS problem is still challenging.
In the UBSS case, one way to deal with the lack of information is to use an expectationmaximizationbased method[11] to obtain a maximum likelihood estimation of the mixing matrix and sources. However, such an approach requires prior knowledge of the source distributions. In contrast, sparsenessbased methods solve the UBSS problem[12–20] without prior knowledge on the source distribution, by exploiting the sparseness of the nonstationary sources in the time–frequency domain. Roughly speaking, sparsenessbased approaches[21] involve transforming the mixtures into an appropriate representation domain. The transformed sources are then estimated thanks to their sparseness and, finally, the sources are reconstructed by inverse transform. A source is said to be sparse in a given signal representation domain if most of its coefficients, in this domain, are (almost) zero and only a few of them are big.
In the instantaneous mixture case, where each observation consists of a sum of sources with different signal intensity in presence of noise, the sparsenessbased methods introduced in[12–17], among others, rely on parameters that are chosen empirically. The general question addressed in this article is then to what extent this empirical parameter choice can be bypassed thanks to statistical methods, specifically designed to cope with sparse representations. This question is particularly relevant because a whole family of sparsenessbased UBSS algorithms relies on assumptions very similar to those employed in theoretical frameworks dedicated to the detection and estimation of sparse signals. Our contribution to this question is then the following.
The UBSS algorithms proposed in[12–17] estimate the unknown mixing matrix by assuming the presence of only one single source at each time–frequency point. In practice, a selection of time–frequency points that probably pertain to one single source is expected to improve performance of the mixing matrix estimation. The mixing matrix estimate is then used to recover the source signals. Rejecting time–frequency points of noise alone and, thus, selecting and processing the time–frequency points where the possibly multiple sources are present only, should also improve the overall performance of the methods. Our contribution is then to perform the selection processes mentioned in the foregoing, by considering them as statistical decision problems and reducing the number of empirical parameters for better robustness. Sparseness hypotheses are then particularly suitable for detecting the time–frequency points needed by the separation procedure, whereas such hypotheses are useless for selecting the time–frequency points used by the mixing matrix estimation.
More specifically, Section “Main steps of standard UBSS methods” recalls the source recovery and mixing matrix estimation steps in classical UBSS methods based on sparseness assumptions. By so proceeding, we highlight the empirical parameters required by these steps. Then, Section “Statistical tests for sparsenessbased UBSS” is the main core of the article because it introduces the statistical tests for the selection of the time–frequency points needed by source recovery and mixing matrix estimation. For source recovery, the selection of the time–frequency points relies on a weak notion of sparseness, exploited through an estimateandplugin detector: We begin by estimating the noise standard deviation via the ddimensional amplitude trimmed estimator (DATE), recently introduced in[22], especially designed for coping with noisy representations of weaklysparse signals; then, the noise standard deviation estimate is used instead of the unknown true value in the expression of a statistical test, specifically designed for noisy representations of weaklysparse signals as well. For the mixing matrix estimation, the physics of the signal suggest introducing a novel strategy. Indeed, the problem is to select time–frequency points whose energy is big enough in noise to consider that they pertain to one single source. We thus introduce a tolerance above which the energy of these relevant points must be regardless of noise. A statistical test involving this tolerance and based on signal norm testing (SNT) recently introduced in[23] is then used to select these points in presence of noise.
Summarizing, we thus extend significantly[24], by introducing three new features of importance. First, we replace the modified complex essential supremum estimate (MCESE) of the noise standard deviation by the DATE, which is as accurate, relies on an even stronger theoretical background and has a computational cost significantly lower. Second, the selection of the time–frequency points of interest for source recovery is performed by using a thresholding test, as in[24], but the value of the detection threshold is determined automatically on the basis of the results provided in[25] for the detection of signals satisfying the weaksparseness model in noise. Third, the mixing matrix estimation is carried out by taking the physical nature of the signals into account.
In Section “Simulation results”, we apply the statistical tests of Section “Statistical tests for sparsenessbased UBSS” to several standard UBSS methods[15, 16, 18, 26, 27] in the instantaneous mixture case. We thus show that our statistical algorithms reduce the number of empirical parameters and improve the overall performance of the UBSS methods under consideration. For instance, by using these statistical algorithms, the subspacebased method presented in[15] can be significantly automatized so as to involve two parameters only. These two parameters are adjusted once for all possible SNRs, in contrast to standard UBSS methods.
In Section “Discussion”, these results are discussed. In particular, the convolutive mixture case is addressed for its importance in practice. Some perspectives of this work are then presented in the concluding Section “Conclusion and perspectives”.
Main steps of standard UBSS methods
Principles
We consider the instantaneous mixing system:
where t ranges in some finite set of sampling times such that, for every t in this set of sampling times,$\mathit{s}\left(t\right)={\left[{s}_{1}\right(t),{s}_{2}(t),\dots ,{s}_{N}(t\left)\right]}^{T}$ is the vector of the N sources,$\mathit{x}\left(t\right)={\left[{x}_{1}\right(t),{x}_{2}(t),\dots ,{x}_{M}(t\left)\right]}^{T}$ is the Mdimensional mixture vector,$\mathit{A}=[{\mathit{a}}_{1},{\mathit{a}}_{2},\dots ,{\mathit{a}}_{N}]$ is the complex M × N mixing matrix and$\mathit{n}\left(t\right)={\left[{n}_{1}\right(t),{n}_{2}(t),\dots ,{n}_{M}(t\left)\right]}^{T}$ is additive noise. It is assumed that(n_{ k }(t))_{1≤k≤M} are random Gaussian processes, mutually decorrelated and independent of the sources. In the sequel, we address the underdetermined case where N > M. Without loss of generality, we assume that the column vectors of A have all unit norm, i.e.,$\u2225{\mathit{a}}_{i}\u2225=1$ for all i ∈ {1,2,…,N}.
Time–frequency signal processing provides effective tools for analyzing nonstationary signals, whose frequency contents vary in time. It involves representing signals in a 2D space, that is, the joint time–frequency domain, hence providing a distribution of the signal energy versus time and frequency simultaneously. The sparseness of the time–frequency coefficients of the source signals is one of the main keys to solve the UBSS problem.
One wellknown time–frequency representation and most used in practice is the shorttime discrete Fourier transform (STFT). The mixing process can be modeled in the time–frequency domain via the STFT as:
where${\mathcal{S}}_{\mathit{x}}(t,f)$,${\mathcal{S}}_{\mathit{s}}(t,f)$ and${\mathcal{S}}_{\mathit{n}}(t,f)$) are the vectors of the STFT coefficients at time–frequency bin (t,f) of the mixtures, the sources and noise, respectively.
Given x(t), our purpose is to recover s(t) or equivalently${\mathcal{S}}_{\mathit{s}}(t,f)$. As formalized in[28], the UBSS problem is generally decomposed in two separate subproblems. First, in the so called mixing matrix estimation, the normalized columns (a_{ i })_{1≤i≤N} are estimated so as to obtain an estimate of A. Then, on the basis of this estimate, the second step called signal recovery, provides a solution to Equation (2). Figure1 presents the flowchart of such a twostep approach.
We now detail the mixing matrix estimation and the source recovery based on sparseness assumptions.
Mixing matrix estimation
The UBSS methods based on sparse signal representations in the time–frequency domain share the following main assumption:
Assumption 1
For each source, there exists a set of time–frequency points where this source exists alone.
The elements of this set can be assumed to be isolated time–frequency points as in degenerate unmixing estimation technique (DUET)[15, 26] or to form a time–frequency box as in time–frequency ratio of mixtures (TIFROM)[16] and time–frequency CORRelation (TIFCORR)[27]. Assumption 1 is often reasonable thanks to the sparseness of the time–frequency representation of the sources, especially when this number of sources is moderate.
As mentioned above, the first step in UBSS methods is to estimate the mixing matrix A to achieve source recovery. In most twostep source separation algorithms[12, 13, 15–18] an autosource selection is performed. By autosource selection, it is meant the detection of regions where only one source occurs. The methods for estimating A on the basis of Assumption 1 can then be summarized as follows.
Jourjine et al.[26] present the DUET method, which is restricted to two mixtures (M = 2). They address the anechoic case, where source transmission attenuations and delays between sensors are taken into account. The columns of the mixing matrix are estimated by finding picks in a 2D histogram of amplitudedelay estimates.
In[16], the mixing matrix estimation of the TIFROM method is based on the complex ratios$\frac{{\mathcal{S}}_{{x}_{j}}(t,f)}{{\mathcal{S}}_{{x}_{k}}(t,f)},$ where, given m ∈ {1,2,…,M},${\mathcal{S}}_{{x}_{m}}(t,f)$ stands for the m th coordinate of${\mathcal{S}}_{\mathit{x}}(t,f)$. These ratios are computed for each time–frequency point and for two arbitrarily chosen indices j and k in {1,2,…,M}. A first limitation of this method is to assume nonnull matrix coefficients. A second limitation is the use of an empirical threshold to select the smallest empirical variances of these ratios.
In TIFCORR[27], the mixing matrix estimation is similar by selecting the empirical covariance coefficients above a certain threshold chosen manually.
The subspacebased UBSS (SUBSS) method[15] relies on another type of mixing matrix estimation. LetΩ_{ k } stand for the set of all the time–frequency points (t f) where the k th source is present and Ω stand for the union of all these sets Ω_{ k }for k=1,2,…,N. According to Assumption 1, the sets Ω_{ k }are nonempty and so is Ω. For (t f)∈Ω_{ k }, (2) reduces to
According to this result, the mixing matrix can be estimated as follows. First, all the spatial direction vectors$\mathit{d}(t,f)=\frac{{\mathcal{S}}_{\mathit{x}}(t,f)}{\u2225{\mathcal{S}}_{\mathit{x}}(t,f)\u2225}$, with (t f) ∈ Ω, are clustered by using an unsupervised clustering algorithm and taking into account that the number of sources is supposed to be known. Since (3) shows that for all the time–frequency points (t f) of Ω_{ k }, the STFT vectors${\mathcal{S}}_{\mathit{x}}(t,f)$ have same spatial direction a_{ k }, the column vectors of the mixing matrix A are then estimated as the centroids of the N classes returned by the clustering algorithm. In[15], AïssaElBey et al. propose the use of the kmeans algorithm but other techniques could be employed. The set Ω required for the clustering procedure is determined by comparing the ratio$\u2225{\mathcal{S}}_{\mathit{x}}(t,f)\u2225/\underset{\xi}{max}\u2225{\mathcal{S}}_{\mathit{x}}(t,\xi )\u2225$ to a threshold height, whose value is chosen empirically.
Source recovery
This section presents a number of techniques used in the source recovery stage of twostep UBSS algorithms. In the underdetermined case, the system (2) has less equations than unknowns, and thus it has (in general) infinitely many solutions. In order to recover the original sources, additional assumptions are needed.
The DUET method[26] assumes the sources to be (approximately) W disjoint orthogonal in the time–frequency domain, that is, the supports of the STFTs of any two sources present in the observations are disjoints. The source recovery is performed by partitioning the time–frequency plane using the mixing parameter estimates. This procedure assigns a source to each time–frequency point, even if this point is due to noise alone, which is detrimental to the method overall performance.
Although TIFROM and TIFCORR do not require the sources to be W disjoint orthogonal for source recovery, they however suffer from the same limitation as DUET in that they also assign time–frequency points of noise alone to sources.
Bofill and Zibulevsky[18] use the ℓ_{1} norm minimization to recover the sources. In the noiseless case, this can be accomplished by solving the convex optimization
where${\u2225\xb7\u2225}_{1}$ is the ℓ_{1} norm. In presence of noise, the foregoing constraint must be modified so as to take the noise standard deviation into account. In practice, this noise standard deviation is unknown and must be estimated.
For the SUBSS approach in[15], the source recovery is based on the following assumptions:
Assumption 2
The number of active sources at any (t,f) is strictly less than the number M of sensors.
Assumption 3
Any M×M submatrix of the mixing matrix has full rank, that is, for all$J\subset \{1,2,\dots ,N\}$with cardinality less than M, (a_{ j })_{j∈J}are linearly independent.
The subspace approach then performs multisource selection, that is, the selection of time–frequency points pertaining to a mixture and then, identifies the sources present at a multisource time–frequency points. Thanks to Assumption 2, the method then involves solving the resulting locally overdetermined linear problem. By construction, the methods requires rejecting time–frequency points of noise alone. In[15], the time–frequency points with energy below some empirically chosen threshold are rejected.
Statistical tests for sparsenessbased UBSS
This section is the main core of the article since it is dedicated to a series of improvements brought to the classical UBSS methods presented in Section “Main steps of standard UBSS methods”. These improvements concern the selection of the time–frequency points of interest for source separation (multisource selection) and the selection of the time–frequency points suitable for mixing matrix estimation (autosource selection). The crux of the approach followed bellow is to consider the aforementioned selections of time–frequency points as statistical testing problems of accepting or rejecting the presence of sources in noise. These two hypothesis testing problems are different in that mixing matrix estimation requires selecting points where only one single source is present, whereas this constraint is useless for denoising and source recovery.
The issue in these binary hypothesis testing problems is twofold. On the one hand, the observation in each problem has unknown distribution because basically the possible source signal distributions are themselves unknown. On the other hand, the noise standard deviation is unknown as well. Because of this lack of prior knowledge, standard likelihood theory or extensions such as generalized likelihood ratios or invariancebased approaches do not apply.
For source recovery, our solution is an estimateandplugin detector. Based on a weaksparseness model for the signal sources in noise, it begins by estimating the noise standard deviation via the DATE introduced in[22]. Then, the noise standard deviation estimate is used instead of the unknown true value in the expression of a statistical test, also designed for noisy sparse signal representations.
For mixing matrix estimation, we exploit the physical nature of the signals to detect the time–frequency points where one single source is present. For signals with high overlapping rate, SNT is appropriate to select such time–frequency points. When the signals have low overlapping rate, we directly use the time–frequency points provided by the source recovery procedure.
Figure2 presents the flowchart of the proposed approach based on the DATE and SNT.
Weaksparsenessbased time–frequency detection for source recovery (multisource selection)
Recovering sources involves detecting the time–frequency points that pertain to signals. Therefore, time–frequency points due to noise alone are useless to recover sources. Detecting the time–frequency points appropriate for source recovery thus amounts to deciding whether any given time–frequency point (t,f) pertains to some signal of interest or not. It is thus natural to state this problem as the binary hypothesis testing, where the null hypothesis${\mathcal{H}}_{0}$ is that${\mathcal{S}}_{\mathit{x}}(t,f)\backsim {\mathcal{N}}_{c}(0,{\sigma}^{2}{I}_{M})$ is complex Gaussian noise and the alternative hypothesis${\mathcal{H}}_{1}$ is that${\mathcal{S}}_{\mathit{x}}(t,f)=\Lambda (t,f)+{\mathcal{S}}_{n}(t,f)$ is a source mixture in independent and additive complex Gaussian noise, where${\mathcal{S}}_{n}(t,f)\backsim {\mathcal{N}}_{c}(0,{\sigma}^{2})$ and Θ (t,f) stands for the mixture of signals possibly present at time–frequency point (t,f).
The issue is then the following. Although${\mathcal{S}}_{\mathit{x}}(t,f)$ can reasonably be modeled as a random complex variable, the distribution of${\mathcal{S}}_{\mathit{x}}(t,f)$ can hardly be known and standard likelihood theory thus becomes useless. This difficulty can however be overcome by resorting to a weaksparseness model that can be introduced as follows.
Figure3a displays the spectrogram obtained by STFT of a mixture of audio signals. This spectrogram exhibits many time–frequency components with small or even null amplitudes. When this mixture is corrupted by additive and independent noise as in Figure3b, small components are masked and only big ones are still visible. We must also note that the proportion of these big components remains seemingly less than or equal to one half. In other words, it is reasonable to assume that (1) the signal components are either present or absent in the time–frequency domain with a probability of presence less than or equal to one half and (2) when present, the signal components are relatively big in that their amplitude is above some minimum value. These two assumptions specify the weak sparseness model by bounding our lack of prior knowledge on the signal distribution. The weaksparseness model slightly differs from the “strong” sparsity model encountered in compressive sensing, where it is assumed that the nonnull significant signal components are very few. In the weak sparseness model, we do not restrict our attention to very small proportions of big time–frequency components.
To take the weaksparseness model into account in our binary hypothesis problem statement, we assume that (1) the probability of occurrence of hypothesis${\mathcal{H}}_{1}$ is less than or equal to one half and (2) there exists some positive real value α such that Θ(t,f)>α. The value α can be regarded as the minimum signal amplitude. We thus write that
with${\mathcal{S}}_{n}(t,f)\backsim {\mathcal{N}}_{c}(0,{\sigma}^{2})$, Θ(t,f)>α and$\mathbb{P}\left({\mathcal{H}}_{1}\right)\u2a7d1/2$. Furthermore, we do not assume that the probability distribution of Θ(t,f) is known. In what follows, we prefer summarizing this testing problem by introducing a Bernoulli distributed random variable ϵ(t,f), valued in {0,1}, independent of Θ(t,f) and${\mathcal{S}}_{n}(t,f)$, but defined on the same probability space, so as to write that${\mathcal{S}}_{\mathit{x}}(t,f)=\u03f5(t,f)\Theta (t,f)+{\mathcal{S}}_{n}(t,f)$. We thus have$\mathbb{P}\left({\mathcal{H}}_{1}\right)=\mathbb{P}\left[\u03f5\right(t,f)=1]$. Given any test$\mathcal{T}$, that is, any measurable map of${\u2102}^{M}$ into {0,1}, we then say that$\mathcal{T}$ accepts (resp. rejects) the null hypothesis${\mathcal{H}}_{0}$ if$\mathcal{T}\left({\mathcal{S}}_{\mathit{x}}\right(t,f\left)\right)=0$ (resp.$\mathcal{T}\left({\mathcal{S}}_{\mathit{x}}\right(t,f\left)\right)=1$). In other words,$\mathcal{T}$ is said to return the expected value of the true hypothesis. The error probability of$\mathcal{T}$ is then defined as the probability${\mathcal{P}}_{e}\left\{\mathcal{T}\right\}=\mathbb{P}\left[\mathcal{T}\right({\mathcal{S}}_{\mathit{x}}(t,f))\ne \u03f5(t,f\left)\right]$.
According to ([25], Theorem VII.1), the decision should then be performed by using the thresholding test with threshold height${\lambda}_{D}(\alpha ,\sigma )=(\sigma /\sqrt{2})\xi (\alpha \sqrt{2}/\sigma )$ where, for any positive ρ,$\xi \left(\rho \right)={\mathrm{I}}_{0}^{1}\left({e}^{{\rho}^{2}/2}\right)/\rho $ and _{I0} is the zeroth order modified Bessel function of the first kind. By thresholding test with threshold height$h\in [0,\infty )$, we mean the test${\mathcal{T}}_{h}$ such that
The reasons for which this test is recommended are the following ones. Let${\mathcal{L}}_{\mathrm{\text{mpe}}}$ be the minimumprobabilityoferror (MPE) test, that is, the likelihood ratio test that guarantees the least possible probability of error among all possible tests and that could be computed if the probability distribution of Θ(t f) and the prior probability of presence$\mathbb{P}\left({\mathcal{H}}_{1}\right)$ were known. Two facts follow from ([25], Theorem VII.1). First, the error probability of${\mathcal{T}}_{{\lambda}_{D}(\alpha ,\sigma )}$ is above the error probability of the MPE test and less than or equal to the error probability of an explicit function$V(\alpha \sqrt{2}/\sigma )$, whose expression is useless in the sequel. Second,$V(\alpha \sqrt{2}/\sigma )$ is a sharp upperbound since it is attained by the error probabilities of tests${\mathcal{L}}_{\mathrm{\text{mpe}}}$ and${\mathcal{T}}_{{\lambda}_{D}(\alpha ,\sigma )}$ in the least favorable case where$\mathbb{P}[\u03f5=1]=1/2$ and Θ(t f)=α^{e iΦ(t,f)} with Φ(t f) uniformly distributed in [0,2Π) and i is the imaginary unit (i^{2} = −1). To carry out this test, we must choose an appropriate value for α and perform an estimate of σ.
The value of α is fixed by following the same reasoning as in[29] and considering that the minimum amplitude of the signal to detect is the noise maximum value. More specifically, given m random variables X_{1}X_{2},…,X_{ m } that are independent and identically distributed with${X}_{k}\stackrel{\mathrm{iid}}{\backsim}\mathcal{N}(0,{\sigma}^{2})$ for$1\u2a7dk\u2a7dm$, it is known ([30], Eqs. (9.2.1), (9.2.2), Section 9.2, p. 187) ([31], p. 454) ([32], Section 2.4.4, p. 91) that
where${\lambda}_{u}=\sigma \sqrt{2lnm}$ is often called the universal threshold[33]. The maximum amplitude of${\left({X}_{k}\right)}_{1\u2a7dk\u2a7dm}$ has thus a strong probability of being close to λ_{ u }when m is large and the universal threshold can be regarded as the noise maximum amplitude of m noise samples. In our case, we have M sensors so that each observation${\mathcal{S}}_{\mathit{x}}(t,f)$ is an Mdimensional complex vector. Let L stand for the number of time–frequency points (t f) obtained for each sensor. We thus have M×L time–frequency points (t f) and, therefore, 2ML random variables—the real and imaginary parts of${\mathcal{S}}_{n}(t,f)$—that are$\mathcal{N}(0,{\sigma}^{2}/2)$. The maximum amplitude of these 2ML Gaussian independent and identically distributed random variables with standard deviation$\sigma /\sqrt{2}$ will then be considered as the minimum signal amplitude so that we set$\alpha =\sigma \sqrt{log\left(2\mathrm{ML}\right)}$. The threshold height used to detect the relevant time–frequency points is then${\lambda}_{D}\left(\sigma \right)=\frac{\sigma}{\sqrt{2}}\xi \left(\sqrt{2log\left(2\mathrm{ML}\right)}\right)$, which is henceforth called the detection threshold.
As far as the estimation of the noise standard deviation is concerned, usual solutions based on standard robust estimators such as the median absolute deviation (MAD)[34], the trimmed or the winsorized estimators[35] do not apply. Indeed, by considering the spectrogram of Figure3b, it can easily be guessed that such standard estimators would fail because the proportion of significant noisy time–frequency points pertaining to the signals is large. Therefore, the noisy time–frequency points are not very few and cannot play the role of outliers with respect to the main core data distribution. In a recent article[22], a new noise standard deviation estimator called the DATE has been proposed. This estimator relies on the weaksparseness model presented before. An exhaustive presentation of the theoretical background on which this estimator is based is beyond the scope of the present article and the reader is asked to refer to[22] for an heuristic presentation and a complete mathematical description of the DATE. In the context addressed in the present article, this algorithm applies as follows.
With the notation used so far, each${\mathcal{S}}_{\mathit{x}}(t,f)$ is an Mdimensional complex vector. Let${\mathcal{S}}_{{x}_{j}}(t,f)$, j=1,2,…,M, be the components of${\mathcal{S}}_{\mathit{x}}(t,f)$. For any given j=1,2,…,M, we assume that the L time–frequency components${\mathcal{S}}_{{x}_{j}}(t,f)$ for the j th sensor are independent and that each time–frequency component obeys the binary hypothesis model of (5) with$\alpha =\sigma \sqrt{log\left(2\mathrm{ML}\right)}$. According to[22] and setting κ=2Γ(3/2) where Γ is the standard Gamma function, there exists a specific convergence criterion, for which we have:
when the number L of time–frequency bins (t f) is large enough. In the previous equation,$1\left(\right{\mathcal{S}}_{{x}_{j}}(t,f)\u2a7d{\lambda}_{D}(\sigma \left)\right)$ stands for the indicator function of event$\left{\mathcal{S}}_{{x}_{j}}\right(t,f\left)\right\u2a7d{\lambda}_{D}\left(\sigma \right)$, The specific convergence criterion involved in (7) is specified in[22] and is not given here because of its intricateness. It also turns out that the noise standard deviation σ is the unique solution of (9) with respect to the convergence criterion involved. Therefore, the DATE basically performs an estimate of the noise standard deviation by solving (7) with regard to this convergence criterion. The several steps involved in the computation are then the following ones.
The DATE:
Given j ∈ {1,2,…,M}, let${Y}_{\left(1\right)}^{j},{Y}_{\left(2\right)}^{j},\dots ,{Y}_{\left(L\right)}^{j}$ be the L values$\left{\mathcal{S}}_{{x}_{j}}\right(t,f\left)\right$ sorted by ascending order.

(1)
[Search interval]:

(a)
Choose some positive real value Q less than or equal $1\frac{L}{4{(L/21)}^{2}}$.

(b)
Set $h=1/\sqrt{4L(1Q)}$

(c)
Compute ${k}_{min}=L/2\mathrm{hL}$. According to Bienaymé–Chebyshev’s inequality and since the probabilities of presence of the signals are assumed to be less than or equal to one half, the probability that the number of observations due to noise alone is above ${k}_{min}$ is larger than or equal to Q. In the experimental results presented below, Q was set to 0.95 for the computation of ${k}_{min}$.

(a)

(2)
[Existence]:
IF there exists a smallest integer k in$\{{k}_{min},\dots ,L\}$ such that
$$\left{Y}_{\left(k\right)}^{j}\right\u2a7d\left(\right.{\mu}_{j}\left(k\right)/\kappa \left)\right.\xi \left(\sqrt{2log\left(2\mathrm{ML}\right)}\right)\phantom{\rule{0.3em}{0ex}}<\phantom{\rule{0.3em}{0ex}}\left{Y}_{(k+1)}^{j}\right$$(8)with
$${\mu}_{j}\left(k\right)=\left\{\begin{array}{ll}\frac{1}{k}\sum _{r=1}^{k}\left{Y}_{\left(r\right)}^{j}\right& \phantom{\rule{1em}{0ex}}\text{if}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}k\ne 0\\ 0& \phantom{\rule{1em}{0ex}}\text{if}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}k=0,\end{array}\right.$$(9)set ^{k∗}=k.
ELSE, set${k}^{\ast}={k}_{min}$.

(3)
[Value]: The estimate ${\sigma}_{j}^{\ast}$ of the noise standard deviation on the j th sensor is then
$$\hat{{\sigma}_{j}}={\mu}_{j}\left({k}^{\ast}\right)/\kappa ,$$(10)
The final estimate$\hat{\sigma}$ of the noise standard deviation is then obtained by averaging the values$\hat{{\sigma}_{j}}$ so that$\hat{\sigma}=(1/M)\sum _{j=1}^{M}\hat{{\sigma}_{j}}$.
Signal source detection for mixing matrix estimation (autosource selection)
In this section, we propose a test for selecting the time–frequency points where one signal source is probably present alone. To perform this selection, we make the distinction between signals with either low or high overlapping rate in the time–frequency domain. Chirp signals (resp. audio signals) are typical examples of signals with low (resp. high) overlapping rate. It is worth noticing that the estimation procedures proposed below for each class have reasonable computational costs.
The case of signals with low overlapping rate
Since the sources have low overlapping rate, we suppose that the observations detected by the thresholding test of Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)” mostly pertain to one signal source. In other words, we neglect the effect on the matrix estimation performance of the few points where sources may overlap, inasmuch as the impact of such time–frequency points is further reduced by the averaging effect inherent to any mixing matrix estimation method.
The case of signals with high overlapping rate
When signals overlap significantly in the time–frequency domain, the time–frequency detection of Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)” is now inappropriate. Indeed, the statistical procedure of Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)” is aimed at detecting time–frequency points where signal sources are present, whatever the number of these sources, whereas it is now required to discriminate points where one single source is present from points where multiple sources occur. We assume that in case of different sources present at time–frequency point (t,f), they are uncorrelated and incoherently combined. The resulting energy at (t,f) is thus supposed to be smaller than the energy attained at the time–frequency points where one single source is present only.
Our purpose is thus to detect the time–frequency points where the signal energy is big enough in presence of noise. Basically, this problem amounts to deciding whether$\left\mathit{A}{\mathcal{S}}_{\mathit{s}}\right(t,f\left)\right$ is above some value τ or not. The value τ^{2} thus represents the minimum energy level above which we consider that the signal energy is big enough to assume that one single source is actually present at (t f). For any$\lambda \in (0,\infty )$, it follows from ([23], Lemma 4, statement (iii)) that
where${F}_{{\chi}_{d}^{2}\left(\delta \right)}\left(\xb7\right)$ stands for the cumulative distribution function of the noncentered chi2 distribution with d degrees of freedom and noncentrality parameter δ. The degree of freedom in (11) is 2M since each${\mathcal{S}}_{\mathit{x}}(t,f)$ is an Mdimensional complex random vector and, thus, a 2Mdimensional real random vector. Given some level γ∈(0,1), it then suffices to choose
to guarantee a “false alarm probability”$\mathbb{P}\left[\right.\phantom{\rule{0.3em}{0ex}}\left{\mathcal{S}}_{\mathit{x}}\right(t,f\left)\right>\lambda \phantom{\rule{0.3em}{0ex}}\left\right.\phantom{\rule{0.3em}{0ex}}\left\mathit{A}{\mathcal{S}}_{\mathit{s}}\right(t,f\left)\right<\tau \phantom{\rule{0.3em}{0ex}}\left]\right.$ less than or equal to γ.
Therefore, for a given time–frequency point (t,f), the decision is that$\left\mathit{A}{\mathcal{S}}_{\mathit{s}}\right(t,f\left)\right<\tau $ if$\left{\mathcal{S}}_{\mathit{x}}\right(t,f\left)\right<\lambda (\tau ,\gamma )$ and that$\left\mathit{A}{\mathcal{S}}_{\mathit{s}}\right(t,f\left)\right\u2a7e\tau $ if$\left{\mathcal{S}}_{\mathit{x}}\right(t,f\left)\right\u2a7e\lambda (\tau ,\gamma )$. For mixing matrix estimation, we then keep the time–frequency points (t,f) such that$\left{\mathcal{S}}_{\mathit{x}}\right(t,f\left)\right\u2a7e\lambda (\tau ,\gamma )$, which are considered as to time–frequency points pertaining to one single source. In practice, since the actual value of σ is unknown, we replace this true value by its estimate$\hat{\sigma}$ provided by the DATE.
Although the two parameters γ and τ must be fixed, there is no need to choose them for each signal to noise ratio. Parameter τ, which is independent of the noise level, can be fixed via a small noiseless database. Similarly, level γ can be determined via a few preliminary test on a small representative database.
Simulation results
In most of the following simulations, the mixing matrix is chosen according to ([14], Eq. (38)) so as to model N sources arriving at the sensor array at different angles _{θ 1}_{θ 2},…,_{ θ M }. The entries of matrix A are therefore${a}_{j,k}={e}^{\mathrm{i}\Pi (j1)sin\left({\theta}_{k}\right)}$ for$j\in \{1,\dots ,M\}$ and$k\in \{1,\dots ,N\}$. In the sequel, we proceed by choosing four sources (N = 4), three sensors (M = 3), _{θ 1} = 15 °, θ_{2} = 30 °, θ_{3} = 45 ° and θ_{4} = 75 °.
Unless specified otherwise, the source signals are speech signals randomly chosen in the TIdigits database[36]. This large speech database collected in a quiet environment is commonly used in speech processing. In this article, the chosen speech signals were downsampled to 8 kHz. All signals involve 8,192 samples. Figure4a–d shows the timedomain representations of the original source signals and Figure4e–h represents their corresponding spectrograms. Figure5 displays a spectrogram of a mixture of these speech signals when the mixing matrix A is applied to them at SNR = 10 dB. The spectrograms of the other mixtures are not presented because the differences between any two of them are not visually noticeable since the mixing matrix A involves no null entry.
The two parameters required for the mixing matrix estimation are then fixed to τ = 4 and γ = 1^{0−3}. The source separation performance is measured by the normalized mean square error (NMSE):
Throughout this section, NMSEs are calculated over 100 MonteCarlo runs.
SUBSS method
The modified SUBSS algorithm is obtained by using both the DATE and SNT for source recovery and mixing matrix estimation by SNT, respectively, as explained in Section “Statistical tests for sparsenessbased UBSS”. It is used to separate the four source signals from the noisy mixed signals observed by the three sensors.
The waveforms of the recovered source signals by the modified SUBSS algorithm are represented in Figure6. Figure6a–d shows the timedomain representations of the recovered source signals in the noiseless case (input SNR = 45 dB), and Figure6e–h represent timedomain representations of the recovered source signals with input SNR = 10 dB.
In Figure7, the performance of the modified SUBSS algorithm, with and without denoising, is compared to that obtained by the originally SUBSS algorithm of[15]. The denoising mentioned above is described in Appendix as a standard linear estimation.
The modified SUBSS algorithm outperforms the original SUBSS algorithm[15], which relies on thresholds that are manually chosen for each input SNR. Moreover, modified SUBSS without denoising yields performance measurements that do not significantly depart from those attained by the original subspacebased UBSS algorithm. In addition, Figure7 displays the NMSEs obtained by using the MAD estimator instead of the DATE in the modified SUBSS algorithm without denoising. The use of the MAD instead of the DATE induces a significant performance loss, which illustrates the relevance of the DATE and the weaksparseness model. In Figures8 and9, we present the NMSEs obtained by the modified SUBSS and the original SUBSS when the number of sources increases and for SNR = 10dB and SNR = 20dB. In both figures, the NMSEs degrade, because an increase of the source interference invalidates Assumption 1.
We now consider the case of complex chirp signals. These ones were generated by slightly modifying the matlab routine MakeSignal.m of the wavelab toolbox, so as to obtain complex chirp signals. The 4 chirp signals we use as sources are${s}_{1}\left(t\right)=\sqrt{t(1t)}{e}^{\mathrm{i}\frac{\Pi \mathrm{T}}{2}{t}^{2}}$,${s}_{2}\left(t\right)=\sqrt{t(1t)}{e}^{\mathrm{i}\frac{\Pi \mathrm{T}}{4}{t}^{2}}$,${s}_{3}\left(t\right)={e}^{\mathrm{i}\Pi \mathrm{T}{t}^{2}}$ and${s}_{4}\left(t\right)={e}^{\mathrm{i}\frac{2}{3}\Pi \mathrm{T}t}$, where t∈[0,1] and T=8192 is the number of samples for each signal. Two of these chirp signals are LFM ones and one is a pure sine. Figure10 then displays the spectrograms of the four chirp signals under consideration, whereas Figure11 presents the spectrogram of a mixture of these sources when matrix A is applied and SNR = 10 dB. The spectrograms of the other mixtures are not displayed for the same reasons as those given previously for the speech signal mixtures.
The experimental procedure for assessing the modified SUBSS in comparison to the original SUBSS method is then the same as above. As specified in Section “The case of signals with low overlapping rate”, the thresholds used for the mixing matrix estimation are the detection ones. Therefore, no additional parameter is needed. The results obtained in Figure12 show the relevance of this choice for the thresholds, explained by the fact that chirp signals present very few overlapping time–frequency components.
Other methods
As described in Sections “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)” and “Signal source detection for mixing matrix estimation (autosource selection)”, The DATE and SNT can be used to perform multisource and autosource selections, respectively. Said otherwise, the statistical tests of the aforementioned sections make it possible to obtain the time–frequency points where noisy mixtures are present and the set of time–frequency points where only one single source exists. In this section, we comment the results we obtain by so proceeding with respect to the several UBSS methods addressed in Section “Main steps of standard UBSS methods” and other than SUBSS.
In the underdetermined case, TIFROM achieves partial source separation only. Therefore, to better assess the contribution of our statistical tests to TIFROM, we consider the determined case where four source signals from four speakers are mixed. The mixing matrix is now 4×4 with independent Gaussian entries. In Figure13, we present the NMSEs obtained by the TIFROM, SNTTIFROM and Modified SNTTIFROM. Specifically, SNTTIFROM uses SNT to select times frequency points where a source exists alone. SNTTIFROM, as TIFROM, performs no multisource selection for source recovery. In contrast, the modified SNTTIFROM performs multisource selection and forces to zero the unselected time–frequency points. These results show that SNT makes it possible to actually select the autosource time–frequency points, with no performance loss and without resorting to the empirical threshold required by the original TIFROM. The performance yielded by the modified SNTTIFROM further emphasizes that the detection threshold adjusted with the DATE selects appropriate multisource time–frequency points for source recovery. The gain for low SNRs is explained by the fact that this selection can be regarded as a nonlinear denoising. The gain brought by this denoising effect decays when the SNR increases.
Another contribution of our statistical approach to sparsenessbased methods is the estimation of the noise standard deviation. Indeed, some methods need an estimate or the true value of the noise standard deviation. For instance, Bofill and Zibulevsky[18], use the _{ℓ 1}norm minimization to recover the sources. In the noisy case, they propose to solve the optimization problem:
Because of the weakly sparseness of the sources in noise, we hereafter prefer following[37] dedicated to stable recovery of not exactly sparse signals. We therefore solve the optimization problem
This approach can then be improved in two ways. First, by solving this optimization problem on only the time–frequency points selected by the multisource procedure propounded in Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)”. Second, by replacing the unknown true value of the noise standard deviation by its estimate provided by the DATE. In this respect, Figure14 displays the performance measurements obtained by the original method based on the ℓ_{1}criterion of Equation (4) (L1 Minimization) in comparison to the modified ℓ_{1}criterion of Equation (14) applied to the outcome of the the multisource selection when the noise standard deviation is estimated by the DATE (Modified L1 minimization). As expected, the gain brought by multisource selection and Equation (14), both adjusted by the noise standard deviation estimate provided by the DATE, is significant. It is also worth noticing that the DATE estimation error does not impact significantly the separation performance in comparison to the case where the noise standard deviation is perfectly known. This can also be seen in Figure14, where the performance measurements are given when the multisoure selection and ℓ_{1}criterion of Equation (14) are both adjusted with the actual value of the noise standard deviation (Oracle Modified L1 Minimization). In contrast, there is significant performance loss when the multisource selection and Equation (4) are calculated by using the MAD instead of the DATE (MAD Modified L1 Minimization). The reason still relates to the fact that the DATE is more robust to weaksparseness than the MAD.
The multisource selection based on the detection threshold adjusted by the estimate provided by the DATE can be further exploited by the DUET reconstruction, as illustrated in Figure15. In this simulation, the input signals are the chirp signals considered above, so that the W disjoint orthogonality assumption is satisfied. Moreover, the mixing matrix A is now assumed to be known. On the one hand, we perform the DUET source recovery by considering the whole time–frequency plane. On the other hand, we consider the modified DUET, that is, the DUET source recovery applied to the selected multisource time–frequency points only. The results are similar to those obtained above by TIFROM and its modified versions. Here, the gain brought by the multisource selection, which acts as a denoising, is bigger on a wider SNR range because the time–frequency representation of chirp signals is sparser than that of audio signals.
Discussion
Assessment
The algorithms we propose are very general. They are not dedicated to a given sparsenessbased BSS method. They are simple to apply without any adjustment. From the results of Section “Simulation results”, our procedures can therefore be used to improve, simplify or bring robustness to the standard sparsenessbased BSS methods considered in the article.
More specifically, the weaksparsenessbased time–frequency detection procedure of Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)” can be used as an automatized preprocessing for multisource selection. For example, the time–frequency detection in[15] requires one threshold value for each instrumented SNR. The detection procedure of Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)” then makes it possible to avoid this empirical parameter choice, which brings robustness and significant simplification. Used as a preprocessing for TIFROM[16], which basically involves no selection of time–frequency points, the multisource selection we propound can improve the separation performance.
For mixing matrix estimation, our approach described in Section “Signal source detection for mixing matrix estimation (autosource selection)” relies on no weaksparseness assumption and involves two parameters only, that is, the tolerance and the falsealarm probability. These parameters are valid over the signaltonoise ratio (SNR) range, in contrast to[15] for instance. Furthermore, the assumptions made by TIFROM can be relaxed by using the autosource selection of Section “Signal source detection for mixing matrix estimation (autosource selection)”. It is also worth noticing that the two parameters we need for mixing matrix estimation have a physical meaning, which is not the case for some standard sparsenessbased BSS methods.
Convolutive mixture case
There exists a great variety of possible strategies for dealing with the convolutive mixture case, which is more realistic than the instantaneous one. In the convolutive mixture case, exhibiting a wellestablished family of methods such as that considered above in the instantaneous mixture one is hardly feasible. However, despite this variety, the statistical framework proposed in this article can be expected to be used in the convolutive mixture case, at least, for methods based on time–frequency representations for which, separating time–frequency points of noise alone from those of noisy signals can be helpful. For instance, this detection procedure for multisource selection can be used straightforwardly to detect the time–frequency points required by the convolutive SUBSS presented in[38]. The modified convolutive SUBSS thus obtained discards the empirical threshold required in[38] for multisource selection. This entails no significant performance loss, as illustrated by Figure16. Studying the addedvalue brought by SNT in the convolutive mixture case requires further analysis that could be achieved in some forthcoming work.
Conclusion and perspectives
The algorithms presented in this article contribute to BSS in the underdetermined mixture case, by avoiding empirical choices of parameters present for the socalled family of weaksparseness based methods. Our first algorithm aimed at selecting the suitable time–frequency points for source recovery is full automatic. The second, dedicated to mixing matrix estimation, requires fixing two parameters only, regardless of the instrumented SNRs.
The question is now to what extent the statistical tests used above in the instantaneous mixture case can possibly be exploited in the convolutive mixture case, especially in complement to the results discussed in Section “Convolutive mixture case”. It can also be wondered whether these tests can be extended so as to deal with colored noise. Work on this topic is under progress.
The theoretical and experimental results of this article pinpoint that the subfunctions of the source separation methods considered above, completed with the statistical tests we have proposed, can be regarded as elementary components that can be interchanged and associated to provide new algorithms for source separation in different applicative contexts. This opens new practical prospects. For instance, it would be desirable to construct a toolbox involving all these elementary components for further developments and studies. Such a toolbox would also make it possible to carry out exhaustive experimental assessments on large databases of signals via the BSSEval toolbox, downloadable from[39].
Appendix
Denoisingbased source recovery
The SUBSS method presented in[15] estimates the index set of the sources present at a given time–frequency point (t f). Let us denote by J this set of indexes. Then, Equation (2) reduces to:
and the STFT coefficients of these active sources can be recovered using:
where${\mathit{A}}_{J}^{\#}={\left({\mathit{A}}_{J}^{H}{\mathit{A}}_{J}\right)}^{1}{\mathit{A}}_{J}^{H}$ is the MoorePenrose pseudoinverse of A_{ J }.
We propose to use the noise standard deviation estimate provided by the DATE to jointly denoise and separate the sources on the basis of the time–frequency points selected by the statistical test of Section “Weaksparsenessbased time–frequency detection for source recovery (multisource selection)”. So, instead of performing the source separation as specified by Equation (16), the source separation is now carried out by computing
where$\hat{\sigma}$ is the noise standard estimate returned by the DATE and${\mathit{R}}_{{\mathit{s}}_{J}}=\ud53c\left[{\mathcal{S}}_{{\mathit{s}}_{J}}\right(t,f\left){\mathcal{S}}_{{\mathit{s}}_{J}}^{H}\right(t,f\left)\right]$. The derivation of the optimal linear estimator of (17) is standard. It involves minimizing the risk$\ud53c\left[{\u2225{\mathcal{S}}_{{\mathit{s}}_{J}}(t,f)\mathit{D}{\mathcal{S}}_{\mathit{x}}(t,f)\u2225}^{2}\right]$ when D ranges over the space of the card(J)×M matrices and under the assumption that the sources are spatially decorrelated. In practice, matrix R_{ s J }is unknown and must be estimated. We then proceeded as follows. On the one hand, we have${\mathit{R}}_{\mathit{x}}=\mathit{A}{\mathit{R}}_{\mathit{s}}{\mathit{A}}^{H}+{\sigma}^{2}{\mathit{I}}_{M}$. On the other hand,R_{ x } can be estimated by$\hat{{\mathit{R}}_{\mathit{x}}}=\frac{1}{\mathrm{\#t}}\sum _{t}{\mathcal{S}}_{\mathit{x}}(t,f){\mathcal{S}}_{\mathit{x}}{(t,f)}^{H}$, where #t stands for the number of time windows on which the STFT is calculated. Since estimates of A and σ are known, we derive from the expressions of R_{ x }and$\hat{{\mathit{R}}_{\mathit{x}}}$ an estimate$\hat{{\mathit{R}}_{\mathit{s}}}$ of R_{ s }. An estimate of R_{ s J } follows by picking the appropriate columns in$\hat{{\mathit{R}}_{\mathit{s}}}$.
References
 1.
Varajarajan V, Krolik J,: Multichannel system identification methods for sensor array calibration in uncertain multipath environments. IEEE Signal Processing Workshop on Statistical Signal Processing (SSP) (Singapore, Oct 2001), pp. 297–300
 2.
Rouxel A, Guennec DL, Macchi O: Unsupervised adaptive separation of impulse signals applied to EEG analysis. IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), vol. 1 (Istanbul, Turkey, June 2000), pp. 420–423
 3.
AbedMeraim K, Attallah S, Lim T, Damen M: A blind interference canceller in DSCDMA. IEEE International Symposium on Spread Spectrum Techniques and Applications (Parsippany, Sept 2000), pp. 358–362
 4.
DuránDíaz I, CrucesAlvarez SA: A joint optimization criterion for blind DSCDMA detection. EURASIP J. Adv. Signal Process 2007, 2007(79248):111.
 5.
AïssaElBey A, AbedMeraim K, Grenier Y: Underdetermined blind audio source separation using modal decomposition. EURASIP J. Audio Speech Music Process 2007, 2007(85438):115.
 6.
Comon P, C Jutten (eds.): Handbook of Blind Source Separation: Independent Component Analysis and Blind Deconvolution. (Academic Press, Oxford, 2010)
 7.
Cardoso JF: Blind signal separation: statistical principles. Proc. IEEE 1998, 86(10):20092025. 10.1109/5.720250
 8.
Belouchrani A, AbedMeraim K, Cardoso JF, Moulines E: A blind source separation technique using secondorder statistics. IEEE Trans. Signal Process 1997, 45(2):434444. 10.1109/78.554307
 9.
Belouchrani A, Amin MG: Blind source separation based on timefrequency signal representations. IEEE Trans. Signal Process 1998, 46(11):28882897. 10.1109/78.726803
 10.
AbedMeraim K, Xiang Y, Manton JH, Hua Y: Blind source separation using second order cyclostationary statistics. IEEE Trans. Signal Process 2001, 49(4):694701. 10.1109/78.912913
 11.
Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incompletes data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39(1):138.
 12.
Yilmaz O, Rickard S: Blind separation of speech mixtures via timefrequency masking. IEEE Trans. Signal Process 2004, 52(7):18301847. 10.1109/TSP.2004.828896
 13.
Melia T, Rickard S: Underdetermined blind source separation in echoic environments using DESPRIT. EURASIP J. Adv. Signal Process 2007, 2007(86484):119.
 14.
LinhTrung N, Belouchrani A, AbedMeraim K, Boashash B: Separating more sources than sensors using timefrequency distributions. EURASIP J. Appl. Signal Process 2005, 2005(17):28282847. 10.1155/ASP.2005.2828
 15.
AïssaElBey A, LinhTrung N, AbedMeraim K, Belouchrani A, Grenier Y: Underdetermined blind separation of nondisjoint sources in the timefrequency domain. IEEE Trans. Signal Process 2007, 55(3):897907.
 16.
Abrard F, Deville Y: A timefrequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process 2005, 85(7):13891403. 10.1016/j.sigpro.2005.02.010
 17.
Arberet S, Gribonval R, Bimbot F: A robust method to count and locate audio sources in a multichannel underdetermined mixture. IEEE Trans. Signal Process 2010, 58(1):121133.
 18.
Bofill P, Zibulevsky M: Underdetermined blind source separation using sparse representations. Signal Process 2001, 81(11):23532362. 10.1016/S01651684(01)001207
 19.
Araki S, Sawada H, Mukai R, Makino S: Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors. Signal Process 2007, 87(8):18331847. 10.1016/j.sigpro.2007.02.003
 20.
Araki S, Nakatani T, Sawada H, Makino S: Stereo source separation and source counting with MAP estimation with dirichlet prior considering spatial aliasing problem. Independent Component Analysis and Signal Separation (ICA), ser. LNCS, vol. 5441 (Springer, Paraty, 2009), pp. 742–750
 21.
O’Grady P, Pearlmutter B, Rickard S: Survey of sparse and nonsparse methods in source separation. Int. J. Imag. Syst. Technol 2005, 15(1):1833. 10.1002/ima.20035
 22.
Pastor D, Socheleau FX: Robust estimation of noise standard deviation in presence of signals with unknown distributions and occurrences. IEEE Trans. Signal Process 2012, 60(4):15451555.
 23.
Pastor D: Signal norm testing in additive and independant standard Gaussian noise,. Institut MinesTélécom; Télécom Bretagne, UEB , LabSTICC UMR CNRS 3192, Tech. Rep., 2011, available at http://www.telecombretagne.eu/publications/publication.php?idpublication=10706
 24.
AzizSbaï SM, AïssaElBey A, Pastor D: Robust underdetermined blind audio source separation of sparse signals in the timefrequency domain. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Prague, Czech Republic, May 2011), pp. 3716–3719
 25.
Pastor D, Gay R, Gronenboom A: A sharp upper bound for the probability of error of likelihood ratio test for detecting signals in white gaussian noise. IEEE Trans. Inf. Theory 2002, 48(1):228238. 10.1109/18.971751
 26.
Jourjine A, Rickard S, Yilmaz O: Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 5 (Istanbul, Turkey, June 2000), pp. 2985–2988
 27.
Deville Y, Puigt M: Temporal and timefrequency correlationbased blind source separation methods. part I: determined and underdetermined linear instantaneous mixtures. Signal Process 2007, 87(3):374407. 10.1016/j.sigpro.2006.05.012
 28.
Theis F, Lang E: Formalization of the twostep approach to overcomplete BSS. Signal and Image Processing (SIP) (Kauai, USA, August 2002), pp. 207–212
 29.
Atto AM, Pastor D, Mercier G: Detection thresholds for nonparametric estimation. Signal Image Video Process 2008, 2(3):207223. 10.1007/s117600080051x
 30.
Berman SM: Sojourns and Extremes of Stochastic Processes. (Wadsworth, Reading, MA, 1992)
 31.
Mallat S: A Wavelet Tour of Signal Processing,. (Academic Press, Cambridge, 1999)
 32.
Serfling RJ: Approximations Theorems of Mathematical Statistics. (Wiley, New York, 1980)
 33.
Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81(3):425455. 10.1093/biomet/81.3.425
 34.
Hampel F: The influence curve and its role in robust estimation. J. Am. Stat. Assoc 1974, 69(346):383393. 10.1080/01621459.1974.10482962
 35.
Huber P, Ronchetti E: Robust Statistics,. (Wiley, New York, 2009)
 36.
Leonard R: A database for speakerindependent digit recognition. IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), vol. 9 (San Diego, California, USA, March 1984), pp. 328–331
 37.
Candès JREJ, Tao T: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math 2006, 59(8):12071223. 10.1002/cpa.20124
 38.
AïssaElBey A, AbedMeraim K, Grenier Y: Blind separation of underdetermined convolutive mixtures using their timefrequency representation. IEEE Trans. Audio Speech Lang. Process 2007, 15(5):15401550.
 39.
Févotte C, Gribonval R, Vincent E: A toolbox for performance measurement in (blind) source separation,. available at http://bassdb.gforge.inria.fr/bss_eval/
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
About this article
Cite this article
AzizSbaï, S.M., AïssaElBey, A. & Pastor, D. Contribution of statistical tests to sparsenessbased blind source separation. EURASIP J. Adv. Signal Process. 2012, 169 (2012). https://doi.org/10.1186/168761802012169
Received:
Accepted:
Published:
Keywords
 Underdetermined blind source separation
 Sparse signals
 Time–frequency domain
 Noise variance estimation
 Weak sparseness
 Random distortion testing