Contribution of statistical tests to sparseness-based blind source separation

Aziz-Sbaï, Si Mohamed; Aïssa-El-Bey, Abdeldjalil; Pastor, Dominique

doi:10.1186/1687-6180-2012-169

Research
Open access
Published: 16 August 2012

Contribution of statistical tests to sparseness-based blind source separation

Si Mohamed Aziz-Sbaï^1,2,
Abdeldjalil Aïssa-El-Bey^1,2 &
Dominique Pastor^1,2

EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 169 (2012) Cite this article

2962 Accesses
12 Citations
Metrics details

Abstract

We address the problem of blind source separation in the underdetermined mixture case. Two statistical tests are proposed to reduce the number of empirical parameters involved in standard sparseness-based underdetermined blind source separation (UBSS) methods. The first test performs multisource selection of the suitable time–frequency points for source recovery and is full automatic. The second one is dedicated to autosource selection for mixing matrix estimation and requires fixing two parameters only, regardless of the instrumented SNRs. We experimentally show that the use of these tests incurs no performance loss and even improves the performance of standard weak-sparseness UBSS approaches.

Introduction

Source separation is aimed at reconstructing multiple sources from multiple observations (mixtures) captured by an array of sensors. In what follows, we assume these sensors to be linear, which is acceptable in many applications. The problem is said to be blind when the observations are linearly mixed by the transfer medium and no prior knowledge on the transfer medium or the sources is available. Blind source separation (BSS) is an important research topic in a variety of fields, including radar processing[1], medical imaging[2], communication[3, 4], speech and audio processing[5]. BSS problems can be classified according to the nature of the mixing process (instantaneous, convolutive) and the ratio between the number of sources and the number of sensors of the problem (underdetermined, overdetermined).

If the sources are assumed to be statistically independent, solutions to the BSS problem are calculated so as to optimize separation criteria based on higher order statistics[6, 7]. Otherwise, when the sources have temporal coherency[8], are nonstationary[9], or possibly cyclostationary[10], the separation criteria to optimize are based on second-order statistics.

Although BSS algorithms exist in great profusion, the underdetermined case (UBSS for underdetermined blind source separation), where the number of sensors is smaller than the number of sources, is less addressed than the overdetermined case, where the number of sensors is greater than or equal to the number of sources. Therefore, the UBSS problem is still challenging.

In the UBSS case, one way to deal with the lack of information is to use an expectation-maximization-based method[11] to obtain a maximum likelihood estimation of the mixing matrix and sources. However, such an approach requires prior knowledge of the source distributions. In contrast, sparseness-based methods solve the UBSS problem[12–20] without prior knowledge on the source distribution, by exploiting the sparseness of the non-stationary sources in the time–frequency domain. Roughly speaking, sparseness-based approaches[21] involve transforming the mixtures into an appropriate representation domain. The transformed sources are then estimated thanks to their sparseness and, finally, the sources are reconstructed by inverse transform. A source is said to be sparse in a given signal representation domain if most of its coefficients, in this domain, are (almost) zero and only a few of them are big.

In the instantaneous mixture case, where each observation consists of a sum of sources with different signal intensity in presence of noise, the sparseness-based methods introduced in[12–17], among others, rely on parameters that are chosen empirically. The general question addressed in this article is then to what extent this empirical parameter choice can be by-passed thanks to statistical methods, specifically designed to cope with sparse representations. This question is particularly relevant because a whole family of sparseness-based UBSS algorithms relies on assumptions very similar to those employed in theoretical frameworks dedicated to the detection and estimation of sparse signals. Our contribution to this question is then the following.

The UBSS algorithms proposed in[12–17] estimate the unknown mixing matrix by assuming the presence of only one single source at each time–frequency point. In practice, a selection of time–frequency points that probably pertain to one single source is expected to improve performance of the mixing matrix estimation. The mixing matrix estimate is then used to recover the source signals. Rejecting time–frequency points of noise alone and, thus, selecting and processing the time–frequency points where the possibly multiple sources are present only, should also improve the overall performance of the methods. Our contribution is then to perform the selection processes mentioned in the foregoing, by considering them as statistical decision problems and reducing the number of empirical parameters for better robustness. Sparseness hypotheses are then particularly suitable for detecting the time–frequency points needed by the separation procedure, whereas such hypotheses are useless for selecting the time–frequency points used by the mixing matrix estimation.

More specifically, Section “Main steps of standard UBSS methods” recalls the source recovery and mixing matrix estimation steps in classical UBSS methods based on sparseness assumptions. By so proceeding, we highlight the empirical parameters required by these steps. Then, Section “Statistical tests for sparseness-based UBSS” is the main core of the article because it introduces the statistical tests for the selection of the time–frequency points needed by source recovery and mixing matrix estimation. For source recovery, the selection of the time–frequency points relies on a weak notion of sparseness, exploited through an estimate-and-plug-in detector: We begin by estimating the noise standard deviation via the d-dimensional amplitude trimmed estimator (DATE), recently introduced in[22], especially designed for coping with noisy representations of weakly-sparse signals; then, the noise standard deviation estimate is used instead of the unknown true value in the expression of a statistical test, specifically designed for noisy representations of weakly-sparse signals as well. For the mixing matrix estimation, the physics of the signal suggest introducing a novel strategy. Indeed, the problem is to select time–frequency points whose energy is big enough in noise to consider that they pertain to one single source. We thus introduce a tolerance above which the energy of these relevant points must be regardless of noise. A statistical test involving this tolerance and based on signal norm testing (SNT) recently introduced in[23] is then used to select these points in presence of noise.

Summarizing, we thus extend significantly[24], by introducing three new features of importance. First, we replace the modified complex essential supremum estimate (MC-ESE) of the noise standard deviation by the DATE, which is as accurate, relies on an even stronger theoretical background and has a computational cost significantly lower. Second, the selection of the time–frequency points of interest for source recovery is performed by using a thresholding test, as in[24], but the value of the detection threshold is determined automatically on the basis of the results provided in[25] for the detection of signals satisfying the weak-sparseness model in noise. Third, the mixing matrix estimation is carried out by taking the physical nature of the signals into account.

In Section “Simulation results”, we apply the statistical tests of Section “Statistical tests for sparseness-based UBSS” to several standard UBSS methods[15, 16, 18, 26, 27] in the instantaneous mixture case. We thus show that our statistical algorithms reduce the number of empirical parameters and improve the overall performance of the UBSS methods under consideration. For instance, by using these statistical algorithms, the subspace-based method presented in[15] can be significantly automatized so as to involve two parameters only. These two parameters are adjusted once for all possible SNRs, in contrast to standard UBSS methods.

In Section “Discussion”, these results are discussed. In particular, the convolutive mixture case is addressed for its importance in practice. Some perspectives of this work are then presented in the concluding Section “Conclusion and perspectives”.

Main steps of standard UBSS methods

Principles

We consider the instantaneous mixing system:

x (t) = A s (t) + n (t),

(1)

where t ranges in some finite set of sampling times such that, for every t in this set of sampling times, $s (t) = {[s_{1} (t), s_{2} (t), \dots, s_{N} (t)]}^{T}$ is the vector of the N sources, $x (t) = {[x_{1} (t), x_{2} (t), \dots, x_{M} (t)]}^{T}$ is the M-dimensional mixture vector, $A = [a_{1}, a_{2}, \dots, a_{N}]$ is the complex M × N mixing matrix and $n (t) = {[n_{1} (t), n_{2} (t), \dots, n_{M} (t)]}^{T}$ is additive noise. It is assumed that(n_k(t))_1≤k≤M are random Gaussian processes, mutually decorrelated and independent of the sources. In the sequel, we address the underdetermined case where N > M. Without loss of generality, we assume that the column vectors of A have all unit norm, i.e., $∥a_{i}∥ = 1$ for all i ∈ {1,2,…,N}.

Time–frequency signal processing provides effective tools for analyzing nonstationary signals, whose frequency contents vary in time. It involves representing signals in a 2D space, that is, the joint time–frequency domain, hence providing a distribution of the signal energy versus time and frequency simultaneously. The sparseness of the time–frequency coefficients of the source signals is one of the main keys to solve the UBSS problem.

One well-known time–frequency representation and most used in practice is the short-time discrete Fourier transform (STFT). The mixing process can be modeled in the time–frequency domain via the STFT as:

S_{x} (t, f) = A S_{s} (t, f) + S_{n} (t, f),

(2)

where $S_{x} (t, f)$ , $S_{s} (t, f)$ and $S_{n} (t, f)$ ) are the vectors of the STFT coefficients at time–frequency bin (t,f) of the mixtures, the sources and noise, respectively.

Given x(t), our purpose is to recover s(t) or equivalently $S_{s} (t, f)$ . As formalized in[28], the UBSS problem is generally decomposed in two separate subproblems. First, in the so called mixing matrix estimation, the normalized columns (a_i)_1≤i≤N are estimated so as to obtain an estimate of A. Then, on the basis of this estimate, the second step called signal recovery, provides a solution to Equation (2). Figure1 presents the flowchart of such a two-step approach.

We now detail the mixing matrix estimation and the source recovery based on sparseness assumptions.

Mixing matrix estimation

The UBSS methods based on sparse signal representations in the time–frequency domain share the following main assumption:

Assumption 1

For each source, there exists a set of time–frequency points where this source exists alone.

The elements of this set can be assumed to be isolated time–frequency points as in degenerate unmixing estimation technique (DUET)[15, 26] or to form a time–frequency box as in time–frequency ratio of mixtures (TIFROM)[16] and time–frequency CORRelation (TIFCORR)[27]. Assumption 1 is often reasonable thanks to the sparseness of the time–frequency representation of the sources, especially when this number of sources is moderate.

As mentioned above, the first step in UBSS methods is to estimate the mixing matrix A to achieve source recovery. In most two-step source separation algorithms[12, 13, 15–18] an autosource selection is performed. By autosource selection, it is meant the detection of regions where only one source occurs. The methods for estimating A on the basis of Assumption 1 can then be summarized as follows.

Jourjine et al.[26] present the DUET method, which is restricted to two mixtures (M = 2). They address the anechoic case, where source transmission attenuations and delays between sensors are taken into account. The columns of the mixing matrix are estimated by finding picks in a 2D histogram of amplitude-delay estimates.

In[16], the mixing matrix estimation of the TIFROM method is based on the complex ratios $\frac{S_{x_{j}} (t, f)}{S_{x_{k}} (t, f)},$ where, given m ∈ {1,2,…,M}, $S_{x_{m}} (t, f)$ stands for the m th coordinate of $S_{x} (t, f)$ . These ratios are computed for each time–frequency point and for two arbitrarily chosen indices j and k in {1,2,…,M}. A first limitation of this method is to assume non-null matrix coefficients. A second limitation is the use of an empirical threshold to select the smallest empirical variances of these ratios.

In TIFCORR[27], the mixing matrix estimation is similar by selecting the empirical covariance coefficients above a certain threshold chosen manually.

The subspace-based UBSS (SUBSS) method[15] relies on another type of mixing matrix estimation. LetΩ_k stand for the set of all the time–frequency points (t f) where the k th source is present and Ω stand for the union of all these sets Ω_kfor k=1,2,…,N. According to Assumption 1, the sets Ω_kare non-empty and so is Ω. For (t f)∈Ω_k, (2) reduces to

S_{x} (t, f) = S_{s_{k}} (t, f) a_{k} + S_{n} (t, f) .

(3)

According to this result, the mixing matrix can be estimated as follows. First, all the spatial direction vectors $d (t, f) = \frac{S_{x} (t, f)}{∥S_{x} (t, f)∥}$ , with (t f) ∈ Ω, are clustered by using an unsupervised clustering algorithm and taking into account that the number of sources is supposed to be known. Since (3) shows that for all the time–frequency points (t f) of Ω_k, the STFT vectors $S_{x} (t, f)$ have same spatial direction a_k, the column vectors of the mixing matrix A are then estimated as the centroids of the N classes returned by the clustering algorithm. In[15], Aïssa-El-Bey et al. propose the use of the k-means algorithm but other techniques could be employed. The set Ω required for the clustering procedure is determined by comparing the ratio $∥S_{x} (t, f)∥ / max_{ξ} ∥S_{x} (t, ξ)∥$ to a threshold height, whose value is chosen empirically.

Source recovery

This section presents a number of techniques used in the source recovery stage of two-step UBSS algorithms. In the underdetermined case, the system (2) has less equations than unknowns, and thus it has (in general) infinitely many solutions. In order to recover the original sources, additional assumptions are needed.

The DUET method[26] assumes the sources to be (approximately) W -disjoint orthogonal in the time–frequency domain, that is, the supports of the STFTs of any two sources present in the observations are disjoints. The source recovery is performed by partitioning the time–frequency plane using the mixing parameter estimates. This procedure assigns a source to each time–frequency point, even if this point is due to noise alone, which is detrimental to the method overall performance.

Although TIFROM and TIFCORR do not require the sources to be W -disjoint orthogonal for source recovery, they however suffer from the same limitation as DUET in that they also assign time–frequency points of noise alone to sources.

Bofill and Zibulevsky[18] use the ℓ₁ -norm minimization to recover the sources. In the noiseless case, this can be accomplished by solving the convex optimization

min_{S_{s} (t, f)} {∥S_{s} (t, f)∥}_{1} subject to S_{x} (t, f) = A S_{s} (t, f),

(4)

where ${∥\cdot∥}_{1}$ is the ℓ₁ norm. In presence of noise, the foregoing constraint must be modified so as to take the noise standard deviation into account. In practice, this noise standard deviation is unknown and must be estimated.

For the SUBSS approach in[15], the source recovery is based on the following assumptions:

Assumption 2

The number of active sources at any (t,f) is strictly less than the number M of sensors.

Assumption 3

Any M×M sub-matrix of the mixing matrix has full rank, that is, for all $J \subset {1, 2, \dots, N}$ with cardinality less than M, (a_j)_j∈Jare linearly independent.

The subspace approach then performs multisource selection, that is, the selection of time–frequency points pertaining to a mixture and then, identifies the sources present at a multisource time–frequency points. Thanks to Assumption 2, the method then involves solving the resulting locally overdetermined linear problem. By construction, the methods requires rejecting time–frequency points of noise alone. In[15], the time–frequency points with energy below some empirically chosen threshold are rejected.

Statistical tests for sparseness-based UBSS

This section is the main core of the article since it is dedicated to a series of improvements brought to the classical UBSS methods presented in Section “Main steps of standard UBSS methods”. These improvements concern the selection of the time–frequency points of interest for source separation (multisource selection) and the selection of the time–frequency points suitable for mixing matrix estimation (autosource selection). The crux of the approach followed bellow is to consider the aforementioned selections of time–frequency points as statistical testing problems of accepting or rejecting the presence of sources in noise. These two hypothesis testing problems are different in that mixing matrix estimation requires selecting points where only one single source is present, whereas this constraint is useless for denoising and source recovery.

The issue in these binary hypothesis testing problems is twofold. On the one hand, the observation in each problem has unknown distribution because basically the possible source signal distributions are themselves unknown. On the other hand, the noise standard deviation is unknown as well. Because of this lack of prior knowledge, standard likelihood theory or extensions such as generalized likelihood ratios or invariance-based approaches do not apply.

For source recovery, our solution is an estimate-and-plug-in detector. Based on a weak-sparseness model for the signal sources in noise, it begins by estimating the noise standard deviation via the DATE introduced in[22]. Then, the noise standard deviation estimate is used instead of the unknown true value in the expression of a statistical test, also designed for noisy sparse signal representations.

For mixing matrix estimation, we exploit the physical nature of the signals to detect the time–frequency points where one single source is present. For signals with high overlapping rate, SNT is appropriate to select such time–frequency points. When the signals have low overlapping rate, we directly use the time–frequency points provided by the source recovery procedure.

Figure2 presents the flowchart of the proposed approach based on the DATE and SNT.

Weak-sparseness-based time–frequency detection for source recovery (multisource selection)

Recovering sources involves detecting the time–frequency points that pertain to signals. Therefore, time–frequency points due to noise alone are useless to recover sources. Detecting the time–frequency points appropriate for source recovery thus amounts to deciding whether any given time–frequency point (t,f) pertains to some signal of interest or not. It is thus natural to state this problem as the binary hypothesis testing, where the null hypothesis $H_{0}$ is that $S_{x} (t, f) ∽ N_{c} (0, σ^{2} I_{M})$ is complex Gaussian noise and the alternative hypothesis $H_{1}$ is that $S_{x} (t, f) = Λ (t, f) + S_{n} (t, f)$ is a source mixture in independent and additive complex Gaussian noise, where $S_{n} (t, f) ∽ N_{c} (0, σ^{2})$ and Θ (t,f) stands for the mixture of signals possibly present at time–frequency point (t,f).

The issue is then the following. Although $S_{x} (t, f)$ can reasonably be modeled as a random complex variable, the distribution of $S_{x} (t, f)$ can hardly be known and standard likelihood theory thus becomes useless. This difficulty can however be overcome by resorting to a weak-sparseness model that can be introduced as follows.

Figure3a displays the spectrogram obtained by STFT of a mixture of audio signals. This spectrogram exhibits many time–frequency components with small or even null amplitudes. When this mixture is corrupted by additive and independent noise as in Figure3b, small components are masked and only big ones are still visible. We must also note that the proportion of these big components remains seemingly less than or equal to one half. In other words, it is reasonable to assume that (1) the signal components are either present or absent in the time–frequency domain with a probability of presence less than or equal to one half and (2) when present, the signal components are relatively big in that their amplitude is above some minimum value. These two assumptions specify the weak sparseness model by bounding our lack of prior knowledge on the signal distribution. The weak-sparseness model slightly differs from the “strong” sparsity model encountered in compressive sensing, where it is assumed that the non-null significant signal components are very few. In the weak sparseness model, we do not restrict our attention to very small proportions of big time–frequency components.

To take the weak-sparseness model into account in our binary hypothesis problem statement, we assume that (1) the probability of occurrence of hypothesis $H_{1}$ is less than or equal to one half and (2) there exists some positive real value α such that |Θ(t,f)|>α. The value α can be regarded as the minimum signal amplitude. We thus write that

\{\begin{array}{l} H_{0} : & S_{x} (t, f) ∽ N_{c} (0, σ^{2} I_{M}) \\ H_{1} : & S_{x} (t, f) = Θ (t, f) + S_{n} (t, f), \end{array}

(5)

with $S_{n} (t, f) ∽ N_{c} (0, σ^{2})$ , |Θ(t,f)|>α and $ℙ (H_{1}) ⩽ 1 / 2$ . Furthermore, we do not assume that the probability distribution of Θ(t,f) is known. In what follows, we prefer summarizing this testing problem by introducing a Bernoulli distributed random variable ϵ(t,f), valued in {0,1}, independent of Θ(t,f) and $S_{n} (t, f)$ , but defined on the same probability space, so as to write that $S_{x} (t, f) = ϵ (t, f) Θ (t, f) + S_{n} (t, f)$ . We thus have $ℙ (H_{1}) = ℙ [ϵ (t, f) = 1]$ . Given any test $T$ , that is, any measurable map of $ℂ^{M}$ into {0,1}, we then say that $T$ accepts (resp. rejects) the null hypothesis $H_{0}$ if $T (S_{x} (t, f)) = 0$ (resp. $T (S_{x} (t, f)) = 1$ ). In other words, $T$ is said to return the expected value of the true hypothesis. The error probability of $T$ is then defined as the probability $P_{e} {T} = ℙ [T (S_{x} (t, f)) \neq ϵ (t, f)]$ .

According to ([25], Theorem VII.1), the decision should then be performed by using the thresholding test with threshold height $λ_{D} (α, σ) = (σ / \sqrt{2}) ξ (α \sqrt{2} / σ)$ where, for any positive ρ, $ξ (ρ) = I_{0}^{- 1} (e^{ρ^{2} / 2}) / ρ$ and _I0 is the zeroth order modified Bessel function of the first kind. By thresholding test with threshold height $h \in [0, \infty)$ , we mean the test $T_{h}$ such that

T_{h} (u) = \{\begin{array}{l} 1 & if | u | ⩾ h \\ 0 & if | u | < h. \end{array}

(6)

The reasons for which this test is recommended are the following ones. Let $L_{mpe}$ be the minimum-probability-of-error (MPE) test, that is, the likelihood ratio test that guarantees the least possible probability of error among all possible tests and that could be computed if the probability distribution of Θ(t f) and the prior probability of presence $ℙ (H_{1})$ were known. Two facts follow from ([25], Theorem VII.1). First, the error probability of $T_{λ_{D} (α, σ)}$ is above the error probability of the MPE test and less than or equal to the error probability of an explicit function $V (α \sqrt{2} / σ)$ , whose expression is useless in the sequel. Second, $V (α \sqrt{2} / σ)$ is a sharp upper-bound since it is attained by the error probabilities of tests $L_{mpe}$ and $T_{λ_{D} (α, σ)}$ in the least favorable case where $ℙ [ϵ = 1] = 1 / 2$ and Θ(t f)=α^{e iΦ(t,f)} with Φ(t f) uniformly distributed in [0,2Π) and i is the imaginary unit (i² = −1). To carry out this test, we must choose an appropriate value for α and perform an estimate of σ.

The value of α is fixed by following the same reasoning as in[29] and considering that the minimum amplitude of the signal to detect is the noise maximum value. More specifically, given m random variables X₁X₂,…,X_m that are independent and identically distributed with $X_{k} \overset{iid}{∽} N (0, σ^{2})$ for $1 ⩽ k ⩽ m$ , it is known ([30], Eqs. (9.2.1), (9.2.2), Section 9.2, p. 187) ([31], p. 454) ([32], Section 2.4.4, p. 91) that

lim_{m \to + \infty} ℙ [λ_{u} - \frac{σ ln ln m}{ln m} ⩽ max \{| X_{k} |, 1 ⩽ k ⩽ m\} ⩽ λ_{u}] = 1,

where $λ_{u} = σ \sqrt{2 ln m}$ is often called the universal threshold[33]. The maximum amplitude of ${(X_{k})}_{1 ⩽ k ⩽ m}$ has thus a strong probability of being close to λ_uwhen m is large and the universal threshold can be regarded as the noise maximum amplitude of m noise samples. In our case, we have M sensors so that each observation $S_{x} (t, f)$ is an M-dimensional complex vector. Let L stand for the number of time–frequency points (t f) obtained for each sensor. We thus have M×L time–frequency points (t f) and, therefore, 2ML random variables—the real and imaginary parts of $S_{n} (t, f)$ —that are $N (0, σ^{2} / 2)$ . The maximum amplitude of these 2ML Gaussian independent and identically distributed random variables with standard deviation $σ / \sqrt{2}$ will then be considered as the minimum signal amplitude so that we set $α = σ \sqrt{log (2 ML)}$ . The threshold height used to detect the relevant time–frequency points is then $λ_{D} (σ) = \frac{σ}{\sqrt{2}} ξ (\sqrt{2 log (2 ML)})$ , which is henceforth called the detection threshold.

As far as the estimation of the noise standard deviation is concerned, usual solutions based on standard robust estimators such as the median absolute deviation (MAD)[34], the trimmed or the winsorized estimators[35] do not apply. Indeed, by considering the spectrogram of Figure3b, it can easily be guessed that such standard estimators would fail because the proportion of significant noisy time–frequency points pertaining to the signals is large. Therefore, the noisy time–frequency points are not very few and cannot play the role of outliers with respect to the main core data distribution. In a recent article[22], a new noise standard deviation estimator called the DATE has been proposed. This estimator relies on the weak-sparseness model presented before. An exhaustive presentation of the theoretical background on which this estimator is based is beyond the scope of the present article and the reader is asked to refer to[22] for an heuristic presentation and a complete mathematical description of the DATE. In the context addressed in the present article, this algorithm applies as follows.

With the notation used so far, each $S_{x} (t, f)$ is an M-dimensional complex vector. Let $S_{x_{j}} (t, f)$ , j=1,2,…,M, be the components of $S_{x} (t, f)$ . For any given j=1,2,…,M, we assume that the L time–frequency components $S_{x_{j}} (t, f)$ for the j th sensor are independent and that each time–frequency component obeys the binary hypothesis model of (5) with $α = σ \sqrt{log (2 ML)}$ . According to[22] and setting κ=2Γ(3/2) where Γ is the standard Gamma function, there exists a specific convergence criterion, for which we have:

\frac{\sum_{(t, f)} | S_{x_{j}} (t, f) | 1 (| S_{x_{j}} (t, f) | ⩽ λ_{D} (σ))}{\sum_{(t, f)} 1 (| S_{x_{j}} (t, f) | ⩽ λ_{D} (σ))} \approx κσ

(7)

when the number L of time–frequency bins (t f) is large enough. In the previous equation, $1 (| S_{x_{j}} (t, f) | ⩽ λ_{D} (σ))$ stands for the indicator function of event $| S_{x_{j}} (t, f) | ⩽ λ_{D} (σ)$ , The specific convergence criterion involved in (7) is specified in[22] and is not given here because of its intricateness. It also turns out that the noise standard deviation σ is the unique solution of (9) with respect to the convergence criterion involved. Therefore, the DATE basically performs an estimate of the noise standard deviation by solving (7) with regard to this convergence criterion. The several steps involved in the computation are then the following ones.

The DATE:

Given j ∈ {1,2,…,M}, let $Y_{(1)}^{j}, Y_{(2)}^{j}, \dots, Y_{(L)}^{j}$ be the L values $| S_{x_{j}} (t, f) |$ sorted by ascending order.

(1)
[Search interval]:
1. (a)
  Choose some positive real value Q less than or equal $1 - \frac{L}{4 {(L / 2 - 1)}^{2}}$ .
2. (b)
  Set $h = 1 / \sqrt{4 L (1 - Q)}$
3. (c)
  Compute $k_{min} = L / 2 - hL$ . According to Bienaymé–Chebyshev’s inequality and since the probabilities of presence of the signals are assumed to be less than or equal to one half, the probability that the number of observations due to noise alone is above $k_{min}$ is larger than or equal to Q. In the experimental results presented below, Q was set to 0.95 for the computation of $k_{min}$ .
(2)
[Existence]:

IF there exists a smallest integer k in ${k_{min}, \dots, L}$ such that
$| Y_{(k)}^{j} | ⩽ (μ_{j} (k) / κ) ξ (\sqrt{2 log (2 ML)}) < | Y_{(k + 1)}^{j} |$
(8)

with
$μ_{j} (k) = \{\begin{array}{l} \frac{1}{k} \sum_{r = 1}^{k} | Y_{(r)}^{j} | & if k \neq 0 \\ 0 & if k = 0, \end{array}$
(9)

set ^k∗=k.

ELSE, set $k^{*} = k_{min}$ .
(3)
[Value]: The estimate $σ_{j}^{*}$ of the noise standard deviation on the j th sensor is then
$\hat{σ_{j}} = μ_{j} (k^{*}) / κ,$
(10)

The final estimate $\hat{σ}$ of the noise standard deviation is then obtained by averaging the values $\hat{σ_{j}}$ so that $\hat{σ} = (1 / M) \sum_{j = 1}^{M} \hat{σ_{j}}$ .

Signal source detection for mixing matrix estimation (autosource selection)

In this section, we propose a test for selecting the time–frequency points where one signal source is probably present alone. To perform this selection, we make the distinction between signals with either low or high overlapping rate in the time–frequency domain. Chirp signals (resp. audio signals) are typical examples of signals with low (resp. high) overlapping rate. It is worth noticing that the estimation procedures proposed below for each class have reasonable computational costs.

The case of signals with low overlapping rate

Since the sources have low overlapping rate, we suppose that the observations detected by the thresholding test of Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)” mostly pertain to one signal source. In other words, we neglect the effect on the matrix estimation performance of the few points where sources may overlap, inasmuch as the impact of such time–frequency points is further reduced by the averaging effect inherent to any mixing matrix estimation method.

The case of signals with high overlapping rate

When signals overlap significantly in the time–frequency domain, the time–frequency detection of Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)” is now inappropriate. Indeed, the statistical procedure of Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)” is aimed at detecting time–frequency points where signal sources are present, whatever the number of these sources, whereas it is now required to discriminate points where one single source is present from points where multiple sources occur. We assume that in case of different sources present at time–frequency point (t,f), they are uncorrelated and incoherently combined. The resulting energy at (t,f) is thus supposed to be smaller than the energy attained at the time–frequency points where one single source is present only.

Our purpose is thus to detect the time–frequency points where the signal energy is big enough in presence of noise. Basically, this problem amounts to deciding whether $| A S_{s} (t, f) |$ is above some value τ or not. The value τ² thus represents the minimum energy level above which we consider that the signal energy is big enough to assume that one single source is actually present at (t f). For any $λ \in (0, \infty)$ , it follows from ([23], Lemma 4, statement (iii)) that

\begin{matrix} ℙ [| S_{x} (t, f) | > λ | | A S_{s} (t, f) | < τ] ⩽ 1 - F_{χ_{2 M}^{2} (2 τ^{2} / σ^{2})} (2 λ^{2} / σ^{2}), \end{matrix}

(11)

where $F_{χ_{d}^{2} (δ)} (\cdot)$ stands for the cumulative distribution function of the non-centered chi-2 distribution with d degrees of freedom and non-centrality parameter δ. The degree of freedom in (11) is 2M since each $S_{x} (t, f)$ is an M-dimensional complex random vector and, thus, a 2M-dimensional real random vector. Given some level γ∈(0,1), it then suffices to choose

λ = λ (τ, γ) = σ \sqrt{\frac{1}{2} F_{χ_{2 M}^{2} (2 τ^{2} / σ^{2})}^{- 1} (1 - γ)} .

(12)

to guarantee a “false alarm probability” $ℙ [| S_{x} (t, f) | > λ | | A S_{s} (t, f) | < τ]$ less than or equal to γ.

Therefore, for a given time–frequency point (t,f), the decision is that $| A S_{s} (t, f) | < τ$ if $| S_{x} (t, f) | < λ (τ, γ)$ and that $| A S_{s} (t, f) | ⩾ τ$ if $| S_{x} (t, f) | ⩾ λ (τ, γ)$ . For mixing matrix estimation, we then keep the time–frequency points (t,f) such that $| S_{x} (t, f) | ⩾ λ (τ, γ)$ , which are considered as to time–frequency points pertaining to one single source. In practice, since the actual value of σ is unknown, we replace this true value by its estimate $\hat{σ}$ provided by the DATE.

Although the two parameters γ and τ must be fixed, there is no need to choose them for each signal to noise ratio. Parameter τ, which is independent of the noise level, can be fixed via a small noiseless database. Similarly, level γ can be determined via a few preliminary test on a small representative database.

Simulation results

In most of the following simulations, the mixing matrix is chosen according to ([14], Eq. (38)) so as to model N sources arriving at the sensor array at different angles _{θ 1}_{θ 2},…,_{θ
M}. The entries of matrix A are therefore $a_{j, k} = e^{i Π (j - 1) sin (θ_{k})}$ for $j \in {1, \dots, M}$ and $k \in {1, \dots, N}$ . In the sequel, we proceed by choosing four sources (N = 4), three sensors (M = 3), _{θ 1} = 15 °, θ₂ = 30 °, θ₃ = 45 ° and θ₄ = 75 °.

Unless specified otherwise, the source signals are speech signals randomly chosen in the TI-digits database[36]. This large speech database collected in a quiet environment is commonly used in speech processing. In this article, the chosen speech signals were downsampled to 8 kHz. All signals involve 8,192 samples. Figure4a–d shows the time-domain representations of the original source signals and Figure4e–h represents their corresponding spectrograms. Figure5 displays a spectrogram of a mixture of these speech signals when the mixing matrix A is applied to them at SNR = 10 dB. The spectrograms of the other mixtures are not presented because the differences between any two of them are not visually noticeable since the mixing matrix A involves no null entry.

The two parameters required for the mixing matrix estimation are then fixed to τ = 4 and γ = 1⁰⁻³. The source separation performance is measured by the normalized mean square error (NMSE):

NMSE = min_{i, j} \{10 \underset{10}{log} (1 - {(\frac{〈ŝ_{i}, s_{j}〉}{∥\hat{s_{i}}∥ \cdot ∥s_{j}∥})}^{2})\} .

(13)

Throughout this section, NMSEs are calculated over 100 Monte-Carlo runs.

SUBSS method

The modified SUBSS algorithm is obtained by using both the DATE and SNT for source recovery and mixing matrix estimation by SNT, respectively, as explained in Section “Statistical tests for sparseness-based UBSS”. It is used to separate the four source signals from the noisy mixed signals observed by the three sensors.

The waveforms of the recovered source signals by the modified SUBSS algorithm are represented in Figure6. Figure6a–d shows the time-domain representations of the recovered source signals in the noiseless case (input SNR = 45 dB), and Figure6e–h represent time-domain representations of the recovered source signals with input SNR = 10 dB.

In Figure7, the performance of the modified SUBSS algorithm, with and without denoising, is compared to that obtained by the originally SUBSS algorithm of[15]. The denoising mentioned above is described in Appendix as a standard linear estimation.

The modified SUBSS algorithm outperforms the original SUBSS algorithm[15], which relies on thresholds that are manually chosen for each input SNR. Moreover, modified SUBSS without denoising yields performance measurements that do not significantly depart from those attained by the original subspace-based UBSS algorithm. In addition, Figure7 displays the NMSEs obtained by using the MAD estimator instead of the DATE in the modified SUBSS algorithm without denoising. The use of the MAD instead of the DATE induces a significant performance loss, which illustrates the relevance of the DATE and the weak-sparseness model. In Figures8 and9, we present the NMSEs obtained by the modified SUBSS and the original SUBSS when the number of sources increases and for SNR = 10dB and SNR = 20dB. In both figures, the NMSEs degrade, because an increase of the source interference invalidates Assumption 1.

We now consider the case of complex chirp signals. These ones were generated by slightly modifying the matlab routine MakeSignal.m of the wavelab toolbox, so as to obtain complex chirp signals. The 4 chirp signals we use as sources are $s_{1} (t) = \sqrt{t (1 - t)} e^{i \frac{Π T}{2} t^{2}}$ , $s_{2} (t) = \sqrt{t (1 - t)} e^{- i \frac{Π T}{4} t^{2}}$ , $s_{3} (t) = e^{- i Π T t^{2}}$ and $s_{4} (t) = e^{i \frac{2}{3} Π T t}$ , where t∈[0,1] and T=8192 is the number of samples for each signal. Two of these chirp signals are LFM ones and one is a pure sine. Figure10 then displays the spectrograms of the four chirp signals under consideration, whereas Figure11 presents the spectrogram of a mixture of these sources when matrix A is applied and SNR = 10 dB. The spectrograms of the other mixtures are not displayed for the same reasons as those given previously for the speech signal mixtures.

The experimental procedure for assessing the modified SUBSS in comparison to the original SUBSS method is then the same as above. As specified in Section “The case of signals with low overlapping rate”, the thresholds used for the mixing matrix estimation are the detection ones. Therefore, no additional parameter is needed. The results obtained in Figure12 show the relevance of this choice for the thresholds, explained by the fact that chirp signals present very few overlapping time–frequency components.

Other methods

As described in Sections “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)” and “Signal source detection for mixing matrix estimation (autosource selection)”, The DATE and SNT can be used to perform multisource and autosource selections, respectively. Said otherwise, the statistical tests of the aforementioned sections make it possible to obtain the time–frequency points where noisy mixtures are present and the set of time–frequency points where only one single source exists. In this section, we comment the results we obtain by so proceeding with respect to the several UBSS methods addressed in Section “Main steps of standard UBSS methods” and other than SUBSS.

In the underdetermined case, TIFROM achieves partial source separation only. Therefore, to better assess the contribution of our statistical tests to TIFROM, we consider the determined case where four source signals from four speakers are mixed. The mixing matrix is now 4×4 with independent Gaussian entries. In Figure13, we present the NMSEs obtained by the TIFROM, SNT-TIFROM and Modified SNT-TIFROM. Specifically, SNT-TIFROM uses SNT to select times frequency points where a source exists alone. SNT-TIFROM, as TIFROM, performs no multisource selection for source recovery. In contrast, the modified SNT-TIFROM performs multisource selection and forces to zero the unselected time–frequency points. These results show that SNT makes it possible to actually select the autosource time–frequency points, with no performance loss and without resorting to the empirical threshold required by the original TIFROM. The performance yielded by the modified SNT-TIFROM further emphasizes that the detection threshold adjusted with the DATE selects appropriate multisource time–frequency points for source recovery. The gain for low SNRs is explained by the fact that this selection can be regarded as a non-linear denoising. The gain brought by this denoising effect decays when the SNR increases.

Another contribution of our statistical approach to sparseness-based methods is the estimation of the noise standard deviation. Indeed, some methods need an estimate or the true value of the noise standard deviation. For instance, Bofill and Zibulevsky[18], use the _{ℓ 1}-norm minimization to recover the sources. In the noisy case, they propose to solve the optimization problem:

min_{S_{s} (t, f)} \frac{1}{2 σ^{2}} {∥S_{x} (t, f) - A S_{s} (t, f)∥}_{2}^{2} + {∥S_{s} (t, f)∥}_{1} .

Because of the weakly sparseness of the sources in noise, we hereafter prefer following[37] dedicated to stable recovery of not exactly sparse signals. We therefore solve the optimization problem

\begin{align} min_{S_{s} (t, f)} {∥S_{s} (t, f)∥}_{1} subject to {∥S_{x} (t, f) - A S_{s} (t, f)∥}_{2} \\ \leq σ^{2} (M + 2 \sqrt{2 M}) . \end{align}

(14)

This approach can then be improved in two ways. First, by solving this optimization problem on only the time–frequency points selected by the multisource procedure propounded in Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)”. Second, by replacing the unknown true value of the noise standard deviation by its estimate provided by the DATE. In this respect, Figure14 displays the performance measurements obtained by the original method based on the ℓ₁-criterion of Equation (4) (L1 Minimization) in comparison to the modified ℓ₁-criterion of Equation (14) applied to the outcome of the the multisource selection when the noise standard deviation is estimated by the DATE (Modified L1 minimization). As expected, the gain brought by multisource selection and Equation (14), both adjusted by the noise standard deviation estimate provided by the DATE, is significant. It is also worth noticing that the DATE estimation error does not impact significantly the separation performance in comparison to the case where the noise standard deviation is perfectly known. This can also be seen in Figure14, where the performance measurements are given when the multisoure selection and ℓ₁-criterion of Equation (14) are both adjusted with the actual value of the noise standard deviation (Oracle Modified L1 Minimization). In contrast, there is significant performance loss when the multisource selection and Equation (4) are calculated by using the MAD instead of the DATE (MAD Modified L1 Minimization). The reason still relates to the fact that the DATE is more robust to weak-sparseness than the MAD.

The multisource selection based on the detection threshold adjusted by the estimate provided by the DATE can be further exploited by the DUET reconstruction, as illustrated in Figure15. In this simulation, the input signals are the chirp signals considered above, so that the W -disjoint orthogonality assumption is satisfied. Moreover, the mixing matrix A is now assumed to be known. On the one hand, we perform the DUET source recovery by considering the whole time–frequency plane. On the other hand, we consider the modified DUET, that is, the DUET source recovery applied to the selected multisource time–frequency points only. The results are similar to those obtained above by TIFROM and its modified versions. Here, the gain brought by the multisource selection, which acts as a denoising, is bigger on a wider SNR range because the time–frequency representation of chirp signals is sparser than that of audio signals.

Discussion

Assessment

The algorithms we propose are very general. They are not dedicated to a given sparseness-based BSS method. They are simple to apply without any adjustment. From the results of Section “Simulation results”, our procedures can therefore be used to improve, simplify or bring robustness to the standard sparseness-based BSS methods considered in the article.

More specifically, the weak-sparseness-based time–frequency detection procedure of Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)” can be used as an automatized pre-processing for multisource selection. For example, the time–frequency detection in[15] requires one threshold value for each instrumented SNR. The detection procedure of Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)” then makes it possible to avoid this empirical parameter choice, which brings robustness and significant simplification. Used as a pre-processing for TIFROM[16], which basically involves no selection of time–frequency points, the multisource selection we propound can improve the separation performance.

For mixing matrix estimation, our approach described in Section “Signal source detection for mixing matrix estimation (autosource selection)” relies on no weak-sparseness assumption and involves two parameters only, that is, the tolerance and the false-alarm probability. These parameters are valid over the signal-to-noise ratio (SNR) range, in contrast to[15] for instance. Furthermore, the assumptions made by TIFROM can be relaxed by using the autosource selection of Section “Signal source detection for mixing matrix estimation (autosource selection)”. It is also worth noticing that the two parameters we need for mixing matrix estimation have a physical meaning, which is not the case for some standard sparseness-based BSS methods.

Convolutive mixture case

There exists a great variety of possible strategies for dealing with the convolutive mixture case, which is more realistic than the instantaneous one. In the convolutive mixture case, exhibiting a well-established family of methods such as that considered above in the instantaneous mixture one is hardly feasible. However, despite this variety, the statistical framework proposed in this article can be expected to be used in the convolutive mixture case, at least, for methods based on time–frequency representations for which, separating time–frequency points of noise alone from those of noisy signals can be helpful. For instance, this detection procedure for multisource selection can be used straightforwardly to detect the time–frequency points required by the convolutive SUBSS presented in[38]. The modified convolutive SUBSS thus obtained discards the empirical threshold required in[38] for multisource selection. This entails no significant performance loss, as illustrated by Figure16. Studying the added-value brought by SNT in the convolutive mixture case requires further analysis that could be achieved in some forthcoming work.

Conclusion and perspectives

The algorithms presented in this article contribute to BSS in the underdetermined mixture case, by avoiding empirical choices of parameters present for the so-called family of weak-sparseness based methods. Our first algorithm aimed at selecting the suitable time–frequency points for source recovery is full automatic. The second, dedicated to mixing matrix estimation, requires fixing two parameters only, regardless of the instrumented SNRs.

The question is now to what extent the statistical tests used above in the instantaneous mixture case can possibly be exploited in the convolutive mixture case, especially in complement to the results discussed in Section “Convolutive mixture case”. It can also be wondered whether these tests can be extended so as to deal with colored noise. Work on this topic is under progress.

The theoretical and experimental results of this article pinpoint that the subfunctions of the source separation methods considered above, completed with the statistical tests we have proposed, can be regarded as elementary components that can be interchanged and associated to provide new algorithms for source separation in different applicative contexts. This opens new practical prospects. For instance, it would be desirable to construct a toolbox involving all these elementary components for further developments and studies. Such a toolbox would also make it possible to carry out exhaustive experimental assessments on large databases of signals via the BSSEval toolbox, downloadable from[39].

Appendix

Denoising-based source recovery

The SUBSS method presented in[15] estimates the index set of the sources present at a given time–frequency point (t f). Let us denote by J this set of indexes. Then, Equation (2) reduces to:

S_{x} (t, f) = A_{J} S_{s_{J}} (t, f) + S_{n} (t, f)

(15)

and the STFT coefficients of these active sources can be recovered using:

S_{s_{J}} (t, f) \approx A_{J}^{#} S_{x} (t, f),

(16)

where $A_{J}^{#} = {(A_{J}^{H} A_{J})}^{- 1} A_{J}^{H}$ is the Moore-Penrose pseudoinverse of A_J.

We propose to use the noise standard deviation estimate provided by the DATE to jointly denoise and separate the sources on the basis of the time–frequency points selected by the statistical test of Section “Weak-sparseness-based time–frequency detection for source recovery (multisource selection)”. So, instead of performing the source separation as specified by Equation (16), the source separation is now carried out by computing

{\hat{S}}_{s_{J}} (t, f) = R_{s_{J}} A_{J}^{H} {(A_{J} R_{s_{J}} A_{J}^{H} + {\hat{σ}}^{2} I_{M})}^{- 1} S_{x} (t, f)

(17)

where $\hat{σ}$ is the noise standard estimate returned by the DATE and $R_{s_{J}} = 피 [S_{s_{J}} (t, f) S_{s_{J}}^{H} (t, f)]$ . The derivation of the optimal linear estimator of (17) is standard. It involves minimizing the risk $피 [{∥S_{s_{J}} (t, f) - D S_{x} (t, f)∥}^{2}]$ when D ranges over the space of the card(J)×M matrices and under the assumption that the sources are spatially decorrelated. In practice, matrix R_{s
J}is unknown and must be estimated. We then proceeded as follows. On the one hand, we have $R_{x} = A R_{s} A^{H} + σ^{2} I_{M}$ . On the other hand,R_x can be estimated by $\hat{R_{x}} = \frac{1}{#t} \sum_{t} S_{x} (t, f) S_{x} {(t, f)}^{H}$ , where #t stands for the number of time windows on which the STFT is calculated. Since estimates of A and σ are known, we derive from the expressions of R_xand $\hat{R_{x}}$ an estimate $\hat{R_{s}}$ of R_s. An estimate of R_{s
J} follows by picking the appropriate columns in $\hat{R_{s}}$ .

References

Varajarajan V, Krolik J,: Multichannel system identification methods for sensor array calibration in uncertain multipath environments. IEEE Signal Processing Workshop on Statistical Signal Processing (SSP) (Singapore, Oct 2001), pp. 297–300
Rouxel A, Guennec DL, Macchi O: Unsupervised adaptive separation of impulse signals applied to EEG analysis. IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), vol. 1 (Istanbul, Turkey, June 2000), pp. 420–423
Abed-Meraim K, Attallah S, Lim T, Damen M: A blind interference canceller in DS-CDMA. IEEE International Symposium on Spread Spectrum Techniques and Applications (Parsippany, Sept 2000), pp. 358–362
Durán-Díaz I, Cruces-Alvarez SA: A joint optimization criterion for blind DS-CDMA detection. EURASIP J. Adv. Signal Process 2007, 2007(79248):1-11.
MATH Google Scholar
Aïssa-El-Bey A, Abed-Meraim K, Grenier Y: Underdetermined blind audio source separation using modal decomposition. EURASIP J. Audio Speech Music Process 2007, 2007(85438):1-15.
Article MATH Google Scholar
Comon P, C Jutten (eds.): Handbook of Blind Source Separation: Independent Component Analysis and Blind Deconvolution. (Academic Press, Oxford, 2010)
Google Scholar
Cardoso J-F: Blind signal separation: statistical principles. Proc. IEEE 1998, 86(10):2009-2025. 10.1109/5.720250
Article Google Scholar
Belouchrani A, Abed-Meraim K, Cardoso J-F, Moulines E: A blind source separation technique using second-order statistics. IEEE Trans. Signal Process 1997, 45(2):434-444. 10.1109/78.554307
Article Google Scholar
Belouchrani A, Amin MG: Blind source separation based on time-frequency signal representations. IEEE Trans. Signal Process 1998, 46(11):2888-2897. 10.1109/78.726803
Article Google Scholar
Abed-Meraim K, Xiang Y, Manton JH, Hua Y: Blind source separation using second order cyclostationary statistics. IEEE Trans. Signal Process 2001, 49(4):694-701. 10.1109/78.912913
Article Google Scholar
Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incompletes data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39(1):1-38.
MathSciNet MATH Google Scholar
Yilmaz O, Rickard S: Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process 2004, 52(7):1830-1847. 10.1109/TSP.2004.828896
Article MathSciNet Google Scholar
Melia T, Rickard S: Underdetermined blind source separation in echoic environments using DESPRIT. EURASIP J. Adv. Signal Process 2007, 2007(86484):1-19.
MATH Google Scholar
Linh-Trung N, Belouchrani A, Abed-Meraim K, Boashash B: Separating more sources than sensors using time-frequency distributions. EURASIP J. Appl. Signal Process 2005, 2005(17):2828-2847. 10.1155/ASP.2005.2828
Article MATH Google Scholar
Aïssa-El-Bey A, Linh-Trung N, Abed-Meraim K, Belouchrani A, Grenier Y: Underdetermined blind separation of nondisjoint sources in the time-frequency domain. IEEE Trans. Signal Process 2007, 55(3):897-907.
Article MathSciNet Google Scholar
Abrard F, Deville Y: A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process 2005, 85(7):1389-1403. 10.1016/j.sigpro.2005.02.010
Article MATH Google Scholar
Arberet S, Gribonval R, Bimbot F: A robust method to count and locate audio sources in a multichannel underdetermined mixture. IEEE Trans. Signal Process 2010, 58(1):121-133.
Article MathSciNet Google Scholar
Bofill P, Zibulevsky M: Underdetermined blind source separation using sparse representations. Signal Process 2001, 81(11):2353-2362. 10.1016/S0165-1684(01)00120-7
Article MATH Google Scholar
Araki S, Sawada H, Mukai R, Makino S: Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors. Signal Process 2007, 87(8):1833-1847. 10.1016/j.sigpro.2007.02.003
Article MATH Google Scholar
Araki S, Nakatani T, Sawada H, Makino S: Stereo source separation and source counting with MAP estimation with dirichlet prior considering spatial aliasing problem. Independent Component Analysis and Signal Separation (ICA), ser. LNCS, vol. 5441 (Springer, Paraty, 2009), pp. 742–750
Google Scholar
O’Grady P, Pearlmutter B, Rickard S: Survey of sparse and non-sparse methods in source separation. Int. J. Imag. Syst. Technol 2005, 15(1):18-33. 10.1002/ima.20035
Article Google Scholar
Pastor D, Socheleau F-X: Robust estimation of noise standard deviation in presence of signals with unknown distributions and occurrences. IEEE Trans. Signal Process 2012, 60(4):1545-1555.
Article MathSciNet Google Scholar
Pastor D: Signal norm testing in additive and independant standard Gaussian noise,. Institut Mines-Télécom; Télécom Bretagne, UEB , Lab-STICC UMR CNRS 3192, Tech. Rep., 2011, available at http://www.telecom-bretagne.eu/publications/publication.php?idpublication=10706
Aziz-Sbaï SM, Aïssa-El-Bey A, Pastor D: Robust underdetermined blind audio source separation of sparse signals in the time-frequency domain. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Prague, Czech Republic, May 2011), pp. 3716–3719
Pastor D, Gay R, Gronenboom A: A sharp upper bound for the probability of error of likelihood ratio test for detecting signals in white gaussian noise. IEEE Trans. Inf. Theory 2002, 48(1):228-238. 10.1109/18.971751
Article MATH MathSciNet Google Scholar
Jourjine A, Rickard S, Yilmaz O: Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 5 (Istanbul, Turkey, June 2000), pp. 2985–2988
Deville Y, Puigt M: Temporal and time-frequency correlation-based blind source separation methods. part I: determined and underdetermined linear instantaneous mixtures. Signal Process 2007, 87(3):374-407. 10.1016/j.sigpro.2006.05.012
Article MATH Google Scholar
Theis F, Lang E: Formalization of the two-step approach to overcomplete BSS. Signal and Image Processing (SIP) (Kauai, USA, August 2002), pp. 207–212
Atto AM, Pastor D, Mercier G: Detection thresholds for non-parametric estimation. Signal Image Video Process 2008, 2(3):207-223. 10.1007/s11760-008-0051-x
Article MATH Google Scholar
Berman SM: Sojourns and Extremes of Stochastic Processes. (Wadsworth, Reading, MA, 1992)
MATH Google Scholar
Mallat S: A Wavelet Tour of Signal Processing,. (Academic Press, Cambridge, 1999)
MATH Google Scholar
Serfling RJ: Approximations Theorems of Mathematical Statistics. (Wiley, New York, 1980)
Book MATH Google Scholar
Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81(3):425-455. 10.1093/biomet/81.3.425
Article MathSciNet MATH Google Scholar
Hampel F: The influence curve and its role in robust estimation. J. Am. Stat. Assoc 1974, 69(346):383-393. 10.1080/01621459.1974.10482962
Article MathSciNet MATH Google Scholar
Huber P, Ronchetti E: Robust Statistics,. (Wiley, New York, 2009)
Book MATH Google Scholar
Leonard R: A database for speaker-independent digit recognition. IEEE International Conference on Acoustics, Speech, Signal Processing (ICASSP), vol. 9 (San Diego, California, USA, March 1984), pp. 328–331
Candès JREJ, Tao T: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math 2006, 59(8):1207-1223. 10.1002/cpa.20124
Article MATH MathSciNet Google Scholar
Aïssa-El-Bey A, Abed-Meraim K, Grenier Y: Blind separation of underdetermined convolutive mixtures using their time-frequency representation. IEEE Trans. Audio Speech Lang. Process 2007, 15(5):1540-1550.
Article MATH Google Scholar
Févotte C, Gribonval R, Vincent E: A toolbox for performance measurement in (blind) source separation,. available at http://bass-db.gforge.inria.fr/bss_eval/

Download references

Author information

Authors and Affiliations

Institut Télécom; Télécom Bretagne, UMR CNRS 3192 Lab-STICC, Technopôle Brest Iroise CS 83818, 29238, Brest, France
Si Mohamed Aziz-Sbaï, Abdeldjalil Aïssa-El-Bey & Dominique Pastor
Universitè europèenne de Bretagne, Rennes, France
Si Mohamed Aziz-Sbaï, Abdeldjalil Aïssa-El-Bey & Dominique Pastor

Authors

Si Mohamed Aziz-Sbaï
View author publications
You can also search for this author in PubMed Google Scholar
Abdeldjalil Aïssa-El-Bey
View author publications
You can also search for this author in PubMed Google Scholar
Dominique Pastor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Si Mohamed Aziz-Sbaï.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Authors’ original file for figure 13

Authors’ original file for figure 14

Authors’ original file for figure 15

Authors’ original file for figure 16

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Aziz-Sbaï, S.M., Aïssa-El-Bey, A. & Pastor, D. Contribution of statistical tests to sparseness-based blind source separation. EURASIP J. Adv. Signal Process. 2012, 169 (2012). https://doi.org/10.1186/1687-6180-2012-169

Download citation

Received: 14 July 2011
Accepted: 01 July 2012
Published: 16 August 2012
DOI: https://doi.org/10.1186/1687-6180-2012-169

Contribution of statistical tests to sparseness-based blind source separation

Abstract

Introduction

Main steps of standard UBSS methods

Principles

Mixing matrix estimation

Assumption 1

Source recovery

Assumption 2

Assumption 3

Statistical tests for sparseness-based UBSS

Weak-sparseness-based time–frequency detection for source recovery (multisource selection)

Signal source detection for mixing matrix estimation (autosource selection)

The case of signals with low overlapping rate

The case of signals with high overlapping rate

Simulation results

SUBSS method

Other methods

Discussion

Assessment

Convolutive mixture case

Conclusion and perspectives

Appendix

Denoising-based source recovery

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords