Removal of muscle artifact from EEG data: comparison between stochastic (ICA and CCA) and deterministic (EMD and wavelet-based) approaches

Electroencephalographic (EEG) recordings are often contaminated with muscle artifacts. This disturbing myogenic activity not only strongly affects the visual analysis of EEG, but also most surely impairs the results of EEG signal processing tools such as source localization. This article focuses on the particular context of the contamination epileptic signals (interictal spikes) by muscle artifact, as EEG is a key diagnosis tool for this pathology. In this context, our aim was to compare the ability of two stochastic approaches of blind source separation, namely independent component analysis (ICA) and canonical correlation analysis (CCA), and of two deterministic approaches namely empirical mode decomposition (EMD) and wavelet transform (WT) to remove muscle artifacts from EEG signals. To quantitatively compare the performance of these four algorithms, epileptic spike-like EEG signals were simulated from two different source configurations and artificially contaminated with different levels of real EEG-recorded myogenic activity. The efficiency of CCA, ICA, EMD, and WT to correct the muscular artifact was evaluated both by calculating the normalized mean-squared error between denoised and original signals and by comparing the results of source localization obtained from artifact-free as well as noisy signals, before and after artifact correction. Tests on real data recorded in an epileptic patient are also presented. The results obtained in the context of simulations and real data show that EMD outperformed the three other algorithms for the denoising of data highly contaminated by muscular activity. For less noisy data, and when spikes arose from a single cortical source, the myogenic artifact was best corrected with CCA and ICA. Otherwise when spikes originated from two distinct sources, either EMD or ICA offered the most reliable denoising result for highly noisy data, while WT offered the better denoising result for less noisy data. These results suggest that the performance of muscle artifact correction methods strongly depend on the level of data contamination, and of the source configuration underlying EEG signals. Eventually, some insights into the numerical complexity of these four algorithms are given.


Introduction
Electroencephalographic (EEG) recordings are mandatory for the diagnosis of epilepsy. As part of the presurgical evaluation of drug-resistant epilepsy, long-term Video-EEG recordings are performed. On these traces, transient events called interictal spikes occur between seizures, and convey essential information both to guide further explorations such as intracerebral implantation and to assist surgery. However, epochs of EEG signals containing spikes have to be free of artifacts when both qualitative and quantitative analyses such as source localization are planned.
EEG is unfortunately often contaminated by various physiological activities of non-interest. Among them, muscular or myogenic activity arising from the contraction of head muscles can strongly obscure EEG signals. As recently reviewed in [1], the perturbation induced by this type of artifact is particularly difficult to correct because myogenic activity is of high amplitude (possibly several times larger than the EEG signal), wide spectral distribution, and variable topographical distribution.
Therefore, denoising of EEG is a challenging preprocessing step prior to qualitative or quantitative EEG analysis. Minimizing the disturbances due to muscular activity in EEG signals can be considered as a blind source separation (BSS) problem, which consists in estimating the original sources underlying the multichannel EEG signals, without a priori (or very little) knowledge about the sources themselves and about the mixing process. BSS techniques have been applied in various domains including the denoising of EEG (see [2,3] for recent reviews). Among the existing BSS algorithms, four main ones have emerged to specifically tackle the difficult problem of EEG data denoising. Independent component analysis (ICA) was successfully applied to EEG denoising for muscular activity [4][5][6][7][8][9][10][11][12][13]. More recently, another BSS approach called Canonical Correlation Analysis (CCA) was also proposed to remove muscle artifacts from EEG and improve the interpretation of ictal epochs [14][15][16]. Regarding the wavelets transform (WT), it has been used in the EEG context for the detection of epileptiform patterns [17], for the elimination of different types of noises [18,19] and for removal of some electrophysiological artifacts, such as ocular movement [20,21], and heart signal [20,22]. Finally, a fourth method called Empirical Mode Decomposition (EMD) has newly appeared as a promising tool in the particular field of EEG data denoising [23].
In this article, we compare the ability of ICA, CCA, EMD, and WT to remove muscle artifacts from multichannel EEG data, and to assess their impact on source localization results we chose the Contrast Maximization 2 (CoM2) algorithm [12] for ICA, the Turning Tangent empirical mode decomposition (2T-EMD) algorithm for EMD [20], and the Discrete Wavelet Transform (DWT) for WT approach. More precisely, the four approaches are first compared on realistic simulated interictal spikelike activity in order to provide a quantitative comparison of their performance. For each method, the numerical complexity is also calculated to evaluate its computational cost. Eventually, each of these approaches is tested on epochs of real EEG signals containing epileptic spikes with the aim of evaluating their effectiveness in a clinical context.

Problem formulation
The EEG electrical activities recorded at the level of surface electrodes can be considered as an instantaneous linear mixture of elementary sources [24] such that A is the (N ×(P e + P f + P b )) global mixing matrix representing the concatenation of A e , A f , and A b . Similarly, s[m] is the global source vector. Finally, v[m] is the Ndimensional vector of instrument Gaussian white noise. This instantaneous linear model comes from the mathematical formulation of the EEG/MEG forward problem that uses the quasi-static formulation of Maxwell's equations. Under the quasi-static assumption, the time-derivatives of the associated electric fields are considered as sufficiently small to be ignored in Maxwell's equations. This means that, for a given position, orientation and spatial extent of the neuronal sources, the resulting electrical activity at the level of surface electrodes is time-independent. Using the Poisson's equations, the electrical potential can then be computed for each scalp electrode. Under the quasi-static assumption, this potential linearly depends on the current amplitude generated at the level of neuron assemblies or muscles [24].
This article deals with a BSS problem that consists in estimating the source vector s[m] from the vector of observations x[m] (1), with minimal a priori knowledge on the sources, the EEG signals, and the noise.
ICA and CCA, as stochastic methods, make the assumption that x m ½ fg , s m ½ fg ,and ν m ½ fg are realizations of three random vector processes referred to as x 0 m ½ fg , s 0 m ½ fg ,a n d ν 0 m ½ fg , respectively. Consequently, for both ICA and CCA, the following hypotheses are made: H1: Spatial statistical independence of sources (i.e., mutual independence of components of s 0 m ½ fg ; H2: Full column rank of the mixing matrix A; H3: Statistical independence between sources and noise (i.e., independence between components of s 0 m ½ and components of ν 0 m ½ at each instant m).
In addition, CCA also assumes that H4: Sources are temporally coloured (i.e., statistical dependence between the pth component of s 0 m ½ and the pth of s 0 m þ 1 ½ at each instant m.
On the other hand, in the case of EMD it can be assumed that H5: Each source is the sum of AM-FM modulations. These modulations are different from one to another source.
Finally, in the case of WT, we assume that H6: Each source can be decomposed on a wavelet basis. Admitting that the informative part of a signal is concentrated in few wavelet coefficients having high absolute value, while the noise part is distributed in wavelet coefficients having low absolute value.
In addition to the above-mentioned hypotheses, ICA and CCA also differ from EMD and WT by the way they process the signals. ICA and CCA jointly exploit all electrodes, i.e., they take advantage of all components of x [m] in order to estimate the source vector s[m]. Contrarily, EMD and WT as implemented in this article process each electrode separately.

Description of denoising methods ICA
The concept of ICA was introduced by Herault and Jutten [25], especially in order to solve the BSS problem. In the mid 1990s, Comon [26] presented a mathematical formulation of ICA. During the past 25 years, a wealth of algorithms have been proposed and ICA-based methods are now extensively and successfully applied to solve many practical real-life problems such as biomedical signal analysis and processing, wireless communications, data mining, speech enhancement, image recognition, etc. (see [3] for details). More precisely, assuming the instantaneous linear observation model of Equation (2) and the hypotheses H1 to H3, the goal of ICA is to find a( N × P), full rank, separator matrix, W, such that the output signal is an estimate of the source vector s[m] up to a multiplicative trivial matrix ΛΠ (where Λ is a diagonal invertible matrix and Π is a permutation matrix). Our previous work showed that, compared to 12 other ICA methods, the CoM2 algorithm [3,27] offers the best performance/complexity compromise for the denoising of Mu-like simulated EEG activity in the context of brain computer interface. This justifies the use of this method in this article. CoM2 starts with a prewhitening of the observations x[m] and relies on a maximization of a contrast function derived from Fourth-Order (FO) cumulants: where z 0 m ½ fg represents the whitened random vector process associated with x 0 m ½ fg and C p;p;p;p;z 0 is the FO cumulant of the pth random process z 0 p m ½ no . In practice, CoM2 method maximizes the contrast function (3) iteratively by applying a planar Givens rotation to every whitened signal pair until convergence as in Jacobi sweeping algorithm for matrix diagonalization. More precisely, in our case where the signals are real, the optimal rotation angle (optimizing (3)) can be found by rooting a polynomial of degree 4.

CCA
Originally proposed by Hotelling [28], CCA is a method that measures the linear relationship between two multi-dimensional random variables. Friman et al. [29] showed that CCA can be used to solve the BSS problem by taking the source vector as the first multi-dimensional random variable and the temporally delayed version of the source vector as the second multi-dimensional random variable. Thus, assuming the instantaneous linear observation model of Equation (2), the spatial statistical decorrelation of sources and the hypotheses H2 to H4,C C Aa i m sa t extracting a P-dimensional vector of sources s[m] by both minimizing the spatial correlation between sources and maximizing their temporal correlation.
To do so, let us consider the two vectors where y[m] is an estimate of the source vector s[m].T h e problem is to find a (N × P), full rank, separator matrix W that maximizes the following function: The maximum of ρ with respect to W is called the maximum canonical correlation, R x 0 x 0 and R z 0 z 0 are the autocovariance matrices of x 0 and z 0 , respectively, and R x 0 z 0 the cross-covariance matrix of x 0 and z 0 . After some manipulations, the demixing matrix W can easily be calculated by solving the eigen-matrix equation where the eigen-vectors w x 0 are the columns of W.

EMD
EMD was originally introduced in the late 1990s to study water surface wave evolution [30]. Since then, it has been used in various fields such as biomedical signal analysis [31], Hurst exponent estimation [32], speech processing [33], or texture analysis [34]. EMD aims at decomposing sequentially a given signal x n m ½ fg into a sum of Amplitude and Frequency Modulated (AM/FM) zero mean oscillatory signal, s nk m ½ fg , referred to as Intrinsic Mode Functions (IMFs), plus a non-zero meanlow-degree polynomial remainder. More precisely, EMD sequentially computes K IMFs s nk m ½ fg and the corresponding trends r n m ½ fg , such that In practice, each IMF, s nk m ½ fg , is calculated using an iterative procedure, called sifting process. At each instant m, the jth sifting iteration consists: (i) in computing the mean envelope, M s nk;j m ½ ÀÁ , of the residual signal, and (ii) in extracting the detail s nk;jþ1 m ½ ¼ s nk;j m ½ À M s nk;j m ½ ÀÁ from the residue. This process is repeated until the stopping criteria is reached (Cauchy-like criterion used in [35]). Several EMD methods have been proposed in the literature [30,32,34,35], depending on the way the mean envelope M s nk;j m ½ ÀÁ ;8m 2 N is calculated. In the following, we will use the 2T-EMD method recently proposed by our team [35]. Briefly, the signal mean trend M : ðÞ is redefined as the signal which interpolates the barycenters of particular oscillations, called elementary oscillations (see [35] for more details). We also show in [35] that, in practice, a robust computation of the mean trend is obtained by averaging two envelopes: a first envelope interpolates the even indexed barycenters, and a second envelope interpolates the odd indexed barycenters. Compared to other approaches the 2T-EMD algorithm is simpler, has a lighter computational cost, and enables both mono and multivariate decompositions without any change in the core of the algorithm.

WT
The concept of WT was formalized in early 1980s and, since then, it has extensively been applied in a large variety of fields, such as biomedical signal processing, image compression, astronomy (see, e.g., [36][37][38][39][40][41][42]). The WT is defined as the inner product or cross correlation of the signal x n m ½ fg with the scaled and time shifted wavelet Ψ a;b m ½ : where Ψ a;b m ½ ¼a jj À 1 2 Ψ m À b ðÞ =a ½ (a and b are the scale and translation parameters, respectively). The WT provides a decomposition of the signal x n m ½ fg in different scales, where the obtained wavelet coefficients represent a measure of similarity between the signal x n m ½ fg and an appropriate wavelet function Ψ a;b m ½ .A m o n gt h ee x i s t i n g WT approaches, we used a most common form of the DWT which employs the dyadic grid (a ¼ 2 j and b ¼ k2 j where j and k are integers) because of its: (i) good temporal localization properties, (ii) fast calculation, and (iii) zero redundancy. In practice, the DWT can be computed by successively passing the signal x n m ½ fg through: (i) a high-pass filter producing the detail coefficients, and (ii) a low-pass filter giving the approximation coefficients [43]. More precisely, the DWT decomposition is recursively (D levels) applied on the output of the low-pass filter and generate D details and one approximation.

Selection of components of interest
For both CoM2 and CCA, the components that represent the sources of interest were selected by calculating the autocorrelation of each component (for a time lag τ = 1) and then by classifying them in a descending order according to their respective autocorrelation values. Consequently, as the autocorrelation of muscle artifacts is relatively low with respect to that of epileptic spikes, the components representing muscle artifacts are expected to be among the last components. In turn, CoM2 or CCA components of interest will be classified among the first components which facilitate their visual selection. Then, the signal is reconstructed by keeping only the components accounting for the sources of interest (epileptic spikes).
Regarding the 2T-EMD method, the visual selection of the relevant IMFs is quasi-impossible and an automatic selection of the IMFs of interest is needed. In the past, some interesting EMD procedures have been proposed to automatically select and classify the IMFs of interest. They can be divided into two categories: (i) methods based on low-pass filtering [44][45][46] and (ii) methods based on thresholding filtering [47,48]. EMD based on low-pass filtering considers that the IMFs are decomposed into two sets: the first set of IMFs represents only the signal of interest and the second set stands for the noise and artifacts. Nevertheless, in some practical cases noise or artifacts can be distributed over all IMFs. Thus, the low-pass filtering methods remove the highfrequency IMFs corresponding to the component of interest and keep the low-frequency IMFs related to noise and/or artifacts. The second family of procedures is inspired from wavelet thresholding methods and evaluates the noise level for each IMF using a suitable threshold as in wavelet analysis. Due to the properties of the muscle activity (wide spectral distribution, and variable topographical distribution), we chose to use a thresholding filter method, as proposed in [47]. Concisely, authors propose in [47] to modify the universal wavelet threshold proposed by Donoho [49] in order to fit the specificities of each IMF. The authors show also that the denoising performances are improved by averaging several denoised versions of the signal (see [50] for details). Note also that this method is advantageous in the sense it does not require any reference signal.
Signal denoising using DWT requires three successive steps. First, we decompose the original signal by choosing the number of levels D. Second, we threshold the obtained D details. Finally, we reconstruct the denoised signal using the D altered detail coefficients and approximation coefficients of level D. Thus, in DWT schemes, the mother wavelet, shrinkage rule, and noise level rescaling approach must be designated. Indeed, there are some general rules about the choice of these parameters: (i) the mother wavelet is selected based on its similarity to the desired signal, (ii) rescaling approach and shrinkage rule are chosen according to the nature of the noise (white or colored) and the variance of the noise. In our case, it is thus reasonable to use the Daubechies family that is interictal spikeslike. Regarding the thresholding method, among the different algorithms, the Stein's Unbiased Risk Estimate (SURE) [51] shrinkage rule and a soft thresholding strategy gave the best results (on simulated signals). This choice can be justified because the signal-to-noise ratio (SNR) is weak and an algorithm which offers a low thresholding as SURE seems to be appropriate for our application. Indeed, contrary to the universal thresholding, in the SURE algorithm the threshold depends on the signal and not only the estimated noise.
It should be noted that, the number of extracted IMFs (K = 10) for 2T-EMD, the levels of decomposition (D =6) for DWT and the mother wavelet (Daubechie order 4) are fixed using the simulated data in order to obtain the desired frequency resolution. These values are also used in the case of real data. We also point out that for DWT the reconstruction of the denoised signal has been achieved without the coefficients of the first level.

Computational complexity
In order to evaluate and compare CoM2, CCA, 2T-EMD, and DWT algorithms from a computational complexity point of view, we have calculated the number of floating point operations (flops). A flop corresponds to a multiplication followed by an addition. Although in the usual practice, only multiplications are counted, the order of magnitude of the computational complexity is unchanged, since in most cases, there are roughly as many multiplications as additions. Let N, P, M, K, J, D, and O be the number of observations, the number of extracted sources, the data length, the number of extracted IMFs, the number of sifting iterations, the number of decomposition levels, and the mother wavelet order, respectively. For the CoM2 algorithm, let f 4 P ðÞ ¼ PPþ 1 ðÞ P þ 2 ðÞ P þ 3 ðÞ =24 be the number of free entries in an FO cumulant tensor of dimension P enjoying all symmetries, I the number of sweeps executed by a jointdiagonalization process, Q the complexity required to compute the roots of a real fourth degree polynomial by Ferrari's techniques (we take Q % 30 flops), and B ¼ min fg the number of flops required to perform spatial whitening. For the 2T-EMD algorithm, we considered s nk;j m ½ ÈÉ as the kth IMF computed at the jth iteration of the sifting process in the case where 2T-EMD was applied on the nth observation x n m ½ fg . In addition, for 2T-EMD, we also considered f 2T nk; j ðÞ as the number of barycentres detected in s nk;j m ½ ÈÉ . Finally, we note that for DWT (dyadic decomposition), the number of input samples decreases by 50% at successive stages of decomposition. For all these parameters, the computational complexity is given in Table 1. Table 1 Computational complexity of CoM2, CCA, 2T-EMD and DWT algorithms

Algorithms
Numerical complexity (Flops) As an example, the numerical complexity of CoM2, CCA, 2T-EMD, and DWT was calculated on the real data presented in Section 2.6. DWT required the smallest amount of calculation followed by CCA. CoM2 needed larger amount of calculation, while 2T-EMD had the highest computation cost among the four algorithms.

Generation of simulated data
To quantitatively evaluate the performance of the four above-mentioned BSS approaches, we simulated 32channels EEG data, with a spatiotemporal model developed by our team [52][53][54]. In this model, EEG sources were represented as a dipole layer distributed over the cortical surface. The geometrical description of the cortical surface was achieved by using a mesh made of 19,626 triangles (mean surface 4.8 mm²) obtained from the segmentation of the gray-white matter interface from a patient 3D T1-weighted MRI. Each triangle of the mesh was associated to an elementary current dipole. The dipole was placed at the barycenter of the triangle and oriented perpendicular to its surface. The moment of each dipole was weighted by a coefficient proportional to the area of the corresponding triangle. In addition, each dipole was assumed to correspond to a distinct cortical neuronal population. Its time course, which represents the time-varying dynamics of the associated population, was provided by the output of a neuronal population model [55], in which parameters can be adjusted to generate either background-like activity or interictal-spikes. In this model, the source (or patch) of these epileptic activities was manually delineated on the mesh as a set of contiguous triangles. Dipoles associated to triangles within the patch were assigned with highly correlated interictal spike activities (i.e., transient interictal spikes) using an appropriate setting of coupling parameters between populations. All other dipoles of the cortical mesh were assigned with null activity. From this setup, we built a spatio-temporal source matrix S containing the time-varying activities of all cortical dipoles of the source space. The pth line of this matrix contains the time-course of the pth dipole.
Simulated EEG were generated using a realistic head model representing the brain, the skull and the scalp with a conductivity of each medium fixed to 0.33, 0.0082, and 0.33 S/m, respectively [56]. The intermedium surfaces were extracted from the segmentation of the same T1-weighted 3D-MRI as for the source space and meshed by 2,440 triangles each (ASA, ANT, Enschede, Netherlands). From this head model, the forward problem was then numerically calculated for each triangle using a boundary element method (ASA, ANT, Enschede, Netherlands) to obtain the leadfield matrix A. This matrix gives the contribution of each dipole of the mesh at the level of 32 scalp electrode positions (19-20  standard 10-20 electrodes plus additional electrodes at  FC1, FC2, FC5, FC6, CP1, CP2, CP5, FT9, FT10, P9, and  P10). The spatio-temporal matrix of simulated EEG data is given by X = AS.
More particularly, for this study two source configurations were considered. In the first configuration, we considered a single patch, made of 100 contiguous triangles (approximately 5 cm²) located in the left superior temporal gyrus. In the second configurations, two patches of 100 triangles each were located in the left superior temporal and left inferior parietal regions, respectively. Activities of dipoles within the patches were highly correlated whereas activities between patches were set uncorrelated. For each source configuration, 50 realizations of spike simulations were generated. These signals corresponded to "clean data". In order to generate noisy EEG simulations, 50 epochs of EEG muscle activity were extracted from real 32-channel EEG data. Each trial of EEG muscle activity was then normalized with respect to the channel showing the maximal power. Then, different levels of amplitude of muscle activity were added to the simulated spike activity in order to get noisy simulated signals with signal-SNR values of -30, -25, -20, -15, -10, and -5 dB. In these simulated data, the number of samples was set to 8,192 which corresponds to 32 s.

Performance criteria Normalized mean-squared error
The performance of CoM2, CCA, 2T-EMD, and DWT was first evaluated by computing the following normalized mean-squared error (NMSE): where x n m ½ ÈÉ is the original EEG observation on the nth electrode (EEG without muscle activity),x ℓ ðÞ n m ½ no is the reconstructed surface EEG after denoising from the ℓth run, L is the number of Monte Carlo runs and M is the data length.
Effect on source localization The performances of CoM2, CCA, 2T-EMD, and DWT were also evaluated by examining their impact on source localization results. For this purpose, source localization was performed on original simulated signals (clean data, considered as our reference), on noisy simulated signals at different SNRs, as well as on CoM2, CCA, 2T-EMD, and DWT. We used the recently published 4-ExSo-MUSIC algorithm [57] specifically dedicated to the localization of distributed sources. We calculated for all 50 trials the Receiver Operating Characteristic (ROC) curves as a performance criterion for source localization results (see Section 2.3 in Birot et al. [57] for details). This criterion represents the mathematical expectation of the true positive fraction (TPF) as a function of the mathematical expectation of the false positive fraction (FPF). The TPF is the fraction between the area of the patch truly retrieved and the total patch area while the FPF is the fraction between the area falsely localized outside the patch and the total cortical area minus the patch area.

Application to real data
In order to test the feasibility of the four denoising algorithms on real data, CCA, CoM2, 2T-EMD, and DWT were applied to the denoising of interictal spikes in a 40year-old patient (referred as to "Patient P" in the following) suffering from drug-resistant partial epilepsy since the age of 26 years. As part of his presurgical evaluation, Patient P underwent two sessions of video-EEG monitoring, Brain MRI, as well as interictal and ictal SPECT acquisition. During video-EEG monitoring, scalp-EEG data were acquired from 32 electrodes (19-20 standard 10-20 electrodes plus additional electrodes at FC1, FC2, FC5, FC6, CP1, CP2, CP5, FT9, FT10, P9, and P10) at a sampling frequency of 256 Hz with a [0.3-100 Hz] band pass filter. These data were reviewed in order to isolate five different epochs of 2,048 samples (8 s) containing a clean spike, as well as epochs of 2,048 samples including spikes (almost) hidden in muscle activity (at two different levels of noise). The exact same methodology as for simulated data was used to denoise these noisy spikes and reconstruct the denoised EEGs. In addition, the sources of spikes in clean, noisy, and denoised data were estimated using the 4-Exso-MUSIC algorithm.

Results on simulated data
In this section, we report the behavior of CoM2, CCA, 2T-EMD, and DWT algorithms as a function of SNR in noisy simulated data obtained either from a single epileptic patch or from two patches. Examples of simulated, noisy, and denoised data are illustrated in Figure 1 for two different levels of added muscle activity. On original simulated data, spikes-like activity is clearly visible at electrode T3 (i.e., scalp electrode facing the cortical epileptic patch), whereas it is entirely hidden in simulated noisy data with very low SNR level (-25 dB). A visual analysis of denoised data shows that from -25 dB noisy data the spike activity at T3 is well retrieved with CCA and CoM2 although the spike activity slightly diffuses on other channels as compared to original data. On the contrary, 2T-EMD does not retrieve the proper spike amplitude at T3 but does not either increase the diffusion of the spike on remote channels. Using the DWT-based method, the proper spike amplitude at T3 is better retrieved than with 2T-TMD but remains inferior to results obtained with CCA and CoM2. Furthermore, other channels are less denoised with the DWT than with 2T-EMD, CCA, or CoM2. Regarding -15 dB noisy data, in which spikes are not entirely hidden by muscle activity, CCA and CoM2 denoised data are very similar to the original simulated data. In that case, the 2T-EMD and DWT algorithms are not denoising data as well as the two other algorithms.

NMSE
For data simulated from a single epileptic patch (Figure 2), the calculation, for a set of 50 trials, of the mean performance criterion (NMSE) calculated for all electrodes shows that 2T-EMD: (i) performs better than CoM2, CCA for very low SNR (-30 dB), (ii) gives comparable results with CoM2 and CCA in the case of SNR -25, and (iii) is less efficient than CCA and CoM2 for SNR ranging from -20 to -5 dB (see Figure 2A). The DWT-based method is less efficient than CCA, CoM2, and 2T-EMD across all SNRs. In addition, the performance of CCA, CoM2, and the DWT algorithms increases as SNR values increase. On the contrary the performance of the 2T-EMD algorithm remains stable across all SNRs, with a low variability within trials compared to the three other methods. As illustrated in Figure 2B, when the performance criterion is specifically calculated for electrode T3 (electrode facing the cortical epileptic patch), CCA and CoM2 clearly outperform 2T-EMD and the DWT for all SNRs. 2T-EMD outperforms the DWT method for SNR of -30 and -25 dB, while the DWT gives comparable results with 2T-EMD in the case of SNR -20 dB. Finally, the DWT-based method outperforms 2T-EMD for SNR ranging from -15 to -5 dB. It is noteworthy that for the CCA and CoM2 methods, the performance remains stable for all considered SNRs while the 2T-EMD and DWT methods become more efficient as the SNR in the simulated signals increases.
The results obtained on signals simulated from two epileptic patches are illustrated in Figure 3. In that case, the mean performance of the four algorithms calculated for 50 trials over all electrodes (Figure 3a) follows the same trend as for data obtained with a single patch (Figure 2a). Nevertheless, for SNR ranging from -20 to -5 dB, the performance of the CCA method is highly variable within trials compared to the other methods. Figure 3b shows that at electrode T3 (facing the location of the first epileptic patch), CCA and CoM2 have similar performance and slightly outperform 2T-EMD for SNRs of -30 and -25 dB, while the DWT shows the worse performance. The performance of CCA, CoM2, and 2T-EMD is quasi-identical for SNR -20 dB, and better than that of the DWT-based method. All methods show almost equivalent performance for SNR of -15 dB. Finally, CoM2 and the DWT algorithms give better results than the two other algorithms for SNRs of -10 and -5 dB. Note also that the performance of the DWT increases as SNR values increase. At CP5 (facing the second epileptic patch), the performance of CCA, CoM2, and 2T-EMD are comparable to that obtained at electrode T3. We also remark that, for SNRs ranging from -20 to -5 dB the DWT algorithm slightly outperform the other methods (Figure 3c). Figures 2a and 3a indicate that the mean performance criterion calculated for the four methods over all electrodes is of the same order of magnitude for data obtained with a single patch or with two patches. On the contrary, Figures 2b, 3b, and 3c show that when data were simulated from two epileptic patches the performance of CCA and CoM2 at electrodes facing the patches (T3 or CP5) is worse than that obtained for data simulated from a single patch.

Source localization
As another performance criterion, source localization was applied on original and noisy simulated data before and after denoising by the CCA, CoM2, 2T-EMD, or the DWT algorithms. An example is given in Figure 4 for two levels of SNR. Figure 5 reports the ROC curves obtained on the set of 50 trials, both for single-patch and for double-patch simulated data. As compared to original data, sources of spikes localized from noisy data are misleading. This is observed both for spikes arising from a single or from two distinct epileptic patches, and for both SNRs (-25 and -15 dB). In the single patch configuration, the source of spikes is still mislocated after denoising of low SNR (-25 dB) data with CCA, CoM2, and DWT method. However, for 2T-EMD denoised data, the source of spikes is localized closer to the actual patch location (inferior temporal region).
ROC curves show that this trend is reversed for higher SNR values (Figure 5a). In particular, as also illustrated in Figure 4, localization results obtained after denoising the -15 dB data with CCA, CoM2, and DWT methods   Figure 3 Performance of the denoising methods in the case of data generated from two distinct sources. The mean performance (NMSE) over the 50 simulation trials of the three denoising methods CCA, CoM2, 2T-EMD, and DWT is plotted for different SNR values. This criteria is calculated either for all electrodes (top) or for the two electrodes (respectively T3 and CP5) facing the two cortical patches.
are very close to those obtained for original clean data, while the epileptic patch is still mislocalized from the 2T-EMD-denoised data. Similarly, in the double-patch configuration, and for low SNR data (-25 dB), spikes arising from two distinct patches are also better localized after 2T-EMD than after CCA, CoM2, and DWT (see Figure 4). In particular, one of the two epileptic sources (inferior parietal) is retrieved from the 2T-EMD-denoised data while for the same SNR, source localization is misleading for CCA, CoM2, and DWT (source localized in-between both patches or in a remote region). As shown in Figure 5b, for SNR ranging from -30 and -25 dB, source localization from CoM2 and CCA and 2T-EMD-denoised data remain comparable, while localization results of DWT-based method are the worse. For an SNR of -20 dB, source localization results are similar, whatever the approach used for denoising. For SNRs of -15 to -5 dB the best results are obtained from data denoised with the DWT or with CoM2. It is noteworthy than for the two-patches source configuration, source localization results get better as SNR in noisy data increases. Nevertheless, for all SNR values, source localization of signals denoised by any of the considered approaches did not retrieve properly the second patch (superior temporal).

Results on real data
In the absence of any perturbation by muscle activity, spikes of Patient P were localized in the lateral anterior temporal region (Figure 6a). This localization was consistent across the five different epochs of clean spikes. This localization was also consistent with the topographical distribution of interictal spikes recorded throughout the whole length of video-EEG monitoring in which frequent interictal spikes, sometimes arising in a continuous manner, and culminating at electrode FT10, could be recorded. In Patient P, the visual interpretation of T1weighted MRI data revealed a global atrophy of the right hemisphere, which was more particularly pronounced in the right anterior temporal region. Finally, these  Figure 4 Estimation of sources of denoised simulated data using 4-ExSo-MUSIC. An example for data simulated both from a single or from two distinct sources is systematically displayed next to each other for a given set of original, noisy, or denoised data. Brown: real patch; purple: truly estimated part of the patch; Orange: falsely estimated part of the patch. electrophysiological and anatomical results are also corroborated by interictal SPECT data, showing a hypoperfusion in the right temporal lobe, including the temporal pole, the mesial temporal region and the lateral temporal neocortex, mostly in the superior and middle temporal gyri.
For both sets of noisy epochs the source of spike was mislocalized in the right inferior frontal region (Figure 6b) or in the left temporal region (Figure 6c). After denoising of the first epoch of noisy data (Figure 6b), spikes are localized in the same region as the source of clean spikes. The results obtained after CCA, CoM2, and 2T-EMD denoising are very similar with spikes being localized in the right lateral anterior temporal gyrus and pole. The source localized after the DWT-based method is slightly more extended and involves the right lateral temporal lobe. For the second epoch of noisy data, localization results after 2T-EMD and DWT denoising are congruent with results obtained in clean data. Nevertheless, spikes of CCA-denoised data are localized in the right mesial temporal lobe, while spikes of CoM2-denoised data suggested the involvement of both the lateral temporal neocortex and the insula.

Discussion
Muscle artifacts are a major source of contamination of scalp EEG data. As a result a rapid and reliable denoising of these data constitutes an essential issue particularly when these signals are used for diagnosis, which is the case in patients with epilepsy. Moreover, artifact correction is crucial when the EEG will be further analyzed with signal processing tools, such as source localization.
In this article, we compared the ability of two stochastic BSS approaches (CoM2 and CCA) and of two deterministic methods (2T-EMD and DWT) to remove muscle artifacts from EEG signals. Our results showed unequal performances of the four algorithms according to the level of SNR and according to the source configuration (single patch versus two patches). At very low SNR and when a single source was used to simulate epileptic spikes, 2T-EMD offered the best results as compared to the other considered approaches. Indeed, for these low SNR, CoM2 and CCA tend to exaggerate the contribution of the source to the scalp EEG activity at the level remote channels. This most likely explains why for these two algorithms, the performance averaged over all channels is low for very low SNR while the performance calculated at T3 is high for all SNR considered. On the contrary, 2T-EMD did not overspread the source activity at distant electrodes but rather reduced the contribution of the source at the electrode facing the source. However, for higher SNR values, this trend was reversed. The performance of 2T-EMD slightly improved while CoM2 and CCA retrieved the signal in a much proper way than for very low SNR, with less spreading at the level of distant channels. Expectedly, for higher levels of SNR, source localization gave more reliable results when applied to CCA-or CoM2-denoised data than when applied to 2T-EMD-denoised data. The worse behavior of the DWT for at very low SNR is explained by the fact that for this cases the wavelet coefficients of epileptic spikes have small values, comparable to noise and artifacts, especially for electrodes located far from the epileptic patch, and even the use of an algorithm with low thresholding as SURE do not remove all the noise and slightly corrupt the signal of interest, namely the epileptic spikes.
Similar conclusions could be derived when two distinct patches are used to simulate epileptic spikes. Very low SNR data were more accurately denoised by 2T-EMD than by CCA, CoM2, or DWT. In particular, both spike activities visible at the electrodes facing the two patches (T3 and CP5, respectively) were retrieved after 2T-EMD denoising, and subsequently, two distinct sources could be estimated from these denoised data. Regarding the  FP1  FP2  F3  F4  C3  C4  P3  P4  O1  O2  F7  F8  T3  T4  T5  T6  FZ  CZ  PZ  FC1  FC2  CP1  CP2  FC5  FC6  CP5  CP6  FT9  FT10  P9  P10  POZ   FP1  FP2  F3  F4  C3  C4  P3  P4  O1  O2  F7  F8  T3  T4  T5  T6  FZ  CZ  PZ  FC1  FC2  CP1  CP2  FC5  FC6  CP5  CP6  FT9  FT10  P9  P10   DWT, because this method is used, in a monovariate way (i.e., channel per channel) the difficulties raised in the case of two distinct patches are exactly the same than those retrieved when single source was used to simulate epileptic spikes (single patch). Regarding, CCA or CoM2 (although to a lesser extend) retrieved the two distinct sources with more difficulty and extracted both sources in one component. As a result, spike activities were mixed at the level of the reconstructed scalp EEG. Source localization in that case provided misleading results with the source being estimated between the two patches. For higher SNR, this difficulty persisted for CCA while the performance of CoM2 slightly improved. This result might be explained by the fact that CoM2 exploits more statistical information from the signal (second and FO cumulants) than CCA (second-order cumulants). It is noteworthy that in another study the CCA approach was shown to provide better results, in terms of NMSE, than an ICA-based method [14]. However, in this study [14], the authors have tested both algorithms on a single EEG epoch (versus 50 simulations here) contaminated by different levels of muscular activity and thus have not taken into account the statistical variability of NMSE results. In addition, they considered EEG data recorded with a smaller number of electrodes than in our simulations (21 versus 32 here) and used a different ICA algorithm (JADE versus CoM2 here) to denoise the data, which prevents a reasonable comparison between their results and ours.
The number of electrodes is a crucial aspect that should be considered to explain the efficiency of BSS methods. In particular, the weak performance of CCA or CoM2 in our study with respect to low SNR data (or high SNR data due to the activation of two cortical regions) is most likely due to an insufficient number of electrodes. In these situations, the number of sources is probably higher than the number of electrodes which leads to an underdetermined mixture and consequently to the well-known ill-posed inverse problem. Consequently, when a small number of electrodes is used, the methods extract a linear combination of sources belonging to the same subspace instead of estimating the sources themselves.
Despite the above-mentioned differences observed between the four algorithms, the denoising process has clearly improved the results of sources localization. With this respect, the results are in harmony with those of a recent and important study showing the usefulness of applying ICA/CCA denoising techniques to ictal EEG signals in order to localize the epileptic zones [58]. Compared to this study, our work uses simulations of realistic epileptic EEG signals to quantitatively compare the different denoising algorithms. The major advantage over real EEG signals is that simulations supply a "ground truth" both of "clean" signals and of source location and geometry that can serve as a reference for denoised data or source estimation, respectively. One potential disadvantage is that simulations can lead to biased results in particular when real or simulated muscular activity is added to clean simulated data [13,59]. In our simulations, we have tried to minimize this bias by adding unfiltered real muscle activity (i.e., real muscle and background activity) in order to keep the spatial and temporal correlation between myogenic and neurogenic activity.
This simulation study globally corroborated the results obtained with real data and aided in their interpretation. As for simulations, the source of spikes is mislocalized when unprocessed and noisy data are used. Moreover, in the case of the first set of noisy data (#1), the source localized from denoised data is consistent with the source localized from clean data, in a comparable way whatever the method used for artifact removal. According to simulations, this behavior suggests that noisy data #1 are moderately contaminated by muscle activity. In the second case, the source estimated from 2T-EMDdenoised data is clearly more consistent with that of clean data than when source localization is performed on CCA or CoM2 denoising. This is consistent with the results obtained in the case of very low SNR simulations, and suggests that data set #2 is strongly affected by muscle activity. Interestingly, for these data, the source localization result obtained from DWT is equivalent to the one provided by 2T-EMD, even if the EEG signals denoising seems visually less effective than for the three other algorithms. Note also that, for these data, the source estimated after CoM2-denoising is partly consistent with the source of clean data, but extends beyond it, whereas CCA-denoised spikes give raise to a source inconsistently localized in the mesio-temporal region. In these two cases, it is difficult to rule out the possibility that spikes may actually arise from the right mesial temporal region or may spread to the right insula. Nevertheless, in the absence of concomitant depth recordings, this question cannot easily be answered.

Conclusion
In general, our results obtained both in the context of simulated and real interictal epileptic spikes suggest that 2T-EMD should be preferred for the denoising of low SNR data but that the reconstructed data would most likely lead too (small) localization errors. This agrees with a recent study showing that EMD outperforms ICA in the context of low SNR simulated data [23] but provides optimal results when combined to ICA. For less noisy data, we show that CCA, CoM2, and DWT can lead to quasi-optimal denoising provided that a single cortical source is expected to underlie the EEG spikes.
In addition, when the numerical complexity is taken into account, DWT and CCA would be the best choices. Otherwise, 2T-EMD or CoM2 should be preferred in the case of higher SNR data when more complex source configurations are suspected.
It is worth mentioning that these conclusions hold in the case of interictal EEG spikes. Since our study demonstrates that the performance of muscle artifact correction methods significantly depend on the level of data contamination, and of the source configuration underlying EEG signals, the four algorithms may also perform differently on other type of data. In particular, further work should consider signals at the onset of epileptic seizures, as these signals directly relate to the epileptogenic zone, and are often obscured by muscle activity.