 Research
 Open Access
 Published:
Extending the scope of empirical mode decomposition by smoothing
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 168 (2012)
Abstract
This article considers extending the scope of the empirical mode decomposition (EMD) method. The extension is aimed at noisy data and irregularly spaced data, which is necessary for widespread applicability of EMD. The proposed algorithm, called statistical EMD (SEMD), uses a smoothing technique instead of an interpolation when constructing upper and lower envelopes. Using SEMD, we discuss how to identify noninformative fluctuations such as noise, outliers, and ultrahigh frequency components from the signal, and to decompose irregularly spaced data into several components without distortions.
Introduction
When analyzing a complex signal, we frequently decompose it into several components having simple forms and then analyze the information contained in each component to reduce the complexity and to enhance interpretability. Conventionally, decomposition is processed using a basis system. The benefits of decomposition are as follows: (1) a signal is well approximated by a finite number of basis functions, (2) information in the time (physical) domain is transformed into information in the frequency domain without losing any information, and (3) the interpretability of the signal can be enhanced by analyzing each component separately and comparing it with the other components.
Spectral analysis[1] and wavelet analysis[2–4] are popular methods for signal decomposition. However, when a signal has inherent nonstationary and nonlinear features according to the scale and time location, these methods might not be suitable. Empirical mode decomposition (EMD), developed by Huang et al.[5], provides a datadriven approach to decompose a signal into socalled intrinsic mode functions (IMFs) according to the local oscillation magnitude in the physical domain. IMFs can be considered as datadriven empirical basis functions. EMD has been popularly used for analyzing nonstationary signals or nonlinear signals in many disciplines of science and engineering[6].
However, due to interpolation process in the construction of envelopes, IMFs obtained by the conventional EMD algorithm are sensitive to noninformative fluctuations such as noise, outliers, and ultrahigh frequency components, and hence, the noninformative fluctuation effect distorts the subsequent decomposition results. In addition, this method focuses on a narrow scope that does not cover irregularly sampled data. These constraints of its scope strongly diminish the applicability of EMD to various signals. To extend the scope of the conventional EMD to noisy signals and irregularly spaced data, we propose a statistical EMD algorithm called SEMD that is based on a smoothing technique. This method is a fully dataadaptive algorithm as in the case of the conventional EMD. The proposed SEMD has several advantages over the conventional EMD: (1) It is robust to noise or noninformative random fluctuations such as outliers and ultrahigh frequency components, and hence, SEMD can decompose such signals into appropriate IMFs without distortion caused by the abovementioned factors. (2) It provides a reasonable boundary condition of an IMF without any boundary treatment, and therefore, SEMD can provide stable decomposition results on the entire domain including boundary regions. Furthermore, we extend EMD to analyze irregularly spaced signals by combining SEMD with a simulation technique.
The remainder of this article is organized as follows. Section Review: empirical mode decomposition presents an overview of the conventional EMD. Section Statistical EMD describes the proposed SEMD method, and several case studies are presented to show its broad applicability. In addition, we investigate the variation diminishing property of SEMD. An extension to an irregularly spaced signal is presented in Section Extension of EMD to irregularly spaced signals. Finally, concluding remarks are presented in Section Conclusion.
Before closing this section, we note that, in the literature, there have been several attempts to enhance the performance of the conventional EMD and to extend its scope. For example, to deal with noise, Boudraa and Cexus[7] removed the highfrequency components using a filtering method, and Wu and Huang[8] used the ensemble mean approach of the simulated signal. Both methods are based on conventional sifting followed by a posterior adjustment. For applying the conventional EMD to signals with lower sampling rate, Xu et al.[9] proposed a hybrid extrema estimation algorithm based on Fourier interpolation. More recently, Diop et al.[10] suggested a PDEbased approach to compute envelopes, which is another way to use noninterpolation in construction of envelopes.
Review: empirical mode decomposition
Fourier analysis decomposes a signal into a sum of sinusoids having different frequencies. However, it is well known that for nonstationary signals, Fourier analysis does not effectively provide frequency information of the signals. Although wavelet analysis is a popular method for analyzing nonstationary signals, it suffers from a nonadaptive nature in that it applies the same type of basis functions to the entire range of data. Wavelet analysis also represents a signal by a linear combination of wavelet basis functions. Therefore, its formulation for the energyfrequency representation of nonlinear data can be misleading[5]. Thus, we require a set of flexible basis functions that reflects timevarying properties of a signal.
Huang et al.[5] proposed a datadriven algorithm for extracting an oscillatory wave from a given signal x as follows. First, we identify the local extrema and construct two functions called the upper envelope and lower envelope by interpolating the local maxima and local minima, respectively. Second, we take their average; this produces a signal with a frequency lower than that of the original signal because the main pattern of the signal is confined between the two envelopes. Third, by subtracting the envelope mean from x, the highly oscillatory wave h is separated.
Huang et al.[5] defined an oscillatory wave to be an IMF if it satisfies two conditions: (1) the number of extrema and the number of zerocrossings should be equal or differ by one and (2) the local average should be zero, implying that the mean of the upper envelope and the lower envelope is zero. There might exist overshoots and undershoots in h after one iteration of the aforementioned procedure, in which case the two conditions are not satisfied. In such a case, until the conditions are satisfied, the procedure is repeated for h. This iterative process is called sifting. We may consider the IMF to be an empirical basis driven by the dataadapted process, sifting. This IMF is the mode function that has the finest resolution. By sifting, the original signal x is decomposed into the highest frequency imf_{1} and a residual signal r_{1} = x − imf_{1} that is less oscillatory than the original signal x. If r_{1} has signals having different frequencies, then the next IMF is obtained by considering r_{1} as a new signal. The signal is sequentially decomposed into signals having different frequencies from the highestfrequency component imf_{1} to the lowestfrequency component imf_{ n } for some finite n and a residual signal r. Finally, we have n IMFs and a residual signal
Here, index i denotes the resolution level and imf_{1} is IMF at the finest level. We finally remark that Fourier analysis assumes that a signal is stationary and consists of components of a pure tone. In practice, the frequency information can evolve over time and several such frequencies can be compounded. The above EMD procedure is useful for identifying the amount of variation due to oscillation at different scale and time location and extracting an oscillatory wave from a nonstationary signal.
Statistical EMD
One of the main purposes of EMD is to decompose a signal into several components and to identify its significant frequency components. It is not uncommon for a signal to be corrupted by noninformative random fluctuations such as noise, which might consist of high frequencies and contains no interpretable information.
However, the conventional EMD algorithm cannot effectively separate noise from the signal, and hence, this algorithm does not produce stable decomposition results from noisy signals. To overcome this problem, we propose a modified sifting process based on a smoothing technique. The proposed algorithm can be easily implemented by simply replacing the interpolation with smoothing. That is, the upper and lower envelopes can be constructed by a smoothing technique. The proposed algorithm is designed for considering noisy signals that are used in the field of statistics. Thus, we call the proposed algorithm statistical EMD (SEMD). Formally, the SEMD algorithm can be stated as follows:

A.
(Modified sifting) Take a signal x to be decomposed, and extract the first mode h _{1,λ} by using a smoothing technique.

(A1) Identify the local maxima (minima) z of the signal$\left(\right)close="">{h}_{1,\lambda}^{0}$ where$\left(\right)close="">{h}_{1,\lambda}^{0}$ is the original signal x.

(A2) Construct an upper envelope$\left(\right)close="">{\xfb}_{\lambda}$ (lower envelope$\left(\right)close="">{\widehat{\ell}}_{\lambda}$) by applying a smoothing technique with a smoothing parameter λ to the maxima (minima) z.

(A3) Compute the local mean$\left(\right)close="">{m}_{\lambda}=\frac{1}{2}({\xfb}_{\lambda}+{\widehat{\ell}}_{\lambda})$ by the average of both the envelopes, and then obtain a candidate intrinsic mode$\left(\right)close="">{h}_{1,\lambda}^{1}={h}_{1,\lambda}^{0}{m}_{\lambda}$.

(A4) Repeat steps (A1)–(A3) for the signal$\left(\right)close="">{h}_{1,\lambda}^{1}$ until the signal$\left(\right)close="">{h}_{1,\lambda}^{j}$ at the j th iteration satisfies the IMF conditions.

(A5) Decompose the signal x = h_{1,λ} + r_{ λ }, where h_{1,λ} is defined as the limit of$\left(\right)close="">{h}_{1,\lambda}^{j}$ and r_{ λ } is the remaining signal.


B.
(Conventional sifting) If the remaining signal r _{ λ } = x − h _{1,λ} has an intrinsic oscillation mode, then r _{ λ } can be further decomposed by conventional sifting.
The only difference between SEMD algorithm and the conventional EMD is step A, where the first mode is extracted by smoothing instead of interpolation. In particular, step (A2) in construction of$\left(\right)close="">{\xfb}_{\lambda}$ and$\left(\right)close="">{\widehat{\ell}}_{\lambda}$ by smoothing plays most important roles in determining the quality of the decomposition when the signal is corrupted by noninformative random fluctuations.
A key issue that needs to be considered is how to determine the degree of smoothness (i.e., smoothing parameter, λ) in the smoothing process. We propose an automatic selection method of λ utilizing the conventional crossvalidation. The crossvalidation splits observations into K roughly equalsized parts (for example, K = 4). For the k th part (say, test dataset), we fit the model to the other K−1parts (say, training dataset) of the observations, and calculate the prediction error of the k th part by the fitted model. We perform this procedure for k = 1,…,K and combine all K estimates of prediction error.
However, by omitting the test dataset, the remaining training dataset for fitting the model becomes unequally spaced data. Since the model fit is based on the decomposition, it is difficult to obtain the stable fitting results with such unequally spaced data, and hence, the conventional crossvalidation method may not be directly applicable to this case.
Here, we propose an imputationbased crossvalidation method for selecting the smoothing parameter. For the k th test dataset, we impute it by an imputation method, apply the SEMD with a given smoothing parameter λ to a new composite data which consists of the imputed k th dataset and the remaining training dataset, and calculate the prediction error for the k th test dataset. More specifically, with defining an indexing function κ:{1,2,…,n} → {1,2,…,K} that indicates the partition to which observation x(t) is allocated, we obtain K partitioned datasets, T_{ k } = {x(t) : k = κ(t),t = 1,2,…,n}(k = 1,…K). We then define the prediction error as follows:

(i)
Split a signal x into K test datasets T _{1},…,T _{ k },…,T _{ K }.

(ii)
Impute the k th test dataset by local average of two neighboring points and obtain $\left(\right)close="">{\stackrel{~}{T}}_{k}$.

(iii)
With a given smoothing parameter λ, apply the SEMD algorithm to decompose the composite signal $\left(\right)close="">{T}_{1},\dots ,{T}_{k1},{\stackrel{~}{T}}_{k},{T}_{k+1},\dots ,{T}_{K}$ into an h _{1,λ} and the remaining signal r _{ λ }.

(iv)
Obtain the predicted values of remaining signal evaluated at the k th part, say $\left(\right)close="">{r}_{\lambda}^{k}\left(t\right)$.

(v)
Repeat steps (ii)–(iv) for k = 1,…,K, and define the prediction error as
$$\mathrm{PE}\left(\lambda \right)=\frac{1}{n}\sum _{t=1}^{n}{\{x\left(t\right){r}_{\lambda}^{k}\left(t\right)\}}^{2}.$$
Finally, by using an optimization algorithm such as golden section search algorithm, we select the smoothing parameter λ value that minimizes the prediction error PE(λ). By considering each test dataset as new observations, it can be shown that the expectation of PE(λ) is close to true prediction error[11]. Thus, the above procedure is widely used for estimating true prediction error.
We have some remarks regarding the SEMD algorithm.

The first mode h_{1,λ}: The h_{1,λ} might not contain a meaningful mode when noninformative fluctuations such as noise are present, and hence, it may not be appropriate to define the extracted mode as IMF. However, the extracted mode can be considered as IMF in the case of noiseless signals.

Modified sifting at the first level: The modified sifting can be employed to extract further IMFs beyond the firstly extracted mode h_{1,λ}. However, from our experience based on extensive simulation studies, SEMD effectively extracts noise from a noisy signal x at the first level. Furthermore, Rilling and Flandrin[12] and Park et al.[13] investigated that when there exists a big discrepancy between the frequencies of two components of a signal, ordinary sifting process cannot correctly estimate the relatively low frequency component, which results in misidentifying the relatively high frequency component. Since noise acts as a high frequency component and the modified sifting utilizing smoothing effectively estimates the low frequency component, SEMD seems to effectively extract noise at the first level.

Smoothing technique: Several smoothing techniques including kernel smoothing, smoothing splines, and local polynomial method have been well developed. In this study, we use kernel smoothing with Gaussian kernel. In practice, any smoothing method can be adopted for SEMD algorithm.

The role of the smoothing parameter λ: The performance of the modified sifting depends on the choice of λ. We now consider two special cases: (1) λ = 0—both envelopes$\left(\right)close="">{\xfb}_{\lambda}$ and$\left(\right)close="">{\widehat{\ell}}_{\lambda}$ are constructed by interpolation, and hence, the extracted results are identical to those by the conventional EMD, and (2) λ = ∞—both envelopes$\left(\right)close="">{\xfb}_{\lambda}$ and$\left(\right)close="">{\widehat{\ell}}_{\lambda}$ are the weighted averages of local extrema, so that the extracted mode becomes a oversmoothed function which might not be suitable to represent any frequency patterns of the original signal. It implies that any meaningful modes can not be extracted further. Therefore, to overcome the above problem, we propose the datadriven crossvalidation approach to select an optimal λ. Finally, we remark that since the PE(λ)is a reasonable estimate of true prediction error, the resultant λ should be close to 0 when the signal is noisefree. Thus, the resulting fitting is almost identical to interpolation result in the case of noisefree signal. In summary, SEMD can be applicable to both noisy and noisefree signals.

The number of K: Through this article, we use K = 4, so that the entire signal is divided into four parts. The K can be chosen to be any number less than n. The case K = n is known as ‘leaveoneout’ crossvalidation, where κ(t) = t, and the predicted value for the t th observation is evaluated using all the data except the t th observation. Thus, the leaveoneout crossvalidation is computationally intensive.

Sensitivity of imputation method: An imputation method is required for the derivation of PE(λ). In this study, we use local average of two neighboring points for imputation which is simple and fast. It can be also adapted by an advanced imputation technique such as EM algorithm. However, for all cases in the article, we observe that the selection results for the smoothing parameter are almost identical.

Computation cost: Compared to the conventional EMD, the SEMD algorithm requires a longer computational time due mainly to the smoothing parameter selection. However, once the smoothing parameter for the first mode is selected, the computation time of the proposed algorithm is even faster than that of the conventional methods when a signal is contaminated by random fluctuations, because the remaining steps are almost identical to those of conventional EMD and in this case, the conventional EMD tends to produce extra artificial modes (this observation will be shown in subsequent sections). In addition, the computational burden of the above Kfold crossvalidation procedure is not considerable at all.
Here, we discuss a theoretical property of SEMD, namely, the variation diminishing property of envelopes. It implies that, as the value of smoothing parameter increases, variation of envelopes decreases monotonically. In other words, structures of envelopes such as peaks and valleys disappear monotonically as the level of smoothing increases. Thus, lower and upper envelopes generated by a certain level of smoothing parameter should not contain some artifacts due to noise. This fact has been known as causality in the scale space literature (see, e.g.,[14]).
Proposition 1
(variation diminishing property of envelopes) Let$\left(\right)close="">{\left\{{z}_{i}\right\}}_{i=1}^{m}$be a sequence of the centered local maxima (or minima) of the original signal$\left(\right)close="">{\left\{x\right(t\left)\right\}}_{t=1}^{n}$(m < n) and ν(λ) be the number of sign changes in the Gaussian (or onesided exponential) kernel estimate$\left(\right)close="">{\xfb}_{\lambda}$for$\left(\right)close="">{\left\{{z}_{i}\right\}}_{i=1}^{m}$. It follows that
for any positive value λ^{′} ≤ λ.
Proof
: Suppose that we observe the data (t_{1},z_{1}),…,(t_{ m },z_{ m }). Give a smoothing parameter λ, a kernel smoothing estimate is defined as
where the kernel function is$\left(\right)close="">K\left(x\right)\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}(1/\sqrt{2\Pi})exp({x}^{2}/2)$. Then, using the fact that$\left(\right)close="">K(t/{\lambda}_{1})\ast K(t/{\lambda}_{2})=K(t/\sqrt{{\lambda}_{1}^{2}+{\lambda}_{2}^{2}})$, we obtain
for all λ_{1},λ_{2} > 0. By Theorem of[15], it follows that the number of sign changes in$\left(\right)close="">{\xfb}_{\lambda}\left(t\right)$ is monotonically decreasing function of λ. □
To conclude this section, we note that the benefits of adopting a smoothing process for sifting can be summarized as a few items and these will be investigated empirically through simulation studies in the subsequent sections.

(a)
Extrema from a signal contaminated by noise are sensitive to noise or outliers. Thus, it is necessary to filter out such insignificant terms when constructing the upper and lower envelopes, and hence, the sifting process using filtered envelopes can produce stable IMFs.

(b)
As to be discussed below, the conventional EMD cannot properly handle a signal containing an ultrahigh frequency component because it is difficult to obtain the desired upper and lower envelopes by using interpolation. This case can be solved by employing a smoothing approach.

(c)
Inadequate information is available on the modulation of two boundaries before the first extremum and after the last extremum when constructing envelopes. Thus, using smoothing instead of interpolation for extrema can alleviate the boundary problem.
SEMD for noisy signals
To observe the effect of smoothing in the sifting process, we consider the following synthetic signal x(t), 0 < t < 9 with frequencies f_{1}, f_{2}, and f_{3}:
The panels in Figure1 show signals obtained from model (1) with f_{1} = 6, f_{2} = 2, and f_{3} = 1, and the decomposition results by the conventional EMD and SEMD, respectively. For the noisefree signal, the performance of SEMD for decomposition is comparable to that of the conventional EMD, which properly produces three IMFs and the residue. We also consider monthly average surface air temperatures of Northern Hemisphere at surface level 700 hPa for the period January 1961 – August 2003 in Figure2. It is available from the website http://www.cdc.noaa.gov/ It seems that the temperature data is noisefree signal. The panels in Figure2 show the first two IMFs and the remaining signal obtained by EMD and SEMD are almost identical.
On the other hand, EMD distorts the decomposition when noise is present. The lefthand side panels in Figure3 show a Gaussian noisy signal with signaltonoise ratio (SNR) 5, the first IMF, and the remaining signal using interpolation during the sifting process. We notice that through all the experiments and simulation study of this article, n = 1024 data points are regularly sampled in the time domain, and SNR is defined as SNR = ∥m∥/σ; m is a true function and σ is the standard deviation of noise. It is apparent that the remaining signal exhibits a noisy pattern. In contrast, the application of a smoothing technique ensures that noise is effectively separated from the signal as shown in the righthand side panels of Figure3. This fact can be evaluated by examining the spectrum of periodogram. As shown in Figure4, we observe that the periodogram of the first component by SEMD is almost flat, which is close to that of pure noise, while that of EMD shows a pattern which is not flat and therefore differs from that of pure noise. In this case, the optimal smoothing parameter λ of SEMD is selected as λ=0.0523by minimizing the prediction error of crossvalidation. Figure5 shows the prediction errors according to the smoothing parameter, which illustrates the sensitivity to the choice of smoothing parameter. Each first component from SEMD and EMD affects the subsequent decomposition results significantly. Figure6 shows the decomposition results for a noisy signal. The conventional EMD does not produce proper IMFs, whereas SEMD extracts noise and decomposes three IMFs effectively.
SEMD for signals with outliers
Here, we consider a case in which the signal has some insignificant random fluctuations represented by outliers. Unfortunately, the conventional EMD may not be effective to decompose signals with outliers because the interpolation is sensitive to outliers. More specifically, interpolation in the conventional sifting is based on local extrema, and therefore, the upper or lower envelope moves toward the extreme values. Thus, the sequent IMFs and the residual signal are distorted to an extent that depends on the extreme values. The lefthand side panels in Figure7 show the case with a single outlier. The EMD process cannot cope with even an outlier and produces incorrect waveforms, whereas the SEMD results shown in righthand side panels are robust to the presence of an outlier. In this case, the optimal smoothing parameter λ of SEMD is 0.0663 which is lager than that of Gaussian noisy signal in Figure3. In a similar manner, the first component of SEMD effectively treats a heavytailed noise that can produce outliers, such as the tdistribution with three degrees of freedom shown in Figure8.
To evaluate the practical performance of SEMD and to see whether the first component effectively captures the noise or outliers from a signal, a simulation study was conducted. In the study, we compared SEMD with the conventional EMD and a wavelet shrinkage method.

1.
emd: conventional EMD,

2.
semd: proposed SEMD, and

3.
sure: SURE wavelet shrinkage method of [16].
We consider the nine test functions displayed in Figure9: Sines (sines) in model (1), a chirp signal (chirp) of the form m(t) = exp(−0.01t) cos(Π t / 10)(t ∈[0,500]), Heavisine (heav) from[17], fg1 (fg1) from[18], Wave (wave), Angles (angle), Parabolas (para), Time Shifted Sine (tsine) from[19], and Corner (corn) from[20].
The test functions are corrupted by Gaussian and tdistribution noise with three degrees of freedom, respectively. We used seven noise levels with SNRs of 1,2,…,7. For each combination of the test functions, noise types, and SNR, 100 datasets were generated with a sample size of 1024. For each generated dataset, the three methods mentioned above are applied to obtain the estimate$\left(\right)close="">\widehat{m}$ of the test function m. In the EMD methods, including SEMD, the estimate of m is obtained by removing the first component from the noisy signal. As a measure of performance, the mean squared error,$\left(\right)close="">\text{MSE}(=\frac{1}{n}\sum {\left\{m\right(t)\widehat{m}(t\left)\right\}}^{2})$ is then calculated for each method. Figures10 and11 show average MSE as function of SNR, averaged over 100 datasets from two noisy types.
From the simulation results, the following main observations can be made: (i) noise distorts the decomposition results in the case of the conventional EMD, (ii) wavelet shrinkage outperforms the conventional EMD in recovering the true function, (iii) SEMD is the most effective in removing noise from a noisy signal, and (iv) SEMD is robust to the presence of extreme values. In summary, the simulation results illustrate that SEMD is an effective decomposition method for separating noise or outliers from signals.
SEMD for signals with an ultrahigh frequency component
It is known that if the extrema of a signal are located near the extrema of the highestfrequency component, the conventional EMD effectively decomposes a signal[12, 13]. However, when a signal contains an ultrahigh frequency component, the very small gaps between the extrema of a signal and the highestfrequency component can degrade the desired representation of the envelopes. Thus, an ultrahigh frequency component may be incorrectly extracted, as shown in Figures12,13 and14. A signal of Figure12 is generated from model (1) with frequencies f_{1} = 80, f_{2} = 2, and f_{3} = 1, and for generating signals in Figures13 and14, an ultrahigh frequency component of 0.1sin(480Πt) is added to the two signals, fg1 and Wave, respectively. Since smoothing can help to properly construct the hidden symmetric envelopes of an ultrahigh frequency component, unlike the interpolation technique, we apply SEMD to the three simulated signals. As shown in the figures, SEMD successfully decomposes the ultrahigh frequency IMF and the subsequent IMFs. Note that conventional EMD produces artificial IMFs.
Boundary treatment by SEMD
When constructing envelopes during the sifting process, inadequate information is available on the modulation of two boundaries before the first extremum and after the last extremum. Unless the boundaries are properly treated, large swings occur on both sides, and these eventually distort the entire decomposition result. This phenomenon is particularly exaggerated in lowerfrequency IMFs because there is inadequate information on an intrinsic mode. In addition to traditional boundary treatments such as periodic or symmetric conditions, Huang et al.[5] extended the original signal by adding artificial waves called characteristic waves, and these can be constructed by repeating the intrinsic mode formed by extreme values nearest to the boundary.
To evaluate the performance of boundary treatment of SEMD, we consider three test signals. For a signal in Figure15, frequencies of model (1) are defined as f_{1} = 6, f_{2} = 2, and f_{3} = 1, and we generate signals in Figures16 and17 by adding a term 0.1sin(60Πt) to the two signals, fg1 and Wave, respectively. The lefthand side panels of Figures15,16 and17 show the decomposition results of the conventional EMD without any boundary treatment. The relatively large amplitude at both boundaries in the first IMF eventually has an effect on the sequential IMFs. This effect subsequently produces the artificial IMFs, which are not a component of the original signal. On the other hand, SEMD itself provides an alternative to the boundary problem treatment, that is, without using any periodic/symmetric condition or characteristic wave, SEMD can alleviate the boundary problem by applying a smoothing procedure to all levels. As shown in the righthand side panels of Figures15,16 and17, the decomposition results of SEMD provide a better boundary adjustment, and this can produce appropriate components on the entire domain.
Extension of EMD to irregularly spaced signals
We consider an extension of EMD to irregularly spaced signals. The conventional EMD interpolates inbetween extrema using cubic splines; this might not be appropriate for obtaining the upper and lower envelopes when the observed data are scattered: they are not observed on regular (spatial) grids, and they have spatially inhomogeneous densities including data voids of various sizes.
Here, we propose a new method based on the combination of a simulation technique for generating random fields and the SEMD algorithm, called simulationbased SEMD. This method can be easily adopted for onedimensional signals. The proposed method comprises two steps: (1) Extrema are generated on a regularly spaced domain by a simulation method. (2) The upper and lower envelopes are constructed using the simulated extrema and the SEMD algorithm.
A key feature of the proposed simulationbased SEMD method is that it can integrate various patterns between the simulated extrema. Furthermore, the uncertainty of the resulting IMFs can be evaluated on the basis of several sets of simulations.
To generate simulated extrema, we can use some wellstudied methods for simulating random fields in spatial statistics. In this study, we employ a krigingbased simulation method that is described below. Consider a Gaussian random field with a covariance function Σ and a realization vector x(s) =[x(s_{1}),…,x(s_{ m })]^{T} sampled at irregularly spaced locations s = [s_{1},…,s_{ m }]^{T}. The aim is to generate a set of extrema on a regular grid with the same mean and covariance structure as x(s) and to ensure that the realization passes through the observed values. Suppose that we have the decomposition
where p(t;x) denotes the kriging predictor at regularly spaced locations t = (t_{1},…,t_{ n })that depends on x. The quantity e(t) = x(t) − p(t;x)denotes kriging residuals that are not available in practice. Therefore, we generate e(t) with an estimated covariance. Using the simulated values p(t;x) + e(t), we identify the extrema and use the SEMD to obtain the IMF at t. Irregularly spaced IMFs are derived at s.
We apply the proposed method to a signal from model (1) with frequencies f_{1} = 6, f_{2} = 2, and f_{3} = 1. The top panel of Figure18 shows 1,024 irregularly spaced observations contaminated by Gaussian noise with SNR =7. The mean estimate for a signal by the simulationbased SEMD method with 200 replications and its 99 % empirical confidence interval are indicated by a solid line and gray band in the second panel of Figure18, respectively. To evaluate the performance of the proposed method, we apply SEMD and the conventional EMD to the irregularly spaced data shown in the top panel. The dashed lines in the third and fourth panels show the reconstructions of SEMD and EMD, respectively. The MSEs of the proposed simulationbased SEMD, SEMD, and EMD are 0.0059, 0.0072, and 511.603, respectively. Furthermore, Figure19 shows the mean estimates of the IMFs and residue by the simulationbased SEMD with 200 replications. From the results of Figure19, the simulationbased SEMD method is capable of effectively decomposing irregularly spaced data.
Conclusion
In this article, we have proposed a statistical EMD to deal with a noisy signal by combining smoothing techniques and the conventional EMD. The results obtained from various numerical experiments confirm the effectiveness of the statistical EMD method. Furthermore, we have extended EMD to irregularly spaced signals by utilizing simulated extrema. These extensions of the conventional EMD are expected to increase the applications of EMD.
Further studies of the proposed SEMD are needed. The current algorithm of SEMD requires the selection of smoothing parameter, which is indeed computationally expensive and might be an obstacle of handing massive data. Hence, it is necessary to develop a computationally efficient method of smoothing parameter selection. As another possible refinement of SEMD, we would like to investigate intermittence problem of mode mixing, which means that different modes of oscillations coexist in a single IMF. Finally, although SEMD is relatively robust to outliers compared with the conventional EMD, a leastsquaredbased smoothing method such as kernel smoothing can be affected by outliers in construction of envelopes. Therefore, it seems that a quantilebased EMD would merit further study.
References
 1.
Priestley MB: Spectral Analysis and Time Series,. vols. 1 and 2 (Academic Press, New York, 1981)
 2.
Mallat S: A Wavelet Tour of Signal Processing. (Academic Press, New York, 2009)
 3.
Daubechies I: Ten Lectures on Wavelets. (SIAM, Philadelphia, 1992)
 4.
Vidakovic B: Statistical Modeling by Wavelets. (John Wiley & Sons, New York, 1999)
 5.
Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH: The empirical mode decomposition and Hilbert spectrum for nonlinear and nonstationary time series analysis. Proc. Roy. Soc. Lond. A 1998, 454: 903995. 10.1098/rspa.1998.0193
 6.
Huang NE, Shen SSP: HilbertHuang Transform and Its Applications. (World Scientic, Singapore, 2005)
 7.
Boudraa AO, Cexus JC: EMDbased signal filtering. IEEE Trans. Instrum. Meas 2007, 56: 21962202.
 8.
Wu Z, Huang NE: Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv. Adapt. Data Anal 2009, 1: 149. 10.1142/S1793536909000047
 9.
Xu Z, Huang B, Zhang F: Improvement of empirical mode decomposition under low sampling rate. Signal Process 2009, 89: 22962303. 10.1016/j.sigpro.2009.04.038
 10.
Diop EHS, Alexandre R, Boudraa AO: Analysis of intrinsic mode functions: a pde approach. IEEE Signal Process. Lett 2010, 17: 398401.
 11.
Hastie T, Tibshirani R: Generalized additive models. (Chapman and Hall, London, 1990)
 12.
Rilling G, Flandrin P: One or two frequencies? the empirical mode decomposition answers. IEEE Trans. Signal Process 2008, 56: 8595.
 13.
Park M, Kim D, Oh HS: A reinterpretation of EMD by cubic spline interpolation. Adv. Adapt. Data Anal 2011, 3: 527540. 10.1142/S1793536911000921
 14.
Lindberg T: ScaleSpace Theory in Computer Vision. (Kluwer, Boston, 1994)
 15.
Silverman BW: Using Kernel density estimates to investigate multimodality. J. Roy. Stat. Soc. B 1981, 43: 9799.
 16.
Donoho DL, Johnstone IM: Adapting to unknown smoothing via wavelet shrinkage. J. Am. Stat. Assoc 1995, 90: 12001224. 10.1080/01621459.1995.10476626
 17.
Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81: 425455. 10.1093/biomet/81.3.425
 18.
Fan J, Gijbels I: Datadriven bandwidth selection in local polynomial fitting: variable bandwidth and spatial adaptation. J. Roy. Stat. Soc. B 1995, 57: 371394.
 19.
Marron JS, Adak S, Johnstone IM, Neumann MH, Patil P: Exact risk analysis of wavelet regression. J. Comput. Graph. Stat 1998, 7: 278309.
 20.
Cai TT: Adaptive wavelet estimation: a block thresholding and oracle inequality approach. Ann. Stat 1999, 27: 898924. 10.1214/aos/1018031262
Acknowledgements
This work of HeeSeok Oh was supported by the National Research Foundation of Korea (NRF) grant (No. 2012002712) funded by the Korea government (MEST). This research of Donghoh Kim was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education, Science and Technology (20090076223).
Author information
Affiliations
Corresponding author
Additional information
Competing interest
The authors declare that they have no competing interests
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Kim, D., Kim, K.O. & Oh, HS. Extending the scope of empirical mode decomposition by smoothing. EURASIP J. Adv. Signal Process. 2012, 168 (2012). https://doi.org/10.1186/168761802012168
Received:
Accepted:
Published:
Keywords
 Empirical mode decomposition
 Intrinsic mode function
 Irregularly spaced data
 Noisy signals
 Sifting
 Smoothing