Skip to main content

Extending the scope of empirical mode decomposition by smoothing

Abstract

This article considers extending the scope of the empirical mode decomposition (EMD) method. The extension is aimed at noisy data and irregularly spaced data, which is necessary for widespread applicability of EMD. The proposed algorithm, called statistical EMD (SEMD), uses a smoothing technique instead of an interpolation when constructing upper and lower envelopes. Using SEMD, we discuss how to identify non-informative fluctuations such as noise, outliers, and ultra-high frequency components from the signal, and to decompose irregularly spaced data into several components without distortions.

Introduction

When analyzing a complex signal, we frequently decompose it into several components having simple forms and then analyze the information contained in each component to reduce the complexity and to enhance interpretability. Conventionally, decomposition is processed using a basis system. The benefits of decomposition are as follows: (1) a signal is well approximated by a finite number of basis functions, (2) information in the time (physical) domain is transformed into information in the frequency domain without losing any information, and (3) the interpretability of the signal can be enhanced by analyzing each component separately and comparing it with the other components.

Spectral analysis[1] and wavelet analysis[2–4] are popular methods for signal decomposition. However, when a signal has inherent nonstationary and nonlinear features according to the scale and time location, these methods might not be suitable. Empirical mode decomposition (EMD), developed by Huang et al.[5], provides a data-driven approach to decompose a signal into so-called intrinsic mode functions (IMFs) according to the local oscillation magnitude in the physical domain. IMFs can be considered as data-driven empirical basis functions. EMD has been popularly used for analyzing nonstationary signals or nonlinear signals in many disciplines of science and engineering[6].

However, due to interpolation process in the construction of envelopes, IMFs obtained by the conventional EMD algorithm are sensitive to non-informative fluctuations such as noise, outliers, and ultra-high frequency components, and hence, the non-informative fluctuation effect distorts the subsequent decomposition results. In addition, this method focuses on a narrow scope that does not cover irregularly sampled data. These constraints of its scope strongly diminish the applicability of EMD to various signals. To extend the scope of the conventional EMD to noisy signals and irregularly spaced data, we propose a statistical EMD algorithm called SEMD that is based on a smoothing technique. This method is a fully data-adaptive algorithm as in the case of the conventional EMD. The proposed SEMD has several advantages over the conventional EMD: (1) It is robust to noise or non-informative random fluctuations such as outliers and ultra-high frequency components, and hence, SEMD can decompose such signals into appropriate IMFs without distortion caused by the above-mentioned factors. (2) It provides a reasonable boundary condition of an IMF without any boundary treatment, and therefore, SEMD can provide stable decomposition results on the entire domain including boundary regions. Furthermore, we extend EMD to analyze irregularly spaced signals by combining SEMD with a simulation technique.

The remainder of this article is organized as follows. Section Review: empirical mode decomposition presents an overview of the conventional EMD. Section Statistical EMD describes the proposed SEMD method, and several case studies are presented to show its broad applicability. In addition, we investigate the variation diminishing property of SEMD. An extension to an irregularly spaced signal is presented in Section Extension of EMD to irregularly spaced signals. Finally, concluding remarks are presented in Section Conclusion.

Before closing this section, we note that, in the literature, there have been several attempts to enhance the performance of the conventional EMD and to extend its scope. For example, to deal with noise, Boudraa and Cexus[7] removed the high-frequency components using a filtering method, and Wu and Huang[8] used the ensemble mean approach of the simulated signal. Both methods are based on conventional sifting followed by a posterior adjustment. For applying the conventional EMD to signals with lower sampling rate, Xu et al.[9] proposed a hybrid extrema estimation algorithm based on Fourier interpolation. More recently, Diop et al.[10] suggested a PDE-based approach to compute envelopes, which is another way to use non-interpolation in construction of envelopes.

Review: empirical mode decomposition

Fourier analysis decomposes a signal into a sum of sinusoids having different frequencies. However, it is well known that for nonstationary signals, Fourier analysis does not effectively provide frequency information of the signals. Although wavelet analysis is a popular method for analyzing nonstationary signals, it suffers from a nonadaptive nature in that it applies the same type of basis functions to the entire range of data. Wavelet analysis also represents a signal by a linear combination of wavelet basis functions. Therefore, its formulation for the energy-frequency representation of nonlinear data can be misleading[5]. Thus, we require a set of flexible basis functions that reflects time-varying properties of a signal.

Huang et al.[5] proposed a data-driven algorithm for extracting an oscillatory wave from a given signal x as follows. First, we identify the local extrema and construct two functions called the upper envelope and lower envelope by interpolating the local maxima and local minima, respectively. Second, we take their average; this produces a signal with a frequency lower than that of the original signal because the main pattern of the signal is confined between the two envelopes. Third, by subtracting the envelope mean from x, the highly oscillatory wave h is separated.

Huang et al.[5] defined an oscillatory wave to be an IMF if it satisfies two conditions: (1) the number of extrema and the number of zero-crossings should be equal or differ by one and (2) the local average should be zero, implying that the mean of the upper envelope and the lower envelope is zero. There might exist overshoots and undershoots in h after one iteration of the aforementioned procedure, in which case the two conditions are not satisfied. In such a case, until the conditions are satisfied, the procedure is repeated for h. This iterative process is called sifting. We may consider the IMF to be an empirical basis driven by the data-adapted process, sifting. This IMF is the mode function that has the finest resolution. By sifting, the original signal x is decomposed into the highest frequency imf1 and a residual signal r1 = x − imf1 that is less oscillatory than the original signal x. If r1 has signals having different frequencies, then the next IMF is obtained by considering r1 as a new signal. The signal is sequentially decomposed into signals having different frequencies from the highest-frequency component imf1 to the lowest-frequency component imf n for some finite n and a residual signal r. Finally, we have n IMFs and a residual signal

x ( t ) = ∑ i = 1 n imf i ( t ) + r ( t ) .

Here, index i denotes the resolution level and imf1 is IMF at the finest level. We finally remark that Fourier analysis assumes that a signal is stationary and consists of components of a pure tone. In practice, the frequency information can evolve over time and several such frequencies can be compounded. The above EMD procedure is useful for identifying the amount of variation due to oscillation at different scale and time location and extracting an oscillatory wave from a nonstationary signal.

Statistical EMD

One of the main purposes of EMD is to decompose a signal into several components and to identify its significant frequency components. It is not uncommon for a signal to be corrupted by non-informative random fluctuations such as noise, which might consist of high frequencies and contains no interpretable information.

However, the conventional EMD algorithm cannot effectively separate noise from the signal, and hence, this algorithm does not produce stable decomposition results from noisy signals. To overcome this problem, we propose a modified sifting process based on a smoothing technique. The proposed algorithm can be easily implemented by simply replacing the interpolation with smoothing. That is, the upper and lower envelopes can be constructed by a smoothing technique. The proposed algorithm is designed for considering noisy signals that are used in the field of statistics. Thus, we call the proposed algorithm statistical EMD (SEMD). Formally, the SEMD algorithm can be stated as follows:

  1. A.

    (Modified sifting) Take a signal x to be decomposed, and extract the first mode h 1,λ by using a smoothing technique.

    • (A-1) Identify the local maxima (minima) z of the signal h 1 , λ 0 where h 1 , λ 0 is the original signal x.

    • (A-2) Construct an upper envelope û λ (lower envelope â„“ Ì‚ λ ) by applying a smoothing technique with a smoothing parameter λ to the maxima (minima) z.

    • (A-3) Compute the local mean m λ = 1 2 ( û λ + â„“ Ì‚ λ ) by the average of both the envelopes, and then obtain a candidate intrinsic mode h 1 , λ 1 = h 1 , λ 0 − m λ .

    • (A-4) Repeat steps (A-1)–(A-3) for the signal h 1 , λ 1 until the signal h 1 , λ j at the j th iteration satisfies the IMF conditions.

    • (A-5) Decompose the signal x = h1,λ + r λ , where h1,λ is defined as the limit of h 1 , λ j and r λ is the remaining signal.

  2. B.

    (Conventional sifting) If the remaining signal r λ  = x − h 1,λ has an intrinsic oscillation mode, then r λ can be further decomposed by conventional sifting.

The only difference between SEMD algorithm and the conventional EMD is step A, where the first mode is extracted by smoothing instead of interpolation. In particular, step (A-2) in construction of û λ and ℓ ̂ λ by smoothing plays most important roles in determining the quality of the decomposition when the signal is corrupted by non-informative random fluctuations.

A key issue that needs to be considered is how to determine the degree of smoothness (i.e., smoothing parameter, λ) in the smoothing process. We propose an automatic selection method of λ utilizing the conventional cross-validation. The cross-validation splits observations into K roughly equal-sized parts (for example, K = 4). For the k th part (say, test dataset), we fit the model to the other K−1parts (say, training dataset) of the observations, and calculate the prediction error of the k th part by the fitted model. We perform this procedure for k = 1,…,K and combine all K estimates of prediction error.

However, by omitting the test dataset, the remaining training dataset for fitting the model becomes unequally spaced data. Since the model fit is based on the decomposition, it is difficult to obtain the stable fitting results with such unequally spaced data, and hence, the conventional cross-validation method may not be directly applicable to this case.

Here, we propose an imputation-based cross-validation method for selecting the smoothing parameter. For the k th test dataset, we impute it by an imputation method, apply the SEMD with a given smoothing parameter λ to a new composite data which consists of the imputed k th dataset and the remaining training dataset, and calculate the prediction error for the k th test dataset. More specifically, with defining an indexing function κ:{1,2,…,n} → {1,2,…,K} that indicates the partition to which observation x(t) is allocated, we obtain K partitioned datasets, T k  = {x(t) : k = κ(t),t = 1,2,…,n}(k = 1,…K). We then define the prediction error as follows:

  1. (i)

    Split a signal x into K test datasets T 1,…,T k ,…,T K .

  2. (ii)

    Impute the k th test dataset by local average of two neighboring points and obtain T ~ k .

  3. (iii)

    With a given smoothing parameter λ, apply the SEMD algorithm to decompose the composite signal T 1 , … , T k − 1 , T ~ k , T k + 1 , … , T K into an h 1,λ and the remaining signal r λ .

  4. (iv)

    Obtain the predicted values of remaining signal evaluated at the k th part, say r λ k ( t ) .

  5. (v)

    Repeat steps (ii)–(iv) for k = 1,…,K, and define the prediction error as

    PE ( λ ) = 1 n ∑ t = 1 n { x ( t ) − r λ k ( t ) } 2 .

Finally, by using an optimization algorithm such as golden section search algorithm, we select the smoothing parameter λ value that minimizes the prediction error PE(λ). By considering each test dataset as new observations, it can be shown that the expectation of PE(λ) is close to true prediction error[11]. Thus, the above procedure is widely used for estimating true prediction error.

We have some remarks regarding the SEMD algorithm.

  • The first mode h1,λ: The h1,λ might not contain a meaningful mode when non-informative fluctuations such as noise are present, and hence, it may not be appropriate to define the extracted mode as IMF. However, the extracted mode can be considered as IMF in the case of noiseless signals.

  • Modified sifting at the first level: The modified sifting can be employed to extract further IMFs beyond the firstly extracted mode h1,λ. However, from our experience based on extensive simulation studies, SEMD effectively extracts noise from a noisy signal x at the first level. Furthermore, Rilling and Flandrin[12] and Park et al.[13] investigated that when there exists a big discrepancy between the frequencies of two components of a signal, ordinary sifting process cannot correctly estimate the relatively low frequency component, which results in misidentifying the relatively high frequency component. Since noise acts as a high frequency component and the modified sifting utilizing smoothing effectively estimates the low frequency component, SEMD seems to effectively extract noise at the first level.

  • Smoothing technique: Several smoothing techniques including kernel smoothing, smoothing splines, and local polynomial method have been well developed. In this study, we use kernel smoothing with Gaussian kernel. In practice, any smoothing method can be adopted for SEMD algorithm.

  • The role of the smoothing parameter λ: The performance of the modified sifting depends on the choice of λ. We now consider two special cases: (1) λ = 0—both envelopes û λ and â„“ Ì‚ λ are constructed by interpolation, and hence, the extracted results are identical to those by the conventional EMD, and (2) λ = ∞—both envelopes û λ and â„“ Ì‚ λ are the weighted averages of local extrema, so that the extracted mode becomes a over-smoothed function which might not be suitable to represent any frequency patterns of the original signal. It implies that any meaningful modes can not be extracted further. Therefore, to overcome the above problem, we propose the data-driven cross-validation approach to select an optimal λ. Finally, we remark that since the PE(λ)is a reasonable estimate of true prediction error, the resultant λ should be close to 0 when the signal is noise-free. Thus, the resulting fitting is almost identical to interpolation result in the case of noise-free signal. In summary, SEMD can be applicable to both noisy and noise-free signals.

  • The number of K: Through this article, we use K = 4, so that the entire signal is divided into four parts. The K can be chosen to be any number less than n. The case K = n is known as ‘leave-one-out’ cross-validation, where κ(t) = t, and the predicted value for the t th observation is evaluated using all the data except the t th observation. Thus, the leave-one-out cross-validation is computationally intensive.

  • Sensitivity of imputation method: An imputation method is required for the derivation of PE(λ). In this study, we use local average of two neighboring points for imputation which is simple and fast. It can be also adapted by an advanced imputation technique such as EM algorithm. However, for all cases in the article, we observe that the selection results for the smoothing parameter are almost identical.

  • Computation cost: Compared to the conventional EMD, the SEMD algorithm requires a longer computational time due mainly to the smoothing parameter selection. However, once the smoothing parameter for the first mode is selected, the computation time of the proposed algorithm is even faster than that of the conventional methods when a signal is contaminated by random fluctuations, because the remaining steps are almost identical to those of conventional EMD and in this case, the conventional EMD tends to produce extra artificial modes (this observation will be shown in subsequent sections). In addition, the computational burden of the above K-fold cross-validation procedure is not considerable at all.

Here, we discuss a theoretical property of SEMD, namely, the variation diminishing property of envelopes. It implies that, as the value of smoothing parameter increases, variation of envelopes decreases monotonically. In other words, structures of envelopes such as peaks and valleys disappear monotonically as the level of smoothing increases. Thus, lower and upper envelopes generated by a certain level of smoothing parameter should not contain some artifacts due to noise. This fact has been known as causality in the scale space literature (see, e.g.,[14]).

Proposition 1

(variation diminishing property of envelopes) Let { z i } i = 1 m be a sequence of the centered local maxima (or minima) of the original signal { x ( t ) } t = 1 n (m < n) and ν(λ) be the number of sign changes in the Gaussian (or one-sided exponential) kernel estimate û λ for { z i } i = 1 m . It follows that

ν ( λ ) ≤ ν ( λ ′ )

for any positive value λ′ ≤ λ.

Proof

: Suppose that we observe the data (t1,z1),…,(t m ,z m ). Give a smoothing parameter λ, a kernel smoothing estimate is defined as

û λ ( t ) = 1 mλ ∑ i = 1 m z i K { ( t − t i ) / λ } ,

where the kernel function is K ( x ) = ( 1 / 2 Π ) exp ( − x 2 / 2 ) . Then, using the fact that K ( t / λ 1 ) ∗ K ( t / λ 2 ) = K ( t / λ 1 2 + λ 2 2 ) , we obtain

û λ 1 ( t ) ∗ K ( t / λ 2 ) = û λ 1 2 + λ 2 2 ( t )

for all λ1,λ2 > 0. By Theorem of[15], it follows that the number of sign changes in û λ ( t ) is monotonically decreasing function of λ. □

To conclude this section, we note that the benefits of adopting a smoothing process for sifting can be summarized as a few items and these will be investigated empirically through simulation studies in the subsequent sections.

  1. (a)

    Extrema from a signal contaminated by noise are sensitive to noise or outliers. Thus, it is necessary to filter out such insignificant terms when constructing the upper and lower envelopes, and hence, the sifting process using filtered envelopes can produce stable IMFs.

  2. (b)

    As to be discussed below, the conventional EMD cannot properly handle a signal containing an ultra-high frequency component because it is difficult to obtain the desired upper and lower envelopes by using interpolation. This case can be solved by employing a smoothing approach.

  3. (c)

    Inadequate information is available on the modulation of two boundaries before the first extremum and after the last extremum when constructing envelopes. Thus, using smoothing instead of interpolation for extrema can alleviate the boundary problem.

SEMD for noisy signals

To observe the effect of smoothing in the sifting process, we consider the following synthetic signal x(t), 0 < t < 9 with frequencies f1, f2, and f3:

x(t)=0.5t+sin( f 1 Πt)+sin( f 2 Πt)+sin( f 3 Πt).

The panels in Figure1 show signals obtained from model (1) with f1 = 6, f2 = 2, and f3 = 1, and the decomposition results by the conventional EMD and SEMD, respectively. For the noise-free signal, the performance of SEMD for decomposition is comparable to that of the conventional EMD, which properly produces three IMFs and the residue. We also consider monthly average surface air temperatures of Northern Hemisphere at surface level 700 hPa for the period January 1961 – August 2003 in Figure2. It is available from the website http://www.cdc.noaa.gov/ It seems that the temperature data is noise-free signal. The panels in Figure2 show the first two IMFs and the remaining signal obtained by EMD and SEMD are almost identical.

Figure 1
figure 1

Decomposition results by EMD and SEMD for a noise-free signal.

Figure 2
figure 2

Monthly average surface air temperatures of Northern Hemisphere at surface level 700 hPa and its decomposition results by EMD and SEMD.

On the other hand, EMD distorts the decomposition when noise is present. The left-hand side panels in Figure3 show a Gaussian noisy signal with signal-to-noise ratio (SNR) 5, the first IMF, and the remaining signal using interpolation during the sifting process. We notice that through all the experiments and simulation study of this article, n = 1024 data points are regularly sampled in the time domain, and SNR is defined as SNR = ∥m∥/σ; m is a true function and σ is the standard deviation of noise. It is apparent that the remaining signal exhibits a noisy pattern. In contrast, the application of a smoothing technique ensures that noise is effectively separated from the signal as shown in the right-hand side panels of Figure3. This fact can be evaluated by examining the spectrum of periodogram. As shown in Figure4, we observe that the periodogram of the first component by SEMD is almost flat, which is close to that of pure noise, while that of EMD shows a pattern which is not flat and therefore differs from that of pure noise. In this case, the optimal smoothing parameter λ of SEMD is selected as λ=0.0523by minimizing the prediction error of cross-validation. Figure5 shows the prediction errors according to the smoothing parameter, which illustrates the sensitivity to the choice of smoothing parameter. Each first component from SEMD and EMD affects the subsequent decomposition results significantly. Figure6 shows the decomposition results for a noisy signal. The conventional EMD does not produce proper IMFs, whereas SEMD extracts noise and decomposes three IMFs effectively.

Figure 3
figure 3

The first component and remaining signal by EMD and SEMD for a Gaussian noisy signal with SNR 5.

Figure 4
figure 4

Periodogram of pure noise, and periodograms of the first components by EMD and SEMD for a Gaussian noisy signal with SNR 5 of Figure 3 .

Figure 5
figure 5

The prediction errors of SEMD according to smoothing parameter λ for a Gaussian noisy signal with SNR 5 of Figure 3 .

Figure 6
figure 6

Decomposition results by EMD and SEMD for a Gaussian noisy signal with SNR 5.

SEMD for signals with outliers

Here, we consider a case in which the signal has some insignificant random fluctuations represented by outliers. Unfortunately, the conventional EMD may not be effective to decompose signals with outliers because the interpolation is sensitive to outliers. More specifically, interpolation in the conventional sifting is based on local extrema, and therefore, the upper or lower envelope moves toward the extreme values. Thus, the sequent IMFs and the residual signal are distorted to an extent that depends on the extreme values. The left-hand side panels in Figure7 show the case with a single outlier. The EMD process cannot cope with even an outlier and produces incorrect waveforms, whereas the SEMD results shown in right-hand side panels are robust to the presence of an outlier. In this case, the optimal smoothing parameter λ of SEMD is 0.0663 which is lager than that of Gaussian noisy signal in Figure3. In a similar manner, the first component of SEMD effectively treats a heavy-tailed noise that can produce outliers, such as the t-distribution with three degrees of freedom shown in Figure8.

Figure 7
figure 7

Decomposition results by EMD and SEMD when one extreme value is present.

Figure 8
figure 8

Decomposition results by EMD and SEMD for a heavy-tailed noise with SNR 5.

To evaluate the practical performance of SEMD and to see whether the first component effectively captures the noise or outliers from a signal, a simulation study was conducted. In the study, we compared SEMD with the conventional EMD and a wavelet shrinkage method.

  1. 1.

    emd: conventional EMD,

  2. 2.

    semd: proposed SEMD, and

  3. 3.

    sure: SURE wavelet shrinkage method of [16].

We consider the nine test functions displayed in Figure9: Sines (sines) in model (1), a chirp signal (chirp) of the form m(t) = exp(−0.01t) cos(Π t / 10)(t ∈[0,500]), Heavisine (heav) from[17], fg1 (fg1) from[18], Wave (wave), Angles (angle), Parabolas (para), Time Shifted Sine (tsine) from[19], and Corner (corn) from[20].

Figure 9
figure 9

Nine test functions used in the simulation study.

The test functions are corrupted by Gaussian and t-distribution noise with three degrees of freedom, respectively. We used seven noise levels with SNRs of 1,2,…,7. For each combination of the test functions, noise types, and SNR, 100 datasets were generated with a sample size of 1024. For each generated dataset, the three methods mentioned above are applied to obtain the estimate m ̂ of the test function m. In the EMD methods, including SEMD, the estimate of m is obtained by removing the first component from the noisy signal. As a measure of performance, the mean squared error, MSE ( = 1 n ∑ { m ( t ) − m ̂ ( t ) } 2 ) is then calculated for each method. Figures10 and11 show average MSE as function of SNR, averaged over 100 datasets from two noisy types.

Figure 10
figure 10

Average MSE as a function of SNR (1–7) for Gaussian noise. dotted line: emd, dashed line: sure, solid line: semd.

Figure 11
figure 11

Average MSE as a function of SNR (1–7) for t -distribution noise. dotted line: emd, dashed line: sure, solid line: semd.

From the simulation results, the following main observations can be made: (i) noise distorts the decomposition results in the case of the conventional EMD, (ii) wavelet shrinkage outperforms the conventional EMD in recovering the true function, (iii) SEMD is the most effective in removing noise from a noisy signal, and (iv) SEMD is robust to the presence of extreme values. In summary, the simulation results illustrate that SEMD is an effective decomposition method for separating noise or outliers from signals.

SEMD for signals with an ultra-high frequency component

It is known that if the extrema of a signal are located near the extrema of the highest-frequency component, the conventional EMD effectively decomposes a signal[12, 13]. However, when a signal contains an ultra-high frequency component, the very small gaps between the extrema of a signal and the highest-frequency component can degrade the desired representation of the envelopes. Thus, an ultra-high frequency component may be incorrectly extracted, as shown in Figures12,13 and14. A signal of Figure12 is generated from model (1) with frequencies f1 = 80, f2 = 2, and f3 = 1, and for generating signals in Figures13 and14, an ultra-high frequency component of 0.1sin(480Πt) is added to the two signals, fg1 and Wave, respectively. Since smoothing can help to properly construct the hidden symmetric envelopes of an ultra-high frequency component, unlike the interpolation technique, we apply SEMD to the three simulated signals. As shown in the figures, SEMD successfully decomposes the ultra-high frequency IMF and the subsequent IMFs. Note that conventional EMD produces artificial IMFs.

Figure 12
figure 12

Decomposition results by EMD and SEMD for a signal x ( t ) = 0.5 t + sin(80 Πt ) + sin(2 Πt ) + sin( Πt ),0 <  t  < 9.

Figure 13
figure 13

Decomposition results by EMD and SEMD for fg1 signal with an ultra-high frequency component 0.1sin(480 Πt ).

Figure 14
figure 14

Decomposition results by EMD and SEMD for Wave signal with an ultra-high frequency component 0.1sin(480 Πt ).

Boundary treatment by SEMD

When constructing envelopes during the sifting process, inadequate information is available on the modulation of two boundaries before the first extremum and after the last extremum. Unless the boundaries are properly treated, large swings occur on both sides, and these eventually distort the entire decomposition result. This phenomenon is particularly exaggerated in lower-frequency IMFs because there is inadequate information on an intrinsic mode. In addition to traditional boundary treatments such as periodic or symmetric conditions, Huang et al.[5] extended the original signal by adding artificial waves called characteristic waves, and these can be constructed by repeating the intrinsic mode formed by extreme values nearest to the boundary.

To evaluate the performance of boundary treatment of SEMD, we consider three test signals. For a signal in Figure15, frequencies of model (1) are defined as f1 = 6, f2 = 2, and f3 = 1, and we generate signals in Figures16 and17 by adding a term 0.1sin(60Πt) to the two signals, fg1 and Wave, respectively. The left-hand side panels of Figures15,16 and17 show the decomposition results of the conventional EMD without any boundary treatment. The relatively large amplitude at both boundaries in the first IMF eventually has an effect on the sequential IMFs. This effect subsequently produces the artificial IMFs, which are not a component of the original signal. On the other hand, SEMD itself provides an alternative to the boundary problem treatment, that is, without using any periodic/symmetric condition or characteristic wave, SEMD can alleviate the boundary problem by applying a smoothing procedure to all levels. As shown in the right-hand side panels of Figures15,16 and17, the decomposition results of SEMD provide a better boundary adjustment, and this can produce appropriate components on the entire domain.

Figure 15
figure 15

Boundary treatment by SEMD for a signal x ( t ) = 0.5 t + sin(6 Πt ) + sin(2 Πt ) + sin( Πt ),0 <  t  < 9.

Figure 16
figure 16

Boundary treatment by SEMD for a signal x ( t ) =  fg1(t ) + 0.1sin(60 Πt )

Figure 17
figure 17

Boundary treatment by SEMD for a signal x ( t ) =  Wave ( t ) + 0.1sin(60 Πt ).

Extension of EMD to irregularly spaced signals

We consider an extension of EMD to irregularly spaced signals. The conventional EMD interpolates in-between extrema using cubic splines; this might not be appropriate for obtaining the upper and lower envelopes when the observed data are scattered: they are not observed on regular (spatial) grids, and they have spatially inhomogeneous densities including data voids of various sizes.

Here, we propose a new method based on the combination of a simulation technique for generating random fields and the SEMD algorithm, called simulation-based SEMD. This method can be easily adopted for one-dimensional signals. The proposed method comprises two steps: (1) Extrema are generated on a regularly spaced domain by a simulation method. (2) The upper and lower envelopes are constructed using the simulated extrema and the SEMD algorithm.

A key feature of the proposed simulation-based SEMD method is that it can integrate various patterns between the simulated extrema. Furthermore, the uncertainty of the resulting IMFs can be evaluated on the basis of several sets of simulations.

To generate simulated extrema, we can use some well-studied methods for simulating random fields in spatial statistics. In this study, we employ a kriging-based simulation method that is described below. Consider a Gaussian random field with a covariance function Σ and a realization vector x(s) =[x(s1),…,x(s m )]T sampled at irregularly spaced locations s = [s1,…,s m ]T. The aim is to generate a set of extrema on a regular grid with the same mean and covariance structure as x(s) and to ensure that the realization passes through the observed values. Suppose that we have the decomposition

x ( t ) = p ( t ; x ) + x ( t ) − p ( t ; x ) ,

where p(t;x) denotes the kriging predictor at regularly spaced locations t = (t1,…,t n )that depends on x. The quantity e(t) = x(t) − p(t;x)denotes kriging residuals that are not available in practice. Therefore, we generate e(t) with an estimated covariance. Using the simulated values p(t;x) + e(t), we identify the extrema and use the SEMD to obtain the IMF at t. Irregularly spaced IMFs are derived at s.

We apply the proposed method to a signal from model (1) with frequencies f1 = 6, f2 = 2, and f3 = 1. The top panel of Figure18 shows 1,024 irregularly spaced observations contaminated by Gaussian noise with SNR =7. The mean estimate for a signal by the simulation-based SEMD method with 200 replications and its 99 % empirical confidence interval are indicated by a solid line and gray band in the second panel of Figure18, respectively. To evaluate the performance of the proposed method, we apply SEMD and the conventional EMD to the irregularly spaced data shown in the top panel. The dashed lines in the third and fourth panels show the reconstructions of SEMD and EMD, respectively. The MSEs of the proposed simulation-based SEMD, SEMD, and EMD are 0.0059, 0.0072, and 511.603, respectively. Furthermore, Figure19 shows the mean estimates of the IMFs and residue by the simulation-based SEMD with 200 replications. From the results of Figure19, the simulation-based SEMD method is capable of effectively decomposing irregularly spaced data.

Figure 18
figure 18

Extension of EMD to irregularly spaced data. (a) the dotted line denotes the mean function of the signal in model (1); (b) the solid line is the estimate by simulation-based SEMD; (c) the dashed line is the estimate by SEMD; and (d) the dashed line is the estimate by the conventional EMD.

Figure 19
figure 19

The mean estimates of IMFs and residue by the simulation-based SEMD with 200 replications.

Conclusion

In this article, we have proposed a statistical EMD to deal with a noisy signal by combining smoothing techniques and the conventional EMD. The results obtained from various numerical experiments confirm the effectiveness of the statistical EMD method. Furthermore, we have extended EMD to irregularly spaced signals by utilizing simulated extrema. These extensions of the conventional EMD are expected to increase the applications of EMD.

Further studies of the proposed SEMD are needed. The current algorithm of SEMD requires the selection of smoothing parameter, which is indeed computationally expensive and might be an obstacle of handing massive data. Hence, it is necessary to develop a computationally efficient method of smoothing parameter selection. As another possible refinement of SEMD, we would like to investigate intermittence problem of mode mixing, which means that different modes of oscillations coexist in a single IMF. Finally, although SEMD is relatively robust to outliers compared with the conventional EMD, a least-squared-based smoothing method such as kernel smoothing can be affected by outliers in construction of envelopes. Therefore, it seems that a quantile-based EMD would merit further study.

References

  1. Priestley MB: Spectral Analysis and Time Series,. vols. 1 and 2 (Academic Press, New York, 1981)

    MATH  Google Scholar 

  2. Mallat S: A Wavelet Tour of Signal Processing. (Academic Press, New York, 2009)

    MATH  Google Scholar 

  3. Daubechies I: Ten Lectures on Wavelets. (SIAM, Philadelphia, 1992)

    Book  MATH  Google Scholar 

  4. Vidakovic B: Statistical Modeling by Wavelets. (John Wiley & Sons, New York, 1999)

    Book  MATH  Google Scholar 

  5. Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH: The empirical mode decomposition and Hilbert spectrum for nonlinear and nonstationary time series analysis. Proc. Roy. Soc. Lond. A 1998, 454: 903-995. 10.1098/rspa.1998.0193

    Article  MathSciNet  MATH  Google Scholar 

  6. Huang NE, Shen SSP: Hilbert-Huang Transform and Its Applications. (World Scientic, Singapore, 2005)

    Book  MATH  Google Scholar 

  7. Boudraa AO, Cexus JC: EMD-based signal filtering. IEEE Trans. Instrum. Meas 2007, 56: 2196-2202.

    Article  Google Scholar 

  8. Wu Z, Huang NE: Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv. Adapt. Data Anal 2009, 1: 1-49. 10.1142/S1793536909000047

    Article  Google Scholar 

  9. Xu Z, Huang B, Zhang F: Improvement of empirical mode decomposition under low sampling rate. Signal Process 2009, 89: 2296-2303. 10.1016/j.sigpro.2009.04.038

    Article  MATH  Google Scholar 

  10. Diop EHS, Alexandre R, Boudraa AO: Analysis of intrinsic mode functions: a pde approach. IEEE Signal Process. Lett 2010, 17: 398-401.

    Article  Google Scholar 

  11. Hastie T, Tibshirani R: Generalized additive models. (Chapman and Hall, London, 1990)

    MATH  Google Scholar 

  12. Rilling G, Flandrin P: One or two frequencies? the empirical mode decomposition answers. IEEE Trans. Signal Process 2008, 56: 85-95.

    Article  MathSciNet  Google Scholar 

  13. Park M, Kim D, Oh HS: A reinterpretation of EMD by cubic spline interpolation. Adv. Adapt. Data Anal 2011, 3: 527-540. 10.1142/S1793536911000921

    Article  MathSciNet  Google Scholar 

  14. Lindberg T: Scale-Space Theory in Computer Vision. (Kluwer, Boston, 1994)

    Book  Google Scholar 

  15. Silverman BW: Using Kernel density estimates to investigate multimodality. J. Roy. Stat. Soc. B 1981, 43: 97-99.

    MathSciNet  Google Scholar 

  16. Donoho DL, Johnstone IM: Adapting to unknown smoothing via wavelet shrinkage. J. Am. Stat. Assoc 1995, 90: 1200-1224. 10.1080/01621459.1995.10476626

    Article  MathSciNet  MATH  Google Scholar 

  17. Donoho DL, Johnstone IM: Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81: 425-455. 10.1093/biomet/81.3.425

    Article  MathSciNet  MATH  Google Scholar 

  18. Fan J, Gijbels I: Data-driven bandwidth selection in local polynomial fitting: variable bandwidth and spatial adaptation. J. Roy. Stat. Soc. B 1995, 57: 371-394.

    MathSciNet  MATH  Google Scholar 

  19. Marron JS, Adak S, Johnstone IM, Neumann MH, Patil P: Exact risk analysis of wavelet regression. J. Comput. Graph. Stat 1998, 7: 278-309.

    Google Scholar 

  20. Cai TT: Adaptive wavelet estimation: a block thresholding and oracle inequality approach. Ann. Stat 1999, 27: 898-924. 10.1214/aos/1018031262

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work of Hee-Seok Oh was supported by the National Research Foundation of Korea (NRF) grant (No. 2012002712) funded by the Korea government (MEST). This research of Donghoh Kim was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education, Science and Technology (2009-0076223).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hee-Seok Oh.

Additional information

Competing interest

The authors declare that they have no competing interests

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Authors’ original file for figure 13

Authors’ original file for figure 14

Authors’ original file for figure 15

Authors’ original file for figure 16

Authors’ original file for figure 17

Authors’ original file for figure 18

Authors’ original file for figure 19

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kim, D., Kim, K.O. & Oh, HS. Extending the scope of empirical mode decomposition by smoothing. EURASIP J. Adv. Signal Process. 2012, 168 (2012). https://doi.org/10.1186/1687-6180-2012-168

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1687-6180-2012-168

Keywords