Ensemble patch transformation: a flexible framework for decomposition and filtering of signal

Kim, Donghoh; Choi, Guebin; Oh, Hee-Seok

doi:10.1186/s13634-020-00690-7

Research
Open access
Published: 26 June 2020

Ensemble patch transformation: a flexible framework for decomposition and filtering of signal

Donghoh Kim¹,
Guebin Choi² &
Hee-Seok Oh³

EURASIP Journal on Advances in Signal Processing volume 2020, Article number: 30 (2020) Cite this article

2775 Accesses
7 Citations
Metrics details

Abstract

This paper considers the problem of signal decomposition and filtering by extending its scope to various signals that cannot be effectively dealt with existing methods. For the core of our methodology, we introduce a new approach, termed “ensemble patch transformation” that provides a framework for decomposition and filtering of signals; thus, as a result, it enhances identification of local characteristics embedded in a signal that is crucial for signal decomposition and designs flexible filters that allow various data analyses. In literature, there are some data-adaptive decomposition methods such as empirical mode decomposition (EMD) by Huang (Proc. R. Soc. London A 454:903–995, 1998). Along the same line of EMD, we propose a new decomposition algorithm that extracts essential components from a signal. Some theoretical properties of the proposed algorithm are investigated. To evaluate the proposed method, we analyze several synthetic examples and real signals.

1 Introduction

In this paper, we propose a new method for decomposition and filtering of signals, termed “ensemble patch transformation,” which adopts a multiscale concept of scale-space theory in computer vision of [1]. The proposed ensemble patch transformation consists of two key components. The first one is “patch process” that is defined as a data-dependent patch of data at a particular time point t. The patch process is designed for identifying local structures of data according to the sizes of patches. The second concept is “ensemble” that is obtained by shifting the time point t of the patch, which is suitable for representing the temporal variation of data efficiently by enhancement of the temporal resolution of them. Moreover, various statistics obtained from the proposed ensemble patches might be useful for data analysis.

Successful recognition of the local frequency patterns of a signal is crucial for signal decomposition. Empirical mode decomposition (EMD) by [2] identifies such local patterns through local extrema. In the case that the local extrema reflect the time-varying amplitude and frequency, EMD decomposes a signal effectively according to its frequencies. However, when the frequency ratio of the two components in a signal is small, EMD fails to identify a superimposed component; thus, it produces artificial components during the decomposition process. To clarify this problem of EMD and provide motivation for the proposed method, we consider a synthetic signal that consists of two components X_t= cos(100πt)+4 cos(60πt),t∈[0,1]. Figure 1 shows signal X_t and its two components.

The middle panel of Fig. 2 illustrates the decomposed results by EMD, where the dotted lines represent the true components, and the solid lines are the extracted components. As one can see, EMD fails to decompose the two components of the signal, where the frequency ratio of the two components is relatively small. In other words, when the local pattern of the high-frequency component is not distinct, EMD does not work correctly to decompose such a signal; hence, it fails to extract the sinusoid components effectively.

We remark that Rilling and Flandrin [3] discussed the ranges of frequency and amplitude ratios when EMD performs. On the other hand, the left panel of Fig. 2 presents the decomposition results by the proposed method in Section 3, which identifies the true components efficiently. The right panel shows the decomposition results by ensemble EMD (EEMD) of [4], which cannot extract the true ones properly.

The main contribution of our study is described as follows. The two procedures, patch process and ensemble process, are capable of providing a framework that generates various filters for decomposition, including some existing filters. Through the filter design reflecting the characteristics of data, the proposed method provides a flexible tool for analyzing signals. Hence, it extends the scope of signal decomposition to a broad class of signals that cannot be dealt with some conventional methods. Specifically, as discussed later in Section 4, the proposed method can identify special features embedded in the signal, such as sudden changes, seasonalities, and amplitude modulation terms that are not readily obtainable in conventional ways. The rest of the paper is organized as follows. Section 2 introduces ensemble patch transformation and investigates its utility for multiscale analysis. In addition, various statistical measures based on the ensemble patch transformation are discussed for data analysis. In Section 3, a new method for signal decomposition is proposed with a practical algorithm. Furthermore, some theoretical properties of the proposed algorithm are investigated. Section 4 presents simulation studies and real data examples to evaluate the empirical performance of the proposed method. In Section 5, as a practical issue of the proposed method, the selection of the size parameter is discussed. Lastly, conclusions are addressed in Section 6.

We remark that in literature, there are numerous studies for signal decomposition. Dragomiretskiy and Zosso [5] developed variational mode decomposition (VMD) for tone detection and separation of a signal. VMD first conducts discrete Fourier transform for detecting frequency information of each mode and then identifies several meaningful modes using the main detected frequencies. For this procedure, it is required to preset the number of modes for the decomposition. However, it is difficult to know the number of meaningful modes according to their frequency information in advance. As a data-adaptive procedure, Huang et al. [2] proposed empirical mode decomposition (EMD). Due to its robustness to the presence of nonlinearity and nonstationarity, EMD has been applied to various fields. Since EMD is based on an empirical algorithm, it raises several methodological issues such as identification of local frequency patterns and intermittency. There have been many proposals to enhance the performance of the conventional EMD. Wu and Huang [4] developed the ensemble EMD (EEMD), taking an average of EMD decompositions of noisy copies of the signal, and several authors have proposed its variants. These include the complementary ensemble EMD of [6], the complete ensemble EMD with the adaptive noise (CEEMDAN) of [7], and the improved complete ensemble EMD of [8]. Daubechies et al. [9] proposed an alternative method of EMD, termed, synchrosqueezed wavelet transforms, which is based on reassignment methods of wavelet coefficients. Thakur et al. [10] discussed a selection method of various parameters in the discrete version, and Thakur and Wu [11] and Meignen et al. [12] proposed some methods that are robust to non-uniform samples and noise via synchrosqueezing techniques.

2 Ensemble patch transform

2.1 Multiscale patch transform

In this section, we introduce a multiscale patch transform of a one-dimensional sequence that is designed for data processing and signal decomposition. We first define a patch process of a real-valued univariate process (X_t)_t. A patch for observation (t,X_t) is a polygon containing neighbors of (t,X_t). A patch is a tool capturing local characteristics of a signal. The size of the patch controls the degree of localization, and various shapes of the patch can be employed according to the purpose of data analysis. The patch is formally defined by its shape and size. Let $\mathcal {T}=\{\tau _{i}\}_{i}$ be a set of size parameters for a patch with a certain shape such as rectangle and oval. For $\tau \in \mathcal {T}$, let $P_{t}^{\tau }\left (X_{t}\right)$ denote the patch process for observation (t,X_t) that is generated by a certain shape with size parameter τ. We further define a multiscale patch transform $MP_{t}^{\mathcal {T}}(X_{t})$ for observation (t,X_t) as a sequence of all patches according to different τ_i’s,

$$MP_{t}^{\mathcal{T}}\left(X_{t}\right):=\left\{P_{t}^{\tau_{i}}(X_{t})\right\}_{i=1,\ldots,|\mathcal{T}|}. $$

As one can see, the precise definition of $MP_{t}^{\mathcal {T}}(X_{t})$ depends on the shape of the patch. As for the typical case, rectangle and oval are considered. Of course, we can take other shapes as well.

Rectangle patch: For a given point (t,X_t) and $\tau \in \mathcal {T}$, this patch is centered at the point (t,X_t) and is a closed rectangle formed by the points (t+k, mink∈[−τ/2,τ/2]{X_t+k}−0.5γτ) and (t+k, maxk∈[−τ/2,τ/2]{X_t+k}+0.5γτ) for k∈[−τ/2,τ/2]. For the rectangle patch, the width is τ and hight $h_{t}^{\tau }$ is

$$h_{t}^{\tau}=\max_{k\in[-\tau/2,\tau/2]}\left\{X_{t+k}\right\}-\min_{k\in[-\tau/2,\tau/2]}\left\{X_{t+k}\right\}+\gamma\tau, $$

where γ is a scale factor.

Oval patch: For a given point (t,X_t) and $\tau \in \mathcal {T}$, this patch is centered at the point (t,X_t) and is characterized by boundaries $\left (t+k, X_{t+k}\pm \gamma \sqrt {\tau ^{2}/4-k^{2}}\right), k\in [-\tau /2,\tau /2]$ where γ is a scale factor. The width for the oval patch is τ as for the rectangle patch, and the height is of decreasing pattern as moving away from a given point (t,X_t).

For an illustration of the patch process, we consider a deterministic signal X_t=25 cos(0.1πt) cos(πt),0≤t≤10. We then obtain a sequence $\left \{X_{t_{i}}\right \}_{i=1}^{100}$ with t_i=iT and sampling rate T=1/10 from the continuous signal X_t. Figure 3 shows rectangle patches $P_{t_{i}}^{\tau }(X_{t})$ of the sequence $\left \{X_{t_{i}}\right \}$ that are respectively performed at certain time points t_i=4,5,6 marked by red dots. We consider three different size parameters τ=4,8,12 for generating patches, and obtain a multiscale patch $MP_{t}^{\mathcal {T}}(X_{t})$ by combining the three patches in Fig. 3a–c. Figure 4 shows patches in the entire time domain with the parameters τ=2 and 4, respectively.

From Figs. 3 and 4 and the definitions, the patch at a particular time point t is an object that contains multiple observations around the time point t; thus, for further statistical analysis, it is necessary to use some statistics that summarize informations of $P_{t}^{\tau }\left (X_{t}\right)$ and $MP_{t}^{\mathcal {T}}\left (X_{t}\right)$. For this purpose, we consider a measure $\mathcal {K}\left (P_{t}^{\tau }\left (X_{t}\right)\right)$ that produces a single statistic at time point t. Some possible measures $\mathcal {K}(\cdot)$ are twofold: one is for central tendency and the other is for dispersion. As measures for central tendency, in this study, we present the following two measures. Suppose that we obtain the patch $P_{t}^{\tau }(X_{t})$ for a fixed τ.

$\text {Ave}_{t}^{\tau }\left (X_{t}\right)= \text {average}\left (\left \{X_{t_{i}}\right \}\right)$, where $\left \{X_{t_{i}}\right \}$ denote observations in the patch $P_{t}^{\tau }\left (X_{t}\right)$.
$M_{t}^{\tau }(X_{t})=\frac {1}{2}(L_{t}^{\tau }(X_{t})+U_{t}^{\tau }(X_{t}))$, where $L_{t}^{\tau }(X_{t})$ and $U_{t}^{\tau }(X_{t})$ denote lower and upper envelopes of the patch $P_{t}^{\tau }(X_{t})$, respectively. $M_{t}^{\tau }(X_{t})$ is mean envelope. The lower envelope $L_{t}^{\tau }(X_{t})$ and upper envelope $U_{t}^{\tau }(X_{t})$ of the rectangle patch are
$$\begin{array}{*{20}l} L_{t}^{\tau}(X_{t}) = \min_{k\in[-\tau/2,\tau/2]}\left\{X_{t+k}\right\}-0.5 \gamma \tau, \\ U_{t}^{\tau}(X_{t}) = \max_{k\in[-\tau/2,\tau/2]}\left\{X_{t+k}\right\}+0.5 \gamma \tau, \end{array} $$
respectively. $L_{t}^{\tau }(X_{t})$ and $U_{t}^{\tau }(X_{t})$ of the oval patch are
$$\begin{array}{*{20}l} L_{t}^{\tau}(X_{t}) = \min_{k\in[-\tau/2,\tau/2]} \left\{X_{t+k} - \gamma\sqrt{\tau^{2}/4-k^{2}} \right\}, \\ U_{t}^{\tau}(X_{t}) = \max_{k\in[-\tau/2,\tau/2]} \left\{X_{t+k} + \gamma\sqrt{\tau^{2}/4-k^{2}} \right\}, \end{array} $$
respectively.

For dispersion measure, we consider the followings

$\text {sd}_{\tau }(X_{t})= \sqrt {\text {Var}(\{X_{t_{i}}\})}$, where $\{X_{t_{i}}\}$ denote observations in the patch $P_{t}^{\tau }(X_{t})$.
$R_{t}^{\tau }(X_{t})=U_{t}^{\tau }(X_{t})-L_{t}^{\tau }(X_{t})$, which represents the envelope range.

Figure 5 shows $\text {Ave}_{t}^{\tau }(X_{t})$ and $\text {sd}_{t}^{\tau }(X_{t})$ with size parameters τ=8,32,64 for a noisy signal X_t=25 cos(0.1πt) cos(πt)+σε_t, where σ=1.8 and ε_t denote i.i.d. standard Gaussian random variables. As the value of size parameter τ increases, a central measure $\text {Ave}_{t}^{\tau }(X_{t})$ is getting smoother with representing the global trend of the observations. On the other hand, the values of $\text {sd}_{t}^{\tau }(X_{t})$ at both boundaries are large, and $\text {sd}_{t}^{\tau }(X_{t})$ becomes larger as τ increases since a large patch contains more observations. Further, it seems that $\text {sd}_{t}^{\tau }(X_{t})$ by τ=64 is capable of identifying the temporal variability of the signal well. The derivations of statistics are not limited in the above definitions, which can be further defined by other measures such as trimmed mean and median absolute deviation of the patch $P_{t}^{\tau }(X_{t})$.

2.2 Ensemble patch transform

To improve the temporal resolution of the patch and its measures, we introduce an ensemble patch process of a real-valued univariate process (X_t)_t.

Definition 1

Let (X_t)_t be a real-valued univariate process. Let $\mathcal {T}$ denote a set of size parameters for the patch. For any $\tau \in \mathcal {T}$, the ℓth shifted patch at time point t is defined as $P_{t+\ell }^{\tau }(X_{t}), \ell \in [-\tau /2,\tau /2]$. Then, a fixed $\tau \in \mathcal {T}$, a collection of all possible shifted patches at time point t is defined as ensemble patch,

$$EP_{t}^{\tau}(X_{t}):= \left\{P_{t+\ell}^{\tau}(X_{t}) : \ell\in[-\tau/2,\tau/2] \right\}. $$

Finally, as a dictionary, the multiscale ensemble patch process is defined the sequence of all sets of $EP_{t}^{\tau }(X_{t})$ over various τ’s as

$$MEP_{t}^{\mathcal{T}}(X_{t}):= \left\{ EP_{t}^{\tau}(X_{t}) : \tau\in\mathcal{T} \right\}. $$

For the sequence $\{X_{t_{i}}\}$ in Fig. 3, we generate ensemble rectangle patches at the time points t_i=4,5,6 according to size parameters τ=4,8,12, which are displayed in Fig. 6. A multiscale ensemble patch $MEP_{t}^{\mathcal {T}}(X_{t})$ is obtained by combining the three ensemble patches in Fig. 6a–c.

Similarly, for further data analysis, we consider some statistics of ensemble patch $EP_{t}^{\tau }(X_{t})$. We first consider a measure of each shifted patch $\mathcal {K}(P_{t+\ell }^{\tau }(X_{t}))$ and then obtain an ensemble measure by averaging $\mathcal {K}(P_{t+\ell }^{\tau }(X_{t}))$’s over ℓ’s in [−τ/2,τ/2]. More specifically, we obtain the following ensemble measures for central tendency and dispersion: for a fixed τ, suppose that we obtain the collection of all shifted patches at time point t, $EP_{t}^{\tau }(X_{t})$ of the patch $P_{t}^{\tau }(X_{t})$.

$\text {EAve}_{t}^{\tau }(X_{t})= \text {average}\left (\text {Ave}_{t+\ell }^{\tau }\left (X_{t}\right)\right)$ over ℓ’s, where $\text {Ave}_{t+\ell }^{\tau }(X_{t})$ denotes the simple average of observations in the shifted patch $P_{t+\ell }^{\tau }(X_{t})$.
$EM_{t}^{\tau }(X_{t})=\text {average}\left (M_{t+\ell }^{\tau }\left (X_{t}\right)\right)$ over ℓ’s, where $M_{t+\ell }^{\tau }(X_{t})$ denotes the average of $L_{t+\ell }^{\tau }(X_{t})$ and $U_{t+\ell }^{\tau }(X_{t})$ that are lower and upper envelopes of the patch $P_{t+\ell }^{\tau }(X_{t})$.

$\text {Esd}_{t}^{\tau }(X_{t})= \text {average}\left (\text {sd}_{t+\ell }^{\tau }\left (X_{t}\right)\right)$ over ℓ’s, where $\text {sd}_{t+\ell }^{\tau }(X_{t})$ denotes the standard deviation of observations in the shifted patch $P_{t+\ell }^{\tau }(X_{t})$.
$ER_{t}^{\tau }(X_{t})= \text {average}\left (R_{t+\ell }^{\tau }\left (X_{t}\right)\right)$ over ℓ’s, where $R_{t+\ell }^{\tau }(X_{t})=U_{t+\ell }^{\tau }(X_{t})-L_{t+\ell }^{\tau }(X_{t})$.

We obtain some measures based on ensemble patches of the noisy signal in Fig. 5, $\text {EAve}_{t}^{\tau }(X_{t})$ and $\text {Esd}_{t}^{\tau }(X_{t})$ with size parameters τ=8,32,64, which are shown in Fig. 7. As the value of τ increases, the central measure $\text {EAve}_{t}^{\tau }(X_{t})$ is getting smoother, and the dispersion measure $\text {Esd}_{t}^{\tau }(X_{t})$ becomes more significant with having relatively large values at both boundaries. Furthermore, by comparison of the ensemble results with the single patch results in Fig. 5, we have some observations: (a) the central measure by ensemble patches represents the temporal trend of the underlying function well, compared to that by single patches. (b) The dispersion measure with large τ by ensemble patches identifies a local variability of the underlying function efficiently. (c) The temporal resolution of both measures by ensemble patches is much more delicate than those of single patches.

We remark that ensemble patches are able to obtain various statistics that are adapted for data analysis. For example, as an alternative central measure, we can consider the median for each patch $P_{t+\ell }^{\tau }(X_{t})$, say Med$_{t+\ell }^{\tau }(X_{t})$ and the corresponding mean of Med$_{t+\ell }^{\tau }(X_{t})$ over ℓ, EMed$_{t}^{\tau }(X_{t})$. These measures are closely related to filters for decomposition in Section 3. Moreover, the difference between rectangle and oval patches is not significant. It is only noticeable when considering envelopes $L_{t}^{\tau }(X_{t})$ and $U_{t}^{\tau }(X_{t})$. The mean envelope $M_{t}^{\tau }(X_{t})$ obtained by the rectangle patch represents a stair-shaped curve and is undesirable for smoothing. The oval patch, on the other hand, can be useful in capturing the central tendency of the data because the resulting curve is smoother. However, it is necessary to capture the sudden change in data as shown in Fig. 22 later. In this case, the rectangle patch is more useful than the oval patch.

Finally, the thick-pen transformation by [13] is a special case of the above $MEP_{t}^{\mathcal {T}}(X_{t})$ with $L_{t+\ell }^{\tau }(X_{t})$ and $U_{t+\ell }^{\tau }(X_{t})$ at the shifting index ℓ=0.

3 Methods

3.1 Ensemble patch filtering

When a signal consists of several components with their own frequencies, ensemble patch transformation can be utilized as a low-pass or a high-pass filter. Figure 8 illustrates the filtering process of the ensemble mean envelope. The top panel shows a sinusoidal signal X_t= cos(50πt)+ cos(10πt)+2t (t∈[0.35,0.55]) and depicts three shifted rectangle patches covering a point X_t (open circle) at t=0.45. Each ℓth shifted patch produces upper and lower envelopes $U_{t+\ell }^{\tau }(X_{t})$ and $L_{t+\ell }^{\tau }(X_{t})$ at time point t. The black dots denote the average values of data in $M_{t+\ell }^{\tau }(X_{t})$ at time point t=0.45 for the shifted patches a, b, and c. Similarly, with shifting the patch over the entire time domain, we construct a mean envelope from each shifted patch process. The bottom panel of Fig. 8 shows three mean envelopes (dotted line), respectively. We note that, although we use only three shifted mean envelops for illustration purposes, the possible number of shifted mean envelopes for a given point is the same as the size parameter τ of the patch in general. Furthermore, we take an ensemble average of those mean envelopes, which results in the ensemble mean envelope marked by the solid line. It seems that the ensemble mean envelope represents a lower frequency component of the signal.

For demonstrating the utility of this ensemble approach, we consider a synthetic example. Figure 9 shows a signal X_t= cos(50πt)+ cos(10πt)+2t (t∈[0,1]) in white color and its ensemble patch transformation with size parameters τ= 20, 40, 80, 120, 200, and 240, respectively. The lower and upper envelopes $EL_{t}^{\tau }(X_{t}), EU_{t}^{\tau }(X_{t})$ and the mean envelope $EM_{t}^{\tau }(X_{t})$ are obtained by the ensemble approach. The area covered by two envelopes is colored in gray, and the mean envelope is denoted by the solid line. We observe that with size parameter τ=40, the ensemble mean envelope suppresses a high-frequency component cos(50πt). When the size parameter τ is larger than 200, both the oscillating patterns of components cos(50πt) and cos(10πt) are painted over by patch transformation. As the value of τ increases, the ensemble mean envelope tends to suppress the oscillating local pattern and represents the lower-frequency pattern at the same time. The ensemble mean envelope removes the frequency pattern whose period is less than τ. By controlling the size parameter, the mean envelope $EM_{t}^{\tau }(X_{t})$ of the ensemble patch transformation is used as a low-pass filter or high-pass filter.

We further perform the same experiment with measure $\text {EAve}_{t}^{\tau }(X_{t})$ that is average of $\text {Ave}_{t+\ell }^{\tau }(X_{t})$ obtained from ensemble patches. The results $\text {EAve}_{t}^{\tau }(X_{t})$ (solid line) with different τ= 20, 40, 80, 120, 200, and 240 are displayed in Fig. 10. As one can see, the results are almost identical to those of $EM_{t}^{\tau }(X_{t})$ in Fig. 9.

3.2 Decomposition by ensemble patch filtering

By using the above notion of filters, we would like to decompose a signal into a high-frequency component and a low-frequency residue component. We consider a signal X_t= cos(90πt)+ cos(10πt), t∈[0,1], shown in Fig. 11.

A snapshot of the decomposition procedure by ensemble patch filtering is displayed in Fig. 12. From top to down and left to right panels, the first panel illustrates a low-frequency mode, say, LF₁ that is an ensemble mean envelope of $EP_{t}^{\tau }(X_{t})$ for a given τ, and the corresponding high-frequency mode HF₁=X−LF₁ in the next panel. As one can see, there still exists apparent low-frequency mode in HF₁. The third panel shows an ensemble mean envelope of HF₁, say, LF₂, which seemingly identifies the low-frequency mode of HF₁ in the second panel. A new high-frequency mode HF₂=HF₁−LF₂=X−LF₁−LF₂ is now obtained. In the next iteration, a further ensemble mean envelope of HF₂, say, LF₃ is almost constant; hence, the corresponding high-frequency mode HF₃=HF₂−LF₂=X−LF₁−LF₂−LF₃ represents the true high-frequency component well. An iterative procedure is required, which is along with the line of the sifting process of EMD.

From the above discussion, we propose a practical decomposition algorithm based on ensemble patch filtering. Let $\mathcal {G}_{t}^{\tau }(X_{t})$ be a generic central measure of $\{P^{\tau }_{t+\ell }(X_{t})\}_{\ell }$, where $P^{\tau }_{t+\ell }(X_{t})$ is the ℓth shifted patch at time t for a given τ. Suppose that a signal X_t consists of a high-frequency component h_t and a low-frequency component g_t as X_t=h_t+g_t.

Obtain an initial component $\hat {h}_{t}^{(0)}=X_{t}-\mathcal {G}_{t}^{\tau }\left (X_{t}\right)$.
Iterate, until convergence, the following step for k=0,1,…,
$$\hat{h}_{t}^{(k+1)}=\hat{h}_{t}^{(k)}-\mathcal{G}_{t}^{\tau}\left(\hat{h}_{t}^{(k)}\right). $$
Take the converged estimate as the extracted component for h_t.

We have some remarks regarding the aforementioned algorithm. (a) Choice of $\mathcal {G}_{t}^{\tau }$: it is feasible to use various choices of $\mathcal {G}_{t}^{\tau }$, including some central measures introduced in Section 2.2, which is the main benefit of utilizing ensemble patch transformation. (b) Choice of τ: the size parameter τ corresponds to a period in the time domain. Thus, the parameter τ plays a crucial role in the quality of the extracted low-frequency component. A selection method of τ will be discussed later.

We now discuss a convergence property of the above algorithm under some conditions.

Theorem 1

Suppose that we observe a real-valued sequence (X_t)_t from a model X_t=h_t+g_t, where $\{h_{t}\}, t \in \mathbb {R}$ is a periodic sequence with $\phantom {\dot {i}\!}h_{t}=h_{t+\tau _{0}}$ and $\phantom {\dot {i}\!}\int _{0}^{\tau _{0}}h_{t}=0$, and g_t is a signal such that $\phantom {\dot {i}\!}|G(\omega)|=0, ~\omega \in \Omega _{\tau _{0}}:=\left \{\omega :\omega =\frac {2\pi k}{\tau _{0}}+2n\pi,~\textup {for all} k=1,\dots,\tau _{0}-1~\textup {and} n\in \mathbb {Z}\right \}$ and G(ω) denotes Fourier transform of g_t. Then, for a given τ₀, we obtain that $\hat {h}_{t}^{(k)} \to h_{t} ~\textup {as} k \to \infty $, where $\hat {h}_{t}^{(k+1)}=\hat {h}_{t}^{(k)}-\textup {EAve}_{t}^{\tau _{0}}\left (\hat {h}_{t}^{(k)}\right)$ and $\hat {h}_{t}^{(0)}=X_{t}-\textup {EAve}_{t}^{\tau _{0}}(X_{t})$.

Proof

$\textup {EAve}_{t}^{\tau _{0}}(X_{t})$ can be expressed as $ \textup {EAve}_{t}^{\tau _{0}}(X_{t})=\phi ^{\tau _{0}}_{t} * \phi ^{\tau _{0}}_{t} * X_{t}, $ where $\phi ^{\tau _{0}}_{t}$ is a rectangular (boxcar) function defined as

$$\phi^{\tau_{0}}_{t}=\left\{ \begin{array}{ll} \frac{1}{\tau_{0}}, &~|t|<\frac{\tau_{0}}{2} \\ 0, &~\text{otherwise}. \end{array} \right. $$

Let $\xi ^{\tau _{0}}_{t}=\phi ^{\tau _{0}}_{t} * \phi ^{\tau _{0}}_{t}$. Then $\hat {h}_{t}^{(k)}$ can be written as $ \hat {h}_{t}^{(k)}=(\delta _{t}-\xi ^{\tau _{0}}_{t})^{*k}*X_{t}, $ where δ_t denotes Kronecker delta function and $u^{*k}={\underbrace {u*u*\dots *u}_k}$ denotes convolution power. In addition, $\Xi ^{\tau _{0}}(\omega)={\mathcal {F}}\{\xi ^{\tau _{0}}_{t}\}=\left (\text {sinc}(\tau _{0}\omega /2)\right)^{2}$. Thus, it follows that $\phantom {\dot {i}\!}0<1-\Xi ^{\tau _{0}}(\omega)<1$ for $\omega \notin \Omega _{\tau _{0}}$. Furthermore, from the assumption of |G(ω)|=0 for $~\omega \in \Omega _{\tau _{0}}$, we conclude that $ \left |\left (1-\Xi ^{\tau _{0}}(\omega)\right)^{k} G(\omega)\right | \to 0 $ as k→∞. □

We now extend the above result with a general filter beyond the double average filter $\textup {EAve}_{t}^{\tau }(X_{t})$.

Definition 2

Let X_t be a continuous-time signal. For any t, an iterative representation of X_t with filter ${\mathcal {M}}$ is defined as $IR_{t}(X_t,{\cal M}):={\lim }_{k\rightarrow \infty } IR_{t}^{(k)}(X_t,{\cal M})$, where $IR_{t}^{(k)}(X_t,{\cal M})=IR_{t}^{(k-1)}(X_t,{\cal M})+{\cal M}(X_t-IR_{t}^{(k-1)}(X_t,{\cal M}))$ and $IR_{t}^{(1)}(X_{t},{\mathcal {M}})={\mathcal {M}}X_{t}$. Furthermore, X_t is said to be (iteratively) representable with filter ${\mathcal {M}}$ if $IR_{t}(X_{t},{\mathcal {M}})=X_{t}$ for all t.

We note that if $X_{t}={\mathcal {M}}X_{t}$ for all t, then X_t is (iteratively) representable with filter $\mathcal {M}$.

Definition 3

Let X_t denotes a continuous-time signal. Suppose that X_t consists of two components as X_t=h_t+g_t for all t. The component h_t is said to be cancelable from X_t with filter ${\mathcal {M}}$ if ${\mathcal {M}} X_{t}= {\mathcal {M}}g_{t}$ for all t.

Theorem 2

Let X_t denotes a continuous-time signal. Suppose that X_t consists of two components as X_t=h_t+g_t for all t. Assume that (a) g_t is (iteratively) representable with filter $\mathcal {M}$, and (b) h_t is cancelable from X_t with filter $\mathcal {M}$. Then, it follows that $X_{t}-IR_{t}(X_{t},{\mathcal {M}})=h_{t}$ for all t.

A proof is directly obtained by the above definitions.

Lemma 1

Let X_t denotes a continuous-time signal. Suppose that |G(ω)|=0 for $\omega \in \Omega _{\tau _{0}}$, where G(ω) is the Fourier transform of X_t. Define the filter $\mathcal {M}$ as $\phantom {\dot {i}\!}{\mathcal {M}} X_{t} = \textup {EAve}_{t}^{\tau _{0}}(X_{t}). $ Then, X_t is (iteratively) representable with filter $\mathcal {M}$.

A proof of Lemma 1 is easily obtained from proof of Theorem 1; hence, we omit it.

Suppose that X_t consists of two components as X_t=h_t+g_t. If $\mathcal {M}$ is a linear filter such that ${\mathcal {M}} h_{t} =0$, then h_t is cancelable from X_t with filter $\mathcal {M}$. Thus, we obtain the following result that extends the convergence property of Lemma 1 with ${\mathcal {M}} X_{t} = \textup {EAve}_{t}^{\tau _{0}}(X_{t})$ to a general linear filter ${\mathcal {M}}$ under some conditions.

Corollary 1

Let X_t be a continuous-time signal. Suppose that X_t consists of two components as X_t=h_t+g_t for all t. Define the filter $\mathcal {M}$ as $ {\mathcal {M}} X_{t} = \textup {EAve}_{t}^{\tau _{0}}(X_{t}). $ Assume that

|G(ω)|=0 for $\omega \in \Omega _{\tau _{0}}$, where G(ω) is the Fourier transform of g_t.
h_t satisfies $\phantom {\dot {i}\!}h_{t}=h_{t+\tau _{0}}$ and $\int _{t=0}^{\tau _{0}}h_{t}dt=0$.

Then, it follows that $X_{t}-IR_{t}(X_{t},{\mathcal {M}})=h_{t}$ for all t.

As for a final remark, we consider a simple example related to designing an ideal filter that provides the strength of our method. To simplify our discussion, suppose that we observed a discrete signal such that $X_{i}=\sum _{i \in \mathbb {Z}} X_{t}\delta (t-i)$. Suppose that X_i consists two component h_i and g_i, i.e., X_i=h_i+g_i, where component h_i satisfies h_i=h_i+3 and $\sum _{i=1}^{3}h_{i}=0$, and g_i is a component whose value suddenly changes from −1 to 1 at i=0 as in Table 1.

Table 1 Component {g_i} of a signal X_i=h_i+g_i, with h_i=h_i+3 and $\sum _{i=1}^{3}h_{i}=0$

Full size table

We define the filter ${\mathcal {M}}$ as $ {\mathcal {M}} X_{i} = \textup {EAve}_{i}^{\tau _{0}}(X_{i}), \tau _{0}=3. $ Note that component h_i is cancelable from X_i with filter $\mathcal {M}$, but component g_i cannot be (iteratively) representable with filter ${\mathcal {M}}$ since |G(ω)|>0 for some $\omega \in \left \{\omega :~\omega =\frac {2\pi k}{3}\pm 2n\pi,\text { for all }~k=1,2 \text { and} n\in \mathbb {N}\right \}$, where G(ω) is the Fourier transform of g_i. Thus, it is not able to obtain h_i from the iterative procedure $X_{i}-IR_{i}(X_{i},{\mathcal {M}})$, because the filters of the moving average class are not suitable for expressing data with a sharp mean change such as g_i. It is generally known that data with such a sharp mean change can be easily represented by a median filter [14]. In particular, the component g_i used in the example is a root signal since it does not change even if it passes through the median filter repeatedly. Hence, the convergence property is ensured. In summary, the average filter of $ \text {Ave}^{\tau }_{t} (X_{t})$ is advantageous to cancel h_i, but it cannot represent g_i properly. On the other hand, median filter of $\text {Med}^{\tau }_{t} (X_{t})$ is not capable of canceling h_i, but is useful for expressing g_i. However, a combination of both filters might lead to desired decomposition results, which is feasible under the ensemble patch transform framework, not just patch transform one. It is a benefit of the proposed transformation combining the patch process and the ensemble process. We now consider a composite filter ${\mathcal {M}}^{*}$ constructed by an average filter of patch process and a median filter of ensemble process, that is, $ {\mathcal {M}}^{*} X_{i} = \text {median}\left (\text {Ave}_{i+\ell }^{\tau _{0}}(X_{i})\right), \tau _{0}=3. $ Due to the property of the linear filter and the condition $\sum _{i = 1}^{3} h_{i} = 0$, it follows that $\text {Ave}_{i+\ell }^{\tau _{0}}(X_{i})=\text {Ave}_{i+\ell }^{\tau _{0}}(h_{i})+\text {Ave}_{i+\ell }^{\tau _{0}}(g_{i})=\text {Ave}_{i+\ell }^{\tau _{0}}(g_{i})$. So, this filter separates two components h_i and g_i, and cancels h_i. In the example, the value of $ \text {Ave}_{i + \ell }^{\tau _{0}}(g_{i})$ for each ℓ is listed in Table 2. Then, by passing the median filter as the second filter, we obtain a signal $\left \{\ \dots, -1, -\tfrac {1}{3}, -\tfrac {1}{3},1, \dots \right \}$, which completely represents component g_i except i∈{−1,0}. An iterative calculation of $IR_{i}^{(k)}(X_{i},{\mathcal {M}}^{*}) (k \ge 1)$ provides the result in Table 2. In the example, note the difference of convergence by the filter $\mathcal {M}$ and ${\mathcal {M}}^{*}$. As the iteration progresses with k→∞, the iterative procedure $X_{i}-IR_{i}^{(k)}(X_{i}, {\mathcal {M}})$ does not converge to component {h_i} at all, while $X_{i}-IR_{i}^{(k)} (X_{i}, {\mathcal {M}}^{*})$ converges to component h_i on $\mathbb {Z} \setminus \{- 1, 0 \}$.

Table 2 An iterative calculation of $IR_{i}^{(k)}(X_{i},{\mathcal {M}}^{*})$ for a signal X_i of Table 1

Full size table

4 Results and discussion

In this section, we conduct a numerical study with various examples to assess the practical performance of the proposed method, which is implemented by the algorithm in Section 3.2. In this numerical study, we compare the proposed decomposition method with EMD, EEMD, and CEEMDAN and various types of filters are designed for the analysis reflecting the characteristics of a given signal. The R statistical package, EPT, used to implement the methods and to carry out some experiments are available at https://CRAN.R-project.org/package=EPT in order that one can reproduce the same results.

4.1 Decomposition of non-stationary piecewise signal

We consider a non-stationary piecewise signal that consists of a low-frequency component and a high-frequency component piecewisely defined as X_t= cos(90πt)I(t≤0.5)+ cos(10πt)I(t>0.5), t∈[ 0,1]. Figure 13 shows 1000 equally spaced signal on [ 0,1]. Huang et al. [2, 15] pointed out that EMD fails to decompose a signal with mode mixing, which means that different modes of oscillations coexist in a single intrinsic mode function (IMF).

On the other hand, the proposed method is capable of locally suppressing the high-frequency mode whose period is less than some size parameter. The dotted line and solid line in Fig. 14 represent the true components and extracted components by each method, respectively. From the results, we observe that the proposed method outperforms EMD, EEMD, and CEEMDAN. Here, we use the median(Ave$^{\tau }_{t+\ell }(X_{t})$) as a central measure and the size parameter τ=21 for our method.

4.2 Decomposition of noisy signal

We evaluate the robustness of the proposed decomposition to noise signals. We generate a noisy signal X_t= cos(90πt)+ cos(10πt)+ε_t, where ε_t denote Gaussian errors with signal-to-noise ratio 7. The decomposition results by the proposed method, EMD, EEMD, and CEEMDAN are shown in Fig. 15. As one can see, EMD is sensitive to noises. The effect of non-informative fluctuation distorts the subsequent decomposition results of EMD, which is due to the interpolation process in the construction of envelopes based on local extrema.

On the other hand, the proposed method is robust to the noises since the decomposition is processed without the identification of fluctuations. The decomposition results in Fig. 15 support this fact. If we regard noise as fluctuation with the highest frequency, the proposed method with relatively small τ might separate a noise term from a signal. By taking the size parameter τ=10, a noisy signal is decomposed as the highest component of noise and the low-frequency residue component, which corresponds to a signal X_t. The low-frequency residue component is repeatedly decomposed with the size parameter τ=21. For central measure, the ensemble average $\text {EAve}_{t}^{\tau }$ is used for the noisy signal. We notice that EEMD and CEEMDAN work well for the decomposition of the noisy signal.

4.3 Analysis of beat signal

Suppose that we have a signal X_t=h_t+g_t:= cos(62πt)+ cos(58πt), where the frequencies of h_t and g_t are very close. A signal composed of two components with similar frequencies generates a beat signal, as shown Fig. 16a. It turns out that h_t and g_t can be separated by $X_{t} - IR_{t}^{(k)}(X_{t},{\mathcal {M}})$ and $IR_{t}^{(k)}(X_{t},{\mathcal {M}})$, where ${\mathcal {M}}X_{t}=\text {EAve}_{i}^{\tau _{0}}(X_{t})$ and τ₀=29 is the period of the component g_t. Clearly, $IR_{t}^{(k)}(X_{t},{\mathcal {M}}) \to g_{t}$ and thus, $X_{t} - IR_{t}^{(k)}(X_{t},{\mathcal {M}}) \to h_{t}$ as k→∞ by Theorem 1. Figure 16b–d show $X_{t}-IR_{t}^{(k)}(X_{t},{\mathcal {M}})$ for k=10,200,500 and e represents $IR_{t}^{(k)}(X_{t},{\mathcal {M}})$ with k=500. We observe that the two components are separated with a sufficiently large k.

The decomposition results may be meaningful, but they are not practical because it requires too many iterations to extract the desired signal.

For a different look of the beat signal, we consider a signal, as shown in Fig. 17a, which can be interpreted as multiplying the signal cos(60πt) by the amplitude modulating term 2 cos(2πt), i.e., X_t= cos(62πt)+ cos(58πt)=2 cos(2πt) cos(60πt). Figure 17b and c display the frequency component and amplitude modulation of X_t, respectively. Figure 17d shows the upper and lower envelopes of the ensemble patch transform and indicates that the ensemble envelope range $ER_{t}^{\tau }$ with size parameter τ=30 holds information of the amplitude modulation. The absolute value of amplitude modulation 2 cos(2πt) marked by the solid line of Fig. 17e is well approximated by $ER_{t}^{\tau }/{2}$ marked by the dashed line of Fig. 17e. Note that when the amplitude modulation 2 cos(2πt) is close to zero, the approximation is not appropriate. If signal X_t is filtered by the approximated absolute amplitude modulation as $\frac {X_{t}}{ER_{t}^{\tau }/{2}}$, as shown in Fig. 17f, additional information for amplitude modulation can be obtained.

Figure 18a shows the upper and lower envelopes of the ensemble patch transform for the filtered signal. By multiplying the ensemble envelope range (Fig. 18b) for the filtered signal with the ensemble envelope range (Fig. 18c) at the first stage, the approximation can be improved. See the dashed line of Fig. 18d. The approximation is significantly improved, where the amplitude modulation 2 cos(2πt) is close to zero. Therefore, the ensemble patch transform can be applied to deduce the information about the amplitude modulation, where the amplitude information is mixed with frequency information. There is some restriction such that the amplitude modulating component should be positive. However, despite this constraint, the above filtering method is practically applicable since the amplitude-modulated component of a real-world signal can be positive. See the analysis of the solar radiation below.

4.4 Analysis of solar radiation data

In this example, we analyze the solar radiation data that were hourly observed at three cities in South Korea, Seoul, Daegu, and Busan during September 2003, which are shown in Fig. 19. The data are available from the Korea Meteorological Administration (https://data.kma.go.kr). Daegu and Busan, located in the southeast part of the Korean Peninsula, are close to each other geographically, whereas Seoul is located in the middle of the Peninsula. Besides, Daegu and Busan were severely damaged by a typhoon named “MAEMI" in September 2003, while Seoul was hardly affected by the typhoon.

Table 3(a) lists the results of correlations among solar radiation observed at Seoul, Daegu, and Busan. However, contrary to the usual expectation, the results are very similar since the daily effect dominates the time series of solar radiation. The solar radiation data can be interpreted as a multiplication form of the periodic component and the amplitude-modulating component. It is required to separate the periodic pattern and the amplitude modulation for a better understanding of the climatic similarity among the three cities.

Table 3 Correlation of solar radiations between Seoul, Daegu, and Busan

Full size table

To reveal the effects of interest, we eliminate the daily effect by using the upper envelope $EU_{t}^{\tau }(\cdot)$ with τ=24. The dashed lines of Fig. 19 represent the upper envelopes of solar radiation data extracted by the proposed method. It seems that the dominant daily effects are successfully removed. For further evaluation, we compute correlations among the upper envelopes of solar radiation data listed in Table 3(b). As one can see, the correlations of Seoul–Daegu and Seoul–Busan have been decreased dramatically compared to the correlations before the transform. Thus, Daegu and Busan have similar climatic characteristics, and Seoul seems to be different from the other two cities, which is consistent with our intuition.

We remark that the data in this example can be considered as a typical multiplicative model. However, the log transformation that is the most popular technique for dealing with the multiplicative model is not suitable for this example. The main reason is that about half of the data has zero value, as shown in Fig. 19, since there is no solar radiation during the night. On the other hand, the proposed $U_{t}^{\tau }(\cdot)$ filter does not suffer from this problem. Of course, it is feasible to use a log transform by adding a small value ε like 10⁻¹⁰ to the data; however, it makes the daily effect stronger. The log transformation is a technique that suppresses the amplitude modulating effects rather than extracting them; hence, it is not appropriate when one is interested in amplitude-modulation signals themselves.

4.5 Analysis of electricity data

We analyze the US electricity production data recorded monthly from January 1973 to December 2005, which can be obtained from R package TSA. The electricity data seem to have a global trend and a seasonal component. A typical approach to analyzing such data consists of two-step: stabilize the variance of the time series through a transform such as a log transform and extract seasonality. We now interpret the electricity signal as a product of the periodic component and the amplitude-modulated component and decompose the data in the following order: stabilizing the volatility of data by inferring the amplitude-modulated component and then removing the seasonal component by using the double average filter ${\mathcal {M}}X_{t}=\text {EAve}_{i}^{\tau }(X_{t})$.

Figure 20a shows the upper envelope $EU_{t}^{\tau }(X_t)$ and the lower envelope $EL_{t}^{\tau }(X_t)$, and Fig. 20b shows $\tilde {X}_{t} := \frac {X_{t}}{ER_{t}^{\tau }(X_{t})}$, which seemingly stabilizes the volatility of X_t, where $ER_{t}^{\tau }(X_t)=EU_{t}^{\tau }(X_t)-EL_{t}^{\tau }(X_t)$. We now decompose the trend and the seasonality of X_t by taking a double average filter ${\mathcal {M}}\tilde {X}_{t}=\text {EAve}_{i}^{\tau }(\tilde {X}_{t})$ with τ=12. The trend and the seasonality are effectively separated, as displayed in Fig. 20c and d. For comparison, two IMFs decomposed by EMD are illustrated in Fig. 21, and the cyclic pattern of seasonality is not clear.

We remark that the above signal X_t can be considered as a multiplicative model rather than an additive model [16]. From the above result, our proposed method is not limited to an additive model. In other words, the proposed EPT method can be applicable to the decomposition method for both additive and multiplicative models.

4.6 Analysis of Airmile data

Here, we analyze monthly airline passenger-mile data in the USA from January 1996 to May 2005 of [17] in Fig. 22a. The data show a strong seasonality with holiday effects, and these are increasing linearly overall with an intervention in September 2001 and several months thereafter due to the terrorist acts on September 11, 2001.

For extracting the periodic components in the data, we design a new filter ${\mathcal {L}}X_{t}:= \max \left (L_{t+\ell }^{\tau }(X_{t})\right)$ over ℓs in [−τ/2,τ/2] that can be considered as another lower envelope of X_t. We note that the signal part expressed by the sum of the increasing trend and several abrupt changes with positive jump sizes can be representable with filter ${\mathcal {L}}$. The dashed line in Fig. 22a represents the realization of filter ${\mathcal {L}}$ with τ=12. Figure 22b shows the extracted signal $\tilde {X}_{t} = X_{t} - {\mathcal {L}}X_{t}$, which removes the trend and several abrupt changes successfully; however, the amplitudes of the extracted periodic signal are slightly different over pulses. To further remove the amplitude modulated component, the only thing to do is getting the upper envelope of $\tilde X$, i.e., $EU_{t}(\tilde X_t)$ and then dividing $\tilde X$ by it because $EU_{t}(\tilde X_t)=ER_{t}(\tilde X_t)$ due to $EL_{t}(\tilde X_t)=0$ in this case. The upper envelope of $\tilde X_{t}$ is represented by the dashed line in Fig. 22b. Finally, we obtain the stabilized periodic signal, i.e., $\frac {\tilde X_{t}}{EU_{t}(\tilde X_{t})}$ in Fig. 22c, and, by further decomposition of the signal, the seasonal and annual components in Fig. 22d and e, respectively. Figure 23 shows the two IMFs by EMD for Airmile data. Since the abrupt changes are scattered across IMFs, the decomposition results are distorted, and the periodic patterns are not separated effectively.

5 Selection of size parameter

Here, we discuss the selection method of size parameter τ for ensemble patch transformation. First of all, in some cases, we can choose the appropriate τ based on known facts about data and analysis purposes. For example, the choice of τ=24 in Section 4.4 to observe solar radiation data per hour and check the daily effect is simple and natural. Sections 4.5 and 4.6 analyze the seasonal effects of monthly data, so τ=12 can be a reasonable choice.

For estimation of the size parameter τ from observations, we propose two selection methods. One is performed in a priori way, and the other is based on a posterior information of the decomposition. The size parameter τ corresponds to a period in the time domain. When a priori information of the periodic pattern of a signal is available, a selection of the size parameter can be conducted based on the distribution of periodic patterns. Such information can be obtained through the empirical periods of distances between local maxima (or local minima). Note that the empirical period is expressed by the number of observations between local maxima, not by the distance of the real-time. To sum up, the proposed priori method can be described as follows.

Find local maxima (or local minima) of the signal.
Obtain empirical periods of distances between local maxima.
Estimate the distribution of empirical periods.
Select τ as the dominated period of the estimated distribution.

Figure 24 shows the distribution of empirical periods for a signal X_t= cos(90πt)+ cos(10πt) and its high-frequency component cos(90πt), where the high frequency pattern is apparent in signal X_t. It seems that the dominated period is 21, which is set to be our estimated parameter, $\hat {\tau }=21$. The decomposition results of Figs. 14 and 15 in Section 4 are based on this selection method.

In the case that the frequency ratio of components composing a signal falls below a certain range, the local pattern of the high-frequency component may not be distinct; thus, the above selection method based on empirical periods is not appropriate. From the results in Fig. 2, we observe that the proposed method might separate two components according to the frequencies. Nevertheless, the components should be weakly correlated to each other unless they are orthogonal. Hence, for the posterior method, we use correlation information between two components extracted by ensemble patch transformation. That is, through the grid search in a certain range of the size parameter, the size parameter τ is selected, having the minimum correlation between the decomposed components. Figure 25 shows the sample correlations between the extracted components for the signal X_t= cos(100πt)+4 cos(60πt) in Fig. 2 over a range of τ, which produces $\hat {\tau }=19$. The decomposition result of Fig. 2 is obtained through the ensemble average for the oval patch with size parameter 19.

We remark that through extensive experiments, we observe that our method is somewhat robust to the selection of size parameters. Suppose that we decompose the signal X_t= cos(90πt)+ cos(10πt) into two components by the proposed method with a range of τ=18 to 23. Figure 26 shows the differences between the extracted high-frequency component and the true component cos(90πt) over the range of τ. As one can see, the results are robust to the choice of the size parameter τ value.

Finally, the proposed selection methods of the parameter τ lack theoretical justification. An objective way with theoretical backup might improve the performance and the practicality of the proposed method. This topic is left for future study.

6 Conclusion

In this paper, we have introduced a new transformation technique, termed “ensemble patch transformation” for signal decomposition and data analysis. We have presented a practical algorithm for the implementation of the proposed method with some theoretical properties. The empirical performance of the proposed method has been evaluated throughout several numerical experiments and real-world signal analysis. Results from these experiments illustrate that the proposed method possesses promising empirical properties.

We remark that the purpose of signal decomposition is to construct the target signal X_t as a combination of components that can be interpreted or understood. However, even when a component of a given signal can be perceived, its separation from the signal is not trivial, which is the motivation for developing the ensemble patch transform. The tools provided by the ensemble patch transform requires the filter design, as shown for analyzing signals in Sections 4.3–4.6. Once a proper filter is designed, this “discomfort” guarantees extreme degrees of freedom for analyzing a signal.

Finally, the proposed transformation holds inherently multiscale features due to the size parameter of τ, which serves to control the size of patches. That is, the size parameter of the patch acts as the scale parameter of multiscale features. The scale-space concept might provide a view-point on visualization of data, which considers a family of representations of data indexed by the scale parameter instead of the conventional dot-connected plot. This topic are reserved for future research.

Abbreviations

CEEMDAN:: Complete ensemble empirical mode decomposition with adaptive noise
EMD:: Empirical mode decomposition
EEMD:: Ensemble empirical mode decomposition
EPT:: Ensemble patch transform
VMD:: variational mode decomposition

References

T. Lindeberg, Scale-space theory in computer vision (Springer Science & Business Media, New York, 1994).
Book Google Scholar
N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N. C. Yen, C. C. Tung, H. H. Liu, The empirical mode decomposition and Hilbert spectrum for nonlinear and nonstationary time series analysis. Proc. R. Soc. London A. 454:, 903–995 (1998).
Article MathSciNet Google Scholar
G. Rilling, P. Flandrin, One or two frequencies? The empirical mode decomposition answers. IEEE Trans. Signal Process.56:, 85–95 (2008).
Article MathSciNet Google Scholar
Z. Wu, N. E. Huang, Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv. Adapt. Data Anal.1:, 1–41 (2009).
Article Google Scholar
K. Dragomiretskiy, D. Zosso, Variational mode decomposition. IEEE Trans. Signal Process.62:, 531–544 (2014).
Article MathSciNet Google Scholar
J. R. Yeh, J. S. Shieh, N. E. Huang, Complementary ensemble empirical mode decomposition: a novel noise enhanced data analysis method. Adv. Adapt. Data Anal.2:, 135–156 (2010).
Article MathSciNet Google Scholar
M. E. Torres, M. A. Colominas, G. Schlotthauer, P. Flandrin, A complete ensemble empirical mode decomposition with adaptive noise. Proc. IEEE Int. Conf. Acoust. Speech Sig. Process (ICASSP), 4144–4147 (2011).
M. A. Colominas, G. Schlotthauer, M. E. Torres, Improved complete ensemble EMD: a suitable tool for biomedical signal processing. Biomed. Signal Proc. Control. 14:, 19–29 (2014).
Article Google Scholar
I. Daubechies, J. Lu, H. T. Wu, Synchrosqueezed wavelet transforms: an empirical mode decomposition-like tool. Appl. Comput. Harmon. Anal.30:, 243–261 (2011).
Article MathSciNet Google Scholar
G. Thakur, E. Brevdo, N. S. Fuckar, H. -T. Wu, The synchrosqueezing algorithm for time-varying spectral analysis: robustness properties and new paleoclimate applications. Signal Process.93:, 1079–1094 (2013).
Article Google Scholar
G. Thakur, H. -T. Wu, Synchrosqueezing-based recovery of instantaneous frequency from nonuniform samples. SIAM J. Math. Anal.43:, 2078–2095 (2011).
Article MathSciNet Google Scholar
S. Meignen, T. Oberlin, S. McLaughlin, A new algorithm for multicomponent signals analysis based on synchrosqueezing: with an application to signal sampling and denoising. IEEE Trans. Signal Process.60:, 5787–5798 (2012).
Article MathSciNet Google Scholar
P. Fryzlewicz, H. -S. Oh, Thick pen transformation for time series. J. R. Stat. Soc. B. 73:, 499–529 (2011).
Article MathSciNet Google Scholar
N. Gallagher, G. Wise, A theoretical analysis of the properties of median filters. IEEE Trans. Acoust. Speech Signal Process.29:, 1136–1141 (1981).
Article Google Scholar
N. E. Huang, M. C. Wu, S. R. Long, S. Shen, W. Qu, P. Gloerson, K. L. Fan, A confidence limit for the empirical mode decomposition and Hilbert spectral analysis. Proc. R. Soc. London A. 459:, 2317–2345 (2003).
Article MathSciNet Google Scholar
A. V. Metcalfe, P. S. Cowpertwait, Introductory time series with R (Springer-Verlag, New York, 2009).
Book Google Scholar
J. D. Cryer, K. -S. Chan, Time series analysis with applications in R, Second Edition (Springer, New York, 2008).
MATH Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2016R1D1A1B03930463 and 2018R1D1A1B07042933).

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Sejong University, 209 Neungdong-ro, Seoul, 05006, Korea
Donghoh Kim
Artificial Intelligence Lab, LG Electronics Inc, 38 Baumoe-ro, Seoul, 06763, Korea
Guebin Choi
Department of Statistics, Seoul National University, 1 Gwanak-ro, Seoul, 08826, Korea
Hee-Seok Oh

Authors

Donghoh Kim
View author publications
You can also search for this author in PubMed Google Scholar
Guebin Choi
View author publications
You can also search for this author in PubMed Google Scholar
Hee-Seok Oh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

DK and GC performed the experiments, analyzed the data, and interpreted the results. HSO contributed in developing the proposed method. All authors wrote, read, and approved the final manuscript.

Corresponding author

Correspondence to Hee-Seok Oh.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kim, D., Choi, G. & Oh, HS. Ensemble patch transformation: a flexible framework for decomposition and filtering of signal. EURASIP J. Adv. Signal Process. 2020, 30 (2020). https://doi.org/10.1186/s13634-020-00690-7

Download citation

Received: 22 November 2019
Accepted: 08 June 2020
Published: 26 June 2020
DOI: https://doi.org/10.1186/s13634-020-00690-7

Ensemble patch transformation: a flexible framework for decomposition and filtering of signal

Abstract

1 Introduction

2 Ensemble patch transform

2.1 Multiscale patch transform

2.2 Ensemble patch transform

Definition 1

3 Methods

3.1 Ensemble patch filtering

3.2 Decomposition by ensemble patch filtering

Theorem 1

Proof

Definition 2

Definition 3

Theorem 2

Lemma 1

Corollary 1

4 Results and discussion

4.1 Decomposition of non-stationary piecewise signal

4.2 Decomposition of noisy signal

4.3 Analysis of beat signal

4.4 Analysis of solar radiation data

4.5 Analysis of electricity data

4.6 Analysis of Airmile data

5 Selection of size parameter

6 Conclusion

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords