Ensemble patch transformation: a flexible framework for decomposition and filtering of signal

This paper considers the problem of signal decomposition and filtering by extending its scope to various signals that cannot be effectively dealt with existing methods. For the core of our methodology, we introduce a new approach, termed “ensemble patch transformation” that provides a framework for decomposition and filtering of signals; thus, as a result, it enhances identification of local characteristics embedded in a signal that is crucial for signal decomposition and designs flexible filters that allow various data analyses. In literature, there are some data-adaptive decomposition methods such as empirical mode decomposition (EMD) by Huang (Proc. R. Soc. London A 454:903–995, 1998). Along the same line of EMD, we propose a new decomposition algorithm that extracts essential components from a signal. Some theoretical properties of the proposed algorithm are investigated. To evaluate the proposed method, we analyze several synthetic examples and real signals.


Introduction
In this paper, we propose a new method for decomposition and filtering of signals, termed "ensemble patch transformation, " which adopts a multiscale concept of scale-space theory in computer vision of [1]. The proposed ensemble patch transformation consists of two key components. The first one is "patch process" that is defined as a data-dependent patch of data at a particular time point t. The patch process is designed for identifying local structures of data according to the sizes of patches. The second concept is "ensemble" that is obtained by shifting the time point t of the patch, which is suitable for representing the temporal variation of data efficiently by enhancement of the temporal resolution of them. Moreover, various statistics obtained from the proposed ensemble patches might be useful for data analysis.
Successful recognition of the local frequency patterns of a signal is crucial for signal decomposition. Empirical mode decomposition (EMD) by [2] identifies such local patterns through local extrema. In the case that the local extrema reflect the time-varying can be employed according to the purpose of data analysis. The patch is formally defined by its shape and size. Let T = {τ i } i be a set of size parameters for a patch with a certain shape such as rectangle and oval. For τ ∈ T , let P τ t (X t ) denote the patch process for observation (t, X t ) that is generated by a certain shape with size parameter τ . We further define a multiscale patch transform MP T t (X t ) for observation (t, X t ) as a sequence of all patches according to different τ i 's, As one can see, the precise definition of MP T t (X t ) depends on the shape of the patch. As for the typical case, rectangle and oval are considered. Of course, we can take other shapes as well.
Rectangle patch: For a given point (t, X t ) and τ ∈ T , this patch is centered at the point (t, X t ) and is a closed rectangle formed by the points t + k, min k∈[−τ/2,τ/2] X t+k − 0.5γ τ and t + k, max k∈[−τ/2,τ/2] X t+k + 0.5γ τ for k ∈[ −τ/2, τ/2]. For the rectangle patch, the width is τ and hight h τ t is where γ is a scale factor. Oval patch: For a given point (t, X t ) and τ ∈ T , this patch is centered at the point (t, X t ) and is characterized by boundaries t + k, X t+k ± γ τ 2 /4 − k 2 , k ∈[ −τ/2, τ/2] where γ is a scale factor. The width for the oval patch is τ as for the rectangle patch, and the height is of decreasing pattern as moving away from a given point (t, X t ).
For an illustration of the patch process, we consider a deterministic signal X t = 25 cos(0.1πt) cos(πt), 0 ≤ t ≤ 10. We then obtain a sequence X t i 100 i=1 with t i = iT and sampling rate T = 1/10 from the continuous signal X t . Figure 3 shows rectangle patches P τ t i (X t ) of the sequence X t i that are respectively performed at certain time points t i = 4, 5, 6 marked by red dots. We consider three different size parameters τ = 4, 8, 12 for generating patches, and obtain a multiscale patch MP T t (X t ) by combining the three patches in Fig. 3a-c. Figure 4 shows patches in the entire time domain with the parameters τ = 2 and 4, respectively. From Figs. 3 and 4 and the definitions, the patch at a particular time point t is an object that contains multiple observations around the time point t; thus, for further statistical analysis, it is necessary to use some statistics that summarize informations of P τ t (X t ) and MP T t (X t ). For this purpose, we consider a measure K P τ t (X t ) that produces a single statistic at time point t. Some possible measures K(·) are twofold: one is for central tendency and the other is for dispersion. As measures for central tendency, in this study, we present the following two measures. Suppose that we obtain the patch P τ t (X t ) for a fixed τ .
and U τ t (X t ) denote lower and upper envelopes of the patch P τ t (X t ), respectively. M τ t (X t ) is mean envelope. The lower envelope L τ t (X t ) and upper envelope U τ t (X t ) of the rectangle patch are  3 Patches with rectangle shape P τ ti (X t ) of signal X t = 25 cos(0.1πt) cos(πt) at center points of the patches, t i = 4, 5, 6. a τ = 4, b τ = 8, and c τ = 12 respectively. L τ t (X t ) and U τ t (X t ) of the oval patch are respectively.
For dispersion measure, we consider the followings , which represents the envelope range. Figure 5 shows Ave τ t (X t ) and sd τ t (X t ) with size parameters τ = 8, 32, 64 for a noisy signal X t = 25 cos(0.1πt) cos(πt) + σ t , where σ = 1.8 and t denote i.i.d. standard Gaussian random variables. As the value of size parameter τ increases, a central measure Ave τ t (X t ) is getting smoother with representing the global trend of the observations. On the other hand, the values of sd τ t (X t ) at both boundaries are large, and sd τ t (X t ) becomes larger as τ increases since a large patch contains more observations. Further, it seems that sd τ t (X t ) by τ = 64 is capable of identifying the temporal variability of the signal well. The derivations of statistics are not limited in the above definitions, which can be further defined by other measures such as trimmed mean and median absolute deviation of the patch P τ t (X t ).

Ensemble patch transform
To improve the temporal resolution of the patch and its measures, we introduce an ensemble patch process of a real-valued univariate process (X t ) t . Definition 1 Let (X t ) t be a real-valued univariate process. Let T denote a set of size parameters for the patch. For any τ ∈ T , the th shifted patch at time point t is defined as Then, a fixed τ ∈ T , a collection of all possible shifted patches at time point t is defined as ensemble patch, Finally, as a dictionary, the multiscale ensemble patch process is defined the sequence of all sets of EP τ t (X t ) over various τ 's as For the sequence {X t i } in Fig. 3, we generate ensemble rectangle patches at the time points t i = 4, 5, 6 according to size parameters τ = 4, 8, 12, which are displayed in Fig. 6. A multiscale ensemble patch MEP T t (X t ) is obtained by combining the three ensemble patches in Fig. 6a-c.
Similarly, for further data analysis, we consider some statistics of ensemble patch EP τ t (X t ). We first consider a measure of each shifted patch K(P τ t+ (X t )) and then obtain an ensemble measure by averaging K(P τ t+ (X t ))'s over 's in [ −τ/2, τ/2]. More specifically, we obtain the following ensemble measures for central tendency and dispersion: for a fixed τ , suppose that we obtain the collection of all shifted patches at time point t, that are lower and upper envelopes of the patch P τ We obtain some measures based on ensemble patches of the noisy signal in Fig. 5, EAve τ t (X t ) and Esd τ t (X t ) with size parameters τ = 8, 32, 64, which are shown in Fig. 7. As the value of τ increases, the central measure EAve τ t (X t ) is getting smoother, and the dispersion measure Esd τ t (X t ) becomes more significant with having relatively large values at both boundaries. Furthermore, by comparison of the ensemble results with the single patch results in Fig. 5, we have some observations: (a) the central measure by ensemble patches represents the temporal trend of the underlying function well, compared to that by single patches. (b) The dispersion measure with large τ by ensemble patches identifies a local variability of the underlying function efficiently. (c) The temporal resolution of both measures by ensemble patches is much more delicate than those of single patches.
We remark that ensemble patches are able to obtain various statistics that are adapted for data analysis. For example, as an alternative central measure, we can consider the median for each patch P τ t+ (X t ), say Med τ t+ (X t ) and the corresponding mean of Med τ t+ (X t ) over , EMed τ t (X t ). These measures are closely related to filters for decomposition in Section 3. Moreover, the difference between rectangle and oval patches is not significant. It is only noticeable when considering envelopes L τ t (X t ) and U τ t (X t ). The mean envelope M τ t (X t ) obtained by the rectangle patch represents a stair-shaped curve and is undesirable for smoothing. The oval patch, on the other hand, can be useful in capturing the central tendency of the data because the resulting curve is smoother. However, it is necessary to capture the sudden change in data as shown in Fig. 22 later. In this case, the rectangle patch is more useful than the oval patch.
Finally, the thick-pen transformation by [13] is a special case of the above MEP T t (X t ) with L τ t+ (X t ) and U τ t+ (X t ) at the shifting index = 0.

Ensemble patch filtering
When a signal consists of several components with their own frequencies, ensemble patch transformation can be utilized as a low-pass or a high-pass filter. Figure 8 illustrates the filtering process of the ensemble mean envelope. The top panel shows a sinusoidal signal X t = cos(50πt) + cos(10πt) + 2t (t ∈[ 0.35, 0.55] ) and depicts three shifted rectangle patches covering a point X t (open circle) at t = 0.45. Each th shifted patch produces upper and lower envelopes U τ t+ (X t ) and L τ t+ (X t ) at time point t. The black dots denote the average values of data in M τ t+ (X t ) at time point t = 0.45 for the shifted patches a, b, and c. Similarly, with shifting the patch over the entire time domain, we construct a mean envelope from each shifted patch process. The bottom panel of Fig. 8 shows three mean envelopes (dotted line), respectively. We note that, although we use only three shifted mean envelops for illustration purposes, the possible number of shifted mean envelopes for a given point is the same as the size parameter τ of the patch in general. Furthermore, we take an ensemble average of those mean envelopes, which results in the ensemble mean envelope marked by the solid line. It seems that the ensemble mean envelope represents a lower frequency component of the signal.
For demonstrating the utility of this ensemble approach, we consider a synthetic example. Figure 9 shows a signal X t = cos(50πt) + cos(10πt) + 2t (t ∈[ 0, 1] ) in white color and its ensemble patch transformation with size parameters τ = 20, 40, 80, 120, 200, and 240, respectively. The lower and upper envelopes EL τ t (X t ), EU τ t (X t ) and the mean envelope EM τ t (X t ) are obtained by the ensemble approach. The area covered by two envelopes is colored in gray, and the mean envelope is denoted by the solid line. We observe that with size parameter τ = 40, the ensemble mean envelope suppresses a highfrequency component cos(50πt). When the size parameter τ is larger than 200, both the oscillating patterns of components cos(50πt) and cos(10πt) are painted over by patch transformation. As the value of τ increases, the ensemble mean envelope tends to suppress the oscillating local pattern and represents the lower-frequency pattern at the same time. The ensemble mean envelope removes the frequency pattern whose period is less (2020) 2020:30 Page 10 of 27 By controlling the size parameter, the mean envelope EM τ t (X t ) of the ensemble patch transformation is used as a low-pass filter or high-pass filter.
We further perform the same experiment with measure EAve τ t (X t ) that is average of Ave τ t+ (X t ) obtained from ensemble patches. The results EAve τ t (X t ) (solid line) with different τ = 20, 40, 80, 120, 200, and 240 are displayed in Fig. 10. As one can see, the results are almost identical to those of EM τ t (X t ) in Fig. 9.

Decomposition by ensemble patch filtering
By using the above notion of filters, we would like to decompose a signal into a highfrequency component and a low-frequency residue component. We consider a signal X t = cos(90πt) + cos(10πt), t ∈[ 0, 1] , shown in Fig. 11. A signal X t = cos(90πt) + cos(10πt) and its two components A snapshot of the decomposition procedure by ensemble patch filtering is displayed in Fig. 12. From top to down and left to right panels, the first panel illustrates a lowfrequency mode, say, LF 1 that is an ensemble mean envelope of EP τ t (X t ) for a given τ , and the corresponding high-frequency mode HF 1 = X −LF 1 in the next panel. As one can see, there still exists apparent low-frequency mode in HF 1 . The third panel shows an ensemble mean envelope of HF 1 , say, LF 2 , which seemingly identifies the low-frequency mode of HF 1 in the second panel. A new high-frequency mode HF 2 = HF 1 − LF 2 = X − LF 1 − LF 2 is now obtained. In the next iteration, a further ensemble mean envelope of HF 2 , say, LF 3 is almost constant; hence, the corresponding high-frequency mode HF 3 3 represents the true high-frequency component well. An iterative procedure is required, which is along with the line of the sifting process of EMD.
From the above discussion, we propose a practical decomposition algorithm based on ensemble patch filtering.
is the th shifted patch at time t for a given τ . Suppose that a signal X t consists of a high-frequency component h t and a low-frequency component g t as X t = h t + g t .

Obtain an initial componentĥ
3. Take the converged estimate as the extracted component for h t .
We have some remarks regarding the aforementioned algorithm. (a) Choice of G τ t : it is feasible to use various choices of G τ t , including some central measures introduced in Section 2.2, which is the main benefit of utilizing ensemble patch transformation. (b) Choice of τ : the size parameter τ corresponds to a period in the time domain. Thus, the parameter τ plays a crucial role in the quality of the extracted low-frequency component. A selection method of τ will be discussed later.
We now discuss a convergence property of the above algorithm under some conditions.
t can be written asĥ where δ t denotes Kronecker delta function and u * k = u * u * · · · * u k denotes convolution power. In addi- We now extend the above result with a general filter beyond the double average filter EAve τ t (X t ). M)) and IR (1) t (X t , M) = MX t . Furthermore, X t is said to be (iteratively) representable with filter M if IR t (X t , M) = X t for all t.

Definition 2 Let X t be a continuous-time signal. For any t, an iterative representation of X t with filter M is defined as IR
We note that if X t = MX t for all t, then X t is (iteratively) representable with filter M.

Definition 3 Let X t denotes a continuous-time signal. Suppose that X t consists of two components as X t = h t + g t for all t. The component h t is said to be cancelable from X t with filter M if MX t = Mg t for all t.
Theorem 2 Let X t denotes a continuous-time signal. Suppose that X t consists of two components as X t = h t + g t for all t. Assume that (a) g t is (iteratively) representable with filter M, and (b) h t is cancelable from X t with filter M. Then, it follows that X t − IR t (X t , M) = h t for all t.
A proof is directly obtained by the above definitions.

Then, X t is (iteratively) representable with filter M.
A proof of Lemma 1 is easily obtained from proof of Theorem 1; hence, we omit it. Suppose that X t consists of two components as X t = h t + g t . If M is a linear filter such that Mh t = 0, then h t is cancelable from X t with filter M. Thus, we obtain the following result that extends the convergence property of Lemma 1 with MX t = EAve τ 0 t (X t ) to a general linear filter M under some conditions.

Corollary 1 Let X t be a continuous-time signal. Suppose that X t consists of two components as X t = h t + g t for all t. Define the filter M as
Then, it follows that X t − IR t (X t , M) = h t for all t.
As for a final remark, we consider a simple example related to designing an ideal filter that provides the strength of our method. To simplify our discussion, suppose that we observed a discrete signal such that X i = i∈Z X t δ(t − i). Suppose that X i consists two component h i and g i , i.e., X i = h i + g i , where component h i satisfies h i = h i+3 and 3 i=1 h i = 0, and g i is a component whose value suddenly changes from −1 to 1 at i = 0 as in Table 1.
We define the filter M as MX i = EAve τ 0 i (X i ), τ 0 = 3. Note that component h i is cancelable from X i with filter M, but component g i cannot be (iteratively) representable with filter M since |G(ω)| > 0 for some ω ∈ ω : ω = 2πk 3 ± 2nπ, for all k = 1, 2 andn ∈ N , where G(ω) is the Fourier transform of g i . Thus, it is not able to obtain h i from the iterative procedure X i − IR i (X i , M), because the filters of the moving average class are not suitable for expressing data with a sharp mean change such as g i . It is generally known that data with such a sharp mean change can be easily represented by a median filter (2020) 2020:30 Page 14 of 27 [14]. In particular, the component g i used in the example is a root signal since it does not change even if it passes through the median filter repeatedly. Hence, the convergence property is ensured. In summary, the average filter of Ave τ t (X t ) is advantageous to cancel h i , but it cannot represent g i properly. On the other hand, median filter of Med τ t (X t ) is not capable of canceling h i , but is useful for expressing g i . However, a combination of both filters might lead to desired decomposition results, which is feasible under the ensemble patch transform framework, not just patch transform one. It is a benefit of the proposed transformation combining the patch process and the ensemble process. We now consider a composite filter M * constructed by an average filter of patch process and a median filter of ensemble process, that is, M * X i = median Ave τ 0 i+ (X i ) , τ 0 = 3. Due to the property of the linear filter and the condition 3 i=1 h i = 0, it follows that So, this filter separates two components h i and g i , and cancels h i . In the example, the value of Ave τ 0 i+ (g i ) for each is listed in Table 2. Then, by passing the median filter as the second filter, we obtain a signal . . . , −1, − 1 3 , − 1 3 , 1, . . . , which completely represents component Table 2. In the example, note the difference of convergence by the filter M and M * . As the iteration progresses with k → ∞, the iterative procedure X i − IR

Results and discussion
In this section, we conduct a numerical study with various examples to assess the practical performance of the proposed method, which is implemented by the algorithm in Section 3.2. In this numerical study, we compare the proposed decomposition method with EMD, EEMD, and CEEMDAN and various types of filters are designed for the analysis reflecting the characteristics of a given signal. The R statistical package, EPT, used to implement the methods and to carry out some experiments are available at https://CRAN.R-project.org/package=EPT in order that one can reproduce the same results.  Table 1 i

Kim et al. EURASIP Journal on Advances in Signal Processing
(2020) 2020:30 Page 15 of 27

Decomposition of non-stationary piecewise signal
We consider a non-stationary piecewise signal that consists of a low-frequency component and a high-frequency component piecewisely defined as X t = cos(90πt)I(t ≤ 0.5) + cos(10πt)I(t > 0.5), t ∈[0, 1]. Figure 13 shows 1000 equally spaced signal on [0, 1]. Huang et al. [2,15] pointed out that EMD fails to decompose a signal with mode mixing, which means that different modes of oscillations coexist in a single intrinsic mode function (IMF). On the other hand, the proposed method is capable of locally suppressing the highfrequency mode whose period is less than some size parameter. The dotted line and solid line in Fig. 14 represent the true components and extracted components by each method, respectively. From the results, we observe that the proposed method outperforms EMD, EEMD, and CEEMDAN. Here, we use the median(Ave τ t+ (X t )) as a central measure and the size parameter τ = 21 for our method.

Decomposition of noisy signal
We evaluate the robustness of the proposed decomposition to noise signals. We generate a noisy signal X t = cos(90πt) + cos(10πt) + t , where t denote Gaussian errors with signal-to-noise ratio 7. The decomposition results by the proposed method, EMD, EEMD, and CEEMDAN are shown in Fig. 15. As one can see, EMD is sensitive to noises. The effect of non-informative fluctuation distorts the subsequent decomposition results of EMD, which is due to the interpolation process in the construction of envelopes based on local extrema. On the other hand, the proposed method is robust to the noises since the decomposition is processed without the identification of fluctuations. The decomposition results in Fig. 15 support this fact. If we regard noise as fluctuation with the highest frequency, the proposed method with relatively small τ might separate a noise term from a signal. By taking the size parameter τ = 10, a noisy signal is decomposed as the highest component of noise and the low-frequency residue component, which corresponds to a signal X t . The low-frequency residue component is repeatedly decomposed with the size parameter τ = 21. For central measure, the ensemble average EAve τ t is used for the noisy signal. We notice that EEMD and CEEMDAN work well for the decomposition of the noisy signal.

Analysis of beat signal
Suppose that we have a signal X t = h t + g t := cos(62πt) + cos(58πt), where the frequencies of h t and g t are very close. A signal composed of two components with similar frequencies generates a beat signal, as shown Fig. 16a. It turns out that h t and g t can be separated by X t − IR M) for k = 10, 200, 500 and e represents IR M) with k = 500. We observe that the two components are separated with a sufficiently large k.
The decomposition results may be meaningful, but they are not practical because it requires too many iterations to extract the desired signal.
For a different look of the beat signal, we consider a signal, as shown in Fig. 17a, which can be interpreted as multiplying the signal cos(60πt) by the amplitude modulating term 2 cos(2πt), i.e., X t = cos(62πt)+cos(58πt) = 2 cos(2πt) cos(60πt). Figure 17b and c display the frequency component and amplitude modulation of X t , respectively. Figure 17d shows the upper and lower envelopes of the ensemble patch transform and indicates that the ensemble envelope range ER τ t with size parameter τ = 30 holds information of the amplitude modulation. The absolute value of amplitude modulation 2 cos(2πt) marked by the solid line of Fig. 17e is well approximated by ER τ t /2 marked by the dashed line  Fig. 17e. Note that when the amplitude modulation 2 cos(2πt) is close to zero, the approximation is not appropriate. If signal X t is filtered by the approximated absolute amplitude modulation as X t ER τ t /2 , as shown in Fig. 17f, additional information for amplitude modulation can be obtained. Figure 18a shows the upper and lower envelopes of the ensemble patch transform for the filtered signal. By multiplying the ensemble envelope range (Fig. 18b) for the filtered signal with the ensemble envelope range (Fig. 18c) at the first stage, the approximation can be improved. See the dashed line of Fig. 18d. The approximation is significantly improved, where the amplitude modulation 2 cos(2πt) is close to zero. Therefore, the ensemble patch transform can be applied to deduce the information about the amplitude modulation, where the amplitude information is mixed with frequency information. There is some restriction such that the amplitude modulating component should be positive. However, despite this constraint, the above filtering method is practically applicable since the amplitude-modulated component of a real-world signal can be positive. See the analysis of the solar radiation below.

Analysis of solar radiation data
In this example, we analyze the solar radiation data that were hourly observed at three cities in South Korea, Seoul, Daegu, and Busan during September 2003, which are shown in Fig. 19. The data are available from the Korea Meteorological Administration (https://data.kma.go.kr). Daegu and Busan, located in the southeast part of the Korean Peninsula, are close to each other geographically, whereas Seoul is located in the middle of the Peninsula. Besides, Daegu and Busan were severely damaged by a typhoon named "MAEMI" in September 2003, while Seoul was hardly affected by the typhoon. Table 3(a) lists the results of correlations among solar radiation observed at Seoul, Daegu, and Busan. However, contrary to the usual expectation, the results are very similar since the daily effect dominates the time series of solar radiation. The solar radiation data can be interpreted as a multiplication form of the periodic component and the amplitudemodulating component. It is required to separate the periodic pattern and the amplitude modulation for a better understanding of the climatic similarity among the three cities.
To reveal the effects of interest, we eliminate the daily effect by using the upper envelope EU τ t (·) with τ = 24. The dashed lines of Fig. 19 represent the upper envelopes of (2020) 2020:30 Page 19 of 27

Fig. 18
Iterative extraction of amplitude modulation solar radiation data extracted by the proposed method. It seems that the dominant daily effects are successfully removed. For further evaluation, we compute correlations among the upper envelopes of solar radiation data listed in Table 3(b). As one can see, the correlations of Seoul-Daegu and Seoul-Busan have been decreased dramatically compared to the correlations before the transform. Thus, Daegu and Busan have similar climatic characteristics, and Seoul seems to be different from the other two cities, which is consistent with our intuition. We remark that the data in this example can be considered as a typical multiplicative model. However, the log transformation that is the most popular technique for dealing with the multiplicative model is not suitable for this example. The main reason is that about half of the data has zero value, as shown in Fig. 19, since there is no solar radiation during the night. On the other hand, the proposed U τ t (·) filter does not suffer from this problem. Of course, it is feasible to use a log transform by adding a small value like 10 −10 to the data; however, it makes the daily effect stronger. The log transformation is a technique that suppresses the amplitude modulating effects rather than extracting them; hence, it is not appropriate when one is interested in amplitude-modulation signals themselves.

Analysis of electricity data
We analyze the US electricity production data recorded monthly from January 1973 to December 2005, which can be obtained from R package TSA. The electricity data seem to have a global trend and a seasonal component. A typical approach to analyzing such data consists of two-step: stabilize the variance of the time series through a transform such as a log transform and extract seasonality. We now interpret the electricity signal as a product of the periodic component and the amplitude-modulated component and decompose the data in the following order: stabilizing the volatility of data by inferring the amplitudemodulated component and then removing the seasonal component by using the double average filter MX t = EAve τ i (X t ).  Figure 20a shows the upper envelope EU τ t (X t ) and the lower envelope EL τ t (X t ), and Fig. 20b showsX t := X t ER τ t (X t ) , which seemingly stabilizes the volatility of X t , where . We now decompose the trend and the seasonality of X t by taking a double average filter MX t = EAve τ i (X t ) with τ = 12. The trend and the seasonality are effectively separated, as displayed in Fig. 20c and d. For comparison, two IMFs decomposed by EMD are illustrated in Fig. 21, and the cyclic pattern of seasonality is not clear.
We remark that the above signal X t can be considered as a multiplicative model rather than an additive model [16]. From the above result, our proposed method is not limited to an additive model. In other words, the proposed EPT method can be applicable to the decomposition method for both additive and multiplicative models.

Analysis of Airmile data
Here, we analyze monthly airline passenger-mile data in the USA from January 1996 to May 2005 of [17] in Fig. 22a. The data show a strong seasonality with holiday effects, and these are increasing linearly overall with an intervention in September 2001 and several months thereafter due to the terrorist acts on September 11, 2001. For extracting the periodic components in the data, we design a new filter LX t := max L τ t+ (X t ) over s in [ −τ/2, τ/2] that can be considered as another lower envelope of X t . We note that the signal part expressed by the sum of the increasing trend and several abrupt changes with positive jump sizes can be representable with filter L. The dashed line in Fig. 22a represents the realization of filter L with τ = 12. Figure 22b shows the extracted signalX t = X t − LX t , which removes the trend and several abrupt changes successfully; however, the amplitudes of the extracted periodic signal are slightly different over pulses. To further remove the amplitude modulated component, the only thing to do is getting the upper envelope ofX, i.e., EU t (X t ) and then dividingX by it because EU t (X t ) = ER t (X t ) due to EL t (X t ) = 0 in this case. The upper envelope ofX t is represented by the dashed line in Fig. 22b. Finally, we obtain the stabilized periodic signal, i.e., Fig. 22c, and, by further decomposition of the signal, the seasonal and annual components in Fig. 22d and e, respectively. Figure 23 shows the two IMFs by EMD for Airmile data. Since the abrupt changes are scattered across IMFs, the decomposition results are distorted, and the periodic patterns are not separated effectively.

Selection of size parameter
Here, we discuss the selection method of size parameter τ for ensemble patch transformation. First of all, in some cases, we can choose the appropriate τ based on known facts about data and analysis purposes. For example, the choice of τ = 24 in Section 4.4 to observe solar radiation data per hour and check the daily effect is simple and natural. Sections 4.5 and 4.6 analyze the seasonal effects of monthly data, so τ = 12 can be a reasonable choice.
For estimation of the size parameter τ from observations, we propose two selection methods. One is performed in a priori way, and the other is based on a posterior information of the decomposition. The size parameter τ corresponds to a period in the time domain. When a priori information of the periodic pattern of a signal is available, a selection of the size parameter can be conducted based on the distribution of periodic patterns. Such information can be obtained through the empirical periods of distances between (2020) 2020:30 Page 23 of 27

Fig. 22 Decomposition of Airmile data by EPT
local maxima (or local minima). Note that the empirical period is expressed by the number of observations between local maxima, not by the distance of the real-time. To sum up, the proposed priori method can be described as follows.
(i) Find local maxima (or local minima) of the signal.
(ii) Obtain empirical periods of distances between local maxima.
(iii) Estimate the distribution of empirical periods.
(iv) Select τ as the dominated period of the estimated distribution. Figure 24 shows the distribution of empirical periods for a signal X t = cos(90πt) + cos(10πt) and its high-frequency component cos(90πt), where the high frequency pattern is apparent in signal X t . It seems that the dominated period is 21, which is set to In the case that the frequency ratio of components composing a signal falls below a certain range, the local pattern of the high-frequency component may not be distinct; thus, the above selection method based on empirical periods is not appropriate. From the results in Fig. 2, we observe that the proposed method might separate two components according to the frequencies. Nevertheless, the components should be weakly correlated to each other unless they are orthogonal. Hence, for the posterior method, we use correlation information between two components extracted by ensemble patch transformation. That is, through the grid search in a certain range of the size parameter, the size parameter τ is selected, having the minimum correlation between the decomposed components. Figure 25 shows the sample correlations between the extracted components for the signal X t = cos(100πt) + 4 cos(60πt) in Fig. 2 over a range of τ , which producesτ = 19. The decomposition result of Fig. 2 is obtained through the ensemble average for the oval patch with size parameter 19.
We remark that through extensive experiments, we observe that our method is somewhat robust to the selection of size parameters. Suppose that we decompose the signal Fig. 24 The distribution of the empirical period for a signal X t = cos(90πt) + cos(10πt) and its component cos(90πt) (2020) 2020:30 Page 25 of 27

Fig. 25
Correlation between decomposed components of test signal X t = cos(100πt) + 4 cos(60πt) over a range of τ X t = cos(90πt) + cos(10πt) into two components by the proposed method with a range of τ = 18 to 23. Figure 26 shows the differences between the extracted high-frequency component and the true component cos(90πt) over the range of τ . As one can see, the results are robust to the choice of the size parameter τ value. Finally, the proposed selection methods of the parameter τ lack theoretical justification. An objective way with theoretical backup might improve the performance and the practicality of the proposed method. This topic is left for future study.

Conclusion
In this paper, we have introduced a new transformation technique, termed "ensemble patch transformation" for signal decomposition and data analysis. We have presented a practical algorithm for the implementation of the proposed method with some theoretical properties. The empirical performance of the proposed method has been evaluated throughout several numerical experiments and real-world signal analysis. Results from these experiments illustrate that the proposed method possesses promising empirical properties. We remark that the purpose of signal decomposition is to construct the target signal X t as a combination of components that can be interpreted or understood. However, even when a component of a given signal can be perceived, its separation from the signal is not trivial, which is the motivation for developing the ensemble patch transform. The tools provided by the ensemble patch transform requires the filter design, as shown for analyzing signals in Sections 4.3-4.6. Once a proper filter is designed, this "discomfort" guarantees extreme degrees of freedom for analyzing a signal.
Finally, the proposed transformation holds inherently multiscale features due to the size parameter of τ , which serves to control the size of patches. That is, the size parameter of the patch acts as the scale parameter of multiscale features. The scale-space concept might provide a view-point on visualization of data, which considers a family of representations of data indexed by the scale parameter instead of the conventional dot-connected plot. This topic are reserved for future research.