Pyramid-based image empirical mode decomposition for the fusion of multispectral and panchromatic images

Image fusion is a fundamental technique for integrating high-resolution panchromatic images and low-resolution multispectral (MS) images. Fused images may enhance image interpretation. Empirical mode decomposition (EMD) is an effective method of decomposing non-stationary signals into a set of intrinsic mode functions (IMFs). Hence, the characteristics of EMD may apply to image fusion techniques. This study proposes a novel image fusion method using a pyramid-based EMD. To improve computational time, the pyramid-based EMD extracts the IMF from the reduced layer. Next, EMD-based image fusion decomposes the panchromatic and MS images into IMFs. The high-frequency IMF of the MS image is subsequently replaced by the high-frequency IMF of the panchromatic image. Finally, the fused image is reconstructed from the mixed IMFs. Two experiments with different sensors were conducted to validate the fused results of the proposed method. The experimental results indicate that the proposed method is effective and promising regarding both visual effects and quantitative analysis.


Introduction
The development of earth resources' satellites is mainly focus on improving spatial and spectral resolutions [1]. As the spatial and spectral information are the two critical factors for enriching the capability of image interpretation, fusion of high spatial and high spectral images may increase the usability of satellite images. Most remote sensing applications, such as image interpretation and feature extraction, require both spatial and spectral information; therefore, the demands for fusing high-resolution multispectral (MS) images are increasing.
Currently, most optical sensors are capable of acquiring high spatial resolution panchromatic (Pan) and low spatial resolution MS bands simultaneously; for example, QuickBird, IKONOS, and SPOT series. Due to the technological constraints and costs, the spatial resolution of panchromatic images is better than the spatial resolution of MS images in an optical sensor. To overcome this problem, image fusion techniques (also called color fusion, pan sharpen, or resolution merge) are widely used to obtain a fused image with both high spatial and high spectral information.
The approaches of image fusion may be categorized into three types [2]: projection-substitution, relative spectral contribution, and ARSIS (Amélioration de la Résolution Spatiale par Injection de Structures). Intensity-Hue-Saturation (IHS) [3] transform is one of the famous fusion algorithms using the projection-substitution method. This method interpolates MS image into the spatial resolution of a panchromatic image and converts the MS image according to intensity, hue, and saturation bands. The intensity of the MS image is then replaced with a high-spatial panchromatic image and reversed to red, green, and blue bands. However, this method is limited to three-band images.
The projection-substitution method also includes principle component analysis (PCA) [4], independent component analysis (ICA) [5], as well as other method. The PCA converts an MS image into several components based on eigen vectors and values. A high spatial panchromatic image replaces the first component of MS image with a large variance and performs the inverse PCA. The image fusion process is similar to the IHS method. Though this method is not constrained by the number of bands, significant color distortion may result.
The relative spectral contribution method utilizes the linear combination of bands to fuse panchromatic and MS images. Brovey transformation [6] is one of the well-known approaches in this category. The fused image is based on a linear combination of panchromatic and MS images.
The ARSIS is a multi-scale fusion approach, which improves spatial resolution by structural injection. This approach is widely used in image fusion because the advantage of multi-scale analysis may improve the fusion results. The multi-scale approach includes the Wavelet transform [7], empirical mode decomposition (EMD) [8], parameterized logarithmic image processing [9], as well as other methods. The Wavelet approach transforms the original images into several high and low frequency layers before replacing the high frequency of MS image with those that are from panchromatic image. Then, an inverse Wavelet transform is selected to construct the mixed layers for image fusion. A more detailed comparison among fusion methods is discussed in [10,11].
The main difference between the Wavelet and EMD fusion approaches is depended on decomposition. The EMD method is an empirical method, which decomposes a nonlinear and non-stationary signal into a series of intrinsic mode functions (IMFs) [12]. It is obtained from the signal by an algorithm called the "sifting process" produces a signal that obtains these properties. The EMD method is widely used in one-dimensional signal processing as well as in two-dimensional image processing. Wavelet decomposition is related to the predefine Wavelet basis while the EMD is a non-parametric data-driven process that is not required to predetermine the basis during decomposition. The EMD fusion approach is similar to the Wavelet fusion approach, in that it replaces the high frequency of MS images with those that are from panchromatic image.
The EMD can be applied in many image processing applications such as noise reduction [13,14], texture analysis [15], image compression [16], image zooming [17], and feature extraction [18,19]. Because the algorithm of image fusion via EMD is not yet mature, a small number of studies have reported on image fusion using EMD. Hariharan et al. [20] combined the visual and thermal images using the EMD method. First, the two-dimensional image is vectorized into a one-dimensional vector to fulfill the one-dimensional EMD decomposition. A set of weights are then multiplied by the number of IMFs. Finally, the weighted IMFs are combined to reconstruct the fused image. From the visual aspect, the experimental results show that the EMD method is better than Wavelet and PCA method. Liu et al. [21] used a bidimensional EMD method in image fusion; the results demonstrate that the EMD method may preserve both spatial and spectral information. The authors also indicated that the two-dimensional EMD is a highly time-consuming process.
Wang et al. [8] integrated QuickBird panchromatic and MS images using the EMD method. The row-column decomposition is selected to decompose the image in rows and columns separately using a one-dimensional EMD decomposition. The quantity evaluation demonstrates that the EMD algorithm may provide more favorable results when compared with either the IHS or Brovey method. Chen et al. [22] combined the Wavelet and EMD in the fusion of QuickBird satellite images. A similar row-column decomposition process is applied in the fusion process. The experiment also substantiates the promising result of the EMD fusion method.
EMD was originally developed to decompose onedimensional data. Most EMD-based fusion methods use row-column decomposition schemes rather than twodimensional decomposition. Because the image is twodimensional, a two-dimensional EMD is more appropriate for image data processing. However, two-dimensional EMD decomposition has seldom been discussed in image fusion.
The sifting process of two-dimensional EMD is interactive, and involves three main steps: (1) determining the extreme points; (2) interpolating the extreme points for the mean envelope; and (3) subtracting the signal using the mean envelope. Determining the extreme points and the interpolation in two-dimensional space is considerably time consuming. Therefore, a new method to improve computation performance is necessary.
The objective of this study is to establish an image fusion method using a pyramid-based EMD. The proposed method reduces the spatial resolution of the original image during the sifting process. First, the proposed method determines and interpolates the extreme points of the reduced image. Then the results are expanded to obtain the mean envelope with identical dimensions to the original image.
The proposed method comprises three main steps: (1) the decomposition of panchromatic and MS images using pyramid-based EMD; (2) image fusion using the mixed IMFs of panchromatic and MS images; and (3) quality assessment of the fused image. The test data include SPOT images of a forest area and QuickBird images of a suburban area. The quality assessment considers two distinct aspects: the visual and quantifiable.
Fusion results of the modified IHS, PCA, and wavelet methods are also provided for comparison.
This study establishes a novel image fusion method using a pyramid-based EMD. The proposed method can improve the computational performance of two-dimensional EMD in image fusion, and can also be applied to EMD-based image fusion. The major contribution of this study is the improvement of the computational performance of two-dimensional EMD using image pyramids. The proposed method extracts the mean envelope of the coarse image, and resamples the mean envelope to equal the original size during the sifting process. The benefits of the proposed method are reduced computation time for extreme point extraction and interpolation.
This article is organized as follows. Section 2 presents the proposed pyramid-based EMD fusion method. Section 3 shows the experimental results from using different image fusion methods. This study also compares and discusses one-and two-dimensional EMD in image fusion. Finally, a conclusion is presented in Section 4.

The proposed scheme
This section introduces the basic ideas and procedures of one-dimensional EMD and row-column EMD. Onedimensional EMD can be extended to two-dimensional EMD before determining the technical details of pyramid-based two-dimensional EMD. This section describes EMD-based image fusion in the final part.

One-dimensional EMD
EMD is used to decompose signals into limited IMFs. An IMF is defined as a function in which the number of extreme points and the number of zero crossings are the same or differ by one [7]. The IMFs are obtained through an iterative process called the sifting process. A brief description of the sifting process is shown below.
Step 1. Determine the local maxima and minima of the current input signal h (i, j) (t), where i is the number of the IMF and j is the number of iteration. In the first iteration, h (1,1) (t) is the original time series signal X(t).
Step 2. Compute the upper and lower envelopes u (i, j) (t) and 1 (i, j) (t) by interpolating the local minimum and maximum using the cubic splines interpolation.
Step 3. Compute the mean envelope m (i, j) (t) from the upper and lower envelopes, as shown as (1).
Step 4. Subtract the h (i, j) (t) by the mean envelope to obtain the sifting result, h (i, j+1) (t), as shown in (2). If h (i, j+1) (t) satisfies the requirement of the IMF, then h (i, j+1) (t) is IMF i (t) and subtract the original X(t) by this IMF i (t) to obtain residual r i (t). The r i (t) is treated as the input data and Step 1 is then repeated. If h (i, j+1) (t) does not satisfy the requirement of the IMF, h (i, j+1) (t) is treated as the input data and Step 1 is then repeated. (2) The stopping criterion of generating an IMF depends on whether or not the numbers of the zero-crossing and extreme are the same during the iteration. The procedure is repeated to obtain all the IMFs until the residual r(t) is smaller than a predefined value. At the end, we can decompose the signal X(t) into several IMFs and a residual r n (t). The decomposition of a signal X(t) can be written as (3). Equation 3 shows that X(t) can be reconstructed from the IMFs and residual without information loss. More details of the basis theory of EMD are discussed in [7]. (3)

Row-column EMD
EMD was originally developed to manage one-dimensional data. To apply this method to two-dimensional data, a row-column EMD [22] is proposed based on one-dimensional EMD. The purpose of row-column EMD is to perform EMD on the rows and columns. This method determines and interpolates the extreme points of the one-dimensional space. The row-column EMD process is briefly described below.
Step 1. Determine the local maxima and minima of the current input image h (i, j) (p, q) and perform the cubic spline interpolation for upper and lower envelopes ur (i, j) (p, q) and lr (i, j) (p, q) systematically by row. The upper and lower envelopes uc (i, j) (p, q) and lc (i, j) (p, q) along the columns are also generated, where i is the number of IMFs and j is the number of the iteration. In the first iteration, h (1,1) (p, q) is the original image X(p, q). Figure 1 illustrates the extreme point extraction using the row-column method.
Step 2. Compute the mean envelope m (i, j) (p, q) from the upper and lower envelopes along rows and columns, as shown in (4).
Step 3. Subtract the h (i, j) (p, q) by the mean envelope to obtain the sifting result h (i, j+1) (p, q), as shown in (5). If m (i, j) (p, q) satisfies the requirement of the IMF, then h (i, j+1) (p, q) is IMF i (p, q) and subtract the original signal by this IMF i (p, q) to obtain residual r i (p, q). r i (p, q) is treated as the next input data and Step 1 is repeated. If m (i, j) (p, q) does not satisfy the requirement of the IMF, then h (i, j+1) (p, q) is treated as the input data and Step 1 is repeated.
The stopping criterion of generating an IMF is when the envelope mean signal is close to zero. The sifting procedure is repeated to obtain all the IMFs until the residual r(p, q) is smaller than a predefined value. At the end, we can decompose the image X(p, q) into several high to low frequency IMFs and a residual r n (p, q). The decomposition of an image X(p, q) can be written as (6). Equation 6 also demonstrates that the original image can be reconstructed using IMFs and residuals without losing information. Figure 2 shows the results of row-column EMD. The EMD decomposed the original image, Figure 2a, into four IMFs from high to low frequency. Each IMF represents different scales. The advantage of this method is easy to implement; however, this method cannot avoid the striping effect, as shown in Figure 2d, e.

Pyramid-based EMD
This study proposed pyramid-based EMD to avoid the striping effect of row-column EMD. Two-dimensional EMD determines and interpolates the extreme points of a two-dimensional space rather than one-dimensional space. The main difference between pyramid-based and row-column EMD is the generation of a mean envelope. The additional image pyramid improves the computation performance of two-dimensional EMD. The process of pyramid-based two-dimensional EMD is described below.
Step 1. Reduce the input image from h (i, j) (p, q) to h (i, j) (p g ,q g ) using Gaussian image pyramid [23], where i is the number of the IMF; j is the number of the iteration and g is the number of pyramid layer. In the first iteration, h (1,1) (p g ,q g ) is the original reduced image X(p g ,q g ). The reduced scale is related to the smoothness of the input image and EMD computation time.
Step 2. Determine the local maxima and minima of the reduced image h (i, j) (p g ,q g ) using openness strategies [24]. Morphological filters [16,25] are frequently used to determine the local maxima and minima for two-dimensional EMD; however, extracting the extreme points in the low-frequency image is difficult. To overcome this problem, this study proposes a surface operator called "openness." Openness is defined as a measure of the surface reliefs of zenith and nadir angles, as shown in Figure 3. Openness is an angular measure of the relationship between surface relief and horizontal distance.
Therefore, the local maxima and minima points are determined by the slope of the center and the surrounding points, as shown in Figure 4. The openness is then defined by the direction of azimuth D and length of distance L. The slope D θ L in azimuth D is calculated from ΔH and distance L, as shown in (7). Openness   Figure 4 shows the positive and negative openness of scale L. In the high-frequency layer, L should be smaller to extract the local extreme points. By contrast, L should be larger during low-frequency iteration. Openness is more suitable for locating the local extreme points of different scales. In addition, extreme point selection relates to the surrounding points of different scales rather than to the neighboring points. Step 3. Perform the spline interpolation for upper and lower envelopes u (i, j) (p g ,q g ) and 1 (i, j) (p g ,q g ). Compute the mean envelope m (i, j) (p g ,q g ) from the upper and lower envelopes, as shown in (9).
Step 4. Expand the mean envelope to the original image size m (i, j) (p, q).
Step 5. Subtract the h (i, j) (p, q) by the mean envelope to obtain the sifting result h (i, j+1) (p, q), as shown in (5). If m (i, j) (p, q) < ε, then h (i, j+1) (p, q) is IMF i (p, q). Subtract the original image by this IMF i (p, q) to obtain residual r i (p, q). If m (i, j) (p, q) > ε, then h (i, j+1) (p, q) is treated as the input data and Step 1 is repeated. The procedure will be terminated when r i (p, q) < ε.
The IMF is obtained when the mean envelope is close to zero in two-dimensional EMD. The sifting procedure is repeated to obtain all the IMFs until the residual is smaller than a predefined value. At the end, we can decompose the image X(p, q) into several high to low frequency IMFs and a residual r n (p, q), as shown in (5). Figure 5 is an example of two-dimensional EMD. The original image is decomposed into two IMFs and a residual. The decomposed results are more favorable than the row-column EMD, as shown in Figure 2.

EMD-based image fusion
EMD-based image fusion is similar to the traditional wavelet approach. The process uses a high-frequency panchromatic IMF to replace the high-frequency IMF of MS images. Then the IMFs are combined to form a fused image. The IMFs for EMD-based image fusion are generated by row-column EMD or pyramid-based EDM. Figure 6 is a schematic representation of the proposed method. In Figure 6, the high-frequency component is the first IMF of EMD. The remaining IMFs are combined as low-frequency components. The proposed method uses an image pyramid to reduce the image during decomposition, which can also reduce the computation time of IMFs extraction. In addition, the advantage of an openness operator is the ability to accurately extract the extreme points of scales with varying levels of detail.
Because only the high-frequency IMF was changed from a panchromatic to a MS image, the remaining IMFs will not affect the image fusion results. Thus, the decomposition process can be simplified. This study only decomposes the image into two IMFs, high-and low-frequency, for image fusion. The EMD image-fusion process is described as follows: For data preprocessing, the panchromatic and MS images are registered into the same system. Next, the MS image is resampled to match the size of the panchromatic image. Then, the method proposed by this study uses EMD to decompose the two images into several IMFs and a residual. The first IMF of the panchromatic image replaces the first IMF of the MS image. Finally, the fused image is obtained by reconstructing the mixed IMFs of the MS image. The reconstruction process combines the mixed IMFs and residuals, as shown in Equation 6.

Experimental results
To evaluate the performance and efficiency of the proposed method, the experiments are performed on both SPOT and QuickBird satellite images. The SPOT satellite images include a SPOT-5 panchromatic image and SPOT-4 MS image, taken on different dates, of a forest area with high textures. These two images are corrected to orthoimages using ground control points and a digital terrain model. Since both orthoimages are in the same coordinate system, data registration can be performed using the standard orthoimage coordinates. The resolutions of the SPOT images are 2.5 m and 20 m, respectively. The land cover of the QuickBird satellite image is a suburban area. The nominal spatial resolution of the QuickBird panchromatic and MS images are 0.7 m and 2.8 m, respectively. The two images were of the same path, and the panchromatic and MS QuickBird images were taken simultaneously. The standard product of the two images is already registered. Related information of the test data is shown in Table 1.
The quality assessment includes the visual and quality aspects. Regarding the visual aspect, the fused and the original MS images are visually compared. Both row-column and pyramid-based EMD are applied during image fusion  to enable a comparison. In addition, this study employed the commercial software ERDAS Imagine 2010 to fuse the images using different methods, including modified IHS [26], PCA, and Wavelet. These images were then compared with the image fused using the EMD method.
The experiment required establishing a number of parameters. Because the purpose of EMD is image fusion, the image was only decomposed into two components: high-frequency and a remainder layer. The stopping criterion is 99% or a mean envelope less than 2 pixels. The image pyramid scales are reduced layers 1 and 2. The experiment results are discussed in the following section. The window of openness is 5-15 pixels in different iterations. Both the threshold of the minimal points for positive openness and threshold of the maximum points for negative openness were less than 75 degrees.

Quality evaluation of the fused image
The quality assessment considers both the visual and quantifiable aspects, and refers to both spatial and spectral qualities. In other words, the fusion method should improve the spatial resolution and preserve spectral content. Several indices are selected to evaluate the quality of a fused image. The experiment compares the fused image with the original MS image to ensure spectral fidelity. The three spectral indices are RMSE [27], ERGAS [27], and the correlation coefficient. Spatial index is the entropy of an image.

Root mean square error (RMSE)
RMSE compares the difference between original MS and fused images. The RMSE equation is shown in (10). This index is used to evaluate the distribution of bias. The ideal value is zero.
where Bias is the difference between mean value of MS and fused images, SDD is the standard deviation of difference between MS and fused images.

Erreur relative globale adimensionnelle de synthèse (ERGAS)
The ERGAS present the relative dimensionless global error in fusion, the difference between original MS and fused images. The ERGAS equation is shown as (11). The lower the ERGAS value, the higher the spectral quality of the merged images.
where h and l are the resolution of PAN and MSI, respectively. N is the number of spectral band (B i ). M is the mean value of each spectral band.

Correlation coefficient
This index measures the correlation between the fused image and the original MS image. The higher the correlation between the fused and original image, the more accurate the estimation of the spectral values is. The correlation equation is shown in (12). The ideal value is 1.
where C is the coefficient of correlation, F(i, j) and M (i, j) are the gray value of the fused and MS images, respectively. μ F is the mean of fused image, μ M is the mean of MS image, and m and n are the image sizes.

Entropy
Entropy represents the information in an image. This index shows the overall detailed information of the image. The entropy equation is shown in (13). The greater the entropy of a fused image, the more information that is included in the image.
where E is the Entropy, P k is the probability of gray value k in the image.

Case I
In the qualitative evaluation, the fused images were evaluated visually. Figure 7 shows both the original and the fused images. That most of the image fusion methods may improve the spatial and spectral resolutions of the images is evident. Even these two data are acquired by two different sensors. Image fusion is able to improve spectral information of panchromatic image. The results of row-column and pyramid-based EMD are similar in appearance. Besides, the results of pyramid-based EMD using different scale also show high correlation. Among these methods, the largest color distortion effect appears in the PCA-fused image. The sharpest fused image is the results of modified IHS. The enchantment of spatial resolution of the Wavelet is of lower quality than the others.
In the quantitative evaluation, the aforementioned indices are selected to evaluate fusion performance. Table 2 presents the comparison of the experimental results of the fused images. First, we compare the EMD fusion approach between row-column and pyramid methods. The pyramid method is slightly more favorable than the row-column method. The pyramid method shows the lowest ERGAS. The effect of the image reduction layer at the fused image is not so sensitive when comparing the results of reduced scales 1 and 2. The correlation of modified IHS is relatively low when compared to other methods. The PCA method has the largest color distortion. This statistical assessment result is identical to that of the visual inspection. The wavelet method produces higher correlation, but its entropy is lower than that of the EMD-fused image, indicating a limited improvement of the spatial resolution. Figure 8 displays the results of different fusion methods for qualitative evaluation. Visual inspection provides a comprehensive comparison between the fused images.

Case II
The PCA method has the largest color distortion when compare to the original MS image. All of these methods may improve the spatial and spectral resolutions of the images. The main difference between these methods is shown in Figure 9, depicting the zoomed-in images. Referring to Figure 9a, the striping effect appears in the row-column method, caused by the discontinuity during the row-column process. Pyramid EMD may overcome this problem, as shown in Figure 9b. Figure 9f shows the fused image with an edge effect using the Wavelet approach. Among these multi-scale fusion approaches, pyramid EMD yields promising results. The visual analysis shows that the spatial resolution of the proposed method is much higher than the others. The quantitative indices' value is calculated and given in Table 3. The QuickBird test image is an 11-bit datum; hence, the value of the statistical results is larger than the SPOT image. This table shows that the pyramid method is superior to the row-column method. The color distortion of the modified IHS and Wavelet is of higher quality than the EMD method, as caused by the replacement IMFs within different ranges. The pyramid method shows the lowest ERGAS than the others. The correlation of the EMD method is slightly more favorable than the others.

Conclusions
This article proposes an EMD-based image fusion method using image pyramids. The proposed method  uses image pyramids to improve the computation performance of two-dimensional EMD. An openness strategy during the extraction of extreme points of twodimensional EMD is also proposed. This experimental study uses SPOT and QuickBird images in distinct areas to evaluate the proposed method, and compare the results with other fusion approaches. The experimental results demonstrate an improvement of pyramid-based two-dimensional EMD. Used during image decomposition, this method may overcome the linear effect of row-column EMD. In addition, the proposed method is sensor-independent but can be applied to the integration of heterogeneous sensors, such as optical and radar images.