# Multifocus image fusion scheme based on feature contrast in the lifting stationary wavelet domain

- Huafeng Li
^{1}, - Shanbi Wei
^{1}Email author and - Yi Chai
^{1, 2}

**2012**:39

https://doi.org/10.1186/1687-6180-2012-39

© Li et al; licensee Springer. 2012

**Received: **2 April 2011

**Accepted: **21 February 2012

**Published: **21 February 2012

## Abstract

For fusion of multifocus images, a novel image fusion method based on multiscale products in lifting stationary wavelet (LSWT) domain is proposed in this article. In order to avoid the influence of noise and select the coefficients of the fused image properly, different subband coefficients employ different selection principles. For choosing the low frequency subband coefficients, a new modified energy of Laplacian (EOL) is proposed and used as the focus measure to select the coefficients from the clear parts of the low frequency subband images; when choosing the high frequency subband coefficients, a novel feature contrast measurement of the multiscale products is proposed, which is proved to be more suitable for fusion of multifocus images than the traditional contrast measurement, and used to select coefficients from the sharpness parts of the high frequency subbands. Experimental results demonstrate that the proposed fusion approach outperforms the traditional discrete wavelet transform (DWT)-based, LSWT-based and LSWT-traditional-contrast-(LSWT-Tra-Con)-based image fusion methods, even though the source images are corrupted by a Gaussian noise, in terms of both visual quality and objective evaluation.

### Keywords

image fusion lifting stationary wavelet transform (LSWT) multiscale products feature contrast modified energy of Laplacian## 1. Introduction

In applications of digital cameras, when a lens focuses on a subject at a certain distance, all subjects at that distance are sharply focused. Subjects not at the same distance are out of focus and theoretically are not sharp. It is often not possible to get an image that contains all relevant objects in focus. One way to overcome this problem is image fusion, in which one can acquire a series of pictures with different focus settings and fuse them to produce an image with extended depth of field [1–3]. During the fusion process, all the important visual information found in the input images must be transferred into the fused image without introduction of artifacts. In addition, the fusion algorithm should be reliable and robust to imperfections such as noise or mis-registration [4–6].

During the last decade, a number of techniques for image fusion have been proposed. A simple image fusion method consists in taking the average of the source images pixel by pixel. However, along with simplicity comes several undesired side effects including reduced contrast. In recent years, many researchers have recognized the multiscale transforms (MST) are very useful for image fusion, and various MST-based fusion methods have been proposed [7–11]. In MST domain, the discrete wavelet transform (DWT) becomes the most popular and important multiscale decompositions method in image fusion. Compared with the Laplacian pyramid transform, the DWT has been found to have some advantages such as: (1) The DWT cannot only possess localization but also provide directional information, while the pyramid representation fails to introduce any spatial orientation selectivity into the decomposition process [9]. So DWT can represent the underlying information of the source images more efficiently. This advantage would make the fused image more accurate. (2) No blocking artifacts, which often occur in Laplacian pyramid-fused images, can be observed in the DWT-based fused images. (3) DWT-based fusion has better signal-to-noise ratios than Laplacian-based fusion [12]. (4) DWT-based fusion images can improve the perception over pyramid-based fusion images. More advantages of the DWT over Laplacian pyramid scheme can be seen in [9, 12].

However, the DWT has its own disadvantages. It needs a great deal of convolution calculations, and it either consumes much time or occupies memory resources, which impedes its real-time application. Relative to the DWT, the lifting wavelet transform (LWT) [13] can overcome its shortcomings. Unfortunately, the original LWT and DWT lack shift-invariance and cause pseudo-Gibbs phenomena around singularities [14], which will reduce the resultant image quality. Thus, new lifting stationary wavelet transform (LSWT) [15], as a fully shift-invariant form of LWT, can be introduced and used as the MST method in this article.

Except for the LSWT discussed in the above paragraph, the nonsubsampled contourlet transform (NSCT) [16], which also possesses the shift-invariant, is another important MST method in image fusion field. Compared with the LSWT, the NSCT is built upon non-subsampled pyramids and nonsubsampled directional filter banks [16]. In NSCT, the non-subsampled pyramids are first used to achieve the multi-scale decomposition, and then the nonsubsampled directional filter banks are employed to achieve the direction decomposition. The number of direction decomposition at each level can be different, which is much more flexible than the three directions in wavelet. So it can get better fusion results than the LSWT. However, the NSCT is more time consuming than the LSWT because of its multi-direction and complexity, which impede its real-time application greatly. By considering both the fusion results and computing complexity, in our proposed method, the LSWT is used as the MST method.

In image fusion algorithm in MST domain, one of the most important things for improving fusion quality is the selection of fusion rules, which influences the performance of fusion algorithm remarkably. According to physiological and psychological research, the human vision system (HVS) is highly sensitive to the local image contrast level. To meet this requirement, Toet and Ruyven [17] developed the local luminance contrast in their research in contrast pyramid (CP). In the local luminance contrast, the contrast level is measured by measuring the ratio of the high frequency component of image to the local luminance of the background.

Based on the idea of [17], many different forms of contrast measurement have been proposed and successfully used in image fusion [18, 19]. However, in these contrast measurements, the value (or absolute value) of a single pixel of the high frequency subband in MST domain is directly used as the strength value of the high frequency component. In fact, the value (or absolute value) of a single pixel of the high frequency subband is very limited in determining which pixel is from the clear part of the sub-images. So, a pure use of a single pixel as the high frequency component in the local contrast measurements is not ideal. In addition, almost all the MST-based image fusion algorithms do not consider the noise influence. In many practical applications, additive Gaussian noise, which is characterized by adding to each image pixel a value from a zero-mean Gaussian distribution, can be systematically introduced into image during acquisition. This noise may cause miscalculation of sharpness values, which in turn, degrade the performance of image fusion. To be useful in real process operation, the fusion algorithm should provide pleasing fusion performance for the clean image fusion; meanwhile it should be reliable and robust to imperfections such as noise.

It is well known that there exist dependencies between wavelet coefficients. If a wavelet coefficient produced by a true signal is of large magnitude at a finer scale, its parents at coarser scales are likely to be large as well. However, for those coefficients caused by noise, the magnitudes will decay rapidly along the scales. So, multiplying the adjacent wavelet scales, namely multiscale products (MSP), can sharpen the important structures while weakening noise [20, 21]. Therefore, multiscale products can distinguish edge structures from noise more effectively.

To make up for the aforementioned deficiencies of the traditional MST-based image fusion methods, we present a new multifocus image fusion scheme which incorporates the merits of interscale dependencies into the image fusion field. In this method, after decomposing the original images using the LSWT, we use a new modified energy of Laplacian, which can reflect features of the edges of the low frequency subimage in LSWT domain, as the focus measure to select the coefficients of the fused image; when choosing the high frequency subband coefficients, a novel local neighborhood feature contrast of the multiscale products, which can effectively represent the salient features and sharp boundaries of image, is developed and used as the measurement to select coefficients from the clear parts of source images. The experimental results show that the proposed method does well in fusion of multifocus images no matter they are clean or not, and outperforms typical wavelet-based, LSWT-based, NSCT-based and LSWT typical contrast-based fusion algorithms in terms of objective criteria and visual appearance.

The article is organized as follows. In Sections 2 and 3, the theory of LSWT and multiscale products are introduced respectively in detail; Section 4 describes the image fusion algorithm using LSWT and multiscale products. Section 5 compares the performance of the new algorithm with the performance of other conventional fusion techniques applied to sequences of multifocus test images. Finally, in Section 6 we conclude the article with a short summary.

## 2. Lifting stationary wavelet transform

### 2.1. Lifting wavelet transform

Lifting wavelet transform (LWT), proposed by Sweldens [22], is a new wavelet construction method using the lifting scheme in time domain. The main feature of the LWT is that it provides an entirely spatial domain interpretation of the transform, as opposed to the traditional frequency domain based constructions. It abandons the Fourier transform as design tool for wavelets, and wavelets are no longer defined as translates and dilates of one fixed function. Compared with the classical wavelet transform, the LWT requires less computation and memory, and can produce integer-to-integer wavelet transform. It is always perfectly reconstructed no matter how the prediction operator and update operator are designed. Moreover, it possesses several advantages, including possibility of adaptive and nonlinear design, in place calculations, and so on [13, 22, 23]. The decomposition stage of LWT consists of three steps: split, prediction and update.

*a*

_{ l }at level

*l*is split into even samples and odd samples

*P*on

*a*

_{l+1}to predict

*d*

_{l+1}. The resultant prediction error

*d*

_{l+1}is regarded as the detail signal of

*a*

_{ l }

where *p*_{
r
}is one of the coefficient of *P* and *M* is the length of prediction coefficients.

*a*

_{l+1}is accomplished by using an update operator

*U*to detail signal

*d*

_{l+1}and adding the result to

*a*

_{l+1}, the resultant

*a*

_{l+1}can be regarded as the approximation signal of

*a*

_{ l }.

where *u*_{
j
}is the coefficient of *U* and *N* is the length of update coefficients.

Let *a*_{
l
}be the input signal for lifting scheme, the detail and approximation signals at the lower resolution level can be obtained by iterating of the above three steps on the output *a.*

*P*and update operator

*U*can be designed by the interpolation subdivision method introduced in [23]. Choosing different

*P*and

*U*is equivalent to choosing different biorthogonal wavelet filters [24].

### 2.2. Lifting stationary wavelet transform

In the LWT, the shift-invariance is not ensured because there exists the split step and the length of approximation signal and detail signal decreases. However, the shift-invariance is desirable in many image applications such as image enhancement, image denoising and image fusion. In order to obtain the LSWT which possesses the shift-invariance, the method of literature [15] is adopted in this article.

*P*

^{ l }and

*U*

^{ l }represent the prediction and update operator of the lifting stationary wavelet at level

*l*, respectively. The initial prediction operator

*P*

^{0}and initial update operator

*U*

^{0}can be obtained once

*M*and

*N*are determined, where

*P*

^{0}= {

*p*

_{ m }},

*m*= 0,1,...,

*M -*1;

*U*

^{0}= {

*u*

_{ n }},

*n*= 0,1,...,

*N*- 1. The coefficients of

*P*

^{ l }and

*U*

^{ l }are designed by padding

*P*

^{0}and

*U*

^{0}with zeros [15]. The prediction coefficients and update coefficients at level

*l*in the LSWT are expressed as follows:

*a*

_{ l }at level

*l*via lifting stationary wavelet are expressed by following equations.

where *d*_{l+1}and *a*_{l+1}are detail signal and approximation signal of *a*_{
l
}at level *l* + 1.

Compared with the DWT, LSWT do not downsample and upsample the highpass and the lowpass coefficients during the decomposition and reconstruction of the image. So, the LSWT not only retains the perfect properties of the LWT, but also possess the shift-invariance. When LSWT is introduced into image fusion, more information for fusion can be obtained. In addition, the size of different sub-images is identical, so it is easy to find the relationship among different subbands, which is beneficial for designing fusion rules [25]. Therefore, the LSWT is more suitable for image fusion.

## 3. Multiscale products of LSWT

In MST-based image fusion algorithms, almost all the schemes design the fusion rule, namely, selection principles for high frequency subband coefficients (simplified into 'fre-coefs' in figures in this article) based on the wavelet coefficients directly. It is worth noting that much of the noise is also related to high frequencies. As a result, the fused images obtained by these methods are more noisy than the source images. It is well known that there exist dependencies between wavelet coefficients: if a coefficient at a coarser scale has small magnitude, its descendant coefficients at finer scales are likely to be small and vice versa. If two adjacent wavelet subbands are multiplied it can amplify the significant features and dilute noise [21, 26].

*f*(

*x*) is a one-dimensional (1-D) discrete signal, we define the multiscale products of

*W*

_{ l }

*f*as

where *k*_{1} < *l* and *k*_{2} ≤ *L* - *l* are nonnegative integers if we use *L* to denote the max level, *W*_{
l
}*f* (*x*) denotes the LSWT of signal *f* (*x*) at scales *l* and position *x*.

*k*

_{1}= 0 and

*k*

_{2}= 1, then we calculate the LSWT scale products as

*f*, the multiscale products at the

*l*th scale,

*d*th direction and location (

*x, y*) can be defined as

where *d* = 1, 2, 3 denote the horizontal, vertical and diagonal directions.

*f*

_{1}=

*g*

_{l}+

*δ*) are illustrated, respectively. Though the LSWT coefficients of the original signal

*g*

_{1}are immersed into noise at fine scales, they are enhanced in the scale products

*P*

_{ l }

*f*. The significant features of

*g*

_{1}are more distinguishable in

*P*

_{ l }

*f*than in

*W*

_{ l }

*f*. So we can conclude that the multiscale production of LSWT can amplify the significant features and dilute noise.

## 4. The proposed fusion algorithm

A good image fusion algorithm should preserve all the salient features of the source images and introduce as less artifacts or inconsistency as possible. In addition, the fusion algorithm should be reliable and robust to imperfections such as noise. In this article, we develop a novel multifocus image fusion scheme to incorporate the merits of interscale dependencies of LSWT into the image fusion technique. Two adjacent wavelet subbands are multiplied to amplify the significant features and dilute noise. In contrast to the conventional MST-based image fusion schemes, we design the fusion rule of the high frequency subbands based on the multiscale products instead of the wavelet coefficients. So our proposed image fusion method can be fairly resistant to the noise because the multiscale products can distinguish edge structures from noise effectively.

*F*from a pair of source images denoted by

*A*and

*B*. The general procedure of the proposed LSWT-MSP-based fusion algorithm is illustrated in Figure 4 and implemented as

- (1)
Decompose the registered source images

*A*and*B*, respectively, into one low frequency subimage and a series of high frequency subimages via LSWT. - (2)
Select fusion coefficients for the low frequency subimage and each high frequency subimage from

*A*and*B*according to fusion rules. - (3)
Reconstruct the original image based on the new fused coefficients of subimages by taking an inverse LSWT transform, then the fused image

*F*is obtained.

### 4.1. Selection of lowpass subband coefficients

As the coefficients in the coarsest scale subband represent the approximation component of the source image, the simplest way is to use the conventional averaging method to produce the composite coefficients. However, this will reduce the fused image contrast. To improve the fused image quality, a clarity measure should be defined to determine whether a coefficient of the low frequency subband is in focus or out of focus.

For multifocus image fusion, many typical focus measurements, e.g. variance, energy of image gradient (EOG), spatial frequency (SF), and energy of Laplacian (EOL) of the image, are compared in literature [28]. They all measure the variation of pixels. Pixels with greater values of these measurements, when source images are compared with each other, are considered from the focus parts. According to literatures [28, 29] we know that EOL can provide a better performance than SF and EOG for fusion multifocus images. In this article, we use a new improved energy of image Laplacian (IEOL) as the focus measure to select coefficients from the clear parts of the source images.

*f*is shown in Equation (14):

In Equation (15), the *f* (*x, y*) is the gray value of pixel at position (*x, y*) of image *f* ⋅ *f*_{
xx
}+ *f*_{
yy
}represents image gradient obtained by Laplacian operator [-1,-4,-1; -4, 20, -4; -1, -4, -1].

However, the second derivatives in different directions may have different signs which cause one sign to cancel the other. This phenomenon may occur frequently in the textured images. In order to avoid the problem, and maintain robustness of the algorithm in the face of adverse effects that may occur in image fusion. We will use an improved EOL (IEOL) as the clarity measure to select coefficients from the clean parts of source images.

*f*are computed as:

*W*

_{ l }is a template which size is relatively small, and must satisfy the normalization rule Σ

_{ a }Σ

_{ b }

*W*

_{ l }(

*a, b*) = 1. For the low frequency subband, it contains low frequency information. In order to match the information of the LSWT neighborhood of low frequency subband, the values of center and center neighborhood of the template should have little change between each other [30]. In this article, the template size is 3 × 3. In order to highlight the center pixel of the window, a weighted template is used, which is given as:

*A*and

*B*are decomposed using LSWT, ${f}_{l}^{L}\left(x,y\right)$ and ${f}_{F}^{L}\left(x,y\right)$ denote the low frequency coefficients of the source image

*L*(

*K*=

*A, B*) and the fused image

*F*which are located at (

*x, y*) in the

*L*th decomposition level, respectively. The ${\mathsf{\text{IEOL}}}_{K}^{L}\left(x,y\right)$ denotes the IEOL measurement of ${f}_{K}^{L}\left(x,y\right)$. The proposed IEOL-based fusion rule can be described as follows:

It means that coefficients with maximum IEOL measurement are selected as the coefficient of the fused image when subbands are compared in the LSWT domain. For simplicity, we name this fusion rule as 'IEOL-max' rule in this article.

### 4.2. Selection of bandpass subband coefficients

The coefficients in the high frequency subbands represent the detailed component of the source image. In traditional multiresolution fusion algorithms, such as [9, 31, 32], the multiresolution coefficients with larger absolute value are considered as sharp brightness changes or salient features in the corresponding source image, such as the edges, contours, and region boundaries, and so on. Thus, for the high frequency subbands coefficients, the most commonly used selection principle is the 'absolute-maximum-choosing' scheme (simplified and named 'Coef-abs-max') without taking any consideration of lowpass subband coefficients, that is, all the information in the lowpass subband is neglected.

Furthermore, in many practical applications, images are distorted by noise during the acquisition or transmission process. But almost all the traditional MST-based image fusion algorithms are designed to transfer the high frequency information from the input images to the fused image. It is worth noting that much of the image noise is also related to the high frequencies and may cause miscalculation of sharpness value. As a result, the fused images obtained by these methods are more noisy than the source images, and the performances are degraded. To make up for the deficiencies of traditional MST-based image algorithms, in our proposed method, after decomposing the original images using LSWT, we design a new image fusion rule based on multiscale products.

where *L'* denotes the local gray level, *L*_{
B'
}is the local brightness of the background and corresponds to the low frequency component. Therefore, Δ*L* can be taken as the high frequency component.

Based on the above idea, many different forms of contrast measurement have been proposed in MST domain and provide better performance than the 'Coef-abs-max' scheme [18, 19, 25]. However, in those contrast measurements, the value (or absolute value) of a single pixel of the high frequency sub-image, namely the coefficient of the high frequency subband when the source image is decomposed by the MST, is used as Δ*L*. In fact, the value (or absolute value) of a signal pixel is very limited in determining which pixel is from the clear part of the sub-image. So, a pure use the value (or absolute value) of a single pixel as the high frequency component is not effective enough. We believe it will be more reasonable to employ feature of the high frequency subband, rather than the value (or absolute value) of pixel, as Δ*L* in the contrast measurement in Equation (19).

Like the sharpness measure, the ISL, shown in Equation (16), can effectively represent the salient features and sharp boundaries of an image. Pixels with larger values of ISL, when the source images are compared with each other, are more possible in focus. That means the ISL can successfully determine which pixel is in the focus. Therefore, it is reasonable to utilize ISL as one type of feature of the high frequency subband to represent Δ*L* in contrast measurement.

^{ d,l }(

*x, y*) (

*l*= 1, 2,...,

*L*) to denote the ISL located at (

*x, y*) in the

*d*th direction (

*d*= 1, 2, 3) and

*l*th scale. The feature contrast

*R*

^{d,l}(

*x, y*) is defined as

*f*

^{ l }(

*x, y*) denotes the low frequency coefficients located at (

*x, y*) in the

*l*th scale. In order to improve the robustness of the contrast to the noise of the low frequency subband, the feature contrast can be modified as

In Equation (22) the local area size *m* × *n* may be 3 × 3 or 5 × 5. In practice, to reduce the computation complexity and the influence of low frequency subband noise, *f*^{
l
}(*x, y*) can be substituted with the coarsest lowpass subband image *f*^{
L
}(*x, y*).

*W*

_{ h }is a template of size 3 × 3. For the high frequency subband, it contains high frequency information. In order to match the information of the LSWT neighborhood of high frequency subband, the values of center and center neighborhood of the template should have relative large change between each other [30]. In this article, a weighted template based on city-block distance is used, which is

where PSL^{
d,l
}(*x, y*) denotes the ISL of multiscale products located at (*x, y*) in *l* th scale and *d* th direction; *P*^{
d,l
}*f*(*x, y*) and MPS^{
d,l
}(*x, y*) are the corresponding multiscale products and the feature contrast.

The local feature contrast of multiscale products cannot only effectively represent the salient features and sharp boundaries of image, but also effectively avoid the noise influence. A large value of the feature contrast means more high frequency information. So the proposed fusion scheme can extract more useful detail information from source images and inject them into the fused image. For simplicity, we name this fusion rule as 'MSP-con-max' in this article.

## 5. Experimental results and analysis

In order to show the advantages of the new image fusion method, we establish three steps to demonstrate that the proposed image fusion method outperforms other methods. First, 'MSP-con-max' is compared with 'Coef-abs-max', the 'Traditional-contrast-max' ('Tra-con-max'), and the proposed 'Feature-contrast-max' ('Fea-con-max'), which is designed according to Equation (23), to demonstrate the performance of the 'MSP-con-max' rule. For the 'Tra-con-max', the absolute value of a single pixel of the high frequency subband is used as Δ*L* in the contrast measurement. Second, the proposed image fusion algorithm is compared with DWT-simple-based method (Method 1), LSWT-simple-based method (Method 2), and NSCT-simple-based method (Method 3), in all of which the low frequency subband coefficients and the high frequency subband coefficients are simply merged by the 'averaging' scheme and the 'Coef-abs-max' scheme, respectively. For comparison purposes, the proposed algorithm is also compared with other four fusion algorithms (namely Methods 4-7). In Methods 4 and 5, LSWT is used as the MST method, and the 'IEOL-max' fusion rule is employed to merge the low frequency subband coefficients. For fusion of the high frequency subband coefficients, the 'Coef-abs-max' and 'Tra-con-max' fusion rules, are respectively used in Methods 4 and 5. For Method 6, the fusion rules of [7], which have been deigned based on the feature of the multiscale products and pulse coupled neural network (PCNN) [7], are respectively used to merge the low and high frequency subband LSWT coefficients (We name the method as 'LSWT-PCNN'). In this method, the PCNN is a model based on the cats primary visual cortex. It is characterized by the global coupling and pulse synchronization of neurons and has been proven suitable for image processing [33]. In Method 7, the NSCT is used as the MST method, and our proposed 'IEOL-max' and 'MSP-con-max' are, respectively, employed to fuse the low and high frequency subbands coefficients (We name it as 'NSCT-MSP-Con'). For multiscale scale products of NSCT, it can be defined just like Equation (13).

In all of these methods, the 'db5' and 'db53' wavelets, together with a decomposition level of 3 are used in DWT-based and LSWT-based methods (including Methods 2, 4, 5, 6 and our proposed method), respectively. Three decomposition levels are also used in the NSCT-based method (including NSCT-simple and NSCT-MSP-Con). All of these methods are used to fuse the multifocus clean images. Third, multifocus noisy images, as shown in Figure 5e,f, are fused by above different methods.

### 5.1. Contrast-based fusion rule in LSWT domain

In this section, we will show the performance of 'Fea-con-max' and 'MSP-con-max' fusion rules. In order to demonstrate the advantages of the new fusion rule, 'MSP-con-max' and 'Fea-con-max' are compared with'Tra-con-max' and 'Coef-abs-max' on high frequency subbands in LSWT domain.

Figure 6e-h shows the multiscale products of Figure 6a-d, respectively. From Figure 6g,h, we can find that the multiscale products of LSWT can distinguish edge structures from noise effectively. Figure 6i-l are the decision maps, in which the coefficients selected from the image in Figure 6b are represented by white color, whereas the coefficients from Figure 6a are represented by black color. Since labeled part of Figure 6b is clearer than that of Figure 6a, the optimal decision map should be in white color in the whole decision map, which means all coefficients should be selected from Figure 6b. However, the decision maps of 'Coef-abs-max' rule and 'Tra-con-max' rule, shown in Figure 6i,j, indicate that these rules do not select the coefficients from the clear part completely even though 'Tra-con-max' shows better performance than 'Coef-abs-max'. Figure 6k,l indicates that the proposed feature contrast is more reasonable than the traditional contrast. It is also proven that applying feature such as ISL to the contrast is more reasonable than the absolute value of a single pixel.

Figure 6m-p shows the decision maps, in which the white color indicates that coefficients are selected from Figure 6d, otherwise selected from Figure 6c. From these figures we can see that the proposed 'MSP-con-max' rule do well in fusion of the multifocus noisy images. All of these demonstrate that the proposed fusion rule cannot only select the coefficients of the fused image properly but also restrain the influence of noise effectively.

### 5.2. Fusion of clean multifocus images

In this section, the experiments are performed on three pairs of multifocus clean images, which are shown in Figure 5a-d,h-j, respectively. All the experiments are implemented in Matlab7.01 and on AMD Athlon(tm) 2.4 GHz with 2 G RAM. For further comparison, besides visual observation, two objective criteria are used to compare the fusion results. The first criterion is the mutual information (MI) [34]. It is a metric defined as the sum of mutual information between each input image and the fused image. The second criterion is *Q*^{
AB/F
}[35] metric, proposed by Xydeas and Petovic, which considers the amount of edge information transferred from the input images to the fused image. This method uses a Sobel edge detector to calculate strength and orientation information at each pixel in both source and the fused images. For both criteria, the larger the value, the better is the fusion result.

*Q*

^{ AB/F }and the execution time of Figures 8a-h, 9a-h, and 10a-h are listed in Tables 1, 2, and 3, respectively. We observe that the fused images produced by NSCT-simple-based method are slightly better than the LSWT-simple fusion results, and all of them outperform the DWT approach in terms of MI and

*Q*

^{ AB/F }. However, the NSCT is time consuming, which impedes its realtime application. As the modified version of LWT, LSWT consumes more time than DWT, that is because LSWT possesses the shift-invariance, and needs to process more dates of the image during the fusion process.

Performance of different fusion methods on precessing Figure 5a,b

Methods | MI |
Q
| Time (s) |
---|---|---|---|

DWT-simple | 6.4711 | 0.7282 | 2.0910 |

LSWT-simple | 6.7608 | 0.7544 | 5.30230 |

NSCT-simple | 6.7356 | 0.7582 | 300.0400 |

LSWT-IEOL | 7.1506 | 0.7643 | 7.5430 |

LSWT-Tra-Con | 7.2098 | 0.7671 | 10.5260 |

LSWT-PCNN | 7.4010 | 0.7735 | 180.8200 |

NSCT-MSP-Con | 7.4040 | 0.7753 | 363.0860 |

Proposed method | 7.4034 | 0.7746 | 28.4390 |

Performance of different fusion methods on precessing Figure 5c,d

Methods | MI |
Q
| Time (s) |
---|---|---|---|

DWT-simple | 5.0938 | 0.6643 | 1.9350 |

LSWT-simple | 5.4225 | 0.6919 | 4.3480 |

NSCT-simple | 5.5337 | 0.6953 | 210.1170 |

LSWT-IEOL | 5.4837 | 0.6902 | 6.0200 |

LSWT-Tra-Con | 5.9649 | 0.7008 | 8.4400 |

LSWT-PCNN | 6.1962 | 0.7049 | 133.6140 |

NSCT-MSP-Con | 6.1101 | 0.7171 | 265.8900 |

Proposed method | 6.8804 | 0.7169 | 19.4900 |

Performance of different fusion methods on precessing Figure 5h-j

Methods | MI |
Q
| RMSE | Time (s) |
---|---|---|---|---|

DWT-simple | 7.0350 | 0.8217 | 0.0075 | 5.0590 |

LSWT-simple | 7.6896 | 0.8471 | 0.0062 | 9.0900 |

NSCT-simple | 7.9574 | 0.8512 | 0.0058 | 452.2690 |

LSWT-IEOL | 8.2163 | 0.8501 | 0.0053 | 11.4980 |

LSWT-Tra-Con | 8.2687 | 0.8585 | 0.0050 | 14.4680 |

LSWT-PCNN | 8.6596 | 0.8606 | 0.0046 | 277.4470 |

NSCT-MSP-Con | 9.1801 | 0.8673 | 0.0044 | 520.1610 |

Proposed method | 9.1845 | 0.8681 | 0.0040 | 31.4180 |

Form Tables 1, 2, and 3, we find that the NSCT-MSP-Con-based method provides similar performance compared with our proposed method. However, the NSCT-MSP-Con is more time consuming than our proposed method, because of its multi-direction and complexity. By considering both the fusing results and computing complexity, we utilize LSWT as the MST method in our proposed algorithm. For the LSWT-PCNN-based method, it is also more time consuming than the proposed method, because the PCNN neuron is very complex and it needs iterative operation to obtain pleasing fusion results. Moreover, the number of parameters of each neuron which need to be adjusted is large and they affect each other greatly. In image processing with PCNN, people often assign the same values to the corresponding parameters of each neuron. They are all chosen with experiments or experiences. For the visual system of eyes, it is impossible that all the parameters of neurons have the same value. They should be related to the situation of the neuron cell. All of these disadvantages significantly compromise the performance of LSWT-PCNN. Relative to LSWT-PCNN, our proposed method not only considers the property of HVS, which is highly sensitive to the local image contrast level, but also possesses some advantages such as simple calculation, high efficiency, etc. So our proposed method can provide better performance than LSWT-PCNN.

*Q*

^{AB/F}. Moreover, since reference (everywhere in focus) image of Figure 5h-j, as shown in Figure 5g, is available, performance comparison of different methods can be made using root mean square error (RMSE). The values of RMSE between Figure 11a-h and Figure 5g are given in Table 3. From Table 3, we can find that the objective evaluation results of RMSE coincide with the MI and

*Q*

^{AB/F}evaluation results very well.

### 5.3. Fusion of noisy multifocus images

In order to evaluate the performance of the proposed method in a noisy environment, the input multifocus images 'pepsi', as shown in Figure 5e,f, have been additionally corrupted by Gaussian noise with deviation *δ* = 0.01.

*Q*

^{AB/F}are used to evaluate how much information of the multifocus clean images, which are shown in Figure 5a,b, is contained in the fused images. However, the objective criteria on MI and

*Q*

^{AB/F}cannot evaluate the performance of these fusion methods in terms of the input/output noise transmission. For further comparison, the improvement in terms of peak signal to noise ratio (PSNR), proposed by Loza et al. [6], is adopted to measure the noise change between the fused image and source noisy image. Let ${\sigma}_{n,f}^{2}$ denotes the noise variance in the fused output, the improvement in terms of PSNR is formulated as:

For the criteria of ΔPSNR, the larger the value, the less noise of fused image is introduced from the original noisy image, and the better is the fusion result.

Figure 11a-h illustrates the fusion results obtained by the above different methods. For a clearer comparison, Figure 11i-p illustrates the parts of the fusion results. By looking at the image examples shown in Figure 11i-l, one can find that the edges information of the fused images, which are fused results using Methods 1-4, respectively, are immersed into noise. That is because all these fusion methods are designed to transfer the high frequency information from the input images into the fused image. It is worth noting that much of the image noise is also related to high frequencies. As a result, the fused images obtained by these methods are more noisy than the source images. From Figure 11m, we can see that the edges of Figure 11m are not clearer than Figure 11n-p, because the noise of the source images causes miscalculation of the contrast values. Therefore, in the presence of noise, the performance of Methods 1-5 may not be as good as those in the noiseless environments. Figure 11n indicates that the Method 6 can reduce the noise level to some extent, but the edges information of the fused image is not clearer compared with Figure 11o,p, which are fused by the Method 7 and our proposed algorithm.

*Q*

^{ AB/F }quality. And the values of ΔPSNR indicate that the proposed fusion rule of the high frequency subband is more reliable, robust and stable than other fusion rules.

Performance of different fusion methods on precessing Figure 5e,f

Methods | MI |
Q
| ΔPSNR |
---|---|---|---|

DWT-simple | 1.6133 | 0.1533 | -4.2930 |

LSWT-simple | 1.7213 | 0.1581 | -2.3292 |

NSCT-simple | 1.7450 | 0.1673 | -1.5841 |

LSWT-IEOL | 1.7107 | 0.1640 | -2.2970 |

LSWT-Tra-Con | 1.7984 | 0.1728 | -1.8413 |

LSWT-PCNN | 1.9254 | 0.1810 | -0.6025 |

NSCT-SMP-Con | 1.9041 | 0.1781 | -0.2334 |

Proposed method | 1.9298 | 0.1845 | -0.4136 |

## 6. Conclusion

In this article, a new multifocus image fusion algorithm based on feature contrast of multiscale products is proposed in LSWT domain. In the proposed algorithm, a novel feature contrast of multiscale products, which stands for edge features in high frequency sub-images in LSWT domain, is developed and used as the fusion scheme of the high frequency subbands. Three pairs of clean multifocus images and one pair of noisy multifocus images are used to test the performance of the proposed image fusion method, respectively. The experimental results demonstrate that the proposed method outperforms the DWT-simple-based method, the LSWT-simple-based method, LSWT-Traditional-Contrast-based method, the LSWT-PCNN-based method and the NSCT-simple-based method in terms of both visual quality and objective evaluation, even though the source images are in a noisy environment. In the future, we will do more research on the fusion of the noisy images, in order to carry out denoising and fusion of noisy source images simultaneously. And that will become the new trends to develop in image fusion field in the future.

## Declarations

### Acknowledgements

The authors would like to thank the associate editor and the anonymous reviewers for their careful study and valuable suggestions for an earlier version of this article. The article was jointly supported by the National Natural Science Foundation of China (No. 60974090), the Ph.D. Programs Foundation of Ministry of Education of China (No. 200806110016), and the Fundamental Research Funds for the Central Universities (No. CDJXS10172205).

## Authors’ Affiliations

## References

- Seales WB, Dutta S: Everywhere-in-focus image fusion using controllable cameras.
*Proc SPIE*1996, 2905: 227-234.View ArticleGoogle Scholar - Li ST, Yang B: Multifocus image fusion using region segmentation and spatial frequency.
*Image Vision Comput*2008, 26(7):971-979. 10.1016/j.imavis.2007.10.012View ArticleGoogle Scholar - Wang ZB, Ma YD, Gu J: Multi-focus image fusion using PCNN.
*Pattern Recogn*2010, 43(6):2003-2016. 10.1016/j.patcog.2010.01.011View ArticleGoogle Scholar - Li ST, Kwok JT, Wang YN: Multifocus image fusion using artificial neural networks.
*Pattern Recogn Lett*2002, 23(8):985-997. 10.1016/S0167-8655(02)00029-6View ArticleGoogle Scholar - Pajares G, de la Cruz JM: A wavelet-based image fusion tutorial.
*Pattern Recogn*2004, 37(9):1855-1872. 10.1016/j.patcog.2004.03.010View ArticleGoogle Scholar - Loza A, Bull D, Canagarajah N, Achim A: Non-gaussian model-based fusion of noisy images in the wavelet domain.
*Comput Vis Image Understand*2010, 114(1):54-65. 10.1016/j.cviu.2009.09.002View ArticleGoogle Scholar - Chai Y, Li HF, Guo MY: Multifocus image fusion scheme based on features of multiscale products and PCNN in lifting stationary wavelet domain.
*Opt Commun*2011, 248(5):1146-1158.View ArticleGoogle Scholar - Petrovic VS, Xydeas CS: Gradient-based multiresolution image fusion.
*IEEE Trans Image Process*2004, 13(2):228-237. 10.1109/TIP.2004.823821View ArticleGoogle Scholar - Li H, Manjunath BS, Mitra SK: Multisensor image fusion using the wavelet transform.
*Graph Models Image Process*1995, 57(3):235-245. 10.1006/gmip.1995.1022View ArticleGoogle Scholar - Qu XB, Yan JW, Xiao HZ: Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain.
*Acta Automatica Sinica*2008, 34(12):1508-1514.View ArticleGoogle Scholar - Chai Y, Li HF, Qu JF: Multifocus image fusion using a novel dual-channel PCNN in lifting stationary wavelet transform.
*Opt Commun*2010, 283(19):3591-3602. 10.1016/j.optcom.2010.04.100View ArticleGoogle Scholar - Wilson T, Rogers S, Kabrisky M: Perceptual based hyperspectral image fusion using multi-spectral analysis.
*Opt Eng*1995, 34(11):3154-3164. 10.1117/12.213617View ArticleGoogle Scholar - Sweldens W: The lifting scheme: a construction of second generation wavelets.
*SIAM J Math Anal*1998, 29(2):511-546. 10.1137/S0036141095289051MathSciNetView ArticleGoogle Scholar - Coifman RR, Donoho DL:
*Translation Invariant De-Noising, Wavelet and Statistics*. Edited by: A Antoniadis, G Oppenheim. Springer-Verlag, New York; 1995:125-150.Google Scholar - Lee CS, Lee CK, Yoo KY: New lifting based structure for undecimated wavelet transform.
*Electron Lett*2000, 36(22):1894-1895. 10.1049/el:20001294View ArticleGoogle Scholar - da Cunha AL, Zhou JP, Do MN: The nonsubsampled contourlet transform: theory, design and application.
*IEEE Trans Image Process*2006, 15(10):3089-3101.View ArticleGoogle Scholar - Toet A, Van ruyven LJ, Valeton JM: Merging thermal and visual images by a contrast pyramid.
*Opt Eng*1989, 28(7):789-792.View ArticleGoogle Scholar - Yang L, Guo BL, Ni W: Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform.
*Neurocomputing*2008, 72(1-3):203-211. 10.1016/j.neucom.2008.02.025View ArticleGoogle Scholar - Zhang Q, Guo BL: Fusion of multi-sensor images based on the nonsubsampled contourlet transform.
*Acta Automatica Sinica*2008, 34(2):135-141.View ArticleGoogle Scholar - Bao P, Zhang L: Noise reduction for magnetic resonance images via adaptive multiscale products thresholding.
*IEEE Trans Med Imag*2003, 22(9):1089-1099. 10.1109/TMI.2003.816958View ArticleGoogle Scholar - Xu Y, Weaver JB, Healy DM Jr, Lu J: Wavelet transform domain filters: a spatially selective noise filtration technique.
*IEEE Trans Imag Process*1994, 3(6):747-758. 10.1109/83.336245View ArticleGoogle Scholar - Sweldens W: The lifting scheme: a custom-design construction of biorthogonal wavelets.
*Appl Comput Harmonic Anal*1996, 3(2):186-200. 10.1006/acha.1996.0015MathSciNetView ArticleGoogle Scholar - Claypoole RL, Davis GM, Sweldens W, Baraniuk R: Nonlinear wavelet transforms for image coding via lifting.
*IEEE Trans Image Process*2003, 12(12):1449-1459. 10.1109/TIP.2003.817237MathSciNetView ArticleGoogle Scholar - Stepien J, Zienlinski T, Rumian R: Image denoising using scale-adaptive lifting schems. In
*Proceedings of the International Conference on Image*.*Volume 3*. Vancouver, BC, Canada; 2000:288-290.Google Scholar - Zhang Q, Guo BL: Multifocus image fusion using the nonsubsampled contourlet transform.
*Signal Process*2009, 89(7):1334-1346. 10.1016/j.sigpro.2009.01.012View ArticleGoogle Scholar - Sadler BM, Swami A: Analysis of multiscale products for step deteciton and estimation.
*IEEE Trans Inf Theory*1999, 45(3):1041-1051.MathSciNetView ArticleGoogle Scholar - Maintz JB, Viergever MA: A surevy of medical image registration.
*Med Image Anal*1998, 2(1):1-36.View ArticleGoogle Scholar - Wei H, Jing ZL: Evaluation of focus measures in multi-focus image fusion.
*Pattern Recogn Lett*2007, 28(4):493-500. 10.1016/j.patrec.2006.09.005View ArticleGoogle Scholar - Wei H, Jing ZL: Multi-focus image fusion using pulse coupled neural network.
*Pattern Recogn Lett*2007, 28(9):1123-1132. 10.1016/j.patrec.2007.01.013View ArticleGoogle Scholar - Song YJ, Ni GQ, Gao K: Regional energy weighting image fusion algorithm by wavelet based contourlet transform.
*Trans Beijing Inst Technol*2008, 28(2):168-172.Google Scholar - Li ST, Yang B: Multifocus image fusion by combining curvelet and wavelet transform.
*Pattern Recogn Lett*2008, 29(9):1295-1301. 10.1016/j.patrec.2008.02.002View ArticleGoogle Scholar - Li ST, Yang B, Hu JW: Performance comparison of different multi-resolution transforms for image fusion.
*Inf Fusion*2011, 12(2):74-84. 10.1016/j.inffus.2010.03.002View ArticleGoogle Scholar - Johnson JL, Padgett ML: PCNN models and applications.
*IEEE Trans Neural Netw*1999, 10(3):480-498. 10.1109/72.761706View ArticleGoogle Scholar - Qu G, Zhang D, Yan P: Information measure for performance of image fusion. Electron.
*Lett*2001, 38(7):313-315.Google Scholar - Petrovic V, Xydeas C: On the effects of sensor noise in pixel-level image fusion performance. In
*IEEE Proceedings of the Third International Conference on Image Fusion*.*Volume 2*. Paris, France; 2000:14-19.Google Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.