 Research
 Open access
 Published:
Multifocus image fusion scheme based on feature contrast in the lifting stationary wavelet domain
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 39 (2012)
Abstract
For fusion of multifocus images, a novel image fusion method based on multiscale products in lifting stationary wavelet (LSWT) domain is proposed in this article. In order to avoid the influence of noise and select the coefficients of the fused image properly, different subband coefficients employ different selection principles. For choosing the low frequency subband coefficients, a new modified energy of Laplacian (EOL) is proposed and used as the focus measure to select the coefficients from the clear parts of the low frequency subband images; when choosing the high frequency subband coefficients, a novel feature contrast measurement of the multiscale products is proposed, which is proved to be more suitable for fusion of multifocus images than the traditional contrast measurement, and used to select coefficients from the sharpness parts of the high frequency subbands. Experimental results demonstrate that the proposed fusion approach outperforms the traditional discrete wavelet transform (DWT)based, LSWTbased and LSWTtraditionalcontrast(LSWTTraCon)based image fusion methods, even though the source images are corrupted by a Gaussian noise, in terms of both visual quality and objective evaluation.
1. Introduction
In applications of digital cameras, when a lens focuses on a subject at a certain distance, all subjects at that distance are sharply focused. Subjects not at the same distance are out of focus and theoretically are not sharp. It is often not possible to get an image that contains all relevant objects in focus. One way to overcome this problem is image fusion, in which one can acquire a series of pictures with different focus settings and fuse them to produce an image with extended depth of field [1–3]. During the fusion process, all the important visual information found in the input images must be transferred into the fused image without introduction of artifacts. In addition, the fusion algorithm should be reliable and robust to imperfections such as noise or misregistration [4–6].
During the last decade, a number of techniques for image fusion have been proposed. A simple image fusion method consists in taking the average of the source images pixel by pixel. However, along with simplicity comes several undesired side effects including reduced contrast. In recent years, many researchers have recognized the multiscale transforms (MST) are very useful for image fusion, and various MSTbased fusion methods have been proposed [7–11]. In MST domain, the discrete wavelet transform (DWT) becomes the most popular and important multiscale decompositions method in image fusion. Compared with the Laplacian pyramid transform, the DWT has been found to have some advantages such as: (1) The DWT cannot only possess localization but also provide directional information, while the pyramid representation fails to introduce any spatial orientation selectivity into the decomposition process [9]. So DWT can represent the underlying information of the source images more efficiently. This advantage would make the fused image more accurate. (2) No blocking artifacts, which often occur in Laplacian pyramidfused images, can be observed in the DWTbased fused images. (3) DWTbased fusion has better signaltonoise ratios than Laplacianbased fusion [12]. (4) DWTbased fusion images can improve the perception over pyramidbased fusion images. More advantages of the DWT over Laplacian pyramid scheme can be seen in [9, 12].
However, the DWT has its own disadvantages. It needs a great deal of convolution calculations, and it either consumes much time or occupies memory resources, which impedes its realtime application. Relative to the DWT, the lifting wavelet transform (LWT) [13] can overcome its shortcomings. Unfortunately, the original LWT and DWT lack shiftinvariance and cause pseudoGibbs phenomena around singularities [14], which will reduce the resultant image quality. Thus, new lifting stationary wavelet transform (LSWT) [15], as a fully shiftinvariant form of LWT, can be introduced and used as the MST method in this article.
Except for the LSWT discussed in the above paragraph, the nonsubsampled contourlet transform (NSCT) [16], which also possesses the shiftinvariant, is another important MST method in image fusion field. Compared with the LSWT, the NSCT is built upon nonsubsampled pyramids and nonsubsampled directional filter banks [16]. In NSCT, the nonsubsampled pyramids are first used to achieve the multiscale decomposition, and then the nonsubsampled directional filter banks are employed to achieve the direction decomposition. The number of direction decomposition at each level can be different, which is much more flexible than the three directions in wavelet. So it can get better fusion results than the LSWT. However, the NSCT is more time consuming than the LSWT because of its multidirection and complexity, which impede its realtime application greatly. By considering both the fusion results and computing complexity, in our proposed method, the LSWT is used as the MST method.
In image fusion algorithm in MST domain, one of the most important things for improving fusion quality is the selection of fusion rules, which influences the performance of fusion algorithm remarkably. According to physiological and psychological research, the human vision system (HVS) is highly sensitive to the local image contrast level. To meet this requirement, Toet and Ruyven [17] developed the local luminance contrast in their research in contrast pyramid (CP). In the local luminance contrast, the contrast level is measured by measuring the ratio of the high frequency component of image to the local luminance of the background.
Based on the idea of [17], many different forms of contrast measurement have been proposed and successfully used in image fusion [18, 19]. However, in these contrast measurements, the value (or absolute value) of a single pixel of the high frequency subband in MST domain is directly used as the strength value of the high frequency component. In fact, the value (or absolute value) of a single pixel of the high frequency subband is very limited in determining which pixel is from the clear part of the subimages. So, a pure use of a single pixel as the high frequency component in the local contrast measurements is not ideal. In addition, almost all the MSTbased image fusion algorithms do not consider the noise influence. In many practical applications, additive Gaussian noise, which is characterized by adding to each image pixel a value from a zeromean Gaussian distribution, can be systematically introduced into image during acquisition. This noise may cause miscalculation of sharpness values, which in turn, degrade the performance of image fusion. To be useful in real process operation, the fusion algorithm should provide pleasing fusion performance for the clean image fusion; meanwhile it should be reliable and robust to imperfections such as noise.
It is well known that there exist dependencies between wavelet coefficients. If a wavelet coefficient produced by a true signal is of large magnitude at a finer scale, its parents at coarser scales are likely to be large as well. However, for those coefficients caused by noise, the magnitudes will decay rapidly along the scales. So, multiplying the adjacent wavelet scales, namely multiscale products (MSP), can sharpen the important structures while weakening noise [20, 21]. Therefore, multiscale products can distinguish edge structures from noise more effectively.
To make up for the aforementioned deficiencies of the traditional MSTbased image fusion methods, we present a new multifocus image fusion scheme which incorporates the merits of interscale dependencies into the image fusion field. In this method, after decomposing the original images using the LSWT, we use a new modified energy of Laplacian, which can reflect features of the edges of the low frequency subimage in LSWT domain, as the focus measure to select the coefficients of the fused image; when choosing the high frequency subband coefficients, a novel local neighborhood feature contrast of the multiscale products, which can effectively represent the salient features and sharp boundaries of image, is developed and used as the measurement to select coefficients from the clear parts of source images. The experimental results show that the proposed method does well in fusion of multifocus images no matter they are clean or not, and outperforms typical waveletbased, LSWTbased, NSCTbased and LSWT typical contrastbased fusion algorithms in terms of objective criteria and visual appearance.
The article is organized as follows. In Sections 2 and 3, the theory of LSWT and multiscale products are introduced respectively in detail; Section 4 describes the image fusion algorithm using LSWT and multiscale products. Section 5 compares the performance of the new algorithm with the performance of other conventional fusion techniques applied to sequences of multifocus test images. Finally, in Section 6 we conclude the article with a short summary.
2. Lifting stationary wavelet transform
2.1. Lifting wavelet transform
Lifting wavelet transform (LWT), proposed by Sweldens [22], is a new wavelet construction method using the lifting scheme in time domain. The main feature of the LWT is that it provides an entirely spatial domain interpretation of the transform, as opposed to the traditional frequency domain based constructions. It abandons the Fourier transform as design tool for wavelets, and wavelets are no longer defined as translates and dilates of one fixed function. Compared with the classical wavelet transform, the LWT requires less computation and memory, and can produce integertointeger wavelet transform. It is always perfectly reconstructed no matter how the prediction operator and update operator are designed. Moreover, it possesses several advantages, including possibility of adaptive and nonlinear design, in place calculations, and so on [13, 22, 23]. The decomposition stage of LWT consists of three steps: split, prediction and update.
In the split step, the original signal (or approximate signal) a_{ l }at level l is split into even samples and odd samples
In the prediction step, we apply a prediction operator P on a_{l+1}to predict d_{l+1}. The resultant prediction error d_{l+1}is regarded as the detail signal of a_{ l }
where p_{ r }is one of the coefficient of P and M is the length of prediction coefficients.
In the update step, an update of even samples a_{l+1}is accomplished by using an update operator U to detail signal d_{l+1}and adding the result to a_{l+1}, the resultant a_{l+1}can be regarded as the approximation signal of a_{ l }.
where u_{ j }is the coefficient of U and N is the length of update coefficients.
Let a_{ l }be the input signal for lifting scheme, the detail and approximation signals at the lower resolution level can be obtained by iterating of the above three steps on the output a.
The inverse LWT can be performed by reversing the prediction and update operators and changing each '+' into '' and vice versa. The complete expression of the reconstruction of LWT is shown in Equations (4)(6). Figure 1 depicts the structure of LWT. The computational costs of the forward and inverse transform are exactly the same. The prediction operator P and update operator U can be designed by the interpolation subdivision method introduced in [23]. Choosing different P and U is equivalent to choosing different biorthogonal wavelet filters [24].
2.2. Lifting stationary wavelet transform
In the LWT, the shiftinvariance is not ensured because there exists the split step and the length of approximation signal and detail signal decreases. However, the shiftinvariance is desirable in many image applications such as image enhancement, image denoising and image fusion. In order to obtain the LSWT which possesses the shiftinvariance, the method of literature [15] is adopted in this article.
In the LSWT, the split step is discarded. Assuming P^{l}and U^{l}represent the prediction and update operator of the lifting stationary wavelet at level l, respectively. The initial prediction operator P^{0} and initial update operator U^{0} can be obtained once M and N are determined, where P^{0} = {p_{ m }}, m = 0,1,..., M  1; U^{0} = {u_{ n }}, n = 0,1,..., N  1. The coefficients of P^{l}and U^{l}are designed by padding P^{0} and U^{0} with zeros [15]. The prediction coefficients and update coefficients at level l in the LSWT are expressed as follows:
The decomposition results of an approximation signal a_{ l }at level l via lifting stationary wavelet are expressed by following equations.
where d_{l+1}and a_{l+1}are detail signal and approximation signal of a_{ l }at level l + 1.
The reconstruction procedure of LSWT is directly achieved from its forward transform, which is expressed as below.
The forward and inverse transform of LSWT is shown in Figure 2.
Compared with the DWT, LSWT do not downsample and upsample the highpass and the lowpass coefficients during the decomposition and reconstruction of the image. So, the LSWT not only retains the perfect properties of the LWT, but also possess the shiftinvariance. When LSWT is introduced into image fusion, more information for fusion can be obtained. In addition, the size of different subimages is identical, so it is easy to find the relationship among different subbands, which is beneficial for designing fusion rules [25]. Therefore, the LSWT is more suitable for image fusion.
3. Multiscale products of LSWT
In MSTbased image fusion algorithms, almost all the schemes design the fusion rule, namely, selection principles for high frequency subband coefficients (simplified into 'frecoefs' in figures in this article) based on the wavelet coefficients directly. It is worth noting that much of the noise is also related to high frequencies. As a result, the fused images obtained by these methods are more noisy than the source images. It is well known that there exist dependencies between wavelet coefficients: if a coefficient at a coarser scale has small magnitude, its descendant coefficients at finer scales are likely to be small and vice versa. If two adjacent wavelet subbands are multiplied it can amplify the significant features and dilute noise [21, 26].
Suppose f(x) is a onedimensional (1D) discrete signal, we define the multiscale products of W_{ l }f as
where k_{1} < l and k_{2} ≤ L  l are nonnegative integers if we use L to denote the max level, W_{ l }f (x) denotes the LSWT of signal f (x) at scales l and position x.
The support of an isolated edge will increase by a factor of two across scale and the neighboring edges will interfere with each other at coarse scale. So in practice it is sufficient to implement the multiplication at two adjacent scales [20]. If we let k_{1} = 0 and k_{2} = 1, then we calculate the LSWT scale products as
According to [7] and Equation (12), for 2D image f, the multiscale products at the l th scale, d th direction and location (x, y) can be defined as
where d = 1, 2, 3 denote the horizontal, vertical and diagonal directions.
To demonstrate the merits of multiscale products of LSWT, in Figure 3, the LSWT and multiscale products of a noisy test image (f_{1} = g_{l} + δ) are illustrated, respectively. Though the LSWT coefficients of the original signal g_{1} are immersed into noise at fine scales, they are enhanced in the scale products P_{ l }f. The significant features of g_{1} are more distinguishable in P_{ l }f than in W_{ l }f. So we can conclude that the multiscale production of LSWT can amplify the significant features and dilute noise.
4. The proposed fusion algorithm
A good image fusion algorithm should preserve all the salient features of the source images and introduce as less artifacts or inconsistency as possible. In addition, the fusion algorithm should be reliable and robust to imperfections such as noise. In this article, we develop a novel multifocus image fusion scheme to incorporate the merits of interscale dependencies of LSWT into the image fusion technique. Two adjacent wavelet subbands are multiplied to amplify the significant features and dilute noise. In contrast to the conventional MSTbased image fusion schemes, we design the fusion rule of the high frequency subbands based on the multiscale products instead of the wavelet coefficients. So our proposed image fusion method can be fairly resistant to the noise because the multiscale products can distinguish edge structures from noise effectively.
Apart from the LSWT and multiscale products in the above section, fusion rules, namely, selection principles for different subband coefficients are another important component in our proposed fusion method. The following study presented in this article is concerned with a design of novel fusion rules for the low frequency subband coefficients and the high frequency subband coefficients. Throughout this study, it has been assumed that the images studied have been appropriately preregistered, so that corresponding features can coincide pixel to pixel [27]. To simplify the discussion, we assume the fusion process is to generate a composite image F from a pair of source images denoted by A and B. The general procedure of the proposed LSWTMSPbased fusion algorithm is illustrated in Figure 4 and implemented as

(1)
Decompose the registered source images A and B, respectively, into one low frequency subimage and a series of high frequency subimages via LSWT.

(2)
Select fusion coefficients for the low frequency subimage and each high frequency subimage from A and B according to fusion rules.

(3)
Reconstruct the original image based on the new fused coefficients of subimages by taking an inverse LSWT transform, then the fused image F is obtained.
4.1. Selection of lowpass subband coefficients
As the coefficients in the coarsest scale subband represent the approximation component of the source image, the simplest way is to use the conventional averaging method to produce the composite coefficients. However, this will reduce the fused image contrast. To improve the fused image quality, a clarity measure should be defined to determine whether a coefficient of the low frequency subband is in focus or out of focus.
For multifocus image fusion, many typical focus measurements, e.g. variance, energy of image gradient (EOG), spatial frequency (SF), and energy of Laplacian (EOL) of the image, are compared in literature [28]. They all measure the variation of pixels. Pixels with greater values of these measurements, when source images are compared with each other, are considered from the focus parts. According to literatures [28, 29] we know that EOL can provide a better performance than SF and EOG for fusion multifocus images. In this article, we use a new improved energy of image Laplacian (IEOL) as the focus measure to select coefficients from the clear parts of the source images.
The complete original expression of the energy of Laplacian (EOL) of the image f is shown in Equation (14):
where
In Equation (15), the f (x, y) is the gray value of pixel at position (x, y) of image f ⋅ f_{ xx }+ f_{ yy }represents image gradient obtained by Laplacian operator [1,4,1; 4, 20, 4; 1, 4, 1].
However, the second derivatives in different directions may have different signs which cause one sign to cancel the other. This phenomenon may occur frequently in the textured images. In order to avoid the problem, and maintain robustness of the algorithm in the face of adverse effects that may occur in image fusion. We will use an improved EOL (IEOL) as the clarity measure to select coefficients from the clean parts of source images.
The improved sum of Laplacian (ISL) and the improved energy of Laplacian (IEOL) of image f are computed as:
where W_{ l }is a template which size is relatively small, and must satisfy the normalization rule Σ_{ a }Σ_{ b }W_{ l }(a, b) = 1. For the low frequency subband, it contains low frequency information. In order to match the information of the LSWT neighborhood of low frequency subband, the values of center and center neighborhood of the template should have little change between each other [30]. In this article, the template size is 3 × 3. In order to highlight the center pixel of the window, a weighted template is used, which is given as:
The IEOL can be used as clarity measure to determine which coefficient is in focus. Suppose the source images A and B are decomposed using LSWT, {f}_{l}^{L}\left(x,y\right) and {f}_{F}^{L}\left(x,y\right) denote the low frequency coefficients of the source image L (K = A, B) and the fused image F which are located at (x, y) in the L th decomposition level, respectively. The {\mathsf{\text{IEOL}}}_{K}^{L}\left(x,y\right) denotes the IEOL measurement of {f}_{K}^{L}\left(x,y\right). The proposed IEOLbased fusion rule can be described as follows:
It means that coefficients with maximum IEOL measurement are selected as the coefficient of the fused image when subbands are compared in the LSWT domain. For simplicity, we name this fusion rule as 'IEOLmax' rule in this article.
4.2. Selection of bandpass subband coefficients
The coefficients in the high frequency subbands represent the detailed component of the source image. In traditional multiresolution fusion algorithms, such as [9, 31, 32], the multiresolution coefficients with larger absolute value are considered as sharp brightness changes or salient features in the corresponding source image, such as the edges, contours, and region boundaries, and so on. Thus, for the high frequency subbands coefficients, the most commonly used selection principle is the 'absolutemaximumchoosing' scheme (simplified and named 'Coefabsmax') without taking any consideration of lowpass subband coefficients, that is, all the information in the lowpass subband is neglected.
Furthermore, in many practical applications, images are distorted by noise during the acquisition or transmission process. But almost all the traditional MSTbased image fusion algorithms are designed to transfer the high frequency information from the input images to the fused image. It is worth noting that much of the image noise is also related to the high frequencies and may cause miscalculation of sharpness value. As a result, the fused images obtained by these methods are more noisy than the source images, and the performances are degraded. To make up for the deficiencies of traditional MSTbased image algorithms, in our proposed method, after decomposing the original images using LSWT, we design a new image fusion rule based on multiscale products.
As we know, the HVS is highly sensitive to the local image contrast level. To meet this requirement, Toet and Ruyven developed the local luminance contrast in their research in CP [17]. It is defined as
where L' denotes the local gray level, L_{ B' }is the local brightness of the background and corresponds to the low frequency component. Therefore, ΔL can be taken as the high frequency component.
Based on the above idea, many different forms of contrast measurement have been proposed in MST domain and provide better performance than the 'Coefabsmax' scheme [18, 19, 25]. However, in those contrast measurements, the value (or absolute value) of a single pixel of the high frequency subimage, namely the coefficient of the high frequency subband when the source image is decomposed by the MST, is used as ΔL. In fact, the value (or absolute value) of a signal pixel is very limited in determining which pixel is from the clear part of the subimage. So, a pure use the value (or absolute value) of a single pixel as the high frequency component is not effective enough. We believe it will be more reasonable to employ feature of the high frequency subband, rather than the value (or absolute value) of pixel, as ΔL in the contrast measurement in Equation (19).
Like the sharpness measure, the ISL, shown in Equation (16), can effectively represent the salient features and sharp boundaries of an image. Pixels with larger values of ISL, when the source images are compared with each other, are more possible in focus. That means the ISL can successfully determine which pixel is in the focus. Therefore, it is reasonable to utilize ISL as one type of feature of the high frequency subband to represent ΔL in contrast measurement.
If we use ISL^{d,l}(x, y) (l = 1, 2,...,L) to denote the ISL located at (x, y) in the d th direction (d = 1, 2, 3) and l th scale. The feature contrast R^{d,l}(x, y) is defined as
where f ^{l}(x, y) denotes the low frequency coefficients located at (x, y) in the l th scale. In order to improve the robustness of the contrast to the noise of the low frequency subband, the feature contrast can be modified as
where
In Equation (22) the local area size m × n may be 3 × 3 or 5 × 5. In practice, to reduce the computation complexity and the influence of low frequency subband noise, f^{l}(x, y) can be substituted with the coarsest lowpass subband image f^{L}(x, y).
To further conform to the characteristics of HVS, the feature contrast must be improved using the 'localbased' idea, thus a local neighborhoodbased feature contrast is proposed in this article. It can be represented as
where W_{ h }is a template of size 3 × 3. For the high frequency subband, it contains high frequency information. In order to match the information of the LSWT neighborhood of high frequency subband, the values of center and center neighborhood of the template should have relative large change between each other [30]. In this article, a weighted template based on cityblock distance is used, which is
In order to make up for the deficiencies of the traditional MSTbased image fusion algorithm, which cannot restrain the noise influence, a new image fusion scheme is proposed in this article. In this fusion method we incorporate the merits of interscale dependencies, which can amplify the significant features, dilute noise and distinguish edge structures from noise more effectively, into the multifocus image fusion technique. In contrast to the traditional MSTbased fusion methods, we design the fusion rule of the high frequency subbands based on the multiscale products instead of the wavelet coefficients. According to the formulae (23), the local feature contrast of multiscale products can be defined as
where
where PSL^{d,l}(x, y) denotes the ISL of multiscale products located at (x, y) in l th scale and d th direction; P^{d,l}f(x, y) and MPS^{d,l}(x, y) are the corresponding multiscale products and the feature contrast.
Therefore, the proposed selection principle for the high frequency subband coefficients can be described as follows:
The local feature contrast of multiscale products cannot only effectively represent the salient features and sharp boundaries of image, but also effectively avoid the noise influence. A large value of the feature contrast means more high frequency information. So the proposed fusion scheme can extract more useful detail information from source images and inject them into the fused image. For simplicity, we name this fusion rule as 'MSPconmax' in this article.
5. Experimental results and analysis
To evaluate the performance of the proposed fusion method, several experimental results are presented in this section. Experiments are performed on four sets of 256level images: clean 'pepsi' (of size 512 × 512), clean 'flower' (of size 384 × 512), clean 'barb' (of size 512 × 512) and noisy 'pepsi' (of size 512 × 512). All of them are registered perfectly and shown in Figure 5af,hj, respectively.
In order to show the advantages of the new image fusion method, we establish three steps to demonstrate that the proposed image fusion method outperforms other methods. First, 'MSPconmax' is compared with 'Coefabsmax', the 'Traditionalcontrastmax' ('Traconmax'), and the proposed 'Featurecontrastmax' ('Feaconmax'), which is designed according to Equation (23), to demonstrate the performance of the 'MSPconmax' rule. For the 'Traconmax', the absolute value of a single pixel of the high frequency subband is used as ΔL in the contrast measurement. Second, the proposed image fusion algorithm is compared with DWTsimplebased method (Method 1), LSWTsimplebased method (Method 2), and NSCTsimplebased method (Method 3), in all of which the low frequency subband coefficients and the high frequency subband coefficients are simply merged by the 'averaging' scheme and the 'Coefabsmax' scheme, respectively. For comparison purposes, the proposed algorithm is also compared with other four fusion algorithms (namely Methods 47). In Methods 4 and 5, LSWT is used as the MST method, and the 'IEOLmax' fusion rule is employed to merge the low frequency subband coefficients. For fusion of the high frequency subband coefficients, the 'Coefabsmax' and 'Traconmax' fusion rules, are respectively used in Methods 4 and 5. For Method 6, the fusion rules of [7], which have been deigned based on the feature of the multiscale products and pulse coupled neural network (PCNN) [7], are respectively used to merge the low and high frequency subband LSWT coefficients (We name the method as 'LSWTPCNN'). In this method, the PCNN is a model based on the cats primary visual cortex. It is characterized by the global coupling and pulse synchronization of neurons and has been proven suitable for image processing [33]. In Method 7, the NSCT is used as the MST method, and our proposed 'IEOLmax' and 'MSPconmax' are, respectively, employed to fuse the low and high frequency subbands coefficients (We name it as 'NSCTMSPCon'). For multiscale scale products of NSCT, it can be defined just like Equation (13).
In all of these methods, the 'db5' and 'db53' wavelets, together with a decomposition level of 3 are used in DWTbased and LSWTbased methods (including Methods 2, 4, 5, 6 and our proposed method), respectively. Three decomposition levels are also used in the NSCTbased method (including NSCTsimple and NSCTMSPCon). All of these methods are used to fuse the multifocus clean images. Third, multifocus noisy images, as shown in Figure 5e,f, are fused by above different methods.
5.1. Contrastbased fusion rule in LSWT domain
In this section, we will show the performance of 'Feaconmax' and 'MSPconmax' fusion rules. In order to demonstrate the advantages of the new fusion rule, 'MSPconmax' and 'Feaconmax' are compared with'Traconmax' and 'Coefabsmax' on high frequency subbands in LSWT domain.
Figure 6ad shows the high frequency subimages of the labeled region of Figure 5a,b,e,f in LSWT domain. One can see that the values of coefficients in clear part are greater than those of blurry part, even though the source image is in a noisy environment. That is why typical 'Coefabsmax' is used in MSTbased fusion algorithms.
Figure 6eh shows the multiscale products of Figure 6ad, respectively. From Figure 6g,h, we can find that the multiscale products of LSWT can distinguish edge structures from noise effectively. Figure 6il are the decision maps, in which the coefficients selected from the image in Figure 6b are represented by white color, whereas the coefficients from Figure 6a are represented by black color. Since labeled part of Figure 6b is clearer than that of Figure 6a, the optimal decision map should be in white color in the whole decision map, which means all coefficients should be selected from Figure 6b. However, the decision maps of 'Coefabsmax' rule and 'Traconmax' rule, shown in Figure 6i,j, indicate that these rules do not select the coefficients from the clear part completely even though 'Traconmax' shows better performance than 'Coefabsmax'. Figure 6k,l indicates that the proposed feature contrast is more reasonable than the traditional contrast. It is also proven that applying feature such as ISL to the contrast is more reasonable than the absolute value of a single pixel.
Figure 6mp shows the decision maps, in which the white color indicates that coefficients are selected from Figure 6d, otherwise selected from Figure 6c. From these figures we can see that the proposed 'MSPconmax' rule do well in fusion of the multifocus noisy images. All of these demonstrate that the proposed fusion rule cannot only select the coefficients of the fused image properly but also restrain the influence of noise effectively.
The results of objective assessment are shown in Figure 7a,b. Figure 7a denotes the performance of different fusion rules in fusion of the multifocus clean image, and Figure 7b presents the performance of the different fusion rules in fusion of the multifocus noisy image. In Figure 7a, 'From a' and 'From b' denote the number of pixels that come from Figure 6a,b, respectively. Obviously, the proposed method is superior to others because the number of pixels that come from Figure 6b is the largest. As a result, the fused image is closer to the goodfocus source image when using our proposed fusion rule, compared to using 'Coefabsmax' rule, and 'Traconmax' rule, when the source images are noisefree. From Figure 7b, the same conclusion can be drawn that the proposed 'MSPconmax' fusion rule outperforms the traditional fusion rules, when the source images are in a noisy environment.
5.2. Fusion of clean multifocus images
In this section, the experiments are performed on three pairs of multifocus clean images, which are shown in Figure 5ad,hj, respectively. All the experiments are implemented in Matlab7.01 and on AMD Athlon(tm) 2.4 GHz with 2 G RAM. For further comparison, besides visual observation, two objective criteria are used to compare the fusion results. The first criterion is the mutual information (MI) [34]. It is a metric defined as the sum of mutual information between each input image and the fused image. The second criterion is Q^{AB/F}[35] metric, proposed by Xydeas and Petovic, which considers the amount of edge information transferred from the input images to the fused image. This method uses a Sobel edge detector to calculate strength and orientation information at each pixel in both source and the fused images. For both criteria, the larger the value, the better is the fusion result.
The first experiment is performed on the 'pepsi' multifocus clean images which have been registered perfectly. Figure 8 illustrates the fusion results obtained by the above mentioned eight different methods (including the proposed method). For a clearer comparison, the difference images between the fused images, which are fused results using Methods 17 and our proposed method, and the source image in Figure 5b are given in Figure 8ip. To make better comparisons, Figure 8qx illustrates parts of the labeled regions of Figure 8ip. For the focused regions, the difference between the source image and the fused image should be zero. So the lower residue features in the difference image mean the better the fusion method transfers information of the source images to fused image. Focusing on the images which are shown in Figure 8qs, one can obviously find that the fused images obtained by the LSWT method (Method 2) and the NSCT method (Method 3) are clearer than the DWT fused result. It is proven that shiftinvariant methods such as LSWT and NSCT can overcome the pseudoGibbs phenomena successfully and improve the quality of the fused image around edges. Figure 8t indicates that the proposed fusion rule of the low frequency subband is more reasonable and useful in fusion multifocus clean images when compared with the 'averaging' fusion scheme. From Figure 8u, we can find that the 'Traconmax' fusion scheme does not extract almost all the useful information of the source images and nor transfer it to the fused image. However, Figure 8v,x shows that fused images attained by our proposed method and Methods 6 and 7 are with better visual quality. Almost all of the useful information of the source images has been transferred to the fused images, and meantime, fewer artifacts are introduced during the fusion process. All of these demonstrate the proposed feature contrast is more reasonable and useful than the traditional contrast.
In order to further evaluate the fusion performance, the second experiment is performed on another set of multifocus clean images, which are also registered perfectly and shown in Figure 5c,d. The resultant fused images are shown in Figure 9ah. Again, for clearer comparison, the difference images between the fused images, which are fused results using Methods 17 and our proposed method, and the source images which are shown in Figure 5d are given in Figure 9ip. Parts of the labeled regions of Figure 9ip are extracted and put into Figure 9qx. Figure 9qx indicates that the proposed method and NSCTMSPConbased method can extract almost all the goodfocalized parts of source images and preserve the detailed information better than the other methods. Moreover the proposed method provides similar performance compared with NSCTMSPConbased method, even though the NSCT is more suitable for image fusion, i.e., because the fusion rules, which are designed in this article, are every effective and can extract almost all the useful information of the source images and transfer it to the fused image, no matter the image fusion method is in the LSWT domain or in NSCT domain.
Three source images with different blur regions, as shown in Figure 5hj, are used to evaluate the fusion performance in the third experiment. To make better comparisons, the difference images between the fused images and the reference image, which are shown in Figure 5g, are given in Figure 10ip. For clearer comparison, the labeled parts of Figure 10ip are extracted and shown in Figure 10qx. From Figure 10qx, the same conclusion can be drawn that the proposed method outperforms others methods.
Furthermore, the values of objective criteria on mutual information (MI), Q^{AB/F}and the execution time of Figures 8ah, 9ah, and 10ah are listed in Tables 1, 2, and 3, respectively. We observe that the fused images produced by NSCTsimplebased method are slightly better than the LSWTsimple fusion results, and all of them outperform the DWT approach in terms of MI and Q^{AB/F}. However, the NSCT is time consuming, which impedes its realtime application. As the modified version of LWT, LSWT consumes more time than DWT, that is because LSWT possesses the shiftinvariance, and needs to process more dates of the image during the fusion process.
Form Tables 1, 2, and 3, we find that the NSCTMSPConbased method provides similar performance compared with our proposed method. However, the NSCTMSPCon is more time consuming than our proposed method, because of its multidirection and complexity. By considering both the fusing results and computing complexity, we utilize LSWT as the MST method in our proposed algorithm. For the LSWTPCNNbased method, it is also more time consuming than the proposed method, because the PCNN neuron is very complex and it needs iterative operation to obtain pleasing fusion results. Moreover, the number of parameters of each neuron which need to be adjusted is large and they affect each other greatly. In image processing with PCNN, people often assign the same values to the corresponding parameters of each neuron. They are all chosen with experiments or experiences. For the visual system of eyes, it is impossible that all the parameters of neurons have the same value. They should be related to the situation of the neuron cell. All of these disadvantages significantly compromise the performance of LSWTPCNN. Relative to LSWTPCNN, our proposed method not only considers the property of HVS, which is highly sensitive to the local image contrast level, but also possesses some advantages such as simple calculation, high efficiency, etc. So our proposed method can provide better performance than LSWTPCNN.
The values of Tables 1, 2, and 3 demonstrate that the proposed image fusion algorithm significantly outperforms other approaches (except NSCTMSPConbased method) in terms of MI and Q^{AB/F}. Moreover, since reference (everywhere in focus) image of Figure 5hj, as shown in Figure 5g, is available, performance comparison of different methods can be made using root mean square error (RMSE). The values of RMSE between Figure 11ah and Figure 5g are given in Table 3. From Table 3, we can find that the objective evaluation results of RMSE coincide with the MI and Q^{AB/F}evaluation results very well.
5.3. Fusion of noisy multifocus images
In order to evaluate the performance of the proposed method in a noisy environment, the input multifocus images 'pepsi', as shown in Figure 5e,f, have been additionally corrupted by Gaussian noise with deviation δ = 0.01.
In following experiment, since reference (everywhereinfocus) images of the scenes under analysis are not available, performance comparison of the proposed method cannot be made using RMSE based metrics for this kind of image. Therefore, image fusion performance evaluation measures which do not require the availability of an ideal image have to be employed. For comparison, besides visual observation, objective criteria on MI and Q^{AB/F}are used to evaluate how much information of the multifocus clean images, which are shown in Figure 5a,b, is contained in the fused images. However, the objective criteria on MI and Q^{AB/F}cannot evaluate the performance of these fusion methods in terms of the input/output noise transmission. For further comparison, the improvement in terms of peak signal to noise ratio (PSNR), proposed by Loza et al. [6], is adopted to measure the noise change between the fused image and source noisy image. Let {\sigma}_{n,f}^{2} denotes the noise variance in the fused output, the improvement in terms of PSNR is formulated as:
For the criteria of ΔPSNR, the larger the value, the less noise of fused image is introduced from the original noisy image, and the better is the fusion result.
Figure 11ah illustrates the fusion results obtained by the above different methods. For a clearer comparison, Figure 11ip illustrates the parts of the fusion results. By looking at the image examples shown in Figure 11il, one can find that the edges information of the fused images, which are fused results using Methods 14, respectively, are immersed into noise. That is because all these fusion methods are designed to transfer the high frequency information from the input images into the fused image. It is worth noting that much of the image noise is also related to high frequencies. As a result, the fused images obtained by these methods are more noisy than the source images. From Figure 11m, we can see that the edges of Figure 11m are not clearer than Figure 11np, because the noise of the source images causes miscalculation of the contrast values. Therefore, in the presence of noise, the performance of Methods 15 may not be as good as those in the noiseless environments. Figure 11n indicates that the Method 6 can reduce the noise level to some extent, but the edges information of the fused image is not clearer compared with Figure 11o,p, which are fused by the Method 7 and our proposed algorithm.
Furthermore, Table 4 gives the quantitative results of the Figure 11. From Table 4, we observe that different fusion methods appear to provide different image fusion performance and the proposed scheme outperforms the other seven image fusion algorithms in terms of larger MI, and Q^{AB/F}quality. And the values of ΔPSNR indicate that the proposed fusion rule of the high frequency subband is more reliable, robust and stable than other fusion rules.
6. Conclusion
In this article, a new multifocus image fusion algorithm based on feature contrast of multiscale products is proposed in LSWT domain. In the proposed algorithm, a novel feature contrast of multiscale products, which stands for edge features in high frequency subimages in LSWT domain, is developed and used as the fusion scheme of the high frequency subbands. Three pairs of clean multifocus images and one pair of noisy multifocus images are used to test the performance of the proposed image fusion method, respectively. The experimental results demonstrate that the proposed method outperforms the DWTsimplebased method, the LSWTsimplebased method, LSWTTraditionalContrastbased method, the LSWTPCNNbased method and the NSCTsimplebased method in terms of both visual quality and objective evaluation, even though the source images are in a noisy environment. In the future, we will do more research on the fusion of the noisy images, in order to carry out denoising and fusion of noisy source images simultaneously. And that will become the new trends to develop in image fusion field in the future.
References
Seales WB, Dutta S: Everywhereinfocus image fusion using controllable cameras. Proc SPIE 1996, 2905: 227234.
Li ST, Yang B: Multifocus image fusion using region segmentation and spatial frequency. Image Vision Comput 2008, 26(7):971979. 10.1016/j.imavis.2007.10.012
Wang ZB, Ma YD, Gu J: Multifocus image fusion using PCNN. Pattern Recogn 2010, 43(6):20032016. 10.1016/j.patcog.2010.01.011
Li ST, Kwok JT, Wang YN: Multifocus image fusion using artificial neural networks. Pattern Recogn Lett 2002, 23(8):985997. 10.1016/S01678655(02)000296
Pajares G, de la Cruz JM: A waveletbased image fusion tutorial. Pattern Recogn 2004, 37(9):18551872. 10.1016/j.patcog.2004.03.010
Loza A, Bull D, Canagarajah N, Achim A: Nongaussian modelbased fusion of noisy images in the wavelet domain. Comput Vis Image Understand 2010, 114(1):5465. 10.1016/j.cviu.2009.09.002
Chai Y, Li HF, Guo MY: Multifocus image fusion scheme based on features of multiscale products and PCNN in lifting stationary wavelet domain. Opt Commun 2011, 248(5):11461158.
Petrovic VS, Xydeas CS: Gradientbased multiresolution image fusion. IEEE Trans Image Process 2004, 13(2):228237. 10.1109/TIP.2004.823821
Li H, Manjunath BS, Mitra SK: Multisensor image fusion using the wavelet transform. Graph Models Image Process 1995, 57(3):235245. 10.1006/gmip.1995.1022
Qu XB, Yan JW, Xiao HZ: Image fusion algorithm based on spatial frequencymotivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Automatica Sinica 2008, 34(12):15081514.
Chai Y, Li HF, Qu JF: Multifocus image fusion using a novel dualchannel PCNN in lifting stationary wavelet transform. Opt Commun 2010, 283(19):35913602. 10.1016/j.optcom.2010.04.100
Wilson T, Rogers S, Kabrisky M: Perceptual based hyperspectral image fusion using multispectral analysis. Opt Eng 1995, 34(11):31543164. 10.1117/12.213617
Sweldens W: The lifting scheme: a construction of second generation wavelets. SIAM J Math Anal 1998, 29(2):511546. 10.1137/S0036141095289051
Coifman RR, Donoho DL: Translation Invariant DeNoising, Wavelet and Statistics. Edited by: A Antoniadis, G Oppenheim. SpringerVerlag, New York; 1995:125150.
Lee CS, Lee CK, Yoo KY: New lifting based structure for undecimated wavelet transform. Electron Lett 2000, 36(22):18941895. 10.1049/el:20001294
da Cunha AL, Zhou JP, Do MN: The nonsubsampled contourlet transform: theory, design and application. IEEE Trans Image Process 2006, 15(10):30893101.
Toet A, Van ruyven LJ, Valeton JM: Merging thermal and visual images by a contrast pyramid. Opt Eng 1989, 28(7):789792.
Yang L, Guo BL, Ni W: Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform. Neurocomputing 2008, 72(13):203211. 10.1016/j.neucom.2008.02.025
Zhang Q, Guo BL: Fusion of multisensor images based on the nonsubsampled contourlet transform. Acta Automatica Sinica 2008, 34(2):135141.
Bao P, Zhang L: Noise reduction for magnetic resonance images via adaptive multiscale products thresholding. IEEE Trans Med Imag 2003, 22(9):10891099. 10.1109/TMI.2003.816958
Xu Y, Weaver JB, Healy DM Jr, Lu J: Wavelet transform domain filters: a spatially selective noise filtration technique. IEEE Trans Imag Process 1994, 3(6):747758. 10.1109/83.336245
Sweldens W: The lifting scheme: a customdesign construction of biorthogonal wavelets. Appl Comput Harmonic Anal 1996, 3(2):186200. 10.1006/acha.1996.0015
Claypoole RL, Davis GM, Sweldens W, Baraniuk R: Nonlinear wavelet transforms for image coding via lifting. IEEE Trans Image Process 2003, 12(12):14491459. 10.1109/TIP.2003.817237
Stepien J, Zienlinski T, Rumian R: Image denoising using scaleadaptive lifting schems. In Proceedings of the International Conference on Image. Volume 3. Vancouver, BC, Canada; 2000:288290.
Zhang Q, Guo BL: Multifocus image fusion using the nonsubsampled contourlet transform. Signal Process 2009, 89(7):13341346. 10.1016/j.sigpro.2009.01.012
Sadler BM, Swami A: Analysis of multiscale products for step deteciton and estimation. IEEE Trans Inf Theory 1999, 45(3):10411051.
Maintz JB, Viergever MA: A surevy of medical image registration. Med Image Anal 1998, 2(1):136.
Wei H, Jing ZL: Evaluation of focus measures in multifocus image fusion. Pattern Recogn Lett 2007, 28(4):493500. 10.1016/j.patrec.2006.09.005
Wei H, Jing ZL: Multifocus image fusion using pulse coupled neural network. Pattern Recogn Lett 2007, 28(9):11231132. 10.1016/j.patrec.2007.01.013
Song YJ, Ni GQ, Gao K: Regional energy weighting image fusion algorithm by wavelet based contourlet transform. Trans Beijing Inst Technol 2008, 28(2):168172.
Li ST, Yang B: Multifocus image fusion by combining curvelet and wavelet transform. Pattern Recogn Lett 2008, 29(9):12951301. 10.1016/j.patrec.2008.02.002
Li ST, Yang B, Hu JW: Performance comparison of different multiresolution transforms for image fusion. Inf Fusion 2011, 12(2):7484. 10.1016/j.inffus.2010.03.002
Johnson JL, Padgett ML: PCNN models and applications. IEEE Trans Neural Netw 1999, 10(3):480498. 10.1109/72.761706
Qu G, Zhang D, Yan P: Information measure for performance of image fusion. Electron. Lett 2001, 38(7):313315.
Petrovic V, Xydeas C: On the effects of sensor noise in pixellevel image fusion performance. In IEEE Proceedings of the Third International Conference on Image Fusion. Volume 2. Paris, France; 2000:1419.
Acknowledgements
The authors would like to thank the associate editor and the anonymous reviewers for their careful study and valuable suggestions for an earlier version of this article. The article was jointly supported by the National Natural Science Foundation of China (No. 60974090), the Ph.D. Programs Foundation of Ministry of Education of China (No. 200806110016), and the Fundamental Research Funds for the Central Universities (No. CDJXS10172205).
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Li, H., Wei, S. & Chai, Y. Multifocus image fusion scheme based on feature contrast in the lifting stationary wavelet domain. EURASIP J. Adv. Signal Process. 2012, 39 (2012). https://doi.org/10.1186/16876180201239
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/16876180201239