Video resampling algorithm for simultaneous deinterlacing and image upscaling with reduced jagged edge artifacts

Yoo, Du Sic; Chang, Joonyoung; Park, Chul Hee; Kang, Moon Gi

doi:10.1186/1687-6180-2013-188

Research
Open access
Published: 20 December 2013

Video resampling algorithm for simultaneous deinterlacing and image upscaling with reduced jagged edge artifacts

Du Sic Yoo¹,
Joonyoung Chang¹,
Chul Hee Park¹ &
…
Moon Gi Kang¹

EURASIP Journal on Advances in Signal Processing volume 2013, Article number: 188 (2013) Cite this article

3808 Accesses
3 Altmetric
Metrics details

Abstract

In this paper, we propose a video resampling method for simultaneous deinterlacing and image upscaling. The proposed method is composed of two steps: the initial image magnification step and the edge enhancement step. In order to convert an interlaced image into a display format image, a filtering strategy, which resizes images with arbitrary ratios and reduces the overall computational load, is performed region adaptively using local characteristics such as motion or motionless regions. After the initial step, the proposed jagged edge correction (JEC) method is applied to the initially upscaled images to correct the stair-like artifacts (jagged edges) which are caused by ignoring any edge information in diagonal edge regions during the linear filtering process. Moreover, this method can be very useful for various upscaling applications to improve edge quality since it can be used in combination with other common interpolation techniques, such as cubic spline techniques. Experimental results show that the proposed method substantially reduces the jagged edges of the converted images and provides steep and natural-looking edge transitions.

1 Introduction

Over the last several decades, image resolution has rapidly increased, and advanced display devices like high-definition television (HDTV) have been developed to keep pace with these rapid changes. Even today, image resolution capabilities are still increasing and some companies have already introduced next-generation broadcasting systems such as Super Hi-Vision and Ultra High Definition TV systems. However, not all commercial media content can be provided in high resolution. There is a large amount of low-resolution video content, and many consumers want to view this content in full screen mode with their high-resolution display devices. Therefore, deinterlacing (or interlaced-to-progressive conversion (IPC)) and image upscaling are required to convert incoming low-resolution interlaced images into high-resolution progressive ones. The deinterlacing and image upscaling procedures can be used to solve the problems of format and spatial conversions, respectively. These procedures are called video-to-display format conversions (VDFC)[1] in this paper.

Deinterlacing and image upscaling have been studied for decades. These methods can be roughly classified into linear filtering interpolation (LFI) and edge directional interpolation (EDI). Among the deinterlacing methods, LFI approaches[2–7] can be categorized into spatial (intra-field), temporal (inter-field), and spatio-temporal methods, according to the field information. In order to obtain progressive images, missing pixels have to be reconstructed using linear filters according to the spatial correlations, the temporal correlations, and both the spatial and temporal correlations in interlaced video sequences. Particularly, some algorithms[5–7] discover missing pixel values by interpolating the pixels along motion trajectories, because temporal correlation is dependent on motion information. Among the image upscaling methods, LFI approaches[8–12] design a particular interpolation kernel, which can be applied to the entire image. Especially, these methods can resize images with arbitrary ratios, which is one of the preferred features for image upscaling applications. LFI methods for both deinterlacing and upscaling are as old as image processing, and they are still popular because of their simple implementation. However, LFI approaches usually produce jagged edge artifacts (stair-like artifacts) in diagonal edge regions because they do not consider any edge information during the resampling process.

On the other hand, EDI techniques calculate the directional correlation between neighboring pixels to perform interpolation along estimated edge directions. These EDI approaches have been widely used for deinterlacing[13, 14] and upscaling[15–17]. Especially, edge-dependent deinterlacing algorithms such as edge-based line average are the most popular among the intra-field deinterlacing methods. Many sophisticated techniques for both deinterlacing and upscaling have been proposed to increase the accuracy of the estimated edge directions because EDI approaches can improve visual quality around dominant edges when the estimated edge direction is correct. However, high computational loads are required to estimate the directions of the various diagonal edges, and thus, upscaling applications become more complicated than deinterlacing applications. Furthermore, for achieving arbitrary ratio enlargement, the EDI process generally becomes more complicated since edge direction estimation and interpolation are performed within an asymmetric interpolation lattice. Therefore, upscaling EDI methods usually restrict the scaling ratio in order to simplify the interpolation process, e.g., images are enlarged twice as much as the original images in both horizontal and vertical directions. With fixed scaling ratios, some researchers[15–17] estimate the covariance of high-resolution images by exploiting the covariance of low-resolution images. These methods can substantially improve the image quality by preserving the spatial coherence of the upscaled images. They also provide natural-looking images by efficiently connecting the disconnected edges. However, these algorithms may introduce some false edges or textures due to over-fitting. They can sometimes connect the edges erroneously due to their incorrect estimation of covariance because spatial correlations generally change after downsampling.

In consumer electronic devices such as HDTV systems, the VDFC technique is used to convert video sources into the kind of display resolution format shown in Figure1. As shown in Figure1b, deinterlacing doubles the vertical resolution of an interlaced image, while image upscaling resizes an image with arbitrary ratios. Therefore, image upscaling can be considered a generalized version of deinterlacing. Moreover, the VDFC technique is generally based on either LFI or EDI approaches. However, LFI approaches tend to suffer from jagged artifacts along the diagonal edges despite offering the competitive advantages of low complexity and arbitrary ratio interpolation. As large display devices with high resolution become more popular, the visible artifacts become more prominent. Although EDI approaches produce good performance in edge regions, they require high complexity in order to offer spatial coherence of the upscaled images. Also, these methods may require additional LFI methods for achieving arbitrary ratio enlargement after performing the EDI process. Therefore, an efficient VDFC technique, which offers a low computation load and provides high edge quality, is required.

In this paper, we propose a video resampling method that takes the advantage of both the LFI and EDI methods: the capabilities of converting the image format, resizing the images with arbitrary ratios, and improving edge quality. The proposed method consists of two steps: initial image magnification and edge enhancement. In the first step, linear filtering strategies are performed differently to local region characteristics such as motion or motionless regions in order to convert interlaced images into upscaled progressive ones. The performance of the LFI is affected by the used interpolation kernels, and thus, an interpolation kernel closer to the ideal sinc function achieves better reconstruction performance. Thus, the conventional kernels are obtained by truncating the ideal sinc function. However, side lobes (or ripples) of the truncated kernel produce severe ringing artifacts within the interpolated images, and thus, the kernel size is limited to reducing the ringing artifacts. In order to cope with this problem, we use a ringing reduction technique to improve interpolation performance, which reduces the number of ringing artifacts while reconstructing the high-frequency components. Thus, the used linear filter is based on the Lanczos function, and the wider window function is applied to the Lanczos function in order to achieve better reconstruction performance. In the second step, the proposed jagged edge correction (JEC) method is performed to improve the edge quality of the initially upscaled images. More specifically, the JEC method uses an adaptive smoothing kernel obtained from the gradient covariance in order to correct the jagged edge artifacts within the initially upscaled images. This kernel, called an ellipsoidal kernel, assigns large weights along the local edge direction in order to smooth the jagged edge artifacts along the edge direction. In conventional EDI methods, the asymmetric interpolation lattice makes the interpolation procedure complex. However, the complexity level of the JEC method is less than that of the conventional EDI because the JEC procedure is applied to a symmetric lattice of the upscaled images. Moreover, enlarging the images resulted in blurring effects within the initially upscaled images, and these effects were perceived more conspicuously in regions with jagged edge artifacts. In order to solve this problem, a transient improvement (TI) technique is performed during the filtering process along the edge direction to improve the sharpness of the initially upscaled images. The JEC method is a core technique of the proposed video resampling system, and it can be used as a postprocessor for many LFI methods, such as cubic spline. This independent module can be applied to various applications with flexibility.

The organization of the rest of this paper is as follows: In Section 2, previous works on Lanczos interpolation and related issues are discussed. In Section 3, the proposed video resampling method is described in detail. First, the overall structure of the proposed method is explained. In Section 3.1, an image magnification step based on the Lanczos function is presented, and the proposed ringing artifact reduction technique is described to improve interpolation performance. Subsequently, the proposed JEC method is explained to improve the edge quality of the initially interpolated images. In Section 4, the experimental results of various test images are presented, and comparisons with other algorithms are made. Finally, conclusions are presented in Section 5.

2 Previous works on Lanczos interpolation

The normalized sinc function has been accepted as an ideal interpolation function that can perfectly pass low frequencies and perfectly cut high frequencies. However, this ideal interpolator has an infinite impulse response in the time and spatial domains so the sinc function has to be truncated or windowed in order to fulfill the requirements of a hardware implementable interpolator. However, the truncated interpolation kernel produces severe ringing effects within interpolated images. Therefore, various windowing kernels such as Hann, Hamming, Cosine, Lanczos, Blackman, and Kaiser have been proposed to reduce these ringing effects. According to spectral analysis[12], these different windows display different spectral characteristics, and some tradeoffs occur when the window function is chosen. Thus, the choice of the windowing function is crucial, and it is very dependent on the selected applications. In Figure2, we compare the sinc kernels truncated by various 6-tap windowing kernels. As shown in Figure2a, each of the truncated sinc kernels has negative coefficients, which comes from the side lobes of a sinc function. These negative coefficients are used to produce steep and sharp edge transitions in the step edges and to recover the image details in the texture regions. Therefore, larger negative coefficients achieve better reconstruction performance in the edge regions. According to Figure2b, the Lanczos windowing kernel has the most negative coefficients among the various windowing kernels. Thus, the Lanczos windowing kernel is preferred in order to improve the reconstruction performance of the high-frequency components.

The impulse response of the Lanczos interpolator is the normalized sinc function windowed by the Lanczos window, and the Lanczos window is the central lobe of a sinc function scaled to a certain extent. In one dimension, the Lanczos function can be obtained as

h_{Lz} (x, s) = \{\begin{array}{cl} \frac{sin (π x)}{π x} \frac{sin (π x / s)}{π x / s}, & | x | < s \\ 1, & x = 0 \\ 0, & otherwise, \end{array}

(1)

where h _Lz(x,s) represents the Lanczos kernel and s is a positive integer (typically 2 or 3), which controls the size of the Lanczos kernel. According to Figure3, the Lanczos interpolator can achieve better reconstruction performance using wider windows (when s is set as 4, 8, or 12) since the interpolation kernels become similar to the ideal sinc function. However, if the window size increases, more side lobes of the ideal sinc function influence the interpolation kernels, and they cause unwanted artifacts such as ringing (or shooting) artifacts in flat regions, especially when high-frequency edges exist within the range of the wide window. From the above discussion, it is clear that the side lobes of an interpolation filter can improve reconstruction performance in edge regions, but they can also degrade the image quality because of the ringing artifacts.

3 Proposed video resampling method with jagged edge correction

Figure4 roughly illustrates the overall structure of the proposed resampling method. As illustrated in Figure4, the proposed method requires two steps to perform video resampling: the initial image magnification step and the edge enhancement step. In the first step, the image magnification process for deinterlacing and upscaling is performed one-dimensionally by performing one-dimensional interpolation processes in the horizontal and vertical directions separately. Interpolation kernels for directional interpolations are based on the Lanczos kernel with a ringing artifact reduction. Especially in the vertical interpolation process, temporal information is used and then the values of both modified spatio-temporal interpolation and modified Lanczos interpolation are mixed according to the motion detection process. In the second step, the proposed JEC method is applied to the initially upscaled images to improve the edge quality of the images. During the JEC process, a TI technique is also used to improve the sharpness of the upscaled images.

3.1 Initial image magnification step

3.1.1 Image magnification process with Lanczos function

The goal of the proposed resampling method is to convert an interlaced field to a scaled progressive frame. In order to achieve this goal, the image magnification process performs one-dimensional interpolation first in the horizontal direction and then again in the vertical direction. Based on the Lanczos function, the missing pixel value at an arbitrary position (Δ_i,Δ_j) is reconstructed by

\begin{align} {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n) & = \sum_{l = - a + 1}^{a} h_{Lz} (Δ_{j} - l, a) \cdot f^{if} \\ \times (i + k, j + l, n), \end{align}

(2)

\begin{align} {\hat{f}}^{pf} (i + Δ_{i}, j + Δ_{j}, n) & = α \cdot {\hat{f}}_{Lz}^{pf} (i + Δ_{i}, j + Δ_{j}, n) \\ + (1 - α) \cdot {\hat{f}}_{ST}^{pf} (i + Δ_{i}, j + Δ_{j}, n), \end{align}

(3)

where the superscripts if and pf denote the interlaced image and the progressive image, respectively, and the subscripts Lz and ST represent the results of Lanczos interpolation and spatio-temporal interpolation, respectively. f ^if and ${\hat{f}}^{pf}$ represent the input interlaced image and the output upscaled progressive image, respectively. ${\hat{f}}_{Lz}^{if}$ denotes the horizontally upscaled interlaced image, and ${\hat{f}}_{Lz}^{pf}$ and ${\hat{f}}_{ST}^{pf}$ represent the upscaled progressive images obtained from Lanczos and spatio-temporal interpolations, respectively. (i,j) and n represent the spatial indices and the temporal index of the input interlaced image, respectively. a represents the horizontal size of the Lanczos kernel. Δ_i and Δ_j (0 ≤ Δ_i,Δ_j < 1) represent the arbitrary positions to be interpolated in the vertical and horizontal directions, respectively. Since the pixels of the top and bottom fields are positioned alternately in the vertical direction, the relative vertical positions of the current, previous, and next fields are determined along the temporal index. Given the input interlaced image f ^if, Δ_i and Δ_j are represented as

\begin{align} Δ_{i} = \{\begin{array}{l} i \cdot {sr}_{v} - ⌊ i \cdot {sr}_{v} ⌋ - (n % 2) / 2 & if n, (n \pm 2) fields \\ i \cdot {sr}_{v} - ⌊ i \cdot {sr}_{v} ⌋ & if (n \pm 1) fields \end{array} \\ Δ_{j} = j \cdot {sr}_{h} - ⌊ j \cdot {sr}_{h} ⌋, \end{align}

(4)

where sr _h and sr _v denote the horizontal and vertical scaling ratios, respectively, and ⌊·⌋ represents an operator to return the maximum integer less than or equal to the given real number. α of (3) is a weight to control the contribution of two values, ${\hat{f}}_{Lz}^{pf}$ and ${\hat{f}}_{ST}^{pf}$ . In the image magnification process, the horizontal scaling process is first performed using (2) and then the final results of the vertical interpolation are obtained from fusing Lanczos interpolation and spatio-temporal interpolation according to the motion detection process.

In the vertical interpolation process of (3), ${\hat{f}}_{Lz}^{pf}$ and ${\hat{f}}_{ST}^{pf}$ mainly contribute to ${\hat{f}}^{pf}$ in the motionless and motion areas, respectively. These values are calculated using previous and next field information because the original pixel information at the missing position is obtained from the previous and next fields due to the inherent nature of the interlaced format. First, the ${\hat{f}}_{Lz}^{pf}$ value for the motionless areas is computed using the field average and a Lanczos kernel. That is,

\begin{align} {\hat{f}}_{Lz}^{pf} (i + Δ_{i}, j + Δ_{j}, n) & = \sum_{k = - b + 1}^{b} h_{Lz} (Δ_{i}^{'} - k, b) \cdot ({\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n) \cdot δ (⌊ 2 \cdot Δ_{i} ⌋) \\ + {\hat{f}}_{FA}^{if} (i + k, j + Δ_{j}, n) \cdot δ (1 - ⌊ 2 \cdot Δ_{i} ⌋)), \end{align}

(5)

where

\begin{align} {\hat{f}}_{FA}^{if} (i + k, j + Δ_{j}, n) & = ({\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n - 1) \\ + {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n + 1)) / 2, \end{align}

(6)

where ${\hat{f}}_{FA}^{if}$ represents the results of the field average. b represents the vertical size of the Lanczos kernel and δ denotes the Dirac delta function. Since the ${\hat{f}}_{FA}^{if}$ value is at the 0.5 position in the vertical direction, Δ_i for the sub-pixel position of the Lanczos kernel is modified:

\begin{align} Δ_{i}^{'} = 2 \cdot Δ_{i} - ⌊ 2 \cdot Δ_{i} ⌋ . \end{align}

(7)

The value of spatio-temporal interpolation, ${\hat{f}}_{ST}^{pf}$ , is reconstructed with spatial and temporal filters. In order to improve the performance and to obtain the interpolated value at an arbitrary position, the spatial filter is defined as the Lanczos kernel. The temporal filter is determined to estimate the high-frequency components of the previous and next fields. Therefore, the ${\hat{f}}_{ST}^{pf}$ value provides enhanced results using additional high-frequency components of temporal information. The ${\hat{f}}_{ST}^{pf}$ is calculated by

\begin{align} {\hat{f}}_{ST}^{pf} (i + Δ_{i}, j + Δ_{j}, n) & = \sum_{k = - b + 1}^{b} h_{Lz} (Δ_{i} - k, b) \cdot {\hat{f}}_{Lz}^{if} \\ \times (i + k, j + Δ_{j}, n) + THF, \end{align}

(8)

where

\begin{align} THF = \{\begin{array}{l} {THF}_{i - 1} \cdot (0.5 - Δ_{i}) + {THF}_{i} \cdot (0.5 + Δ_{i}) & if top fields and Δ_{i} \leq 0.5 \\ {THF}_{i} \cdot (1.5 - Δ_{i}) + {THF}_{i + 1} \cdot (Δ_{i} - 0.5) & if top fields and Δ_{i} > 0.5 \\ {THF}_{i} \cdot (0.5 + Δ_{i}) + {THF}_{i + 1} \cdot (0.5 - Δ_{i}) & if bottom fields and Δ_{i} \leq 0.5 \\ {THF}_{i + 1} \cdot (Δ_{i} - 0.5) + {THF}_{i} \cdot (1.5 - Δ_{i}) & if bottom fields and Δ_{i} > 0.5 \end{array}, \end{align}

(9)

where THF represents the high-frequency components of temporal information. Since the THF value is estimated only at the existing pixel position in the previous and next fields, a linear combination between the THFs at the neighboring pixels is used to obtain THF at an arbitrary position. According to characteristics of the top and bottom fields, the used THF is determined as shown in (9). THF _i is defined as

\begin{align} {THF}_{i} & = \sum_{k = - c}^{c} h_{TF} (c + k) \cdot ({\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n - 1) \\ + {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n + 1))/ 2, \end{align}

(10)

where

\begin{align} h_{TF} = [- 0.25 0.5 - 0.25] . \end{align}

(11)

Figure5 shows an example of obtaining the high-frequency component of temporal information.

For combining the results of Lanczos interpolation and spatio-temporal interpolation, the reliability terms of temporal information are used as weighting factors based on both motion detection and feathering artifacts. The motion detection process and the feathering artifact detection process are performed in each pixel position. For motion detection, the temporal difference, D _T is computed through five fields in order to detect both normal and fast motions. D _T is given by

\begin{align} D_{T} & = | {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n - 1) - {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n + 1) | \\ + \sum_{k = 0}^{1} \sum_{l = 0, 2} | {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n - 2 + l) \\ - {\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n + l) |/ 2 . \end{align}

(12)

|x| returns the absolute value of x. To detect the feathering artifacts, the vertical difference, D _V is computed by using ${\hat{f}}_{FA}^{if}$ .

\begin{align} D_{V} = min {D_{V 1}, D_{V 2}, D_{V 3}}, \end{align}

(13)

where

\begin{align} D_{V 1} = | {\hat{f}}_{Lz}^{if} (i, j + Δ_{j}, n) - {\hat{f}}_{FA}^{if} (i, j + Δ_{j}, n) | \\ D_{V 2} = | {\hat{f}}_{Lz}^{if} (i, j + Δ_{j}, n) - {\hat{f}}_{FA}^{if} (i - 1, j + Δ_{j}, n) | \\ D_{V 3} = | {\hat{f}}_{Lz}^{if} (i + 1, j + Δ_{j}, n) - {\hat{f}}_{FA}^{if} (i + 1, j + Δ_{j}, n) |, \end{align}

(14)

The arbitration rules for α in (3) can be summarized as

α = \frac{min {D_{T} + D_{V}, τ_{1}}}{τ_{1}},

(15)

where τ ₁ represents a predetermined constant for normalization. According to (12), if D _T has a large value, the current pixel can be considered a motion pixel. In these motion areas, D _V has a large value generally because of the feathering artifacts. Thus, the final result is close to the result of spatio-temporal interpolation.

3.1.2 Ringing artifact reduction technique to improve interpolation performance

As discussed in Section 2, the side lobes of a Lanczos kernel can improve reconstruction performance along edge regions, but they often degrade image quality due to ringing artifacts. Moreover, although the temporal high pass filter of spatio-temporal interpolation can be useful to improve edge information, the resulting images generally suffer from shooting artifacts caused by overshooting or undershooting. In our previous work[18], we introduced a kernel-based image upscaling method that handled the ringing artifact problem caused by using a wider window for the Lanczos kernel. An extension of these concepts is now presented in order to avoid ringing artifacts or shooting artifacts. The proposed method is performed one-dimensionally after performing each one-dimensional interpolation process including Lanczos interpolation and spatio-temporal interpolation. Since ringing artifact reduction processes for Lanczos interpolation are similar to those discussed in[18], we only describe the ringing artifact reduction process for spatio-temporal interpolation in this section.

The proposed ringing artifact reduction method is composed of two steps: median filtering and arbitration. First, a median filter is applied to the result of spatio-temporal interpolation with the two nearest input data values. The result of median filtering is called the median spatio-temporal value in this paper. The median spatio-temporal value of the vertical direction is obtained as

\begin{align} {\hat{f}}_{medST}^{pf} (i + Δ_{i}, j + Δ_{j}, n) = med \{{\hat{f}}_{arbiLz}^{if} (i, j + Δ_{j}, n), \\ {\hat{f}}_{ST}^{pf} (i + Δ_{i}, j + Δ_{j}, n), {\hat{f}}_{arbiLz}^{if} (i + 1, j + Δ_{j}, n)\}, \end{align}

(16)

where ${\hat{f}}_{medST}^{pf}$ and ${\hat{f}}_{arbiLz}^{if}$ represent the result of the median spatio-temporal value and the ringing artifact reduced result of the horizontal Lanczos interpolator, respectively, and med{x,y,z} represents a three-input median filter that returns the median value of x, y, and z. This median process is very efficient at removing ringing artifacts since it is particularly good for removing shot noise. However, this process often degrades image details by restricting the interpolated value within the values of its neighboring pixels. This restriction prevents reconstructing the high-frequency components of an image.

In the second arbitration step, the two highly complementary results are combined effectively in order to avoid ringing artifacts while reconstructing the high-frequency components. The arbitration process is performed using the results of (8) and (16) as follows:

\begin{align} {\hat{f}}_{arbiST}^{pf} (i + Δ_{i}, j + Δ_{j}, n) & = β \cdot {\hat{f}}_{ST}^{pf} (i + Δ_{i}, j + Δ_{j}, n) \\ + (1 - β) \cdot {\hat{f}}_{medST}^{pf} (i + Δ_{i}, j + Δ_{j}, n), \end{align}

(17)

where β (0 ≤ β ≤ 1) represents the weighting coefficient that controls the contribution of the two results. In the edge regions, β is determined near 1 to reconstruct the high-frequency components by adopting the spatio-temporal result. However, in regions with ringing artifacts, β decreases to remove the artifacts by adopting the median spatio-temporal result. The arbitration weight β is obtained with the difference values between the neighboring pixels as

\begin{align} β = \frac{min {D_{U}, D_{D}, τ_{2}}}{τ_{2}}, \end{align}

(18)

where τ ₂ represents a predetermined constant for normalization, and the differences (D _U and D _D) are obtained as

\begin{align} D_{U} = max \{| {\hat{f}}_{arbiLz}^{if} (i - 1, j + Δ_{j}, n) - {\hat{f}}_{arbiLz}^{if} (i, j + Δ_{j}, n) |, \\ | {\hat{f}}_{arbiLz}^{if} (i, j + Δ_{j}, n) - {\hat{f}}_{arbiLz}^{if} (i + 1, j + Δ_{j}, n) |\}, \\ D_{D} = max \{| {\hat{f}}_{arbiLz}^{if} (i, j + Δ_{j}, n) - {\hat{f}}_{arbiLz}^{if} (i + 1, j + Δ_{j}, n) |, \\ | {\hat{f}}_{arbiLz}^{if} (i + 1, j + Δ_{j}, n) - {\hat{f}}_{arbiLz}^{if} (i + 2, j + Δ_{j}, n) |\}, \end{align}

(19)

where D _U and D _D represent the difference values of the upper and lower sides, respectively. According to (18) and (19), if both D _U and D _D have large values, high-frequency edges exist in the current pixels. Thus, the final result is generally close to the result of the spatio-temporal value in order to preserve the high-frequency edges. However, if either D _U or D _D is a small value, the region is classified as either a flat or a step edge region. Therefore, the result of the median spatio-temporal value dominates the final decision of whether or not to remove the ringing artifacts.

In summary, the ringing artifact reduction process is performed in the horizontal direction with ${\hat{f}}_{Lz}^{if} (i + k, j + Δ_{j}, n)$ in (2) and then the vertical ringing artifact reduction process with ${\hat{f}}_{arbiLz}^{if} (i + k, j + Δ_{j}, n)$ is applied to the results of both Lanczos interpolation and spatio-temporal interpolation. Thus, the final result of the initial resampling step is obtained by

\begin{align} {\hat{f}}_{Init}^{pf} (i + Δ_{i}, j + Δ_{j}, n) & = α \cdot {\hat{f}}_{arbiLz}^{pf} (i + Δ_{i}, j + Δ_{j}, n) \\ + (1 - α) \cdot {\hat{f}}_{arbiST}^{pf} (i + Δ_{i}, j + Δ_{j}, n) . \end{align}

(20)

3.2 Edge enhancement step

Based on the LFI approach, the proposed method in Section 3.1 is used to resize images to fit the display format. However, LFI approaches are likely to smooth image details during the resampling process, and they usually produce jagged edge artifacts in the diagonal edge regions. In this section, we propose an edge enhancement algorithm that corrects the jagged edge artifacts and improves the sharpness of the initially interpolated images. For convenience of notation, the initially upscaled image is represented as F(i,j) instead of ${\hat{f}}_{Init}^{pf} (i, j, n)$ as is used in (20).

3.2.1 Jagged edge correction with an ellipsoidal kernel

In order to remove the jagged edge artifacts, an estimation of the edge direction is important. In[18–20], it is assumed that the edge direction of interest F(i,j), where F represents the initially upscaled image, is piecewise constant and the gradient vectors within a small mask should on average be orthogonal to the edge direction. Therefore, the estimation of edge direction can be formulated as the task of finding a unit vector d to minimize the following cost function:

\begin{align} cost (d) & = \sum_{k = - k_{0}}^{k_{0}} \sum_{l = - l_{0}}^{l_{0}} {\{d^{T} \cdot g (k, l)\}}^{2} = d^{T} \cdot \\ \sum_{k = - k_{0}}^{k_{0}} \sum_{l = - l_{0}}^{l_{0}} \{g (k, l) \cdot g^{T} (k, l)\} \cdot d = d^{T} C d, \end{align}

(21)

where g(k,l) = [g _v(k,l) g _h(k,l)]^T represents a gradient vector of F(i + k,j + l), and a covariance matrix C is determined as

\begin{align} C = [\begin{array}{c} \sum_{k} \sum_{l} g_{v} (k, l) g_{v} (k, l) & \sum_{k} \sum_{l} g_{v} (k, l) g_{h} (k, l) \\ \sum_{k} \sum_{l} g_{v} (k, l) g_{h} (k, l) & \sum_{k} \sum_{l} g_{h} (k, l) g_{h} (k, l) \end{array}] = [\begin{array}{c} c_{00} & c_{01} \\ c_{10} & c_{11} \end{array}], \end{align}

(22)

where g _v and g _h represent derivatives in the vertical and horizontal directions, respectively.

A unique vector d that minimizes the cost function in (21) is a good estimate of an edge direction so that the JEC can be carried out in the direction of d. However, this requires an additional technique such as singular value decomposition (SVD) to find a unique optimal solution for d[19]. Therefore, instead of finding the optimal solution, the cost function in (21) is used to estimate which vectors are closer to the edge direction in the proposed JEC method. Let d = [k l]^T denote a vector pointing from the current pixel in (i,j) to the neighboring pixel in (i + k,j + l). Then, the similarity between d and the edge direction is obtained by the cost function in (21) as follows:

\begin{align} cost (d) & = [k l] \cdot [\begin{array}{c} c_{00} & c_{01} \\ c_{10} & c_{11} \end{array}] \cdot [\begin{array}{c} k \\ l \end{array}] = c_{00} k^{2} \\ + (c_{01} + c_{10}) kl + c_{11} l^{2} . \end{align}

(23)

As mentioned above, cost(d) decreases as the orientation of d approaches the edge direction. However, if d is parallel to the gradient vector, cost(d) returns a large value. Using this characteristic, an adaptive smoothing kernel is obtained by adopting a Gaussian kernel as

ε (k, l) = exp \{- \frac{cost (d)}{σ}\} = exp \{- \frac{cost ({[k l]}^{T})}{σ}\},

(24)

where ε(k,l) represents the coefficients of the adaptive smoothing kernel. The cost function in (23) represents an elliptic equation form, and it spread the Gaussian kernel along the local edge direction. Therefore, in this paper, this adaptive kernel is called an ellipsoidal kernel, which is similar to the steering kernel used in[21]. σ in (24) controls the scale of the kernel, and it is determined as

σ = ν \cdot max (c_{00}, c_{11}),

(25)

where ν represents a predetermined smoothing parameter. In Figure6, an example of an ellipsoidal kernel is illustrated with the gradient vector, the edge direction, and an arbitrary vector d. As shown in (24) and Figure6, the ellipsoidal kernel assigns large weights along the edge direction. Therefore, pixels in the similar edge directions are smoothed by the ellipsoidal kernel, and as a result, jagged edge artifacts are corrected.

Figure7 illustrates the proposed method performed with an ellipsoidal kernel. This figure shows the positions of the pixels used in the proposed method. In our experiments, a sufficient window size is necessary to ensure performance, but this also increases the computational cost. Therefore, we divide the proposed method into vertical and horizontal processes and use only the pixels within the cross-shaped region for the filtering process, as shown in Figure7. Also, some pixels such as the four nearest neighbors of the current pixel degrade the correction performance since large weights are generally assigned to the pixels because of their short distances from the current pixel. Therefore, the vertical and horizontal lines that pass the current pixel are excluded from the filtering process, as shown in Figure7. With the remaining pixels, the proposed method is performed in the horizontal and vertical directions. First, the horizontal process is performed as

\begin{align} Z_{hor} & = \sum_{l \neq 0, l = - L}^{L} ε (- 1, l) + ε (1, - l) = 2 \cdot \sum_{l \neq 0, l = - L}^{L} ε (- 1, l) \\ F_{Jc-hor} (i, j) & = \sum_{l \neq 0, l = - L}^{L} ε (- 1, l) \cdot F (i - 1, j + l) + ε (1, - l) \cdot \\ \times F (i + 1, j - l) \\ = 2 \cdot \sum_{l \neq 0, l = - L}^{L} ε (- 1, l) \cdot F_{hor} (i, j + l), \end{align}

(26)

where F and F _Jc-hor represent the initially interpolated image and the weighted sum of the horizontal process, respectively. Z _hor is used for normalization and F _hor(i,j + l) is defined as

F_{hor} (i, j + l) = \frac{F (i - 1, j + l) + F (i + 1, j - l)}{2} .

(27)

In (26), ε(1,-l) is the same as ε(-1,l) since an ellipsoidal kernel is point symmetric with respect to the middle point. In the same way, the vertical process is performed as

\begin{align} Z_{ver} & = 2 \cdot \sum_{| k | > 1, k = - K}^{K} ε (- k, - 1), \\ F_{Jc-ver} (i, j) & = 2 \cdot \sum_{| k | > 1, k = - K}^{K} ε (- k, - 1) \cdot F_{ver} (i - k, j), \end{align}

(28)

where F _Jc-ver represents the weighted sum in the vertical direction. Z _ver is used for normalization and F _ver(i-k,j) is defined as

F_{ver} (i - k, j) = \frac{F (i - k, j - 1) + F (i + k, j + 1)}{2} .

(29)

From (26) and (28), the final JEC result is obtained as

F_{Jc} (i, j) = \frac{F (i, j) + F_{Jc-hor} (i, j) + F_{Jc-ver} (i, j)}{1 + Z_{hor} + Z_{ver}} .

(30)

3.2.2 Jagged edge correction with transient improvement

In general, enlarging images result in blurring effects within initially interpolated images, and these blurring effects are perceived more conspicuously in regions with jagged edge artifacts. In these regions, the interpolation process was performed across the contrasting edges so that it produced blurred edges which appear like a staircase. In the proposed JEC method, an ellipsoidal kernel was used to reduce these stair-like artifacts by smoothing them along the local edge direction. However, since the ellipsoidal kernel is a kind of low pass filter, some image details are smoothed by the kernel during the filtering process. Therefore, in this paper, a TI technique is simultaneously performed with the filtering process in order to improve the sharpness of the initially interpolated images.

From the viewpoint of enhancing the sharpness of an image, the basic idea of the TI methods may seem similar to that of general sharpening methods. However, TI methods are more specialized in terms of improving the slow transitions of blurred edges since these methods fundamentally prevent overshooting and undershooting. In order to improve the slow transition with general sharpening methods, we have to increase the amount of sharpening. However, an excessive amount of sharpening tends to degrade image quality because it produces severe overshooting and undershooting. These factors often produce white and black bands along the contrasting edges, which appear unpleasant. However, TI algorithms can produce steep and natural edge transitions without undershooting and overshooting. In image upscaling applications, a lot of edges show poor transitions so the TI technique can be used efficiently to improve the sharpness of the upscaled images. In Figure8, the behavior of the TI algorithm is illustrated briefly.

TI methods generally consist of two steps[22–24]. In the first step, a high-frequency boost filter is adopted as a pre-filter to enhance the slow transition of blurred edges. That is, a correction signal is added to the blurred signal to reconstruct the original high-frequency component. The above description can be written as the following equation:

X_{HB} = X_{in} + h_{HF} * X_{in} = h_{HB} * X_{in},

(31)

where X _in and X _HB represent the input signal and the filtered result, respectively. h _HF and h _HB represent a high pass filter and a high-frequency boost filter, respectively. For example, the second-order derivative operator has been used as h _HF in several conventional methods. In the second step, the processed signal is limited to the proper range to prevent overshooting and undershooting:

\begin{align} X_{TI} (i, j) & = TI \{X_{HB} (i, j), τ_{max}, τ_{min}\} \\ = \{\begin{array}{l} τ_{max}, & if X_{HB} (i, j) > τ_{max} \\ τ_{min}, & if X_{HB} (i, j) < τ_{min} \\ X_{HB} (i, j), & otherwise \end{array} . \end{align}

(32)

In most TI algorithms, τ _max and τ _min are usually set as the local maximum and minimum values found within a predefined window. In this section, the TI method is combined with the filtering process, described in Section 3.2.1. The TI process is performed in the vertical and horizontal directions. Since the vertical and horizontal filtering processes are exactly the same, we will only describe the horizontal process in this paper.

In (26) and (27), F _hor(i,j + l) is used for the filtering process, and it is obtained by averaging F(i-1,j + l) and F(i + 1,j-l). However, this direct average causes blurring effects during the filtering process when the two values are quite different. Therefore, the TI technique is applied to F _hor(i,j + l) to reduce the blurring effects. Let F _TI-hor(i,j + l) be the result of the TI process, which is used instead of F _hor(i,j + l). Then, (26) is changed to the following equation:

\begin{align} Z_{hor} & = 2 \cdot \sum_{l \neq 0, l = - L}^{L} ε (- 1, l) \\ F_{Jc-hor} (i, j) & = 2 \cdot \sum_{l \neq 0, l = - L}^{L} ε (- 1, l) \cdot F_{TI-hor} (i, j + l), \end{align}

(33)

where F _TI-hor(i,j + l) is obtained as

F_{TI-hor} (i, j + l) = TI \{F_{HB-hor} (i, j + l), τ_{max}, τ_{min}\},

(34)

where F _HB-hor(i,j + l) represents the result of high-frequency boost filtering. It is obtained by adding the high-frequency components to F _hor(i,j + l) as

\begin{align} F_{HB-hor} (i, j + l) = F_{hor} (i, j + l) + {[\begin{array}{c} 0.5 \\ - 0.25 \\ - 0.25 \end{array}]}^{T} \cdot [\begin{array}{c} F (i, j) \\ F (i - 1, j - l) \\ F (i + 1, j + l) \end{array}] \\ = {[\begin{array}{c} 0.5 \\ 0.5 \\ 0.5 \\ - 0.25 \\ - 0.25 \end{array}]}^{T} \cdot [\begin{array}{c} F (i - 1, j + l) \\ F (i + 1, j - l) \\ F (i, j) \\ F (i - 1, j - l) \\ F (i + 1, j + l) \end{array}] = {[\begin{array}{c} 0.5 \\ 0.5 \\ 0.5 \\ - 0.25 \\ - 0.25 \end{array}]}^{T} \cdot [\begin{array}{c} b_{1} \\ b_{2} \\ a \\ c_{1} \\ c_{2} \end{array}], \end{align}

(35)

where l falls within the range of [-L,L]. The positions of the pixels [b ₁,b ₂,a,c ₁,c ₂] are illustrated in Figure9 as an example.

As shown in Figure9, F _hor(i,j + l) is obtained by averaging b ₁ and b ₂ in the direction of the green arrow. High pass filtering is applied to the pixels [a,c ₁,c ₂] in the opposite direction to F _hor(i,j + l) (along the dotted purple arrow). We supposed the direction of the green arrow to be similar to the real edge direction. Then, a, c ₁, and c ₂ in the opposite direction to the green arrow form a contrasting edge, and the high-frequency components are extracted from this edge. That is, the high-frequency components obtained from pixels across an edge (a, c ₁, and c ₂) are used to restore the blurred edges of an initially interpolated image. After the filtering process, the filtered value is limited to the proper range between τ _max and τ _min to prevent artifacts caused by overshooting and undershooting:

\begin{align} [\begin{array}{c} τ_{max} \\ τ_{min} \end{array}] = [\begin{array}{c} F_{max} (l) \\ F_{min} (l) \end{array}] = [\begin{array}{c} {max}_{n \in {n | - | l | \leq n \leq | l |}} F (i, j + n) \\ {min}_{n \in {n | - | l | \leq n \leq | l |}} F (i, j + n) \end{array}], \end{align}

(36)

where |l| returns the absolute value of l. F _max(l) and F _min(l) represent the local maximum and local minimum values within the range of [-|l|,|l|]. The local extremums are searched on the middle line among the three lines illustrated in Figure9. In general, the middle line usually crosses over the edges on (i,j) so that it contains both the maximum and minimum pixel values within a local window. From (34) to (36), we obtain F _TI-hor(i,j + l), which provides sharper images than F _hor(i,j + l).

In Figure10, we present two examples of the proposed TI process when l = 2 and l = -2. As shown in Figure10b, jagged edge artifacts are reduced along the edges at 0° to 45° angles (blue circled) when l = 2. However, when l = -2, the jagged edge artifacts are reduced along the edges at 135° to 180° angles (red circled) in Figure10c. Even though the edges at 135° to 180° angles in Figure10b and the edges at 0° to 45° angles in Figure10c are degraded by the TI process, the degraded results are excluded by the ellipsoidal kernel. In the proposed method, the ellipsoidal kernel assigns large weights along the estimated edge direction. Therefore, the l = 2 results in Figure10b dominate the final results along the edges at 0° to 45° angles, and the l = -2 results in Figure10c dominate the final results along the edges at 135° to 180° angles, respectively.

For the vertical process, F _TI-ver(i-k,j) is used instead of F _ver(i-k,j), and it is obtained by

\begin{align} F_{TI-ver} (i - k, j) = TI \{F_{HB-ver} (i - k, j), τ_{max}, τ_{min}\}, \end{align}

(37)

and the vertical JEC process is performed as

\begin{align} Z_{ver} & = 2 \cdot \sum_{| k | > 1, k = - K}^{K} ε (- k, - 1), \\ F_{Jc-ver} (i, j) & = 2 \cdot \sum_{| k | > 1, k = - K}^{K} ε (- k, - 1) \cdot F_{TI-ver} (i - k, j) . \end{align}

(38)

The final JEC result is obtained by (30).

4 Experimental results

The performance of the proposed method was tested with well-known common intermediate format (CIF), SD, and HD video sequences. We converted the progressive sequences into an interlaced format or downsampled them according to experimental purposes. For performance comparisons, three groups of conventional deinterlacing methods were implemented: LFI approaches including cubic spline (CS)[11] and Lanczos, EDI approaches including new edge dependent deinterlacing (NEDD)[13] and local surface model-based deinterlacing (LSMD)[14], and a combination of both LFI and EDI approaches including spatial-temporal content-adaptive deinterlacing (STCAD)[3] and motion adaptive vertical temporal filtering (MAVFT)[4]. Two groups of image upscaling methods were implemented: LFI approaches including CS[11], EDI approaches including new edge-directed interpolation (NEDI)[15] and soft-decision adaptive interpolation (SAI)[17]. In the experiments, the interpolated positions of deinterlacing and upscaling were adaptively adjusted for the CS and Lanczos, respectively. Lanczos represents Lanczos interpolation without applying the proposed ringing reduction method.

The peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM)[25] of the luminance channel were used for quantitative measurement. The PSNR is defined as PSNR = 10log ₁₀ (255²/MSE), where MSE represents the mean squared error between the original and the reconstructed images. The SSIM compared local patterns of pixel intensities that were normalized for luminance and contrast. Thus, the SSIM was used to gauge the visual quality of images more closely to the human visual system. The SSIM index is represented by

\begin{align} SSIM (o_{m}, r_{m}) = \frac{(2 μ_{o} μ_{r} + C 1) (2 σ_{or} + C 2)}{(μ_{o}^{2} + μ_{r}^{2} + C 1) (σ_{o}^{2} + σ_{r}^{2} + C 2)}, \end{align}

(39)

where o and r represent the original and reconstructed images, respectively. o _m and r _m are the image contents at the m th windows of the image. μ and σ represent the mean and standard deviation of each pixel over a 11×11 pixel Gaussian window, respectively. σ _or represents the covariance value of each pixel over a 11 × 11 pixel Gaussian window on the original and reconstructed images. C 1 and C 2 are constant values to increase stability: C 1 = 6.5025 and C 2 = 58.5225 for 8-bit images. In order to evaluate the overal image quality, a mean structural similarity (MSSIM) was employed.

\begin{align} MSSIM (o, r) = \frac{1}{M} \sum_{m = 1}^{M} SSIM (o_{m}, r_{m}), \end{align}

(40)

where M represents the number of local windows of the image. The MSSIM ranged from 0 to 1. Therefore, a higher MSSIM value close to 1 meant that a given image was reconstructed with reduced degradation of structural information.

There are several parameters in the proposed method. Most of these parameters were set empirically and tested with various images to obtain the best results. a in (2) and b in (5) represent the sizes of the interpolation filters in the horizontal and vertical directions, respectively. Since the side lobes of a Lanczos kernel improved reconstruction performance along the edge regions, the used kernel size severely impacted the performance of the reconstructed image. In Figure11, we presented the upscaled images obtained by varying the size of the Lanczos interpolation kernel to analyze the effect of the kernel size on the image quality. Figure11a presents the input image with the vertically patterned edges. In Figure11b,c,d, it was upscaled by a factor of 2, and different kernel sizes were used for each result. The horizontal kernel sizes were set as 4, 8, and 12 in Figure11b,c,d, respectively. As shown in the figures, the high-frequency components were reconstructed well as the kernel size increased. In our observations, there was little improvement when the size was larger than 8. Therefore, we set the size as 8, as a compromise between performance and hardware complexity. In the experiments, we used a smaller interpolation filter in the vertical direction because of the line memory restriction of the hardware structure. Thus, a and b were set as a = 8 and b = 6, respectively. Also, τ ₁ in (15) and τ ₂ in (18) represent the predetermined constants for normalization. For 8-bit images, τ ₁ was set as 96 for the mixing process of vertical interpolation, and τ ₂ was set as 64 for the ringing artifact reduction process. Generally, the window size used for the JEC process is the most important factor in determining performance, and therefore, a sufficient window size was required to reduce the jagged edge artifacts along the nearly horizontal or vertical edges. In Figure12, we presented the jagged edge corrected images obtained by varying the window size of the JEC process to analyze the effect of the window size on the image quality. Figure12a presents the input image with the nearly horizontal edge, which was degraded by the jagged edge artifacts. In Figure12b,d,f,h, it was corrected by the JEC process, and different window sizes were used for each result. The horizontal window size were set as 5, 7, 9, and 11, respectively. As shown in the figures, the jagged edge artifacts were reduced substantially by the JEC process. In order to compare the performance of the JEC process with various window sizes, the difference images between the input image and the reconstruction images are presented in Figure12c,e,g,i, and the difference image presents the jagged edge artifacts corrected by the JEC process. As shown in the figures, the jagged edge artifacts were reduced well as the window size increased. In our observation, there was little improvement when the size was larger than 7. Therefore, L in (26) and (33) was set as 7 for the horizontal process. However, the vertical window size K in (28) and (38) was set as 3 to reduce the line memory required for hardware implementation. We used the same parameter values for all the test images.

In Figure13, the performance of the proposed method was evaluated for the deinterlacing process. For this experiment, test video sequences were converted into an interlaced format and then the interlaced sequences were deinterlaced again with conventional deinterlacing methods. Figure13a,b,c,d,e,f,g represents the deinterlaced results of CS, Lanczos, NEDD, LSMD, STCAD, MAVTF, and the proposed method, respectively. As shown in Figure13, the existing methods produced jagged edge artifacts along the diagonal edges in red circled regions, but the proposed method provided satisfactory outputs without jagged edge artifacts. Tables1 and2 show the PSNRs and the MSSIMs of various deinterlacing methods, respectively. As described in Table1, the proposed method recorded higher PSNR values than conventional deinterlacing methods in the majority of the test sequences. According to Table2, the proposed method recorded higher MSSIM values than conventional deinterlacing methods in the majority of the test sequences. These experimental results demonstrate that the proposed method can be used for deinterlacing process to improve image quality.

Table 1 PSNRs (dB) of various deinterlacing methods including the proposed method

Full size table

Table 2 SSIMs of various deinterlacing methods including the proposed method

Full size table

In Figure14, the performance of the proposed method was evaluated for the VDFC process. For this experiment, the test video sequences were downsampled by a factor of 2 and then these sequences were converted into an interlaced format. Again, the downsampled and interlaced sequences were deinterlaced with conventional deinterlacing methods and then progressive sequences were upsampled with conventional image upscaling methods. The STCAD was used to convert the interlaced format into a progressive format because the STCAD achieved high PSNR and SSIM values among the conventional deinterlacing methods (as shown in Tables1 and2). Figure14a,b,c,d represents the results obtained from CS, Lanczos, NEDI, and SAI after performing STCAD. As shown in the red circled regions of Figure14, LFI-based image upsampling methods in Figure14a,b suffered from jagged edge artifacts along the diagonal edges, and EDI-based image upsampling methods in Figure14c,d provided fine results along the diagonal edges. However, NEDI and SAI introduced some artifacts in the neighborhood regions of the character regions due to an incorrect estimation of covariance. However, the proposed method produced high-quality images by connecting discontinuous edges and reduced the jagged edge artifacts substantially. Tables3 and4 present the PSNRs and the MSSIMs of various VDFC methods, respectively. As described in Table3, the proposed method recorded higher PSNR values than conventional VDFC methods in the majority of the test sequences. According to Table4, the proposed method recorded higher MSSIM values than conventional VDFC methods. From Tables3 and4, it can be verified that the proposed method outperformed the conventional approaches in terms of numerical values.

Table 3 PSNRs (dB) of various VDFC methods including the proposed method

Full size table

Table 4 SSIMs of various VDFC methods including the proposed method

Full size table

The performance of the proposed JEC method was also evaluated. We applied the proposed method to the deinterlaced images and the upscaled images. Figure15a,c,e,g represents the deinterlaced results of CS, Lanczos, STCAD, and MAVTF, respectively. Using each figure as an input image, the proposed JEC method was performed to obtain the results in Figure15b,d,f,h. Figure16a,c represents the upscaled results of the CS and Lanczos, respectively. Using each figure as an input image, the proposed method obtained the results in Figure16b,d. As shown in Figures15 and16, the jagged edge artifacts were reduced substantially by the proposed method. Tables5 and6 present the PSNRs of various deinterlacing methods and image upscaling methods, respectively. As described in Table5, the proposed method improved the PSNR of CS, Lanczos, STCAD, and MAVTF by 0.15, 0.26, 0.21, and 0.3, respectively. As described in Tables6, the proposed method improved the PSNR of CS and Lanczos by 0.01 and 0.29, respectively. In Tables7 and8, the MSSIMs of various deinterlacing methods and image upscaling methods are compared. As described in Table7, the proposed method improved the MSSIM of CS, Lanczos, STCAD, and MAVTF by 0.002, 0.001, 0.001, and 0.002, respectively. According to Table8, the proposed method improved the MSSIM of CS and Lanczos by 0.001 and 0.003, respectively. These experimental results demonstrate that the proposed JEC method can be used as a postprocessor for various deinterlacing methods and image upscaling methods in order to improve the image quality.

Table 5 PSNRs (dB) of various deinterlacing methods with and without the proposed JEC method

Full size table

Table 6 PSNRs (dB) of various image upscaling methods with and without the proposed JEC method

Full size table

Table 7 SSIMs of various deinterlacing methods with and without the proposed JEC method

Full size table

Table 8 SSIMs of various image upscaling methods with and without the proposed JEC method

Full size table

To show the computational requirements, the average run times of various image formats were calculated (as shown in Table9). For this experiment, we used a PC equipped with an Intel Core2 Quad Q8200 CPU. Especially, the resampled images of CS, Lanczos, NEDI, and SAI were obtained after performing the SAVTF. Thus, the processing time of SAVTF was added to the total processing times of the conventional methods. As described in Table9, the processing times increased depending on the resolution of the image format. Although EDI-based methods needed more time than LFI-based methods due to requiring many operations to estimate the edge direction, the EDI-based methods provided high-quality results and the LFI-based methods suffered from jagged edge artifacts in the diagonal edge regions. According to Table9, the processing time of the proposed method was similar to that of NEDI. Thus, the proposed method and the NEDI method have similar complexity levels. However, the proposed method provided better objective and subjective performance than the other methods. Furthermore, the JEC process of the proposed method can be used as a postprocessor for performance improvement of many linear filtering interpolation methods. Thus, either the total proposed method or the JEC method can be selectively used according to the applications.

Table 9 CPU times of various VDFC methods including the proposed method for various image formats

Full size table

5 Conclusions

In this paper, we have proposed a flexible video resampling method with the advantages of both the LFI and EDI methods: the capabilities of converting image formats, resizing images for arbitrary ratios and improving edge quality. The proposed method converted input interlaced sequences into upscaled progressive sequences simultaneously and improved the image quality of resampled images by correcting various interpolation artifacts, such as ringing, blurring, and jagged edge artifacts. In order to reduce the ringing artifacts, the proposed ringing reduction method was combined with Lanczos interpolation and spatio-temporal interpolation. Also, the proposed JEC method was applied to initially upscaled images to correct jagged edge artifacts and to improve the sharpness of the images. Especially, this JEC postprocessor can be very useful for various image resampling applications since it is often used in combination with other common LFI techniques. The proposed algorithm was applied to various test images to verify the performance. Simulation results show that the proposed method outperformed conventional methods both visually and numerically.

References

de Haan G: Large-video-display-format conversion. J. Soc. Inf. Display 2000, 8(1):79-87. 10.1889/1.1828706
Article Google Scholar
de Haan G, Bellers EB: Deinterlacing - an overview. Proc. IEEE 1998, 86(9):1839-1857. 10.1109/5.705528
Article Google Scholar
Lee GG, Lin H-Y, Wang M-J, Lai R-L, Jhuo CW, Chen B-H: Spatial-temporal content-adaptive deinterlacing algorithm. IET Image Process 2008, 2(6):323-336. 10.1049/iet-ipr:20070219
Article MathSciNet Google Scholar
Lee K, Lee J, Lee C: Deinterlacing with motion adaptive vertical temporal filtering. IEEE Trans. Consum. Electron 2009, 55(2):636-643.
Article Google Scholar
Chang Y-L, Lin S-F, Chen C-Y, Chen L-G: Video deinterlacing by adaptive 4-field global/local motion compensated approach. IEEE Trans. Circuits Syst. Video Technol 2005, 15(12):1569-1582.
Article Google Scholar
Chen Y-R, Tai S-C: True motion compensated deinterlacing algorithm. IEEE Trans. Circuits Syst. Video Technol 2009, 19(10):1489-1498.
Article Google Scholar
Huang Q, Zhao D, Ma S, Gao W, Sun H: Deinterlacing using hierarchical motion analysis. IEEE Trans. Circuits Syst. Video Technol 2010, 20(5):673-686.
Article Google Scholar
Shannon CE: Communication in the presence of noise. Proc. I.R.E. 1983, MI-2: 31-39.
MathSciNet Google Scholar
Lehmann TM, Gönner C: Survey: interpolation methods in medical image processing. IEEE Trans. Med. Imaging 1999, 18(11):1049-1075. 10.1109/42.816070
Article Google Scholar
Keys RG: Cubic convolution interpolation for digital image processing. IEEE Trans. Acoustics, Speech Signal Process 1981, 29(6):1153-1160. 10.1109/TASSP.1981.1163711
Article MathSciNet Google Scholar
Hou HS, Andrews HC: Cubic splines for image interpolation and digital filtering. IEEE Trans. Acoustics, Speech Signal Process 1978, 508(6):517.
Google Scholar
TheuBl T, Hauser H, Gröller ME: Mastering windows: improving reconstruction. In Proceedings of the 2000 IEEE Symposium on Volume Visualization. Salt Lake City; 9–10 Oct 2000:101-108.
Google Scholar
Park MK, Kang MG, Nam K, Oh SG: New edge dependent deinterlacing algorithm based on horizontal edge pattern. IEEE Trans. Consum. Electron 2003, 49(4):1508-1512. 10.1109/TCE.2003.1261260
Article Google Scholar
Park S-J, Jeong J: Local surface model-based deinterlacing algorithm. Opt. Eng 2011, 50(1):017004–1–017004-10.
MathSciNet Google Scholar
Li X: New edge-directed interpolation. IEEE Trans. Image Process 2001, 10(10):1521-1527. 10.1109/83.951537
Article Google Scholar
Tam WS, Kok CW, Siu WC: Modified edge-directed interpolation for images. J. Electron. Imaging 2010, 19(1):013011. 10.1117/1.3358372
Article Google Scholar
Zhang X, Wu X: Image interpolation by adaptive 2-D autoregressive modeling and soft-decision estimation. IEEE Trans. Image Process 2008, 17(6):887-896.
Article MathSciNet Google Scholar
Park CH, Chang J, Kang MG: Kernel-based image upscaling method with shooting artifact reduction. Proc. SPIE 8655 Image Processing: Algorithms and Systems XI 2013. doi:10.1117/12.2003326
Google Scholar
Feng X, Milanfar P: Multiscale principal components analysis for image local orientation estimation. In Conference record of the 36th Asilomar Conference on Signals, Systems and Computers. Pacific Grove; 3–6 Nov 2002:478-482.
Google Scholar
Chang J, Yoo DS, Park JH, Park SH, Kang MG: Edge directional interpolation for image upscaling with temporarily interpolated pixels. IET Electron. Lett 2011, 47(21):1176-1178. 10.1049/el.2011.2496
Article Google Scholar
Takeda H, Frsiu S, Milanfar P: Kernel regression for image processing and reconstruction. IEEE Trans. Image Process 2007, 16(2):349-366.
Article MathSciNet Google Scholar
Lin P, Kim YT: An adaptive color transient improvement algorithm. IEEE Trans. Consum. Electron 2003, 49(4):1326-1329. 10.1109/TCE.2003.1261236
Article Google Scholar
Ohara K: Digital color transient improvement. U.S. Patent 5,920,357,. 6 July 1999
Chang J, Shin GS, Park JH, Kang MG: Color transient improvement for signals with a bandlimited chrominance component. Opt. Eng 2007, 46(2):027002.1-027002.11.
Article Google Scholar
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP: Image quality assessment: from error visibility structural similarity. IEEE Trans. Image Process 2004, 13(4):600-612. 10.1109/TIP.2003.819861
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 2012R1A2A4A01003732).

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, Yonsei University, Seoul, 120-749, South Korea
Du Sic Yoo, Joonyoung Chang, Chul Hee Park & Moon Gi Kang

Authors

Du Sic Yoo
View author publications
You can also search for this author in PubMed Google Scholar
Joonyoung Chang
View author publications
You can also search for this author in PubMed Google Scholar
Chul Hee Park
View author publications
You can also search for this author in PubMed Google Scholar
Moon Gi Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moon Gi Kang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Authors’ original file for figure 13

Authors’ original file for figure 14

Authors’ original file for figure 15

Authors’ original file for figure 16

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Yoo, D.S., Chang, J., Park, C.H. et al. Video resampling algorithm for simultaneous deinterlacing and image upscaling with reduced jagged edge artifacts. EURASIP J. Adv. Signal Process. 2013, 188 (2013). https://doi.org/10.1186/1687-6180-2013-188

Download citation

Received: 31 July 2013
Accepted: 02 December 2013
Published: 20 December 2013
DOI: https://doi.org/10.1186/1687-6180-2013-188

Video resampling algorithm for simultaneous deinterlacing and image upscaling with reduced jagged edge artifacts

Abstract

1 Introduction

2 Previous works on Lanczos interpolation

3 Proposed video resampling method with jagged edge correction

3.1 Initial image magnification step

3.1.1 Image magnification process with Lanczos function

3.1.2 Ringing artifact reduction technique to improve interpolation performance

3.2 Edge enhancement step

3.2.1 Jagged edge correction with an ellipsoidal kernel

3.2.2 Jagged edge correction with transient improvement

4 Experimental results

5 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords