- Open Access
No-reference image quality metric based on image classification
EURASIP Journal on Advances in Signal Processingvolume 2011, Article number: 65 (2011)
In this article, we present a new no-reference (NR) objective image quality metric based on image classification. We also propose a new blocking metric and a new blur metric. Both metrics are NR metrics since they need no information from the original image. The blocking metric was computed by considering that the visibility of horizontal and vertical blocking artifacts can change depending on background luminance levels. When computing the blur metric, we took into account the fact that blurring in edge regions is generally more sensitive to the human visual system. Since different compression standards usually produce different compression artifacts, we classified images into two classes using the proposed blocking metric: one class that contained blocking artifacts and another class that did not contain blocking artifacts. Then, we used different quality metrics based on the classification results. Experimental results show that each metric correlated well with subjective ratings, and the proposed NR image quality metric consistently provided good performance with various types of content and distortions.
Recently, there has been considerable interest in developing image quality metrics that predict perceptual image quality. These metrics have been useful in various applications, such as image compression, restoration, and enhancement. The most reliable way of evaluating the perceptual quality of pictures is by using subjective scores given by evaluators. In order to obtain a subjective quality metric, a number of evaluators and controlled test conditions are required. However, these subjective tests are expensive and time-consuming. Consequently, subjective metrics may not always apply. As a result, many efforts have been made to develop objective quality metrics that can be used for real-world applications.
The most commonly used objective image quality metric is the peak signal to noise ratio (PSNR). However, PSNR does not correlate well with human perception in some cases. Recently, a number of other objective quality metrics have been developed, which consider the human visual system (HVS). In  the Sarnoff model computed errors when distortions exceeded a visibility threshold. The structural similarity index (SSIM) compares local patterns of pixel intensities normalized for luminance and contrast . One drawback of these metrics is that they require the original image as a reference.
Since human observers do not require original images to assess the quality of degraded images, efforts have been made to develop no-reference (NR) metrics that also do not require original images. Several NR methods have been proposed [3–15]. These NR methods mainly measure blocking and blurring artifacts. Blocking artifacts have been observed in block-based DCT compressed images (e.g., JPEG- and MPEG- coded images). Wu et al. proposed a blocking metric (generalized block impairment metric (GBIM)), which employed a texture and luminance masking method to weight a blocking feature . In [7, 8], blocking metrics were developed to measure the blockiness between adjacent block edge boundaries. However, these methods do not consider that the visibility can be changed depending on background luminance levels. In , the blocking artifacts were detected and evaluated using blocky signal power and activities in the DCT domain. In , the blocking metric was modeled by three features: average differences around the block boundary, signal activities, and zero-crossing rates. In general, this metric requires a training process to integrate the three features.
The blur metric is useful for blurred images. For example, JPEG2000 based on a wavelet transform may produce blurring artifacts. Several NR blur metrics have been proposed to measure smoothing or smearing effects on sharp edges [9–13]. Also, a blur radius estimated using a Gaussian blur kernel has been proposed to measure blurring artifacts [14, 15].
However, most NR image quality metrics were designed to measure specific distortion. As a result, they may produce unsatisfactory performance in certain cases. In other words, NR blocking metrics cannot guarantee satisfactory performance for JPEG2000 compressed images and Gaussian-blurred images, while NR blur metrics cannot guarantee good performance for JPEG-compressed images. Since the HVS can assess image quality regardless of image distortion types, ideal NR quality metrics should be also able to measure such image distortions. However, this is a difficult task since NR quality metrics have no access to original images, and we have a limited understanding of the HVS.
Recently, researchers have tried to combine blur and blocking metrics to compute NR image quality metrics [16, 17]. In , Horita et al. introduced an integrated NR image quality metric that they used for JPEG- and JPEG2000-compressed images. The researchers used an automatic discrimination method of compressed images, which produced good results for JPEG and JPEG2000 compressed images. However, the HVS characteristics were not considered in the decision process. In , Jeong et al. proposed a NR image quality metric that first computed the blur and blocking metrics and then combined them for global optimization.
In this article, we propose a new NR blocking metric and a new NR blur metric based on human visual sensitivity, and we also propose a NR metric based on image classification. The proposed blocking metric was obtained by computing the pixel differences across the block boundaries. These differences were computed according to the visibility threshold, which was based on the background luminance levels. The proposed blur metric was computed by estimating the blur radius on the edge regions. Images were classified based on the proposed blocking metric. Then, the blocking metric or the blur metric was used for each class. In the experiments, the proposed NR blocking metric, NR blur metric, and NR image quality metric based on image classification were evaluated using three image sets (i.e. JPEG-, JPEG2000-compressed, and Gaussian-blurred images). In Sect. II, the proposed blocking and blur metrics are explained, and then the image quality metric based on image classification is presented. Experimental results are presented in Sect. III. Conclusions are given in Sect. IV.
II. The proposed no-reference image quality metric
A. NR blocking metric calculation
In , Safranek showed that the visibility threshold needs to be changed based on the background luminance. In other words, the visibility threshold may differ depending on the background luminance level. For example, if the background luminance level is low, the visibility threshold generally has a relatively large value. For medium luminance levels, the visibility threshold is generally small. This property was used when computing the proposed blocking metric. The proposed blocking metric was computed using the following two steps:
Step 1. We computed a horizontal blocking feature (BLKH) and a vertical blocking feature (BLKV) using a visibility threshold of block boundaries.
Step 2. We combined BLKH and BLKV.
In order to measure the horizontal blockiness (vertical edge artifacts), we defined the absolute horizontal difference as follows (Figure 1):
On the other hand, Chou et al.  defined the visibility threshold value, Ф(⋅), as follows:
where s represents the background luminance intensity, T0 = 17, γ = 3/128, and L = 2bit-1 - 1.
In this article, min(AvgL, AvgR) was used as the background luminance value around the block boundary, and the horizontal blockiness was only measured when the absolute horizontal difference exceeded the visibility threshold as follows:
where NDh(x) represents the sum of noticeable horizontal blockiness at x and u(⋅) represents the unit step function. By repeating the procedure for an entire frame, the frame horizontal blockiness was computed as follows:
Although we assumed that the distance between the adjacent blocking boundaries was a multiple of 8, one can use other values if the basic block for transforms size is different. Also, if the video is spatially shifted, one can determine the blocking boundaries by searching the locations that provide the local maximum NDh(x) values.
One problem with the frame horizontal blockiness value (BNDh) is that it may be large even though there is no blocking artifact if the video has many vertical patterns. To address this problem, we also computed the column differences (EBDh) of pixels between the blocking boundaries and used them to normalize the BNDh value. We computed the average column difference value EBDh as follows:
The horizontal blocking feature, BLKH, was computed as follows:
The vertical blocking feature BLKV was similarly computed. The final blocking metric F BLK was computed as a linear summation of the horizontal blocking feature and the vertical blocking feature:
In , it was reported that the visual sensitivities to horizontal and vertical blocking artifacts were similar. Therefore, α and β were set to 0.5 in this article.
B. NR blur metric calculation
The proposed NR blur metric was motivated by the Gaussian blur radius estimator in , which was used for estimating an unknown Gaussian blur radius using two re-blurred images of the entire image. However, blurring artifacts are not always visible in flat (homogeneous) regions. They are mostly recognizable in edge areas. Based on this observation, we divided the images into a number of blocks, and classified each block as a flat or edge block. Then, we computed the blur radius only for the edge blocks. In this article, we used a block size of 8 × 8. The variance was computed at each pixel position (x,y) as follows:
where v(x, y) represents the variance value at (x, y), M represents the width of the window, N represents the height of the window, and E represents the mean of the window. In this article, M and N were set to 3. In other words, the size of window was 3 × 3. Then, we classified each pixel using the following equations:
In this article, the th1 value was empirically set to 400. Then, we classified the 8× 8 blocks based on the pixel classification results. If there was at least one edge pixel in a block, the block was classified as an edge block. Otherwise, the block was classified as a flat block. Figure 2 shows the classification results of the Lena image. In Figure 2, the black blocks represent flat blocks and the white blocks represent edge blocks.
The proposed blur metric was obtained by estimating the blur radii for the edge blocks (Be). The blur radius was estimated using the procedure described in , where an edge e(x) was modeled as a step function:
where A and B are the constant values, and they do not influence the blur radius estimation.
When the edge was blurred with an unknown Gaussian blur radius σ, the blurred edge was modeled as follows:
where g(n, σ) represents a normalized Gaussian kernel ().
To estimate the unknown blur radius σ, two re-blurred edges (b a (x), b b (x)) were obtained with the blur radii (σ a and σ b (σ a < σ b )). Then, the difference r(x) was calculated as follows:
As proposed in , the blur radius σ was estimated by . In this article, σ a was empirically set to 1, and σ b was set to 4. The blur radius σ was calculated only for the edge blocks. Finally, the proposed blur metric F BLR was obtained as follows:
where σ i represents the blur radius of the ith block and N B represents the total number of edge blocks.
When there were no edge blocks, N B was zero. This means that the entire image was highly blurred. Therefore, in this case, F BLR was set to 1.
C. NR quality metric based on image classification
Jeong et al. proposed the NR image quality metric that can be used for images with both blocking and blurring artifacts . Jeong et al. optimized weights for blocking and blur metrics to compute the NR image quality metric as follows:
where QNR represents the NR image quality metric, v1 and v2 represent the weights, BlockingM represents the blocking metric, and BlurM represents the blur metric.
On the other hand, JPEG and JPEG2000 images show different compression characteristics . JPEG images may produce both blocking and blurring artifacts while JPEG2000 images mainly produce blurring artifacts. Since compressed images show different artifacts depending on the employed compression standard, global optimization may not produce the best performance. To address this problem, we first classified the images into two classes: one with blocking artifacts (JPEG images) and the other without blocking artifacts (e.g., high quality JPEG images and JPEG2000 images). Then, to compute the proposed NR quality metric, the blocking metric was used for images containing blocking artifacts, and the blur metric was used for those containing no blocking artifacts, respectively. The proposed blocking metric was used as a decision criterion, and the proposed NR image quality metric was computed as follows:
The weights () were determined by minimizing the squared errors between the subjective scores and NR metrics. To compute the weights () in Equation 15, images were first classified into two groups by the blocking score. The weights () were computed from the sample images that contained the blocking artifacts, and the other weights () were computed from the sample images that have no blocking artifacts. After the weights were determined, the image quality metric was computed for each case. A block diagram of the proposed NR image quality metric is illustrated in Figure 3. Although one may use the blocking metric along with the blur metric for images classified as having no blocking artifacts, we found that using the blocking metric along with the blur metric did not improve the performance. Similarly, although one may use the blur metric along with the blocking metric for images classified as having blocking artifacts, it did not improve performance.
III. Experimental results
A. Image Quality Databases and Performance evaluation criteria
Several image quality databases (LIVE , IVC , and TID2008 ) are publicly available. In the LIVE database, 29 source images were used for creating 779 impaired images using JPEG images, JPEG2000 images, Gaussian blur, white noise, and fast-fading . The LIVE database provides subjective quality scores in terms of the difference mean opinion score (DMOS). The IVC database contains JPEG- and JPEG2000-compressed images and also provides images with artifacts because of blurring and locally adaptive resolution (LAR) coding . The subjective quality scores are given in terms of the mean opinion score (MOS). The TID2008 database has 25 source images and 1700 impaired images (25 source images × 17 types of distortions × 4 levels of distortions) . The TID2008 database also gives the subjective scores in terms of MOS.
In general, the evaluation of a NR quality metric is performed by comparing the subjective MOS and objective values. Since the IVC database contained a small number of JPEG2000 images, we used the TID2008 database as a test database. To evaluate the proposed NR image quality metric, three image sets: JPEG-, JPEG2000-compressed images, and Gaussian-blurred images were selected from the TID2008 database.
Pearson correlation coefficients were used for performance evaluation . These correlation coefficients were computed after the 3rd order polynomial functions were applied to take into account the nonlinear relationships between the objective quality metrics and the MOS scores.
where β1,β2,β3, and β4 represent the mapping parameters, Metric represents the objective quality metric, and MOSp represents the predicted MOS.
B. Performance of the proposed NR blocking metric
To evaluate the proposed NR blocking metric, we used the JPEG images of the TID2008 database and compared them with some existing blocking metrics in the literature [3, 6, 17]. Table 1 shows the Pearson correlation coefficients between the subjective scores (MOS) and the objective scores. All the metrics showed good performance except for Jeong's method, and the proposed metric showed statistically equivalent performance as Wang's blocking metric, and it was found to better than GBIM. As seen in Figure 4, the predicted MOSs (MOS p ) of the proposed NR blocking metrics correlated well with the subjective scores (MOS).
C. Performance of the proposed NR blur metric
The proposed NR blur metric was compared with some existing NR blur metrics [12, 13, 17] using the JPEG2000 images and the Gaussian-blurred images of the TID2008 database. The performance of each metric is shown in Tables 2 and 3. It has been reported that Ferzli's method produced good prediction performance with both image sets (JPEG2000 and Gaussian-blurred images of LIVE database) . However, Ferzli's method did not show satisfactory performance for the Gaussian-blurred images of the TID2008 database. This result may have been caused by the fact that the test design of the TID2008 database is different from that of the LIVE database. The proposed blur metric showed the best performance for the JPEG2000 images and slightly lower performance than Marziliano's algorithm for the Gaussian-blurred images. This result shows that the proposed NR blur metric accurately estimated the blurring artifacts for both the JPEG2000 images and the Gaussian-blurred images. Figures 5 and 6 show the scatter plots for the JPEG2000 and Gaussian-blurred images. The proposed blur metric correlated well with the subjective scores for both image sets (JPEG2000 and Gaussian-blurred images).
D. Performance of the proposed NR image quality metric based on image classification
To evaluate the performance of the proposed NR image quality metric based on image classification, three image sets (JPEG, JPEG2000, and Gaussian-blurred images of the TID2008 database) were combined into one set. We first combined a blocking metric and a blur metric by global optimization as shown in Equation 14. The blocking metric was either one of the existing blocking metrics or the proposed blocking metric. The blur metric was either one of the existing blur metrics or the proposed blur metric. Table 4 shows the NR image quality metrics obtained as a linear combination of some blocking and blur metrics (global optimization). Clearly, the linear combination of the proposed blocking and blur metrics showed the best performance.
Next, we computed the NR image quality metric based on image classification. There was one parameter which was the threshold value (th) in Equation 15. Table 5 represents the Pearson correlation coefficient of the NR image quality metrics based on image classification as a function of the threshold value (th). As seen in Table 5, when a blocking and a blur metrics were combined, noticeably improved performance was achieved. On the other hand, different threshold values were used for obtaining optimal performance for different combinations. Although these combinations of the blocking and blur metrics show good results, the NR image quality metric using the proposed NR blur and blocking metrics showed the best performance. Furthermore, as seen in Tables 4 and 6, employing image classification significantly improved performance. Figure 7 shows some sample images that were degraded by the JPEG images, the JPEG2000 images, and the Gaussian blur kernel. The predicted MOSs by the proposed NR image quality metric correlate well with the subjective scores.
Table 7 shows how the three image sets (JPEG, JPEG2000, and Gaussian-blurred images of the TID2008 database) were classified. For the JPEG database, 14% of the images were classified as images without blocking artifacts, 4% of the JPEG200 database were classified as images with blocking artifacts, and 2% of the Gaussian-blurred database were classified as images with blocking artifacts. Table 8 shows the performance of the proposed NR metric based on image classification for each of the three image sets. The proposed NR metric based on image classification showed consistently good performance for different impairment types. For the JPEG and JPEG2000 databases, the performance of the proposed NR metric based on image classification was identical to that of the proposed NR blocking metric or the proposed NR blur metric. For the Gaussian-blurred database, the proposed NR metric based on image classification performed better than the other NR blur metrics.
In this article, we proposed a new NR image quality metric based on image classification. The NR blocking metric was obtained by computing noticeable horizontal and vertical distortions across block boundaries. The NR blur metric was computed by estimating the blur radii in the edge regions. To develop the new NR image quality metric, images were first classified into two classes: one that contained blocking artifacts, and the other that contained no blocking artifacts. Then, the different quality metrics were used to measure the image quality. The experimental results show that the proposed NR blocking and blur metrics correlated highly with the subjective scores and the proposed NR metric based on image classification showed consistently good performance.
- BLK H :
horizontal blocking feature
- BLK V :
vertical blocking feature
difference mean opinion score
generalized block impairment metric
human visual system
locally adaptive resolution
mean opinion score
peak signal to noise ratio
structural similarity index
Lubin J, Fibush D: Sarnoff JND Vision Model. (T1A1.5 Working Group Document No. 97-612, ANSI T1 Standards Committee, 1997)
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP: Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 2004,13(4):600-612. 10.1109/TIP.2003.819861
Wu HR, Yuen M: A generalized block-edge impairment metric for video coding. IEEE Signal Process Lett 1997,4(11):317-320. 10.1109/97.641398
Liu S, Bovik AC: Efficient DCT-domain blind measurement and reduction of blocking artifacts. IEEE Trans Circuits Syst Video Technol 2002,12(12):1139-1149. 10.1109/TCSVT.2002.806819
Wang Z, Bovik AC, Evans BL: Blind Measurement of Blocking Artifacts in Images. Proc IEEE Int Conf Image Processing 2000, 3: 981-984.
Wang Z, Sheikh HR, Bovik AC: No-Reference Perceptual Quality Assessment of JPEG Compressed Images. Proc IEEE Int Conf Image Processing 2002, 477-480.
Venkatesh Babu R, Bopardikar AS, Perkis A, Hillestad OI: No-reference Metrics for Video Streaming Applications. Proc of International Packet Video Workshop, December 2004.
Suthaharan S: No-reference visually significant blocking artifact metric for natural scene images. Signal Process 2009,89(8):1647-1652. 10.1016/j.sigpro.2009.02.007
Li X: Blind image quality assessment. Proc IEEE Int Conf Image Processing 2002.
Marziliano P, Dufaux F, Winkler S, Ebrahimi T: A No-Reference Perceptual Blur Metric. Proc IEEE Int Conf Image Processing 2002, 3: 57-60.
Ong E, Lin W, Lu Z, Yang X, Yao S, Pan F, Jiang L, Moschetti F: A No-Reference Quality Metric for Measuring Image Blur. Proc IEEE Int Conf Image Processing 2003, 469-472.
Marziliano P, Dufaux F, Winkler S, Ebrahimi T: Perceptual blur and ringing metrics: application to JPEG2000. Signal Process. Image Commun 2004,19(2):163-172.
Ferzli R, Karam LJ: A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB). IEEE Trans Image Process 2009,18(4):717-718.
Elder JH, Zucker SW: Local scale control for edge detection and blur estimation. IEEE Trans Pattern Anal Mach Intell 1998,20(7):699-716. 10.1109/34.689301
Hu H, Haan de G: Low Cost Robust Blur Estimator. β 1 . Proc IEEE Int Conf Image Processing 2006, 617-620.
Horita Y, Arata S, Murai T: No-reference image quality assessment for JPEG/JPEG2000 coding. Proc EUSIPCO 2004, 1301-1304.
Jeong T, Kim Y, Lee C: No-reference image quality metric based on blur radius and visual blockiness. Opt Eng 2010,49(4):045001. 10.1117/1.3366671
Safranek RJ, Johnston JD: A perceptually tuned subband image coder with image dependent quantization and post-quantization data compression. Proc IEEE Int Conf, Acoust., Speech, Signal Processing 1989, 3: 1945-1948.
Chou CH, Li YC: A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile. IEEE Trans Circuits Syst Video Technol 1995,5(6):467-476. 10.1109/76.475889
Karunasekera SA, Kingsbury NG: A distortion measure for blocking artifacts in images based on human visual sensitivity. IEEE Trans Image Process 1995,4(6):713-724. 10.1109/83.388074
Ebrahimi F, Chamik M, Winkler S: JPEG vs. JPEG2000: an objective comparison of image encoding quality. App Dig Image Proc XXVII 2004, 5558: 300-308.
Sheikn HR, Wang Z, Cormack L, Bovik AC: LIVE image quality assessment database.2003. [http://live.ece.utexas.edu/research/quality]
Le Callet P, Autrusseau F: Subjective quality assessment irccyn/ivc database.[http://www.irccyn.ec-nantes.fr/ivcdb/2005]
Ponomarenko N, Carli M, Lukin V, Egiazarian K, Astola J, Battisti F: Color Image Database for Evaluation of Image Quality Metrics. Proc Int Workshop on Multi-media Signal Processing 2008, 403-408.
VQEG: Final Report from the Video Quality Experts Group on the Validation of Objective Models of Video Quality Assessment. 2003.
This study was supported by the IT R & D program of MKE/KCC/KEIT .
This article was supported by the Korea IT R&D program of MKE/KCC/KEIT .
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.