Real-time video quality monitoring
© Liu et al; licensee Springer. 2011
Received: 6 June 2011
Accepted: 6 December 2011
Published: 6 December 2011
The ITU-T Recommendation G.1070 is a standardized opinion model for video telephony applications that uses video bitrate, frame rate, and packet-loss rate to measure the video quality. However, this model was original designed as an offline quality planning tool. It cannot be directly used for quality monitoring since the above three input parameters are not readily available within a network or at the decoder. And there is a great room for the performance improvement of this quality metric. In this article, we present a real-time video quality monitoring solution based on this Recommendation. We first propose a scheme to efficiently estimate the three parameters from video bitstreams, so that it can be used as a real-time video quality monitoring tool. Furthermore, an enhanced algorithm based on the G.1070 model that provides more accurate quality prediction is proposed. Finally, to use this metric in real-world applications, we present an example emerging application of real-time quality measurement to the management of transmitted videos, especially those delivered to mobile devices.
With the increase in the volume of video content processed and transmitted over communication networks, the variety of video applications and services has also been steadily growing. These include more mature services such as broadcast television, pay-per-view, and video on demand, as well as newer models for delivery of video over the internet to computers and over telephone systems to mobile devices such as smart phones. Niche markets for very high quality video for telepresence are emerging as are more moderate quality channels for video conferencing. Hence, an accurate, and in many cases real-time, assessment of the video quality is becoming increasingly important.
The most commonly used methods for assessing visual quality are designed to predict subjective quality ratings on a set of training data . Many of these methods rely on access to an original undistorted version of the video under test. There has been significant progress in the development of such tools. However, they are not directly useful for many of the new video applications and services in which the quality of a target video must be assessed without access to a reference. For these cases, no-reference (NR) models are more appropriate. Development of NR visual quality metrics is a challenging research problem partially due to the fact that the artifacts introduced by different transmission components can have dramatically different visual impacts and the perceived quality can largely depend on the underlying video content. Therefore, a "divide-and-conquer" approach is often adopted. Different models are designed to detect and measure specific artifacts or impairments . Among various forms of artifacts, the most commonly studied are spatial coding artifacts, e.g. blurriness [3–5] and blockiness [6–9], temporally induced artifacts [10–12], and packet-loss-related artifacts [13–18]. In addition to the models developed for specific distortions, there are investigations into generic quality measurement which can predict the quality of video affected by multiple distortions . Recently, there are numerous efforts on developing QoS-based video quality metrics, which can be easily deployed in network environment. International Telecommunication Unit (ITU) and Video Quality Expert Group (VQEG) proposed the concepts of non-intrusive parametric and bitstream quality modeling, P. NAMS and P.NBAMS . Based on the investigation of the relationship between video quality and bitrate and quantization parameter (QP) , Yang et al. proposed a quality metric by considering various bitstream domain features, such as bit rate, QP, packet loss and error propagation, temporal effects, picture type, etc. . Among others, the multimedia quality model which is standardized by ITU-T in its Recommendation G.1070 in 2007  is a widely used NR quality measure.
In ITU-T Recommendation G.1070, a framework for assessing multimedia quality is proposed. It consists of three models: a video quality estimation model, a speech quality estimation model, and a multimedia quality integration model. The video quality estimation model (which we will loosely refer to as the G.1070 model in this article) uses the bit rate (bits per second) and frame rate (frame per second) of the compressed video, along with the expected packet-loss rate (PLR) of the channel, to predict the perceived video quality subject to compression artifacts and transmission error artifacts. Details of the G.1070 models, including equations, can be found in . Since its standardization, the G.1070 model has been widely used, studied, extended, and enhanced. Yamagishi and Hayashi  proposed to use G.1070 in the context of IPTV quality. Since the G.1070 model is codec dependent, Belmudez and Moller  extended the model, originally trained for H.264 and MPEG4 video, to MPEG-2 content. Joskowicz and Ardao  enhanced G.1070 with both resolution- and content-adaptive parameters.
In this article, we showcase how this technology can be used in a real-world video quality monitoring application. To accomplish this, there are several technical challenges to overcome. First of all, G.1070 was originally designed for network planning purposes, and it cannot be readily used within a network or at a video player for the purpose of real-time video quality monitoring. This is because the three inputs to the G.1070 model, i.e. bitrate, frame rate, and PLR of the encoded video bitstream, are not immediately available, and hence they need to be estimated from the bitstream. However, the estimation of these parameters is not straightforward. In this article, we propose efficient estimation methods that allow G.1070 to be extended from a planning tool to a real-time video quality monitoring tool. Specifically, we describe methods for real-time estimation of these three quality-related parameters in a typical video streaming environment.
Second, although the G.1070 model is generally suitable for estimating the quality of video conferencing content, where head-and-shoulder videos dominate, it is observed that its ability to account for the impact of content characteristics on video quality is limited. This is because the video compression performance is largely content dependent. For example, a video scene with a complex background and a high level of motion, and another scene with relatively less activity or texture, may have dramatically different perceived qualities even if they are encoded at the same bitrate and frame rate. To address this issue, we propose an enhancement to the G.1070 model wherein the encoding bitrate is normalized by a video complexity factor to compensate for the impact of content complexity on video encoding. The resulting normalized bitrate better reflects the perceptual quality of the video.
Based on the above contributions, this article also proposes a design for a realtime video quality monitoring system that can be used to solve real-world quality management problems. The ability to remotely monitor in real-time the quality of transmitted content (particularly to mobile devices) enables the right decisions to be made at the transmission end (e.g. by increasing the encoding bitrate or frame rate) in order to improve the quality of the subsequently transmitted content.
This article is organized as follows. In Section 2, the G.1070 video quality model is first introduced as a video quality planning tool, and then a scheme is proposed to extend it for video quality monitoring by estimating the three parameters, i.e. bitrate, frame rate, and PLR, from video bitstreams. In Section 3, we further propose an improved version of the G.1070 model to more accurately predict the quality of videos with different content characteristics. Experimental results demonstrating the proposed improvements are shown in Section 4. Using the proposed video quality monitoring tools, we present an emerging video application to measure and manage the quality of videos delivered to mobile phones in Section 5. Finally, Section 6 concludes this article.
2 Extension of G.1070 to video quality monitoring
In this section, G.1070 is first introduced as a planning tool. Then, we propose the estimation methods for bitrate, frame rate, and PLR, which allow G.1070 to be extended from a planning tool to a real-time video quality monitoring tool . Specifically, we describe methods for real-time estimation of bitrate, frame rate, and PLR of an encoded video bitstream in a typical video streaming environment. Some of the practical issues therein are discussed. Based on simulation results, we also analyze the performance of the proposed parameter estimation methods.
2.1 Introduction of G.1070 as a planning tool
where V q is the video quality score, in the range from 1 to 5 (5 represents the highest quality). Br v , Fr v , and represent bit rate, frame rate, and PLR, respectively. Icoding represents the quality of video compression, which is followed by the quality degradation caused by packet losses, a function of PLR and packet-loss robustness, D Pplv . The model assumes that there is an optimal quality that can be achieved, I Ofr , with given bitrate. The associated frame rate to optimal quality is denoted as O fr . D FrV is the robustness to quality change due to frame rate change.
v1, v2, . . ., and v12 are the 12 constants to be determined. These parameters are codec/implementation and resolution dependent. Although in the G.1070 Recommendation parameter sets are provided for H.264 and MPEG-4 videos at a few resolutions, the values of these parameters for other codecs and resolutions need to be determined. Refer to the Recommendation for more detailed interpretation of this model.
The intended application of G.1070 is QoE/QoS planning: different quality scores could be predicted by inputting different ranges of the three video parameters. Based on this, QoE/QoS planners can choose proper sets of video parameters to deliver a satisfactory service. G.1070 has the advantage of being simple and light-weight, in addition to being a NR quality model. These features make it ideal to be extended as a video quality monitoring tool. However, in a monitoring application, bit rate, frame rate, and PLR are usually not available to the network provider and end user. These input parameters to G.1070 need to be estimated from the received video bitstreams.
2.2 G.1070 extension to quality monitoring
2.2.1 Feature extractor
Outputs of the feature extractor
Output feature (per packet)
The reference clock frequency of the transport format. For example, if we consider the transport of video over RTP, the standard clock frequency is 90 kHz.
Display time of the frame to which the packet belongs.
The number of bits in the packet.
Type of data in the packet. For example, in the case of H.264, the coded unit type corresponds to the NAL-unit type.
The sequence number of the input packet.
2.2.2 Feature integrator
Outputs of feature integrator
Output feature (per window)
Same as described in Table 1.
The time interval between two adjacent video frames in display order.
The number of video coding layer bits received over the N-frame window. The determination of whether the bits belong to the video coding layer is based on the input codedUnitType. For example, in H.264, the SPS and PPS NAL-units do not belong to video coding layer and hence are not included in the calculation.
The number of packets received over the N-frame window.
The number of packets lost over the N-frame window. This can be determined by counting the discontinuities in the sequence number information.
The number of video coding layer packets per picture.
The estimates of timeIncrement, bitsReceivedCount, and packetsPerPicture are prone to error due to packet loss. Therefore, extra care is taken while calculating these estimates including compensation for errors. The bitsReceivedCount is the basis for the calculation of bit rate, which may be underestimated due to possible packet loss. Thus, it is necessary to perform some compensation during the calculation of bit rate, which will be explained later. However, as will be explained below, the estimation of timeIncrement and packetsPerPicture are performed such that they are robust to packet loss.
The estimation of the timeIncrement between the frames in display order is complicated by the fact that almost all state-of-the-art encoding standards use a highly predictive structure. Because of this, the coding order is not the same as the display order and hence the received timestamps are not monotonically increasing. Also, packet losses can lead to frame losses which can cause missing timestamps. In order to overcome these issues, the timeIncrement estimator buffers timestamps over N frames and sorts them in ascending order. The timeIncrement is then estimated as the minimum difference between consecutive timestamps in the buffer. The sorting makes sure that the timestamps are monotonically increasing and calculating the minimum timestamp difference makes the estimation more robust to frame loss. The effectiveness of this method is clear from experimental results on frame rate estimation in the presence of packet loss (Section 4.1.2), since timeIncrement is used to estimate the frame rate.
A packetsPerPicture estimate is calculated for each picture. For those frames that are affected by packet loss, the corresponding packetsPerPicture estimates are discarded since these may be erroneous.
2.2.3 Parameter estimator
Finally, the BR, FR, and PLR estimates are provided to a standard G.1070 video quality estimator which calculates the corresponding video quality. Note that the parameters are estimated over a window of N frames. This means that the quality estimate at a frame is obtained from the statistics of the N preceding frames. The proposed system generates a video quality estimate for each frame, except during the initial buffering of N frames. No quality measurement is generated for lost frames.
2.3 Experimental results
The performance of the proposed video parameter estimation methods are validated by experimental results in Section 4. The proposed methods were implemented in a prototype system as a proof-of-concept and several experiments were performed with regard to the estimation accuracy of bit rate, frame rate, and PLR using a variety of bitstreams with different coding configurations. The experimental results in Section 4 show not only a high accuracy of estimation but also high robustness of the bit rate and frame rate estimation in the presence of packet loss.
3 Enhanced content-adaptive G.1070
3.1 Generalized frame complexity estimation
The complexity of a frame is a combination of the spatial complexity of the picture and the temporal complexity of the scene in which it is found. Pictures with more detail have higher spatial complexity than those with little detail. Scenes with high motion have higher temporal complexity than those with little or no motion. Compared to the previous works which investigate the frame complexity in the pixel domain [30, 31], we proposed a novel frame complexity algorithm in the bitstream domain, which does not need to fully decode and reconstruct the videos and has much lower computational complexity. In a general video compression process, for a fixed level of quantization, frames with a higher complexity yield more bits. Similarly, for a fixed target number of bits, frames with higher complexity result in larger quantization step sizes. Therefore, the coding complexity can be estimated based on the number of coded bits and the level of quantization. These two parameters are used to estimate the number of bits that would have been used at a particular quantization level (denoted as reference quantization level), which is then used to predict complexity. The following derivation applies to many video compression standards including MPEG-2, MPEG-4, and H.264/AVC.
The reference quantization step size matrix M Q is arranged in zigzag order and m Q is an entry in the matrix. To evaluate the effects of the quantization step size matrix, we consider a weighted sum of all the elements m Q where the averaging factor, a, for each element depends on the corresponding frequency. In natural imagery, the energy tends to be concentrated in the lower frequencies. Thus, quantization step sizes in the lower frequencies have more impact on the resulting number of bits. The weighted sums in Equation 11 allow the lower frequencies to be weighted more heavily than the higher frequencies.
In many cases, different macroblocks can have different quantization step size matrices. Thus, the matrices specified in Equation 11 are averaged over all the macroblocks in the frame. Some compression standards allow macroblocks to be skipped. This usually occurs when the macroblock data can be well predicted from previously coded data. Hence, to be more specific, the quantization step size matrices specified in Equation 11 are averaged over all the coded (not skipped) macroblocks in the frame. To extract the QP and MB mode for each MB, the variable length decoding is needed, which is about 40% cycle complexity of the full decoding. Compared to the header only decoding, which is about 2-4% cycle complexity in the decoding progress, the proposed algorithm pays higher computational complexity to get more accurate quality estimation. However, compared with the video quality assessments in the pixel domain, our model has much lower complexity.
The frame complexity estimation is designed for all video compression standards. Different video standards use different quantization step size matrices and, in the following text, we derive the frame complexity functions for H.264/AVC and MPEG-2. Note that these derivations may also be used for MPEG-4, which uses two quantization modes wherein mode 0 is similar to MPEG-2 and mode 1 is similar to H.264.
3.2 H.264 frame complexity estimation
H.264 (also known as MPEG-4 Advanced Video Coding or AVC) uses a QP to determine the quantization level. The QP can take one of 52 values . The QP is used to derive the quantization step size, which in turn is combined with a scaling matrix to derive the quantization step size matrix. An increase of 1 in QP results in a corresponding increase in quantization step size of approximately 12%. As shown in Equation 13, this change in QP results in a corresponding increase in quantization complexity factor of a factor of approximately 1.1 and a decrease in the number of frame bits by a factor of . Similarly, a decrease of 1 in QP results in an increase by a factor of 1.1 in the number of frame bits.
3.3 MPEG-2 frame complexity estimation
In MPEG-2, the quant _scale _code has one value (between 1 and 31) for each macroblock. The quant _scale _code is the same at each coefficient position in the 8 × 8 matrix. Thus, the quant _scale input and quant _scale ref , in Equation 18, are independent of i and can be factored out of the summation. For the reference, we choose 16 as the reference quant _scale _code to represent the average quantization. We use the notation quant _scale to indicate the value of quant _scale when the quant _scale _code = 16. For the input bitstream, we calculate the average quant _scale _code for each frame over the coded macroblocks, and we denote it as quant_scaleinput _avg.
3.4 Bitrate normalization using frame complexity
4 Experimental results
In this section, experimental results are provided to demonstrate the effectiveness of the parameter estimation methods proposed in Section 2 as well as the quality prediction accuracy of the enhanced G.1070E model proposed in Section 3.
4.1 Parameter estimation accuracy evaluation
Summary of test content and test conditions used for parameter estimation accuracy testing
akiyo, bridge-close, bridge-far, bus, coastguard, container, flower-garden, football, foreman, hall, highway, mobile-and-calendar, mother-daughter, news, paris, silent, Stefan, table-tennis, tempete, waterfall
32 kbps, 64 kbps, 128 kbps, 256 kbps
6 fps, 10 fps, 15 fps, 30 fps
0%, 1%, 2%, 5%, 10%
2 random patterns
4.1.1 Bit rate estimation
In order to evaluate the accuracy of bit rate estimation with increasing PLR, the estimates of bit rate at non-zero PLRs were compared with the 0% packet-loss case which is considered as the ground truth.
4.1.2 Frame rate estimation
Additionally, the frame rate estimation was subjected to stress testing in order to test its robustness to high PLR. To do so, each original test bitstream is degraded with different PLR's starting from 0% and going up to 95% in steps of 5%. The frame rate estimates are compared with the ground truth frame rates for every packet-loss impaired bitstream. From the results, it is observed that the frame rate estimates obtained are accurate for all the test cases as long as the bitstreams were decodable. If the bitstream is not decodable (generally for PLR greater than 75%), there can be no frame rate estimation.
Note that the proposed frame rate estimation algorithm will fail in the rare event wherein packets belonging to every alternate frame get dropped before reaching the decoder, in which case no two consecutive timestamps can be received during the buffer window (here, set to 30 frames). However, this is only a failure insofar as the goal is to obtain the actual encoded frame rate and not the frame rate observed at the decoder (which in this case is exactly half the encoded frame rate).
4.1.3 PLR estimation
Accurate estimation of PLR is crucial because it is used as a correction factor for the bit rate estimate when packet loss is present. In order to analyze the accuracy of PLR estimation, we use the EPFL PoliMi database , which consists of CIF and 4CIF resolution videos that have 18 and 32 slices per frame, respectively, where each slice is encapsulated in one packet. This database was chosen for two reasons: (a) it provides tools to extract the location of packets lost, and (b) it enables a good visual representation of PLR estimation since it has a finer granularity of packet loss (i.e. sufficiently high number of packets per frame).
Note that the impact of actual packets lost on the PLR can also be clearly seen. For example, for a short duration after 1000 packets, the number of packets lost increases causing a corresponding increase in the instantaneous PLR. Similarly, the number of packets lost between 2500 and 3500 is lower and this causes a drop in instantaneous PLR.
4.2 G.1070E quality prediction accuracy evaluation
In this section, we present experiment results comparing the performance of G.1070 (using the proposed parameter estimation methods in Section 2) and the proposed G.1070E method (Section 3), using three different testing datasets. According to the methods described in the G.1070 Recommendation, the 12 coefficients of G.1070 and G.1070E are trained on the same video dataset. In our experiments, the performance of the proposed methods are similar for H.264 and MPEG-2 bitstreams.
The comparison between G.1070E and G.1070 for the IT-IST H.264 encoded sequences
Spearman rank correlation
The comparison between G.1070E and G.1070 for the IT-IST MPEG2 encoded sequences
Spearman rank correlation
The comparison between G.1070E and G.1070 for the EPFL PoliMI Video Quality Assessment Database
Spearman rank correlation
Like G.1070, G.1070E is also a NR bitstream-domain objective video quality measurement model. Experimental result shows that G.1070E has a significantly higher correlation with subjective MOS scores and can reflect the quality of video experience better than G.1070. The expense paid for this improvement in quality prediction accuracy is the complexity involved in extracting additional parameters, e.g. QP, number of coded and total macroblocks, and in computing frame complexity.
5 Quality monitoring system and applications
The quality measurement tools described above have been incorporated into a real-time video quality monitoring system. We introduce the notion of a video quality agent. This is a software process that can analyze a bitstream and output a quality measurement. In order to calculate the G.1070 measurement, the agent must first estimate the bit rate, frame rate, and PLR as described in Section 2. Thus, it must partially decode the input bitstream to extract the main features: bit counts, time scales, time stamps, coded unit types, and sequence numbers. For calculation of the enhancements described in Section 3, the agent must also extract the quantization step size matrix for each macroblock. Thus, the agent does the decoding necessary to extract these features. Alternatively, the feature extraction can be built into an existing decoder. For example, a video player or transcoder can be modified to extract the features needed by the quality agent during decoding for playback. We use the term 'video quality agent' to refer to a software process, integrated with an existing decoder or with its own decoding ability, that can analyze a bitstream, extract the necessary features, estimate the necessary parameters, calculate the quality estimates, and finally, communicate those measurements to another software process running in the network.
A video quality monitoring system is a collection of video quality agents all reporting their measurements back to a central network collection point where the measurements are aggregated for further analysis. As mentioned above, video quality agents can be embedded into video players on mobile handsets, in set-top boxes, on computers, etc. In addition, agents with their own decoding capabilities can be deployed at a streaming server, transcoder, or router.
In the small system of Figure 15, the aggregator is receiving quality measurements about the same video stream from four different agents. By synchronizing these four streams of data, the aggregator can monitor the degradation in quality as the video passes through the transcoder, packager, server, and transmission network. The transcoder is expected to degrade the video quality. The goal of transcoding in this system is to modify the source content to match the bit rate, frame rate, and codec type supported by the target network and media player. By comparing the quality measurements from before and after transcoding, this damage can be quantified and compared to pre-established thresholds. Alerts can be issued when the drop in quality exceeds these thresholds. The packaging and serving processes are not expected to degrade the video quality. Differences in quality measurements between these two points can indicate problems in the video data paths. Finally, measurements from the handset represent the user experience. Differences in quality between the video served and that received can be attributed to the communication network. In considering the changes in quality, the aggregator is constructing a measure of the fidelity of the channel between measurement points. This allows the aggregator to identify the source of quality degradations and fits nicely into the standard network management paradigm.
A number of video service applications can be modeled with a generalized version of Figure 15. Consider the case in which the devices are operated by different companies. At each hand-off point, there are service level agreements (SLA) specifying a minimum quality of service. But these SLAs could also specify a maximum amount of degradation to the video quality. With the ability to measure quality, systems could manage their bandwidth usage, insuring that the amount of bandwidth used is just enough necessary to meet the quality targets. Similarly, network operators can establish tiered services in which the video quality delivered to the viewer depends on the price paid. More expensive plans deliver higher quality video. To do this, the quality of the video must be measured and controlled. A final example is quality assurance of end user video. Most video network operators today are not aware of any video quality problems in their network until they receive a complaint from a customer. A network instrumented to measure video quality will give operators the ability to identify and troubleshoot problems more quickly.
In many cases, it seems that the quality measurements shown in Figure 15 can be made with a reference. For example, if the video gateway is modifying the stream, it can measure the quality of the output relative to the input and thus report the level of degradation for which it is responsible. It is not clear, however, how a number of these relative quality measurements can be collected to provide insight into the overall impact on quality (it is likely that a simple linear summation or average would be insufficient). Further, in many applications, the various components in the network are controlled by different parties who each have an incentive to report very slight, if any, degradation in quality; true or not. For these reasons, we propose this agent-aggregator general system structure with the use of NR video quality models to measure relevant aspects of the video.
As we seek to use the proposed quality models in the context of a system like Figure 15, a number of practical challenges needs to be properly addressed. There are two synchronization issues that arise in the implementation of a system similar to that shown in Figure 15. First, consider multiple network devices (many versions of server, network, end-point all running in parallel), all reporting quality measurements to a single aggregator. The system must be able to establish which measurements can serve as references to which other target measurements. Once that first synchronization issue has been addressed, the two streams of measurement data, target and reference, must be temporally aligned. A tight computational and memory constraints at some measurement points is another concern. The mobile devices usually have limited available resources including battery power, memory, and compute cycles. Since most mobile devices will decode the received bitstreams and display the video anyway, fortunately, the extra computation of applying the proposed quality metric in these devices is minor (some experimental statistics of the overhead related to the quality calculation are presented in Section 3). However, computational challenges exist in less likely spots. A video server or switch may have very powerful processors, large memory footprints, and plenty of electrical power, but these devices are also tasked with serving a large number of streams simultaneously. Adding a partial decoding/extraction process to each stream may bring considerable burden to some network nodes.
The ITU-T standardized G.1070 video quality model is widely used as a video quality planning tool for video conferencing applications. It takes as inputs the target bitrate and frame rate as well as the expected PLR of the channel. However, there are two technical challenges to extend this model for real-time quality monitoring for general video applications.
First, in the quality monitoring scenario, the bit rate and frame rate of the bitstreams and the actual PLR of the network are not known and need to be estimated. Second, the video content characteristics significantly impact the encoded bitrate of different video scenes at similar quality levels. This content-sensitivity issue may not be obvious in the context of video conferencing where the content is homogeneous, but its impact is felt when measuring the quality of general videos with varying characteristics.
To address the above problems, we first enable quality monitoring using G.1070 by presenting methods to continuously estimate the bit rate, frame rate, and PLR from received bitstreams. Then, we proposed a novel enhanced G.1070 (G.1070E) system, which compensates for the impact of varying video content characteristics on encoding bit rate by normalizing the bit rate with estimated video complexity. The improved quality prediction accuracy of the proposed G.1070E model is validated by experimental results comparing the predicted quality with MOS data collected from subjective tests.
Finally, we have presented an emerging application that can efficiently use the proposed real-time video quality monitoring method for diagnosing network problems and ensuring end user video quality.
- Seshadrinathan K, Soundararajan R, Bovik A, Cormack L: Study of subjective and objective quality assessment of video. IEEE Trans Image Process 2010,19(6):1427-1441.MathSciNetView ArticleGoogle Scholar
- Winkler S: Digital Video Quality: Vision Models and Metrics. Wiley, New York; 2005.View ArticleGoogle Scholar
- Marziliano P, Dufaux F, Winkler S, Ebrahimi T: Perceptual blur and ringing metrics: applications to JPEG2000. Signal Process Image Commun 2004, 19: 163-172. 10.1016/j.image.2003.08.003View ArticleGoogle Scholar
- Ferzli R, Karam L: A human visual system-based model for blur/sharpness perception. International Workshop on Video Processing and Quality Metrics (VPQM) 2006.Google Scholar
- Liu D, Chen Z, Xu F, Gu X: No reference block based blur detection. International Workshop on Quality of Multimedia Experience (QoMEX) 2009.Google Scholar
- Wang Z, Sheikh H, Bovik A: No reference perceptual quality assessment of JPEG compressed images. IEEE International Conference on Image Processing (ICIP) 2002.Google Scholar
- Babu R, Perkis A: An HVS-based no-reference perceptual quality assessment of JPEG coded images using neural networks. IEEE International Conference on Image Processing (ICIP) 2005.Google Scholar
- Wang Z, Bovik A, Evans B: Blind measurement of blocking artifacts in images. IEEE International Conference on Image Processing (ICIP) 2000.Google Scholar
- Muijs R, Kirenko I: A no-reference blocking artifact measure for adaptive video processings. European Signal Processing Conference 2005.Google Scholar
- Lu Z, Lin W, Seng BC, Kato S, Yao S, Ong E, Yang XK: Measuring the negative impact of frame dropping on perceptual visual quality. SPIE Human Vision and Electronic Imaging 2005, 5666: 554-562.Google Scholar
- Yang KC, Guest CC, El-Maleh K, Das PK: Perceptual temporal quality metric for compressed video. IEEE Trans Multimedia 2007, 9: 1528-1535.View ArticleGoogle Scholar
- Ou YF, Ma Z, Liu T, Wang Y: Perceptual quality assessment of video considering both frame rate and quantization artifacts. IEEE Trans Circuits Syst Video Technol 2011,21(3):286-298.View ArticleGoogle Scholar
- Pastrana-Vidal RR, Gicquel JC: Automatic quality assessment of video fluidity impairments using a no-reference metric. International Workshop on Video Processing and Quality Metrics (VPQM) 2006.Google Scholar
- Babu R, Bopardikar A, Perkis A, Hillestad OI: No-reference metrics for video streaming applications. International Workshop on Packet Video 2004.Google Scholar
- Rui H, Li C, Qiu S: Evaluation of packet loss impairment on streaming video. J Zhejiang Univ Sci 2006,7(Suppl I):131-136.View ArticleGoogle Scholar
- Reibman A, Poole D: Predicting packet-loss visibility using scene characteristics. International Workshop on Packet Video 2007.Google Scholar
- Lin TL, Kanumuri S, Zhi Y, Poole D, Cosman P, Reibman A: A versatile model for packet loss visibility and its application to packet prioritization. IEEE Trans Image Process 2010,19(3):722-735.MathSciNetView ArticleGoogle Scholar
- Liu T: Perceptual quality assessment of videos affected by packet-losses. PhD thesis, Polytechnic Institute of New York University; 2010.Google Scholar
- Mohamed S, Rubino G: A study of real-time packet video quality using random neural networks. IEEE Trans Circuits Systems Video Technol 2002,12(12):1071-1083. 10.1109/TCSVT.2002.806808View ArticleGoogle Scholar
- Takahashi A, Yamagishi K, Kawaguti G: Recent activities of QoS/QoE standardization in ITU-T SG12. NTT Technical Review 2008.Google Scholar
- Verscheure O, Frossard P, Hamdi M: User-oriented QoS analysis in MPEG-2 video delivery. Real-time Image 1999,5(5):305-314. 10.1006/rtim.1999.0175View ArticleGoogle Scholar
- Yang F, Wan S, Xie Q, Wu H: No-reference quality assessment for networked video via primary analysis of bit stream. IEEE Trans Circuits Syst Video Technol 2010,20(11):1544-1554.View ArticleGoogle Scholar
- Recommendation ITU-T G1070: Opinion Model for Video-telephony Applications 2007.Google Scholar
- Yamagishi K, Hayashi T: Parametric packet-layer model for monitoring video quality of IPTV services. IEEE International Conference on Communications 2008.Google Scholar
- Belmudez B, Moller S: Extension of the G.1070 video quality function for the MPEG2 video codec. International Workshop on Quality of Multimedia Experience (QoMEX) 2010.Google Scholar
- Joskowicz J, Ardao J: Enhancements to the opinion model for video-telephony applications. Fifth International Latin American Networking Conference 2009.Google Scholar
- Narvekar N, Liu T, Zou D, Bloom J: Extending G.1070 for video quality monitoring. IEEE International Conference on Multimedia and Expo (ICME) 2011.Google Scholar
- Wolf S, Pinson M: Video quality measurement techniques. National Telecommunications and Information Administration (NTIA) Report 2002.Google Scholar
- Wang B, Zou D, Ding R, Liu T, Bhagavathy S, Narvekar N, Bloom J: Efficient frame complexity estimation and application to G.1070 video quality monitoring. International Workshop on Quality of Multimedia Experience (QoMEX) 2011.Google Scholar
- Yang J, Zhao Q, Zhang L: The study of frame complexity prediction and rate control in H.264 encoder. International Conference on Image Analysis and Signal Processing (IASP) 2009.Google Scholar
- Tian L, Sun Y, Sun S: Frame complexity prediction for H.264/AVC rate control. IEEE International Conference on Multimedia and Expo (ICME) 2009.Google Scholar
- Wiegand T, Bjontegaard G, Sullivan G, Luthra A: Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 2003, 13: 560-576.View ArticleGoogle Scholar
- ISO/IEC 13818-2 MPEG2 1995.Google Scholar
- MPEG-2 video decoder version 12[http://www.mpeg.org/MPEG/MSSG]
- EPFL PoliMI Video Quality Assessment Database (version 2.0)[http://mmspl.epfl.ch/vqa]
- Instituto Superior Tecnico of Instituto de Telecomunicacoes dataset[http://amalia.img.lx.it.pt]
- Simone FD, Naccari M, Tagliasacchi M, Dufaux F, Tubaro S, Ebrahimi T: Subjective assessment of H.264/AVC video sequences transmitted over a noisy channel. International Workshop on Quality of Multimedia Experience (QoMEX) 2009.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.