Detection of shifted double JPEG compression by an adaptive DCT coefficient model

Wang, Shi-Lin; Liew, Alan Wee-Chung; Li, Sheng-Hong; Zhang, Yu-Jin; Li, Jian-Hua

doi:10.1186/1687-6180-2014-101

Research
Open access
Published: 05 July 2014

Detection of shifted double JPEG compression by an adaptive DCT coefficient model

Shi-Lin Wang¹,
Alan Wee-Chung Liew²,
Sheng-Hong Li³,
Yu-Jin Zhang³ &
…
Jian-Hua Li¹

EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 101 (2014) Cite this article

2708 Accesses
3 Citations
Metrics details

Abstract

In many JPEG image splicing forgeries, the tampered image patch has been JPEG-compressed twice with different block alignments. Such phenomenon in JPEG image forgeries is called the shifted double JPEG (SDJPEG) compression effect. Detection of SDJPEG-compressed patches could help in detecting and locating the tampered region. However, the current SDJPEG detection methods do not provide satisfactory results especially when the tampered region is small. In this paper, we propose a new SDJPEG detection method based on an adaptive discrete cosine transform (DCT) coefficient model. DCT coefficient distributions for SDJPEG and non-SDJPEG patches have been analyzed and a discriminative feature has been proposed to perform the two-class classification. An adaptive approach is employed to select the most discriminative DCT modes for SDJPEG detection. The experimental results show that the proposed approach can achieve much better results compared with some existing approaches in SDJPEG patch detection especially when the patch size is small.

1 Introduction

With the rapid development of image processing tools, manipulating a digital image without leaving obvious visual traces is becoming easier and easier. The detection of malicious tampering and the verification of the credibility of the original digital image have become important research topics.

Some researchers have proposed digital watermarking as an image/video content authentication technique [1–3]. However, these kinds of ‘active authentication’ methods have not been widely used because most images on the Internet are not required to be watermarked. Moreover, there is also the challenge of how to guide against hostile attacks in watermark-based approaches. Recently, a variety of ‘passive authentication’ methods [4–7] have been proposed which perform image content authentication by detecting certain cues produced during creation and modification of the image, such as double compression, light abnormality, re-sampling, and photo response non-uniformity noise (PRNU). Compared with the active approach, the passive or blind authentication approach does not require additional watermarks or signatures and could have broader applications in image forensics.

JPEG is the most widely used image format. Authentication or detection of forgeries on JPEG images plays an important role in image forensics. Since most tampered JPEG image undergoes at least two JPEG compressions, many JPEG image authentication methods are based on detection of double JPEG compression. As the block discrete cosine transform (BDCT) is the key operation in JPEG compression, the distribution of the BDCT coefficients usually contains important information which could indicate the compression history. Hence, most passive image tampering detection approaches for JPEG images are based on BDCT coefficient analysis. Lukas and Fridrich [8] tried to identify double JPEG compression by detecting the double-peak effect in DCT coefficient histogram. Fu et al. [9] observed that the distribution of the first digit of DCT coefficients after JPEG compression followed the generalized Benford's law and stated that double JPEG compression could be detected because it would cause violations of the first digit law. In [10], Pevny and Fridrich introduced a machine learning approach for detection of double JPEG compression. A set of features from the histograms of several low-frequency DCT coefficients was extracted, and support vector machine (SVM) was adopted as the classifier. Recently, Farid [11] observed that re-compression would introduce additional local minima in the difference between the image and its JPEG-compressed counterpart, and these local minima, referred to as the JPEG ghosts, could be used to detect double compression. However, these methods are only effective when the block structure of the first and second JPEG compressions are aligned with each other.

In many JPEG image tampering situations, a foreign JPEG-compressed patch is inserted into an authentic image and the resultant image is re-compressed to form the new image. The tampered region has double JPEG compressions, but the block structures of the two compressions in the tampered region are usually not aligned with each other. Such case is referred to as shifted double JPEG compression (SDJPEG) [12] or non-aligned double JPEG compression (NA-JPEG) [13, 14]. For SDJPEG, the double JPEG compression detection methods discussed above cannot achieve satisfactory results. Luo et al. [15] tried to detect SDJPEG by analyzing the blocking artifact characteristics matrix (BACM). They observed that BACM is symmetric for single JPEG compression but the symmetry is destroyed after SDJPEG. However, BACM is highly related to the image content and the detection performance would decrease if the testing images are very different from those in the training set. In order to solve the above problem, Chen and Hsu [16] extended the idea of BACM and proposed a feature which is less related to the image content by introducing the inter-block correlation. However, since the statistical features in both [15] and [16] require large amount of data to obtain high discriminative power, their methods do not work well for small SDJPEG patches.

Recently, Bianchi and Piva [13] tried to detect SDJPEG effects by examining the integer periodicity in the DCT coefficient histogram when the BDCT is computed according to the first JPEG compression. In addition, they also proposed a statistical model that characterizes the artifacts due to SDJPEG [14]. They have observed that the shifted JPEG compression will introduce a Gaussian noise to each DCT coefficients and approximated the variance of the noise by the quantization step of the shifted compression. Inspired by [14], we propose a highly effective SDJPEG image tampering detection method in this paper. The major contributions of our work lie in the following: (i) we perform a rigorous theoretic analysis of the DCT coefficient variations caused by SDJPEG and derive from it a rigorous statistical model for the DCT coefficients of SDJPEG patches, which provides a more accurate estimation of the quantization noise introduced by the shifted JPEG quantization compared with that in [13, 14]; (ii) based on the analysis, we propose an effective discriminative feature to detect SDJPEG patches; and (iii) we propose an adaptive DCT component selection method to select the most discriminative DCT components. Our algorithm not only detects image forgeries but also locates the tampered regions accurately.

This paper is organized as follows. Section 2 gives a description of the ‘crop-and-paste’ image tampering problem and introduces current approaches. Section 3 gives an in-depth analysis on the DCT coefficient variations caused by shifted JPEG compression and describes the DCT coefficient histogram model for SDJPEG patches. In Section 4, a new discriminative feature is introduced to detect SDJPEG patches. An adaptive DCT mode selection method and a tampering detection algorithm are also given. Section 5 presents the experimental results comparing our approach with several state-of-the-art techniques. Finally, Section 6 draws the conclusion.

2 Crop-and-paste image tampering detection

Figure 1 illustrates a typical crop-and-paste image tampering scenario. Given an original image shown in Figure 1a, an image region of arbitrary shape highlighted in Figure 1b was cropped from a foreign image and was pasted onto the original image to construct the tampered image of Figure 1c (with tampered region highlighted). The tampering detection problem can be stated as follows: given an image, detect whether it has been tampered through crop-and-paste and if so, locate the region where the crop-and-paste occurred. Since any arbitrary shape region can be divided into a concatenation of small squared patches, we consider a squared image patch composed of a number of 8 × 8 blocks as the fundamental unit in this work.To facilitate subsequent discussions, we first give some definitions. An image patch with square shape and containing a number of 8 × 8 blocks can be classified into one of five categories as shown in Figure 2 (from left to right, top to bottom):

(i)
Uncompressed: These patches are raw images and have never been JPEG compressed.
(ii)
Aligned single JPEG compressed (ASJPEG in short): When the uncompressed image patches undergo a JPEG compression with the block structure aligned to the image patch, the output image patches are called ASJPEG patches. They are usually referred to as single JPEG image patches in the literature.
(iii)
Aligned double JPEG compressed (ADJPEG in short): When the ASJPEG patches undergo another JPEG compression and the block structures of the two JPEG compressions are aligned with each other, the output image patches are called ADJPEG patches. They are usually referred to as double JPEG image patches in the literature.
(iv)
Shifted double JPEG compressed (SDJPEG in short): When the ASJPEG patches undergo another JPEG compression and the block structures of the two JPEG compressions are different, the output image patches are called SDJPEG patches.
(v)
Shifted single JPEG compressed (SSJPEG in short): Different from SDJPEG patches, when the image patches before the shifted JPEG compression (with the block structure in dashed line) have never been compressed with the block structure starting from the top-left corner, the output image patches are called SSJPEG patches.

There are two general approaches for crop-and-paste tampering detection in JPEG images: aligned double JPEG detection and shifted double JPEG detection. Both approaches involve analyzing an image patch within a window aligned with the JPEG block structure of the final image, i.e., denoted hereafter as aligned patch, to detect any sign of tampering. Their differences and requirements are briefly summarized in Table 1. In most crop-and-paste tampering detection applications, both kinds of approaches are applied together to provide a more robust detection result.

Table 1 Requirements of the crop-and-paste tampering detection methods

Full size table

The underlying rationale of the aligned double JPEG detection approach is as follows. If the original image has been JPEG compressed before and undergoes another JPEG compression after crop-and-paste tampering, image patches in the unmodified region in the final JPEG image (such as aligned patch I in Figure 3a) would have been compressed twice with the same block structure and exhibit the aligned double JPEG effect, while image patches in the tampered region (such as aligned patch II in Figure 3a) will not show such effect. Algorithms in this approach look for untampered patch, i.e., they detect the presence of aligned double JPEG effect, to establish the authenticity of the image patch. If such effect is absent, the patch is assumed to be tampered. It has been shown [8–11, 17] that if the quality factors of the original and final image are the same, the detection performance would not be satisfactory. Recently, Huang et al. [18] proposed an aligned double JPEG detection algorithm that applied to the case where both the original and final images have the same quantization matrix. However, their approach cannot provide accurate detection results for small image patch. Finally, the aligned double JPEG detection approach cannot detect the tampered region if the original image is uncompressed.

In contrast, the shifted double JPEG detection-based approach [13–16] looks for tampered patch by detecting cues of tampering in an image patch. The basic idea is that if the inserted image region has been JPEG compressed earlier, the second JPEG compression will leave some cues of shifted compression in the tampered region in the final image (such as the additional blocking artifacts [15, 16]). For the aligned patches fully or partially located in the tampered region in the final image (such as aligned patch II in Figure 3a), such cues would exist, while for the aligned patches located in the unmodified region in the final image (such as aligned patch I in Figure 3a), such cues would be absent. Hence, the tampered region can be located by searching for all the aligned patches where the cues left by shifted compression exist. However, such cues vary greatly with image content and the quality factor of the inserted patch, and it is difficult to find a robust and discriminative feature for all situations. Moreover, this approach needs a large image patch to obtain robust detection results and is less effective for small patches.

In this paper, we propose a new shifted double JPEG detection method which also examines the characteristics of the inserted patch. Similar to [13, 14], our algorithm considers image patches that are not aligned with the block structure of the final image (such as non-aligned patches I and II in Figure 3b). When analyzing the non-aligned patches in the final image, SDJPEG patches (such as non-aligned patch II in Figure 3b where the JPEG-compressed inserted patch undergoes a shifted JPEG compression, i.e., during JPEG compression of the final image) are located in the tampered region, while SSJPEG patches (such as non-aligned patch I in Figure 3b) are located in the untampered region. Note that the JPEG compression of the final image is aligned for the aligned patches in Figure 3a and shifted for non-aligned patches Figure 3b. In our algorithm, an exhaustive search on 63 possible locations of the non-aligned image patches is performed to detect tampering. We will show in the experiment section that the increase in computational cost is worthwhile and our algorithm can achieve much better detection performance compared with the existing methods.

3 DCT coefficient analysis for SDJPEG patches

Since SDJPEG patches are generated by shifted JPEG compression on ASJPEG patches, we examine how the shifted JPEG compression affects the DCT coefficients. We first describe the effect of shifted JPEG compression on the DCT coefficients in Section 3.1. Then we derive a specific DCT coefficient model for SDJPEG patches in Section 3.2. The notation used hereafter is summarized in Table 2.

Table 2 Notation

Full size table

3.1 DCT coefficient variations caused by shifted JPEG compression

Before analyzing the effects on DCT coefficients caused by shifted JPEG compression, a simpler case, i.e., the effects cause by aligned JPEG compression, is discussed. Given an 8 × 8 image block A, its DCT coefficients can be represented by $D^{O} (A) = [D_{m, n}^{O} (A)] 1 \leq m, n \leq 8 .$ If block A undergoes an aligned JPEG compression (as shown in Figure 4a) with the quality factor QF, the resulting DCT coefficients of block A, denoted by $D^{AJ} (A) = [D_{m, n}^{AJ} (A)] 1 \leq m, n \leq 8$ , will be a multiple of the quantization step. Hence, aligned JPEG compression will induce a zero-mean quantization error (denoted by $E_{m, n}^{AJ} (A)$ ) for all the DCT coefficients, i.e.,

D_{m, n}^{AJ} (A) = D_{m, n}^{O} (A) + E_{m, n}^{AJ} (A)

(1)

The discussions above can be extended to the case of shifted JPEG compression. If block A undergoes a shifted JPEG compression with a coordinate shift of (x_S, y_S) and quality factor QF (as shown in Figure 4b), the DCT coefficients of block A become $D^{SJ} (A) = [D_{m, n}^{SJ} (A)] 1 \leq m, n \leq 8$ . To derive the expression for D^SJ (A), we consider the DCT coefficients of the neighboring 8 × 8 aligned blocks whose block structure coincides with the block structure of the shifted JPEG compression on A. As shown in Figure 4c,d, block A is surrounded by four aligned blocks B₁, B₂, B₃, and B₄. According to [19], the DCT coefficients of block A and B_i (i = 1,2,3,4) are related by (details are given in Appendix)

\begin{array}{l} A = A_{1} + A_{2} + A_{3} + A_{4} \Rightarrow D (A) \\ = D (A_{1}) + D (A_{2}) + D (A_{3}) + D (A_{4}) \end{array}

(2)

\begin{array}{l} A_{i} = H_{i 1} B_{i} H_{i 2} \Rightarrow D (A) \\ = \sum_{i = 1}^{4} D (H_{i 1}) D (B_{i}) D (H_{i 2}) \end{array}

(3)

where H_{i 1} and H_{i 2} (i = 1,2,3,4) are the row and column translation matrices to translate a specific block from B_i to A_i, as shown in Figure 4. Note that D(H_{i 1}) and D(H_{i 2}) (i = 1,2,3,4) are only related to the coordinate shifts (x_S, y_S).

Let $D^{O} (B_{i}) = [D_{m, n}^{O} (B_{i})] 1 \leq m, n \leq 8$ be the original DCT coefficients of block B_ii = 1,2,3,4 and thus before compression $D^{O} (A) = \sum_{i = 1}^{4} D (H_{i 1}) D^{O} (B_{i}) D (H_{i 2})$ according to (3). D^O(B_i) becomes D^AJ(B_i), i = 1,2,3,4 after a JPEG compression is performed on these aligned blocks. Based on the analysis about aligned JPEG compression in (1), D^AJ(B_i) = D^O(B_i) + E^AJ(B_i), i = 1,2,3,4. The aligned JPEG compression on B_i is equivalent to a shifted JPEG compression on A. Hence, after compression, the DCT coefficients of block A would change to

\begin{array}{l} D^{SJ} (A) = \sum_{i = 1}^{4} D (H_{i 1}) D^{AJ} (B_{i}) D (H_{i 2}) \\ = \sum_{i = 1}^{4} D (H_{i 1}) (D^{O} (B_{i}) + E^{AJ} (B_{i})) D (H_{i 2}) \\ = D^{O} (A) + \sum_{i = 1}^{4} D (H_{i 1}) E^{AJ} (B_{i}) D (H_{i 2}) \\ = D^{O} (A) + E^{SJ} (A) \end{array}

(4)

where $E^{SJ} (A) = [E_{m, n}^{SJ} (A)] = \sum_{i = 1}^{4} D (H_{i 1}) E^{AJ} (B_{i}) D (H_{i 2})$ denotes the shifted quantization error caused by shifted JPEG compression. For any DCT mode (m,n), the shifted quantization error $E_{m, n}^{SJ} (A)$ can be expressed by a linear combination of 4 × 64 = 256 zero-mean random variables, i.e., $E_{m, n}^{SJ} (A) = \sum_{i = 1}^{4} \sum_{u = 1}^{8} \sum_{v = 1}^{8} c_{m, n}^{i} (u, v) E_{u, v}^{AJ} (B_{i})$ and $c_{m, n}^{i} (u, v)$ is the weighting parameter determined by D(H_{i 1}) and D(H_{i 2}) (i = 1,2,3,4). According to the Central Limit Theorem (CLT), $E_{m, n}^{SJ} (A)$ follows a zero-mean Gaussian distribution denoted by $G (0, σ_{m, n}^{SJ})$ with standard deviation $σ_{m, n}^{SJ}$ . $σ_{m, n}^{SJ}$ can be calculated with knowledge of the standard deviations of these 256 random variables ( $E_{u, v}^{AJ} (B_{i}) 1 \leq i \leq 4, 1 \leq u, v \leq 8$ ) and the weighting coefficients $c_{m, n}^{i} (u, v) 1 \leq i \leq 4, 1 \leq u, v \leq 8$ .

In order to estimate the quantization error caused by aligned JPEG compression, we divide all the DCT coefficients into two parts, i.e., the DC and AC components. For the DC component, since the quantization step is relatively small and the DC coefficients have a large dynamic range, the quantization error caused by aligned JPEG compression approximately follows the uniform distribution of $[- \frac{q_{1, 1} (QF)}{2}, \frac{q_{1, 1} (QF)}{2})$ , where q_1,1(QF) is the quantization step for the DC coefficients with the quality factor QF. For the AC components, according to [20], the AC components for a natural image approximately follow a Laplacian distribution. If we fit the coefficients of AC components in the image patch to a Laplacian model, the standard deviations of $E_{u, v}^{AJ} (B_{i})$ can be directly calculated with the quality factor QF. In addition, from (3), the weighting coefficients, $c_{m, n}^{i} (u, v)$ , are determined by the coordinate shift (x_S, y_S). Then the theoretical value of $σ_{m, n}^{SJ}$ can be calculated when QF and (x_S, y_S) are known.

To summarize, the shifted JPEG compression will induce a zero-mean Gaussian-distributed quantization error for all the DCT coefficients, and for different DCT coefficients, the standard deviations of the shifted quantization error are different and can be calculated when the quality factor and the coordinate shift of the compression are known.

3.2 DCT coefficient analysis on SDJPEG patches

As discuss earlier, SDJPEG patches are constructed by performing shifted JPEG compression on ASJPEG patches. Similar to the ASJPEG patches, the DCT coefficients of SDJPEG patches also have certain specific distributions.

Given a gray scale image patch IMG consisting of a number of 8 × 8 blocks aligned JPEG compressed with the quality factor QF₁, let $D_{m, n}^{AJ} (k)$ denote the (m,n)-th DCT coefficient of the k th block in IMG. Since it has been JPEG compressed with quality factor QF₁, $D_{m, n}^{AJ} (k)$ would be a multiple of the quantization step of the (m,n)th DCT mode (denoted by q_m,n(QF₁)). Hence, considering all the (m,n)th DCT coefficients in IMG, the normalized histogram $h_{m, n}^{AJ} (f)$ of the (m,n)th DCT coefficients is given by

h_{m, n}^{AJ} (f) = \sum_{i = - N}^{N} w_{i} δ (f - i \times q_{m, n} ({QF}_{1}))

(5)

where w_i is the normalized frequency of the (m,n)th DCT coefficients having a value of i × q_m,n(QF₁) and Nq_m,n(QF₁) is the maximum absolute value of the (m,n)th DCT coefficient in IMG.

When the ASJPEG-compressed image patch IMG is transformed back to the spatial domain, two kinds of errors would be introduced. One is the truncation error. Since the luminance value of the gray scale image ranges from 0 to 255, any gray level greater than 255 or less than 0 will be truncated to 255 or 0, respectively. However, as discussed in [21], such kind of error seldom appears in natural images (about 1% of the image blocks have truncation errors), and any block with pixels having luminance value of 0 or 255 is discarded for further analysis. The other is the rounding error, i.e., a rounding process will be carried out after IDCT to ensure that the luminance value in the spatial domain is an integer. Such rounding error in the spatial domain will lead to bias (denoted by $E_{m, n}^{rounding}$ ) in all the DCT coefficients. Assuming that the rounding error in the spatial domain follows a zero-mean uniform distribution with range [-0.5, 0.5) and considering that the DCT transform is unitary, $E_{m, n}^{rounding}$ is Gaussian distributed with zero mean and variance 1/12 for all 1 ≤ m, n ≤ 8 according to the CLT [14].

After DCT-to-spatial transformation, the image patch undergoes a shifted JPEG compression with quality factor QF₂ and the coordinate shifts (x_S, y_S). According to the analysis in Section 3.1, the shifted JPEG compression will introduce a zero-mean Gaussian-distributed error term $E_{m, n}^{SJ} (k)$ . The (m,n)th DCT coefficient of the k th block in the final SDJPEG patch, denoted by $D_{m, n}^{SD} (k)$ , is given by

\begin{array}{l} D_{m, n}^{SD} (k) = D_{m, n}^{AJ} (k) + E_{m, n}^{rounding} + E_{m, n}^{SJ} (k) \\ = D_{m, n}^{AJ} (k) + E_{m, n}^{SJR} (k) \end{array}

(6)

where $E_{m, n}^{SJR} (k)$ is the overall error of the (m,n)th DCT coefficient during the construction of the SDJPEG patch from the ASJPEG patch. Since $E_{m, n}^{rounding}$ and $E_{m, n}^{SJ} (k)$ are both Gaussian distributed and independent of each other, $E_{m, n}^{SJR} (k)$ also follows the Gaussian distribution with zero-mean and standard deviation of $σ_{m, n}^{SJR} = \sqrt{{(σ_{m, n}^{SJ})}^{2} + 1 / 12}$ . The histogram of the (m,n)th DCT coefficients after shifted double JPEG compression, denoted by $h_{m, n}^{SD} (f)$ , is then given by

\begin{array}{l} h_{m, n}^{SD} (f) = h_{m, n}^{AJ} (f) \otimes G (0, σ_{m, n}^{SJR}) \\ = \sum_{i = - N}^{N} w_{i} G (i \times q_{m, n} ({QF}_{1}), σ_{m, n}^{SJR}) 1 \leq m, n \leq 8 \end{array}

(7)

Figure 5 illustrates the DCT coefficient histograms for the SDJPEG and SSJPEG patches from the uncompressed 512 × 512 image ‘Lena.bmp’ (Figure 5 (c1)). From the figure, it can be observed that (i) the coefficient variations caused by rounding errors and shifted JPEG compression follow Gaussian distributions and thus $h_{m, n}^{SD} (f)$ can be approximated by (7); (ii) the AC DCT coefficient distribution of the SSJPEG patch (shown in Figure 5 (c2 and c3)) follows the Laplacian distribution, which is similar to that of uncompressed images [22], and is very different from that of the SDJPEG patches shown in Figure 5 (a2 and b2). Based on the above analysis, it can be concluded that the SDJPEG patches can be differentiated from SSJPEG patches by analyzing the DCT coefficient distributions. In the following section, details of the SDJPEG detection method based on the proposed DCT coefficient model in (7) will be elaborated.

4 Detection of image patch with SDJPEG compression

The analysis in Section 3 shows that the DCT coefficient distribution of SDJPEG patches follows a weighted summation of Gaussian components with the same standard deviation (as shown in (7) and Figure 5). However, in order to detect SDJPEG patches based on (7), two questions have to be addressed: (i) How to obtain discriminative features that capture the differences in the (m,n)th DCT coefficient distributions between SDJPEG and SSJPEG patches? (ii) How to select DCT modes which provide high discriminative power since the differences in the DCT coefficient distributions of SDJPEG and SSJPEG patches would vary for different DCT mode? In the following, we will address these questions.

4.1 Discriminative feature extraction based on the (m,n)th DCT coefficients

For a specific DCT mode, say (m,n)th, the following detection algorithm is carried out by determining whether the histogram h_m,n(f) is similar to that of $h_{m, n}^{SD} (f)$ .

Given a specific quantization step q_m,n, we project the histogram h_m,n(f) onto the interval $(- \frac{q_{m, n}}{2}, \frac{q_{m, n}}{2})$ , and the sum of the histogram function within the interval is defined as

s h_{m, n} (f) = \sum_{i = - N}^{N} h_{m, n} (f + i \times q_{m, n}), - \frac{q_{m, n}}{2} \leq f < \frac{q_{m, n}}{2}

(8)

where Nq_m,n is the maximum absolute value of the (m,n)th DCT coefficient.

According to (7), for SDJPEG-compressed image patches, $h_{m, n}^{SD} (f)$ follows a weighted summation of Gaussian components with a standard deviation of $σ_{m, n}^{SJR}$ . Hence, $s h_{m, n}^{SD} (f)$ would follow a specific distribution determined by $σ_{m, n}^{SJR}$ and q_m,n. Based on the different ratios of $q_{m, n} / σ_{m, n}^{SJR}$ , $s h_{m, n}^{SD} (f)$ can be obtained as follows. It should be noted that in our discussions, the Gaussian distribution, G(0, σ), is assumed to be bounded in [-3σ, 3σ], and any outliers are omitted.

If $\frac{q_{m, n}}{2} \geq 3 σ_{m, n}^{SJR}$ , i.e., all the Gaussian components in $h_{m, n}^{SD} (f)$ are isolated, $s h_{m, n}^{SD} (f)$ would follow a Gaussian distribution of $G (0, σ_{m, n}^{SJR})$ (as shown in Figure 6a), i.e.,

\begin{array}{l} s h_{m, n}^{SD} (f) = \sum_{i = - N}^{N} w_{i} G (0, σ_{m, n}^{SJR}) = G (0, σ_{m, n}^{SJR}) \\ - \frac{q_{m, n}}{2} \leq f < \frac{q_{m, n}}{2} \end{array}

(9)

If $p \times \frac{q_{m, n}}{2} < 3 σ_{m, n}^{SJR} \leq (p + 1) \times \frac{q_{m, n}}{2} p = 1, 2, \dots$ , some Gaussian components would overlap with each other and $s h_{m, n}^{SD} (f)$ would follow a distribution of a mixture of (p + 1) Gaussian distributions (as shown in Figure 6b,c), i.e.,

s h_{m, n}^{SD} (f) = \{\begin{cases} \sum_{i = - N}^{N} w_{i} [\sum_{j = 0}^{p} G (f + j \times q_{m, n}, σ_{m, n}^{SJR})] \\ = \sum_{j = 0}^{p} G (f + j \times q_{m, n}, σ_{m, n}^{SJR}) - q_{m, n} / 2 \leq f \leq 0 \\ \sum_{i = - N}^{N} w_{i} [\sum_{j = 0}^{p} G (f - j \times q_{m, n}, σ_{m, n}^{SJR})] \\ = \sum_{j = 0}^{p} G (f - j \times q_{m, n}, σ_{m, n}^{SJR}) 0 \leq f \leq q_{m, n} / 2 \end{cases}

(10)

where $p + 1 = ceiling (\frac{3 σ_{m, n}^{SJR}}{q_{m, n} / 2})$ .

From (9) and (10) and Figure 6, it is observed that for the isolated or slight overlapping cases (p = 0 or p = 1), $s h_{m, n}^{SD} (f)$ has a distinctive peak at small |f|. The peak at small |f| becomes more prominent with larger $q_{m, n} / σ_{m, n}^{SJR}$ . Such phenomenon does not occur for SSJPEG patches. Figure 7 illustrates sh_m,n(f) distributions of the image Lena.bmp of Figure 5 (c1) after SDJPEG and SSJPEG with various parameter settings.

From Figure 7, it is observed that the energy of sh_m,n(f) for SDJPEG image patches is concentrated on the center region whereas the energy of sh_m,n(f) for SSJPEG patches is almost evenly distributed. It should be noted that when the size of the image patch is small, e.g., 64 × 64, the total number of DCT coefficients is small (64 coefficients in total), and it is difficult to estimate the actual distribution of the patch accurately and robustly using such limited data. In order to solve the above problem of inadequate data, the 1-D feature similar to that in [23] is adopted to differentiate between SDJPEG and SSJPEG image patches as follows:

s_{m, n} = \int_{R_{2}} s h_{m, n} (f) df / \int_{R_{1}} s h_{m, n} (f) df

(11)

where $R_{1} = (- \frac{q_{m, n}}{6}, \frac{q_{m, n}}{6})$ representing the central region and $R_{2} = (- \frac{q_{m, n}}{2}, - \frac{q_{m, n}}{3}) \cup (\frac{q_{m, n}}{3}, \frac{q_{m, n}}{2})$ representing the peripheral region.

For SDJPEG patches, the reference value of the feature, denoted by $s_{m, n}^{SD}$ , can be derived from $s h_{m, n}^{SD} (f)$ and is determined by $σ_{m, n}^{SJR}$ and q_m,n. The reference value of SSJPEG patch, denoted by $s_{m, n}^{SS}$ , is obtained as follows. Since the SSJPEG patch has never been compressed with the block structure starting from the top-left corner of the image patch, the DCT coefficients of SSJPEG can be assumed distributed approximately as the original uncompressed image patch [8, 14]. Moreover, $h_{m, n}^{SS} (f)$ is equivalent to the distribution of the quantization error of the (m,n)th DCT component with the quantization step q_m,n. Hence, $h_{m, n}^{SS} (f)$ can be approximated using the aligned JPEG quantization error estimation approach introduced in Section 3.1. Then $s_{m, n}^{SS}$ can be derived from (8) and (11). Note that for the low-frequency DCT modes, the quantization step q_m,n is relatively smaller compared with the dynamic range of the DCT coefficients and thus $s h_{m, n}^{SS} (f)$ is almost evenly distributed and $s_{m, n}^{SS} \approx 1 > s_{m, n}^{SD}$ .

The final discriminative feature indicating the likelihood of an image patch having been SDJPEG compressed is derived by normalizing the extracted feature with the reference values $s_{m, n}^{SD}$ and $s_{m, n}^{SS}$ , i.e., $s n_{m, n} = (s_{m, n} - s_{m, n}^{SD}) / (s_{m, n}^{SS} - s_{m, n}^{SD})$ . Note that sn_m,n is small for SDJPEG patches and large for SSJPEG patches.

4.2 Discriminative feature extraction based on all the DCT coefficients

The analysis in Section 4.1 shows that sn_m,n for the SDJPEG image patches is smaller than those for the SSJPEG patches. Such phenomenon becomes more prominent with the increase of $q_{m, n} / σ_{m, n}^{SJR}$ , and thus the DCT components with larger $q_{m, n} / σ_{m, n}^{SJR}$ would be more discriminative in SDJPEG detection. Our analysis in Section 3 shows that for a specific image, $σ_{m, n}^{SJR}$ can be estimated theoretically when QF₂ and the shift coordinates (x_S, y_S) are known. Figure 8 gives some examples of $σ_{m, n}^{SJR} 1 \leq m, n \leq 8$ with different QF₂ and (x_S, y_S).

Figure 8 shows the following: (i) Shifted JPEG compression with lower quality (QF₂) will introduce larger spread to the DCT coefficients. (ii) $σ_{m, n}^{SJR}$ for different DCT modes varies from each other and are not proportional to their corresponding quantization step. For instance, in Figure 8 (b2), the quantization steps for (2,1) and (2,2) are the same while $σ_{2, 1}^{SJR}$ and $σ_{2, 2}^{SJR}$ are quite different. (iii) Even for the same DCT mode, the variations caused by shift JPEG compression do not remain constant with different coordinate shifts, e.g., for the (2,3)th DCT coefficient, $σ_{2, 3}^{SJR}$ is quite different when (x_S, y_S) changes between (4,4) and (1,6). Hence, $σ_{m, n}^{SJR}$ can be obtained by a table-lookup with the knowledge of QF₂ and (x_S, y_S). In order to have large value of $q_{m, n} / σ_{m, n}^{SJR}$ , smaller values of $σ_{m, n}^{SJR}$ or larger values of q_m,n are preferred. For any value of the quality factor, Q = [q_m,n] = λQ_default, where Q_default is the default quantization table defined by the Independent JPEG Group (IJG) and λ is a constant determined by the quality factor. Hence, we have $\frac{q_{m, n}}{σ_{m, n}^{SJR}} = λ \times \frac{q_{default} (m, n)}{σ_{m, n}^{SJR}} = λ \times {dis}_{m, n} 1 \leq m, n \leq 8$ . It should be noted that in practice, only QF₂ is known. We will show how QF₁ can be estimated later.

The discriminative feature extraction considering all the DCT coefficients runs as follows:

1.
Input the prior information: the image patch, the quality factor of the final compression QF₂, and the coordinate shift (x _S, y _S) (which is given by an exhaustive enumeration on 63 possible coordinate shifts, see Section 4.3).
2.
According to the image patch, QF₂ and (x _S, y _S), estimate $σ^{SJR} = [σ_{m, n}^{SJR}]$ which is introduced by the shifted double JPEG compression. Calculate the discriminative table DIS = [dis(m,n)] for low frequencies (where m + n < 8)
3.
Sort all the DCT modes according to their discriminative value in descending order and the first N _c (N _c = 3 in our experiment) components are selected to construct the candidate set, i.e., { $(m_{1}, n_{1}), (m_{2}, n_{2}), \dots, (m_{N_{c}}, n_{N_{c}})$ }. Estimate the quantization step $q_{m_{i}, n_{i}} ({QF}_{1})$ (for the unknown QF₁) of these DCT modes by analyzing their histograms. It should be noted that if the coefficients of a DCT mode in the candidate set has too many zero values (>80% of the total number of coefficients), the quantization step cannot be estimated accurately. Hence, considering the quantization noise caused by shifted compression, for any (m _i, n _i) whose coefficients are concentrated in the range of $[- 3 \times σ_{m_{i}, n_{i}}^{SJR}, 3 \times σ_{m_{i}, n_{i}}^{SJR}]$ , i.e., $\int_{f = - 3 \times σ_{m_{i}, n_{i}}^{SJR}}^{3 \times σ_{m_{i}, n_{i}}^{SJR}} h_{m_{i}, n_{i}}^{SD} (f) \geq 80 %$ , the mode is discarded and replaced with the mode with the next largest discriminative value outside the candidate set.
4.
For the i th mode (m _i,n _i) in the candidate set, extract the normalized discriminative feature for the (m _i,n _i)th mode, i.e., sn _mi,_ni, using the approach described in Section 4.1 with the estimated $σ_{m_{i}, n_{i}}^{SJR}$ and $q_{m_{i}, n_{i}} ({QF}_{1})$ . The final discriminative feature for the image patch, denoted by sn _all, is obtained by averaging over $s n_{m_{i}, n_{i}} 1 \leq i \leq N_{c}$ . In step 3, the quantization step q _m,n(QF₁) can be obtained by an exhaustive search among the reference SDJPEG's $s h_{m, n}^{SD} (f)$ where sh _m,n(f) is the most similar to. However, this approach is quite computationally expensive. Since $h_{m, n}^{SD} (f)$ exhibits periodic-like pattern with a period of q _m,n(QF₁) to reduce the complexity, the approach similar to that in [17] is adopted, i.e., the fast Fourier transform is applied to the histogram $h_{m, n}^{SD} (f)$ , and the peak of the spectrum with the DC removed is extracted to estimate the quantization step of the (m,n)th DCT mode q _m,n(QF₁). With the quantization step of the (m,n)th DCT mode q _m,n(QF₁) estimated, the quality factor QF₁ can be estimated by comparing q _m,n(QF₁) with the default quantization table Q _default. In order to improve the robustness, all q _m,n(QF₁) values in the candidate set are estimated. The median value of the predicted QF₁ is taken as the quality factor of the first compression, and all q _m,n(QF₁) values in the candidate set are refined using the estimated quality factor.

4.3 Crop-and-paste image tampering detection by detecting the SDJPEG effects

In order to detect crop-and-paste image tampering, the JPEG image is divided into a series of B × B subimages. Each B × B subimage is examined to detect whether it contains any image patch having SDJPEG effects, which runs as follows:

(i)
For a specific coordinate shift (x _S, y _S) (0 ≤ x _S, y _S ≤ 7 and (x _S, y _S) ≠ (0, 0)), crop an image patch ${IMG}_{x_{S}, y_{S}}$ from the subimage with the size of (B - 8) × (B - 8) and starting from (x _S, y _S).
(ii)
With (x _S, y _S) and QF₂, extract the discriminative feature sn _all for ${IMG}_{x_{S}, y_{S}}$ . Then the SDJPEG effect map (SEM) for (x _S, y _S) is set to sn _all, i.e., SEM (x _S, y _S) = sn _all.
(iii)
Repeat steps (i) and (ii) for all the 63 possible coordinate shifts to obtain the SEM for the B × B subimage.
(iv)
Compare SEM with those of the positive (containing SDJPEG patches) and negative (not containing any SDJPEG patch) samples in the training database to detect whether the subimage has been tampered with. The Fisher's linear discriminant analysis (LDA) [24] is adopted as the classifier in our approach.
(v)
Loop through steps (i) to (iv) for all B × B subimages in the image to detect the suspicious regions that might have been tampered with.

5 Experiments and discussion

In this section, we first investigate the effectiveness of several key issues for the proposed approach, i.e., the quantization error estimation for shifted JPEG compression, the primary quality factor estimation, and the proposed discriminative feature for SDJPEG detection. Then, we analyze the SDJPEG detection performance for image patches of various sizes and compared the detection performance with four state-of-the-art SDJPEG detection algorithms, i.e., Luo et al.'s [15] (Luo's in short), Chen and Hsu's [16] (Chen's in short), Bianchi and Piva's [13] (Bianchi I's in short), and Bianchi and Piva's [14] (Bianchi II's in short). Finally, we compare the image tampering detection performance of all five algorithms for two example images. The images in our experiments come from two widely used image databases, i.e., the UCID [25] and NRCS [26] image datasets, and all the original images are uncompressed.

5.1 Effectiveness evaluation of the proposed approach

5.1.1 Quantization error estimation for shifted JPEG compression

In order to evaluate the effectiveness of the proposed approach to estimate $σ^{SJR} = [σ_{m, n}^{SJR}]$ , 10,000 image patches with the size 256 × 256 are randomly collected from the uncompressed images in the databases [25, 26]. Then half of the image patches are used to generate the SSJPEG patches by shifted JPEG compression with the quality factor QF₂ randomly picked from 50 to 90 and random selection of (x_S, y_S) ((x_S, y_S) ≠ (0, 0)). The other half of the image patches are used to generate the SDJPEG by aligned JPEG compression with quality factor QF₁ randomly picked from 50 to 90 and shifted JPEG compression with quality factor QF₂ randomly picked from 50 to 90 and random selection of (x_S, y_S) ((x_S, y_S) ≠ (0, 0)). For each image patch, the actual standard deviation of the quantization error caused by shifted JPEG compression (denoted by $σ^{ACT - SJR} = [σ_{m, n}^{ACT - SJR}]$ ) is recorded as the ground truth, and the average relative estimation error is adopted to evaluate the estimation performance, i.e.,

η_{SJR} = \sum_{m = 1}^{8} \sum_{n = 1}^{8} \frac{|σ_{m, n}^{SJR} - σ_{m, n}^{ACT - SJR}|}{|σ_{m, n}^{ACT - SJR}|} / 64 \times 100 %

(12)

The average relative estimation error η_SJR for the above image patches using the proposed approach is 10.61%, which is much less than that using the rough approximation in [13, 14] (214.94%). Hence, the proposed DCT coefficient model can better describe the effects caused by shifted JPEG compression.

5.1.2 Primary quality factor estimation

To show the effectiveness of the proposed quantization step estimation approach, the following experiments have been carried out. For a specific patch size, QF₁ and QF₂, 500 SDJPEG patches are generated with random selection of (x_S, y_S) ((x_S, y_S) ≠ (0, 0)). The average estimation accuracy is given in Table 3.

Table 3 Average estimation accuracy a1/a2/a3 of QF₁with various patch sizes ((a1) 64 × 64; (a2) 128 × 128; (a3) 256 × 256), QF₁and QF₂

Full size table

From the table, it is observed that the estimation performance is improved with the increase of the patch size. It is because for larger size of the image patch, more data are collected to construct the histogram of the DCT coefficients and the period estimation approach will be more robust. On the other hand, the estimation performance is also improved with the increase of QF₂ - QF₁. Such phenomenon can be explained as follows. From (7) and Figure 5 (a2 and b2), for SDJPEG image patches, the period in the histogram (i.e., the primary quantization step q_m,n(QF₁)) is large for low quality factor QF₁ and the standard deviation of the quantization noise with rounding error, i.e., $σ_{m, n}^{SJR}$ , is small for high quality factor QF₂. Hence when QF₂ - QF₁ is large, i.e., QF₂ - QF₁ ≥ 10, the Gaussian impulses in the histogram are non-overlapping or slightly overlapping and thus the period can be estimated accurately. However, when QF₂ - QF₁ is small, i.e., QF₂ - QF₁ ≤ -20, the Gaussian impulses in the histogram are highly overlapping and thus the periodic pattern almost disappears in the histogram.

5.1.3 Effectiveness of the proposed discriminative feature

In order to investigate the effectiveness of the proposed discriminative feature sn_all, the distributions of sn_all for SSJPEG and SDJPEG patches with the size 256 × 256 are analyzed. For SSJPEG patches, the quality factor QF₂ is set to 70. For SDJPEG patches, the second compression quality factor QF₂ is also set to 70 and the primary compression quality factor QF₁ is randomly picked from 50 to 70. The coordinate shift (x_S, y_S) is also selected randomly in the range of 0 ≤ x_S, y_S ≤ 7, (x_S, y_S) ≠ (0, 0). The histograms of sn_all in SSJPEG and SDJPEG patches are given in Figure 9. It is observed that sn_all is usually small for SDJPEG patches while it is large for SSJPEG ones, which demonstrates that sn_all is effective in differentiating SDJPEG and SSJPEG patches.

5.2 SDJPEG patch detection

To assess the performance of SDJPEG effects detection, we prepare the dataset as follows:

i.
Image patches of three sizes, i.e., 64 × 64, 128 × 128 and 256 × 256 are used.
ii.
An overall 25 {QF₁,QF₂} pairs are investigated with QF₁,QF₂ = {50,60,70,80,90}.
iii.
For a specific {QF₁,QF₂} pair, 10,000 image patches are randomly extracted from the uncompressed image database. The ‘positive’ samples are constructed by performing shifted JPEG compression with quality factor of QF₁ and random coordinate shifts (x _S, y _S) (0 ≤ x _S, y _S ≤ 7 and (x _S, y _S) ≠ (0, 0)) and then saving the image patches in JPEG format with QF₂. The ‘negative’ samples are constructed by directly saving the same uncompressed image patches in JPEG with QF₂.

Since in practice only QF₂ is available and QF₁ is unknown to an algorithm, the following are the settings for the different algorithms evaluated:

1)
For Luo's [15] and Chen's [16] approaches, the SVM is adopted as the classifier. In the training procedure, since QF₁ is unknown for each QF₂, a specific SVM is trained with the features extracted from half of the positive samples obtained from all possible selections of QF₁ = {50,60,70,80,90} and the corresponding negative samples (there are overall 25,000 positive and 25,000 negative samples). In the testing procedure, the test samples are classified by their corresponding classifier with the knowledge of QF₂.
2)
For Bianchi I's [14] approach, with the knowledge of QF₂, QF₁ is estimated by an EM algorithm. Then the tampered region likelihood is calculated for each 8 × 8 block in the image patch, and the investigated image patch is classified as a SDJPEG patch if half of the 8 × 8 blocks in the patch are with the likelihood greater than 1. For Bianchi II's [13] approach, with the knowledge of QF₂, QF₁ is achieved by an exhaustive search.
3)
For the proposed approach, QF₁ is estimated by the method introduced in Section 4.2. For each QF₂, SEM features for 630 positive samples (ten samples for each possible (x _S, y _S) with QF₁ randomly selected) and 630 negative samples are adopted to train the classifier, and the rest (about 48,740 samples) are adopted for testing.

The detection performance obtained by our algorithm compared with the four existing methods is given in Tables 4, 5 and 6. For all the methods investigated, the classification performance is affected by two key parameters: the patch size and the difference in quality factor between the first and second JPEG compressions, i.e., QF₂ - QF₁. From the table, the following observations can be made.

Table 4 Average classification accuracy, i.e., 1 - (FAR + FRR)/2, in percent with various values of QF₁and QF₂for patch size of 64 × 64

Full size table

Table 5 Average classification accuracy for patch size of 128 × 128

Full size table

Table 6 Average classification accuracy for patch size 256 × 256

Full size table

First, the recognition accuracy increases substantially with the increase of patch size. As the features used in all five methods are based on statistical models, more data leads to better estimation of the models. Among all five approaches, our algorithm always outperforms the others especially when the patch size is small (64 × 64 for instance) since it requires less data to estimate a discriminative model. Since only one or several small image patches are modified in most image tampering detection tasks, our algorithm is more appropriate in such applications.

Second, the detection performance is improved with the increase of QF₂ - QF₁. It is because the lower quality factor of the first JPEG compression will leave more traces of the compression history and the higher quality factor of the second JPEG compression will introduce less distortion to the image, which makes the traces left by the first compression easier to be detected. The proposed approach can achieve reasonable accuracies (above 80%) when QF₂ - QF₁ ≥ 10 and high accuracies (above 90%) when QF₂ - QF₁ ≥ 20 for 128 × 128 and 256 × 256 patches. However, when QF₂ - QF₁ ≤ -20, the detection performance is very poor for all approaches, i.e., the classification accuracies are close to that of random guess (50%). The poor performance of the proposed approach is to be expected because when QF₂ < QF₁, $q_{m, n} / σ_{m, n}^{SJR}$ will be very small for all the DCT modes and $s_{m, n}^{SD}$ will be close to $s_{m, n}^{SS}$ . In addition, as shown in subsection 5.1.2, with small values of $q_{m, n} / σ_{m, n}^{SJR}$ , all the Gaussian components of $h_{m, n}^{SD} (f)$ in (10) will highly overlap with each other and the estimation of QF₁ will not be accurate especially when the patch size is small.

5.3 Crop-and-paste image tampering detection

Figure 10 illustrates the tampering detection results using our approach compared with two recent state-of-the-art approaches [13, 14]. The forgery image is made by cropping the image patch (enclosed by the highlighted boundary as shown in Figure 10 (b)) from another JPEG image with quality factor of QF₁, pasting it to the original image and saving as the forgery JPEG file with quality factor of QF₂. Three different settings of {QF₁,QF₂}, i.e., {50,50}, {50,70}, and {50,90}, are investigated, and the detected tampered region is highlighted in the figures. Note that for all the three approaches, the corresponding parameters are adjusted to ensure that the false alarm rate (FAR) is less than 10%. Moreover, the detection results undergo a 3 × 3 median filtering process to remove isolate detection errors. It is observed from the figures that the proposed algorithm can accurately detect and locate tampered region when {QF₁,QF₂} = {50,70} and {50,90}. It also accurately points out some suspicious regions when {QF₁,QF₂} = {50,50}. Compared with Bianchi and Piva's approaches [13, 14], the proposed approach provides more accurate detection results due to (i) better estimation of the quantization error caused by shifted JPEG compression and (ii) more discriminative information investigated by the adaptive DCT mode selection approach in Section 4.1 compared with that only DC mode is investigated in [13].

6 Conclusions

In this paper, a new JPEG image tampering detection algorithm based on SDJPEG detection is presented. The DCT coefficient distribution for the SDJPEG patches has been expressed by a statistical model, and a discriminative feature has been proposed which can effectively differentiate between SDJPEG and SSJPEG patches. By an adaptive DCT mode selection scheme, several highly discriminative DCT modes are selected. We used several thousand SDJPEG and SSJPEG patches extracted from the UCID and NRCS image database to evaluate the performance of the proposed algorithm along with several existing algorithms. From the experimental results, the proposed algorithm can achieve much better results compared with the existing algorithms, especially when the patch size is small. We also performed experiments to detect and locate the tampered region of two test images, where the proposed algorithm always outperforms the other algorithms investigated. As a result, the proposed algorithm has provided a new and effective solution for SDJPEG detection and JPEG image forensics.

Appendix

Derivation of DCT coefficient from adjacent blocks

From Figure 4d, it is observed that block A can be decomposed into four non-overlapped region, i.e., A₁, A₂, A₃, and A₄, and A = A₁ + A₂ + A₃ + A₄. In order to derive A_i = H_{i 1}B_iH_{i 2}, we just demonstrate the situation when i = 4 and the same analysis can be applied to i = 1,2,3 (Figure 11).

For the bottom right corner, A₄ has the following relationship to B₄:A₄ = H₄₁B₄H₄₂, where $H_{41} = [\begin{array}{c} 0 & 0 \\ I_{h} & 0 \end{array}]$ and $H_{42} = [\begin{array}{c} 0 & I_{w} \\ 0 & 0 \end{array}]$ . I_w and I_h are the identity matrices with size h × h and w × w, respectively; h and w are the number of rows and columns extracted. Since all unitary orthogonal transforms such as the DCT are distributive to matrix multiplication [19], we have $D (A_{4}) = D (H_{41}) D (B_{4}) D (H_{42}) \Rightarrow D (A) = \sum_{i = 1}^{4} D (A_{i}) = \sum_{i = 1}^{4} D (H_{i 1}) D (B_{i}) D (H_{i 2})$ .

References

Wang X, Xue J, Zheng Z, Liu Z, Li N: Image forensic signature for content authenticity analysis. J. Vis. Commun. Image. Repres. 2012, 23(5):782-797. 10.1016/j.jvcir.2012.03.005
Article Google Scholar
Swaminathan A, Mao Y, Wu M: Robust and secure image hashing. IEEE. Trans. Info. Forensics. Sec. 2006, 1(2):215-230. 10.1109/TIFS.2006.873601
Article Google Scholar
Phadikar A, Maity SP, Mandal M: Novel wavelet-based QIM data hiding technique for tamper detection and correction of digital images. J. Vis. Commun. Image. Represent. 2012, 23(3):454-466. 10.1016/j.jvcir.2012.01.005
Article Google Scholar
Farid H: A survey of image forgery detection. IEEE. Signal. Process. Mag. 2009, 2(26):16-25.
Article Google Scholar
Popescu AC Ph.D. Thesis. In Statistical tools for digital image forensics. Department of Computer Science, Dartmouth College, Hanover, NH; 2005.
Google Scholar
Popescu AC, Farid H: Exposing digital forgeries in color filter array interpolated images. IEEE. Trans. Signal. Proc. 2005, 53(10):3948-3959.
Article MathSciNet Google Scholar
Chen L, Lu W, Ni J, Sun W, Huang J: Region duplication detection based on Harris corner points and step sector statistics. J. Vis. Commun. Image. Represent. In Press
Lukas J, Fridrich J: Estimation of primary quantization matrix in double compressed JPEG images. In Proceedings of Digital Forensic Research Workshop. Cleveland; 2003:5-8.
Google Scholar
Fu D, Shi YQ, Su Q: A Generalized Benford's law for JPEG coefficients and its applications in image forensics. In Proceedings of the SPIE Electronic Imaging, Security and Watermarking of Multimedia Contents IX, vol. 650. 5th edition. Springer, San Jose, CA; 2007:1L1-1L11.
Google Scholar
Pevny T, Fridrich J: Detection of double-compression in JPEG images for applications in steganography. IEEE. Trans. Info. Forensics. Sec. 2008, 3(2):247-258.
Article Google Scholar
Farid H: Exposing digital forgeries from JPEG ghosts. IEEE. Trans. Info. Forensics. Sec. 2009, 4(1):154-160.
Article MathSciNet Google Scholar
Ye SM, Sun QB, Chang EC: Detecting digital image forgeries by measuring inconsistencies of blocking artifact, in Proceedings of IEEE International Conference on Multimedia and Expo. China, Beijing; 2007. pp. 12–15
Google Scholar
Bianchi T, Piva A: Detection of nonaligned double JPEG compression based on integer periodicity maps. IEEE. Trans. Info. Forensics. Sec. 2012, 7: 2.
Article Google Scholar
Bianchi T, Piva A: Image forgery localization via block-grained analysis of JPEG artifacts. IEEE. Trans. Info. Forensics. Sec. 2012, 7: 3.
Article Google Scholar
Luo W, Qu Z, Huang J, Qiu G: A novel method for detecting cropped and recompressed image block. In Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP’07), vol. 6. 2nd edition. IEEE, Piscataway, Honolulu, Hawaii; 217-220.
Chen YL, Hsu CT: Detecting recompression of JPEG images via periodicity analysis of compression artifacts for tampering detection. IEEE. Trans. Info. Forensics. Sec. 2011, 6(2):396-406.
Article MathSciNet Google Scholar
Lin Z, He J, Tang X, Tang CK: Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis. Pattern. Recogn. 2009, 42(11):2492-2501. 10.1016/j.patcog.2009.03.019
Article MATH Google Scholar
Huang FJ, Huang JW, Shi YQ: Detecting double JPEG compression with the same quantization matrix. IEEE. Trans. Info. Forensics. Sec. 2010, 5(4):848-856.
Article Google Scholar
Chang SF, Messerschimitt DG: Manipulation and compositing of MC-DCT compressed video. IEEE. J. Select. Areas. Commun. 1995, 13(1):1-11. 10.1109/49.363151
Article Google Scholar
Reininger R, Gibson J: Distributions of the two-dimensional DCT coefficients for images. IEEE. Trans. Commun. 1983, 31(6):835-839. 10.1109/TCOM.1983.1095893
Article Google Scholar
Fan Z, de Queiroz RL: Identification of bitmap compression history: JPEG detection and quantizer estimation. IEEE. Trans. Image. Process. 2003, 12(2):230-235. 10.1109/TIP.2002.807361
Article Google Scholar
Popescu AC, Farid H: Statistical tools for digital forensics. Lect. Notes. Comput. Sci. 2005, 3200: 395-407.
Google Scholar
Luo W, Huang J, Qiu G: JPEG error analysis and its application to digital image forensics. IEEE. Trans. Info. Forensics. Sec. 2010, 5(3):480-491.
Article Google Scholar
Duda RO, Hart PE, Stork DG: Pattern Classification. 2nd edition. Wiley-Interscience, USA; 2000.
MATH Google Scholar
Schaefer G, Stich M: UCID: an uncompressed color image database. In Proceedings of the SPIE Storage and Retrieval Methods and Applications for Multimedia, vol. 530. 7th edition. Springer, San Jose, CA; 2003:472-480.
Google Scholar
NRCS Photo Gallery 2005.http://photogallery.nrcs.usda.gov/res/sites/photogallery/

Download references

Acknowledgements

The work described in this paper is supported by National Key Basic Research Program of China No. 2013CB329603 and NSFC Fund No. 61271319 and No. 61071152.

Author information

Authors and Affiliations

School of Information Security Engineering, Shanghai Jiaotong University, No. 800, Dong Chuan Rd, Shanghai, 200240, China
Shi-Lin Wang & Jian-Hua Li
School of Information and Communication Technology, Gold Coast Campus, Griffith University, Queensland, QLD4222, Australia
Alan Wee-Chung Liew
Department of Electric and Electronic Engineering, Shanghai Jiaotong University, Shanghai, 200240, China
Sheng-Hong Li & Yu-Jin Zhang

Authors

Shi-Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Alan Wee-Chung Liew
View author publications
You can also search for this author in PubMed Google Scholar
Sheng-Hong Li
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Jin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jian-Hua Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shi-Lin Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Wang, SL., Liew, A.WC., Li, SH. et al. Detection of shifted double JPEG compression by an adaptive DCT coefficient model. EURASIP J. Adv. Signal Process. 2014, 101 (2014). https://doi.org/10.1186/1687-6180-2014-101

Download citation

Received: 09 January 2014
Accepted: 13 June 2014
Published: 05 July 2014
DOI: https://doi.org/10.1186/1687-6180-2014-101

Detection of shifted double JPEG compression by an adaptive DCT coefficient model

Abstract

1 Introduction

2 Crop-and-paste image tampering detection

3 DCT coefficient analysis for SDJPEG patches

3.1 DCT coefficient variations caused by shifted JPEG compression

3.2 DCT coefficient analysis on SDJPEG patches

4 Detection of image patch with SDJPEG compression

4.1 Discriminative feature extraction based on the (m,n)th DCT coefficients

4.2 Discriminative feature extraction based on all the DCT coefficients

4.3 Crop-and-paste image tampering detection by detecting the SDJPEG effects

5 Experiments and discussion

5.1 Effectiveness evaluation of the proposed approach

5.1.1 Quantization error estimation for shifted JPEG compression

5.1.2 Primary quality factor estimation

5.1.3 Effectiveness of the proposed discriminative feature

5.2 SDJPEG patch detection

5.3 Crop-and-paste image tampering detection

6 Conclusions

Appendix

Derivation of DCT coefficient from adjacent blocks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords