 Research
 Open Access
 Published:
Detection of shifted double JPEG compression by an adaptive DCT coefficient model
EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 101 (2014)
Abstract
In many JPEG image splicing forgeries, the tampered image patch has been JPEGcompressed twice with different block alignments. Such phenomenon in JPEG image forgeries is called the shifted double JPEG (SDJPEG) compression effect. Detection of SDJPEGcompressed patches could help in detecting and locating the tampered region. However, the current SDJPEG detection methods do not provide satisfactory results especially when the tampered region is small. In this paper, we propose a new SDJPEG detection method based on an adaptive discrete cosine transform (DCT) coefficient model. DCT coefficient distributions for SDJPEG and nonSDJPEG patches have been analyzed and a discriminative feature has been proposed to perform the twoclass classification. An adaptive approach is employed to select the most discriminative DCT modes for SDJPEG detection. The experimental results show that the proposed approach can achieve much better results compared with some existing approaches in SDJPEG patch detection especially when the patch size is small.
1 Introduction
With the rapid development of image processing tools, manipulating a digital image without leaving obvious visual traces is becoming easier and easier. The detection of malicious tampering and the verification of the credibility of the original digital image have become important research topics.
Some researchers have proposed digital watermarking as an image/video content authentication technique [1–3]. However, these kinds of ‘active authentication’ methods have not been widely used because most images on the Internet are not required to be watermarked. Moreover, there is also the challenge of how to guide against hostile attacks in watermarkbased approaches. Recently, a variety of ‘passive authentication’ methods [4–7] have been proposed which perform image content authentication by detecting certain cues produced during creation and modification of the image, such as double compression, light abnormality, resampling, and photo response nonuniformity noise (PRNU). Compared with the active approach, the passive or blind authentication approach does not require additional watermarks or signatures and could have broader applications in image forensics.
JPEG is the most widely used image format. Authentication or detection of forgeries on JPEG images plays an important role in image forensics. Since most tampered JPEG image undergoes at least two JPEG compressions, many JPEG image authentication methods are based on detection of double JPEG compression. As the block discrete cosine transform (BDCT) is the key operation in JPEG compression, the distribution of the BDCT coefficients usually contains important information which could indicate the compression history. Hence, most passive image tampering detection approaches for JPEG images are based on BDCT coefficient analysis. Lukas and Fridrich [8] tried to identify double JPEG compression by detecting the doublepeak effect in DCT coefficient histogram. Fu et al. [9] observed that the distribution of the first digit of DCT coefficients after JPEG compression followed the generalized Benford's law and stated that double JPEG compression could be detected because it would cause violations of the first digit law. In [10], Pevny and Fridrich introduced a machine learning approach for detection of double JPEG compression. A set of features from the histograms of several lowfrequency DCT coefficients was extracted, and support vector machine (SVM) was adopted as the classifier. Recently, Farid [11] observed that recompression would introduce additional local minima in the difference between the image and its JPEGcompressed counterpart, and these local minima, referred to as the JPEG ghosts, could be used to detect double compression. However, these methods are only effective when the block structure of the first and second JPEG compressions are aligned with each other.
In many JPEG image tampering situations, a foreign JPEGcompressed patch is inserted into an authentic image and the resultant image is recompressed to form the new image. The tampered region has double JPEG compressions, but the block structures of the two compressions in the tampered region are usually not aligned with each other. Such case is referred to as shifted double JPEG compression (SDJPEG) [12] or nonaligned double JPEG compression (NAJPEG) [13, 14]. For SDJPEG, the double JPEG compression detection methods discussed above cannot achieve satisfactory results. Luo et al. [15] tried to detect SDJPEG by analyzing the blocking artifact characteristics matrix (BACM). They observed that BACM is symmetric for single JPEG compression but the symmetry is destroyed after SDJPEG. However, BACM is highly related to the image content and the detection performance would decrease if the testing images are very different from those in the training set. In order to solve the above problem, Chen and Hsu [16] extended the idea of BACM and proposed a feature which is less related to the image content by introducing the interblock correlation. However, since the statistical features in both [15] and [16] require large amount of data to obtain high discriminative power, their methods do not work well for small SDJPEG patches.
Recently, Bianchi and Piva [13] tried to detect SDJPEG effects by examining the integer periodicity in the DCT coefficient histogram when the BDCT is computed according to the first JPEG compression. In addition, they also proposed a statistical model that characterizes the artifacts due to SDJPEG [14]. They have observed that the shifted JPEG compression will introduce a Gaussian noise to each DCT coefficients and approximated the variance of the noise by the quantization step of the shifted compression. Inspired by [14], we propose a highly effective SDJPEG image tampering detection method in this paper. The major contributions of our work lie in the following: (i) we perform a rigorous theoretic analysis of the DCT coefficient variations caused by SDJPEG and derive from it a rigorous statistical model for the DCT coefficients of SDJPEG patches, which provides a more accurate estimation of the quantization noise introduced by the shifted JPEG quantization compared with that in [13, 14]; (ii) based on the analysis, we propose an effective discriminative feature to detect SDJPEG patches; and (iii) we propose an adaptive DCT component selection method to select the most discriminative DCT components. Our algorithm not only detects image forgeries but also locates the tampered regions accurately.
This paper is organized as follows. Section 2 gives a description of the ‘cropandpaste’ image tampering problem and introduces current approaches. Section 3 gives an indepth analysis on the DCT coefficient variations caused by shifted JPEG compression and describes the DCT coefficient histogram model for SDJPEG patches. In Section 4, a new discriminative feature is introduced to detect SDJPEG patches. An adaptive DCT mode selection method and a tampering detection algorithm are also given. Section 5 presents the experimental results comparing our approach with several stateoftheart techniques. Finally, Section 6 draws the conclusion.
2 Cropandpaste image tampering detection
Figure 1 illustrates a typical cropandpaste image tampering scenario. Given an original image shown in Figure 1a, an image region of arbitrary shape highlighted in Figure 1b was cropped from a foreign image and was pasted onto the original image to construct the tampered image of Figure 1c (with tampered region highlighted). The tampering detection problem can be stated as follows: given an image, detect whether it has been tampered through cropandpaste and if so, locate the region where the cropandpaste occurred. Since any arbitrary shape region can be divided into a concatenation of small squared patches, we consider a squared image patch composed of a number of 8 × 8 blocks as the fundamental unit in this work.To facilitate subsequent discussions, we first give some definitions. An image patch with square shape and containing a number of 8 × 8 blocks can be classified into one of five categories as shown in Figure 2 (from left to right, top to bottom):

(i)
Uncompressed: These patches are raw images and have never been JPEG compressed.

(ii)
Aligned single JPEG compressed (ASJPEG in short): When the uncompressed image patches undergo a JPEG compression with the block structure aligned to the image patch, the output image patches are called ASJPEG patches. They are usually referred to as single JPEG image patches in the literature.

(iii)
Aligned double JPEG compressed (ADJPEG in short): When the ASJPEG patches undergo another JPEG compression and the block structures of the two JPEG compressions are aligned with each other, the output image patches are called ADJPEG patches. They are usually referred to as double JPEG image patches in the literature.

(iv)
Shifted double JPEG compressed (SDJPEG in short): When the ASJPEG patches undergo another JPEG compression and the block structures of the two JPEG compressions are different, the output image patches are called SDJPEG patches.

(v)
Shifted single JPEG compressed (SSJPEG in short): Different from SDJPEG patches, when the image patches before the shifted JPEG compression (with the block structure in dashed line) have never been compressed with the block structure starting from the topleft corner, the output image patches are called SSJPEG patches.
There are two general approaches for cropandpaste tampering detection in JPEG images: aligned double JPEG detection and shifted double JPEG detection. Both approaches involve analyzing an image patch within a window aligned with the JPEG block structure of the final image, i.e., denoted hereafter as aligned patch, to detect any sign of tampering. Their differences and requirements are briefly summarized in Table 1. In most cropandpaste tampering detection applications, both kinds of approaches are applied together to provide a more robust detection result.
The underlying rationale of the aligned double JPEG detection approach is as follows. If the original image has been JPEG compressed before and undergoes another JPEG compression after cropandpaste tampering, image patches in the unmodified region in the final JPEG image (such as aligned patch I in Figure 3a) would have been compressed twice with the same block structure and exhibit the aligned double JPEG effect, while image patches in the tampered region (such as aligned patch II in Figure 3a) will not show such effect. Algorithms in this approach look for untampered patch, i.e., they detect the presence of aligned double JPEG effect, to establish the authenticity of the image patch. If such effect is absent, the patch is assumed to be tampered. It has been shown [8–11, 17] that if the quality factors of the original and final image are the same, the detection performance would not be satisfactory. Recently, Huang et al. [18] proposed an aligned double JPEG detection algorithm that applied to the case where both the original and final images have the same quantization matrix. However, their approach cannot provide accurate detection results for small image patch. Finally, the aligned double JPEG detection approach cannot detect the tampered region if the original image is uncompressed.
In contrast, the shifted double JPEG detectionbased approach [13–16] looks for tampered patch by detecting cues of tampering in an image patch. The basic idea is that if the inserted image region has been JPEG compressed earlier, the second JPEG compression will leave some cues of shifted compression in the tampered region in the final image (such as the additional blocking artifacts [15, 16]). For the aligned patches fully or partially located in the tampered region in the final image (such as aligned patch II in Figure 3a), such cues would exist, while for the aligned patches located in the unmodified region in the final image (such as aligned patch I in Figure 3a), such cues would be absent. Hence, the tampered region can be located by searching for all the aligned patches where the cues left by shifted compression exist. However, such cues vary greatly with image content and the quality factor of the inserted patch, and it is difficult to find a robust and discriminative feature for all situations. Moreover, this approach needs a large image patch to obtain robust detection results and is less effective for small patches.
In this paper, we propose a new shifted double JPEG detection method which also examines the characteristics of the inserted patch. Similar to [13, 14], our algorithm considers image patches that are not aligned with the block structure of the final image (such as nonaligned patches I and II in Figure 3b). When analyzing the nonaligned patches in the final image, SDJPEG patches (such as nonaligned patch II in Figure 3b where the JPEGcompressed inserted patch undergoes a shifted JPEG compression, i.e., during JPEG compression of the final image) are located in the tampered region, while SSJPEG patches (such as nonaligned patch I in Figure 3b) are located in the untampered region. Note that the JPEG compression of the final image is aligned for the aligned patches in Figure 3a and shifted for nonaligned patches Figure 3b. In our algorithm, an exhaustive search on 63 possible locations of the nonaligned image patches is performed to detect tampering. We will show in the experiment section that the increase in computational cost is worthwhile and our algorithm can achieve much better detection performance compared with the existing methods.
3 DCT coefficient analysis for SDJPEG patches
Since SDJPEG patches are generated by shifted JPEG compression on ASJPEG patches, we examine how the shifted JPEG compression affects the DCT coefficients. We first describe the effect of shifted JPEG compression on the DCT coefficients in Section 3.1. Then we derive a specific DCT coefficient model for SDJPEG patches in Section 3.2. The notation used hereafter is summarized in Table 2.
3.1 DCT coefficient variations caused by shifted JPEG compression
Before analyzing the effects on DCT coefficients caused by shifted JPEG compression, a simpler case, i.e., the effects cause by aligned JPEG compression, is discussed. Given an 8 × 8 image block A, its DCT coefficients can be represented by {\mathit{D}}^{\mathrm{O}}\left(\mathit{A}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{O}}\left(\mathit{A}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8. If block A undergoes an aligned JPEG compression (as shown in Figure 4a) with the quality factor QF, the resulting DCT coefficients of block A, denoted by {\mathit{D}}^{\mathrm{AJ}}\left(\mathit{A}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{A}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8, will be a multiple of the quantization step. Hence, aligned JPEG compression will induce a zeromean quantization error (denoted by {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{A}\right)) for all the DCT coefficients, i.e.,
The discussions above can be extended to the case of shifted JPEG compression. If block A undergoes a shifted JPEG compression with a coordinate shift of (x_{S}, y_{S}) and quality factor QF (as shown in Figure 4b), the DCT coefficients of block A become {\mathit{D}}^{\mathrm{SJ}}\left(\mathit{A}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8. To derive the expression for D^{SJ} (A), we consider the DCT coefficients of the neighboring 8 × 8 aligned blocks whose block structure coincides with the block structure of the shifted JPEG compression on A. As shown in Figure 4c,d, block A is surrounded by four aligned blocks B_{1}, B_{2}, B_{3}, and B_{4}. According to [19], the DCT coefficients of block A and B_{ i } (i = 1,2,3,4) are related by (details are given in Appendix)
where H_{i 1} and H_{i 2} (i = 1,2,3,4) are the row and column translation matrices to translate a specific block from B_{ i } to A_{ i }, as shown in Figure 4. Note that D(H_{i 1}) and D(H_{i 2}) (i = 1,2,3,4) are only related to the coordinate shifts (x_{S}, y_{S}).
Let {\mathit{D}}^{\mathrm{O}}\left({\mathit{B}}_{\mathit{i}}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{O}}\left({\mathit{B}}_{\mathit{i}}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8 be the original DCT coefficients of block B_{ i }i = 1,2,3,4 and thus before compression {\mathit{D}}^{\mathrm{O}}\left(\mathit{A}\right)={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{H}}_{\mathit{i}1}\right){\mathit{D}}^{\mathrm{O}}\left({\mathit{B}}_{\mathit{i}}\right)\mathit{D}\left({\mathit{H}}_{\mathit{i}2}\right)} according to (3). D^{O}(B_{ i }) becomes D^{AJ}(B_{ i }), i = 1,2,3,4 after a JPEG compression is performed on these aligned blocks. Based on the analysis about aligned JPEG compression in (1), D^{AJ}(B_{ i }) = D^{O}(B_{ i }) + E^{AJ}(B_{ i }), i = 1,2,3,4. The aligned JPEG compression on B_{ i } is equivalent to a shifted JPEG compression on A. Hence, after compression, the DCT coefficients of block A would change to
where {\mathit{E}}^{\mathrm{SJ}}\left(\mathit{A}\right)=\left[{\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)\right]={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{H}}_{\mathit{i}1}\right){\mathit{E}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)\mathit{D}\left({\mathit{H}}_{\mathit{i}2}\right)} denotes the shifted quantization error caused by shifted JPEG compression. For any DCT mode (m,n), the shifted quantization error {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right) can be expressed by a linear combination of 4 × 64 = 256 zeromean random variables, i.e., {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)={\displaystyle \sum _{\mathit{i}=1}^{4}{\displaystyle \sum _{\mathit{u}=1}^{8}{\displaystyle \sum _{\mathit{v}=1}^{8}{\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right){\mathit{E}}_{\mathit{u},\mathit{v}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)}}} and {\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right) is the weighting parameter determined by D(H_{i 1}) and D(H_{i 2}) (i = 1,2,3,4). According to the Central Limit Theorem (CLT), {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right) follows a zeromean Gaussian distribution denoted by \mathit{G}\left(0,{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\right) with standard deviation {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}. {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}} can be calculated with knowledge of the standard deviations of these 256 random variables ({\mathit{E}}_{\mathit{u},\mathit{v}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)\phantom{\rule{0.48em}{0ex}}1\le \mathit{i}\le 4,1\le \mathit{u},\mathit{v}\le 8) and the weighting coefficients {\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right)\phantom{\rule{0.48em}{0ex}}1\le \mathit{i}\le 4,1\le \mathit{u},\mathit{v}\le 8.
In order to estimate the quantization error caused by aligned JPEG compression, we divide all the DCT coefficients into two parts, i.e., the DC and AC components. For the DC component, since the quantization step is relatively small and the DC coefficients have a large dynamic range, the quantization error caused by aligned JPEG compression approximately follows the uniform distribution of \left[\frac{{\mathit{q}}_{1,1}\left(\mathrm{QF}\right)}{2},\frac{{\mathit{q}}_{1,1}\left(\mathrm{QF}\right)}{2}\right), where q_{1,1}(QF) is the quantization step for the DC coefficients with the quality factor QF. For the AC components, according to [20], the AC components for a natural image approximately follow a Laplacian distribution. If we fit the coefficients of AC components in the image patch to a Laplacian model, the standard deviations of {\mathit{E}}_{\mathit{u},\mathit{v}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right) can be directly calculated with the quality factor QF. In addition, from (3), the weighting coefficients, {\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right), are determined by the coordinate shift (x_{S}, y_{S}). Then the theoretical value of {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}} can be calculated when QF and (x_{S}, y_{S}) are known.
To summarize, the shifted JPEG compression will induce a zeromean Gaussiandistributed quantization error for all the DCT coefficients, and for different DCT coefficients, the standard deviations of the shifted quantization error are different and can be calculated when the quality factor and the coordinate shift of the compression are known.
3.2 DCT coefficient analysis on SDJPEG patches
As discuss earlier, SDJPEG patches are constructed by performing shifted JPEG compression on ASJPEG patches. Similar to the ASJPEG patches, the DCT coefficients of SDJPEG patches also have certain specific distributions.
Given a gray scale image patch IMG consisting of a number of 8 × 8 blocks aligned JPEG compressed with the quality factor QF_{1}, let {\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathit{AJ}}\left(\mathit{k}\right) denote the (m,n)th DCT coefficient of the k th block in IMG. Since it has been JPEG compressed with quality factor QF_{1}, {\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{k}\right) would be a multiple of the quantization step of the (m,n)th DCT mode (denoted by q_{m,n}(QF_{1})). Hence, considering all the (m,n)th DCT coefficients in IMG, the normalized histogram {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{f}\right) of the (m,n)th DCT coefficients is given by
where w_{ i } is the normalized frequency of the (m,n)th DCT coefficients having a value of i × q_{m,n}(QF_{1}) and Nq_{m,n}(QF_{1}) is the maximum absolute value of the (m,n)th DCT coefficient in IMG.
When the ASJPEGcompressed image patch IMG is transformed back to the spatial domain, two kinds of errors would be introduced. One is the truncation error. Since the luminance value of the gray scale image ranges from 0 to 255, any gray level greater than 255 or less than 0 will be truncated to 255 or 0, respectively. However, as discussed in [21], such kind of error seldom appears in natural images (about 1% of the image blocks have truncation errors), and any block with pixels having luminance value of 0 or 255 is discarded for further analysis. The other is the rounding error, i.e., a rounding process will be carried out after IDCT to ensure that the luminance value in the spatial domain is an integer. Such rounding error in the spatial domain will lead to bias (denoted by {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{rounding}}) in all the DCT coefficients. Assuming that the rounding error in the spatial domain follows a zeromean uniform distribution with range [0.5, 0.5) and considering that the DCT transform is unitary, {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{rounding}} is Gaussian distributed with zero mean and variance 1/12 for all 1 ≤ m, n ≤ 8 according to the CLT [14].
After DCTtospatial transformation, the image patch undergoes a shifted JPEG compression with quality factor QF_{2} and the coordinate shifts (x_{S}, y_{S}). According to the analysis in Section 3.1, the shifted JPEG compression will introduce a zeromean Gaussiandistributed error term {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathit{SJ}}\left(\mathit{k}\right). The (m,n)th DCT coefficient of the k th block in the final SDJPEG patch, denoted by {\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathit{SD}}\left(\mathit{k}\right), is given by
where {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\left(\mathit{k}\right) is the overall error of the (m,n)th DCT coefficient during the construction of the SDJPEG patch from the ASJPEG patch. Since {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{rounding}} and {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{k}\right) are both Gaussian distributed and independent of each other, {\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\left(\mathit{k}\right) also follows the Gaussian distribution with zeromean and standard deviation of {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}=\sqrt{{\left({\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\right)}^{2}+1/12}. The histogram of the (m,n)th DCT coefficients after shifted double JPEG compression, denoted by {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right), is then given by
Figure 5 illustrates the DCT coefficient histograms for the SDJPEG and SSJPEG patches from the uncompressed 512 × 512 image ‘Lena.bmp’ (Figure 5 (c1)). From the figure, it can be observed that (i) the coefficient variations caused by rounding errors and shifted JPEG compression follow Gaussian distributions and thus {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) can be approximated by (7); (ii) the AC DCT coefficient distribution of the SSJPEG patch (shown in Figure 5 (c2 and c3)) follows the Laplacian distribution, which is similar to that of uncompressed images [22], and is very different from that of the SDJPEG patches shown in Figure 5 (a2 and b2). Based on the above analysis, it can be concluded that the SDJPEG patches can be differentiated from SSJPEG patches by analyzing the DCT coefficient distributions. In the following section, details of the SDJPEG detection method based on the proposed DCT coefficient model in (7) will be elaborated.
4 Detection of image patch with SDJPEG compression
The analysis in Section 3 shows that the DCT coefficient distribution of SDJPEG patches follows a weighted summation of Gaussian components with the same standard deviation (as shown in (7) and Figure 5). However, in order to detect SDJPEG patches based on (7), two questions have to be addressed: (i) How to obtain discriminative features that capture the differences in the (m,n)th DCT coefficient distributions between SDJPEG and SSJPEG patches? (ii) How to select DCT modes which provide high discriminative power since the differences in the DCT coefficient distributions of SDJPEG and SSJPEG patches would vary for different DCT mode? In the following, we will address these questions.
4.1 Discriminative feature extraction based on the (m,n)th DCT coefficients
For a specific DCT mode, say (m,n)th, the following detection algorithm is carried out by determining whether the histogram h_{ m,n }(f) is similar to that of {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right).
Given a specific quantization step q_{m,n}, we project the histogram h_{m,n}(f) onto the interval \left(\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\right.,\left(\right)close=")">\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\n, and the sum of the histogram function within the interval is defined as
where Nq_{m,n} is the maximum absolute value of the (m,n)th DCT coefficient.
According to (7), for SDJPEGcompressed image patches, {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) follows a weighted summation of Gaussian components with a standard deviation of {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}. Hence, \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) would follow a specific distribution determined by {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathit{SJR}} and q_{m,n}. Based on the different ratios of {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}, \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) can be obtained as follows. It should be noted that in our discussions, the Gaussian distribution, G(0, σ), is assumed to be bounded in [3σ, 3σ], and any outliers are omitted.
If \frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\ge 3{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}, i.e., all the Gaussian components in {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) are isolated, \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) would follow a Gaussian distribution of \mathit{G}\left(0,{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\right) (as shown in Figure 6a), i.e.,
If \mathit{p}\times \frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}<3{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\le \left(\mathit{p}+1\right)\times \frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\phantom{\rule{1em}{0ex}}\mathit{p}=1,2,\dots, some Gaussian components would overlap with each other and \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) would follow a distribution of a mixture of (p + 1) Gaussian distributions (as shown in Figure 6b,c), i.e.,
where \mathit{p}+1=\mathrm{ceiling}\phantom{\rule{0.2em}{0ex}}\left(\frac{3{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}}{{\mathit{q}}_{\mathit{m},\mathit{n}}/2}\right).
From (9) and (10) and Figure 6, it is observed that for the isolated or slight overlapping cases (p = 0 or p = 1), \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) has a distinctive peak at small f. The peak at small f becomes more prominent with larger {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}. Such phenomenon does not occur for SSJPEG patches. Figure 7 illustrates sh_{m,n}(f) distributions of the image Lena.bmp of Figure 5 (c1) after SDJPEG and SSJPEG with various parameter settings.
From Figure 7, it is observed that the energy of sh_{m,n}(f) for SDJPEG image patches is concentrated on the center region whereas the energy of sh_{m,n}(f) for SSJPEG patches is almost evenly distributed. It should be noted that when the size of the image patch is small, e.g., 64 × 64, the total number of DCT coefficients is small (64 coefficients in total), and it is difficult to estimate the actual distribution of the patch accurately and robustly using such limited data. In order to solve the above problem of inadequate data, the 1D feature similar to that in [23] is adopted to differentiate between SDJPEG and SSJPEG image patches as follows:
where {\mathit{R}}_{1}=\left(\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{6},\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{6}\right) representing the central region and {\mathit{R}}_{2}=\left(\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2},\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{3}\right){\displaystyle \cup}\left(\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{3},\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\right) representing the peripheral region.
For SDJPEG patches, the reference value of the feature, denoted by {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}, can be derived from \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) and is determined by {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} and q_{m,n}. The reference value of SSJPEG patch, denoted by {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}, is obtained as follows. Since the SSJPEG patch has never been compressed with the block structure starting from the topleft corner of the image patch, the DCT coefficients of SSJPEG can be assumed distributed approximately as the original uncompressed image patch [8, 14]. Moreover, {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\left(\mathit{f}\right) is equivalent to the distribution of the quantization error of the (m,n)th DCT component with the quantization step q_{ m,n }. Hence, {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\left(\mathit{f}\right) can be approximated using the aligned JPEG quantization error estimation approach introduced in Section 3.1. Then {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}} can be derived from (8) and (11). Note that for the lowfrequency DCT modes, the quantization step q_{ m,n } is relatively smaller compared with the dynamic range of the DCT coefficients and thus \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\left(\mathit{f}\right) is almost evenly distributed and {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\approx 1>{\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}.
The final discriminative feature indicating the likelihood of an image patch having been SDJPEG compressed is derived by normalizing the extracted feature with the reference values {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}} and {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}, i.e., \mathit{s}{\mathit{n}}_{\mathit{m},\mathit{n}}=\left({\mathit{s}}_{\mathit{m},\mathit{n}}{\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\right)/\left({\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}{\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\right). Note that sn_{ m,n } is small for SDJPEG patches and large for SSJPEG patches.
4.2 Discriminative feature extraction based on all the DCT coefficients
The analysis in Section 4.1 shows that sn_{ m,n } for the SDJPEG image patches is smaller than those for the SSJPEG patches. Such phenomenon becomes more prominent with the increase of {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}, and thus the DCT components with larger {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} would be more discriminative in SDJPEG detection. Our analysis in Section 3 shows that for a specific image, {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} can be estimated theoretically when QF_{2} and the shift coordinates (x_{S}, y_{S}) are known. Figure 8 gives some examples of {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\phantom{\rule{0.48em}{0ex}}1\le \mathit{m},\mathit{n}\le 8 with different QF_{2} and (x_{S}, y_{S}).
Figure 8 shows the following: (i) Shifted JPEG compression with lower quality (QF_{2}) will introduce larger spread to the DCT coefficients. (ii) {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} for different DCT modes varies from each other and are not proportional to their corresponding quantization step. For instance, in Figure 8 (b2), the quantization steps for (2,1) and (2,2) are the same while {\mathit{\sigma}}_{2,1}^{\mathrm{SJR}} and {\mathit{\sigma}}_{2,2}^{\mathrm{SJR}} are quite different. (iii) Even for the same DCT mode, the variations caused by shift JPEG compression do not remain constant with different coordinate shifts, e.g., for the (2,3)th DCT coefficient, {\mathit{\sigma}}_{2,3}^{\mathrm{SJR}} is quite different when (x_{S}, y_{S}) changes between (4,4) and (1,6). Hence, {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} can be obtained by a tablelookup with the knowledge of QF_{2} and (x_{S}, y_{S}). In order to have large value of {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathit{SJR}}, smaller values of {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} or larger values of q_{ m,n } are preferred. For any value of the quality factor, Q = [q_{m,n}] = λQ_{default}, where Q_{default} is the default quantization table defined by the Independent JPEG Group (IJG) and λ is a constant determined by the quality factor. Hence, we have \frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}}=\mathit{\lambda}\times \frac{{\mathit{q}}_{\mathrm{default}}\left(\mathit{m},\mathit{n}\right)}{{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}}=\mathit{\lambda}\times {\mathrm{dis}}_{\mathit{m},\mathit{n}}\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8. It should be noted that in practice, only QF_{2} is known. We will show how QF_{1} can be estimated later.
The discriminative feature extraction considering all the DCT coefficients runs as follows:

1.
Input the prior information: the image patch, the quality factor of the final compression QF_{2}, and the coordinate shift (x _{S}, y _{S}) (which is given by an exhaustive enumeration on 63 possible coordinate shifts, see Section 4.3).

2.
According to the image patch, QF_{2} and (x _{S}, y _{S}), estimate {\mathit{\sigma}}^{\mathrm{SJR}}=\left[{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\right] which is introduced by the shifted double JPEG compression. Calculate the discriminative table DIS = [dis(m,n)] for low frequencies (where m + n < 8)

3.
Sort all the DCT modes according to their discriminative value in descending order and the first N _{c} (N _{c} = 3 in our experiment) components are selected to construct the candidate set, i.e., {\left({\mathit{m}}_{1},{\mathit{n}}_{1}\right),\left({\mathit{m}}_{2},{\mathit{n}}_{2}\right),\dots ,\left({\mathit{m}}_{{\mathit{N}}_{\mathrm{c}}},{\mathit{n}}_{{\mathit{N}}_{\mathrm{c}}}\right)}. Estimate the quantization step {\mathit{q}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}\left({\mathrm{QF}}_{1}\right) (for the unknown QF_{1}) of these DCT modes by analyzing their histograms. It should be noted that if the coefficients of a DCT mode in the candidate set has too many zero values (>80% of the total number of coefficients), the quantization step cannot be estimated accurately. Hence, considering the quantization noise caused by shifted compression, for any (m _{ i }, n _{ i }) whose coefficients are concentrated in the range of \left[3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}},3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}\right], i.e., {\displaystyle \underset{\mathit{f}=3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}}{\overset{3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}}{\int}}{\mathit{h}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SD}}\left(\mathit{f}\right)}\ge 80\mathrm{\%}, the mode is discarded and replaced with the mode with the next largest discriminative value outside the candidate set.

4.
For the i th mode (m _{ i },n _{ i }) in the candidate set, extract the normalized discriminative feature for the (m _{ i },n _{ i })th mode, i.e., sn _{ mi },_{ ni }, using the approach described in Section 4.1 with the estimated {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}} and {\mathit{q}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}\left({\mathrm{QF}}_{1}\right). The final discriminative feature for the image patch, denoted by sn _{all}, is obtained by averaging over \mathit{s}{\mathit{n}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}\phantom{\rule{0.48em}{0ex}}1\le \mathit{i}\le {\mathit{N}}_{\mathrm{c}}. In step 3, the quantization step q _{m,n}(QF_{1}) can be obtained by an exhaustive search among the reference SDJPEG's \mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) where sh _{m,n}(f) is the most similar to. However, this approach is quite computationally expensive. Since {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) exhibits periodiclike pattern with a period of q _{m,n}(QF_{1}) to reduce the complexity, the approach similar to that in [17] is adopted, i.e., the fast Fourier transform is applied to the histogram {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right), and the peak of the spectrum with the DC removed is extracted to estimate the quantization step of the (m,n)th DCT mode q _{m,n}(QF_{1}). With the quantization step of the (m,n)th DCT mode q _{m,n}(QF_{1}) estimated, the quality factor QF_{1} can be estimated by comparing q _{m,n}(QF_{1}) with the default quantization table Q _{default}. In order to improve the robustness, all q _{m,n}(QF_{1}) values in the candidate set are estimated. The median value of the predicted QF_{1} is taken as the quality factor of the first compression, and all q _{m,n}(QF_{1}) values in the candidate set are refined using the estimated quality factor.
4.3 Cropandpaste image tampering detection by detecting the SDJPEG effects
In order to detect cropandpaste image tampering, the JPEG image is divided into a series of B × B subimages. Each B × B subimage is examined to detect whether it contains any image patch having SDJPEG effects, which runs as follows:

(i)
For a specific coordinate shift (x _{S}, y _{S}) (0 ≤ x _{S}, y _{S} ≤ 7 and (x _{S}, y _{S}) ≠ (0, 0)), crop an image patch {\mathrm{IMG}}_{{\mathit{x}}_{\mathrm{S}},{\mathit{y}}_{\mathrm{S}}} from the subimage with the size of (B  8) × (B  8) and starting from (x _{S}, y _{S}).

(ii)
With (x _{S}, y _{S}) and QF_{2}, extract the discriminative feature sn _{all} for {\mathrm{IMG}}_{{\mathit{x}}_{\mathrm{S}},{\mathit{y}}_{\mathrm{S}}}. Then the SDJPEG effect map (SEM) for (x _{S}, y _{S}) is set to sn _{all}, i.e., SEM (x _{S}, y _{S}) = sn _{all}.

(iii)
Repeat steps (i) and (ii) for all the 63 possible coordinate shifts to obtain the SEM for the B × B subimage.

(iv)
Compare SEM with those of the positive (containing SDJPEG patches) and negative (not containing any SDJPEG patch) samples in the training database to detect whether the subimage has been tampered with. The Fisher's linear discriminant analysis (LDA) [24] is adopted as the classifier in our approach.

(v)
Loop through steps (i) to (iv) for all B × B subimages in the image to detect the suspicious regions that might have been tampered with.
5 Experiments and discussion
In this section, we first investigate the effectiveness of several key issues for the proposed approach, i.e., the quantization error estimation for shifted JPEG compression, the primary quality factor estimation, and the proposed discriminative feature for SDJPEG detection. Then, we analyze the SDJPEG detection performance for image patches of various sizes and compared the detection performance with four stateoftheart SDJPEG detection algorithms, i.e., Luo et al.'s [15] (Luo's in short), Chen and Hsu's [16] (Chen's in short), Bianchi and Piva's [13] (Bianchi I's in short), and Bianchi and Piva's [14] (Bianchi II's in short). Finally, we compare the image tampering detection performance of all five algorithms for two example images. The images in our experiments come from two widely used image databases, i.e., the UCID [25] and NRCS [26] image datasets, and all the original images are uncompressed.
5.1 Effectiveness evaluation of the proposed approach
5.1.1 Quantization error estimation for shifted JPEG compression
In order to evaluate the effectiveness of the proposed approach to estimate {\mathit{\sigma}}^{\mathrm{SJR}}=\left[{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\right], 10,000 image patches with the size 256 × 256 are randomly collected from the uncompressed images in the databases [25, 26]. Then half of the image patches are used to generate the SSJPEG patches by shifted JPEG compression with the quality factor QF_{2} randomly picked from 50 to 90 and random selection of (x_{S}, y_{S}) ((x_{S}, y_{S}) ≠ (0, 0)). The other half of the image patches are used to generate the SDJPEG by aligned JPEG compression with quality factor QF_{1} randomly picked from 50 to 90 and shifted JPEG compression with quality factor QF_{2} randomly picked from 50 to 90 and random selection of (x_{S}, y_{S}) ((x_{S}, y_{S}) ≠ (0, 0)). For each image patch, the actual standard deviation of the quantization error caused by shifted JPEG compression (denoted by {\mathit{\sigma}}^{\mathrm{ACT}\mathrm{SJR}}=\left[{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{ACT}\mathrm{SJR}}\right]) is recorded as the ground truth, and the average relative estimation error is adopted to evaluate the estimation performance, i.e.,
The average relative estimation error η_{SJR} for the above image patches using the proposed approach is 10.61%, which is much less than that using the rough approximation in [13, 14] (214.94%). Hence, the proposed DCT coefficient model can better describe the effects caused by shifted JPEG compression.
5.1.2 Primary quality factor estimation
To show the effectiveness of the proposed quantization step estimation approach, the following experiments have been carried out. For a specific patch size, QF_{1} and QF_{2}, 500 SDJPEG patches are generated with random selection of (x_{S}, y_{S}) ((x_{S}, y_{S}) ≠ (0, 0)). The average estimation accuracy is given in Table 3.
From the table, it is observed that the estimation performance is improved with the increase of the patch size. It is because for larger size of the image patch, more data are collected to construct the histogram of the DCT coefficients and the period estimation approach will be more robust. On the other hand, the estimation performance is also improved with the increase of QF_{2}  QF_{1}. Such phenomenon can be explained as follows. From (7) and Figure 5 (a2 and b2), for SDJPEG image patches, the period in the histogram (i.e., the primary quantization step q_{ m,n }(QF_{1})) is large for low quality factor QF_{1} and the standard deviation of the quantization noise with rounding error, i.e., {\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}, is small for high quality factor QF_{2}. Hence when QF_{2}  QF_{1} is large, i.e., QF_{2}  QF_{1} ≥ 10, the Gaussian impulses in the histogram are nonoverlapping or slightly overlapping and thus the period can be estimated accurately. However, when QF_{2}  QF_{1} is small, i.e., QF_{2}  QF_{1} ≤ 20, the Gaussian impulses in the histogram are highly overlapping and thus the periodic pattern almost disappears in the histogram.
5.1.3 Effectiveness of the proposed discriminative feature
In order to investigate the effectiveness of the proposed discriminative feature sn_{all}, the distributions of sn_{all} for SSJPEG and SDJPEG patches with the size 256 × 256 are analyzed. For SSJPEG patches, the quality factor QF_{2} is set to 70. For SDJPEG patches, the second compression quality factor QF_{2} is also set to 70 and the primary compression quality factor QF_{1} is randomly picked from 50 to 70. The coordinate shift (x_{S}, y_{S}) is also selected randomly in the range of 0 ≤ x_{S}, y_{S} ≤ 7, (x_{S}, y_{S}) ≠ (0, 0). The histograms of sn_{all} in SSJPEG and SDJPEG patches are given in Figure 9. It is observed that sn_{all} is usually small for SDJPEG patches while it is large for SSJPEG ones, which demonstrates that sn_{all} is effective in differentiating SDJPEG and SSJPEG patches.
5.2 SDJPEG patch detection
To assess the performance of SDJPEG effects detection, we prepare the dataset as follows:

i.
Image patches of three sizes, i.e., 64 × 64, 128 × 128 and 256 × 256 are used.

ii.
An overall 25 {QF_{1},QF_{2}} pairs are investigated with QF_{1},QF_{2} = {50,60,70,80,90}.

iii.
For a specific {QF_{1},QF_{2}} pair, 10,000 image patches are randomly extracted from the uncompressed image database. The ‘positive’ samples are constructed by performing shifted JPEG compression with quality factor of QF_{1} and random coordinate shifts (x _{S}, y _{S}) (0 ≤ x _{S}, y _{S} ≤ 7 and (x _{S}, y _{S}) ≠ (0, 0)) and then saving the image patches in JPEG format with QF_{2}. The ‘negative’ samples are constructed by directly saving the same uncompressed image patches in JPEG with QF_{2}.
Since in practice only QF_{2} is available and QF_{1} is unknown to an algorithm, the following are the settings for the different algorithms evaluated:

1)
For Luo's [15] and Chen's [16] approaches, the SVM is adopted as the classifier. In the training procedure, since QF_{1} is unknown for each QF_{2}, a specific SVM is trained with the features extracted from half of the positive samples obtained from all possible selections of QF_{1} = {50,60,70,80,90} and the corresponding negative samples (there are overall 25,000 positive and 25,000 negative samples). In the testing procedure, the test samples are classified by their corresponding classifier with the knowledge of QF_{2}.

2)
For Bianchi I's [14] approach, with the knowledge of QF_{2}, QF_{1} is estimated by an EM algorithm. Then the tampered region likelihood is calculated for each 8 × 8 block in the image patch, and the investigated image patch is classified as a SDJPEG patch if half of the 8 × 8 blocks in the patch are with the likelihood greater than 1. For Bianchi II's [13] approach, with the knowledge of QF_{2}, QF_{1} is achieved by an exhaustive search.

3)
For the proposed approach, QF_{1} is estimated by the method introduced in Section 4.2. For each QF_{2}, SEM features for 630 positive samples (ten samples for each possible (x _{S}, y _{S}) with QF_{1} randomly selected) and 630 negative samples are adopted to train the classifier, and the rest (about 48,740 samples) are adopted for testing.
The detection performance obtained by our algorithm compared with the four existing methods is given in Tables 4, 5 and 6. For all the methods investigated, the classification performance is affected by two key parameters: the patch size and the difference in quality factor between the first and second JPEG compressions, i.e., QF_{2}  QF_{1}. From the table, the following observations can be made.
First, the recognition accuracy increases substantially with the increase of patch size. As the features used in all five methods are based on statistical models, more data leads to better estimation of the models. Among all five approaches, our algorithm always outperforms the others especially when the patch size is small (64 × 64 for instance) since it requires less data to estimate a discriminative model. Since only one or several small image patches are modified in most image tampering detection tasks, our algorithm is more appropriate in such applications.
Second, the detection performance is improved with the increase of QF_{2}  QF_{1}. It is because the lower quality factor of the first JPEG compression will leave more traces of the compression history and the higher quality factor of the second JPEG compression will introduce less distortion to the image, which makes the traces left by the first compression easier to be detected. The proposed approach can achieve reasonable accuracies (above 80%) when QF_{2}  QF_{1} ≥ 10 and high accuracies (above 90%) when QF_{2}  QF_{1} ≥ 20 for 128 × 128 and 256 × 256 patches. However, when QF_{2}  QF_{1} ≤ 20, the detection performance is very poor for all approaches, i.e., the classification accuracies are close to that of random guess (50%). The poor performance of the proposed approach is to be expected because when QF_{2} < QF_{1}, {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}} will be very small for all the DCT modes and {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}} will be close to {\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}. In addition, as shown in subsection 5.1.2, with small values of {\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}, all the Gaussian components of {\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right) in (10) will highly overlap with each other and the estimation of QF_{1} will not be accurate especially when the patch size is small.
5.3 Cropandpaste image tampering detection
Figure 10 illustrates the tampering detection results using our approach compared with two recent stateoftheart approaches [13, 14]. The forgery image is made by cropping the image patch (enclosed by the highlighted boundary as shown in Figure 10 (b)) from another JPEG image with quality factor of QF_{1}, pasting it to the original image and saving as the forgery JPEG file with quality factor of QF_{2}. Three different settings of {QF_{1},QF_{2}}, i.e., {50,50}, {50,70}, and {50,90}, are investigated, and the detected tampered region is highlighted in the figures. Note that for all the three approaches, the corresponding parameters are adjusted to ensure that the false alarm rate (FAR) is less than 10%. Moreover, the detection results undergo a 3 × 3 median filtering process to remove isolate detection errors. It is observed from the figures that the proposed algorithm can accurately detect and locate tampered region when {QF_{1},QF_{2}} = {50,70} and {50,90}. It also accurately points out some suspicious regions when {QF_{1},QF_{2}} = {50,50}. Compared with Bianchi and Piva's approaches [13, 14], the proposed approach provides more accurate detection results due to (i) better estimation of the quantization error caused by shifted JPEG compression and (ii) more discriminative information investigated by the adaptive DCT mode selection approach in Section 4.1 compared with that only DC mode is investigated in [13].
6 Conclusions
In this paper, a new JPEG image tampering detection algorithm based on SDJPEG detection is presented. The DCT coefficient distribution for the SDJPEG patches has been expressed by a statistical model, and a discriminative feature has been proposed which can effectively differentiate between SDJPEG and SSJPEG patches. By an adaptive DCT mode selection scheme, several highly discriminative DCT modes are selected. We used several thousand SDJPEG and SSJPEG patches extracted from the UCID and NRCS image database to evaluate the performance of the proposed algorithm along with several existing algorithms. From the experimental results, the proposed algorithm can achieve much better results compared with the existing algorithms, especially when the patch size is small. We also performed experiments to detect and locate the tampered region of two test images, where the proposed algorithm always outperforms the other algorithms investigated. As a result, the proposed algorithm has provided a new and effective solution for SDJPEG detection and JPEG image forensics.
Appendix
Derivation of DCT coefficient from adjacent blocks
From Figure 4d, it is observed that block A can be decomposed into four nonoverlapped region, i.e., A_{1}, A_{2}, A_{3}, and A_{4}, and A = A_{1} + A_{2} + A_{3} + A_{4}. In order to derive A_{ i } = H_{i 1}B_{ i }H_{i 2}, we just demonstrate the situation when i = 4 and the same analysis can be applied to i = 1,2,3 (Figure 11).
For the bottom right corner, A_{4} has the following relationship to B_{4}:A_{4} = H_{41}B_{4}H_{42}, where {\mathit{H}}_{41}=\left[\begin{array}{cc}\hfill 0\hfill & \hfill 0\hfill \\ \hfill {\mathit{I}}_{\mathit{h}}\hfill & \hfill 0\hfill \end{array}\right] and {\mathit{H}}_{42}=\left[\begin{array}{cc}\hfill 0\hfill & \hfill {\mathit{I}}_{\mathit{w}}\hfill \\ \hfill 0\hfill & \hfill 0\hfill \end{array}\right]. I_{ w } and I_{ h } are the identity matrices with size h × h and w × w, respectively; h and w are the number of rows and columns extracted. Since all unitary orthogonal transforms such as the DCT are distributive to matrix multiplication [19], we have \mathit{D}\left({\mathit{A}}_{4}\right)=\mathit{D}\left({\mathit{H}}_{41}\right)\mathit{D}\left({\mathit{B}}_{4}\right)\mathit{D}\left({\mathit{H}}_{42}\right)\Rightarrow \mathit{D}\left(\mathit{A}\right)={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{A}}_{\mathit{i}}\right)}={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{H}}_{\mathit{i}1}\right)\mathit{D}\left({\mathit{B}}_{\mathit{i}}\right)\mathit{D}\left({\mathit{H}}_{\mathit{i}2}\right)}.
References
Wang X, Xue J, Zheng Z, Liu Z, Li N: Image forensic signature for content authenticity analysis. J. Vis. Commun. Image. Repres. 2012, 23(5):782797. 10.1016/j.jvcir.2012.03.005
Swaminathan A, Mao Y, Wu M: Robust and secure image hashing. IEEE. Trans. Info. Forensics. Sec. 2006, 1(2):215230. 10.1109/TIFS.2006.873601
Phadikar A, Maity SP, Mandal M: Novel waveletbased QIM data hiding technique for tamper detection and correction of digital images. J. Vis. Commun. Image. Represent. 2012, 23(3):454466. 10.1016/j.jvcir.2012.01.005
Farid H: A survey of image forgery detection. IEEE. Signal. Process. Mag. 2009, 2(26):1625.
Popescu AC Ph.D. Thesis. In Statistical tools for digital image forensics. Department of Computer Science, Dartmouth College, Hanover, NH; 2005.
Popescu AC, Farid H: Exposing digital forgeries in color filter array interpolated images. IEEE. Trans. Signal. Proc. 2005, 53(10):39483959.
Chen L, Lu W, Ni J, Sun W, Huang J: Region duplication detection based on Harris corner points and step sector statistics. J. Vis. Commun. Image. Represent. In Press
Lukas J, Fridrich J: Estimation of primary quantization matrix in double compressed JPEG images. In Proceedings of Digital Forensic Research Workshop. Cleveland; 2003:58.
Fu D, Shi YQ, Su Q: A Generalized Benford's law for JPEG coefficients and its applications in image forensics. In Proceedings of the SPIE Electronic Imaging, Security and Watermarking of Multimedia Contents IX, vol. 650. 5th edition. Springer, San Jose, CA; 2007:1L11L11.
Pevny T, Fridrich J: Detection of doublecompression in JPEG images for applications in steganography. IEEE. Trans. Info. Forensics. Sec. 2008, 3(2):247258.
Farid H: Exposing digital forgeries from JPEG ghosts. IEEE. Trans. Info. Forensics. Sec. 2009, 4(1):154160.
Ye SM, Sun QB, Chang EC: Detecting digital image forgeries by measuring inconsistencies of blocking artifact, in Proceedings of IEEE International Conference on Multimedia and Expo. China, Beijing; 2007. pp. 12–15
Bianchi T, Piva A: Detection of nonaligned double JPEG compression based on integer periodicity maps. IEEE. Trans. Info. Forensics. Sec. 2012, 7: 2.
Bianchi T, Piva A: Image forgery localization via blockgrained analysis of JPEG artifacts. IEEE. Trans. Info. Forensics. Sec. 2012, 7: 3.
Luo W, Qu Z, Huang J, Qiu G: A novel method for detecting cropped and recompressed image block. In Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP’07), vol. 6. 2nd edition. IEEE, Piscataway, Honolulu, Hawaii; 217220.
Chen YL, Hsu CT: Detecting recompression of JPEG images via periodicity analysis of compression artifacts for tampering detection. IEEE. Trans. Info. Forensics. Sec. 2011, 6(2):396406.
Lin Z, He J, Tang X, Tang CK: Fast, automatic and finegrained tampered JPEG image detection via DCT coefficient analysis. Pattern. Recogn. 2009, 42(11):24922501. 10.1016/j.patcog.2009.03.019
Huang FJ, Huang JW, Shi YQ: Detecting double JPEG compression with the same quantization matrix. IEEE. Trans. Info. Forensics. Sec. 2010, 5(4):848856.
Chang SF, Messerschimitt DG: Manipulation and compositing of MCDCT compressed video. IEEE. J. Select. Areas. Commun. 1995, 13(1):111. 10.1109/49.363151
Reininger R, Gibson J: Distributions of the twodimensional DCT coefficients for images. IEEE. Trans. Commun. 1983, 31(6):835839. 10.1109/TCOM.1983.1095893
Fan Z, de Queiroz RL: Identification of bitmap compression history: JPEG detection and quantizer estimation. IEEE. Trans. Image. Process. 2003, 12(2):230235. 10.1109/TIP.2002.807361
Popescu AC, Farid H: Statistical tools for digital forensics. Lect. Notes. Comput. Sci. 2005, 3200: 395407.
Luo W, Huang J, Qiu G: JPEG error analysis and its application to digital image forensics. IEEE. Trans. Info. Forensics. Sec. 2010, 5(3):480491.
Duda RO, Hart PE, Stork DG: Pattern Classification. 2nd edition. WileyInterscience, USA; 2000.
Schaefer G, Stich M: UCID: an uncompressed color image database. In Proceedings of the SPIE Storage and Retrieval Methods and Applications for Multimedia, vol. 530. 7th edition. Springer, San Jose, CA; 2003:472480.
NRCS Photo Gallery 2005.http://photogallery.nrcs.usda.gov/res/sites/photogallery/
Acknowledgements
The work described in this paper is supported by National Key Basic Research Program of China No. 2013CB329603 and NSFC Fund No. 61271319 and No. 61071152.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Wang, SL., Liew, A.WC., Li, SH. et al. Detection of shifted double JPEG compression by an adaptive DCT coefficient model. EURASIP J. Adv. Signal Process. 2014, 101 (2014). https://doi.org/10.1186/168761802014101
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/168761802014101
Keywords
 Digital image forensics
 Shifted double JPEG compression
 JPEG coefficient analysis
 Image splicing detection