# Detection of shifted double JPEG compression by an adaptive DCT coefficient model

- Shi-Lin Wang
^{1}Email author, - Alan Wee-Chung Liew
^{2}, - Sheng-Hong Li
^{3}, - Yu-Jin Zhang
^{3}and - Jian-Hua Li
^{1}

**2014**:101

https://doi.org/10.1186/1687-6180-2014-101

© Wang et al.; licensee Springer. 2014

**Received: **9 January 2014

**Accepted: **13 June 2014

**Published: **5 July 2014

## Abstract

In many JPEG image splicing forgeries, the tampered image patch has been JPEG-compressed twice with different block alignments. Such phenomenon in JPEG image forgeries is called the shifted double JPEG (SDJPEG) compression effect. Detection of SDJPEG-compressed patches could help in detecting and locating the tampered region. However, the current SDJPEG detection methods do not provide satisfactory results especially when the tampered region is small. In this paper, we propose a new SDJPEG detection method based on an adaptive discrete cosine transform (DCT) coefficient model. DCT coefficient distributions for SDJPEG and non-SDJPEG patches have been analyzed and a discriminative feature has been proposed to perform the two-class classification. An adaptive approach is employed to select the most discriminative DCT modes for SDJPEG detection. The experimental results show that the proposed approach can achieve much better results compared with some existing approaches in SDJPEG patch detection especially when the patch size is small.

## Keywords

## 1 Introduction

With the rapid development of image processing tools, manipulating a digital image without leaving obvious visual traces is becoming easier and easier. The detection of malicious tampering and the verification of the credibility of the original digital image have become important research topics.

Some researchers have proposed digital watermarking as an image/video content authentication technique [1–3]. However, these kinds of ‘active authentication’ methods have not been widely used because most images on the Internet are not required to be watermarked. Moreover, there is also the challenge of how to guide against hostile attacks in watermark-based approaches. Recently, a variety of ‘passive authentication’ methods [4–7] have been proposed which perform image content authentication by detecting certain cues produced during creation and modification of the image, such as double compression, light abnormality, re-sampling, and photo response non-uniformity noise (PRNU). Compared with the active approach, the passive or blind authentication approach does not require additional watermarks or signatures and could have broader applications in image forensics.

JPEG is the most widely used image format. Authentication or detection of forgeries on JPEG images plays an important role in image forensics. Since most tampered JPEG image undergoes at least two JPEG compressions, many JPEG image authentication methods are based on detection of double JPEG compression. As the block discrete cosine transform (BDCT) is the key operation in JPEG compression, the distribution of the BDCT coefficients usually contains important information which could indicate the compression history. Hence, most passive image tampering detection approaches for JPEG images are based on BDCT coefficient analysis. Lukas and Fridrich [8] tried to identify double JPEG compression by detecting the double-peak effect in DCT coefficient histogram. Fu et al. [9] observed that the distribution of the first digit of DCT coefficients after JPEG compression followed the generalized Benford's law and stated that double JPEG compression could be detected because it would cause violations of the first digit law. In [10], Pevny and Fridrich introduced a machine learning approach for detection of double JPEG compression. A set of features from the histograms of several low-frequency DCT coefficients was extracted, and support vector machine (SVM) was adopted as the classifier. Recently, Farid [11] observed that re-compression would introduce additional local minima in the difference between the image and its JPEG-compressed counterpart, and these local minima, referred to as the JPEG ghosts, could be used to detect double compression. However, these methods are only effective when the block structure of the first and second JPEG compressions are aligned with each other.

In many JPEG image tampering situations, a foreign JPEG-compressed patch is inserted into an authentic image and the resultant image is re-compressed to form the new image. The tampered region has double JPEG compressions, but the block structures of the two compressions in the tampered region are usually not aligned with each other. Such case is referred to as shifted double JPEG compression (SDJPEG) [12] or non-aligned double JPEG compression (NA-JPEG) [13, 14]. For SDJPEG, the double JPEG compression detection methods discussed above cannot achieve satisfactory results. Luo et al. [15] tried to detect SDJPEG by analyzing the blocking artifact characteristics matrix (BACM). They observed that BACM is symmetric for single JPEG compression but the symmetry is destroyed after SDJPEG. However, BACM is highly related to the image content and the detection performance would decrease if the testing images are very different from those in the training set. In order to solve the above problem, Chen and Hsu [16] extended the idea of BACM and proposed a feature which is less related to the image content by introducing the inter-block correlation. However, since the statistical features in both [15] and [16] require large amount of data to obtain high discriminative power, their methods do not work well for small SDJPEG patches.

Recently, Bianchi and Piva [13] tried to detect SDJPEG effects by examining the integer periodicity in the DCT coefficient histogram when the BDCT is computed according to the first JPEG compression. In addition, they also proposed a statistical model that characterizes the artifacts due to SDJPEG [14]. They have observed that the shifted JPEG compression will introduce a Gaussian noise to each DCT coefficients and approximated the variance of the noise by the quantization step of the shifted compression. Inspired by [14], we propose a highly effective SDJPEG image tampering detection method in this paper. The major contributions of our work lie in the following: (i) we perform a rigorous theoretic analysis of the DCT coefficient variations caused by SDJPEG and derive from it a rigorous statistical model for the DCT coefficients of SDJPEG patches, which provides a more accurate estimation of the quantization noise introduced by the shifted JPEG quantization compared with that in [13, 14]; (ii) based on the analysis, we propose an effective discriminative feature to detect SDJPEG patches; and (iii) we propose an adaptive DCT component selection method to select the most discriminative DCT components. Our algorithm not only detects image forgeries but also locates the tampered regions accurately.

This paper is organized as follows. Section 2 gives a description of the ‘crop-and-paste’ image tampering problem and introduces current approaches. Section 3 gives an in-depth analysis on the DCT coefficient variations caused by shifted JPEG compression and describes the DCT coefficient histogram model for SDJPEG patches. In Section 4, a new discriminative feature is introduced to detect SDJPEG patches. An adaptive DCT mode selection method and a tampering detection algorithm are also given. Section 5 presents the experimental results comparing our approach with several state-of-the-art techniques. Finally, Section 6 draws the conclusion.

## 2 Crop-and-paste image tampering detection

- (i)
Uncompressed: These patches are raw images and have never been JPEG compressed.

- (ii)
Aligned single JPEG compressed (ASJPEG in short): When the uncompressed image patches undergo a JPEG compression with the block structure aligned to the image patch, the output image patches are called ASJPEG patches. They are usually referred to as single JPEG image patches in the literature.

- (iii)
Aligned double JPEG compressed (ADJPEG in short): When the ASJPEG patches undergo another JPEG compression and the block structures of the two JPEG compressions are aligned with each other, the output image patches are called ADJPEG patches. They are usually referred to as double JPEG image patches in the literature.

- (iv)
Shifted double JPEG compressed (SDJPEG in short): When the ASJPEG patches undergo another JPEG compression and the block structures of the two JPEG compressions are different, the output image patches are called SDJPEG patches.

- (v)
Shifted single JPEG compressed (SSJPEG in short): Different from SDJPEG patches, when the image patches before the shifted JPEG compression (with the block structure in dashed line) have never been compressed with the block structure starting from the top-left corner, the output image patches are called SSJPEG patches.

**Requirements of the crop-and-paste tampering detection methods**

Aligned double JPEG detection approach | Shifted double JPEG detection approach | |
---|---|---|

Original image | JPEG compressed | JPEG compressed or raw |

Foreign image being inserted | JPEG compressed or raw | JPEG compressed |

Final image | JPEG compressed | JPEG compressed |

Decision basis | Presence of ADJPEG effect indicates un-tampered patch | Presence of SDJPEG effect indicates tampered patch |

Additional limitation | The quantization matrices of original and final image cannot be the same | The block structure of the inserted patch cannot align to that of the final image |

In contrast, the shifted double JPEG detection-based approach [13–16] looks for tampered patch by detecting cues of tampering in an image patch. The basic idea is that if the inserted image region has been JPEG compressed earlier, the second JPEG compression will leave some cues of shifted compression in the tampered region in the final image (such as the additional blocking artifacts [15, 16]). For the aligned patches fully or partially located in the tampered region in the final image (such as aligned patch II in Figure 3a), such cues would exist, while for the aligned patches located in the unmodified region in the final image (such as aligned patch I in Figure 3a), such cues would be absent. Hence, the tampered region can be located by searching for all the aligned patches where the cues left by shifted compression exist. However, such cues vary greatly with image content and the quality factor of the inserted patch, and it is difficult to find a robust and discriminative feature for all situations. Moreover, this approach needs a large image patch to obtain robust detection results and is less effective for small patches.

In this paper, we propose a new shifted double JPEG detection method which also examines the characteristics of the inserted patch. Similar to [13, 14], our algorithm considers image patches that are not aligned with the block structure of the final image (such as non-aligned patches I and II in Figure 3b). When analyzing the non-aligned patches in the final image, SDJPEG patches (such as non-aligned patch II in Figure 3b where the JPEG-compressed inserted patch undergoes a shifted JPEG compression, i.e., during JPEG compression of the final image) are located in the tampered region, while SSJPEG patches (such as non-aligned patch I in Figure 3b) are located in the untampered region. Note that the JPEG compression of the final image is aligned for the aligned patches in Figure 3a and shifted for non-aligned patches Figure 3b. In our algorithm, an exhaustive search on 63 possible locations of the non-aligned image patches is performed to detect tampering. We will show in the experiment section that the increase in computational cost is worthwhile and our algorithm can achieve much better detection performance compared with the existing methods.

## 3 DCT coefficient analysis for SDJPEG patches

**Notation**

Description | |
---|---|

| DCT coefficients of an 8 × 8 image block, where the superscripts O, AJ, SJ, and SD stand for the original block before JPEG compression, the block after aligned JPEG compression, the block after shifted JPEG compression, and the block after shifted double JPEG compression respectively. Note that no superscript stands for the general case |

${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{O}}$, ${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}$, ${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}$, ${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}$ | The ( |

${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}$, ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}$, ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ | Quantization error of the ( |

${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}$, ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{\text{SJR}}}$ | Standard deviation of the quantization error of the ( |

| Histograms of the ( |

( | Coordinate shift of the shifted JPEG compression |

QF | Quality factor of the primary JPEG compression and the final JPEG compression, respectively |

| Quantization step of the ( |

| Gaussian distribution with mean |

### 3.1 DCT coefficient variations caused by shifted JPEG compression

*A*, its DCT coefficients can be represented by ${\mathit{D}}^{\mathrm{O}}\left(\mathit{A}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{O}}\left(\mathit{A}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8.$ If block

*A*undergoes an aligned JPEG compression (as shown in Figure 4a) with the quality factor QF, the resulting DCT coefficients of block

*A*, denoted by ${\mathit{D}}^{\mathrm{AJ}}\left(\mathit{A}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{A}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8$, will be a multiple of the quantization step. Hence, aligned JPEG compression will induce a zero-mean quantization error (denoted by ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{A}\right)$) for all the DCT coefficients, i.e.,

*A*undergoes a shifted JPEG compression with a coordinate shift of (

*x*

_{S},

*y*

_{S}) and quality factor QF (as shown in Figure 4b), the DCT coefficients of block

*A*become ${\mathit{D}}^{\mathrm{SJ}}\left(\mathit{A}\right)=\left[{\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)\right]\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8$. To derive the expression for

*D*

^{SJ}(

*A*), we consider the DCT coefficients of the neighboring 8 × 8 aligned blocks whose block structure coincides with the block structure of the shifted JPEG compression on

*A*. As shown in Figure 4c,d, block

*A*is surrounded by four aligned blocks

*B*

_{1},

*B*

_{2},

*B*

_{3}, and

*B*

_{4}. According to [19], the DCT coefficients of block

*A*and

*B*

_{ i }(

*i*= 1,2,3,4) are related by (details are given in Appendix)

where *H*_{i 1} and *H*_{i 2} (*i* = 1,2,3,4) are the row and column translation matrices to translate a specific block from *B*_{
i
} to *A*_{
i
}, as shown in Figure 4. Note that *D*(*H*_{i 1}) and *D*(*H*_{i 2}) (*i* = 1,2,3,4) are only related to the coordinate shifts (*x*_{S}, *y*_{S}).

*B*

_{ i }

*i*= 1,2,3,4 and thus before compression ${\mathit{D}}^{\mathrm{O}}\left(\mathit{A}\right)={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{H}}_{\mathit{i}1}\right){\mathit{D}}^{\mathrm{O}}\left({\mathit{B}}_{\mathit{i}}\right)\mathit{D}\left({\mathit{H}}_{\mathit{i}2}\right)}$ according to (3).

*D*

^{O}(

*B*

_{ i }) becomes

*D*

^{AJ}(

*B*

_{ i }),

*i*= 1,2,3,4 after a JPEG compression is performed on these aligned blocks. Based on the analysis about aligned JPEG compression in (1),

*D*

^{AJ}(

*B*

_{ i }) =

*D*

^{O}(

*B*

_{ i }) +

*E*

^{AJ}(

*B*

_{ i }),

*i*= 1,2,3,4. The aligned JPEG compression on

*B*

_{ i }is equivalent to a shifted JPEG compression on

*A*. Hence, after compression, the DCT coefficients of block

*A*would change to

where ${\mathit{E}}^{\mathrm{SJ}}\left(\mathit{A}\right)=\left[{\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)\right]={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{H}}_{\mathit{i}1}\right){\mathit{E}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)\mathit{D}\left({\mathit{H}}_{\mathit{i}2}\right)}$ denotes the shifted quantization error caused by shifted JPEG compression. For any DCT mode (*m*,*n*), the shifted quantization error ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)$ can be expressed by a linear combination of 4 × 64 = 256 zero-mean random variables, i.e., ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)={\displaystyle \sum _{\mathit{i}=1}^{4}{\displaystyle \sum _{\mathit{u}=1}^{8}{\displaystyle \sum _{\mathit{v}=1}^{8}{\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right){\mathit{E}}_{\mathit{u},\mathit{v}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)}}}$ and ${\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right)$ is the weighting parameter determined by *D*(*H*_{i 1}) and *D*(*H*_{i 2}) (*i* = 1,2,3,4). According to the Central Limit Theorem (CLT), ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{A}\right)$ follows a zero-mean Gaussian distribution denoted by $\mathit{G}\left(0,{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\right)$ with standard deviation ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}$. ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}$ can be calculated with knowledge of the standard deviations of these 256 random variables (${\mathit{E}}_{\mathit{u},\mathit{v}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)\phantom{\rule{0.48em}{0ex}}1\le \mathit{i}\le 4,1\le \mathit{u},\mathit{v}\le 8$) and the weighting coefficients ${\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right)\phantom{\rule{0.48em}{0ex}}1\le \mathit{i}\le 4,1\le \mathit{u},\mathit{v}\le 8$.

In order to estimate the quantization error caused by aligned JPEG compression, we divide all the DCT coefficients into two parts, i.e., the DC and AC components. For the DC component, since the quantization step is relatively small and the DC coefficients have a large dynamic range, the quantization error caused by aligned JPEG compression approximately follows the uniform distribution of $\left[-\frac{{\mathit{q}}_{1,1}\left(\mathrm{QF}\right)}{2},\frac{{\mathit{q}}_{1,1}\left(\mathrm{QF}\right)}{2}\right)$, where *q*_{1,1}(QF) is the quantization step for the DC coefficients with the quality factor QF. For the AC components, according to [20], the AC components for a natural image approximately follow a Laplacian distribution. If we fit the coefficients of AC components in the image patch to a Laplacian model, the standard deviations of ${\mathit{E}}_{\mathit{u},\mathit{v}}^{\mathrm{AJ}}\left({\mathit{B}}_{\mathit{i}}\right)$ can be directly calculated with the quality factor QF. In addition, from (3), the weighting coefficients, ${\mathit{c}}_{\mathit{m},\mathit{n}}^{\mathit{i}}\left(\mathit{u},\mathit{v}\right)$, are determined by the coordinate shift (*x*_{S}, *y*_{S}). Then the theoretical value of ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}$ can be calculated when QF and (*x*_{S}, *y*_{S}) are known.

To summarize, the shifted JPEG compression will induce a zero-mean Gaussian-distributed quantization error for all the DCT coefficients, and for different DCT coefficients, the standard deviations of the shifted quantization error are different and can be calculated when the quality factor and the coordinate shift of the compression are known.

### 3.2 DCT coefficient analysis on SDJPEG patches

As discuss earlier, SDJPEG patches are constructed by performing shifted JPEG compression on ASJPEG patches. Similar to the ASJPEG patches, the DCT coefficients of SDJPEG patches also have certain specific distributions.

*IMG*consisting of a number of 8 × 8 blocks aligned JPEG compressed with the quality factor QF

_{1}, let ${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathit{AJ}}\left(\mathit{k}\right)$ denote the (

*m*,

*n*)-

*th*DCT coefficient of the

*k*th block in IMG. Since it has been JPEG compressed with quality factor QF

_{1}, ${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{k}\right)$ would be a multiple of the quantization step of the (

*m*,

*n*)th DCT mode (denoted by

*q*

_{m,n}(QF

_{1})). Hence, considering all the (

*m*,

*n*)th DCT coefficients in IMG, the normalized histogram ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{AJ}}\left(\mathit{f}\right)$ of the (

*m*,

*n*)th DCT coefficients is given by

where *w*_{
i
} is the normalized frequency of the (*m*,*n*)th DCT coefficients having a value of *i* × *q*_{m,n}(QF_{1}) and *Nq*_{m,n}(QF_{1}) is the maximum absolute value of the (*m*,*n*)th DCT coefficient in IMG.

When the ASJPEG-compressed image patch IMG is transformed back to the spatial domain, two kinds of errors would be introduced. One is the truncation error. Since the luminance value of the gray scale image ranges from 0 to 255, any gray level greater than 255 or less than 0 will be truncated to 255 or 0, respectively. However, as discussed in [21], such kind of error seldom appears in natural images (about 1% of the image blocks have truncation errors), and any block with pixels having luminance value of 0 or 255 is discarded for further analysis. The other is the rounding error, i.e., a rounding process will be carried out after IDCT to ensure that the luminance value in the spatial domain is an integer. Such rounding error in the spatial domain will lead to bias (denoted by ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{rounding}}$) in all the DCT coefficients. Assuming that the rounding error in the spatial domain follows a zero-mean uniform distribution with range [-0.5, 0.5) and considering that the DCT transform is unitary, ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{rounding}}$ is Gaussian distributed with zero mean and variance 1/12 for all 1 ≤ *m*, *n* ≤ 8 according to the CLT [14].

_{2}and the coordinate shifts (

*x*

_{S},

*y*

_{S}). According to the analysis in Section 3.1, the shifted JPEG compression will introduce a zero-mean Gaussian-distributed error term ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathit{SJ}}\left(\mathit{k}\right)$. The (

*m*,

*n*)th DCT coefficient of the

*k*th block in the final SDJPEG patch, denoted by ${\mathit{D}}_{\mathit{m},\mathit{n}}^{\mathit{SD}}\left(\mathit{k}\right)$, is given by

*m*,

*n*)th DCT coefficient during the construction of the SDJPEG patch from the ASJPEG patch. Since ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{rounding}}$ and ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\left(\mathit{k}\right)$ are both Gaussian distributed and independent of each other, ${\mathit{E}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\left(\mathit{k}\right)$ also follows the Gaussian distribution with zero-mean and standard deviation of ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}=\sqrt{{\left({\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJ}}\right)}^{2}+1/12}$. The histogram of the (

*m*,

*n*)th DCT coefficients after shifted double JPEG compression, denoted by ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$, is then given by

## 4 Detection of image patch with SDJPEG compression

The analysis in Section 3 shows that the DCT coefficient distribution of SDJPEG patches follows a weighted summation of Gaussian components with the same standard deviation (as shown in (7) and Figure 5). However, in order to detect SDJPEG patches based on (7), two questions have to be addressed: (i) How to obtain discriminative features that capture the differences in the (*m*,*n*)th DCT coefficient distributions between SDJPEG and SSJPEG patches? (ii) How to select DCT modes which provide high discriminative power since the differences in the DCT coefficient distributions of SDJPEG and SSJPEG patches would vary for different DCT mode? In the following, we will address these questions.

### 4.1 Discriminative feature extraction based on the (*m,n*)th DCT coefficients

For a specific DCT mode, say (*m*,*n*)th, the following detection algorithm is carried out by determining whether the histogram *h*_{
m,n
}(*f*) is similar to that of ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$.

*q*

_{m,n}, we project the histogram

*h*

_{m,n}(

*f*) onto the interval $\left(-\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\right.,\left(\right)close=")">\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}$, and the sum of the histogram function within the interval is defined as

where *Nq*_{m,n} is the maximum absolute value of the (*m*,*n*)th DCT coefficient.

According to (7), for SDJPEG-compressed image patches, ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ follows a weighted summation of Gaussian components with a standard deviation of ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$. Hence, $\mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ would follow a specific distribution determined by ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathit{SJR}}$ and *q*_{m,n}. Based on the different ratios of ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$, $\mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ can be obtained as follows. It should be noted that in our discussions, the Gaussian distribution, *G*(0, *σ*), is assumed to be bounded in [-3*σ*, 3*σ*], and any outliers are omitted.

*p*+ 1) Gaussian distributions (as shown in Figure 6b,c), i.e.,

where $\mathit{p}+1=\mathrm{ceiling}\phantom{\rule{0.2em}{0ex}}\left(\frac{3{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}}{{\mathit{q}}_{\mathit{m},\mathit{n}}/2}\right)$.

*p*= 0 or

*p*= 1), $\mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ has a distinctive peak at small |

*f*|. The peak at small |

*f*| becomes more prominent with larger ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$. Such phenomenon does not occur for SSJPEG patches. Figure 7 illustrates

*sh*

_{m,n}(

*f*) distributions of the image Lena.bmp of Figure 5 (c1) after SDJPEG and SSJPEG with various parameter settings.

*sh*

_{m,n}(

*f*) for SDJPEG image patches is concentrated on the center region whereas the energy of

*sh*

_{m,n}(

*f*) for SSJPEG patches is almost evenly distributed. It should be noted that when the size of the image patch is small, e.g., 64 × 64, the total number of DCT coefficients is small (64 coefficients in total), and it is difficult to estimate the actual distribution of the patch accurately and robustly using such limited data. In order to solve the above problem of inadequate data, the 1-D feature similar to that in [23] is adopted to differentiate between SDJPEG and SSJPEG image patches as follows:

where ${\mathit{R}}_{1}=\left(-\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{6},\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{6}\right)$ representing the central region and ${\mathit{R}}_{2}=\left(-\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2},-\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{3}\right){\displaystyle \cup}\left(\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{3},\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{2}\right)$ representing the peripheral region.

For SDJPEG patches, the reference value of the feature, denoted by ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}$, can be derived from $\mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ and is determined by ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ and *q*_{m,n}. The reference value of SSJPEG patch, denoted by ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}$, is obtained as follows. Since the SSJPEG patch has never been compressed with the block structure starting from the top-left corner of the image patch, the DCT coefficients of SSJPEG can be assumed distributed approximately as the original uncompressed image patch [8, 14]. Moreover, ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\left(\mathit{f}\right)$ is equivalent to the distribution of the quantization error of the (*m*,*n*)th DCT component with the quantization step *q*_{
m,n
}. Hence, ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\left(\mathit{f}\right)$ can be approximated using the aligned JPEG quantization error estimation approach introduced in Section 3.1. Then ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}$ can be derived from (8) and (11). Note that for the low-frequency DCT modes, the quantization step *q*_{
m,n
} is relatively smaller compared with the dynamic range of the DCT coefficients and thus $\mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\left(\mathit{f}\right)$ is almost evenly distributed and ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}\approx 1>{\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}$.

The final discriminative feature indicating the likelihood of an image patch having been SDJPEG compressed is derived by normalizing the extracted feature with the reference values ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}$ and ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}$, i.e., $\mathit{s}{\mathit{n}}_{\mathit{m},\mathit{n}}=\left({\mathit{s}}_{\mathit{m},\mathit{n}}-{\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\right)/\left({\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}-{\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\right)$. Note that *sn*_{
m,n
} is small for SDJPEG patches and large for SSJPEG patches.

### 4.2 Discriminative feature extraction based on all the DCT coefficients

*sn*

_{ m,n }for the SDJPEG image patches is smaller than those for the SSJPEG patches. Such phenomenon becomes more prominent with the increase of ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$, and thus the DCT components with larger ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ would be more discriminative in SDJPEG detection. Our analysis in Section 3 shows that for a specific image, ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ can be estimated theoretically when QF

_{2}and the shift coordinates (

*x*

_{S},

*y*

_{S}) are known. Figure 8 gives some examples of ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\phantom{\rule{0.48em}{0ex}}1\le \mathit{m},\mathit{n}\le 8$ with different QF

_{2}and (

*x*

_{S},

*y*

_{S}).

Figure 8 shows the following: (i) Shifted JPEG compression with lower quality (QF_{2}) will introduce larger spread to the DCT coefficients. (ii) ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ for different DCT modes varies from each other and are not proportional to their corresponding quantization step. For instance, in Figure 8 (b2), the quantization steps for (2,1) and (2,2) are the same while ${\mathit{\sigma}}_{2,1}^{\mathrm{SJR}}$ and ${\mathit{\sigma}}_{2,2}^{\mathrm{SJR}}$ are quite different. (iii) Even for the same DCT mode, the variations caused by shift JPEG compression do not remain constant with different coordinate shifts, e.g., for the (2,3)th DCT coefficient, ${\mathit{\sigma}}_{2,3}^{\mathrm{SJR}}$ is quite different when (*x*_{S}, *y*_{S}) changes between (4,4) and (1,6). Hence, ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ can be obtained by a table-lookup with the knowledge of QF_{2} and (*x*_{S}, *y*_{S}). In order to have large value of ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathit{SJR}}$, smaller values of ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ or larger values of *q*_{
m,n
} are preferred. For any value of the quality factor, *Q* = [*q*_{m,n}] = *λQ*_{default}, where *Q*_{default} is the default quantization table defined by the Independent JPEG Group (IJG) and *λ* is a constant determined by the quality factor. Hence, we have $\frac{{\mathit{q}}_{\mathit{m},\mathit{n}}}{{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}}=\mathit{\lambda}\times \frac{{\mathit{q}}_{\mathrm{default}}\left(\mathit{m},\mathit{n}\right)}{{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}}=\mathit{\lambda}\times {\mathrm{dis}}_{\mathit{m},\mathit{n}}\phantom{\rule{1em}{0ex}}1\le \mathit{m},\mathit{n}\le 8$. It should be noted that in practice, only QF_{2} is known. We will show how QF_{1} can be estimated later.

- 1.
Input the prior information: the image patch, the quality factor of the final compression QF

_{2}, and the coordinate shift (*x*_{S},*y*_{S}) (which is given by an exhaustive enumeration on 63 possible coordinate shifts, see Section 4.3). - 2.
According to the image patch, QF

_{2}and (*x*_{S},*y*_{S}), estimate ${\mathit{\sigma}}^{\mathrm{SJR}}=\left[{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}\right]$ which is introduced by the shifted double JPEG compression. Calculate the discriminative table DIS = [dis(*m*,*n*)] for low frequencies (where*m*+*n*< 8) - 3.
Sort all the DCT modes according to their discriminative value in descending order and the first

*N*_{c}(*N*_{c}= 3 in our experiment) components are selected to construct the candidate set, i.e., {$\left({\mathit{m}}_{1},{\mathit{n}}_{1}\right),\left({\mathit{m}}_{2},{\mathit{n}}_{2}\right),\dots ,\left({\mathit{m}}_{{\mathit{N}}_{\mathrm{c}}},{\mathit{n}}_{{\mathit{N}}_{\mathrm{c}}}\right)$}. Estimate the quantization step ${\mathit{q}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}\left({\mathrm{QF}}_{1}\right)$ (for the unknown QF_{1}) of these DCT modes by analyzing their histograms. It should be noted that if the coefficients of a DCT mode in the candidate set has too many zero values (>80% of the total number of coefficients), the quantization step cannot be estimated accurately. Hence, considering the quantization noise caused by shifted compression, for any (*m*_{ i },*n*_{ i }) whose coefficients are concentrated in the range of $\left[-3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}},3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}\right]$, i.e., $\underset{\mathit{f}=-3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}}{\overset{3\times {\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}}{\int}}{\mathit{h}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SD}}\left(\mathit{f}\right)}\ge 80\mathrm{\%$, the mode is discarded and replaced with the mode with the next largest discriminative value outside the candidate set. - 4.
For the

*i*th mode (*m*_{ i },*n*_{ i }) in the candidate set, extract the normalized discriminative feature for the (*m*_{ i },*n*_{ i })th mode, i.e.,*sn*_{ mi },_{ ni }, using the approach described in Section 4.1 with the estimated ${\mathit{\sigma}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}^{\mathrm{SJR}}$ and ${\mathit{q}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}\left({\mathrm{QF}}_{1}\right)$. The final discriminative feature for the image patch, denoted by*sn*_{all}, is obtained by averaging over $\mathit{s}{\mathit{n}}_{{\mathit{m}}_{\mathit{i}},{\mathit{n}}_{\mathit{i}}}\phantom{\rule{0.48em}{0ex}}1\le \mathit{i}\le {\mathit{N}}_{\mathrm{c}}$. In step 3, the quantization step*q*_{m,n}(QF_{1}) can be obtained by an exhaustive search among the reference SDJPEG's $\mathit{s}{\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ where*sh*_{m,n}(*f*) is the most similar to. However, this approach is quite computationally expensive. Since ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ exhibits periodic-like pattern with a period of*q*_{m,n}(QF_{1}) to reduce the complexity, the approach similar to that in [17] is adopted, i.e., the fast Fourier transform is applied to the histogram ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$, and the peak of the spectrum with the DC removed is extracted to estimate the quantization step of the (*m*,*n*)th DCT mode*q*_{m,n}(QF_{1}). With the quantization step of the (*m*,*n*)th DCT mode*q*_{m,n}(QF_{1}) estimated, the quality factor QF_{1}can be estimated by comparing*q*_{m,n}(QF_{1}) with the default quantization table*Q*_{default}. In order to improve the robustness, all*q*_{m,n}(QF_{1}) values in the candidate set are estimated. The median value of the predicted QF_{1}is taken as the quality factor of the first compression, and all*q*_{m,n}(QF_{1}) values in the candidate set are refined using the estimated quality factor.

### 4.3 Crop-and-paste image tampering detection by detecting the SDJPEG effects

*B*×

*B*subimages. Each

*B*×

*B*subimage is examined to detect whether it contains any image patch having SDJPEG effects, which runs as follows:

- (i)
For a specific coordinate shift (

*x*_{S},*y*_{S}) (0 ≤*x*_{S},*y*_{S}≤ 7 and (*x*_{S},*y*_{S}) ≠ (0, 0)), crop an image patch ${\mathrm{IMG}}_{{\mathit{x}}_{\mathrm{S}},{\mathit{y}}_{\mathrm{S}}}$ from the subimage with the size of (*B*- 8) × (*B*- 8) and starting from (*x*_{S},*y*_{S}). - (ii)
With (

*x*_{S},*y*_{S}) and QF_{2}, extract the discriminative feature*sn*_{all}for ${\mathrm{IMG}}_{{\mathit{x}}_{\mathrm{S}},{\mathit{y}}_{\mathrm{S}}}$. Then the SDJPEG effect map (*SEM*) for (*x*_{S},*y*_{S}) is set to*sn*_{all}, i.e., SEM (*x*_{S},*y*_{S}) =*sn*_{all}. - (iii)
Repeat steps (i) and (ii) for all the 63 possible coordinate shifts to obtain the SEM for the

*B*×*B*subimage. - (iv)
Compare SEM with those of the positive (containing SDJPEG patches) and negative (not containing any SDJPEG patch) samples in the training database to detect whether the subimage has been tampered with. The Fisher's linear discriminant analysis (LDA) [24] is adopted as the classifier in our approach.

- (v)
Loop through steps (i) to (iv) for all

*B*×*B*subimages in the image to detect the suspicious regions that might have been tampered with.

## 5 Experiments and discussion

In this section, we first investigate the effectiveness of several key issues for the proposed approach, i.e., the quantization error estimation for shifted JPEG compression, the primary quality factor estimation, and the proposed discriminative feature for SDJPEG detection. Then, we analyze the SDJPEG detection performance for image patches of various sizes and compared the detection performance with four state-of-the-art SDJPEG detection algorithms, i.e., Luo et al.'s [15] (Luo's in short), Chen and Hsu's [16] (Chen's in short), Bianchi and Piva's [13] (Bianchi I's in short), and Bianchi and Piva's [14] (Bianchi II's in short). Finally, we compare the image tampering detection performance of all five algorithms for two example images. The images in our experiments come from two widely used image databases, i.e., the UCID [25] and NRCS [26] image datasets, and all the original images are uncompressed.

### 5.1 Effectiveness evaluation of the proposed approach

#### 5.1.1 Quantization error estimation for shifted JPEG compression

_{2}randomly picked from 50 to 90 and random selection of (

*x*

_{S},

*y*

_{S}) ((

*x*

_{S},

*y*

_{S}) ≠ (0, 0)). The other half of the image patches are used to generate the SDJPEG by aligned JPEG compression with quality factor QF

_{1}randomly picked from 50 to 90 and shifted JPEG compression with quality factor QF

_{2}randomly picked from 50 to 90 and random selection of (

*x*

_{S},

*y*

_{S}) ((

*x*

_{S},

*y*

_{S}) ≠ (0, 0)). For each image patch, the actual standard deviation of the quantization error caused by shifted JPEG compression (denoted by ${\mathit{\sigma}}^{\mathrm{ACT}-\mathrm{SJR}}=\left[{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{ACT}-\mathrm{SJR}}\right]$) is recorded as the ground truth, and the average relative estimation error is adopted to evaluate the estimation performance, i.e.,

The average relative estimation error *η*_{SJR} for the above image patches using the proposed approach is 10.61%, which is much less than that using the rough approximation in [13, 14] (214.94%). Hence, the proposed DCT coefficient model can better describe the effects caused by shifted JPEG compression.

#### 5.1.2 Primary quality factor estimation

_{1}and QF

_{2}, 500 SDJPEG patches are generated with random selection of (

*x*

_{S},

*y*

_{S}) ((

*x*

_{S},

*y*

_{S}) ≠ (0, 0)). The average estimation accuracy is given in Table 3.

**Average estimation accuracy a1/a2/a3 of QF**_{
1
}**with various patch sizes ((a1) 64** × **64; (a2) 128** × **128; (a3) 256** × **256), QF**_{
1
}**and QF**_{
2
}

QF | QF | ||||
---|---|---|---|---|---|

50 | 60 | 70 | 80 | 90 | |

50 | 0.62/0.95/0.99 | 0.78/0.97/0.97 | 0.90/1.00/1.00 | 0.95/1.00/1.00 | 0.96/1.00/1.00 |

60 | 0.48/0.54/0.59 | 0.62/0.82/0.95 | 0.75/0.97/0.98 | 0.92/1.00/1.00 | 0.93/1.00/1.00 |

70 | 0.13/0.14/0.13 | 0.36/0.52/0.61 | 0.65/0.85/0.92 | 0.79/0.99/0.98 | 0.92/1.00/1.00 |

80 | 0.13/0.12/0.10 | 0.14/0.13/0.12 | 0.44/0.53/0.62 | 0.68/0.85/0.94 | 0.82/0.98/0.96 |

90 | 0.08/0.08/0.09 | 0.12/0.14/0.13 | 0.14/0.13/0.14 | 0.44/0.55/0.59 | 0.69/0.82/0.89 |

From the table, it is observed that the estimation performance is improved with the increase of the patch size. It is because for larger size of the image patch, more data are collected to construct the histogram of the DCT coefficients and the period estimation approach will be more robust. On the other hand, the estimation performance is also improved with the increase of QF_{2} - QF_{1}. Such phenomenon can be explained as follows. From (7) and Figure 5 (a2 and b2), for SDJPEG image patches, the period in the histogram (i.e., the primary quantization step *q*_{
m,n
}(QF_{1})) is large for low quality factor QF_{1} and the standard deviation of the quantization noise with rounding error, i.e., ${\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$, is small for high quality factor QF_{2}. Hence when QF_{2} - QF_{1} is large, i.e., QF_{2} - QF_{1} ≥ 10, the Gaussian impulses in the histogram are non-overlapping or slightly overlapping and thus the period can be estimated accurately. However, when QF_{2} - QF_{1} is small, i.e., QF_{2} - QF_{1} ≤ -20, the Gaussian impulses in the histogram are highly overlapping and thus the periodic pattern almost disappears in the histogram.

#### 5.1.3 Effectiveness of the proposed discriminative feature

*n*

_{all}, the distributions of

*sn*

_{all}for SSJPEG and SDJPEG patches with the size 256 × 256 are analyzed. For SSJPEG patches, the quality factor QF

_{2}is set to 70. For SDJPEG patches, the second compression quality factor QF

_{2}is also set to 70 and the primary compression quality factor QF

_{1}is randomly picked from 50 to 70. The coordinate shift (

*x*

_{S},

*y*

_{S}) is also selected randomly in the range of 0 ≤

*x*

_{S},

*y*

_{S}≤ 7, (

*x*

_{S},

*y*

_{S}) ≠ (0, 0). The histograms of

*sn*

_{all}in SSJPEG and SDJPEG patches are given in Figure 9. It is observed that

*sn*

_{all}is usually small for SDJPEG patches while it is large for SSJPEG ones, which demonstrates that

*sn*

_{all}is effective in differentiating SDJPEG and SSJPEG patches.

### 5.2 SDJPEG patch detection

- i.
Image patches of three sizes, i.e., 64 × 64, 128 × 128 and 256 × 256 are used.

- ii.
An overall 25 {QF

_{1},QF_{2}} pairs are investigated with QF_{1},QF_{2}= {50,60,70,80,90}. - iii.
For a specific {QF

_{1},QF_{2}} pair, 10,000 image patches are randomly extracted from the uncompressed image database. The ‘positive’ samples are constructed by performing shifted JPEG compression with quality factor of QF_{1}and random coordinate shifts (*x*_{S},*y*_{S}) (0 ≤*x*_{S},*y*_{S}≤ 7 and (*x*_{S},*y*_{S}) ≠ (0, 0)) and then saving the image patches in JPEG format with QF_{2}. The ‘negative’ samples are constructed by directly saving the same uncompressed image patches in JPEG with QF_{2}.

_{2}is available and QF

_{1}is unknown to an algorithm, the following are the settings for the different algorithms evaluated:

- 1)
For Luo's [15] and Chen's [16] approaches, the SVM is adopted as the classifier. In the training procedure, since QF

_{1}is unknown for each QF_{2}, a specific SVM is trained with the features extracted from half of the positive samples obtained from all possible selections of QF_{1}= {50,60,70,80,90} and the corresponding negative samples (there are overall 25,000 positive and 25,000 negative samples). In the testing procedure, the test samples are classified by their corresponding classifier with the knowledge of QF_{2}. - 2)
For Bianchi I's [14] approach, with the knowledge of QF

_{2}, QF_{1}is estimated by an EM algorithm. Then the tampered region likelihood is calculated for each 8 × 8 block in the image patch, and the investigated image patch is classified as a SDJPEG patch if half of the 8 × 8 blocks in the patch are with the likelihood greater than 1. For Bianchi II's [13] approach, with the knowledge of QF_{2}, QF_{1}is achieved by an exhaustive search. - 3)
For the proposed approach, QF

_{1}is estimated by the method introduced in Section 4.2*.*For each QF_{2}, SEM features for 630 positive samples (ten samples for each possible (*x*_{S},*y*_{S}) with QF_{1}randomly selected) and 630 negative samples are adopted to train the classifier, and the rest (about 48,740 samples) are adopted for testing.

_{2}- QF

_{1}. From the table, the following observations can be made.

**Average classification accuracy, i.e., 1 - (FAR + FRR)/2, in percent with various values of QF**_{
1
}**and QF**_{
2
}**for patch size of 64** × **64**

QF | QF | ||||
---|---|---|---|---|---|

50 | 60 | 70 | 80 | 90 | |

50 | 51.86/50.40/50.96 | 51.55/55.15/53.51 | 54.89/70.08/62.07 | 56.32/78.95/70.02 | 62.29/82.12/76.71 |

50.90/ | 51.56/ | 54.29/ | 56.58/ | 63.01/ | |

60 | 51.31/50.26/49.34 | 50.57/50.87/52.10 | 51.40/55.24/56.21 | 53.92/70.50/62.51 | 59.02/78.39/71.51 |

51.95/ | 50.34/ | 51.75/ | 54.31/ | 60.89/ | |

70 | 50.62/50.44/50.70 | 50.33/50.55/51.09 | 51.30/51.13/51.07 | 52.71/59.49/60.02 | 56.33/70.25/65.25 |

50.80/ | 49.60/ | 50.45/ | 52.63/ | 57.22/ | |

80 | 50.42/ | 49.62/50.63/ | 51.62/50.42/50.09 | 50.49/51.91/51.77 | 54.34/58.41/59.39 |

49.79/50.83 | 49.77/50.44 | 49.66/ | 50.33/ | 53.90/ | |

90 | 50.44/50.27/49.26 | 49.86/49.41/49.20 | 49.89/49.91/49.45 | 49.39/50.37/49.62 | 50.02/50.14/50.48 |

49.13/ |
| 49.87/ | 50.48/ | 50.26/ |

**Average classification accuracy for patch size of 128** × **128**

QF | QF | ||||
---|---|---|---|---|---|

50 | 60 | 70 | 80 | 90 | |

50 | 51.81/52.95/53.82 | 52.62/56.83/55.13 | 58.25/72.74/67.85 | 64.26/81.69/77.62 | 74.29/89.93/86.55 |

51.88/ | 52.30/ | 59.59/ | 65.94/ | 75.15/ | |

60 | 50.91/51.96/51.45 | 51.33/52.02/51.37 | 54.12/61.65/59.72/ | 58.48/74.19/70.39 | 69.45/81.78/79.76 |

51.29/ | 51.48/ | 55.08/ | 58.44/ | 70.78/ | |

70 | 50.00/50.58/ | 51.26/50.05/50.53 | 52.64/53.55/53.97 | 53.76/64.37/64.07 | 62.79/76.49/78.42 |

50.12/50.23 | 50.99/ | 51.57/ | 54.28/ | 62.53/ | |

80 | 50.10/50.24/50.00 |
| 50.11/50.30/50.44 | 50.99/52.79/52.42 | 57.24/65.44/68.36 |

| 50.67/49.96 | 50.05/ | 51.00/ | 58.44/ | |

90 | 50.25/49.36/49.58 |
| 49.47/49.19/50.25 | 49.86/52.09/52.67 | 52.46/53.31/53.25 |

| 49.53/49.55 | 50.20/ | 49.39/ | 52.33/ |

**Average classification accuracy for patch size 256** × **256**

QF | QF | ||||
---|---|---|---|---|---|

50 | 60 | 70 | 80 | 90 | |

50 | 53.24/58.51/59.75 | 56.39/66.82/64.12 | 65.63/78.59/71.01 | 75.82/89.20/86.18 | 84.39/94.19/93.59 |

52.84/ | 57.54/ | 67.88/ | 77.05/ | 86.72/ | |

60 | 50.91/53.52/52.40 | 55.02/60.36/57.57 | 58.23/65.21/63.58 | 67.78/78.18/71.37 | 80.62/89.98/88.14 |

50.08/ | 54.81/ | 60.39/ | 69.72/ | 81.64/ | |

70 | 50.81/ | 50.68/53.94/52.15 | 52.14/57.31/57.99 | 59.04/68.23/69.21 | 74.18/84.12/82.61 |

49.97/50.13 | 50.00/ | 53.63/ | 61.30/ | 75.50/ | |

80 | 49.62/ | 49.39/ | 50.04/50.48/50.31 | 53.55/58.32/59.16 | 64.82/72.41/71.58 |

50.53/50.47 | 49.38/50.17 | 50.71/ | 52.89/ | 64.57/ | |

90 | 49.99/50.38/ | 50.72/50.56/49.39 | 50.02/50.24/49.78 | 50.03/52.93/51.13 | 54.39/54.27/53.67 |

50.26/49.34 |
| 49.67/ | 49.66/ | 54.85/ |

First, the recognition accuracy increases substantially with the increase of patch size. As the features used in all five methods are based on statistical models, more data leads to better estimation of the models. Among all five approaches, our algorithm always outperforms the others especially when the patch size is small (64 × 64 for instance) since it requires less data to estimate a discriminative model. Since only one or several small image patches are modified in most image tampering detection tasks, our algorithm is more appropriate in such applications.

Second, the detection performance is improved with the increase of QF_{2} - QF_{1}. It is because the lower quality factor of the first JPEG compression will leave more traces of the compression history and the higher quality factor of the second JPEG compression will introduce less distortion to the image, which makes the traces left by the first compression easier to be detected. The proposed approach can achieve reasonable accuracies (above 80%) when QF_{2} - QF_{1} ≥ 10 and high accuracies (above 90%) when QF_{2} - QF_{1} ≥ 20 for 128 × 128 and 256 × 256 patches. However, when QF_{2} - QF_{1} ≤ -20, the detection performance is very poor for all approaches, i.e., the classification accuracies are close to that of random guess (50%). The poor performance of the proposed approach is to be expected because when QF_{2} < QF_{1}, ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$ will be very small for all the DCT modes and ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}$ will be close to ${\mathit{s}}_{\mathit{m},\mathit{n}}^{\mathrm{SS}}$. In addition, as shown in subsection 5.1.2, with small values of ${\mathit{q}}_{\mathit{m},\mathit{n}}/{\mathit{\sigma}}_{\mathit{m},\mathit{n}}^{\mathrm{SJR}}$, all the Gaussian components of ${\mathit{h}}_{\mathit{m},\mathit{n}}^{\mathrm{SD}}\left(\mathit{f}\right)$ in (10) will highly overlap with each other and the estimation of QF_{1} will not be accurate especially when the patch size is small.

### 5.3 Crop-and-paste image tampering detection

_{1}, pasting it to the original image and saving as the forgery JPEG file with quality factor of QF

_{2}. Three different settings of {QF

_{1},QF

_{2}}, i.e., {50,50}, {50,70}, and {50,90}, are investigated, and the detected tampered region is highlighted in the figures. Note that for all the three approaches, the corresponding parameters are adjusted to ensure that the false alarm rate (FAR) is less than 10%. Moreover, the detection results undergo a 3 × 3 median filtering process to remove isolate detection errors. It is observed from the figures that the proposed algorithm can accurately detect and locate tampered region when {QF

_{1},QF

_{2}} = {50,70} and {50,90}. It also accurately points out some suspicious regions when {QF

_{1},QF

_{2}} = {50,50}. Compared with Bianchi and Piva's approaches [13, 14], the proposed approach provides more accurate detection results due to (i) better estimation of the quantization error caused by shifted JPEG compression and (ii) more discriminative information investigated by the adaptive DCT mode selection approach in Section 4.1 compared with that only DC mode is investigated in [13].

## 6 Conclusions

In this paper, a new JPEG image tampering detection algorithm based on SDJPEG detection is presented. The DCT coefficient distribution for the SDJPEG patches has been expressed by a statistical model, and a discriminative feature has been proposed which can effectively differentiate between SDJPEG and SSJPEG patches. By an adaptive DCT mode selection scheme, several highly discriminative DCT modes are selected. We used several thousand SDJPEG and SSJPEG patches extracted from the UCID and NRCS image database to evaluate the performance of the proposed algorithm along with several existing algorithms. From the experimental results, the proposed algorithm can achieve much better results compared with the existing algorithms, especially when the patch size is small. We also performed experiments to detect and locate the tampered region of two test images, where the proposed algorithm always outperforms the other algorithms investigated. As a result, the proposed algorithm has provided a new and effective solution for SDJPEG detection and JPEG image forensics.

## Appendix

### Derivation of DCT coefficient from adjacent blocks

*A*can be decomposed into four non-overlapped region, i.e.,

*A*

_{1},

*A*

_{2},

*A*

_{3}, and

*A*

_{4}, and

*A*=

*A*

_{1}+

*A*

_{2}+

*A*

_{3}+

*A*

_{4}. In order to derive

*A*

_{ i }=

*H*

_{i 1}

*B*

_{ i }

*H*

_{i 2}, we just demonstrate the situation when

*i*= 4 and the same analysis can be applied to

*i*= 1,2,3 (Figure 11).

For the bottom right corner, *A*_{4} has the following relationship to *B*_{4}:*A*_{4} = *H*_{41}*B*_{4}*H*_{42}, where ${\mathit{H}}_{41}=\left[\begin{array}{cc}\hfill 0\hfill & \hfill 0\hfill \\ \hfill {\mathit{I}}_{\mathit{h}}\hfill & \hfill 0\hfill \end{array}\right]$ and ${\mathit{H}}_{42}=\left[\begin{array}{cc}\hfill 0\hfill & \hfill {\mathit{I}}_{\mathit{w}}\hfill \\ \hfill 0\hfill & \hfill 0\hfill \end{array}\right]$. *I*_{
w
} and *I*_{
h
} are the identity matrices with size *h* × *h* and *w* × *w*, respectively; *h* and *w* are the number of rows and columns extracted. Since all unitary orthogonal transforms such as the DCT are distributive to matrix multiplication [19], we have $\mathit{D}\left({\mathit{A}}_{4}\right)=\mathit{D}\left({\mathit{H}}_{41}\right)\mathit{D}\left({\mathit{B}}_{4}\right)\mathit{D}\left({\mathit{H}}_{42}\right)\Rightarrow \mathit{D}\left(\mathit{A}\right)={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{A}}_{\mathit{i}}\right)}={\displaystyle \sum _{\mathit{i}=1}^{4}\mathit{D}\left({\mathit{H}}_{\mathit{i}1}\right)\mathit{D}\left({\mathit{B}}_{\mathit{i}}\right)\mathit{D}\left({\mathit{H}}_{\mathit{i}2}\right)}$.

## Declarations

### Acknowledgements

The work described in this paper is supported by National Key Basic Research Program of China No. 2013CB329603 and NSFC Fund No. 61271319 and No. 61071152.

## Authors’ Affiliations

## References

- Wang X, Xue J, Zheng Z, Liu Z, Li N: Image forensic signature for content authenticity analysis.
*J. Vis. Commun. Image. Repres.*2012, 23(5):782-797. 10.1016/j.jvcir.2012.03.005View ArticleGoogle Scholar - Swaminathan A, Mao Y, Wu M: Robust and secure image hashing.
*IEEE. Trans. Info. Forensics. Sec.*2006, 1(2):215-230. 10.1109/TIFS.2006.873601View ArticleGoogle Scholar - Phadikar A, Maity SP, Mandal M: Novel wavelet-based QIM data hiding technique for tamper detection and correction of digital images.
*J. Vis. Commun. Image. Represent.*2012, 23(3):454-466. 10.1016/j.jvcir.2012.01.005View ArticleGoogle Scholar - Farid H: A survey of image forgery detection.
*IEEE. Signal. Process. Mag.*2009, 2(26):16-25.View ArticleGoogle Scholar - Popescu AC Ph.D. Thesis. In
*Statistical tools for digital image forensics*. Department of Computer Science, Dartmouth College, Hanover, NH; 2005.Google Scholar - Popescu AC, Farid H: Exposing digital forgeries in color filter array interpolated images.
*IEEE. Trans. Signal. Proc.*2005, 53(10):3948-3959.MathSciNetView ArticleGoogle Scholar - Chen L, Lu W, Ni J, Sun W, Huang J: Region duplication detection based on Harris corner points and step sector statistics. J. Vis. Commun. Image. Represent. In PressGoogle Scholar
- Lukas J, Fridrich J: Estimation of primary quantization matrix in double compressed JPEG images. In
*Proceedings of Digital Forensic Research Workshop*. Cleveland; 2003:5-8.Google Scholar - Fu D, Shi YQ, Su Q: A Generalized Benford's law for JPEG coefficients and its applications in image forensics. In
*Proceedings of the SPIE Electronic Imaging, Security and Watermarking of Multimedia Contents IX, vol. 650*. 5th edition. Springer, San Jose, CA; 2007:1L1-1L11.Google Scholar - Pevny T, Fridrich J: Detection of double-compression in JPEG images for applications in steganography.
*IEEE. Trans. Info. Forensics. Sec.*2008, 3(2):247-258.View ArticleGoogle Scholar - Farid H: Exposing digital forgeries from JPEG ghosts.
*IEEE. Trans. Info. Forensics. Sec.*2009, 4(1):154-160.MathSciNetView ArticleGoogle Scholar - Ye SM, Sun QB, Chang EC:
*Detecting digital image forgeries by measuring inconsistencies of blocking artifact, in Proceedings of IEEE International Conference on Multimedia and Expo*. China, Beijing; 2007. pp. 12–15Google Scholar - Bianchi T, Piva A: Detection of nonaligned double JPEG compression based on integer periodicity maps.
*IEEE. Trans. Info. Forensics. Sec.*2012, 7: 2.View ArticleGoogle Scholar - Bianchi T, Piva A: Image forgery localization via block-grained analysis of JPEG artifacts.
*IEEE. Trans. Info. Forensics. Sec.*2012, 7: 3.View ArticleGoogle Scholar - Luo W, Qu Z, Huang J, Qiu G: A novel method for detecting cropped and recompressed image block. In Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP’07), vol. 6. 2nd edition. IEEE, Piscataway, Honolulu, Hawaii; 217-220.Google Scholar
- Chen YL, Hsu CT: Detecting recompression of JPEG images via periodicity analysis of compression artifacts for tampering detection.
*IEEE. Trans. Info. Forensics. Sec.*2011, 6(2):396-406.MathSciNetView ArticleGoogle Scholar - Lin Z, He J, Tang X, Tang CK: Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis.
*Pattern. Recogn.*2009, 42(11):2492-2501. 10.1016/j.patcog.2009.03.019View ArticleMATHGoogle Scholar - Huang FJ, Huang JW, Shi YQ: Detecting double JPEG compression with the same quantization matrix.
*IEEE. Trans. Info. Forensics. Sec.*2010, 5(4):848-856.View ArticleGoogle Scholar - Chang SF, Messerschimitt DG: Manipulation and compositing of MC-DCT compressed video.
*IEEE. J. Select. Areas. Commun.*1995, 13(1):1-11. 10.1109/49.363151View ArticleGoogle Scholar - Reininger R, Gibson J: Distributions of the two-dimensional DCT coefficients for images.
*IEEE. Trans. Commun.*1983, 31(6):835-839. 10.1109/TCOM.1983.1095893View ArticleGoogle Scholar - Fan Z, de Queiroz RL: Identification of bitmap compression history: JPEG detection and quantizer estimation.
*IEEE. Trans. Image. Process.*2003, 12(2):230-235. 10.1109/TIP.2002.807361View ArticleGoogle Scholar - Popescu AC, Farid H: Statistical tools for digital forensics.
*Lect. Notes. Comput. Sci.*2005, 3200: 395-407.Google Scholar - Luo W, Huang J, Qiu G: JPEG error analysis and its application to digital image forensics.
*IEEE. Trans. Info. Forensics. Sec.*2010, 5(3):480-491.View ArticleGoogle Scholar - Duda RO, Hart PE, Stork DG:
*Pattern Classification*. 2nd edition. Wiley-Interscience, USA; 2000.MATHGoogle Scholar - Schaefer G, Stich M: UCID: an uncompressed color image database. In
*Proceedings of the SPIE Storage and Retrieval Methods and Applications for Multimedia, vol. 530*. 7th edition. Springer, San Jose, CA; 2003:472-480.Google Scholar - NRCS Photo Gallery 2005.http://photogallery.nrcs.usda.gov/res/sites/photogallery/

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.