 Research
 Open Access
 Published:
Modified BCH data hiding scheme for JPEG steganography
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 89 (2012)
Abstract
In this article, a new BoseChaudhuriHochquenghem (BCH)based data hiding scheme for JPEG steganography is presented. Traditional data hiding approaches hide data into each block, where all the blocks are not overlapping each other. However, in the proposed method, two consecutive blocks can be overlapped to form a combined block which is larger than a single block, but smaller than two consecutive nonoverlapping blocks in size. In order to embed more amounts of data into the combined block than a single block, the BCHbased data hiding scheme has to be redesigned. In this article, we propose a way to get a joint solution for hiding data into two blocks with intersected coefficients such that any modification of the intersected area does not affect the data hiding process into both blocks. Due to hiding more amounts of data into the intersected area, embedding capacity is increased. On the other hand, the nonzero DCT coefficient stream is modified to achieve better steganalysis and to reduce the distortion impact after data hiding. This approach carefully inserts or removes 1 or 1 coefficients into or from the DCT coefficient stream according to the rule proposed in this article. Experimental results show that the proposed algorithms work well and their performance is significant.
1. Introduction
One of the first steganography methods for JPEG images embeds data by changing the leastsignificant bit values of the quantized discrete cosine transform (DCT) coefficients. However, this method can easily be detected by a statistical analysis. Thus, for a good while, evading the statistical analysis has been a major concern. Provos [1] divides the DCT coefficients into two disjoint subsets, hides data into the first subset, and compensates the distorted histogram by modifying the second subset. Other methods in [2, 3] use a similar approach. On the other hand, Solanki et al. [4] utilize the robust watermarking scheme for steganography purposes. They embed data into image in the spatial domain by using a technique robust against JPEG compression. Their scheme provides less degradation onto the features of the DCT coefficients, and, as a result, its detectability was low against old version of the statistical steganalysis.
Another way to survive against steganalysis is reducing the number of modified coefficients. Traditionally, each nonzero DCT coefficient has been modified. As a result, embedding capacity is as much as the number of nonzero DCT coefficients. However, the maximum possible embedding capacity trades off the detectability. Westfeld [5] has used a matrix encoding (ME) technique to lower detectability by sacrificing the embedding capacity. The ME technique exploits the Hamming code which is designed for error correction. His scheme hides many bits by flipping at most one coefficient in each block. This approach was the first instance of using the error correcting code for data hiding.
Fridrich et al. [6–13] use the concept of the "minimal distortion" to enhance the security (i.e., by reducing distortion). The perturbed quantization steganography utilizes the wet paper coding.
Later, Kim et al. [14] have improved the performance of the ME by reducing the distortion impact. In fact, their modified matrix encoding (MME) method changes more number of coefficients compared to the ME. However, they show that the distortion impact after modifying one coefficient may be larger than that after modifying two coefficients. Thus, it is obvious that modifying one coefficient or two per block may have less distortion and lower detectability against the steganalysis. Note that MME requires the original uncompressed image for data hiding, but not for decoding.
Schönfeld and Winkler [15] have proposed a new way to hide data using more powerful error correction code. They use a structured BoseChaudhuriHochquenghem (BCH) code [2]. Zhang et al. [16] have significantly improved the original BCHbased data hiding scheme. Their improved method can easily find the flip positions and defeat the steganalysis well compared to the existing methods. Later, Sachnev et al. [17] apply a heuristic optimization technique for the data hiding scheme over the BCH coding and modify the stream of the input DCT coefficients to reduce the distortion. Their method considerably outperforms the steganography method proposed by Zhang et al. [16].
Recently, Filler and Fridrich [18] have proposed a remarkable framework which minimizes a distortion measure as a weighted norm of the difference between cover and stego feature vectors. In their approach, the distortion is not necessarily an additive function over the pixels because the features may contain higherorder statistics such as sample transition probability matrices of pixels or DCT coefficients modeled as Markov chains [19–21]. When the distortion measure is defined as a sum of local potentials, practical nearoptimal embedding methods can be implemented with syndrometrellis codes [22].
Most of the abovementioned steganographic methods use the nonoverlapping blocks of the DCT coefficients for hiding secret message. Such a blockwise embedding scheme divides both the stream of the DCT coefficients and hidden message into the separate blocks and solves the equations for hiding data for each block individually. Recent methods like MME [14], BCHbased steganography methods [15–17] may produce several alternative solutions. Thus, such a data hiding method can choose a solution with the lowest distortion impact. Past investigation over the BCH data hiding scheme finds that BCH usually allows redundant number of possible solutions. It means that a solution with acceptable distortion impact can be achieved from the reduced set of possible solutions. Hence, the embedding efficiency of the BCH steganographic methods can be increased by reducing the number of possible solutions and keeping similar distortion impact compared to the original approach.
In the proposed method, two blocks of the DCT coefficients form a combined block sharing common coefficients in the intersected part between two consecutive blocks. Such a design achieves high embedding efficiency by hiding data twice into the intersected area. The number of possible joint solutions for both blocks (i.e., solutions which valid for both blocks) is always smaller than the number of all possible solutions for two independent blocks. The reduced number of possible solutions can increase distortion, but not significantly. Besides, the number of possible solutions can easily be controlled by changing size of the intersected area. The smaller size of the intersected area, the larger number of possible joint solutions. Similar approach has been tested for Hamming code in [23].
However, the higher size of the intersected area, the higher embedding efficiency of the proposed method. In the proposed method, the block of the DCT coefficients can be modified by inserting new nonzero coefficients 1 or 1, or removing coefficients 1 or 1. Such modification is carried out carefully and sophisticatedly in order to reduce distortion caused by excessive hiding.
The rest of the article is organized as follows. Section 2 explains the details of the BCH coding. Section 3 presents the BCHbased modified data hiding scheme. In Section 4, we propose the insertingremoving strategy. The encoder and decoder are presented in Section 5. Section 6 provides the experimental results. Finally, Section 7 concludes the article.
2. BCH syndrome coding
The BCH codes are the well known and widely used family of the error correction codes. BCH code (n, k, t) can correct t bits by inserting n  k additional bits to the original message k such that syndrome of resulted n bits is equal to 0. In general, BCH codes were invented for error correction and cannot directly be used for data hiding. An efficient method of using powerful BCH codes for data hiding has been presented in [15–17].
2.1. BCH syndrome coding
The generalized paritycheck matrix H for BCH coding is presented as follows:
Let t be 2. Then, the paritycheck matrix is expressed as follows:
Assume that the original stream of binary data is V = {v_{0}, v_{1}, v_{2}, ..., v_{n1}}, and the modified stream of binary data after data hiding is R = {r_{0}, r_{1}, r_{2}, ..., r_{n1}}. The streams V and R over GF(2^{m}) can be represented as V(x) = v_{0} + v_{1}·x + v_{2}·x^{2} + v_{3}·x^{3} + ⋯ + v_{n1}·x^{n1}, and R(x) = r_{0} + r_{1}·x + r_{2}·x^{2} + r_{3}·x^{3} + ⋯ + r_{n1}·x^{n1}, respectively.
The embedded message m can be computed as follows:
Thus, the hiding message m to V requires to find R such that
The difference between V and R shows the number and location of the elements in V to be flipped.
or
where u = {u_{0}, u_{1}, u_{2}, ..., u_{ l }} are the positions of the elements in V to be flipped in order to get R.
Using Equations (3) and (4), the syndrome S can be computed as follows:
If t is 2, then
2.2. Lookup tables
In this article, we utilized the method of Zhao et al. [24] based on the fast lookup tables for finding roots of quadratic and cubic polynomial of σ(x). Similar approach has been used in [16, 17].
2.3. Solutions
Hiding message m to the binary stream V requires to find the positions of the coefficients to be flipped. In this article, we used a method presented in [16, 17] to get one, two, three, or four flips solutions. The set of all possible solutions for one, two, three, or four flips has to be stored in the look up tables J_{1}, J_{2}, J_{3}, and J_{4}, respectively. The notation J_{3}(S) returns all three flips solutions for syndrome S = {S_{1}S_{2}}. Similarly, we can get all possible solutions for block n_{1} with syndrome S^{I}, for block n_{2} with syndrome S^{II}, as J^{I} = {J_{1}(S^{I}) J_{2}(S^{I}) J_{3}(S^{I}) J_{4}(S^{I})} and for block n_{2} with syndrome S^{II} as J^{II} = {J_{1}(S^{II}) J_{2}(S^{II}) J_{3}(S^{II}) J_{4}(S^{II})}, respectively. The look up tables' size is (2^{2·m} 1) × nS where nS is a number of stored solutions.
3. Proposed data hiding scheme
In the proposed BCH data hiding scheme, we combine two BCH blocks of 2^{m}  1 DCT coefficients into one, such that BCH blocks intersect each other. Figure 1 shows the block diagram of coefficients for the proposed scheme. In the presented example, (a_{1}, a_{2}, a_{3}, ..., a_{25}) is the combined block of the DCT coefficients; $\left({v}_{1}^{\prime},{v}_{2}^{\prime},{v}_{3}^{\prime},\dots ,{v}_{15}^{\prime}\right)$ and $\left({v}_{1}^{\u2033},{v}_{2}^{\u2033},{v}_{3}^{\u2033},\dots ,{v}_{15}^{\u2033}\right)$ are the corresponding binary coefficients for the BCH blocks n_{1} and n_{2}, respectively. Intersected area I covers five coefficients a_{11}, a_{12}, a_{13}, a_{14}, and a_{15} in this example. Such a scheme can hide more amounts of data by exploiting the intersected area using any kind of coding schemes.
One of the two main contributions of this article is to present a systematic algorithm for the joint solutions. The proposed BCHbased data hiding scheme requires to find a joint solution for both blocks n_{1} and n_{2} using the guidelines from Section 2.1 such that the intersected area does not affect the result. For example, let 8 bits be hidden into 15 coefficients from a_{1} to a_{15} using the BCHbased steganography. Then, another 8 bits can be hidden into the next block having another 15 coefficients from a_{11} to a_{25}. This is the traditional approach. As a result, 16 bits can be hidden into 30 coefficients. However, our new approach hides the same amount of data into 25 coefficients a_{1} to a_{25}. Eight bits are hidden into the coefficients from a_{1} to a_{15}, and another eight bits into the coefficients from a_{11} to a_{25}. Data hiding algorithm requires to find syndromes S^{I} and S^{II} (Equation 6) for each block n_{1} and n_{2}, respectively.
There are two possible ways for hiding data into the combined blocks. Either hiding data into the block n_{1} first, or into the block n_{2} first. The proposed algorithm for getting a joint solution is designed as follows:

1.
Hiding data into the block n _{2} first.

(a)
Some solutions for hiding data into the block n _{1} do not modify the coefficients in the intersected area. Thus, solutions for the block n _{2} have to be obtained using the original syndrome S ^{II}. Some solutions are valid since they do not modify the coefficients in the intersected area. These solutions are called specified solutions.

(b)
Some solutions for the block n _{1} modify the coefficients in the intersected area. These modifications in the intersected area affect the syndrome for the block n _{2}. Thus, the new syndrome for the block n _{2} is obtained as S ^{II} new. Some new solutions are valid since they do not modify the coefficients already modified by the n _{1} in the intersected area.
Among all possible solutions for the block n_{2} and new syndrome ${S}_{new}^{\mathsf{\text{II}}}$ (in case of 1(a), ${S}_{new}^{\mathsf{\text{II}}}={S}^{\mathsf{\text{II}}}$), choose the solutions which do not have flipping positions in the intersected area (i.e., valid or specified solutions). Thus, the joint solutions for a combined block unify the solutions for the block n_{1} and its syndrome S^{I} and the specified solutions for the block n_{2} and its syndrome S^{II} new.

2.
Hiding data into the block n _{2} first.

(a)
Some solutions for hiding data into the block n _{2} do not modify the coefficients in the intersected area. Thus, solutions for the block n _{1} have to be obtained using the original syndrome S ^{I} . Some solutions are valid since they do not modify the coefficients in the intersected area.

(b)
Some solutions for the block n _{2} modify the coefficients in the intersected area. These modifications in the intersected area affect the syndromes for the block n _{1}. Thus, the new syndrome for the block n _{1} is obtained as ${S}_{new}^{\mathsf{\text{I}}}$. Some new solutions are valid since they do not modify the coefficients already modified by the n _{2} in the intersected area.
The joint solutions for a combined block unify the solutions for the block n_{2} and its syndrome S^{II} and the specified solutions for the block n_{1} and its syndrome S^{I} new (in case of 2(a), ${S}_{new}^{\mathsf{\text{I}}}={S}^{\mathsf{\text{I}}}$).
In general, the proposed modified BCH data hiding schemes hides 4·m bits of data to the block of 2·(2^{m}1)I by using the BCH scheme (2^{m}1, k, 2) for blocks n_{1} and n_{2}.
The proper BCHbased data hiding scheme needs a suitable parameter m for hiding message M into the stream of N nonzero DCT coefficients. The parameter m can be obtained as follows:
where m defines the proper BCHbased scheme for the proposed method, N is the number of nonzero DCT coefficients, M is the hidden message, n^{p} = 2·(2^{m}1)I is the size of the combined block, 4·m is the capacity of the combined block.
3.1. Data hiding algorithm
The proposed method requires to find the solution for two blocks n_{1} and n_{2} for hiding two messages m_{1} and m_{2} together such that
where R_{1} and R_{2} are the modified streams of the binary coefficients obtained from n_{1} and n_{2} (see Figure 1); H is a paritycheck matrix from Equation (1).
Note that, hiding message m_{1} to block n_{1} modifies the block n_{2} and vice versa, due to the intersected part. Hence, we need proper positions to flip by solving Equation (9) for correct decoding.
Among all possible solutions, the proposed method unifies the solutions for blocks n_{1} and n_{2}, such that the flip positions cover only nonintersected area for both blocks (i.e., ${J}_{s}^{\mathsf{\text{I}}}={J}^{\mathsf{\text{I}}}\notin I$ and ${J}_{s}^{\mathsf{\text{II}}}={J}^{\mathsf{\text{II}}}\notin I$, for blocks n_{1} and n_{2}). In other words, it is desirable to hide data into the block n_{1} using the solutions from ${J}_{s}^{\mathsf{\text{I}}}$ that do not affect the block n_{2}, and vice versa. According to the above explanation, ${J}_{s}^{\mathsf{\text{I}}}$ and ${J}_{s}^{\mathsf{\text{II}}}$ unify the specified solutions for the blocks n_{1} and n_{2}, respectively. Here, note that superscript indexes X^{I} and X^{II} present different items for blocks n_{1} and n_{2}, respectively.
However, even if some flip positions j from the block n_{1} belong to the intersected area I. Thus, we can consider the effect of those j to get a new solutions for the block n_{2} and vice versa.
For this purpose, Equation (9) can be rewritten as follows:
where ${S}^{\mathsf{\text{II}}}=\left\{{S}_{1}^{\mathsf{\text{II}}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{S}_{2}^{\mathsf{\text{II}}}\right\}$ is the syndrome for blocks n_{2}; ${S}_{new}^{\mathsf{\text{II}}}=\left\{{P}_{1}^{\mathsf{\text{II}}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{P}_{2}^{\mathsf{\text{II}}}\right\}$ is a new syndrome for blocks n_{2} after hiding data to block n_{1}; l is the number of the flip positions (j_{1}, ..., j_{ l }) from the block n_{1} belonged to the intersected area I (i.e., j_{1}, ..., j_{ l } = J^{I} (S^{I} ) ∈ I); and the values β_{1}, ..., β_{ l }are computed using Equation (13) for the flipping positions $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)=F\left({j}_{1},\dots ,{j}_{l}\right)$ from the intersected area I for the block n_{2}. Function F converts indexes (j_{1}, ..., j_{ l }) of the intersected area from the block n_{1} to the corresponding indexes $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$ from the block n_{2}. For example, solution for the block n_{1} illustrated in Figure 1 is ${J}^{\mathsf{\text{I}}}\left({S}^{\mathsf{\text{I}}}\right)=\left[\begin{array}{cc}\hfill 3\hfill & \hfill 11\hfill \end{array}\right]$. j_{1} = 11 ∈ I, where index 1 means the first coefficient form the intersected area I. Coefficient j_{1} = 11 is located in the 11th position of the combined block. However, 11th coefficient in the combined block is the 15th coefficient in the block n_{2} (i.e., $F\left({j}_{1}\right)=F\left(11\right)={j}_{1}^{\prime}=15$ see Figure 1). Thus, even if the flip positions for blocks n_{1} and n_{2} are different (i.e., j_{1} = 11 and ${j}_{1}^{\prime}=15$), those coefficients have the same location in the combined block.
Finally, the solution for the block n_{2} can be obtained as $\left\{{j}_{1}^{\prime},\dots ,{j}_{l}^{\prime},\phantom{\rule{0.3em}{0ex}}{J}_{s}^{\mathsf{\text{II}}}\left({S}_{new}^{\mathsf{\text{II}}}\right)\right\}$. Presented solution sufficiently hides message m_{2} into the block n_{2}.
The joint solution hides both messages m_{1} and m_{2} into the combined blocks. The joint solution $\left\{{J}^{\mathsf{\text{I}}}\left({S}^{\mathsf{\text{I}}}\right),\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}{J}_{s}^{\mathsf{\text{II}}}\left({S}_{new}^{\mathsf{\text{II}}}\right)\right\}$ unifies the solutions for the blocks n_{1} and n_{2}. In this example, the flipping positions from the intersected area are the part of J^{I}(S^{I}).
Similarly, we can get a joint solution by using the current solution for block n_{2} (i.e., J^{II}(S^{II})). For this purpose, Equation (9) can be rewritten again as follows:
where ${S}^{\mathsf{\text{I}}}=\left\{\begin{array}{cc}\hfill {S}_{1}^{\mathsf{\text{I}}}\hfill & \hfill {S}_{2}^{\mathsf{\text{I}}}\hfill \end{array}\right\}$ is the syndrome for blocks n_{1}; ${S}_{new}^{\mathsf{\text{II}}}=\left\{\begin{array}{cc}\hfill {P}_{1}^{\mathsf{\text{II}}}\hfill & \hfill {P}_{2}^{\mathsf{\text{II}}}\hfill \end{array}\right\}$ is the new syndrome of the block n_{1} after hiding data to block n_{2}; l is the number of flip positions $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$ for the block n_{2} belonged to the intersected area I (i.e., $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)={J}^{\mathsf{\text{II}}}\left({S}^{\mathsf{\text{II}}}\right)\in I$); β_{1}, ..., β_{ l } are computed using Equation (15) for the flipping positions $\left({j}_{1},...,{j}_{l}\right)={F}^{1}\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$ from the intersected area I for the block n_{1}; function F^{1} (i.e., the inverse function of F) converts the indexes of the coefficients of intersected area $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$ from the block n_{2} to the corresponding indexes (j_{1}, ..., j_{ l }) from the block n_{1}. For example, if ${J}^{\mathsf{\text{II}}}\left({S}^{\mathsf{\text{II}}}\right)=\left[\begin{array}{cc}\hfill 1\hfill & \hfill 15\hfill \end{array}\right]$, then ${j}_{1}^{\prime}=15\in I$, then ${j}_{1}={F}^{1}\left({j}_{1}^{\prime}\right)={F}^{1}\left(15\right)=11$ (see Figure 1).
The solution for the block n_{1} can be obtained as $\left\{\left({j}_{1},...,{j}_{l}\right)\phantom{\rule{0.3em}{0ex}}{J}_{s}^{\mathsf{\text{I}}}\left({S}_{new}^{\mathsf{\text{I}}}\right)\right\}$. Presented solution sufficiently hides message m_{1} into the block n_{1}.
Joint solution for hiding both messages m_{1} and m_{2} is $\left\{{J}_{s}^{\mathsf{\text{I}}}\left({S}_{new}^{\mathsf{\text{I}}}\right)\phantom{\rule{0.3em}{0ex}}{J}^{\mathsf{\text{II}}}\left({S}^{\mathsf{\text{II}}}\right)\right\}$. Here, the flipping positions from the intersected area $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$ are the part of J^{II}(S^{II}). Corresponding flipping positions $\left({j}_{1},...,{j}_{l}\right)={F}^{1}\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$ are the part of the solution for the block n_{1}.
Note that there are several solutions in J^{I} and J^{II} for syndromes S^{I} and S^{II}, respectively. Presented method may generate one joint solution for each solution from J^{I}(S^{I}) and J^{II}(S^{II}).
The proposed method requires to find values β from the flip positions (j_{1}, ..., j_{ l }) or $\left({j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}\right)$. The relationship between β and flip position j is presented as follows:
or
The complete procedure for getting all possible joint solutions for any syndromes is presented as follows:
For a given combined block of binary coefficients a and two messages m_{1}and m_{2}process follows:

(a)
Define two blocks of the DCT coefficients n _{1} and n _{2} (see Figure 1). Compute syndromes S ^{I} and S ^{II} using corresponding binary streams v' and v".

(b)
Find all possible solutions j ^{I} = J ^{I}(S ^{I}) and j ^{II} = J ^{II}(S ^{II}) for blocks n _{1} and n _{2} by using the syndromes S ^{I} and S ^{II}.

(c)
For each solution j ^{I}(p) (p = 1, 2, 3,..,k, where k is the number of solutions) process follows:

i.
Define flip positions j _{1}, ..., j_{ l } from the intersected area I.

ii.
Convert j _{1}, ..., j_{ l } to ${j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}$ (corresponding flip positions from the block n _{2}). Compute corresponding β using Equation 13. Compute new syndrome ${S}_{new}^{\mathsf{\text{II}}}$ using Equation 10.

iii.
Using a new syndrome ${S}_{new}^{\mathsf{\text{II}}}$ get new flips solutions as ${j}_{new}^{\mathsf{\text{II}}}={J}_{s}^{\mathsf{\text{II}}}\left({S}_{new}^{\mathsf{\text{II}}}\right)$.

iv.
For each solution ${j}_{new}^{\mathsf{\text{II}}}\left(q\right)$ (q = 1, 2, 3,...,z, where z is the number of solutions) store the joint solution: $\left\{{j}^{\mathsf{\text{I}}}\left(p\right),{j}_{new}^{\mathsf{\text{II}}}\left(q\right)\right\}$.

(d)
For each solution j ^{II} (p) (p = 1, 2, 3,...,k) process follows:

i.
Define flip positions ${j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}$ from the intersected area I for block n _{2}.

ii.
Convert ${j}_{1}^{\prime},\dots ,{j}_{l}^{\prime}$ to j _{1}, ..., j_{ l } Compute corresponding β. Compute new syndrome ${S}_{new}^{\mathsf{\text{I}}}$ using Equation 11.

iii.
Using a new syndrome ${S}_{new}^{\mathsf{\text{I}}}$ get new flips solutions as ${j}_{new}^{\mathsf{\text{I}}}={J}_{s}^{\mathsf{\text{I}}}\left({S}_{new}^{\mathsf{\text{I}}}\right).$

iv.
For each solution ${j}_{new}^{\mathsf{\text{I}}}\left(q\right)$ (q = 1, 2, 3,...,z, where z is the number of solutions) store the joint solution: $\left\{{j}_{new}^{\mathsf{\text{I}}}\left(q\right),{j}^{\mathsf{\text{II}}}\left(p\right)\right\}$.
The stored joint solutions are used further to hide data with better performance. Note that the proposed method needs to search the best solution among k·q possible candidates for each block (see steps c and d). Thus, computational complexity of the proposed search algorithm is O(n^{2}).
3.2. Twostage embedding technique
In order to enhance the performance of the blockwise methods (i.e., ME, MME, BCHbased data hiding, etc.), we utilize almost all the DCT coefficients for data hiding. The proposed method uses two different embedding schemes together. Two schemes use the different block sizes ${n}_{1}^{p}$ and ${n}_{2}^{p}$, and have different payloads ${m}_{1}^{p}$ and ${m}_{2}^{p}$.
This method divides the stream of the DCT coefficients (c_{1}, c_{2}, ..., c_{ N }) and the message M into two parts and hides data into each part separately. The optimal number of the blocks (k_{1} and k_{2}) for both schemes can be computed as follows:
The relation between the numbers of blocks for the schemes 1 and 2 is presented as follows:
where N is the number of DCT coefficients.
The computed ${k}_{1}^{\prime}$ and ${k}_{2}^{\prime}$ are noninteger numbers. Thus, we have to choose the nearest integers ${k}_{1}=\lceil {k}_{1}^{\prime}\rceil \pm 1$ and ${k}_{2}=\lceil {k}_{2}^{\prime}\rceil \pm 1$ such that:
The presented twoscheme embedding method improves the performance of data hiding by using the proper distribution of the available DCT coefficients among two different modified BCH schemes. First scheme uses ${m}_{1}^{p}=4\cdot m$ obtained from inequality (8), the second scheme uses ${m}_{p}^{2}=4\cdot \left(m+1\right)$. Note that the second scheme has higher embedding efficiency. The efficiency of the two schemes embedding refers to the ratio between number of blocks k_{1} and k_{2} for the schemes 1 and 2, respectively. The larger the value k_{1} (smaller ratio k_{1}/k_{2}), the higher efficiency of the proposed two schemes embedding for the same m.
The twoscheme embedding method enables to use different sizes of the intersected area for both schemes I_{sh 1}and I_{sh 2}, respectively (see Tables 1 and 2 We test several sizes of the intersected areas and several payloads. In the experiments, we try to hide data into a set of 4,000 natural images and compute performance against the steganalysis [20, 25] for different sizes of the intersected areas and payloads. Results are presented in Tables 1, 2, and 3.
Boldface numbers in Tables 1 and 2 link to the lowest accuracy and show the most appropriate intersected area size for each tested payload. Data hiding by using the most appropriate intersected area always shows better results. Tables 1 and 2 also indicate a difference between the proposed method and the original BCHbased steganography method [16] in terms of performance of the steganalysis [20, 25]. The most appropriate intersected area size presented in Table 3 was used later for other experiments.
4. Insertingremoving strategy
The performance of the proposed method can significantly be increased by using insertingremoving strategy. The proposed strategy is based on fact that the block of the 2^{m} 1 DCT coefficients can be modified before data hiding by inserting or removing coefficients 1 and 1. Data hiding to modified stream of DCT coefficients may result lower distortion and, as a result, lower detectability of the steganalysis. Such a modification has to be carried out carefully and sophisticatedly in order to reduce distortion.
The proposed insertingremoving strategy uses the stream of nonrounded quantized DCT coefficients a_{ q } computed as follows:
where B is the 8 × 8 block of the image pixels; a' is the block of original DCT coefficients; a_{ q } is the block of DCT coefficients divided by corresponding coefficients from quantization matrix Q; a_{ r } is the block of quantized DCT coefficients; Q_{ f } is a quality factor.
Each nonzero integer DCT coefficient has a corresponding informative bit computed as follows:
According to the proposed insertingremoving strategy, the stream a of nonrounded DCT coefficients obtained from the blocks a_{ q } is divided into three sets: modifiable c_{ m } = a ∈ (∞; 1.5) ∪ (1.5;∞), removable c_{ R } = a ∈ [1.5; 0.5) ∪ (0.5;1.5], and insertable c_{ Ins } = a ∈ [0.5; 0.25) ∪ (0.25;0.5]. Set c unifies modifiable, insertable, and removable sets (i.e., c = c_{ m } ∪ c_{ R } ∪ c_{ Ins }). The set C = c_{ m } ∪ c_{ R } contains all nonzero rounded DCT coefficients. According to Equation (17), only the nonzero DCT coefficients (i.e., set C) have the corresponding informative coefficients and can be used for hiding data.
The proposed steganographic method uses the stream of n_{ p } nonzero DCT coefficients from the set C for data hiding. In general, set C is the subset of the unified set c. Thus, each block unifies the n_{ p } coefficients form set C and some insertable coefficients from the set c (i.e., ${c}_{b}={c}_{m}^{\prime}\cup {c}_{R}^{\prime}\cup {c}_{Ins}^{\prime}$, where ${C}^{\prime}={c}_{m}^{\prime}\cup {c}_{R}^{\prime}$ is the block of n_{ p } nonzero DCT coefficients from the set C). Inserting or removing of any coefficients from ${c}_{Ins}^{\prime}$ and ${c}_{R}^{\prime}$ produces a new block C' with new solution for data hiding. As a result, insertingremoving strategy significantly increases the number of possible solutions and helps to find the most appropriate solution with the lowest distortion.
In the proposed improved matrix encoding, we use the same measure for computing distortion similar to MME [14]. The distortion for each DCT coefficient is computed as follows:
The distortion due to inserting or removing D_{ IR } is computed as follows:
where Q is the corresponding quantization coefficient of the quantization table.
The resulted distortion for the combined block of DCT coefficients is computed as follows:
where l is the number of flipped coefficients.
Flipped coefficients are computed as follows:
5. Encoder and decoder
The encoder of the proposed steganographic method based on modified BCH data hiding scheme and insertingremoving strategy is organized as follows:
For a given bitmap image I_{ m }, payload P, quality factor Q_{ f }, and secret key K process follows:

1.
Divide image I_{ m } into nonoverlapped 8 × 8 blocks of pixels and process DCT, quantization and rounding as presented in (16). Remove DC coefficients. Obtain a', a_{ q }, a_{ r } , and streams of DCT coefficients a. Permute stream a using K and any pseudorandom generator. Obtain stream c = a ∈ (∞; 0.25) ∪ (0.25;∞) from the permuted stream a.

2.
Define sets: modifiable c_{ m } , insertable c_{ Ins } , and removable c_{ R } .

3.
Define parameters for schemes 1 and 2, and number of the blocks k _{1} and k _{2} using (14) and (15). Divide message M into two parts: ${M}_{1}={m}_{1}^{p}\cdot {k}_{1}$ and ${M}_{2}={m}_{2}^{p}\cdot {k}_{2}$.

4.
Start from the first block i = 1. Define the i th block of the DCT coefficients ${c}_{{b}_{i}}={c}_{{m}_{i}}^{\prime}\cup {c}_{{R}_{i}}^{\prime}\cup {c}_{{Ins}_{i}}^{\prime}$, where ${c}_{{m}_{i}}^{\prime}$, ${c}_{{R}_{i}}^{\prime}$, and ${c}_{{Ins}_{i}}^{\prime}$ are the modifiable, removable, and insertable subsets for the current block. If i = k _{1} +1 switch to the scheme 2.

5.
Define the block of nonzero rounded DCT coefficients ${\mathsf{\text{C}}}_{\mathsf{\text{i}}}^{\prime}={c}_{{m}_{i}}^{\prime}\cup {c}_{Ri}^{\prime}$.

6.
Get the solutions for the block ${C}_{i}^{\prime}$ using the modified BCH data hiding scheme (see the algorithm in Section 3). Compute the distortion D for each solution using Equation (20). Choose solution J_{ m } with the lowest distortion D_{ m } and store it.

7.
Modify the block ${C}_{i}^{\prime}$ by inserting or removing coefficients from the subsets ${c}_{{R}_{i}}^{\prime}$, and ${c}_{{Ins}_{i}}^{\prime}$. Obtain a new block: (i) after removing ${C}_{i}^{\prime}={c}_{{m}_{i}}^{\prime}\cup {c}_{{R}_{i}}^{\u2033}$, where ${c}_{{R}_{i}}^{\u2033}={c}_{{R}_{i}}^{\prime}{c}_{{R}_{i}}^{\prime}\left(p\right)$ is the modified removable set and ${c}_{{R}_{i}}^{\prime}\left(p\right)$ is the removed coefficient; (ii) after inserting ${C}_{i}^{\prime}={c}_{{m}_{i}}^{\prime}\cup {c}_{{R}_{i}}^{\prime}\cup {c}_{{Ins}_{i}}^{\prime}\left(q\right)$, where ${c}_{{Ins}_{i}}^{\prime}\left(q\right)=\pm 1$ is the inserted coefficient. p and q are the current position for insertion and removing.

8.
Repeat steps 56 for all insertable and removable coefficients from ${c}_{{R}_{i}}^{\prime}$, and ${c}_{{Ins}_{i}}^{\prime}$.

9.
Among all stored solutions J_{ m } choose solution with the lowest distortion D_{ m } . Modify one, two, or three coefficients according to the best solution (see explanation in Section 2) and, if necessary, insert or remove coefficient in the block ${c}_{{b}_{i}}$.

10.
Process all k _{1} + k _{2} blocks using steps 49. Obtain the modified stream ${c}^{\prime}=\left\{{c}_{{b}_{1}},{c}_{{b}_{2}},\dots ,{c}_{{b}_{{k}_{2}+{k}_{2}}}\right\}$.

11.
Recover the original sequence order of the DCT coefficients a from the modified stream c' using the secret key K and utilized pseudorandom generator. Add DC coefficients, round the coefficients a', and obtain the modified JPEG image ${I}_{m}^{\prime}$.
The decoder of the proposed steganographic method is organized as follows:
For the given modified JPEG image${I}_{m}^{\prime}$, quality factor Q_{ f }, secret key K, and size of the payload p = P process follows:

1.
Read the DCT coefficients from the JPEG file. Permute them using the secret key K and utilized pseudorandom generator. Remove the DC coefficients. Obtain the stream of nonzero DCT coefficients C.

2.
Using Equations (15) and (16) define parameters of the schemes 1 and 2, and the number of blocks k _{1} and k _{2}. Here, N = C.

3.
Divide C into the blocks according to the k _{1} and k _{2}.

4.
Decode data from each block using (9).
The steganographic method based only on modified BCH data hiding scheme skips the steps 7 and 8.
6. Experimental results
In these experiments, we try to hide different amount of data into the set of uncompressed images using the proposed BCHbased data hiding scheme with and without the insertingremoving strategy. The set of modified and original compressed images is analyzed by two powerful steganalysis algorithm proposed by Pevny and Fridrich [20] and Kodovsky and Fridrich [25]. Those methods use 274 and 548 different features of the DCT coefficients, respectively. The union of the 274 or 548 features from the unmodified and modified images are used for making the models for the support vector machine (SVM) with parameter C = 10^{4} and kernel width γ = 10^{4}. A set of 4,000 natural uncompressed images (768*512) downloaded from Corel Draw and obtained from several digital cameras is used in our experiments. Proposed method needs 15 min for hiding data to each image. Experiments are carried out for seven different payloads (0.05, 0.1, 0.15, 0.17, 0.20, 0.22, and 0.25 bits per nonzero coefficientbpc) and quality factor 75. SVM training process needs a set of 3,000 images (1,500 original and 1,500 stego images) for 7 different payload sizes. The SVM engine tests for 7 obtained models to test a set of 1,000 images (500 original and 500 stego) for 7 different payload sizes. The result shows the error probabilities of the steganalysis for each tested payload (see Figures 2 and 3).
The error probability is computed as follows:
where P_{ a } is the probability of misdetection (i.e., the unmodified image is classified as modified) and P_{ b } is the probability of misclassification (i.e., the modified image is classified as unmodified).
In our experiments, we test both methods: (1) based only on the modified BCHbased data hiding scheme; and (2) the modified BCHbased data hiding scheme with the proposed insertingremoving strategy. The proposed methods achieve high error probability for all the tested payloads. For payloads up to 0.1 bpc, both methods have detectability close to 50%, meaning that the steganalysis cannot distinguish the unmodified images from the modified. This probability is almost equal to that of the coin toss. For higher payloads around 0.15 and 0.2 bpc, the proposed methods show much better performance compared to the MME. Significant improvement over the MME is justified on the fact of using methods with larger embedding efficiency (i.e., the BCHbased schemes with large m). The proposed method also shows better results compared to the methods based on the original BCHbased schemes. Hence, the proposed method with the insertingremoving strategy shows the significant improvement over the method with modified BCHbased data hiding scheme only, by 0.0363, 0.0414, and 0.0392 points in terms of error probabilities for payloads 0.15, 0.2, and 0.25, respectively. For payload of 0.25 bpc, both methods show 0.2961 and 0.3353 of the error probability. The error probabilities are better than those of the MME [14], original BCHbased [16], heuristic BCHbased scheme [17], and syndrome trellis code STC [22] proposed by Kodovsky and Fridrich. Such improvement was achieved by using modified BCHbased data hiding and unique insertingremoving strategy.
7. Conclusion
In this article, an efficient data hiding technique for steganography is presented. The proposed BCHbased data hiding scheme uses two blocks to form a single combined block. A new data hiding strategy enables to get a joint solution for two blocks with intersected coefficients. Due to intersection, the proposed method requires small number of coefficients for hiding the same amount of data compared with the original nonoverlapping blockwise approaches. As a result, the proposed method can use the BCHbased schemes with large m (i.e., lager capacity). Even though the proposed method requires to use the same BCHbased scheme (for 0.17 and 0.2 bpc), the efficiency of data hiding is still high because the proposed twoscheme embedding has a lower ratio k_{1}\k_{2} compared to the original BCHbased scheme. The proposed BCHbased data hiding scheme significantly outperforms the MME and original BCHbased steganography in terms of the error probabilities and accuracy against the steganalysis. The proposed twoscheme embedding technique (see Equations 14 and 15) enables to use almost all the available DCT coefficients. The proposed strategy based on inserting and removing coefficients 1 or 1 increases the number of possible solutions and significantly decreases the total distortion. The experimental results show that the insertingremoving strategy significantly improves the performance of the proposed method. The combination of the modified BCHbased and the insertingremoving strategy achieves higher error probabilities and lower accuracy against the powerful steganalysis.
References
 1.
Provos N: Defending against statistical steganalysis. In Proc of 10th USENIX Security Symposium. Washington, DC; 2001:2424.
 2.
Eggers J, Bauml R, Girod B: A communications approach to steganography. In Proc of EI SPIE. Volume vol. 4675. San Jose, CA; 2002:2637.
 3.
Noda H, Niimi M, Kawaguchi E: Application of QIM with dead zone for histogram preserving JPEG steganography. In Proc of ICIP. Geneva, Italy; 2005.
 4.
Solanki K, Sakar A, Manjunath BS: YASS: Yet another steganographic scheme that resists blind steganalysis. Lect Notes Comput Sci 2007, 2939: 154167.
 5.
Westfeld A: High capacity despite better steganalysis (F5a steganographic algorithm). Lect Notes Comput Sci 2001, 2137: 289302.
 6.
Fridrich J: Minimizing the embedding impact in steganography. In Proc of ACM Multimedia and Security Workshop. Geneva, Switzerland; 2006:210.
 7.
Fridrich J: Featurebased steganalysis for JPEG images and its implications for future design of steganographic schemes. Lect Notes Comput Sci 2005, 3200: 6781.
 8.
Fridrich J, Filler T: Practical methods for minimizing embedding impact in steganography. In Proc EI SPIE. Volume vol. 6505. San Jose, CA; 2007:23.
 9.
Fridrich J, Goljan M, Soukal D: Perturbed quantization steganography using wet paper codes. In Proc of ACM Workshop on Multimedia and Security. Magdeburg, Germany; 2004:415.
 10.
Fridrich J, Goljan M, Soukal D: Perturbed quantization steganography. ACM Multimedia Secur J 2005, 11(2):98107.
 11.
Fridrich J, Pevny T, Kodovsky J: Statistically undetectable JPEG steganography: dead ends, challenges, and opportunities. In Proc of ACM Workshop on Multimedia and Security. Dallas, TX; 2007:315.
 12.
Fridrich J, Goljan M, Soukal D: Perturbed quantization steganography. ACM Multimedia Secur J 2005, 11(2):98107.
 13.
Fridrich J, Goljan M, Soukal D: Wet paper coding with improved embedding efficiency. IEEE Trans Inf Secur Forensics 2005, 1(1):102110.
 14.
Kim YH, Duric Z, Richards D: Modified matrix encoding technique for minimal distortion steganography. Lect Notes Comput Sci 2006, 4437: 314327.
 15.
Schönfeld D, Winkler A: Reducing the complexity of syndrome coding for embedding. Lect Notes Comput Sci 2008, 4567: 145158.
 16.
Zhang R, Sachnev V, Kim HJ: Fast BCH syndrome coding for steganography. Lect Notes Comput Sci 2009, 5806: 4858.
 17.
Sachnev V, Kim HJ, Zhang R: Less detectable JPEG steganography method based on heuristic optimization and BCH syndrome coding. In Proc of ACM Workshop on Multimedia and Security. Princeton, NJ; 2009:131139.
 18.
Filler T, Fridrich J: Steganography using Gibbs random fields. In Proceedings of ACM Multimedia and Security Workshop. Rome, Italy; 2010:199212.
 19.
Upham D[http://www.funet.fi/pub/crypt/stegangraphy/jpegjstegv4.diff.gz]
 20.
Pevny T, Fridrich J: Merging Markov and DCT features for multiclass JPEG steganalysis. In Proc of SPIE. Volume vol. 6505. San Jose, CA; 2007:34.
 21.
Shi YQ, Chen C, Chen W: Markov process based approach to effective attacking JPEG steganography. Lect Notes Comput Sci 2006, 4437: 249264.
 22.
Filler T, Judas J, Fridrich J: Minimizing embedding impact in steganography using trelliscoded quantization. IEEE Trans Inf Secur Forensics 2011, 6(3):920935.
 23.
RifaPous H, Rifa J: Product perfect codes and steganography. Digital Signal Process 2009, 19: 764769.
 24.
Zhao Z, Wu F, Yu S, Zhou J: A lookup table based fast algorithm for finding roots of quadratic or cubic polynomials in the GF(2^{m}). J Huazhong Univ Sci Technol (Nat Sci Ed.) 2005, 33(1):7073.
 25.
Kodovsky J, Fridrich J: Calibration revisited. In Proceedings of the 11th ACM Multimedia & Security Workshop. Edited by: Dittmann J, Craver S, Fridrich J. Princeton, NJ; 2009.
Acknowledgements
This study was supported by the Catholic University of Korea, National Research Foundation of Korea (grant 20110013695), ITRC and BK21 Project, Korea University and IT R&D program (Development of anonymitybased uknowledge security technology, 2007S00101).
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Sachnev, V., Kim, H.J. Modified BCH data hiding scheme for JPEG steganography. EURASIP J. Adv. Signal Process. 2012, 89 (2012). https://doi.org/10.1186/16876180201289
Received:
Accepted:
Published:
Keywords
 BCH
 steganography
 less detectable data hiding