 Research
 Open Access
 Published:
Local distortion resistant image watermarking relying on salient feature extraction
EURASIP Journal on Advances in Signal Processing volume 2012, Article number: 97 (2012)
Abstract
The purpose of this article is to present a novel method for region based image watermarking that can tolerate local image distortions to a substantially greater extent than existing methods. The first stage of the method relies on computing a normalized version of the original image using image moments. The next step is to extract a set of feature points that will act as centers of the watermark embedding areas. Four different existing feature extraction techniques are tested: Radial Symmetry Transform (RST), scaleinvariant feature transform (SIFT), speeded up robust features (SURF) and features from accelerated segment test (FAST). Instead of embedding the watermark in the DCT domain of the normalized image, we follow the equivalent procedure of first performing the inverse DCT of the original watermark, inversely normalizing it and finally embedding it in the original image. This is done in order to minimize image distortion imposed by inversely normalizing the normalized image to obtain the original. The detection process consists of normalizing the input image and extracting the feature points of the normalized image, after which a correlation detector is employed to detect the possibly inserted watermark in the normalized image. Experimental results demonstrate the relative performance of the four different feature extraction techniques under both geometrical and signal processing operations, as well as the overall superiority of the method against two stateoftheart techniques that are quite robust as far as local image distortions are concerned.
1 Introduction
During the last two decades there has been a great increase in the amount of multimedia information exchanged through the Internet. This resulted in the need for an efficient way to protect copyright on this information. The most sophisticated method to accomplish this in present years is digital watermarking [1–3]. It is interesting to note that it has since been also used in the context of other applications such as integrity checking [4, 5], broadcast monitoring [6, 7] and fingerprinting [8, 9]. When referring to the design of a watermarking algorithm for copyright protection of digital images, there are certain requirements that we would like it to meet [10]:

Robustness: The watermark should be resistant against intentional or unintentional attacks. That means, it should not be easy to render it undetectable or to remove it.

Imperceptibility: The watermark should be invisible. Specifically, it should not affect the overall quality of the original image.

Security: There should exist a large set of different possible keys producing independent watermarks. One should not be able to decide which the embedding key was.

Capacity: It should be possible to embed and, subsequently, detect multiple watermarks in the same image.

Payload: The number of watermark bits that could be embedded should be high.
As one can imagine, it is difficult to fulfill all requirements to the greatest extent simultaneously. A tradeoff should rather be established. In our article, we choose to focus on the robustness requirement having in mind that it is difficult to ensure a high degree of robustness without increasing watermark energy to a level that renders the watermark visible. On the other hand, if watermark energy remains low to ensure invisibility, it is unlikely that the watermark will survive any possible attack. The proposed technique, as will be shown, achieves to balance between these two requirements. Payload is kept at a moderate level, although rather small embedding areas are used for our multibit method and the adapted watermark pattern is duplicated across all of them. Finally, security and capacity remain high.
Possible watermark attacks can be categorized as follows:

Geometrical attacks: these include scaling, shearing, rotation, combinations of them and local distortions such as Stirmark attack or line removal.

Signal processing attacks: examples are lowpass filtering, lossy compression and noise addition.
Most of the proposed methods to date focus on either of these attack categories. The choice of embedding domain and the watermark's shape are two factors that determine which attack category the watermark is more resistant to. In general, watermarks embedded in the spatial domain can be designed in such a way that synchronization after geometric attacks can be achieved, whereas embedding in a transform domain usually provides greater robustness against filtering and compression. Additionally, watermarks having a certain symmetry (usually circular, as in [11, 12]) are employed to cope with geometrical attacks. Certain methods proposed in the recent years tend to be robust against both attack categories. In [13], a scheme is described that involves image segmentation, Gaussian scale model and moment normalization of selected circular regions. The problem encountered in this method is that the inverse normalization of the embedding regions may result in boundary artifacts. Apart from that, the homogeneity criterion of the employed segmentation method cannot provide a stable representation of the image after watermark embedding and/or some attack. In [14], a drawback is the fact that the strongest corner points detected are not necessarily the mostly repeated, i.e., corner strength does not change proportionally for all points after some attack. Another problem is the increased complexity due to both circular convolution needed to ensure rotational invariance and local search needed to overcome instability of feature point position and scale. The methods proposed in [15, 16] also suffer from quantization error due to inverse normalization of the embedding disks although some remedies are proposed in [15] to overcome this. These remedies, however, may affect detector performance. Besides, in [15] the number of correctly detected feature points after watermarking and possible attacks affects the detection threshold used to decide on the existence of the watermark. The watermark embedded using the technique described in [17] cannot withstand shearing attacks and, consequently, any affine geometrical attack involving shearing. That is because of the fact that the watermark is only rotationally invariant due to its structure of homocentric cirques and scaling invariant due to prior scale normalization of the whole image. Finally, in [18], a method is proposed that utilizes the scaleinvariant feature transform (SIFT) to extract circular patches that are scale and translation invariant, and the prototype rectangular watermark is subsequently inversely polarmapped prior to embedding. However, a computational overhead is introduced, again, due to circular convolution needed during detection to compensate for image rotation and, eventually, decide on the existence of the watermark.
In the following sections we describe a watermarking technique that deals successfully with all of the problems stated above and, additionally, provides substantially greater robustness than existing methods against local distortions, while keeping robustness against other usual attacks at an acceptable level. In Section 2, the initial stage of preprocessing which precedes both watermark embedding and detection is first described. In Section 3, the main watermarking procedure is explained and Section 4 presents examples of experimental results that prove the efficiency of the technique. Finally, conclusions about this study are drawn in Section 5.
2 Image preprocessing
Both watermark embedding and detection procedures require that a proper preprocessing of the original image has taken place, so that the watermark embedding or detection areas can be located. Section 2.1 describes the first preprocessing step where the original image is transformed geometrically to a standard form. Section 2.2 briefly overviews the four different feature extraction methods that will alternatively act upon the normalized image to produce the reference points both for watermark embedding and detection.
2.1 Image normalization
The first step prior to watermark embedding and detection is image normalization. This serves to provide the next step of feature extraction with a standard form of the original image, in which to search for strong feature points. The difference from other methods in the literature is that they employ image normalization in circular patches that have already been extracted from the original image. The problem, as stated in Section 1, is that the normalized and afterwards watermarked patches have to be inversely normalized and overlayed on the original image, leading to interpolation errors and, thus, visible artifacts. In the current article, we implement the image normalization method proposed in [19]. Here we should point out that the method described in [19] is the first step of a watermarking technique which, however, affects the whole of the image. Our aim in the present article is to provide a technique that only affects the image regionally, since we wish to cope with local image distortions. If we let I(x, y) be the original image, then the normalized image is g(x, y) = I (x_{ α } , y_{ α } ), where
and T=\left(\begin{array}{ccc}1& \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}d1\\ 0& \phantom{\rule{0.5em}{0ex}}1& \phantom{\rule{0.5em}{0ex}}d2\\ 0& \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}1\end{array}\right) is a translation matrix, X=\left(\begin{array}{ccc}1& \phantom{\rule{0.5em}{0ex}}\beta & \phantom{\rule{0.5em}{0ex}}0\\ 0& \phantom{\rule{0.5em}{0ex}}1& \phantom{\rule{0.5em}{0ex}}0\\ 0& \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}1\end{array}\right) is a xshearing matrix, Y=\left(\begin{array}{ccc}1& \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}0\\ \gamma & \phantom{\rule{0.5em}{0ex}}1& \phantom{\rule{0.5em}{0ex}}0\\ 0& \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}1\end{array}\right) is a yshearing matrix, and S=\left(\begin{array}{ccc}\alpha & \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}0\\ 0& \phantom{\rule{0.5em}{0ex}}\delta & \phantom{\rule{0.5em}{0ex}}0\\ 0& \phantom{\rule{0.5em}{0ex}}0& \phantom{\rule{0.5em}{0ex}}1\end{array}\right) is a scaling matrix.
The values of the parameters d_{1}, d_{2} are calculated as
where m_{10}, m_{01}, m_{00} are geometric moments of the original image I(x, y)
If we let I_{ T } (x, y) be the image after translation normalization, the value of the parameter β is calculated as a root of
where {\mu}_{pq}^{\left(T\right)} are the central moments of I_{ T } (x, y)
In case of a singe real root and two complex conjugate roots, the value of β is chosen as the real one. In case of three real roots, the value is chosen as the median. The value of γ is calculated as
where {\mu}_{pq}^{\left(XT\right)} are the central moments of I_{ XT } (x, y) which is the image I_{ T } (x, y) after xshearing normalization. Finally, the values of α and δ are derived given that I_{ Y XT } (x, y) (the image I_{ XT } (x, y) after yshearing normalization) is resized to a specific size (e.g., 512 × 512 in our experiments) to provide the final normalized image I_{ SY XT } (x, y). The signs of these parameter values are determined by the constraint that both {\mu}_{50}^{\left(SY\phantom{\rule{0.3em}{0ex}}XT\right)} and {\mu}_{05}^{\left(SY\phantom{\rule{0.3em}{0ex}}XT\right)} are positive. Examples of the original "Lena" and "Lake" images and respective normalized images using the above described method are shown in Figure 1.
This normalized representation of the original image is the input for the next step of preprocessing that is necessary for both watermark embedding and detection.
2.2 Feature extraction
The second step of the preprocessing stage is the feature extraction step. A great variety of feature extraction methods has been proposed in the literature. Lately, there is a tendency of using the socalled scalespace methods such as SIFT [20] for watermarking purposes [18, 21–23]. In our study, we employed this as well as other feature detectors proposed in the literature, but not in the context of image watermarking, during the past few years. These detectors are, more specifically, the radial symmetry transform (RST) introduced in [24], the speeded up robust features (SURF) [25, 26] and the features from accelerated segment test (FAST) [27, 28]. As we will show in the experimental results section, all of them perform adequately well for our application, although their relative performance varies.
2.2.1 Radial symmetry transform
To compute the RST first we have to construct two images, the magnitude projection image M_{ n } and the orientation projection image O_{ n } of the normalized image at every radius n that we have selected. These images are initialized to zero and are subsequently updated at each point depending on how the point is affected by the gradient vector at a point a distance n away. Let p = (x, y) be a point and g(p) the gradient vector at that point, determined by applying the 3 × 3 Sobel operator at the respective point of the normalized image. The coordinates of the socalled positivelyaffected pixel are
and those of the negativelyaffected pixel are
The pixel values of the magnitude projection and orientation projection images are updated as follows
Next, we have to define
where k_{ n } is a scaling factor to normalize M_{ n } and O_{ n } across different radii. Once Õ_{ n } is defined, we compute
where α is the radial strictness parameter. The larger the value of α, the stricter the required radial symmetry. Finally, F_{ n } is convolved with a 2D Gaussian filter A_{ n } to produce the radial symmetry contribution at radius n
The overall RST (symmetry map) is calculated by simply averaging radial symmetry contributions for all of the radii considered
where N is the set of radii. A nonmaximum suppression and thresholding algorithm [29] is applied to the symmetry map S to localize the strongly symmetric points of the normalized image. An example for the images of Figure 1 is depicted in Figure 2 for N = {1, 3, 5} and α = 1. The value of the radius for nonmaximum suppression was chosen to be 3 and that of the threshold to be 5.
2.2.2 Scaleinvariant feature transform
The main idea of this detector is to search for candidate stable feature points across a series of image scales. First, the socalled scale space of the normalized image is constructed by convolving the image I(x, y) with a variablescale Gaussian G\left(x,y,\sigma \right)=\frac{1}{2\pi {\sigma}^{2}}{e}^{\left({x}^{2}+{y}^{2}\right)/2{\sigma}^{2}}
The potentially stable feature points are detected as local extrema of the function D(x, y, σ) constructed as follows
that is, a convolution of the image with a difference of Gaussians. k is a factor that determines the difference between consecutive scales. An octave of scale space is a series of D(x, y, σ) functions spanning a doubling of σ. Each octave is divided in s intervals and, thus, k = 2^{1/s}. For each new octave, the Gaussian image produced with the doubled value of σ at the previous octave is first downsampled by a factor of 2 at each dimension. The local minima and maxima are found by 3D search in the 8 neighbors of the current scale and the respective 9 neighbors in each of the previous and the next scale.
To correctly localize feature points, candidate points are fitted to the nearby data by interpolation. The Taylor expansion of the function D(x, y, σ) is given by
where D and its derivatives are calculated at the candidate feature point and x = (x, y, σ) ^{T} is the offset from this point. The location of the extremum \widehat{\mathbf{x}} is found by taking the derivative of this expansion and setting it to zero, giving
If the offset \widehat{\mathbf{x}} is larger than 0.5 in any dimension, then the extremum should be closer to another candidate feature point. If so, the interpolation is again performed around a different point. Otherwise the offset is added to the candidate point to produce the interpolated estimate of the extremum.
To discard feature points of low contrast, the value of the secondorder Taylor expansion is computed at the offset \widehat{\mathbf{x}}. If this value is less than 0.03 then the candidate point is discarded. Otherwise it is kept, and its final location and scale are, respectively, y + \widehat{\mathbf{x}} and σ, where y is the original location of the candidate point at scale σ.
Another action that should be taken is to eliminate feature points with strong edge response. To do so, we first have to compute the secondorder Hessian matrix H
whose eigenvalues are proportional to the principal curvatures of D. If we let α be the larger eigenvalue and β the smaller one, then it can be shown that
where r = α/β, Tr(H) = D_{ xx } + D_{ yy } = α + β is the trace of H and Det(H) = D_{ xx }D_{ yy } (D_{ xy } )^{2} = αβ is the determinant of H. If the ratio R for a certain candidate feature point is larger than (r_{ th } + 1)^{2}/r_{ th } , then the feature point is rejected. The method sets the threshold eigenvalue ratio to r_{ th } = 10.
In our experiments the values of the various parameters involved in this method were chosen in accordance with [20]. Only the strength threshold for local maxima of the scale space was chosen to be equal to 0.05 to reduce the number of produced feature points. Examples of feature points extracted from the normalized versions of "Lena" and "Lake" are shown in Figure 3.
2.2.3 Speeded up robust features
This method was introduced as an alternative to SIFT focusing on computational cost reduction. A fast way of computing the Hessian matrix using integral images is proposed. This approach approximates the second order Gaussian derivatives by box filters. These, in turn, are used to compute the approximate determinant of the Hessian matrix. Instead of subsampling the filtered image of a previous layer, the scale space is constructed by increasing the filter size. For each new octave, the filter size increase per layer is doubled, and so is the sampling interval for the extracted feature points.
In the experiments that we conducted, the number of octaves that were analyzed was 5, the initial sampling interval was 2 and the Hessian response threshold was chosen to be 0.004. The feature points extracted from the normalized versions of "Lena" and "Lake" are presented in Figure 4.
2.2.4 Features from accelerated segment test
This feature detector should be more precisely called a corner detector. To test if a certain pixel p is a corner, 16 pixels lying on a circle centered at this pixel (specifically, a Bresenham circle of radius 3) are tested for similarity of intensity to the center pixel. If N contiguous pixels lying on this circle are all brighter than the center pixel by a quantity T (that is I_{p→x}≥ I_{ p } +T, x ∈ {1 . . . 16}) or darker than it by the same quantity (that is I_{p→x}≤ I_{ p }  T, x ∈ {1 . . . 16}), then the center pixel is considered a corner. A nonmaximum suppression step follows to reduce the number of corner points. Since there is no score function on which to apply the suppression, we define one as [28]
where
After suppression only the candidates having score value greater than all their 8 neighbors are preserved. The parameter values used in our experiments where N = 12 and T = 60. For the "Lena" and "Lake" images, the feature points extracted from their normalized versions are shown in Figure 5.
3 Watermarking scheme
The preprocessing stage described in the previous section is, as already stated, common for both watermark embedding and detection procedures. The extracted feature points are to be used as centers of the areas where the watermark is to be embedded.
The watermark pattern is initially constructed in the DCT domain as a rectangular patch of size that is related to the size of the normalized image (e.g., 64 × 64 for a normalized image of size 512 × 512, as in our examples). Other methods employing DCT in the field of image watermarking have been proposed in the past as well [30]. If we let b _{ i }, i = 1, . . . , N be binary sequences of length K (which is the number of DCT coefficients that are going to be modulated) created by thresholding pseudorandom values taken from the standard normal distribution (i.e., \mathcal{N}\left(0,1\right)), where N is the length of the multibit watermark message, and m_{ i } is the i th bit of the message, then the middle zone of K DCT coefficients is modulated as follows:
The position of the middle zone of DCT coefficients is chosen so as to render the watermark both robust to attacks that affect high frequencies (such as JPEG compression or lowpass filtering) and invisible (by preserving low frequency content). The rest of the DCT coefficients are set to zero. The final watermark pattern is produced by inverse zigzag scanning of the zeropadded C sequence. An example of such a watermark (of size 64 × 64) and its spatial counterpart (its inverse DCT) is depicted in Figure 6. The range of nonzero coefficients is chosen to be [407, 3316] in the zigzag order, which means that K = 2910. We can notice the nonwhite properties of the watermark pattern in the spatial domain representation.
3.1 Watermark embedding
The original aim is to insert the watermark in the DCT transform domain  other domains such as the space/spatialfrequency domain [31] could alternatively be employed  of the normalized image or, equivalently, insert the inverse DCT of the watermark in the spatial domain of the normalized image. However, by doing so, we would afterwards have to inversely normalize the watermarked normalized image to obtain the watermarked original image so that the watermark embedding process would be complete. This, as pointed out in Section 1, would impose interpolation errors, resulting in a version of the image that would be visibly corrupt compared to the original, even in areas that would not be normally affected by watermark embedding. To avoid this image degradation we choose to embed the inversely normalized version of the inverse DCT of the original watermark in the original image. Additionally, the watermark is to be embedded in all areas corresponding to the extracted feature points of the normalized image in a similar fashion as in [32]. This is done to increase watermark robustness as it is possible that not all originally detected feature points will also be detected after some attack. The overall embedding procedure is depicted in Figure 7.
More formally, for each embedding area g_{ i } (x, y), i = 1 . . . M (where M is the number of feature points) of the normalized image we additively embed the DCTdomain watermark as follows
where W (u, v) is the original DCTdomain watermark and α is the embedding strength. Given that the DCT is an orthogonal transform, Equation (26) can be rewritten as
where w(x, y) is the inverse DCT of W (u, v). If we followed directly this procedure for watermark embedding we would, eventually, have to inversely normalize the watermarked normalized image g^{w} (x, y) to produce the watermarked version f^{w} (x, y) of the original image:
where f^{w} (x, y) = g^{w} (x_{ b } , y_{ b } ). However, as aforementioned, the image would thus be visibly damaged. Instead of performing embedding according to Equation (27), we choose to embed the watermark directly in the original image. To do so, we have to inversely normalize the upright rectangular watermark pattern and embed it in the original image, centered at the points that correspond to the feature points extracted from the normalized image:
where w_{ o } (x, y) = w(x_{ b } , y_{ b } ) according to Equation (28), f_{ i } (x, y) with i = 1 . . . M are the areas of the original image where the watermark is to be embedded and {f}_{i}^{w}\left(x,y\right) are the respective watermarked areas. An example of a watermarked version of the image "Lake" of PSNR = 24.69 dB using RST and its amplified difference from the original is given in Figure 8. We can notice that some embedding areas may overlap because of the proximity of the corresponding feature points. We prefer to use all feature points as embedding area centers instead of applying some kind of criterion to select some of them. That is because we cannot be certain about the repeatability of feature points (that is, the probability that a specific point will be extracted in any altered version of the image). Since the watermark is embedded around all extracted points, it is also going to be detected around all feature points extracted during the detection stage, as it will be described in the following section. Thus, to cover the case of overlapping areas, it would be more appropriate to describe embedding in an iterative manner
where i = 1, . . . , M, w_{ i } (x, y) is the image with same size as f(x, y) and nonzero only in the i th embedding area (where w_{ o } is located), and {f}_{0}^{w}\left(x,y\right)=f\left(x,y\right).
An evident problem that may arise because of watermark area overlapping is that the watermark might become visible, as one can see in Figure 8. To overcome this, we modify Equation (30) in the following way
where r(x, y) is the number of watermarked areas overlapping at point (x, y). If no watermarking has occurred at that point, then r(x, y) = 1. A noniterative version of Equation (31) is
An example of applying this rule is given in Figure 9. The watermarked image now has PSNR = 40 dB and, in contrast to Figure 8, the watermark is hardly visible.
3.2 Watermark detection
To perform watermark detection, the preprocessing step is needed as for watermark embedding. This means that the watermarked and, possibly, attacked image is first geometrically normalized and feature extraction is performed in the normalized image in the same manner as in the embedding stage (using one of the methods described in Section 2.2). Figure 10 shows the result for the watermarked image of Figure 9. As one can see, a great percentage of the originally extracted feature points used for watermark embedding (see Figure 2) are still present in the normalized watermarked image. Therefore, the watermark will be detected accurately in all respective areas. Since, as pointed out in Section 3.1, no algorithm for selection of certain feature points has been established, watermark detection is going to be performed in all corresponding areas. An outline of the detection procedure is shown in Figure 11.
Detection is performed blindly, meaning that no knowledge about the original image is required. Although embedding has been performed in the original image, detection is carried out in the normalized image. This is done to avoid the overhead of inversely normalizing the watermark, since the normalized image is already available. To decide about the value of each message bit that was originally embedded in the image, we first have to extract the sequence of DCT coefficients of each region where the watermark is supposedly embedded. If {f}^{{w}^{\prime}}\left(x,y\right) is the image in which the watermark is to be detected, we have to obtain its normalized version {g}^{{w}^{\prime}}\left(x,y\right). If we let M' be the number of extracted feature points in image {g}^{{w}^{\prime}}\left(x,y\right), the detector output D_{ j } for each message bit {\widehat{m}}_{j} is computed by linear correlation between the respective DCT band {\mathbf{G}}_{i}^{{w}^{\prime}},i=1,\dots ,{M}^{\prime} and the binary sequence b_{ j } created by the same key as the one used for embedding, for all M' regions. This can be formulated as
The value of each extracted message bit {\widehat{m}}_{j} can be determined by comparing the detection value D_{ j } with zero.
4 Experimental results
To test the efficiency of the proposed watermarking technique against local distortions as well as other image processing attacks, we have conducted extensive watermarking experiments on ten well known images of different content, specifically "Airplane", "Boat", "House", "Peppers", "Splash", "Baboon", "Couple", "Lena", "Elaine", and "Lake". Each experiment consisted of embedding a 50 bit watermark message in each of the images and subsequently trying to extract it from the watermarked and attacked version of the image. For all techniques compared and for all images, PSNR is tuned to 40 dB. The bit error rate (BER), that is, the percentage of message bits that have not been detected correctly, is finally calculated. The proposed technique was tested for all four feature detectors under concern and compared to the stateoftheart techniques described in [19, 33]. These methods were selected as two of the recent bibliography that are multibit, permit finetuning of PSNR and are built to resist geometric attacks. It is worth mentioning that these methods act globally, thus distorting the whole of the image. In contrast, our method affects only local regions, thus producing zero distortion in part of the image. This, in turn, results in improved imperceptibility. The parameter values for the feature detectors were those used in the examples of Section 2.2. The range of DCT coefficients used for watermarking with the technique by Dong et al. [19] was chosen to be [28681, 215478], that is 186798 coefficients. The respective range of DCT coefficients for the technique by Tian et al. [33] was [7170, 53870], that is 46701 coefficients. These ranges were chosen as equivalent to the one used in our method. In the following sections, we present results for local geometric attacks, global geometric attacks and signal processing attacks. Some of the attacks were implemented using the Checkmark benchmarking software [34].
4.1 Local geometric attacks
One classic nongeometric attack is column and line removal. In Figure 12 we can see results for this attack where the pair of values inside the parentheses denotes the number of columns and lines of the image that have been removed, and which are equidistant. We can notice that our technique performs better for all employed feature detectors. This was expected since the stateoftheart techniques affect the image globally and cannot withstand attacks that modify image contents. The SIFTbased version of our technique demonstrated the best performance followed by RSTbased and SURFbased which have similar performance and finally FASTbased which is still better than the older techniques.
The next local distortion considered was the Stirmark attack. The experiment involved varying the jitter strength parameter from 1 to 7. As one can see in Figure 13, the proposed technique is superior to the technique by Dong et al. for all versions and especially for the SIFTbased one, but the technique by Tian et al. provides better performance for all cases but one.
Another attack considered in this category was image band cropping. The idea is to crop a band of certain width around the boundaries of the image. The band width in our experiments varied from 3 to 11 pixels as one can see in Figure 14. We can notice that the stateoftheart techniques are seriously affected even by a small amount of cropping, whereas the various versions of our technique are always more robust and slowly degrade as the band gets wider. Although all versions provide similar performance, the SIFTbased version appears to prevail. This is, again, an expected behavior since the stateoftheart techniques are not designed to withstand attacks that severely modify the global spectral representation of the image.
4.2 Global geometric attacks
Another category of possible distortions is that of global geometric attacks. These include rotation, scaling, shearing and combinations of them (i.e., general affine transforms). The first of these attacks presented here is the shearing attack. In this experiment, the varying parameters were the shearing percentages in both x and y axes. The results shown in Figure 15 prove that the technique by Tian et al. is not resistant against such an attack, which is expected since the technique does not apply affine normalization on the original image prior to watermark embedding. Performance, however, is excellent for the rest of the methods, with the SIFTbased version providing slightly better robustness than the technique by Dong et al. which, in turn, is a little more robust than the rest of our versions.
In the case of scaling, we conducted experiments with the scaling factor taking values as shown in Figure 16. The various methods do not present great differences in performance. However, the technique by Dong et al. is the best in all cases but one. The SIFTbased version of our technique is the next in order of performance, followed by the SURFbased and the RSTbased versions and the technique by Tian et al. which alternate in terms of performance for the various parameter values, and finally the FASTbased version which exhibits the lowest robustness.
Rotation followed by cropping out the central region that does not contain black border pixels and finally scaling to the original size has next been tested with the varying parameter being the rotation angle, as presented in Figure 17. The technique by Tian et al. cannot withstand this attack. On the contrary, the technique by Dong et al. is superior, although the SIFTbased version of our technique is very close to it in terms of robustness, followed by SURFbased, RSTbased, and FASTbased.
Another example of an attack comprising of different stages is shown in Figure 18, where successive downsampling and upsampling has been performed in the watermarked images. The pairs of values in parentheses correspond to the downsampling and upsampling factor, respectively. We can notice that all methods display similar performance with the technique by Tian et al. presenting the least varying robustness. The SIFTbased version appears to be, again, the best among all versions of our technique.
Finally, an experiment involving general affine transform was conducted, which showed that the performance of the proposed technique is comparable to that provided by the technique by Dong et al., as one can see in Figure 19. All techniques but the one by Tian et al. survive this type of attack. The varying parameters, in this case, were the affine transform matrix coefficients, considering the form \left(\begin{array}{cc}\hfill {a}_{1}\phantom{\rule{0.5em}{0ex}}\hfill & \hfill {a}_{2}\hfill \\ \hfill {a}_{3}\phantom{\rule{0.5em}{0ex}}\hfill & \hfill {a}_{4}\hfill \end{array}\right). As in the aforementioned experiments, the SIFTbased version demonstrated the best results among the four versions of our method, followed by SURFbased, RSTbased, and FASTbased.
4.3 Signal processing attacks
The third and last attack category considered in our experiments was that of signal processing manipulations. A very usual attack is JPEG compression. Figure 20 presents results for quality factor ranging from 10% to 50%. We can notice that the stateoftheart techniques are superior, with the technique by Tian et al. having the least variation in robustness. However, the performance of our method in all its versions is quite close to the one by Dong et al. especially for high compression ratios (low quality factor values). Of course, for higher quality factor values, the performance of all versions improves since the distortion is smaller. The SIFTbased version of our method is the best, followed by SURFbased, RSTbased, and FASTbased.
A more modern compression technique, specifically H.264 intraframe compression, has also been considered. As we can see in Figure 21, all methods have similar performance which improves with reduced quantization parameter value, as expected. The two stateoftheart techniques perform slightly better, whereas the various versions of our method follow closely, with SIFTbased being the best, followed by SURFbased, RSTbased, and FASTbased.
Another common distortion is noise addition. For the purpose of our experiments, we added Gaussian white noise of zero mean value and variance ranging from 0.001 to 0.006 to the watermarked images whose pixel values had previously been scaled to the range [0, 1]. As we can see in Figure 22, our technique is not as robust against Gaussian noise as the technique by Dong et al., but is better than the technique by Tian et al. in its SIFTbased and SURFbased versions. The RSTbased version follows and, finally, the FASTbased version exhibits the lowest robustness.
Finally, we perform lowpass filtering using a rotationally symmetric Gaussian filter of size 3 × 3 with standard deviation varying from 0.1 to 0.6. As one can see in Figure 23, the technique by Dong et al. is flawless for all values of standard deviation, followed in order of performance by the SIFTbased and the SURFbased versions of our method, with RSTbased and FASTbased following. The technique by Tian et al. is only better than the two latter versions for small values of standard deviation. However, the variation in performance is quite small for all methods.
In summary, the proposed technique, as expected due to its design, is more robust than the stateoftheart techniques in terms of local geometric distortions. It is also better in terms of shearing attacks and downsampling followed by upsampling. It is only inferior compared to the method by Dong et al., yet with significant performance, under rotation, scaling, general affine transform and signal processing attacks, such as JPEG compression, H.264 intraframe compression, lowpass filtering and noise addition. It is even better, in its SIFTbased and SURFbased versions, than the method by Tian et al. for all these attacks except compression attacks. The most competitive version of our method appears to be the SIFTbased one, followed by the SURFbased, the RSTbased, and the FASTbased.
5 Conclusions
In the current article, a new image watermarking technique is proposed, which is robust against the usual local distortion attacks that are not efficiently coped with by the stateoftheart techniques. According to our technique, a multibit watermark is formed in the DCT domain, inversely transformed and, eventually, geometrically normalized to the spatial domain of the original image. This prevents image interpolation errors in contrast to other techniques in the literature which embed the watermark in a normalized version of the image and afterwards apply inverse normalization. Furthermore, no local search is needed to achieve synchronization during detection. The use of a visibility rule during embedding prevents image deterioration due to overlapping of watermarked areas. Four different feature detection techniques are alternatively used in our study, namely SIFT, SURF, RST, and FAST, in order to produce the regions in which to embed the watermark. Our technique, especially in its SIFTbased version, proves to be more robust against local geometric attacks than certain stateoftheart techniques and has remarkable performance in terms of global geometric distortions and signal processing attacks.
References
O'Ruanaidh JJK, Dowling WJ, Boland FM: Watermarking digital images for copyright protection. IEE Proc Vision Image Signal Process 1996, 143(4):250256. 10.1049/ipvis:19960711
Berghel H, O'Gorman L: Protecting ownership rights through digital watermarking. Computer 1996, 29(7):101103. 10.1109/2.511977
Cox IJ, Kilian J, Leighton FT, Shamoon T: Secure spread spectrum watermarking for multimedia. IEEE Trans Image Process 1996, 6(12):16731687.
Lie WN, Hsu TL, Lin GS: Verification of image content integrity by using dual watermarking on wavelets domain. In Proc of the IEEE International Conference on Image Processing (ICIP 2003). Volume 3. Barcelona, Spain; 2003:487490.
Wang DS, Li JP, Wen XY: Biometric image integrity authentication based on SVD and fragile watermarking. In Proc of the 2008 Congress on Image and Signal Processing (CISP 2008). Volume 5. Sanya, China; 2008:679682.
Depovere G, Kalker T, Haitsma J, Maes M, de Strycker L, Termont P, Vandewege J, Langell A, Alm C, Norman P, O'Reilly G, Howes B, Vaanholt H, Hintzen R, Donnelly P, Hudson A: The VIVA project: digital watermarking for broadcast monitoring. In Proc of the IEEE International Conference on Image Processing (ICIP 1999). Volume 2. Kobe, Japan; 1999:202205.
Li L, Daiyuan P, Xiaoju L: A Security Video Watermarking Scheme for Broadcast Monitoring. In Proc, of the 3rd International Workshop on Signal Design and Its Applications in Communications (IWSDA 2007). Volume 1. Chengdu, China; 2007:109113.
Kirovski D, Malvar H, Yacobi Y: A dual watermarkfingerprint system. IEEE Multimedia 2004, 11(3):5973. 10.1109/MMUL.2004.1
Shahid Z, Chaumont M, Puech W: Spread spectrumbased watermarking for Tardos codebased fingerprinting for H.264/AVC video. In Proc of the IEEE International Conference on Image Processing (ICIP 2010). Hong Kong, China; 2010:21052108.
Cox I, Miller M, Bloom J, Fridrich J, Kalker T: Digital Watermarking and Steganography. 2nd edition. Morgan Kaufmann, Burlington, MA; 2008.
Solachidis V, Pitas I: Circularly symmetric watermark embedding in 2D DFT domain. IEEE Trans Image Process 2001, 10(11):17411753. 10.1109/83.967401
Verstrepen L, Meesters T, Dams T, Dooms A, Bardyn D: Circular Spatial improved watermark embedding using a new Global SIFT synchronization scheme. In Proc of the 16th International Conference on Digital Signal Processing (DSP 2009). Volume 1. Santorini, Greece; 2009:18.
Zheng D, Wang S, Zhao J: RST invariant image watermarking algorithm with mathematical modeling and analysis of the watermarking processes. IEEE Trans Image Process 2009, 18(5):10551068.
Seo JS, Chang CD, Yoo D: Localized image watermarking based on feature points of scalespace representation. Pattern Recogn 2004, 37(7):13651375. 10.1016/j.patcog.2003.12.013
Lu W, Lu H, Chung FL: Feature based robust watermarking using image normalization. Comput Electric Eng 2010, 36(1):218. 10.1016/j.compeleceng.2009.04.002
Wang XY, Yang YP, Yang HY: Invariant image watermarking using multiscale Harris detector and wavelet moments. Comput Electric Eng 2010, 36(1):3144. 10.1016/j.compeleceng.2009.04.005
Li LD, Guo BL: Localized image watermarking in spatial domain resistant to geometric attacks. AEU  Int J Electron Commun 2009, 63(2):123131. 10.1016/j.aeue.2007.11.007
Lee HY, Kim H, Lee HK: Robust image watermarking using local invariant features. Opt Eng 2006, 45(3):037002. doi:10.1117/1.2181887 10.1117/1.2181887
Dong P, Brankov JG, Galatsanos NP, Yang Y, Davoine F: Digital Watermarking Robust to Geometric Distortions. IEEE Trans Image Process 2005, 14(12):21402150.
Lowe DG: Distinctive Image Features from ScaleInvariant Keypoints. Int J Comput Vision 2004, 60(2):91110.
Pham VQ, Miyaki T, Yamasaki T, Aizawa K: Geometrically Invariant ObjectBased Watermarking using SIFT Feature. In Proc of the IEEE International Conference on Image Processing (ICIP 2007). Volume 5. San Antonio, Texas; 2007:473476.
Jing L, Gang L, Jiulong Z: Robust image watermarking based on SIFT feature and optimal triangulation. In Proc of the 2009 International Forum on Information Technology and Applications (IFITA 2009). Volume 3. Chengdu, China; 2009:337340.
Sun J, Lan S: Geometrical attack robust spatial digital watermarking based on improved SIFT. In Proc of the 2010 International Conference on Innovative Computing and Communication and 2010 AsiaPacific Conference on Information Technology and Ocean Engineering (CICCITOE 2010). Volume 1. Macao, Macao; 2010:98101.
Loy G, Zelinsky A: Fast radial symmetry for detecting points of interest. IEEE Trans Pattern Anal Mach Intell 2003, 25(8):959973. 10.1109/TPAMI.2003.1217601
Bay H, Tuytelaars T, Van Gool L: SURF: Speeded Up Robust Features. In Proc of the European Conference on Computer Vision (ECCV 2006). Volume 1. Graz, Austria; 2006:404417.
Bay H, Ess A, Tuytelaars T, Van Gool L: SURF: Speeded Up Robust Features. Comput Vision Image Understand 2008, 110(3):346359. 10.1016/j.cviu.2007.09.014
Rosten E, Drummond T: Fusing points and lines for high performance tracking. In Proc of the 10th IEEE International Conference on Computer Vision (ICCV 2005). Volume 2. Beijing, China; 2005:15081511.
Rosten E, Drummond T: Machine learning for highspeed corner detection. In Proc of the European Conference on Computer Vision (ECCV 2006). Volume 1. Graz, Austria; 2006:430443.
Canny JF: A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 1986, 8(6):679698.
Bors AG, Pitas I: Image watermarking using block site selection and DCT domain constraints. Optics Express 1998, 3(12):512522. 10.1364/OE.3.000512
Stankovic S, Orovic I, Zaric N: An application of multidimensional timefrequency analysis as a base for the unified watermarking approach. IEEE Trans Image Process 2010, 19(3):736745.
Nikolaidis A, Pitas I: Regionbased image watermarking. IEEE Trans Image Process 2001, 10(11):17261740. 10.1109/83.967400
Tian H, Zhao Y, Ni R, Pan JS: Spread spectrumbased image watermarking resistant to rotation and scaling using radon transform. In Proc of the Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIHMSP 2010). Volume 1. Darmstadt, Germany; 2010:442445.
Pereira S, Voloshynovskiy S, Madueno M, MarchandMaillet S, Pun T: Second generation benchmarking and application oriented evaluation. In International Workshop on Information Hiding (IHW 2001). Volume 1. Pittsburgh, PA, USA; 2001:340353.
Acknowledgements
A. Nikolaidis wishes to acknowledge financial support provided by the Research Committee of the Technological Educational Institute of Serres, Greece, under grant SAT/IC/2331125/1.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Nikolaidis, A. Local distortion resistant image watermarking relying on salient feature extraction. EURASIP J. Adv. Signal Process. 2012, 97 (2012). https://doi.org/10.1186/16876180201297
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/16876180201297
Keywords
 digital image watermarking
 local image distortions
 image moments
 radial symmetry transform
 discrete cosine transform
 feature extraction
 SIFT
 SURF
 FAST