Optimal detector for multiplicative watermarks embedded in the DFT domain of non-white signals

This paper deals with the statistical analysis of the behavior of a blind robust watermarking system based on pseudorandom signals embedded in the magnitude of the Fourier transform of the host data. The host data that the watermark is embedded into is one-dimensional and non-white, following a speciﬁc probability model. The analysis performed involves theoretical evaluation of the statistics of the Fourier coeﬃcients and the design of an optimal detector for multiplicative watermark embedding. Finally, experimental results are presented in order to show the performance of the proposed detector versus that of the correlator detector.


I. Introduction
The risk of illegal copying, reproduction and distribution of copyrighted multimedia material is becoming more threatening with the all-digital evolving solutions adopted by content providers, system designers and users.Thus, copyright watermark protection of digital data is an essential requirement for multimedia distribution.Robust watermarks can offer a copyright protection mechanism for digital media.The watermark is a signal that contains information about the copyright owner and it is embedded permanently in the multimedia data.It introduces imperceptible content changes that can be detected by a detection program.
Robustness is a very important property of the watermarking scheme.The watermarks must be robust to distortions, such as those caused by image processing algorithms (in the case of image watermarks).Image processing does not modify only the image but may also modify the watermark as well.Thus, the watermark may become undetectable after intentional or unintentional image processing attacks.The watermark must also be imperceptible.The watermark alterations should not decrease the perceptual media quality.A general watermarking framework for copyright protection has been presented in [1], [2] and describes all these issues in detail.

DRAFT
Correlation detection of watermarked signals is involved in the majority of the watermarking techniques in the literature.However, correlator detector is optimal and minimizes the error probability only in cases when the signal follows a Gaussian distribution.There are papers in the literature that propose detectors, different than the correlator, in the cases when the host data do not follow a Gaussian distribution [22]- [24].In [22], the embedding domain is DCT.The DCT coefficient distribution is modelled as a generalized Gaussian one.Then, the maximum likelihood (ML) criterion is used in order to derive the optimal detector structure.In [24], [25], the watermark is embedded in the magnitude of the DFT domain.In this case, the authors assume that the Fourier magnitude does not follow the generalized Gaussian distribution.They propose the Weibull one, due to the facts that its support domain is the set of the positive real numbers and that it represents a big probability distribution family.In the present paper, the watermark is also assumed to be embedded in the magnitude of the DFT domain.Moreover, we assume that the signal is not white and that it follows a specific probability model.The novelty of the present paper, that is also the main difference from the papers reported above, is that the DFT magnitude distribution is analytically calculated and it is proven to be different than the Weibull distribution [24].Finally, we construct the optimal detector according to the Neyman-Pearson criterion.
The paper is organized as follows.The watermarking system model is presented in section II.In the next section, the signal model is presented and the distribution of DFT magnitude coefficients is shown.Then, in section IV, the construction of the optimal detector is depicted.
In sections V and VI, the experimental results and the conclusions are presented.

II. Watermarking system model
Let s(i), i = 1, 2, ..., N be the samples of a host signal s with length N .Let also S(k), k = 1, 2, ...N be the Discrete Fourier transform coefficients of s(i) and M (k), P (k) the magnitude of the Fourier transform (M (k) = |S(k)|) and its phase, P (k) = arg(S(k)), respectively.Suppose that S R (k) and S I (k) denote the real and the imaginary part of S(k) respectively.As mentioned in the introduction, the watermark embedding is performed in the Fourier domain and more specifically in its magnitude.Thus, starting from the magnitude of the Fourier transform M , we produce the watermarked transform magnitude.Let us assume that M is the watermarked magnitude generated by the watermark embedding function f : In the previous formula, vector W contains the samples of the watermark sequence.This sequence is produced by a random generator.We assume that W (k), k = 1, 2, ..., N is a random signal that consists solely of 1 s and −1 s and that it is uniformly distributed in its domain {1, −1}.
Thus, the mean of the watermark sequence samples W (k) is equal to zero.In the case that f is of a linear form, it can be easily proven that the mean of the watermarked magnitude remains unaltered.This property increases both the watermarked signal imperceptibility as well as its robustness.The parameter p that is employed in (1) is a real number that determines the watermark strength.An increase in the value of p results in a more robust (and more easily perceptible) watermark.
If the embedding function is multiplicative, the watermarked magnitude is given by: In order to compute the final watermarked signal s (in the spatial domain), the inverse discrete DRAFT Fourier transform is applied on the watermarked magnitude M and the initial DFT coefficient phase P .
Given a possibly watermarked signal y, the watermark detector aims at deciding whether y hosts a certain watermark W . Watermark detection can be expressed as a hypothesis test where two hypotheses are possible: • H 0 : signal y does not host watermark W • H 1 : signal y hosts watermark W .
It should be noted that hypothesis H 0 can occur either in the case that the signal y is not watermarked (hypothesis H 0a ) or in the case that the signal y is watermarked by another watermark W , where W = W (hypothesis H 0b ).The events H 0a , H 0b are mutually exclusive and their union produces the hypothesis H 0 .
The performance of a watermarking method depends mainly on the selection of the watermark detector d.The correlator detector is the most commonly used watermark detector.It has been employed in many watermarking methods which perform not only spatial domain watermarking but also watermarking in transform domains.Its test statistic is the correlation between the watermark and the possibly watermarked signal y.
In order to decide on the valid hypothesis, the detector output d is compared against a suitably selected threshold T .The evaluation of the watermarking method can be measured by the false alarm P fa and the false rejection P fr probabilities.False alarm probability is the type I error which is the probability of rejecting hypothesis H 0 , even though it is true.In our case it is the probability of detecting a watermark W in a signal that is not watermarked by the watermark DRAFT W . Correspondingly, false rejection is the type II error, whose probability is that of not detecting a watermark W in a signal that is actually watermarked by the watermark W (accept H 0 even it is false).
In most of the watermarking methods, hypothesis H 0 is accepted, when the detector output is greater than a threshold T .Thus, false alarm and false rejection probabilities can be expressed as: The calculation of the above probabilities can be performed, if the detector distribution for both hypotheses is known.Thus, assuming that the f 0 (x), f 1 (x) are the probability density functions for the hypotheses H 0 and H 1 respectively, the error probabilities are given by: According to the above equations, P fa and P fr depend on the threshold T .A possible change of T increases one probability and decreases the other.Thus, apart from the detector, an appropriate threshold should be selected.In many cases the detector is expressed as a sum or a product of almost independent terms that obey the same distribution.According to the central limit theorem, the detector (or the detector logarithm in case of multiplicative embedding) obey a Gaussian distribution.Thus, in this case, the error probabilities can be written as ), where µ 0 , µ 1 are the mean values and σ 0 , σ 1 the standard deviations of the distributions f 0 , f 1 respectively.

III. Signal model and distribution of DFT magnitude coefficients
A basic step for the optimal detector construction is the computation of the transform coefficient distribution.Thus, in this section, the distribution of the DFT magnitude coefficients of a signal will be computed, whose model is ergodic and wide-sense stationary stochastic process.
The signal statistics are modeled as: where E(•) denotes the expected value.
A first order separable autocorrelation function model will be assumed [26]: where a is a real-valued constant.Typically a is in the range [a = 0.9, ...0, 99] for several class of 1D signals (e.g.audio).It should be noted that if a tends to zero, the autocorrelation approaches a Dirac distribution.
It is obvious from equations ( 5) and ( 8) that the signal correlation F s,s (D) depends only on the absolute difference D of the signal indices.The DFT transform of signal s(i), i = 1, ..., N is given by the following equation:

DRAFT
We can assume that the DFT transform (9) of the signal follow a Gaussian distribution due the Central Limit Theorem for random variables with small dependency [27].This assumption is valid at least for small values of parameter a.In order to show this experimentally we have performed the Kolmogorov-Smirnov test for all the coefficients.In Figure 2 the p-values for each coefficient for the case of a = 0 (Figure 2a) and a = 0.995 (Figure 2b) are illustrated.The statistic parameters used in the Kolmogorov-Smirnov test (expected value and variance) were theoretically derived from equations 12, 13, and 30.It is shown that the p-values are very low which means that all the coefficients follow the Gaussian distribution.
Thus, it is proved that the mean of S(k) is given by: The proof of µ S(k) is given in the Appendix.The variance of S(k) will be computed separately for its real S R (k) and imaginary S I (k) part according to the following formula: By substituting equation ( 5) in the equation (10) we result in The final results for the variances of S R (k) and S I (k) are given below: The proof of the above equations is given in the Appendix.
In Figure 1  were equal, then we could conclude that the distribution of Rayleigh one [28]: However, the variances of the real and the imaginary part of S(k) are equal only in the case of signals whose samples can be modeled as independent identically distributed (i.i.d) random variables (a = 0).Thus, for any other case we have to use the probability density function of a where x ∼ N (0, σ 2 1 ), y ∼ N (0, σ 2 2 ) and σ 1 = σ 2 .It is proved in the Appendix that the pdf of such a random variable z is given by: where I 0 denotes the modified Bessel function and σ 1 , σ 2 are the standard deviations of the real DRAFT and imaginary part of S(k).Thus, the Discrete Fourier magnitude distribution is given by: IV. Optimal watermark detector In the next section the optimal watermark detector for multiplicative watermarks will be evaluated by using the likelihood ratio test (LTR).According to the Neyman-Pearson theorem, in order to maximize the probability of detection P D for a given P fa = e, we decide for H 1 if: where the threshold T can be found from The test of ( 16) is called Likelihood Ratio Test (LTR).In the sequel the probability density functions of the watermarked signal P (M ; H 0 ), P (M ; H 1 ) will be computed for watermarked signals with a known and an unknown (random) watermark.For P (M ; H 0 ) we assume that the watermark is a random one whose pdf is modeled by: According to the embedding formula (2), it can be easily proved that the pdf of the watermarked signal is equal to: DRAFT By substituting f M with the probability density function of the distribution in equation ( 14) we find: In the case of hypothesis H 1 , the signal is watermarked by the known watermark W . Thus, the probability is given by ( 14): Assuming independence between the transform coefficients of S, we conclude that: By combining equations ( 14), ( 21) and ( 16) we get the optimal detector scheme:

A. Threshold estimation
The threshold is selected in such a way so that a predefined false alarm error probability can be achieved.In order to calculate the false alarm error probability, we firstly have to know the detector distribution in the case of erroneous watermark detection.We assume that the DRAFT distribution is Gaussian.Then, we estimate the distribution parameters from the statistics of the empirical distribution.The latter is calculated by detecting erroneous watermarks from the (possibly) watermarked signal.
From the empirical distribution statistics and the desired false alarm error probability, we calculate the threshold according the equation below: where µ and σ are the expected value and the standard deviation of the detector output set.
Thus, according to the equation above, the threshold T is given by: The total number of such detections needed is not predefined but, should be sufficiently large if we want to accurately approximate this distribution.The minimal number of experiments required in order to sufficiently approximate the distribution is found through the following procedure.We estimate the distribution parameters, µ, σ using the empirical distribution produced from L detector outputs, for an increasing L in a certain range of L, [L min , L max ].Then, according to these statistics, we calculate the threshold in order to achieve a false alarm probability e.g.equal to 10 −10 .We stop for an L * that leads a rather stable estimation of T .
This procedure is illustrated in Figure 4 for L min = 5, L max = 1000.According to this Figure , the threshold value is stabilized when the number of experiments becomes greater than L * = 100.
Of course, L * depends on the watermark embedding power, the signal length, and the signal characteristics.For this reason, we propose to execute the above procedure for representative signal sets and for the chosen embedding power in a particular application.

V. Experimental results
In this section, experiments are performed in order to verify the superiority of the proposed detector against the classical correlator one.The experiments are performed on one dimensional digital signals.
In order to construct signals with the desired autocorrelation properties (8) we filter a random white normally distributed signal S of zero mean value with an IIR filter: This filtering creates a signal having an autocorrelation function of the form: that is identical to (8) for µ 2 s = 0.The variance of the filtered signal equals to Watermark embedding is performed according to (2).Then, the watermarked signal is fed to both the correlator (4) and the proposed detectors (23).In order to estimate false alarm and false rejection probabilities, both correct and erroneous keys have been used during detection.
The above procedure is executed for a large number of different keys.Due to the Central Limit Theorem for products [29], the distribution of L(x) is log-normal.Consequently, the distribution of ln(L(x)) is normal, where ln(x) is the natural logarithm of x.In order to show the very good approximation of the detector output by the Gaussian distribution, we depict its empirical distribution in Figure 3.In Figures 3a and 3b the detector distribution for detection using an erroneous and correct key respectively is shown.The fitting is very good since the Kolmogorov-Smirnov null hypothesis has not been rejected for level of significance equal to 0.01 In the following the proposed detector will be the ln(L(x)) instead of L(x).Let dr(x) and de(x) be the distributions of the detector outputs for detecting correct and erroneous watermarks respectively.

DRAFT
The calculation of the empirical mean and standard deviation by approximating the empirical pdf with a normal one, can be used to produce Receiver Operator Characteristic (ROC) curves for both detector outputs.ROC curves will be used for comparing detector performance.
The above procedure is performed for several values of parameter a.The detection was performed using: • the correlator detector • the proposed detector considering the parameter a known In Figure 5, the value of the parameter a is zero.This is a special case for white signals, no filtering is performed by equation( 26).In the subsequent Figures, the parameter a increases, reaching the value a = 0.995 in the last Figure (Figure 8).By observing the Figures 5-8, we can conclude that: • the proposed detector performance is by far better that the correlator detector one.
• The performance of the proposed detector using the estimated parameter a, is almost the same with that using the known parameter a since that their ROC curves are very close to each other • The ROC curves that correspond to the proposed detector are not affected significantly by the value parameter a contrary to the correlator detect or ROC curves that show very decreased detection performance for highly correlated signals, i.e. as parameter a tends to one.

VI. Conclusions and future work
This paper deals with the statistical analysis of the behavior of a blind robust watermarking system based on 1-D pseudorandom signals embedded in the magnitude of the Fourier transform of the data and the design of an optimum detector.A multiplicative embedding method is examined and experiments are performed in order to show the proposed detector's improved efficiency against correlator one.
DRAFT Thus, the mean is equal to: II. Calculation of Discrete Fourier coefficient variance S(k) is a complex signal thus the variances of real and imaginary part will be calculated separately.

A. Variance of real part
The variance of the real part of S(k) is given by: The second sum has been calculated in 30.The first sum equals to: Using the equation 3 of 1.353 of [31] n−1 k=0 and splitting the sum we derive equation 12.

B. Variance of imaginary part
The variance of the imaginary part of S(k) is given by: By splitting the above equation as in 34 and using the equation 1 of 1.353 of [31] that has the form: we conclude in equation 13.

III. Calculation of the f z (z) distribution
In this section the distribution of f z (z) = x 2 + y 2 , where x ∼ N (0, σ 2 1 ), y ∼ N (0, σ 2 2 ) and σ 1 = σ 2 , will the calculated.By substituting x by z cos(t) and y by z sin(t) the above distribution equals with: by the parameter q equation 37 has the form: After taking into account the periodicity of the sin function and its symmetry in the integral the integral in equation 38 can be written as Using the formula 3.339 of [31] where I 0 (z) is the modified Bessel function of z, the integral in equation 39 equals: Finally, substituting q and using equation 41, equation 38 has the form: In the special case that σ 1 = σ 2 , the probability density function f (z) is the Rayleigh function.

Correlator
Proposed detector using a =0.9 Proposed detector using estimated a =0.90919Proposed detector using normalized correlation Fig. 6.ROC curves of correlator, the normalized correlator, the proposed detector by using the known parameter a and the proposed detector after estimating the parameter a. a = 0.9 DRAFT the theoretical variances and experimental of real and imaginary part of the Discrete Fourier transform coefficients are shown.In this example 100 signals of length 1000 obeying the model (8) were used for a = 0, 99.The next step is to calculate the distribution of the Fourier magnitude |S(k)|.By observing (10), we conclude that all but the DC term, have zero mean.If the variances of S R (k) and S I (k)

•
the proposed detector by estimating the (unknown) parameter a from the watermark sequence • the normalized correlator In Figures 5-8 the performance of the proposed detector against the correlator one is shown, for several values of parameter a in the range [0, 1].

Fig. 5 .
Fig. 5. ROC curves of correlator, the normalized correlator, the proposed detector by using the known parameter a and the proposed detector after estimating the parameter a. a = 0