The optimal decoder attempts to obtain an estimate \widehat{b} of *b* such that the probability error {P}_{e}=Pr\left\{\widehat{b}\ne b\right\} is minimized. This can be done with the maximum aposteriori (MAP) decoder, which is simplified to the maximum likelihood (ML) decoder with the assumption of the equal prior probability of the bit information. The ML estimate \widehat{b} can be expressed as

\widehat{b}=\underset{b\in \left\{\pm 1\right\}}{\mathsf{\text{arg}}\mathsf{\text{max}}}{f}_{\mathbf{R}}\left(\mathbf{r}|b,A,\mathbf{s}\right),

(9)

where *f*_{
R
}(**r**|*b*, *A*, *s*) represents the conditional PDF of **r** when given *b*, *A*, and **s**. It is clear that the distribution of the host signal plays an important role in the ML decoder structure and thus, distribution of the DCT and magnitude of DFT domains were introduced in Section 3. The ML decoder for binary information hiding could be expressed by using the likelihood ratio rule. In this case the ML decoder decides \widehat{b}=+1 if

\frac{{f}_{\mathbf{R}}\left(\mathbf{r}|b=+1\right)}{{f}_{\mathbf{R}}\left(\mathbf{r}|b=-1\right)}>1.

(10)

As discussed in Section 3, the PDF of the host signal can be different depending on the transform domain. In practice, due to different desired properties, different transform domains could be used for data hiding. Derivation and performance analyses of the ML decoder for SS embedding require the distribution of the host signal in a specific domain. In the following subsections, the ML decoders for SS and ISS embedding schemes in DCT and DFT domains are derived. It is worth mentioning that the ML decoder for the SS scheme in the DCT domain has been already proposed in [26].

### 4.1. ML decoders in the DFT domain

One possible host signal for information hiding is the DFT magnitude domain. However, it is important to note that the SS (1) and ISS (5) embedding schemes can not be applied directly in this domain because of the special property of the magnitudes of the DFT coefficients, i.e., they should be always positive. To ensure the intuition that the watermarked signal should be always positive, we propose a modified SS embedding scheme in the DFT magnitude domain as follows

\mathbf{r}=\mathbf{x}+\mathbf{s}Ab+\mathbf{e},

(11)

where the insurance vector *e* = [*e*_{1}, *e*_{2}, ..., *e*_{
l
}]^{T}is designed to make *r*_{
i
}'s positive. More specifically, if *x*_{
i
}+ *s*_{
i
}*Ab* is positive, its corresponding element *e*_{
i
}is set to be zero; If *x*_{
i
}+ *s*_{
i
}*Ab* is negative, *e*_{
i
}= -*s*_{
i
}*Ab* is set to make *r*_{
i
}equal to *x*_{
i
}which is consequently positive. In summary, *e*_{
i
}in (11) can be formulated as

{e}_{i}=\left(-{s}_{i}Ab\right)u\left(-\left({x}_{i}+{s}_{i}Ab\right)\right).

(12)

The modified SS embedding scheme (11) and the vector **e** defined in (12) reveal that, for those coefficients where *e*_{
i
}> 0, the watermarked signal becomes *r*_{
i
}= *x*_{
i
}, meaning that the coefficient *r*_{
i
}does not convey information directly. However, because of the structure of the optimal decoder, which will be derived shortly, such *r*_{
i
}'*s* still could help for decoding. We also note that, for the modified SS scheme (11), by increasing the watermark amplitude *A*, the number of coefficients with *e*_{
i
}> 0 increases and consequently the number of watermarked coefficients decreases. Generally, the goal of this modified SS embedding scheme is to make all the watermarked coefficients positive.

Having proposed the modified SS embedding scheme for information hiding in the magnitude of the DFT domain, we can derive the optimal decoder using the distribution of the host signal. Referring to expression (8), assuming the independent and identical distribution of the coefficients, we have the joint PDF of the host data as

{f}_{\mathbf{X}}\left(\mathbf{x}\right)=\mathsf{\text{exp}}\left\{-\sum _{i=1}^{l}{\left(\frac{\left|{x}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\}\prod _{i=1}^{l}\left[\left(\frac{{\gamma}_{i}}{{\eta}_{i}^{{\gamma}_{i}}}\right){\left(\left|{x}_{i}\right|\right)}^{{\gamma}_{i}-1}u\left({x}_{i}\right)\right].

(13)

Based on the ML decoder structure (10), it decides \widehat{b}=+1 if

\frac{{f}_{\mathbf{R}}\left(\mathbf{r}|b=+1\right)}{{f}_{\mathbf{R}}\left(\mathbf{r}|b=-1\right)}=\frac{\mathsf{\text{exp}}\left\{-{\sum}_{i=1}^{l}{\left(\frac{\left|{r}_{i}-{s}_{i}A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\}{\prod}_{i=1}^{l}\left(\frac{{\gamma}_{i}}{{\eta}_{i}^{{\gamma}_{i}}}\right){\left(\left|{r}_{i}-{s}_{i}A\right|\right)}^{{\gamma}_{i}-1}u\left({r}_{i}-{s}_{i}A\right)}{\mathsf{\text{exp}}\left\{-{\sum}_{i=1}^{l}{\left(\frac{\left|{r}_{i}-{s}_{i}A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\}{\prod}_{i=1}^{l}\left(\frac{{\gamma}_{i}}{{\eta}_{i}^{{\gamma}_{i}}}\right){\left(\left|{r}_{i}-{s}_{i}A\right|\right)}^{{\gamma}_{i}-1}u\left({r}_{i}-{s}_{i}A\right)}>1.

(14)

Generally, the ML decoder could be expressed as

\widehat{b}=\mathsf{\text{sign}}\left\{z\right\},

(15)

where in this case, after some manipulations, the test statistic regarding (14) is obtained as follows

z=\sum _{i=1}^{l}\left[{\left(\frac{\left|{r}_{i}+{s}_{i}A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}-{\left(\frac{\left|{r}_{i}-{s}_{i}A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\left({\gamma}_{i}-1\right)\mathsf{\text{ln}}\left(\left|\frac{{r}_{i}-{s}_{i}A}{{r}_{i}+{s}_{i}A}\right|\right)+\mathsf{\text{ln}}\left(\frac{u\left({r}_{i}-{s}_{i}A\right)}{u\left({r}_{i}+{s}_{i}A\right)}\right)\right].

(16)

Investigating the test statistic of the ML decoder in the DFT magnitude domain reveals that the bit information amplitude as well as the PDF parameters should be provided at the receiver side. Now, we proceed to show that the decoding procedure is error free for two cases. For one case that *b* = +1 and there is one coefficient with *r*_{
i
}+ *s*_{
i
}*A* < 0 at the decoder side, we can see that the test statistic in (16) goes to infinity and thus the decoder (15) definitely decides \widehat{b}=+1. More precisely, in this case, ln(*u*(*r*_{
i
}-*s*_{
i
}*A*)) is positive and ln(*u*(*r*_{
i
}+ *s*_{
i
}*A*)) goes to minus infinity, and thus the test statistic in (16) goes to infinity. Similarly, for the other case that *b* = -1 and at the decoder side there is one coefficient with *r*_{
i
}- *s*_{
i
}*A* < 0, the test statistic in (16) goes to minus infinity and thus the decoder (15) definitely decides \widehat{b}=-1.

As mentioned earlier, these coefficients which do not convey information directly could help decoding indirectly. To explain this better, let assume that *b* = +1 and *x*_{
i
}+ *s*_{
i
}*A* < 0, thus the corresponding coefficient becomes *r*_{
i
}= *x*_{
i
}at the embedding side. At the decoder side we will have *r*_{
i
}+ *s*_{
i
}*A* < 0, which based on the above discussion, leads to the decision \widehat{b}=+1. Similarly, let assume that *b* = -1 and *x*_{
i
}- *s*_{
i
}*A* < 0, thus the corresponding coefficient becomes *r*_{
i
}= *x*_{
i
}at the embedding side. At the decoder side we will have *r*_{
i
}+ *s*_{
i
}*A* < 0, which leads to decision \widehat{b}=-1. In both cases, the decoder performance would be error free and therefore, even some coefficients do not convey hidden information directly, they still could contribute to accurate decoding indirectly.

Deriving an analytic expression for the probability of error is always desirable, because it could help to analyze the behavior of the error. We first show that the test statistic used for decoding could be modeled as Gaussian random variable. It is noted that the test statistic (16) is the sum of *l* random variables, which with the assumption of independent host signal samples and with the knowledge of signature codes, by employing the central limit theorem, the test statistic could be approximated as a normal random variable if *l* would be large. Assuming that the signature code accepts values +1 and -1 with the equal probability, one could show that the conditional PDFs of the test statistic are expressed as

{f}_{Z}\left(z|b=+1\right)=\mathcal{N}\left({m}_{z},{\sigma}_{z}^{2}\right),

(17)

{f}_{Z}\left(z|b=-1\right)=\mathcal{N}\left(-{m}_{z},{\sigma}_{z}^{2}\right),

(18)

where *m*_{
z
}and {\sigma}_{z}^{2} represent the mean and the variance of the test statistic. Assuming the equal prior probability for the information bit, i.e., *Pr*{*b* = +1} = *Pr*{*b* = -1} = 1/2, the probability of error can be expressed as

{P}_{e}=\frac{1}{2}\mathsf{\text{erfc}}\left(\sqrt{\frac{\mathsf{\text{WIR}}}{2}}\right)=\frac{1}{2}\mathsf{\text{erfc}}\left(\sqrt{\frac{{m}_{z}^{2}}{2{\sigma}_{z}^{2}}}\right),

(19)

where er\phantom{\rule{2.77695pt}{0ex}}fc\left(x\right)=\frac{2}{\sqrt{\pi}}{\int}_{x}^{\infty}{e}^{-{t}^{2}}dt is the complementary error function and WIR is referred as the watermark to interference ratio. It could be shown that the mean and variance of the test statistic (16) when *b* = +1, and for all host signal coefficients which *x*_{
i
}> 2*A*, and with assuming equal probability for the signature code *Pr*{*s*_{
i
}= +1} = *Pr*{*s*_{
i
}= -1} = 1/2, are achieved as below

{m}_{z}=\sum _{i}\left\{\frac{1}{2}{\left(\frac{\left|{x}_{i}+2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\frac{1}{2}{\left(\frac{\left|{x}_{i}-2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\frac{1}{2}\left({\gamma}_{i}-1\right)\mathsf{\text{ln}}\left(\frac{{x}_{i}^{2}}{\left|{x}_{i}^{2}-4{A}^{2}\right|}\right)-{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\},

(20)

\begin{array}{ll}\hfill {\sigma}_{z}^{2}& =\sum _{i}\left\{\frac{1}{2}{\left(\frac{\left|{x}_{i}+2A\right|}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}+\frac{1}{2}{\left(\frac{\left|{x}_{i}-2A\right|}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}+{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}+\frac{1}{2}{\left({\gamma}_{i}-1\right)}^{2}{\left[\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{x}_{i}+2A\right|}\right)\right]}^{2}\right.\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+\frac{1}{2}{\left({\gamma}_{i}-1\right)}^{2}{\left[\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{x}_{i}-2A\right|}\right)\right]}^{2}-{\left(\frac{\left|{x}_{i}+2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}-{\left(\frac{\left|{x}_{i}-2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}+\left({\gamma}_{i}-1\right){\left(\frac{\left|{x}_{i}+2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{x}_{i}+2A\right|}\right)+\left({\gamma}_{i}-1\right){\left(\frac{\left|{x}_{i}-2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{x}_{i}-2A\right|}\right)\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}-\left({\gamma}_{i}-1\right){\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\mathsf{\text{ln}}\left(\frac{{x}_{i}^{2}}{{x}_{i}^{2}-4{A}^{2}}\right)-\left[\frac{1}{2}{\left(\frac{\left|{x}_{i}+2A\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right.+\frac{1}{2}{\left(\frac{\left|{x}_{i}-2A\right|}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}-{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}\left(\right)close="\}">{\left(\right)close="]">+\frac{1}{2}\left({\gamma}_{i}-1\right)\mathsf{\text{ln}}\left(\frac{{x}_{i}^{2}}{\left|{x}_{i}^{2}-4{A}^{2}\right|}\right)}^{}2\\ .\phantom{\rule{2em}{0ex}}\end{array}\n

(21)

If there is a host signal coefficient *x*_{
i
}< 2*A*, according to the earlier discussion, the probability of error would equal to zero. Therefore, the theoretical error probability of the modified SS scheme is expressed as (19) when the mean and variance could be achieved using expressions (20) and (21).

Having introduced information embedding using the modified SS scheme in the DFT magnitude domain, we now present the modified ISS scheme. Similar to the modified SS scheme, to avoid having negative watermarked coefficients, we propose a modified ISS embedding scheme as

\mathbf{r}=\mathbf{s}Ab+\mathbf{u}+\mathbf{e},

(22)

where the insurance vector *e* is determined as follows. If *u*_{
i
}+ *s*_{
i
}*Ab* is positive then the corresponding *e*_{
i
}is set to zero; If *u*_{
i
}+ *s*_{
i
}*Ab* is negative, then *e*_{
i
}= -*s*_{
i
}*Ab* + *ks*_{
i
}**s**^{T}x to ensure that *r*_{
i
}is positive. Therefore *e*_{
i
}in the modified ISS scheme could be expressed as

{e}_{i}=\left(-{s}_{i}Ab+k{s}_{i}{\mathbf{s}}^{T}\mathbf{x}\right)u\left(-\left({u}_{i}+{s}_{i}Ab\right)\right).

(23)

In order to obtain the ML decoder, the conditional distribution of the received signal **r** should be exploited. To do so, it is straightforward to show that the distribution of the vector **u** could be given by

{F}_{\mathbf{U}}\left(\mathbf{u}\right)=\frac{1}{\left|\mathbf{M}\right|}\mathsf{\text{exp}}\left\{-\sum _{i=1}^{l}{\left(\frac{\left|{\mathbf{m}}_{i}\mathbf{u}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\}\prod _{i=1}^{l}\left[\left(\frac{{\gamma}_{i}}{{\eta}_{i}^{{\gamma}_{i}}}\right){\left(\left|{\mathbf{m}}_{i}\mathbf{u}\right|\right)}^{{\gamma}_{i}-1}u\left({\mathbf{m}}_{i}\mathbf{u}\right)\right],

(24)

where |**M**| is the determinant of **M** and **m**_{
i
}is the *i* th row of **M**^{-1}, where **M** is defined as

\mathbf{M}=\mathbf{I}-k\mathbf{s}{\mathbf{s}}^{T}.

(25)

Exploiting the ML theory leads to decide \widehat{b}=+1 when the following inequality holds

\frac{{f}_{\mathbf{R}}\left(\mathbf{r}|b=+1\right)}{{f}_{\mathbf{R}}\left(\mathbf{r}|b=-1\right)}=\frac{\mathsf{\text{exp}}\left\{-{\sum}_{i=1}^{l}{\left(\frac{\left|{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\}{\prod}_{i=1}^{l}\left[\left(\frac{{\gamma}_{i}}{{\eta}_{i}^{{\gamma}_{i}}}\right){\left(\left|{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right|\right)}^{{\gamma}_{i}-1}u\left({\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right)\right]}{\mathsf{\text{exp}}\left\{-{\sum}_{i=1}^{l}{\left(\frac{\left|{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\}{\prod}_{i=1}^{l}\left[\left(\frac{{\gamma}_{i}}{{\eta}_{i}^{{\gamma}_{i}}}\right){\left(\left|{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right|\right)}^{{\gamma}_{i}-1}u\left({\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right)\right]}>1.

(26)

With some manipulations on (26), we can have the following test statistic

z=\sum _{i=1}^{l}\left[{\left(\frac{\left|{\mathbf{m}}_{i}\left(\mathbf{r}+\mathbf{s}A\right)\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}-{\left(\frac{\left|{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\left({\gamma}_{i}-1\right)\mathsf{\text{ln}}\left(\left|\frac{{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)}{{\mathbf{m}}_{i}\left(\mathbf{r}+\mathbf{s}A\right)}\right|\right)+\mathsf{\text{ln}}\left(\frac{{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)}{{\mathbf{m}}_{i}\left(\mathbf{r}+\mathbf{s}A\right)}\right)\right].

(27)

We now investigate the error probability behavior of this scheme. It is observed from (26) that at least one of the terms **m**_{
i
}(**r** + *sA*) and **m**_{
i
}(**r** - *sA*) should be positive for all watermarked coefficients, but we can show that the scheme in (22) may not fulfill this requirement. For instance, let us assume that *b* = +1 is hidden into the host signal, with the embedding scheme (22) and the ML decoder (26), the two terms *u*(**m**_{
i
}(**r** - *sA*)) and *u*(**m**_{
i
}(**r** + *sA*)) become (*u*(*x*_{
i
}+ **m**_{
i
}**e**)) and (*u*(*x*_{
i
}+ **m**_{
i
}**e**-2*A* **m**_{
i
}**s**)), respectively. Although the host signal vector and the insurance vector have positive coefficients, since the elements of **m**_{
i
}could be negative, it can not be guaranteed that all coefficients of *x*_{
i
}+ **m**_{
i
}**e** and *x*_{
i
}+ **m**_{
i
}**e** - 2*A* **m**_{
i
}**s** be positive. A similar observation can be noted when *b* = -1 is hidden into the host signal. Referring to (26), one could conclude that in the cases that both terms **m**_{
i
}(**r** + **s** *A*) and **m**_{
i
}(**r** - **s** *A*) are negative, the decoder makes random decisions. In order to avoid this undesirable behavior, we improve the modified ISS embedding scheme by proposing

\mathbf{r}=\mathbf{s}Ab+\mathbf{u}+\mathbf{e}+\mathbf{q},

(28)

where the vector **q** = [*q*_{1}, *q*_{2}, ..., *q*_{
l
}]^{T}is to make **m**_{
i
}(**r** + **s** *A*) when *b* = -1 (or **m**_{
i
}(**r** - **s** *A*) when *b* = +1) positive. With (28), when *b* = -1, then **y** = **M**^{-1}(**r** + **s** *A*) becomes **y** = **x** + **M**^{-1}**e** + **M**^{-1}**q**, where **y** = [*y*_{1}, *y*_{2}, ..., *y*_{
l
}]^{T}. Let us denote **x** + **M**^{-1}**e** = **v**_{
p
}+ **v**_{
n
}, where {\mathbf{v}}_{p}={\left[{v}_{p1},{v}_{p2},...,{p}_{pl}\right]}^{T} is the vector whose elements are non-negative, and {\mathbf{v}}_{n}={\left[{v}_{n1},{v}_{n2},...,{v}_{nl}\right]}^{T} is the vector whose elements are less than zero. Therefore, the vector **y** could be written as

\mathbf{y}={\mathbf{v}}_{p}+{\mathbf{v}}_{n}+{\mathbf{M}}^{-1}\mathbf{q}.

(29)

In order to make all the elements of **y** positive, it is sufficient to make **v**_{
n
}+ **M**^{-1}**q** = 0, which leads to the vector **q** as

\mathbf{q}=-\mathbf{M}{\mathbf{v}}_{n}.

(30)

We will apply the modified ISS scheme in (28) for embedding, and use the decoder in (26) for extracting the hidden information. So far, as a summary, the modified ISS embedding scheme (28) in the DFT magnitude domain has been proposed in order to make all the watermarked coefficients positive and to make the decoder (26) meaningful.

The remaining point is determining the parameter *k* in ISS embedding which could be done using the probability of error. This parameter should take the value which minimizes the theoretical probability of error. It could be shown that the probability of error for modified ISS scheme is obtained using the expression (19) when *x*_{
i
}> 2*A*(*l*(*k*^{-1} - *l*)^{-1} + 1), based on the following mean and variance

{m}_{z}=\sum _{i}\left\{\frac{1}{2}{\left(\frac{\left|{a}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\frac{1}{2}{\left(\frac{\left|{d}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\frac{1}{2}\left({\gamma}_{i}-1\right)\mathsf{\text{ln}}\left(\frac{{x}_{i}^{2}}{\left|{a}_{i}{d}_{i}\right|}\right)-{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\right\},

(31)

\begin{array}{ll}\hfill {\sigma}_{z}^{2}& =\sum _{i}\left\{\frac{1}{2}{\left(\frac{\left|{a}_{i}\right|}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}\right.+\frac{1}{2}{\left(\frac{\left|{d}_{i}\right|}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}+{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}+\frac{1}{2}{\left({\gamma}_{i}-1\right)}^{2}{\left[\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{a}_{i}\right|}\right)\right]}^{2}+\frac{1}{2}{\left({\gamma}_{i}-1\right)}^{2}{\left[\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{d}_{i}\right|}\right)\right]}^{2}\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}-{\left(\frac{\left|{a}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}-{\left(\frac{\left|{d}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\left({\gamma}_{i}-1\right){\left(\frac{\left|{a}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{a}_{i}\right|}\right)+\left({\gamma}_{i}-1\right){\left(\frac{\left|{d}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\mathsf{\text{ln}}\left(\frac{{x}_{i}}{\left|{d}_{i}\right|}\right)\phantom{\rule{2em}{0ex}}\\ \phantom{\rule{1em}{0ex}}\left(\right)close="\}">-\left({\gamma}_{i}-1\right){\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}\mathsf{\text{ln}}\left(\frac{{x}_{i}^{2}}{{a}_{i}{d}_{i}}\right)-{\left[\frac{1}{2}{\left(\frac{\left|{a}_{i}\right|}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\frac{1}{2}{\left(\frac{\left|{d}_{i}\right|}{{\eta}_{i}}\right)}^{2{\gamma}_{i}}-{\left(\frac{{x}_{i}}{{\eta}_{i}}\right)}^{{\gamma}_{i}}+\frac{1}{2}\left({\gamma}_{i}-1\right)\mathsf{\text{ln}}\left(\frac{{x}_{i}^{2}}{\left|{a}_{i}{d}_{i}\right|}\right)\right]}^{2}& ,\phantom{\rule{2em}{0ex}}\end{array}\n

(32)

where

{a}_{i}={x}_{i}+2A\left(l\mu +1\right),\phantom{\rule{1em}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{2.77695pt}{0ex}}{b}_{i}={x}_{i}-2A\left(l\mu +1\right),

(33)

\mu ={\left({k}^{-1}-l\right)}^{-1}.

(34)

It should be noted that regarding to (5) and the point that the parameter *A* is always positive, one could conclude that the parameter *k* in ISS embedding scheme should satisfy \left(0\le k\le \sqrt{D/\left({\mathbf{s}}^{T}{\mathbf{R}}_{\mathbf{x}}\mathbf{s}\right)}\right). This parameter could obtain by optimizing the WIR as the following constrained maximization

{\lambda}_{k}=\underset{0\le k\le \sqrt{\frac{D}{{\mathbf{s}}^{T}{\mathbf{R}}_{\mathbf{x}}\mathbf{s}}}}{\mathsf{\text{arg}}\mathsf{\text{max}}}\frac{{m}_{z}^{2}}{{\sigma}_{z}^{2}}.

(35)

### 4.2. ML decoders in the DCT domain

As opposed to the SS and ISS schemes in the magnitude of DFT which suffered from lack of optimal decoder, the optimal decoder in the DCT domain exploiting SS scheme has been accomplished in [26]. It has been shown that the ML decoder satisfies (15) where

z=\sum _{i=1}^{l}\left[{\left|\frac{{r}_{i}+{s}_{i}A}{{\sigma}_{{x}_{i}}}\right|}^{c}-{\left|\frac{{r}_{i}-{s}_{i}A}{{\sigma}_{{x}_{i}}}\right|}^{c}\right].

(36)

We can see that the ML decoder requires knowledge of the bit information amplitude and the shape parameter as well as the signature code. For practical implementation, the receiver should either have these prior information or estimate them. One way to avoid estimating the shape parameter is to use a general value for all images, hoping it could describe the distribution of the DCT coefficients relatively well [35].

Now, we extend this work to obtain the ML decoder for ISS embedding scheme in the DCT domain to achieve better decoding performance. To do so, we should first derive the PDF of the vector **u** defined in (4). To this end, the PDF of generalized Gaussian is rewritten in the following vector form

{f}_{\mathbf{x}}\left(\mathbf{x}\right)=\frac{{\lambda}^{l}}{{\prod}_{i=1}^{i=l}{\sigma}_{xi}}\mathsf{\text{exp}}\left\{-{\left[\sqrt{\frac{\mathrm{\Gamma}\left(\frac{3}{c}\right)}{\mathrm{\Gamma}\left(\frac{1}{c}\right)}}\right]}^{c}{\u2225{\mathbf{x}}^{T}{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}\u2225}_{\frac{c}{2}}{\u2225{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}\mathbf{x}\u2225}_{\frac{c}{2}}\right\},

(37)

where

\lambda =\frac{c}{2\mathrm{\Gamma}\left(\frac{1}{c}\right)}\sqrt{\frac{\mathrm{\Gamma}\left(\frac{3}{c}\right)}{\mathrm{\Gamma}\left(\frac{1}{c}\right)},}

(38)

and {\mathbf{R}}_{\mathbf{x}}=diag\left\{{\sigma}_{x1}^{2},{\sigma}_{x2}^{2},...,{\sigma}_{xl}^{2}\right\}. In addition, we define ∥[*a*_{1}, *a*_{2},..., *a*_{
l
}]∥_{
c
}= [|*a*_{1}|^{c}, |*a*_{2}|^{c},..., |*a*_{
l
}|^{c}]. Regarding to the ISS signal model (3), it is shown [36] that the PDF of **u** can be expressed as

{f}_{\mathbf{U}}\left(\mathbf{u}\right)=\frac{{\lambda}^{l}}{\left|\mathbf{M}\right|{\prod}_{i=1}^{i=1}{\sigma}_{xi}}\mathsf{\text{exp}}\left\{-{\left[\sqrt{\frac{\mathrm{\Gamma}\left(\frac{3}{c}\right)}{\mathrm{\Gamma}\left(\frac{1}{c}\right)}}\right]}^{c}{\u2225{\mathbf{u}}^{T}{\mathbf{M}}^{-1}{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}\u2225}_{\frac{c}{2}}{\u2225{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}{\mathbf{M}}^{-1}\mathbf{u}\u2225}_{\frac{c}{2}}\right\}.

(39)

Then, likelihood ratio for ISS (3) leads the decoder decides \widehat{b}=+1 when

\frac{{f}_{\mathbf{R}}\left(\mathbf{r}|b=+1\right)}{{f}_{\mathbf{R}}\left(\mathbf{r}|b=-1\right)}=\frac{\mathsf{\text{exp}}\left\{-{\u2225{\left(\mathbf{r}-\mathbf{s}A\right)}^{T}{\mathbf{M}}^{-1}{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}\u2225}_{\frac{c}{2}}{\u2225{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}{\mathbf{M}}^{-1}\left(\mathbf{r}-\mathbf{s}A\right)\u2225}_{\frac{c}{2}}\right\}}{\mathsf{\text{exp}}\left\{-{\u2225{\left(\mathbf{r}+\mathbf{s}A\right)}^{T}{\mathbf{M}}^{-1}{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}\u2225}_{\frac{c}{2}}{\u2225{{\mathbf{R}}_{\mathbf{x}}}^{\frac{-1}{2}}{\mathbf{M}}^{-1}\left(\mathbf{r}+\mathbf{s}A\right)\u2225}_{\frac{c}{2}}\right\}}>1.

(40)

Having accomplished some algebraic simplifications, the ML decoder for ISS embedding scheme is obtained as in the form of (15) where

z=\sum _{i=1}^{l}\left[{\left|\frac{{\mathbf{m}}_{i}\left(\mathbf{r}+\mathbf{s}A\right)}{{\sigma}_{{x}_{i}}}\right|}^{c}-{\left|\frac{{\mathbf{m}}_{i}\left(\mathbf{r}-\mathbf{s}A\right)}{{\sigma}_{{x}_{i}}}\right|}^{c}\right].

(41)

Having proposed the optimal decoder of the ISS scheme in the DCT domain, the error probability is obtained by (19) where the mean and variance are determined as follow:

{m}_{z}=\sum _{i=1}^{l}\left[{\left(\frac{1}{{\sigma}_{{x}_{i}}}\right)}^{c}\left(\frac{1}{2}{\left|{x}_{i}+2A\left(l\mu +1\right)\right|}^{c}+\frac{1}{2}{\left|{x}_{i}-2A\left(l\mu +1\right)\right|}^{c}-{\left|{x}_{i}\right|}^{c}\right)\right],

(42)

{\sigma}_{z}^{2}=\sum _{i=1}^{l}\left[{\left(\frac{1}{{\sigma}_{{x}_{i}}}\right)}^{2c}\left(\frac{1}{4}{\left[{\left|{x}_{i}+2A\left(l\mu +1\right)\right|}^{c}-{\left|{x}_{i}-2A\left(l\mu +1\right)\right|}^{c}\right]}^{2}\right)\right].

(43)

Similar to the ISS embedding in the magnitude of the DFT domain, the parameter *k* could be determined using the constrained maximization (35) taking into account the mean and variance defined in (42) and (43).