Skip to main content

On the security of compressed encryption with partial unitary sensing matrices embedding a secret keystream

Abstract

The principle of compressed sensing (CS) can be applied in a cryptosystem by providing the notion of security. In this paper, we study the computational security of a CS-based cryptosystem that encrypts a plaintext with a partial unitary sensing matrix embedding a secret keystream. The keystream is obtained by a keystream generator of stream ciphers, where the initial seed becomes the secret key of the CS-based cryptosystem. For security analysis, the total variation distance, bounded by the relative entropy and the Hellinger distance, is examined as a security measure for the indistinguishability. By developing upper bounds on the distance measures, we show that the CS-based cryptosystem can be computationally secure in terms of the indistinguishability, as long as the keystream length for each encryption is sufficiently large with low compression and sparsity ratios. In addition, we consider a potential chosen plaintext attack (CPA) from an adversary, which attempts to recover the key of the CS-based cryptosystem. Associated with the key recovery attack, we show that the computational security of our CS-based cryptosystem is brought by the mathematical intractability of a constrained integer least-squares (ILS) problem. For a sub-optimal, but feasible key recovery attack, we consider a successive approximate maximum-likelihood detection (SAMD) and investigate the performance by developing an upper bound on the success probability. Through theoretical and numerical analyses, we demonstrate that our CS-based cryptosystem can be secure against the key recovery attack through the SAMD.

1 Introduction

Compressed sensing (CS) [1–4] is a novel data acquisition scheme that samples a signal at a sub-Nyquist rate, which allows simultaneous data acquisition and compression. The original signal can be faithfully recovered from the measurement samples, if it is sparse with respect to a particular basis and sampled via a random projection. With efficient measurement and stable reconstruction, the CS technique has been of interest in a variety of research fields, e.g., communications [5–7], sensor networks [8–10], image processing [11–13], and radar [14].

Recently, a great deal of attention has been paid to the CS technique for data confidentiality in information security field. A CS-based cryptosystem encrypts a plaintext through a CS measurement process by keeping the sensing matrix secret. Then, the ciphertext can be decrypted by a CS reconstruction process. Thus, the CS-based cryptosystem performs simultaneous data acquisition and encryption at physical layer. Such a lightweight cryptosystem is particularly attractive for secure communications in wireless sensor networks, where the resources are not sufficient for providing data confidentiality by conventional encryption.

The security potential of compressed sensing was hinted by Candes and Tao [3], where the measurement samples were referred to as a weakly encrypted ciphertext. In [15], Rachlin and Baron proved that the CS-based cryptosystem cannot be perfectly secure but might be computationally secure. Orsdemir et al. [16] showed that it is computationally secure against a key search technique via an algebraic approach. Subsequently, many researchers have studied the security of CS-based cryptosystems for practical applications, which will be discussed with more details in Section 2.3. For a comprehensive review of CS techniques in information security, readers are referred to [17].

In this paper, we study the computational security of a CS-based cryptosystem that encrypts a plaintext with a partial unitary sensing matrix embedding a secret keystream. The keystream to be embedded is obtained by a keystream generator of stream ciphers, which ensures fast and efficient generation of the keystream. Assuming that the keystream is part of the original one with an extremely long period, we renew it at each encryption, which leads to a one-time sensing (OTS) cryptosystem. Then, the initial seed (or state) of the original keystream generator is essentially the secret key of the CS-based cryptosystem. With the sensing matrix, we demonstrate that the CS-based cryptosystem theoretically guarantees a stable and robust CS decryption for a legitimate recipient.

For security analysis, we first use probability metrics to investigate the security in a statistical manner. The total variation (TV) distance [18] between probability distributions of ciphertexts conditioned on a pair of plaintexts is examined as a security measure for the indistinguishability [19] of our CS-based cryptosystem. We investigate the TV distance by developing upper bounds on the relative entropy [20] and the Hellinger distance [21], which demonstrates that our CS-based cryptosystem can be computationally secure in terms of the indistinguishability, as long as the keystream length for each encryption is sufficiently large with low compression \(\left (\frac {M}{N} \right)\) and sparsity \(\left (\frac {K}{N} \right)\) ratios.

Next, we analyze the security of our CS-based cryptosystem by examining the resistance against a cryptanalytic attack. We consider a potential chosen plaintext attack (CPA) from an adversary to recover the key of our CS-based cryptosystem. In the CPA, the adversary needs to restore a keystream embedded in CS encryption, which is nontrivial unlike in stream ciphers, since the keystream is not outstanding from a known plaintext-ciphertext pair. Associated with the key recovery attack, we show that the security of our CS-based cryptosystem is based on the mathematical intractability of a constrained integer least-squares (ILS) problem. For a sub-optimal, but feasible key recovery attack, we consider a successive approximate maximum-likelihood (ML) detection (SAMD) for the adversary’s CPA and investigate the performance by developing an upper bound on the success probability. Finally, theoretical analysis and numerical results reveal that our CS-based cryptosystem can be secure against the key recovery attack through the SAMD.

This paper is organized as follows. Section 2 reviews the CS principle, discusses some known CS-based cryptosystems, and summarizes the contributions of this paper. In Section 3, we describe a mathematical model of the CS-based cryptosystem proposed by this paper. We discuss a theoretical guarantee of CS decryption for a legitimate recipient by the cryptosystem. In Section 4, we analyze the indistinguishability of our CS-based cryptosystem, to demonstrate the computational security. Section 5 introduces an adversary’s potential CPA strategy for key recovery, where we describe the details and examine the performance of SAMD. Section 6 presents numerical results to demonstrate the reliability and the security of our CS-based cryptosystem. Finally, concluding remarks will be given in Section 7.

2 Background

2.1 Notations

A matrix (or a vector) is represented by a boldface upper (or lower) case letter. U T and |U| denote the transpose and the determinant of a matrix U, respectively. tr(U) denotes the trace of a matrix U or the sum of all diagonal entries of U. U(k,t) is an entry of an M×N matrix U in the kth row and the tth column, where 0≤k≤M−1 and 0≤t≤N−1. μ(U) denotes the maximum magnitude of the entries of U, i.e., \(\mu (\mathbf {U}) = \underset {k, t}{\max } |\mathbf {U}(k,t)|\). diag(s) is a diagonal matrix whose diagonal entries are from a vector s. An identity matrix is denoted by I, where the dimension is determined in the context. W is a conventional N×N Walsh-Hadamard matrix, where W W T=W T W=N I. Also, D denotes a discrete-cosine transform (DCT) matrix, where D D T=D T D=N I. For a vector \(\mathbf {x} = (x_{0},\cdots, x_{N-1})^{T} \in \mathbb {R}^{N}\), the l p -norm of x is denoted by \( || \mathbf {x} ||_{p} = \left (\sum _{k=0}^{N-1} |x_{k}|^{p} \right)^{\frac {1}{p}} \), where 1≤p<∞. If the context is clear, ||x|| denotes the l 2-norm of x. A vector \(\mathbf {n} \sim {\mathcal {N}} \left (\mathbf {0}, \sigma ^{2} \mathbf {I}\right)\) is a Gaussian random vector with mean 0=(0,⋯,0)T and covariance σ 2 I. Finally, \(\mathbb {E}[\!\cdot ]\) denotes the average of a random vector or a random matrix.

Table 1 summarizes the abbreviations of this paper.

Table 1 Abbreviations

2.2 Compressed sensing

Compressed sensing (CS) [1–3] is to recover a sparse signal from the measurements that are believed to be incomplete. A signal \(\mathbf {x} \in \mathbb {R}^{N}\) is called K-sparse with respect to a sparsifying (orthonormal) basis Ψ if θ=Ψ x has at most K nonzero entries, where K≪N. The sparse signal x is linearly measured by \(\mathbf {r} = \boldsymbol {\Phi } \mathbf {x} + \mathbf {n} = \boldsymbol {\Phi } \boldsymbol {\Psi }^{T} \boldsymbol {\theta } + \mathbf {n} \in \mathbb {R}^{M} \), where Φ is an M×N measurement matrix with M≪N and \(\mathbf {n} \in \mathbb {R}^{M}\) is a measurement noise. The CS theory states that if the sensing matrix A=Φ Ψ T obeys the restricted isometry property (RIP) [2], a stable and robust reconstruction of θ can be guaranteed from the incomplete measurement r. The CS reconstruction is accomplished by solving the l 1-minimization problem of

$$ \boldsymbol{\widehat{\theta}} = \underset{\boldsymbol{\theta}}{\text{argmin}} || {\boldsymbol{\theta}} ||_{1} \text{subject to } ||\mathbf{A} {\boldsymbol{\theta}} - \mathbf{r}||_{2} \leq \epsilon $$

with convex optimization or greedy algorithms [4]. For simplicity, this paper assumes Ψ=I, or that x is sparse in canonical basis, which yields the sensing matrix of A=Φ.

2.3 Prior works on CS-based cryptosystems

Since the foundational works of [15] and [16], there have been many research efforts on CS-based cryptosystems. Bianchi, Bioglio, and Magli [22, 23] analyzed the security of a noiseless CS-based cryptosystem utilizing random Gaussian sensing matrices in an OTS manner. In [24], a similar analysis has been made for a noiseless CS-based cryptosystem having a circulant sensing matrix for efficient CS processes. Cambareri et al. [25] proposed a CS-based cryptosystem that supports multiclass encryption using a random Bernoulli matrix and its class-dependent variations. In spite of exploiting different security measures, i.e., indistinguishability [23] and asymptotic spherical security [25], the security analyses of [23] and [25] showed that the statistical properties of ciphertexts reveal only the information about the energy of the plaintexts. The security of the multiclass encryption scheme has been further investigated in [26] against a known plaintext attack (KPA), by examining the average number of candidate solutions matching a plaintext-ciphertext pair.

In addition to the secret sensing matrix, a CS-based cryptosystem may employ an extra cryptographic primitive, which can be considered as a product cipher. For instance, scrambling or random permutation has been additionally accomplished, before [27] or after [28] CS encryption. In [29], nonlinear diffusion has been added to quantized ciphertexts. Zhang et al. [30] proposed a bi-level protected CS (BLP-CS), where the sparsifying basis and the sensing matrix are generated with different secret keys. In the BLP-CS, the knowledge of both the sparsifying basis and the sensing matrix is required for CS decryption.

To gain a resistance against KPA and CPA, a CS-based cryptosystem normally operates in an OTS manner, by renewing the sensing matrix at each encryption. As the renewal requires the additional complexity and can quickly waste up the cryptographic resource for generating each sensing matrix, a CS-based cryptosystem reusing the sensing matrix during multiple encryptions has also been of interest. However, it is insecure against KPA and CPA, since an adversary can easily recover the sensing matrix with N linearly independent plaintexts by solving the system of linear equations [15]. While reusing the same sensing matrix, the BLP-CS [30] attempted to overcome the weakness and to achieve a CPA-resistance by ensuring a RIPless reconstruction for an adversary.

CS-based cryptosystems can work in a framework of physical layer security [31]. The emerging technology of physical layer security is a promising paradigm for enhancing wireless security [32], by exploiting the randomness of wireless channel characteristics. In [33], Agrawal and Vishwanath derived sufficient conditions for secret communications via CS in a wiretap channel. Reeves at al. [34] investigated the secrecy capacity of a wiretap channel employing CS. Dautov and Tsouri [35] used the received signal strength indicator (RSSI) from wireless channels for secure key establishment in a CS-based cryptosystem, where the shared key can be used to form a common sensing matrix in a sender and a recipient. In practice, a variety of CS-based cryptosystems concerning the security and privacy of multimedia, imaging, and smart grid data have been suggested and studied in [36–39].

2.4 Summary of contributions

The main results of this paper are summarized in comparison with prior works. Our CS-based cryptosystem encrypts a plaintext with a partial unitary sensing matrix embedding a secret keystream, which is used only once for each encryption. Thus, it operates in an OTS manner, similar to those of [22–25], but different from the BLP-CS [30]. It can further reduce the consumption of the cryptographic resource by renewing only the keystream of length N, not replacing the entire M×N sensing matrix, at each encryption. Unlike the BLP-CS, our CS-based cryptosystem uses only a single cryptographic primitive, or the secret keystream, while keeping the sparsifying basis public. Furthermore, the secret keystream can be efficiently generated by a keystream generator of stream ciphers. Based on the RIP analysis, the knowledge of the sensing matrix, or equivalently the keystream, theoretically guarantees a reliable CS decryption.

In security analysis, we obtain the result by two different approaches. On the one hand, we demonstrate the indistinguishability of our CS-based cryptosystem, by investigating the TV distance between probability distributions of a pair of ciphertexts. This statistical approach seems like the analysis of [23], but we use a new probability metric of the Hellinger distance [21] to characterize the TV distance. On the other hand, we consider a potential CPA from an adversary for key recovery of our CS-based cryptosystem. By formulating the CPA as an NP-hard problem, we show that the success of the CPA is computationally infeasible for a sufficiently large keystream length. In addition, we introduce a sub-optimal but feasible CPA strategy and investigate the performance with the highest possible success probability. Finally, the CPA performance turns out to be quite poor even under an optimistic scenario, which guarantees the security against the CPA for our CS-based cryptosystem. The second type of security analysis is new in this paper.

3 Mathematical model

3.1 CS encryption with a partial unitary sensing matrix

A CS-based cryptosystem encrypts a sparse plaintext \(\mathbf {x} \in \mathbb {R}^{N}\) through the CS measurement process with a sensing matrix \(\boldsymbol {\Phi } \in \mathbb {R}^{M \times N}\), which produces the ciphertext \(\mathbf {r} = \boldsymbol {\Phi } \mathbf {x} + \mathbf {n} \in \mathbb {R}^{M}\), where \(\mathbf {n} \sim \mathcal {N}\left (\mathbf {0}, \sigma ^{2} \mathbf {I}\right)\) is a measurement noise. This paper proposes a CS-based cryptosystem that employs a partial unitary sensing matrix Φ embedding a secret keystream, as defined in Definition 1.

Definition 1

The sensing matrix1 of our CS-based cryptosystem is defined by

$$ \boldsymbol{\Phi} = \frac{1}{\sqrt{M}} \mathbf{R}_{\Omega} \mathbf{U} = \frac{1}{\sqrt{MN}} \mathbf{R}_{\Omega} \mathbf{U}_{1} {diag}(\mathbf{s}) \mathbf{U}_{2}. $$
(1)

In (1), R Ω is a public random subsampling operator that selects M rows out of N ones uniformly at random, where the selected indices are specified by Ω={ω 0,⋯,ω M−1}. Also, \(\mathbf {U}_{i} \in {\mathbb {R}}^{N \times N} \) is a unitary matrix, i.e., \(\mathbf {U}_{i}^{T} \mathbf {U}_{i} = \mathbf {U}_{i} \mathbf {U}_{i}^{T} = N \mathbf {I}\) for i=1 and 2, respectively. In particular, each entry of U 1 has unit magnitude, i.e., |U 1(k,t)|=1 for all 0≤k,t≤N−1. Finally, \(\mathbf {U} = \frac {1}{\sqrt {N}} \mathbf {U}_{1} { diag}(\mathbf {s}) \mathbf {U}_{2} \) is also unitary for s∈{−1,+1}N, where s is a secret keystream to be embedded in Φ for each CS encryption.

In this paper, we use U 1=H, or an N×N Hadamard matrix that employs a binary m-sequence [40] of period N−1=2n−1 for a positive integer n, i.e., \(\mathbf {d} = \left (d_{0}, \cdots, d_{2^{n}-2} \right)\), where d k ∈{0,1}. For 0≤k,t≤N−1, each entry of H is given by

$$\mathbf{H}(k, t) = \left\{ \begin{array}{ll} 1, & \quad \text{if}\ k=0\ \text{or}\ t=0, \\ (-1)^{d_{k+t-2}}, & \quad \text{otherwise,} \end{array} \right. $$

where the index k+t−2 is computed modulo 2n−1. From the structure, H is symmetric, or H T=H. As d has the ideal two-level autocorrelation [40], i.e.,

$$\sum\limits_{k = 0}^{2^{n}-2} (-1)^{d_{k} + d_{k + \tau}} = \left\{ \begin{array}{ll} 2^{n}-1, & \quad \text{if } \tau = 0, \\ -1, & \quad \text{if } 1 \leq \tau \leq 2^{n}-2, \end{array} \right. $$

where k+τ is computed modulo 2n−1, it is obvious that H H T=H T H=N I. Since H is public, the structure and the initial state of an n-stage linear feedback shift register (LFSR) generating the binary m-sequence d are publicly known.

3.2 Keystream generation for CS encryption

In the sensing matrix Φ of (1), we assume that s is a segment of length N from the original keystream of an extremely long period, which enables to renew the keystream s at each CS encryption. For fast and efficient keystream generation, one may employ an LFSR-based nonlinear keystream generator of stream ciphers. For example, we may consider the combinatorial sequence generator [41], the filtering sequence generator [42], the clock-controlled generator [43, 44], the shrinking generator [45], and the self-shrinking generator (SSG) [46], each of which presents a simple structure but a remarkable resistance against various attacks. For more details on keystream generators and stream ciphers, see [47] and [48]. Regarding the keystream of our CS-based cryptosystem, we make the following assumption.

Assumption 1

An original keystream from a stream cipher is designed to have nice pseudorandomness properties [40] such as balance, large period, low autocorrelation, and large linear complexity. With the properties, we assume that each element of the keystream s takes +1 or −1 independently and uniformly at random, which facilitates the security analysis of our CS-based cryptosystem.

When we employ a keystream generator to produce the keystream s, the initial seed (or state) of the generator is essentially the key of our CS-based cryptosystem. The key should be kept secret between a sender and a legitimate recipient, whereas the structure of the keystream generator can be publicly known. For secure key exchange, we may establish a separate secure channel, or use the key establishment via the RSSI from wireless channels as in [35].

3.3 CS decryption

For CS decryption, a noisy ciphertext \(\mathbf {r} = \boldsymbol {\Phi } \mathbf {x} + \mathbf {n} \in {\mathbb {R}}^{M} \) is available for an adversary as well as a legitimate recipient, where \(\mathbf {n} \sim {\mathcal {N}}\left (\mathbf {0}, \sigma ^{2} \mathbf {I}\right)\) is a measurement noise. A legitimate recipient of the ciphertext r, who knows Φ, attempts to recover the plaintext x by conducting a CS reconstruction. Meanwhile, an adversary will make various attempts to recover the plaintext x or the keystream s, with no knowledge of Φ.

Proposition 1 presents the reliability and the stability of our CS-based cryptosystem for a legitimate recipient, which is from the RIP result [49, 50] of a partial unitary sensing matrix.

Proposition 1

[49, 50] For a legitimate recipient, our CS-based cryptosystem theoretically guarantees a stable decryption of a K-sparse plaintext with bounded errors, as long as \(M = \mathcal {O}\left (\mu ^{2}(\mathbf {U}) \cdot K\log ^{4}N \right)\).

When U 1=H, numerical experiments revealed that \(\mu (\mathbf {U}) = {\mathcal {O}}\left (\sqrt {\log N}\right) \) for i) U 2=W or ii) U 2=D, if each entry of the keystream s takes +1 or −1 uniformly at random. In this case, if \(M = \mathcal {O}\left (K\log ^{5}N\right)\), Proposition 1 guarantees a stable decryption.

Table 2 summarizes a symmetric-key CS-based cryptosystem proposed in this paper.

Table 2 Symmetric-key CS-based cryptosystem

4 Security analysis

A CS-based cryptosystem cannot be perfectly secure [15] but is believed to be computationally secure [15, 16]. In this section, we analyze the computational security of our CS-based cryptosystem by studying the notion of indistinguishability [19].

Assume that a cryptosystem produces a ciphertext by encrypting one of two possible plaintexts. The cryptosystem is said to have the indistinguishability, if no adversary can determine in polynomial time which of the two plaintexts corresponds to the ciphertext, with probability significantly better than that of a random guess [51]. In short, if a cryptosystem has the indistinguishability, an adversary is unable to learn any partial information of the plaintext in polynomial time from a given ciphertext.

In specific, let us consider an indistinguishability experiment [51] with a constraint of K-sparse plaintexts. First of all, an adversary creates a pair of plaintexts x 1 and x 2 with at most K nonzero entries per each. Then, our CS-based cryptosystem produces a ciphertext r=Φ x h +n by randomly selecting h, where h=1 or 2. Given r, the adversary attempts to figure out which plaintext, x 1 or x 2, was encrypted for the ciphertext, by carrying out a polynomial time test \({\mathcal {D}}: \mathbf {r} \rightarrow h \in \{1, 2\}\).

In this paper, we make use of the total variation (TV) distance [18] to evaluate the performance of the indistinguishability experiment. Let d TV(p 1,p 2) be the TV distance between the probability distributions p 1=Pr(r|x 1) and p 2=Pr(r|x 2). Then, it is readily checked from [52] that the probability that an adversary can successfully distinguish the plaintexts by some kind of the binary hypothesis test \({\mathcal {D}}\) is bounded by

$$ p_{d} \leq \frac{1}{2} + \frac{d_{\text{TV}} (p_{1}, p_{2}) }{2}. $$
(2)

Therefore, if d TV(p 1,p 2) approaches to zero, the probability of success will be at most that of a random guess, which leads to the indistinguishability of a cryptosystem. Consequently, one can argue that a cryptosystem with d TV(p 1,p 2) closer to zero would be more secure in terms of the indistinguishability. Since computing d TV(p 1,p 2) directly is difficult [53], we compute two probability metrics instead to bound the TV distance, which ultimately examines the indistinguishability of our CS-based cryptosystem.

4.1 Relative entropy

In [23] and [24], the relative entropy (or the Kullback-Leibler divergence [20]) has been used to quantify the indistinguishability. Precisely, the relative entropy of two probability distributions gives an upper bound on the TV distance by Pinsker’s inequality [54] or the refinements [55], which ultimately bounds the success probability of the indistinguishability experiment by (2).

In (1), one may assume that the entries of Φ are asymptotically Gaussian for a sufficiently large N, since each one can be seen as the sum of independent random variables weighted by each entry of s. Along with the Gaussian noise n, we assume that r, conditioned on x 1 (or x 2), is a jointly Gaussian random vector. Also, \( {\mathbb {E}}[\boldsymbol {\Phi }] = \frac {1}{\sqrt {MN}} \mathbf {R}_{\Omega } \mathbf {U}_{1} \cdot {\mathbb {E}} [\!\text {diag}(\mathbf {s})] \cdot \mathbf {U}_{2} = \mathbf {0}\) for a given R Ω , as each entry of s takes ±1 with probability 1/2 under Assumption 1. Thus, \( {\mathbb {E}}\left [\!\mathbf {r} | \mathbf {x}_{h}\right ] = {\mathbb {E}}[\boldsymbol {\Phi }] \cdot \mathbf {x}_{h} + {\mathbb {E}}[\!\mathbf {n}] = \mathbf {0}\). With the Gaussian random vector r, the relative entropy between p 1=Pr(r|x 1) and p 2=Pr(r|x 2) has the following closed-form expression [56]

$$ D\left(p_{1} || p_{2}\right) = \frac{1}{2} \left[ \log \frac{|\mathbf{C}_{2}|}{|\mathbf{C}_{1}|} + \text{tr} \left(\mathbf{C}_{2}^{-1} \mathbf{C}_{1}\right) - M \right], $$
(3)

where C 1 and C 2 are the covariance matrices of r conditioned on x 1 and x 2, respectively. By measuring the relative entropy by (3), we obtain an upper bound on the TV distance, i.e.,

$$ d_{\text{TV}} (p_{1}, p_{2}) \leq \min \left(\sqrt{\frac{D(p_{1} || p_{2})}{2}}, \ 1 \right) $$
(4)

by Pinsker’s inequality. In (4), the upper bound is set to be at most 1, since d TV(p 1,p 2)∈[ 0,1].

In what follows, we present an upper bound on the relative entropy with some constraints on plaintexts, which subsequently yields an analytic upper bound on the maximum TV distance by (4).

Theorem 1

In our CS-based cryptosystem, assume that each plaintext x has at most K nonzero entries with the constant energy \(\mathcal {E}_{x} = || \mathbf {x} ||^{2}\). Then, the relative entropy of (3) is bounded by

$$ {}D(p_{1} || p_{2})\! \leq\! \frac{M}{2} \left(\!K \mu^{2} (\mathbf{U}_{2})\! \cdot \text{PNR}\,-\,\log\! \left(K \mu^{2} (\mathbf{U}_{2})\! \cdot\! \text{PNR}\! +\! 1 \right) \right)\!, $$
(5)

where \({PNR} = \frac {\mathcal {E}_{x}}{M \sigma ^{2}}\) is the plaintext-to-noise power ratio (PNR).

Proof

See the Appendices. â–¡

In Theorem 1, μ(U 2)=1 if U 2=W, while \(\mu (\mathbf {U}_{2}) = \sqrt {2}\) if U 2=D. However, if \(\mathbf {U}_{2} = \sqrt {N} \mathbf {I}\), the upper bound increases as N for \(\mu (\mathbf {U}_{2}) = \sqrt {N}\). Thus, Theorem 1 implies that one must not use \(\mathbf {U}_{2} = \sqrt {N} \mathbf {I}\), to achieve the indistinguishability of our CS-based cryptosystem.

To ensure a reliable CS decryption for a legitimate recipient, our CS-based cryptosystem can set \(K = {\mathcal {O}} \left (\frac {M}{ \mu ^{2} (\mathbf {U}) \log N} \right)\) for nonuniform CS recovery [57], which yields the following corollary.

Corollary 1

In our CS-based cryptosystem with U 1=H and N=2n, assume U 2=W or D, where \(\mu (\mathbf {U})={\mathcal {O}}(\sqrt {\log N})\). In Theorem 1, if \(K \leq \frac {c M}{n^{2}} \) with a constant c, then

$$ \begin{aligned} D(p_{1} || p_{2}) & \leq \frac{M}{2} \left(\frac{c M \mu^{2} (\mathbf{U}_{2})}{n^{2}} \cdot \text{PNR}\right. \\ &\quad - \log \left.\left(\frac{c M \mu^{2} (\mathbf{U}_{2})}{n^{2}} \cdot \text{PNR} + 1 \right) \right). \end{aligned} $$

Thus, if the keystream length N is sufficiently large with given M and PNR, our CS-based cryptosystem will have low relative entropy, which contributes to the indistinguishability against an adversary, while guaranteeing the reliability for a legitimate recipient.

4.2 Hellinger distance

To bound the TV distance, we may use another probability metric, the Hellinger distance [21]. In our CS-based cryptosystem, recall that the ciphertext r, conditioned on x h , is assumed to be a jointly Gaussian random vector with zero mean and the covariance matrix C h , where h=1 or 2. Then, the Hellinger distance for the multivariate Gaussian distributions p 1 and p 2 is given by [58, 59]

$$ d_{\mathrm{H}} (p_{1}, p_{2}) = \sqrt{1- \frac{|\mathbf{C}_{1}|^{\frac{1}{4}} |\mathbf{C}_{2}|^{\frac{1}{4}}}{|\mathbf{C}_{3}|^{\frac{1}{2}}} }, $$
(6)

where \(\mathbf {C}_{3} = \frac {\mathbf {C}_{1} + \mathbf {C}_{2}}{2}\). The Hellinger distance is particularly useful by giving both upper and lower bounds on the TV distance [60], i.e.,

$$ {}d_{\mathrm{H}}^{2} (p_{1}, p_{2}) \leq d_{\text{TV}} (p_{1}, p_{2}) \leq d_{\mathrm{H}}(p_{1}, p_{2}) \sqrt{2 - d_{\mathrm{H}}^{2} (p_{1}, p_{2})}. $$
(7)

In what follows, we present an upper bound on the Hellinger distance of (6), which leads to an analytic upper bound on the maximum TV distance by (7).

Theorem 2

Recall the assumptions and definitions of Theorem 1. In our CS-based cryptosystem, the Hellinger distance of (6) is bounded by

$$ d_{\mathrm{H}} (p_{1}, p_{2}) \leq \sqrt{1 - \left(\frac{2\sqrt{K \mu^{2} (\mathbf{U}_{2}) \cdot \text{PNR} + 1}}{K \mu^{2} (\mathbf{U}_{2})\cdot \text{PNR} + 2} \right)^{\frac{M}{4} }}, $$
(8)

where \(\text {PNR} = \frac {\mathcal {E}_{x}}{M \sigma ^{2}}\).

Proof

See the Appendices. â–¡

Corollary 2

In our CS-based cryptosystem with U 1=H and N=2n, assume U 2=W or D, where \(\mu (\mathbf {U}) = {\mathcal {O}}\left (\sqrt {\log N}\right)\). In Theorem 2, if \(K \leq \frac {c M}{n^{2}}\) with a constant c, then

$$ d_{\mathrm{H}} (p_{1}, p_{2}) \leq \sqrt{1 - \left(\frac{2n \sqrt{c M \mu^{2} (\mathbf{U}_{2}) \cdot \text{PNR} + n^{2}}} {c M \mu^{2} (\mathbf{U}_{2}) \cdot \text{PNR} + 2n^{2}} \right)^{\frac{M}{4} }}. $$

Thus, if the keystream length N is sufficiently large with given M and PNR, our CS-based cryptosystem will have low Hellinger distance, which contributes to the indistinguishability against an adversary, while guaranteeing the reliability for a legitimate recipient.

Remark 1

Theorems 1 and 2 suggest that the relative entropy and the Hellinger distance will approach to zero as PNR decreases. Accordingly, our CS-based cryptosystem will have low TV distance by (4) and (7) at low PNR. Similarly, the TV distance will be low when M and K are small, respectively. Consequently, our CS-based cryptosystem can be indistinguishable at low PNR for small M and K.

Remark 2

When N=2n increases, Corollaries 1 and 2 suggest that if M is fixed, the relative entropy and the Hellinger distance will decrease at a given PNR by reducing \(K = {\mathcal {O}} \left (\frac {M}{n^{2}} \right)\), which will be confirmed by numerical results of Section 5. On the other hand, if M increases with \(M = {\mathcal {O}} \left (K n^{2}\right) \) for a given K, numerical results reveal that they also decrease over N at a given PNR, which contradicts Theorems 1 and 2. This observation implies that there is a room to improve the bounds of the theorems. Combined with Remark 1, the TV distance will be low if the keystream length N is sufficiently large with low compression \(\left (\frac {M}{N} \right)\) and sparsity \(\left (\frac {K}{N} \right)\) ratios, which leads to the asymptotic indistinguishability of our CS-based cryptosystem.

5 Potential key recovery attack

In this section, we consider a potential key recovery attack in which an adversary attempts to recover the key of our CS-based cryptosystem. In the CPA, the adversary tries to restore a keystream from a ciphertext (stage 1) and then to recover the original key from the restored keystream via algebraic cryptanalysis (stage 2). With a sufficiently long key, we assume that the number of keystream bits required for the algebraic cryptanalysis, denoted by D, is much larger than the ciphertext length M. For a convenience of analysis, we assume D=N, which means that the adversary needs to restore a keystream of full length N from stage 1. Figure 1 illustrates the potential CPA from an adversary for key recovery. This section discusses the adversary’s strategy for keystream recovery in stage 1. Once a keystream is successfully restored through stage 1, a known cryptanalysis [47, 48] can be carried out in stage 2 for key recovery, which will not be discussed in this paper.

Fig. 1
figure 1

An adversary’s chosen plaintext attack for key recovery against our CS-based cryptosystem

5.1 Mathematical intractability of keystream recovery

In stage 1 of the CPA, an adversary needs to observe a correct N-bit keystream from a ciphertext that has been encrypted by a chosen plaintext. We assume that the adversary will choose a plaintext x such that each entry of \(\widehat {\mathbf {x}} = \mathbf {U}_{2} \mathbf {x} \) is nonzero for a unitary matrix U 2. Then, the corresponding ciphertext is given by

$$ \begin{aligned} \mathbf{r} = \boldsymbol{\Phi} \mathbf{x} + \mathbf{n} & = \frac{1}{\sqrt{MN}} \mathbf{R}_{\Omega} \mathbf{U}_{1} \text{diag}(\mathbf{s}) \mathbf{U}_{2} \mathbf{x} + \mathbf{n}\\ & = \frac{1}{\sqrt{MN}} \mathbf{R}_{\Omega} \mathbf{U}_{1} \text{diag}(\widehat{\mathbf{x}}) \mathbf{s} + \mathbf{n}\\ & = \mathbf{A} \mathbf{s} +\mathbf{n}, \end{aligned} $$
(9)

where \(\mathbf {A} = \frac {1}{\sqrt {MN}} \mathbf {R}_{\Omega } \mathbf {U}_{1} \text {diag}(\widehat {\mathbf {x}})\). Unlike in stream ciphers, restoring the keystream s from the known plaintext-ciphertext pair is not a trivial task, since s is hidden under compression in r.

From the ciphertext r of (9), an adversary needs to find a most likely keystream, which is equivalent to a maximum-likelihood (ML) estimate of

$$ \widehat{\mathbf{s}} = \underset{\mathbf{s} \in \{-1, +1\}^{N}}{\text{argmin}} || \mathbf{r} - \mathbf{A} \mathbf{s} ||^{2}. $$
(10)

Finding the ML solution of (10) is known as a constrained integer least-squares (ILS) problem, which is also called a closest vector problem (CVP) [61] in lattices. For a general A, the constrained ILS problem is proven to be NP hard [62].

To find a most likely keystream of (10), an exhaustive ML search requires the complexity of \({\mathcal {O}}\left (2^{N}\right)\), which would be computationally infeasible if the keystream length N is sufficiently large. Alternatively, the generalized sphere decoding (GSD) algorithms [63–65] can find an ML solution to the ILS problem of the underdetermined system with M<N. However, as it has the complexity exponential in N−M [63–65], the GSD cannot be applicable to the ILS problem with M≪N. To the best of our knowledge, there is no polynomial-time algorithm to find an ML solution of (10) with M≪N for a sufficiently large N.

In summary, the computational security of our CS-based cryptosystem against the key recovery attack is brought by the mathematical hardness that no polynomial-time algorithm is known to find an ML solution to the underdetermined ILS problem. In fact, the mathematical intractability of the ILS problem has been exploited by public-key cryptosystems [66–68]. In our symmetric-key CS-based cryptosystem, it also ensures that if the keystream length N is sufficiently large with M≪N, no adversary will be able to find a most likely keystream of length N in polynomial time, which demonstrates the computational security of our CS-based cryptosystem against the key recovery attack.

5.2 Successive approximate maximum-likelihood detection (SAMD)

In Section 5.1, we demonstrated that the ML detection would be infeasible for keystream recovery, as long as the keystream length is sufficiently large. As an alternative, we consider a sub-optimal, but feasible keystream recovery process for the CPA. Instead of restoring an N-bit keystream at once, we assume that an adversary attempts to restore a disjoint J-bit segment2 of the keystream from each detection, where J≪N, and repeats the detection \(\lceil \frac {N}{J} \rceil \) times successively to restore the keystream of full length N. In this subsection, we describe the details of the successive detection process for keystream recovery.

For a convenience of analysis, we assume a chosen plaintext such that \(\widehat {\mathbf {x}} = \left (\sqrt {MN}, \cdots, \sqrt {MN}\right)^{T}\) in (9), which yields A=R Ω U 1 for our analysis3. In the keystream recovery, an adversary has a freedom to choose the value of J and the J-bit positions of a keystream to be restored at the ith detection. Let Θ i ⊂{0,⋯,N−1} be a set of indices, where |Θ i |=J if 1≤i≤n s −1 and |Θ i |=N−(n s −1)J if i=n s , respectively, for \(n_{s} = \lceil \frac {N}{J} \rceil \). Also, Θ a ∩Θ b =ϕ for a≠b, where ϕ is an empty set, and \(\phantom {\dot {i}\!}\Theta _{1} + \cdots + \Theta _{n_{s}} = \{0, \cdots, N-1 \}\).

Let \(\mathbf {s}_{\Theta _{i}} \in \{ -1, +1 \}^{|\Theta _{i} |}\) be a |Θ i |-bit vector, where the entries are taken from the indices of Θ i in the keystream s. At the ith detection, an adversary attempts to find \(\mathbf {s}_{\Theta _{i}}\) from the ciphertext r of (9). With \(\mathbf {s}_{\Theta _{1}}, \cdots, \mathbf {s}_{\Theta _{i-1}}\) that have been detected from the previous detections, the ith detection should use a new ciphertext r i by subtracting their contribution from r, i.e.,

$$ \mathbf{r}_{i} = \mathbf{r} - \sum\limits_{h=1}^{i-1} \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Theta_{h}}^{T} \widehat{\mathbf{s}}_{\Theta_{h}}, $$
(11)

where \(\widehat {\mathbf {s}}_{\Theta _{h}}\) is an estimate from the hth detection. In (11), \(\mathbf {R}_{\Theta _{h}}^{T}\) is an N×J column selection operator that selects J columns of U 1 whose indices are specified by Θ h . Let Δ i ={0,⋯,N−1}∖(Θ 1+⋯+Θ i ), where \(\Delta _{n_{s}} = \phi \), and \(\mathbf {R}_{\Delta _{i}}^{T}\) be an N×(N−i J) column selection operator whose indices are specified by Δ i . By assuming \(\widehat {\mathbf {s}}_{\Theta _{h}} = \mathbf {s}_{\Theta _{h}}\) for 1≤h≤i−1, we have from (11)

$$ \begin{aligned} \mathbf{r}_{i} & = \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Theta_{i}}^{T} \mathbf{s}_{\Theta_{i}} + \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Delta_{i}}^{T} \mathbf{s}_{\Delta_{i}} + \mathbf{n} \\ & = \mathbf{m}_{i} + \mathbf{w}_{i} + \mathbf{n}, \end{aligned} $$
(12)

where \(\mathbf {m}_{i} = \mathbf {R}_{\Omega } \mathbf {U}_{1} \mathbf {R}_{\Theta _{i}}^{T} \mathbf {s}_{\Theta _{i}}\) corresponds to a desired component to be detected at the ith detection, \(\mathbf {w}_{i} = \mathbf {R}_{\Omega } \mathbf {U}_{1} \mathbf {R}_{\Delta _{i}}^{T} \mathbf {s}_{\Delta _{i}} \) is an interfering component from the keystream segments that have not been detected yet, and \(\mathbf {n} \sim {\mathcal N} (\mathbf {0}, \sigma ^{2} \mathbf {I})\) is a Gaussian random noise.

In (12), \(\mathbf {w}_{n_{s}} = \mathbf {0}\) since \(\Delta _{n_{s}} = \phi \). On the other hand, if 1≤i≤n s −1, each entry of w i is taken from the sum of N−i J column vectors of R Ω U 1, each of which is weighted by the entry of \(\mathbf {s}_{\Delta _{i}}\). Since each entry of \(\mathbf {s}_{\Delta _{i}}\) takes +1 or −1 randomly and independently under Assumption 1, w i will follow the jointly Gaussian distribution by the central limit theorem [69]. By noting that w i +n can be modeled as a Gaussian random vector for 1≤i≤n s , r i is also Gaussian for a given \(\mathbf {s}_{\Theta _{i}}\). Then,

$$ \begin{aligned} {\mathbb{E}}\left[\mathbf{r}_{i} | \mathbf{s}_{\Theta_{i}} \right] &= {\mathbb{E}}\left[\mathbf{m}_{i} | \mathbf{s}_{\Theta_{i}} \right] + {\mathbb{E}}\left[\mathbf{w}_{i} | \mathbf{s}_{\Theta_{i}} \right] + {\mathbb{E}}\left[\mathbf{n} | \mathbf{s}_{\Theta_{i}} \right]\\ &= \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Theta_{i}}^{T} \mathbf{s}_{\Theta_{i}} = \mathbf{m}_{i}, \end{aligned} $$
(13)

where \({\mathbb {E}}\left [\mathbf {w}_{i} | \mathbf {s}_{\Theta _{i}} \right ] = {\mathbb {E}}\left [\mathbf {n} | \mathbf {s}_{\Theta _{i}} \right ] = \mathbf {0}\), since \(\mathbf {s}_{\Theta _{i}}\) is independent of w i and n, respectively. Also, the covariance of r i is given by

$$ {}{\mathbb{E}}\!\left[\!\left(\mathbf{r}_{i} \,-\, \mathbf{m}_{i}\right)\!\left(\mathbf{r}_{i} \,-\, \mathbf{m}_{i}\right)^{T}\! | \mathbf{s}_{\Theta_{i}}\!\right] \!\!= \!{\mathbb{E}}\!\left[\!\left(\mathbf{w}_{i} \,+\, \mathbf{n}\right) \!\left(\mathbf{w}_{i} \,+\, \mathbf{n}\right)^{T}\!\right]\! =\! \mathbf{K}_{i} + \sigma^{2} \mathbf{I}, $$
(14)

where w i and n are independent. In (14),

$$ \begin{aligned} \mathbf{K}_{i} = {\mathbb{E}}\left[\mathbf{w}_{i} \mathbf{w}_{i}^{T}\right] & = \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Delta_{i}}^{T} \cdot {\mathbb{E}}\left[\mathbf{s}_{\Delta_{i}}\mathbf{s}_{\Delta_{i}}^{T}\right] \cdot \mathbf{R}_{\Delta_{i}} \mathbf{U}_{1}^{T} \mathbf{R}_{\Omega}^{T} \\ & = \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Delta_{i}}^{T} \cdot \mathbf{R}_{\Delta_{i}} \mathbf{U}_{1}^{T} \mathbf{R}_{\Omega}^{T}, \end{aligned} $$
(15)

where \({\mathbb {E}}\left [\mathbf {s}_{\Delta _{i}} \mathbf {s}_{\Delta _{i}}^{T}\right ] = \mathbf {I}\). Since K i does not depend on \(\mathbf {s}_{\Theta _{i}}\), the covariance of r i in (14) is equal for all possible \(\mathbf {s}_{\Theta _{i}} \in \{-1, +1\}^{|\Theta _{i}|}\) at each ith detection. Under the Gaussian model of r i with equal covariance, we can apply the ML decision rule [70] at the ith detection, which yields

$$ \widehat{\mathbf{s}}_{\Theta_{i}} = \underset{\mathbf{s}_{\Theta_{i}} \in \{-1, +1\}^{|\Theta_{i}|}}{\text{argmin}} \left(\mathbf{r}_{i} - \mathbf{m}_{i} \right)^{T} \left(\mathbf{K}_{i} + \sigma^{2} \mathbf{I} \right)^{-1} \left(\mathbf{r}_{i} - \mathbf{m}_{i} \right). $$
(16)

In (11) and (12), we assumed that all the estimates \(\widehat {\mathbf {s}}_{\Theta _{h}}\), 1≤h≤i−1, from the previous detections are correct, and then ignored the estimation errors \(\mathbf {s}_{\Theta _{h}} - \widehat {\mathbf {s}}_{\Theta _{h}}\) while subtracting the contribution from r. Therefore, (16) cannot be a true ML detection, but an optimistic approximation to the adversary.

Finally, the adversary carries out the approximate ML detection of (16) n s times successively for 1≤i≤n s and restores the full N-bit keystream by combining the disjoint |Θ i |-bit estimates of \(\widehat {\mathbf {s}}_{\Theta _{i}} \). Throughout this paper, the detection process is called a successive approximate ML detection (SAMD). In what follows, we present an upper bound on the success probability of the SAMD.

Theorem 3

In the SAMD, recall the approximate ML decision rule of (16) applied at each ith detection for 1≤i≤n s , where \(n_{s} = \lceil \frac {N}{J} \rceil \). Let λ min (K i ) be the minimum eigenvalue of the covariance matrix K i in (15). Let P succ be the probability that an N-bit keystream can be successfully restored by the SAMD. Then,

$$ P_{succ} \leq \prod_{i=1}^{n_{s}} \left(1 - Q \left(\sqrt{\frac{M \mu^{2} (\mathbf{U}_{1})}{ \lambda_{min} (\mathbf{K}_{i}) + \sigma^{2}}} \right) \right) \triangleq P_{succ, UB}, $$
(17)

where \(Q(x) = \frac {1}{\sqrt {2 \pi }} \int _{x}^{\infty } e^{-\frac {t^{2}}{2}} dt\).

Proof

See the Appendices. â–¡

Theorem 3 shows the result for a general unitary matrix U 1, which suggests that our CS-based cryptosystem should choose an N×N unitary matrix U 1 such that μ(U 1) is as small as possible, regardless of N, in order to degrade the performance of the SAMD. In this paper, μ(U 1)=1 from U 1=H.

The upper bound on the success probability of Theorem 3 represents the highest possible performance that the SAMD can achieve with no estimation errors at each detection, which is an optimistic scenario for an adversary. In reality, the actual probability of success will be much lower than the upper bound, due to estimation errors and error propagation through detections. If an adversary finds a solution of (16) via an exhaustive search, the complexity of each detection of the SAMD will be \({\mathcal O} \left (2^{J}\right)\) with J≪N.

5.3 Minimum eigenvalues of K i

Theorem 3 implies that minimizing λ min(K i ) can improve the performance of the SAMD. At the ith detection of the SAMD, it is an adversary that determines the selection operator \(\mathbf {R}_{\Theta _{i}}\). Therefore, if the adversary appropriately chooses Θ i (or equivalently Δ i ) to minimize λ min(K i ), the success probability of the SAMD can be improved. In this paper, we consider three possible selections for Θ i that the adversary may choose reasonably.

  1. 1)

    Uniform selection: \(\Theta _{i} = \left \{i-1, \ \lfloor \frac {N}{J} \rfloor +i-1, \ \cdots, \ (J-1)\lfloor \frac {N}{J} \rfloor +i-1 \right \}\).

  2. 2)

    Consecutive selection: Θ i ={(i−1)J, (i−1)J+1, ⋯, i J−1}.

  3. 3)

    Random selection: Θ i selects the J indices from {0,⋯,N−1}∖(Θ 1+⋯+Θ i−1) uniformly at random.

Each selection is valid for 1≤i≤n s −1, and \(\Theta _{n_{s}} = \{0, \cdots, N-1\} \setminus (\Theta _{1} + \cdots + \Theta _{n_{s}-1})\), where \(n_{s} = \lceil \frac {N}{J} \rceil \). To further minimize λ min(K i ), the adversary might be able to develop a more sophisticated selection of Θ i by exploiting the structure of R Ω and U 1. However, we leave this issue open for future research. Regarding the selection operator, we have the following assumption.

Assumption 2

Once an adversary chooses a value of J and a type of selection, we assume that they will be fixed through the entire detections of the SAMD.

Intuitively, the larger J will ensure better detection performance for the SAMD, since a longer keystream segment that can be subtracted from each detection may contribute less interference. The intuition will be justified by the numerical results of Section 6. In this regard, Assumption 2 is valid, since the adversary’s reasonable option is to fix the value of J to the largest possible one allowed by the computing power. In addition, the numerical results of Section 6 show that λ min(K i ) is not so affected by the type of selections, which also supports Assumption 2.

In what follows, we present a theoretical lower bound on λ min(K i ) for 1≤i≤n s , if Θ i is a random selection.

Theorem 4

In our CS-based cryptosystem with U 1=H, assume that an adversary chooses a random selection for Θ i in the ith detection of the SAMD, where \(1 \leq i \leq n_{s} = \lceil \frac {N}{J} \rceil \). Let \(I_{T} = \lceil \frac {N - c_{2} M\log M}{J} \rceil \) for a constant c 2>0. Then,

$$ \lambda_{\min}(\mathbf{K}_{i}) \geq \left\{ \begin{array}{ll} \left(\sqrt{N - iJ} - \sqrt{c_{1} M \log M} \right)^{2}, & \quad \text{if } i < I_{T}, \\ 0, & \quad \text{if } i \geq I_{T} \end{array} \right. $$
(18)

with high probability, where c 1 is a constant with 0<c 1<c 2.

Proof

See the Appendices. â–¡

The numerical results of Section 6 show that the lower bound also holds for uniform and consecutive selections. Using the bound, Corollary 3 presents a further upper bound on the success probability of the SAMD, which is straightforward from Theorems 3 and 4 with μ(H)=1.

Corollary 3

In our CS-based cryptosystem with U 1=H, if an adversary chooses a random selection for Θ i , 1≤i≤n s during the SAMD, P succ, UB in Theorem 3 is bounded by

$$\begin{aligned} {}P_{\mathrm{succ, UB}} \!\leq\! & \left(\!1\! - Q \left(\sqrt{\frac{M }{\sigma^{2}}} \right) \right)^{n_{s} - I_{T} + 1} \\ & \cdot \prod_{i=1}^{I_{T} -1}\!\! \left(\! 1\! -\! Q\! \left(\! \sqrt{\frac{M }{(\sqrt{N - iJ} \,-\,\! \sqrt{c_{1} M \log M})^{2}\,+\,\sigma^{2}}}\! \right) \right) \\ & \triangleq P_{\text{succ}, \mathrm{U}^{2}\mathrm{B}}, \end{aligned} $$

where \(I_{T} = \lceil \frac {N - c_{2} M\log M}{J} \rceil \) for constants c 1 and c 2 with 0<c 1<c 2.

6 Numerical results

This section presents numerical results to demonstrate the reliability and the security of our CS-based cryptosystem. In numerical experiments, each plaintext x has at most K nonzero entries, where the positions are chosen uniformly at random and the coefficients are taken from the Gaussian distribution. In CS encryption, \(\boldsymbol {\Phi }= \frac {1}{\sqrt {MN}} \mathbf {R}_{\Omega } \mathbf {U}_{1} \text {diag}(\mathbf {s}) \mathbf {U}_{2}\), where U 1=H, and U 2=W or D. Also, the secret keystream s is generated by the self-shrinking generator [46] of a 128-stage LFSR. For CS decryption, the CoSaMP recovery algorithm [71] has been employed for a legitimate recipient to decrypt each ciphertext with the knowledge of Φ.

6.1 CS decryption of a legitimate recipient

Figure 2 demonstrates the performance of CS decryption of a legitimate recipient, where the plaintext length is N=1024 and the ciphertext length is M=48. The figure sketches the normalized mean squared error (NMSE), defined by \(\text {NMSE} = {\mathbb {E}} \left [ \frac {||\mathbf {x} - \widehat {\mathbf {x}} ||^{2}}{||\mathbf {x}||^{2}} \right ]\), where x and \(\widehat {\mathbf {x}}\) are original and decrypted plaintexts, respectively. We examine the performance with total 10000 plaintexts at a given PNR, where each one has at most K=4 nonzero entries. For comparison, we sketch the performance of CS reconstruction with a random Gaussian sensing matrix for Φ. The figure shows that the performance of our CS decryption is as good as that of CS recovery with a random Gaussian sensing matrix. As a consequence, it demonstrates that our CS-based cryptosystem guarantees a reliable CS decryption for a legitimate recipient.

Fig. 2
figure 2

The normalized mean squared error (NMSE) of CS decryption for a legitimate recipient

6.2 Indistinguishability

Figure 3 displays the upper and lower bounds of TV distance over PNR with U 2=W, where N=1024, M=48, and K=4. In the figure, the relative entropy of (3) and the Hellinger distance of (6) were computed using the covariance matrix of (19). Averaged over 10,000 pairs of randomly generated plaintexts (x 1,x 2) with at most K nonzero entries per each, the relative entropy and the Hellinger distance yield the bounds of (4) and (7) on the TV distance, respectively. For comparison, we also sketch the theoretical upper bounds on the TV distance, which are obtained by the maximum relative entropy of (5) and the maximum Hellinger distance of (8), respectively. The figure shows that the TV distance approaches to zero as noise level grows, which implies that our CS-based cryptosystem can be indistinguishable at low PNR. As PNR increases, however, we observe that the upper and lower bounds increase and finally converge to certain levels, respectively. More extensive simulations agreed with the implication of Remark 1 that the CS-based cryptosystem will have lower TV distances with less PNR, M, and K. We made similar observations of the TV distance when U 2=D and/or each plaintext has bipolar nonzero entries.

Fig. 3
figure 3

The upper and lower bounds of total variation distance over PNR

Figure 4 depicts the upper bounds on the success probability of an adversary in the indistinguishability experiment, where the best- and worst-case upper bounds of (2) are from the minimum and maximum achievable TV distances of (7), respectively, obtained by the Hellinger distance (6). In the figure, U 2=W and PNR=25 dB. With a given ciphertext length M=48, the maximum sparsity is set as \(K = \left \lfloor c M / \log _{2}^{2} N \right \rfloor \) for each N=2n, to ensure a reliable nonuniform CS decryption for a legitimate recipient, where c=8.5. For comparison, we sketch the empirical success probability of CS decryption by a legitimate recipient, where a decrypted plaintext has been declared as a success if \(\frac {||\mathbf {x} - \widehat {\mathbf {x}} ||^{2}}{||\mathbf {x}||^{2}} < 10^{-2}\). The figure reveals that the adversary’s success probability approaches to that of a random guess as the keystream length N increases, while a legitimate recipient maintains its reliability.

Fig. 4
figure 4

The success probability of legitimate recipient and adversary for a given M

Figure 5 also displays the upper bounds on the success probability of an adversary in the indistinguishability experiment. At this time, the ciphertext length is kept as \(M = \left \lceil c K \log _{2}^{2} N \right \rceil \) for each N=2n with a given K=4, where c=0.12. As in Fig. 4, it also reveals that the adversary’s success probability approaches to 0.5 as the keystream length N increases, while a legitimate recipient maintains its reliability. In conclusion, the empirical results of Figs. 4 and 5 show that if the keystream length N is sufficiently large with low compression \(\left (\frac {M}{N} \right)\) and sparsity \(\left (\frac {K}{N} \right)\) ratios, our CS-based cryptosystem can be computationally secure in terms of the indistinguishability, while guaranteeing a reliable CS decryption for a legitimate recipient.

Fig. 5
figure 5

The success probability of legitimate recipient and adversary for a given K

6.3 Performance of SAMD

Figure 6 sketches the minimum eigenvalues of the covariance matrix K i of (15) at the ith detection for various J∈{32,48,64,80}, where N=1024 and M=48. For comparison, it also sketches the lower bound of Theorem 4, where c 1=0.5 and c 2=1. For each i, we tested with 100,000 pairs of (Ω,Θ i ) for random subsampling and selection operators R Ω and \(\mathbf {R}_{\Theta _{i}}\), where Θ i had been fixed through the tested pairs in case of uniform and consecutive selections. In each subfigure, λ min(K i ) is sketched over 1≤i≤n s −1, where \(n_{s} = \lceil \frac {N}{J} \rceil \). Figure 6 shows that if J increases, λ min(K i ) decreases faster over i, which suggests that the detection performance will be improved as J increases. It is plausible because if more keystream bits are detected from the ith detection with no estimation errors, more interfering components can be subtracted from the (i+1)th detection. In addition, it appears that the minimum eigenvalues are irrelevant to the types of Θ i , which means that an adversary may expect no benefits from a particular selection of Θ i . Finally, Fig. 6 demonstrates that the lower bound of Theorem 4 is valid, not only for random selection but also for uniform and consecutive selections.

Fig. 6
figure 6

The minimum eigenvalues of K i at the ith detection of the SAMD

Figure 7 displays the upper bounds on the success probability of the SAMD for keystream recovery. For comparison, it also sketches the theoretical upper bound of Corollary 3 for random selection Θ i . In view of the adversary’s bounded computing power, we set J≤128, where the complexity of each detection in the SAMD will be \({\mathcal {O}}\left (2^{J}\right)\) by an exhaustive search. Since λ min(K i ) has similar values for different types of Θ i ’s in Fig. 6, the upper bounds of Fig. 7 are also similar for every selection types. Moreover, the upper bounds increase over J, which is obvious from the sharp decline of λ min(K i ) over J, observed from Fig. 6. However, even if an adversary chooses a large value of J, the upper bounds on the success probability are still significantly low, which implies that the potential of the SAMD to restore a correct N-bit keystream is pessimistic. Note that this is the result of an optimistic scenario, and in reality, the actual probability of success of the SAMD will be much lower than the upper bounds, due to estimation errors and their propagation through the SAMD.

Fig. 7
figure 7

The upper bounds on the success probability of the SAMD

7 Conclusions

This paper has proposed a CS-based cryptosystem that encrypts a plaintext with a partial unitary sensing matrix embedding a secret keystream. We demonstrated that our CS-based cryptosystem can offer a theoretically and empirically reliable decryption performance for a legitimate recipient, which is the first contribution of this paper. Then, we examined the indistinguishability of our CS-based cryptosystem by studying the TV distance as a security measure. To investigate the TV distance, we developed upper bounds on the relative entropy and the Hellinger distance, respectively. From the second contribution, we showed that our CS-based cryptosystem can be computationally secure in terms of the indistinguishability, as long as the keystream length for each encryption is sufficiently large with low compression and sparsity ratios.

In addition, we considered a potential CPA from an adversary to recover the key of our CS-based cryptosystem. The computational security of our CS-based cryptosystem against the CPA is based on the mathematical hardness that no polynomial-time algorithm is known to find an ML solution to the underdetermined ILS problem for keystream recovery. As a sub-optimal approach, we introduced the SAMD for an adversary to restore a secret keystream in polynomial time. In the third contribution, we developed an upper bound on the success probability of the SAMD and demonstrated that the performance of the keystream recovery through the SAMD is very pessimistic. In conclusion, our CS-based cryptosystem with a partial unitary sensing matrix embedding a secret keystream can be secure against the CPA, while guaranteeing a stable and robust decryption for a legitimate recipient.

8 Endnotes

1 This paper assumes that a plaintext x is sparse in canonical basis, or Ψ=I. In general, if a plaintext x is sparse with respect to an arbitrary orthonormal basis Ψ, i.e., x=Ψ T θ, the sensing matrix A=Φ Ψ T maintains the form of (1) by considering U 2 Ψ T as a new unitary matrix U 2.

2 In the last detection, \((N - (\lceil \frac {N}{J} \rceil - 1) J)\)-bit segment will be restored, where \(\lceil \frac {N}{J} \rceil \) denotes the nearest integer greater than or equal to \(\frac {N}{J}\).

3 Under this assumption, numerical results showed that the upper bound on the success probability of the successive detection is more favorable for an adversary than that of \(\widehat {\mathbf {x}}\) with arbitrary nonzero entries.

9 Appendices

9.1 Proof of Theorem 1

We give a brief sketch for the proof of Theorem 1, as the underlying technique is similar to that of Theorem 1 in [72]. Similar to Lemma 1 of [72], the covariance matrix of r is given by

$$ \mathbf{C}_{h} = {\mathbb{E}}\left[\mathbf{r} \mathbf{r}^{T} | \mathbf{x}_{h}\right] = \mathbf{R}_{\Omega} \widetilde{\mathbf{C}}_{h} \mathbf{R}_{\Omega}^{T} + \sigma^{2} \mathbf{I}, $$
(19)

where \( \widetilde {\mathbf {C}}_{h} = \frac {1}{N} \mathbf {U}_{1}^{T} \text {diag}\left (\frac {|\widehat {\mathbf {x}}_{h}|^{2} }{M} \right) \mathbf {U}_{1} \) for \(\widehat {\mathbf {x}}_{h} = \mathbf {U}_{2} \mathbf {x}_{h}\). Let λ 1(C h )≥⋯≥λ M (C h ) be the eigenvalues of C h , while \(\lambda _{1}(\widetilde {\mathbf {C}}_{h}) \geq \cdots \ge \lambda _{N} \left (\widetilde {\mathbf {C}}_{h}\right)\) be the eigenvalues of \(\widetilde {\mathbf {C}}_{h}\). With \(\widehat {\mathbf {x}}_{h} = \mathbf {U}_{2} \mathbf {x}_{h} = \left (\widehat {x}_{h,0}, \cdots, \widehat {x}_{h,N-1}\right)^{T}\), let v h =(v h,0,⋯,v h,N−1)T, where \(v_{h,k} = |\widehat {x}_{h, \pi (k)}|^{2}\) for k=0,⋯,N−1, and π(k) is a permutation for v h,0≥⋯≥v h,N−1. From the definition of \(\widetilde {\mathbf {C}}_{h}\), it is clear that \(\lambda _{t}\left (\widetilde {\mathbf {C}}_{h}\right) = \frac {v_{h, t-1}}{M} \ge 0\) for t=1,⋯,N.

In (19), \(\widehat {\mathbf {C}}_{h} = \mathbf {R}_{\Omega } \widetilde {\mathbf {C}}_{h} \mathbf {R}_{\Omega }^{T} \) is an M×M principal submatrix of \(\widetilde {\mathbf {C}}_{h} \), where successive application of the interlacing inequality [73] leads to \( \lambda _{t+N-M} \left (\widetilde {\mathbf {C}}_{h}\right) \leq \lambda _{t} \left (\widehat {\mathbf {C}}_{h}\right) \leq \lambda _{t} \left (\widetilde {\mathbf {C}}_{h}\right)\) for 1≤t≤M. Thus, \( \underset {h}{\min } \ \underset {\mathbf {x}_{h}}{\min } \ \lambda _{M} \left (\widehat {\mathbf {C}}_{h}\right) = \underset {h}{\min } \ \underset {\mathbf {x}_{h}}{\min } \ \lambda _{N} \left (\widetilde {\mathbf {C}}_{h}\right) = 0\) from v h,N−1≥0. On the other hand, \( \underset {h}{\max } \ \underset {\mathbf {x}_{h}}{ \max } \ \lambda _{1} \left (\widehat {\mathbf {C}}_{h}\right) = \underset {h}{\max } \ \underset {\mathbf {x}_{h}}{ \max } \ \lambda _{1} \left (\widetilde {\mathbf {C}}_{h}\right) = \underset {h}{\max } \ \underset {\mathbf {x}_{h}}{ \max } \ \frac {v_{h, 0}}{M}\). By the Cauchy-Schwarz inequality, we obtain \( \frac {v_{h, 0}}{M} = \frac {|\widehat {x}_{h, \pi (0)}|^{2}}{M} = \frac {1}{M} \left | \sum _{k \in \mathcal {S}} x_{h, k} \mathbf {U}_{2}(\pi (0), k) \right |^{2} \leq \frac {K \mu ^{2} (\mathbf {U}_{2}) \cdot \mathcal {E}_{x}}{M}\), where \(\mathcal {S}\) is the set of nonzero entries of x h with \(|\mathcal {S}| \leq K\). As \(\lambda _{t} (\mathbf {C}_{h}) = \lambda _{t} \left (\widehat {\mathbf {C}}_{h}\right) + \sigma ^{2}\) from \(\mathbf {C}_{h} = \widehat {\mathbf {C}}_{h} + \sigma ^{2} \mathbf {I}\), we have

$$ {\displaystyle \begin{array}{cc}{\lambda}_{\mathrm{min}}& =\underset{h}{\min \limits}\kern1em \underset{{\mathbf{x}}_h}{\min \limits}\kern1em {\lambda}_M\left({\mathbf{C}}_h\right)={\sigma}^2,\\ {}{\lambda}_{\mathrm{max}}& =\underset{h}{\max \limits}\kern1em \underset{{\mathbf{x}}_h}{\max \limits}\kern1em {\lambda}_1\left({\mathbf{C}}_h\right)=\frac{K{\mu}^2\left({\mathbf{U}}_2\right)\cdotp {\mathcal{E}}_x}{M}+{\sigma}^2,\end{array}} $$
(20)

where h=1 or 2.

Meanwhile, the upper bound on \(\text {tr} \left (\mathbf {C}_{2}^{-1} \mathbf {C}_{1} \right)\) in Lemma 3 of [72] yields

$$ {}\begin{aligned} D(p_{1} || p_{2}) & \leq \frac{1}{2} \sum\limits_{t=1}^{M} \left(\log \frac{\lambda_{M+1-t}(\mathbf{C}_{2})}{\lambda_{t}(\mathbf{C}_{1})} + \frac{\lambda_{t}(\mathbf{C}_{1})}{\lambda_{M+1-t}(\mathbf{C}_{2})} - 1 \right) \\ & = \frac{1}{2} \sum\limits_{t=1}^{M} \, f(z_{t}), \end{aligned} $$

where f(z)=z− logz−1 and \(z_{t} = \frac {\lambda _{t} (\mathbf {C}_{1})}{ \lambda _{M+1-t} (\mathbf {C}_{2})} > 0\). With λ min and λ max in (20), define \( \tau = \frac {\lambda _{\max }}{\lambda _{\min }} = \frac {K \mu ^{2} (\mathbf {U}_{2}) \mathcal {E}_{x}}{M \sigma ^{2}} + 1 > 1\). Similar to the proof of Theorem 1 in [72], \(D(p_{1} || p_{2}) \leq \frac {M}{2} f(\tau)\), which yields (5).

9.2 Proof of Theorem 2

We use definitions and notations in the proof of Theorem 1. Let λ 1(C 3)≥⋯≥λ M (C 3) be the eigenvalues of \(\mathbf {C}_{3} = \frac {\mathbf {C}_{1} + \mathbf {C}_{2}}{2}\). Clearly, the eigenvalues of C 1, C 2, and C 3 are positive by (20) and the Weyl inequality [73]. In (6), let \( \Gamma = \frac {|\mathbf {C}_{1}|^{\frac {1}{2}} |\mathbf {C}_{2}|^{\frac {1}{2}}}{|\mathbf {C}_{3}|} \triangleq \frac {\Gamma _{n}}{\Gamma _{d}}\). Then,

$$ \begin{aligned} \Gamma_{d} = \prod_{t=1}^{M} \lambda_{t} (\mathbf{C}_{3}) &\leq \left(\frac{\sum_{t=1}^{M} \lambda_{t} (\mathbf{C}_{3})}{M} \right)^{M} = \left(\frac{\text{tr} (\mathbf{C}_{3})}{M} \right)^{M}\\& = \left(\frac{\text{tr} (\mathbf{C}_{1}) + \text{tr} (\mathbf{C}_{2})}{2M} \right)^{M}, \end{aligned} $$
(21)

where the inequality is from the arithmetic mean-geometric mean inequality. For h=1 or 2, the tth diagonal entry of \(\widetilde {\mathbf {C}}_{h} = \frac {1}{N} \mathbf {U}_{1}^{T} \text {diag}\left (\frac {|\widehat {\mathbf {x}}_{h}|^{2} }{M} \right) \mathbf {U}_{1} \) is given by \(\frac {1}{MN} \sum _{k=0}^{N-1} |\widehat {x}_{h,k}|^{2} \mathbf {U}_{1}^{2} (k, t) = \frac {1}{MN} || \widehat {\mathbf {x}}_{h} ||^{2} = \frac {1}{M} ||\mathbf {x}_{h}||^{2} = \frac {\mathcal {E}_{x}}{M},\) where \(\mathbf {U}_{1}^{2} (k, t) = 1\) for 0≤t≤N−1. Note that \(\widehat {\mathbf {C}}_{h} = \mathbf {R}_{\Omega } \widetilde {\mathbf {C}}_{h} \mathbf {R}_{\Omega }^{T}\) has the same diagonal entry of \(\widetilde {\mathbf {C}}_{h}\). Thus, from \(\mathbf {C}_{h} = \widehat {\mathbf {C}}_{h} + \sigma ^{2} \mathbf {I}\), we have

$$ \text{tr}(\mathbf{C}_{h}) = \text{tr}(\widehat{\mathbf{C}}_{h}) + M \sigma^{2} = \mathcal{E}_{x} + M \sigma^{2}, $$
(22)

where (21) becomes

$$ \Gamma_{d} \leq \left(\frac{\mathcal{E}_{x}}{M} + \sigma^{2} \right)^{M}. $$
(23)

In Γ n , the geometric mean-harmonic mean inequality yields

$$ |\mathbf{C}_{h}|^{\frac{1}{2}} = \left(\prod_{t=1}^{M} \lambda_{t} (\mathbf{C}_{h}) \right)^{\frac{1}{2}} \ge \left(\frac{1}{\frac{1}{M} \sum_{t=1}^{M} \lambda_{t}^{-1} (\mathbf{C}_{h})} \right)^{\frac{M}{2}}, $$
(24)

where h=1 or 2. By the Kantorovich inequality [74],

$$ \begin{aligned} \frac{1}{M} \sum\limits_{t=1}^{M} \lambda_{t}^{-1} (\mathbf{C}_{h}) & \leq \frac{M}{4 \ \text{tr}(\mathbf{C}_{h})} \left(\frac{\lambda_{1} (\mathbf{C}_{h})}{\lambda_{M} (\mathbf{C}_{h})} + \frac{\lambda_{M} (\mathbf{C}_{h})}{\lambda_{1} (\mathbf{C}_{h})} \!+ 2 \right) \\ & = \frac{M}{4 \ \text{tr}(\mathbf{C}_{h})} \left(\frac{\lambda_{\max}}{\lambda_{\min}} + \frac{\lambda_{\min}}{\lambda_{\max}} + 2 \right) \\ & = \frac{M}{4 \ \text{tr}(\mathbf{C}_{h})} \left(\tau + \frac{1}{\tau} + 2 \right), \end{aligned} $$
(25)

where λ 1(C h ) and λ M (C h ) have been replaced by λ max and λ min of (20), respectively. In (25), \(\tau = \frac {\lambda _{\max }}{\lambda _{\min }} = \frac {K \mu ^{2}(\mathbf {U}_{2}) \cdot \mathcal {E}_{x}}{M \sigma ^{2}} + 1 = K \mu ^{2}(\mathbf {U}_{2}) \cdot \text {PNR} + 1\). By (22), (24), and (25),

$$ \Gamma_{n} \geq \left(\frac{4 \sqrt{\text{tr}(\mathbf{C}_{1}) \cdot \text{tr}(\mathbf{C}_{2})} }{M(\tau + \frac{1}{\tau} + 2)} \right)^{M} = \left(\frac{4 \left(\frac{\mathcal{E}_{x}}{M} + \sigma^{2} \right) }{\tau + \frac{1}{\tau} + 2} \right)^{M}. $$
(26)

By combining Γ d and Γ n , (23) and (26) yield

$$ \begin{aligned} \Gamma = \frac{\Gamma_{n}}{\Gamma_{d}} &\geq \frac{\left(\frac{4 \left(\frac{\mathcal{E}_{x}}{M} + \sigma^{2} \right) }{\tau + \frac{1}{\tau} + 2} \right)^{M}} {\left(\frac{\mathcal{E}_{x}}{M} + \sigma^{2} \right)^{M}} = \left(\frac{2\sqrt{\tau} }{\tau + 1} \right)^{\frac{M}{2}} \\&= \left(\frac{2\sqrt{K \mu^{2} (\mathbf{U}_{2})\cdot \text{PNR}+1} }{ K \mu^{2} (\mathbf{U}_{2}) \cdot \text{PNR}+ 2} \right)^{\frac{M}{2}}. \end{aligned} $$

Finally, the proof is completed by \(d_{\mathrm {H}} (p_{1}, p_{2}) = \sqrt {1 - \Gamma ^{\frac {1}{2}}}\).

9.3 Proof of Theorem 3

In (15), K i is the Gram matrix, or \(\mathbf {K}_{i} = \mathbf {A}_{i}^{T} \mathbf {A}_{i}\) with \(\mathbf {A}_{i} = \mathbf {R}_{\Delta _{i}} \mathbf {U}_{1}^{T} \mathbf {R}_{\Omega }^{T}\) for 1≤i≤n s −1, where λ min(K i )≥0, since K i is positive semi-definite [73]. Let \(\mathbf {s}_{\Theta _{i}}\) and \(\mathbf {s}_{\Theta _{i}} '\) be a pair of correct and wrong J-bit segments from a keystream s at the index set Θ i , respectively. From (13), \({\mathbb {E}}\left [\mathbf {r}_{i} | \mathbf {s}_{\Theta _{i}} \right ] = \mathbf {m}_{i} = \mathbf {R}_{\Omega } \mathbf {U}_{1} \mathbf {R}_{\Theta _{i}}^{T} \mathbf {s}_{\Theta _{i}} \) and \({\mathbb {E}}\left [\mathbf {r}_{i} | \mathbf {s}_{\Theta _{i}} ' \right ] = \mathbf {m}_{i} ' = \mathbf {R}_{\Omega } \mathbf {U}_{1} \mathbf {R}_{\Theta _{i}}^{T} \mathbf {s}_{\Theta _{i}} '\), respectively. Also, (14) yields \({\mathbb {E}}\left [\left (\mathbf {r}_{i} - \mathbf {m}_{i} \right)\left (\mathbf {r}_{i} - \mathbf {m}_{i} \right)^{T} | \mathbf {s}_{\Theta _{i}} \right ] = {\mathbb {E}}\left [\left (\mathbf {r}_{i} - \mathbf {m}_{i} ' \right)\left (\mathbf {r}_{i} - \mathbf {m}_{i} '\right)^{T} | \mathbf {s}_{\Theta _{i}} '\right ] = \mathbf {K}_{i} + \sigma ^{2} \mathbf {I}\). Assuming that r i is a Gaussian random vector, the binary hypothesis detection of Section 3.2 in [70] reveals that the pairwise error probability that \(\mathbf {s}_{\Theta _{i}} '\) is incorrectly detected by the ith detection is

$$ \begin{aligned} {}\text{Pr}\left[\!\mathbf{s}_{\Theta_{i}} \!\rightarrow\! \left. \mathbf{s}_{\Theta_{i}} ' \right| \mathbf{s}_{\Theta_{i}}, \mathbf{s}_{\Theta_{i}} ' \right] & \geq Q \left(\frac{|| \mathbf{m}_{i} - \mathbf{m}_{i} '||}{2 \sqrt{\lambda_{\min}(\mathbf{K}_{i})+\sigma^{2}}} \right) \\ & = Q \left(\frac{|| \mathbf{R}_{\Omega} \mathbf{U}_{1} \mathbf{R}_{\Theta_{i}}^{T} \left(\mathbf{s}_{\Theta_{i}} - \mathbf{s}_{\Theta_{i}}'\right)||} {2 \sqrt{\lambda_{\min}(\mathbf{K}_{i})+\sigma^{2}}} \right)\!. \end{aligned} $$
(27)

We assume that the pairwise error event occurs only for a specific \(\mathbf {s}_{\Theta _{i}}'\), which is closest to \(\mathbf {s}_{\Theta _{i}}\), and ignore all the other \(\mathbf {s}_{\Theta _{i}}'\). In other words, we take into account only a single \(\mathbf {s}_{\Theta _{i}}'\), where \(\mathbf {s}_{\Theta _{i}} - \mathbf {s}_{\Theta _{i}}'\) has the nonzero entry (+2 or −2) at one position, or equivalently \(|| \mathbf {s}_{\Theta _{i}} - \mathbf {s}_{\Theta _{i}} ' || = 2\) for a given \(\mathbf {s}_{\Theta _{i}}\). This assumption, similar to the one in [75], is favorable for an adversary. From (27), the error probability under the assumption is given by

$$ {}\begin{aligned} P_{e}^{(i)} & \,=\, \sum_{\mathbf{s}_{\Theta_{i}}}\! \text{Pr} \left[\! \mathbf{s}_{\Theta_{i}}\right] \!\cdot\! \sum_{\mathbf{s}_{\Theta_{i}} ' } \text{Pr}\left[\! \mathbf{s}_{\Theta_{i}} \rightarrow\! \mathbf{s}_{\Theta_{i}} '\! \mid\! \mathbf{s}_{\Theta_{i}}, \mathbf{s}_{\Theta_{i}} ' \right] \!\cdot\! \text{Pr} \left[ \mathbf{s}_{\Theta_{i}}' \mid \mathbf{s}_{\Theta_{i}} \right] \\ & =\! \sum_{\mathbf{s}_{\Theta_{i}}}\! \text{Pr} \left[ \mathbf{s}_{\Theta_{i}}\right] \!\cdot\! \text{Pr}\left[ \mathbf{s}_{\Theta_{i}} \!\rightarrow\! \mathbf{s}_{\Theta_{i}} ' \mid \mathbf{s}_{\Theta_{i}}, \mathbf{s}_{\Theta_{i}} ', ||\mathbf{s}_{\Theta_{i}} - \mathbf{s}_{\Theta_{i}} ' || \,=\, 2\! \right] \\ & =\! \text{Pr}\left[ \mathbf{s}_{\Theta_{i}} \rightarrow \mathbf{s}_{\Theta_{i}} ' \mid \mathbf{s}_{\Theta_{i}}, \mathbf{s}_{\Theta_{i}} ', ||\mathbf{s}_{\Theta_{i}} - \mathbf{s}_{\Theta_{i}} ' || = 2 \right] \\ & \geq Q \left(\frac{\sqrt{\sum_{k=0}^{M-1} 4 \left| \mathbf{U}_{1} \left(\omega_{k}, \theta_{i, \tau}\right) \right|^{2} }}{2 \sqrt{\lambda_{\min}(\mathbf{K}_{i})+\sigma^{2}}} \right) \\ & = Q \left(\sqrt{\frac{M \mu^{2}(\mathbf{U}_{1})}{\lambda_{\min}(\mathbf{K}_{i})+\sigma^{2}}} \right), \end{aligned} $$
(28)

where ω k ∈Ω and θ i,τ ∈Θ i . In (28), we assumed that \(\mathbf {s}_{\Theta _{i}}\) and \(\mathbf {s}_{\Theta _{i}} '\) differ only at a position corresponding to the column index θ i,τ of U 1. Note that \(P_{e}^{(i)}\) is under the assumption that all the estimates from previous i−1 detections have been subtracted with no errors to yield r i of (12). Then, the success probability of the ith detection is

$$ \begin{aligned} P_{s}^{(i)} & = \text{Pr} \left[ \widehat{\mathbf{s}}_{\Theta_{i}} = \mathbf{s}_{\Theta_{i}} \mid \widehat{\mathbf{s}}_{\Theta_{1}} = \mathbf{s}_{\Theta_{1}}, \cdots, \widehat{\mathbf{s}}_{\Theta_{i-1}} = \mathbf{s}_{\Theta_{i-1}} \right] \\ & = 1 - P_{e}^{(i)} \leq 1 - Q \left(\sqrt{\frac{M \mu^{2}(\mathbf{U}_{1})}{\lambda_{\min}(\mathbf{K}_{i})+\sigma^{2}}} \right), \end{aligned} $$
(29)

where 1≤i≤n s . If a correct N-bit keystream is to be restored, all the component detections should be successful. Thus, the success probability of the SAMD is

$$ {}\begin{aligned} P_{\text{succ}} &= \text{Pr} \left[\widehat{\mathbf{s}}_{\Theta_{1}} = \mathbf{s}_{\Theta_{1}}, \cdots, \widehat{\mathbf{s}}_{\Theta_{n_{s}}} = \mathbf{s}_{\Theta_{n_{s}}} \right] \\ & = \prod_{i=1}^{n_{s}} \text{Pr} \left[ \widehat{\mathbf{s}}_{\Theta_{i}} = \mathbf{s}_{\Theta_{i}} \mid \widehat{\mathbf{s}}_{\Theta_{1}} = \mathbf{s}_{\Theta_{1}}, \cdots, \widehat{\mathbf{s}}_{\Theta_{i-1}} = \mathbf{s}_{\Theta_{i-1}} \right] \\ & = \prod_{i=1}^{n_{s}} P_{s}^{(i)}. \end{aligned} $$
(30)

Finally, we obtain the upper bound of (17) by combining (29) and (30), which completes the proof.

9.4 Proof of Theorem 4

In (15), let \(\mathbf {A}_{i} = \mathbf {R}_{\Delta _{i}} \mathbf {H}^{T} \mathbf {R}_{\Omega }^{T}\) with U 1=H. Then, the singular values of A i are equal to the square roots of the eigenvalues of \(\mathbf {K}_{i} = \mathbf {A}_{i}^{T} \mathbf {A}_{i}\), where λ min(K i )≥0 for all i’s. In other words, if σ min(A i ) denotes the minimum singular value of A i , then \(\lambda _{\min }(\mathbf {K}_{i}) = \sigma _{\min }^{2} (\mathbf {A}_{i})\).

To examine σ min(A i ) for 1≤i≤n s −1, we first define \(\mathbf {B}_{i} = \mathbf {H}^{T} \mathbf {R}_{\Omega }^{T}\). Then, B i is an N×M matrix satisfying \( \mathbf {B}_{i}^{T} \mathbf {B}_{i} = \mathbf {R}_{\Omega } \mathbf {H} \cdot \mathbf {H}^{T} \mathbf {R}_{\Omega }^{T} = N \mathbf {I}\), which means that each column of B i is mutually orthogonal. Also, it is clear that the l 2-norm of each row of B i is \(\sqrt {M}\), since each entry of B i is ±1. If Θ i is a random selection, so is Δ i , where \(\phantom {\dot {i}\!}\mathbf {A}_{i} = \mathbf {R}_{\Delta _{i}} \mathbf {B}_{i}\) is an (N−i J)×M matrix obtained by randomly subsampling (N−i J) rows from B i , where the selected row indices are specified by Δ i . For such a matrix A i , Corollary 5.55 of [4] shows that for every t≥0,

$$ \sigma_{\min} (\mathbf{A}_{i}) \geq \sqrt{N-iJ} - t \sqrt{M} $$
(31)

with probability at least \(1-2M e^{-ct^{2}}\) for a constant c>0. The corollary assumed that \(t \geq \sqrt {c_{1} \log {M}}\) and N−i J>c 2 M logM for the bound to be nontrivial and nonnegative, where 0<c 1<c 2. Thus, the bound of (31) is valid only for \(i <\lceil \frac {N - c_{2} M\log M}{J} \rceil = I_{T}\), and we set σ min(A i )≥0 if i≥I T , which gives the bound of (18) from \(\lambda _{\min }(\mathbf {K}_{i}) = \sigma _{\min }^{2} (\mathbf {A}_{i})\).

References

  1. DL Donoho, Compressed sensing. IEEE Trans. Inf. Theory. 52(4), 1289–1306 (2006).

    Article  MATH  MathSciNet  Google Scholar 

  2. EJ Candes, J Romberg, T Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory. 52(2), 489–509 (2006).

    Article  MATH  MathSciNet  Google Scholar 

  3. EJ Candes, T Tao, Near-optimal signal recovery from random projections: universal encoding strategies. IEEE Trans. Inf. Theory. 52(12), 5406–5425 (2006).

    Article  MATH  MathSciNet  Google Scholar 

  4. YC Eldar, G Kutyniok, Compressed sensing—theory and applications (Cambridge University Press, Cambridge, 2012).

    Book  Google Scholar 

  5. J Tropp, JN Laska, M Duarte, J Romberg, RG Baraniuk, Beyond Nyquist: efficient sampling of sparse bandlimited signals. IEEE Trans. Inf. Theory. 56(1), 520–544 (2010).

    Article  MATH  MathSciNet  Google Scholar 

  6. M Mishali, YC Eldar, From theory to practice: sub-Nyquist sampling of sparse wideband analog signals. IEEE J. Select Top. Sig. Process. 4(2), 375–391 (2010).

    Article  Google Scholar 

  7. J Haupt, W Bajwa, G Raz, R Nowak, Toeplitz compressed sensing matrices with applications to sparse channel estimation. IEEE Trans. Inf. Theory. 56(11), 5862–5875 (2010).

    Article  MATH  MathSciNet  Google Scholar 

  8. MF Duarte, S Sarvotham, D Baron, MB Wakin, RG Baraniuk, in Asilomar Conf. on Signals, Systems and computers. Distributed compressed sensing of jointly sparse signals (Pacific Grove, 2005), pp. 1537–1541.

  9. J Haupt, W Bajwa, M Rabbat, R Nowak, Compressed sensing for networked data. IEEE Sig. Process. Mag. 25(2), 92–101 (2008).

    Article  Google Scholar 

  10. C Caione, D Brunelli, L Benini, Compressive sensing optimization for signal ensembles in WSNs. IEEE Trans. Ind. Inform. 10(1), 382–392 (2014).

    Article  Google Scholar 

  11. M Duarte, M Davenport, D Takhar, JN Laska, T Sun, KF Kelly, RG Baraniuk, Single-pixel imaging via compressive sampling. IEEE Sig. Process. Mag. 25(2), 83–91 (2008).

    Article  Google Scholar 

  12. R Marcia, Z Harmany, R Willet, in Proc. IS&T/SPIE Symp. Elec. Imag.: Comp. Imag. Compressive coded aperture imaging (San Jose, 2009).

  13. M Lustig, D Donoho, J Pauly, in Proc. Ann. Meeting of ISMRM. Rapid MR imaging with compressed sensing and randomly under-sampled 3DFT trajectories (Seattle, 2006).

  14. S Goginneni, A Nehorai, Target estimation using sparse modeling for distributed MIMO radar. IEEE Trans. Signal Process. 59(11), 5315–5325 (2011).

    Article  MathSciNet  Google Scholar 

  15. Y Rachlin, D Baron, in Proc. 46th Annu. Allerton Conf. Commun.Control, Comput. The secrecy of compressed sensing measurements, (2008), pp. 813–817.

  16. A Orsdemir, HO Altun, G Sharma, MF Bocko, in Proc. IEEE Military Commun. Conf. (MILCOM). On the security and robustness of encryption via compressed sensing, (2008), pp. 1–7.

  17. Y Zhang, LY Zhang, J Zhou, L Liu, F Chen, X He, A review of compressive sensing in information security field. IEEE Access Spec. Sect. Green Commun. Netw. 5G Wirel. 4:, 2507–2519 (2016).

    Google Scholar 

  18. AL Gibbson, FE Su, On choosing and bounding probability metrics. Int Stat Rev. 70(3), 419–435 (2002).

    Article  MATH  Google Scholar 

  19. S Goldwasser, S Micali, Probabilistic encryption. J. Comput. Syst. Sci. 28:, 270–299 (1984).

    Article  MATH  MathSciNet  Google Scholar 

  20. TM Cover, JA Thomas, Elements of information theory (Wiley & Sons, Inc., Hoboken, 2006).

    MATH  Google Scholar 

  21. L Le Cam, Asymptotic methods in statistical decision theory (Springer-Verlag, New York, 1986).

    Book  MATH  Google Scholar 

  22. T Bianchi, V Bioglio, E Magli, in Proc. IEEE Int. Conf. Acoust. Speech Signal Process (ICASSP). On the security of random linear measurements, (2014), pp. 3992–3996.

  23. T Bianchi, V Bioglio, E Magli, Analysis of one-time random projections for privacy preserving compressed sensing. IEEE Trans. Inf. Forens. Sec. 11(2), 313–327 (2016).

    Article  Google Scholar 

  24. T Bianchi, E Magli, in IEEE Workshop on Information Forensics and Security (WIFS). Analysis of the security of compressed sensing with circulant matrices, (2014), pp. 1–6.

  25. V Cambareri, M Mangia, F Pareschi, R Rovatti, G Setti, Low complexity multiclass encryption by compressed sensing. IEEE Trans. Signal Process. 63(9), 2183–2195 (2015).

    MathSciNet  Google Scholar 

  26. V Cambareri, M Mangia, F Pareschi, R Rovatti, G Setti, On known-plaintext attacks to a compressed sensing-based encryption: a quantitative analysis. IEEE Trans. Inf. Forensics Secur.10(10), 2182–2195 (2015).

    Article  Google Scholar 

  27. Y Zhang, J Zhou, F Chen, LY Zhang, K-W Wong, X He, Embedding cryptographic features in compressive sensing. Neurocomputing. 205:, 472–480 (2016).

    Article  Google Scholar 

  28. L Zeng, X Zhang, L Chen, Z Fan, Y Wang, Scrambling-based speech encryption via compressed sensing. EURASIP J. Adv. Signal Process. 2012:, 257 (2012).

    Article  Google Scholar 

  29. LY Zhang, K-W Wong, Y Zhang, Q Lin, Joint quantization and diffusion for compressed sensing measurements of natural images, (2015).

  30. LY Zhang, K-W Wong, Y Zhang, J Zhou, Bi-level protected compressive sampling. IEEE Trans. Multimed. 18(9), 1720–1732 (2016).

    Article  Google Scholar 

  31. M Bloch, J Barros, Physical-layer security—from information theory to security engineering (Cambridge University Press, 2011).

  32. Y Zou, J Zhu, X Wang, L Hanzo, A survey on wireless security: technical challenges, recent advances, and future trends. Proc. IEEE. 104(9), 1727–1765 (2016).

    Article  Google Scholar 

  33. S Agrawal, S Vishwanath, in Proc. IEEE Inf. Theory Workshop (ITW). Secrecy using compressive sensing, (2011), pp. 563–567.

  34. G Reeves, N Goela, N Milosavljevic, M Gastpar, in Proc. IEEE Inf. Theory Workshop (ITW). A compressed sensing wire-tap channel, (2011), pp. 548–552.

  35. R Dautov, GR Tsouri, in Proc. Int. Conf. Comput. Netw. Commun. Establishing secure measurement matrix for compressed sensing using wireless physical layer security, (2013), pp. 354–358.

  36. SN George, DP Pattathil, A secure LFSR based random measurement matrix for compressive sensing. Sens. Imag. 15(1), 1–29 (2014).

    Google Scholar 

  37. YD Li, Z Zhang, M Winslett, Y Yang, in Proc.10th Annu. ACM Workshop Privacy Electron. Soc. (WPES). Compressive mechanism: utilizing sparse representation in differentical privacy, (2011), pp. 177–182.

  38. H Li, R Mao, L Lai, R Qui, in Proc. IEEE SmartGridComm. Compressed meter reading for delay-sensitive and secure load report in smart grid, (2010), pp. 114–119.

  39. J Gao, X Zhang, H Liang, X Shen, in Proc.IEEE GLOBECOM, Commun. Inf. Syst. Security Symp. Joint encryption and compressed sensing in smart grid data transmission, (2014), pp. 662–667.

  40. SW Golomb, G Gong, Signal design for good correlation—for wireless communication, cryptography and radar (Cambridge University Press, New York, 2005).

    Book  MATH  Google Scholar 

  41. R Rueppel, O Staffelbach, Products of linear recurring sequences with maximum complexity. IEEE Trans. Inf. Theory. 33(1), 124–131 (1987).

    Article  MATH  Google Scholar 

  42. T Herlestam, in Advances in Cryptology-Eurocrypt’85. On functions of linear shift register sequences. Lecture Notes in Computer Science (LNCS), vol. 219 (Springer-Verlag, 1986), pp. 119–129.

  43. T Beth, F Piper, in Advances in Cryptology-Eurocrypt’84. The stop-and-go generator. Lecture Notes in Computer Science (LNCS), vol. 209 (Springer-Verlag, 1985), pp. 88–92.

  44. D Gollmann, WG Chambers, Clock-controlled shift registers: a review. IEEE J. Sel. Areas Commun. 7(4), 525–533 (1989).

    Article  Google Scholar 

  45. D Coppersmith, H Krawczys, Y Mansour, in Advances in Cryptology - Eurocrypt’93. The shrinking generator. Lecture Notes in Computer Science (LNCS), vol. 773 (Springer-Verlag, 1993), pp. 22–39.

  46. W Meier, O Staffelbach, in Advances in Cryptology-Eurocrypt’94. The self-shrinking generator. Lecture Notes in Computer Science (LNCS), vol. 950 (Springer-Verlag, 1995), pp. 205–214.

  47. L Chen, G Gong, Communication system security (Chapman & Hall/CRC, Boca Raton, 2012).

    MATH  Google Scholar 

  48. A Klein, Stream ciphers (Springer-Verlag, London, 2013).

    Book  MATH  Google Scholar 

  49. M Rudelson, R Vershynin, On sparse reconstruction from Fourier and Gaussian measurements. Comm. Pure Appl. Math. 61(8), 1025–1045 (2008).

    Article  MATH  MathSciNet  Google Scholar 

  50. MF Duarte, YC Eldar, Structured compressed sensing: from theory to applications. IEEE Trans. Signal Process. 59(9), 4053–4085 (2011).

    Article  MathSciNet  Google Scholar 

  51. J Katz, Y Lindell, Introduction to modern cryptography, 2nd Ed., (Boca Raton, 2015).

  52. L Le Cam, Convergence of estimates under dimensionality restrictions. Ann. Stat. 1(1), 38–53 (1973).

    Article  MATH  MathSciNet  Google Scholar 

  53. A DasGupta, Asymptotic theory of statistics and probability (Springer Science+Business Media, LLC, New York, 2008).

    MATH  Google Scholar 

  54. MS Pinsker, Information and information stability of random variables and processes (in Russian) (U.S.S.R., Izv.Akad. Nauk, Moscow, 1960).

    MATH  Google Scholar 

  55. AA Fedotov, P Harremoës, F Topsoe, Refinements of Pinsker’s inequality. IEEE Trans. Inf. Theory.49(6), 1491–1498 (2003).

    Article  MATH  MathSciNet  Google Scholar 

  56. Y Singer, MK Warmuth, in Proc. Advances in Neural Information Processing Systems 11 (NIPS’98). Batch and on-line parameter estimation of Gaussian mixtures based on the joint entropy, (1998), pp. 578–584.

  57. S Foucart, H Rauhut, A mathematical introduction to compressive sensing (Springer Science+Business Media, New York, 2013).

    Book  MATH  Google Scholar 

  58. T Kailath, The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans. Commun. Technol. COM-15(1), 52–60 (1967).

  59. KT Abou-Moustafa, FP Ferrie, in JMLR: Asian Conference on Machine Learning. A note on metric properties for some divergence measures: The Gaussian case, vol. 25, (2012), pp. 1–15.

  60. A Guntuboyina, S Saha, G Schiebinger, Sharp inequalities for f-divergences. IEEE Trans. Inf. Theory. 60(1), 104–121 (2014).

    Article  MATH  MathSciNet  Google Scholar 

  61. S Arora, L Babai, J Stern, Z Sweedyk, The hardness of approximate optima in lattices, codes, and systems of linear equations. J. Comput. Syst. Sci. 54(2), 317–331 (1997).

    Article  MATH  MathSciNet  Google Scholar 

  62. M Ajtai, in Proc. of 30th Ann. ACM Symp. Theory Comput. The shortest vector problem in L 2 is NP-hard for randomized reductions, (1998), pp. 10–19.

  63. M Damen, K Abed-Meraim, J-C Belfiore, Generalized sphere decoder for asymmetrical space-time communication architecture. IET Electron. Lett. 36(2), 166–167 (2000).

    Article  Google Scholar 

  64. P Dayal, MK Varanasi, in Proc.of 41st Annula Allerton Conf. on Comm. Control, and Comput. A fast generalized sphere decoder for optimum decoding for under-determined MIMO systems, (2003).

  65. T Cui, C Tellambura, An efficient generalized sphere decoder for rank-deficient MIMO systems. IEEE Commun. Lett. 9(5), 423–425 (2005).

    Article  Google Scholar 

  66. M Ajtai, in Proc. 28th Annu. ACM Symp. Theory Comput. Generating hard instances of lattice problems, (1996), pp. 99–108.

  67. O Goldreich, S Goldwasser, S Halevi, in Proc.17th Annu. Int. Cryptography Conf. Publick-key cryptosystems from lattice reduction problems, (1997), pp. 112–131.

  68. R Fischlin, J Seifert, in Proc. 7th IMA Int. Conf. Tensor-based trapdoors for CVP and their application to public key cryptography, (1999), pp. 244–257.

  69. RG Gallager, Stochastic processes: theory for applications (Cambridge University Press, 2013).

  70. HL van Trees, KL Bell, Z Tian, Detection, estimation, and modulation theory: part I—detection, estimation, and filtering theory, Second Ed. (Wiley & Sons, Inc., Hoboken, 2013).

    MATH  Google Scholar 

  71. D Needell, JA Tropp, CoSaMP: iterative signal recovery from incomplete and inaccurate samples. Appl. Comput. Harmon. Anal. 26:, 301–321 (2009).

    Article  MATH  MathSciNet  Google Scholar 

  72. NY Yu, Indistinguishability of compressed encryption with circulant matrices for wireless security. IEEE Signal Process. Lett. 24(2), 181–185 (2017).

    Article  Google Scholar 

  73. RA Horn, CR Johnson, Matrix Analysis, 2nd Ed. (Cambridge University Press, Cambridge, 2013).

    MATH  Google Scholar 

  74. G Strang, On the Kantorovich inequality. Proc. Amer. Math. Soc. 11:, 468 (1960).

    Article  MATH  MathSciNet  Google Scholar 

  75. J Choi, Secure transmission via compressive sensing in multicarrier systems. IEEE Signal Process. Lett. 23(10), 1315–1319 (2016).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (no. NRF-2017R1A2B4004405).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nam Yul Yu.

Ethics declarations

Competing interests

The author declares that he has no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, N. On the security of compressed encryption with partial unitary sensing matrices embedding a secret keystream. EURASIP J. Adv. Signal Process. 2017, 73 (2017). https://doi.org/10.1186/s13634-017-0508-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-017-0508-6

Keywords