Rate distortion performance analysis of nested sampling and coprime sampling

In this paper, rate distortion performance of nested sampling and coprime sampling is studied. It is shown that with the increasing of distortion, the data rate decreases. With these two sparse sampling algorithms, the data rate is proved to be much less than that without sparse sampling. With the increasing of sampling spacings, the data rate decreases at certain distortion, which is because with more sparse sampling, less number of bits is required to represent the information. We also prove that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion. The reason is that nested sampling collects a little less number of samples than coprime sampling with the same length of data, which is a little sparser than coprime sampling.


Introduction
The twenty-first century is awash with data. Data are flooding in at rates never seen before, doubling almost every 18 months [1], as result of new information gathered from Radar, Web communities, newly deployed smart assets, and customer data from public, proprietary, purchased sources, and so forth. For example, oil companies, telecommunication companies, and other data-centric industries have had huge data for long time. Data is being collected and transmitted at unprecedented scale [2,3] in a wide range of application areas nowadays. The phrase 'Big Data' as defined by US National Science Foundation in its recent solicitation, refers to large, diverse, complex, distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and all other digital sources available today and in the future.
Unstructured data is data that does not follow a specified format, which is really most of the Big Data. Radar or sonar data is one typical example, which includes meteorological, vehicular, and oceanographic seismic information, such as in [4], Big Data from O'Reilly radar was described.
And many efforts have been made to develop suitable compression techniques for Big Data. However, tradi-tional compression methods [5,6] are all based on Nyquist rate, which will have poor efficiency in terms of both sampling rate and computational complexity. Unlike traditional compression techniques, some sparse sampling algorithms have been proposed to overcome Nyquist sampling requirement, like compressive sensing, nested sampling, and coprime sampling.
Nested sampling [7] is an non-uniform sampling, using two different samplers in each period. Although the signal is sampled sparsely and non-uniformly, the autocorrelation of signal could be estimated at all lags. Therefore, although the samples can be arbitrarily sparse, it keeps the signal's statistical information [8]. While coprime sampling uses two uniform samplers, with sample spacings P and Q coprime integers. The authors in [8] have already proved that these two sets of samples of the signal could fully estimate all lags of autocorrelation of the original signal. As both nested sampling and coprime sampling could keep the statistical property of the original signal, these two sampling algorithms could be applied to Big Data to highly reduce the transmission or storage cost of Big Data.
Information rate distortion function is a measure of distortion between the original source and its representation. In this paper, we will provide theoretical rate distortion performance, because of these two sparse sampling algorithms, either nested sampling (NS) or coprime sampling (CS). We will show that with these two sparse sampling algorithms, the data rate is much less than that without http://asp.eurasipjournals.com/content/2014/1/18 sparse sampling for a given distortion. With the increasing of sampling spacings, the data rate decreases at certain distortion, which is because with more sparse sampling, less number of bits is required to represent the information. We will also prove that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion.
The rest of this paper is organized as follows. In Section 2, we give a brief introduction of nested sampling and coprime sampling separately. Theoretical derivation of rate distortion performance of nested sampling and coprime sampling is detailed in Sections 3.1 and 3.2. Also, the theoretical analysis and comparison of these two sparse sampling is given in Section 3.3. In Section 4, numerical results are provided to verify the theoretical rate distortion results derived in Section 3. Conclusions are given in Section 5.

Nested sampling
The nested array was first introduced in [7] as an effective approach to array processing with enhanced degrees of freedom [9,10]. In time domain, the signal's autocorrelation could also be obtained from nested sampling structure [11]. And although the samples from this nested sparse sampling are sparsely and non-uniformly located, the samples of the autocorrelation can be computed at any specified rate. Some applications which depend on the difference co-array, or autocorrelation, like Direction-ofarrival (DOA) estimation and beamforming could be done based on nested sampling.
In the simplest form, there are two levels of sampling density in nested sampling [11][12][13][14], with the level 1 samples at the N 1 locations and the level 2 samples at the N 2 locations. 1 ≤ l ≤ N 1 , for level 1 An example of periodic sparse sampling using nested sampling structure is shown in Figure 1, with N 1 = 3 and N 2 = 5. The cross differences are given by The cross differences (2) are in the following range with the maximum value (N 1 + 1)N 2 − 1 [9,11], except the integers and the corresponding negated versions shown in (3).
For example, consider the example in Figure 1, where N 1 = 3, N 2 = 5, 1 ≤ m ≤ 5 and 1 ≤ l ≤ 3, the cross differences k = (N 1 + 1)m − l will achieve these values The difference 0 is also missing, for the reason that m and l are nonzero. Meanwhile, we notice that all of the missing differences could be covered by the self differences among the second array, Combining the cross differences and the self-differences is the difference-co-array, which is a filled difference coarray as shown in (2). This indicates that using nested array structure, although sparse samples are obtained, the degrees of freedom is enhanced, Based on the principle above, a sparse non-uniform sampling using nested sampling structure could be performed as in Figure 1. There are two levels of nesting, with N 1 level-1 samples and N 2 level-2 samples in each period, with period (N 1 + 1)N 2 . It is obvious that nested sampling is non-uniform and the samples obtained are very sparse.
We could notice that, in (N 1 + 1)N 2 T seconds, there are totally N 1 + N 2 samples. Therefore, the average sampling rate is Here, T = 1/f n , f n ≥ 2 f max is the Nyquist sampling frequency. As the Nyquist sampling rate is 1/T, the average sampling rate of nested sampling is smaller than the conventional Nyquist sampling rate.

Co-prime sampling
Different from nested sampling, coprime sampling involves two sets of uniformly spaced samplers [8,[12][13][14] as shown in Figure 2, each of them are sparsely sampled.
The coprime sampling uniformly sample the source signal using two sub-Nyquist samplers, with sample spacing PT and QT respectively, where P and Q are coprime integers with P < Q. 1/T Hz is the Nyquist rate for a bandlimited process, i.e., 1/T = 2 f max , f max being the highest frequency.
Consider the product where x(Pn 1 ) and x(Qn 2 ) comes from the first and the second sampler. Let the difference as It has been shown that k can achieve any integer value in the range 0 ≤ k ≤ PQ − 1 in [11], if n 1 and n 2 are in the ranges 0 ≤ n 1 ≤ 2Q − 1 and 0 ≤ n 2 ≤ P − 1.
For coprime sampling, the two samplers totally collect P + Q samples in PQT seconds, so the average sampling rate is Same as in nested sampling, T = 1/f n , f n ≥ 2 f max is the Nyquist sampling frequency. It is obvious the average sampling rate of coprime sampling is also much smaller than the conventional Nyquist sampling rate. However, the signal's second-order statistics, like the autocorrelation, is kept, which allows us to sample a signal sparsely and estimate some aspects of the signal (spectra, DOA, and so on) at a significantly higher resolution.

Rate distortion performance
Information rate distortion function is a measure of distortion between the original source and its representation. Our purpose is to construct a distortion function which can measure the distortion because of these two sparse sampling algorithms, either nested sampling (NS) or coprime sampling (CS). Sparse sampling can cause possible distortion because less number of samples are used. A wide variety of distortion functions, such as Euclidean distance, Hamming distance, Mahalanobis distance, and Itakura-Saito distance have been used. In this paper, squared error distortion is used. The original samples are denoted as x i , i = 1, · · · , L, where L is the total number of samples. Assume that all original information from L samples is X L = [x 1 , x 2 , · · · , x L ], the selected information after sparse sampling can be represented as [15] where S(·) denotes sparse sampling, either nested sampling or coprime sampling.X L = [x 1 ,x 2 , · · · ,x L ] and L < L. The distortion associated with the sparse sampling between all original samples and the selected samples is where d(·) is the distortion function. The expectation in (12) is with respect to the probability distribution on X L . The rate distortion function R(D) is the minimum of data rates R such that (R, D) is in the rate distortion region for a given distortion. From [16,17], we know that information rate distortion function is defined as where I(X L ;X L ) is the mutual information between X L andX L .
where inequality (a) follows from the fact that condition reduces the entropy. From formula (13), we know that PT QT x (Pn) x c (t) For squared error distortion, where i = 1, · · · , L and j = 1, · · · , L , and (b) follows from the definition that E( Since Gaussian assumption is a classical modeling assumption heavily used in areas such as signal processing and communication system [18], from [16], the rate distortion function for a single Gaussian source N(0, σ 2 ) with squared error distortion is For L-independent zero mean Gaussian sources x 1 , · · · , x L with variance σ 2 1 , σ 2 2 , · · · , σ 2 L , the rate distortion performance with squared error distortion is given by [16,17,19,20] where where λ is chosen so that L i D i = D, and D i = E(x i −x i ) 2 . This gives rise to a kind of reverse waterfilling. We choose a constant λ and only describe those random variables with variance greater than λ, and no bits are used to describe random variables with variance less than λ. N(0, σ 2 i ), i = 1, 2, · · · , L, be independent Gaussian random variables, and under squared error distortion. The rate distortion between the original Gaussian source and after nested sampling of these Gaussian random variables is given by

Theorem 1. (Rate distortion for nested sampling of Gaussian source) Let x i ∼
where K NS is given in (24) and where λ is chosen so that

Proof 1. For nested sampling (NS), all L original information is X
And less number of samples L will be selected based on nested sampling as described, Therefore, (16) becomes where the length of K NS could be determined based on the following formula, here we assume Y = L(mod (N 1 +1)N 2 ) in which U = (Y − (N 1 + 1))(mod (N 1 + 1)). If all samples are assumed to be independent Gaussian N(0, σ 2 i ), hence, the corresponding rate distortion function for nested sampling will be where inequality (c) follows from the fact that the normal distribution maximizes the entropy for a given second moment, and K NS k=1 D k = D. To find the minimum value, we could use Lagrange multipliers which results in an equal distortion for each random variable, if the constant λ is less than σ 2 i for all i. As the increase of the total allowable distortion D, the constant λ increases until it exceeds σ 2 i for some i. Kuhn-Tucker conditions could be used to find the minimum in (26) if we increase the total distortion D. In this case, the Kuhn-Tucker conditions yield Therefore, where λ is chosen so that N(0, σ 2 i ), i = 1, 2, · · · , L, be independent Gaussian random variables, and under squared error distortion. The rate distortion between the original Gaussian source and after coprime sampling of these Gaussian random variables is given by

Theorem 2. (Rate distortion for coprime sampling of Gaussian source) Let x i ∼
where K CS is given in (36) and where λ is chosen so that Proof 2. For coprime sampling (CS), we still assume the original information with length L, i.e., X L = [x 1 , x 2 , · · · , x L ].
And based on coprime sampling, less number of samples L will be selected, Similarly, (16) becomes where the length of K CS could be determined based on the following formula Therefore, the corresponding rate distortion function for coprime sampling of independent Gaussian source The minimum value could be obtained using the similar procedure as described in nested sampling.

Theoretical analysis
Without sparse sampling, the rate distortion function would be which is much greater than that with sparse sampling. From the above derivation of rate distortion function of nested sampling and coprime sampling, we could notice that if the sampling spacings are assumed to be the same, i.e., N 1 = P and N 2 = Q for these two sparse sampling methods, then the minimum value of K NS min could be achieved when Y = L(mod (N 1 + 1)N 2 ) = 0, therefore (39) http://asp.eurasipjournals.com/content/2014/1/18 While for coprime sampling, the minimum value of K CS min could be achieved when L (mod P) = 0, L(mod Q) = 0, and L(mod PQ) = 0, therefore As we know that for these two sparse sampling algorithms, the sampling interval is for sure greater than Nyquist sampling spacing, which indicates that Q > 1, therefore, which indicates that in most cases, K NS > K CS . Table 1 shows some example of K NS and K CS with respect to sampling intervals when N 1 = P, N 2 = Q, and L = 1, 000. It is clear that with the increase of sampling spacings, samples are selected more sparsely by both nested sampling and coprime sampling, which results in a increase of K NS and K CS . In addition, we could notice that K NS > K CS as proved.
With our assumption that all samples are independent Gaussian N(0, σ 2 i ), we could conclude that which indicates that both nested sampling and coprime sampling use less number of bits to describe the information compared that without sparse sampling (WS). As we know from the introduction part, in (N 1 + 1)N 2 T seconds, there are totally N 1 + N 2 samples for nested sampling, while coprime sampling totally collect P + Q samples in PQT seconds. If the sampling intervals are the same, i.e., N 1 = P and N 2 = Q, it is obvious that nested sampling is a little sparser than coprime sampling method.
is because nested sampling collects a little less number of samples than coprime sampling with the same length L of data. The rate R(D) at a given distortion for both sparse sampling algorithms is less than that without sparse sampling. The reason is that with sparse sampling, less number of bits is used to describe the original information.

Numerical results
The total length of the information is set to be L = 1, 000. Each sample is assumed to follow a Gaussian distribution N(0, 1) with zero mean and unit variance. We also assume D k = λ < σ 2 = 1, which is equal distortion for each random variable. Figure 3 shows the rate distortion performance of nested sampling with different sampling spacings. It is clear that with the increasing of distortion, the rate decreases. When the sampling intervals N 1 and N 2 becomes larger, i.e., less samples are acquired, the rate becomes smaller. For example, when D = 0.3, N 1 = 3, N 2 = 5, the data rate R(D) ≈ 1, 350, while with the increase of sampling pairs to N 1 = 3, N 2 = 11, then R(D) ≈ 1220, which is much smaller. This is because with more sparse sampling, less number of bits is required to represent the information.
The rate distortion performance of coprime sampling with different sampling spacings is shown in Figure 4. Similarly as nested sampling, with the increasing of distortion, the rate R(D) decreases. When the sampling intervals P and Q becomes larger, the rate becomes smaller. Figure 5 compares the rate distortion performance between nested sampling and coprime sampling, where D is the distortion between the original source and its sparse-sampled representation, and R(D) is the corresponding rate at a particular distortion D. With the same sampling spacings chosen, N 1 = P, and N 2 = Q, at the same distortion, the rate of nested sampling is less than that of coprime sampling. For example, when N 1 = P = 3, and N 2 = Q = 17, when D = 0.3, the rate for nested sampling is R NS (D) ≈ 1, 200, while the rate for coprime sampling is R CS (D) ≈ 1, 300. This verifies the result that R NS (D) < R CS (D), because nested sampling collects a little less number of samples than coprime sampling with the same length L of data, which is a little sparser than coprime sampling.

Conclusions
Information rate distortion function is a measure of distortion between the original source and its representation. Our purpose in this paper is to construct a distortion function which can measure the distortion because of these two sparse sampling algorithms. Information theoretical rate distortion performance for these two sparse sampling methods, nested sampling and coprime sampling, is studied in this paper. It is showed that with these two sparse sampling algorithms, the data rate is proved to be much less than that without sparse sampling at a given distortion. With the increasing of sampling spacings, i.e., data are more sparsely acquired, the data rate decreases at certain distortion. This is because with sparser sampling, less number of bits is required to represent the information. We also show that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion.