- Research
- Open Access
- Published:

# Rate distortion performance analysis of nested sampling and coprime sampling

*EURASIP Journal on Advances in Signal Processing*
**volume 2014**, Article number: 18 (2014)

## Abstract

In this paper, rate distortion performance of nested sampling and coprime sampling is studied. It is shown that with the increasing of distortion, the data rate decreases. With these two sparse sampling algorithms, the data rate is proved to be much less than that without sparse sampling. With the increasing of sampling spacings, the data rate decreases at certain distortion, which is because with more sparse sampling, less number of bits is required to represent the information. We also prove that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion. The reason is that nested sampling collects a little less number of samples than coprime sampling with the same length of data, which is a little sparser than coprime sampling.

## 1 Introduction

The twenty-first century is awash with data. Data are flooding in at rates never seen before, doubling almost every 18 months [1], as result of new information gathered from Radar, Web communities, newly deployed smart assets, and customer data from public, proprietary, purchased sources, and so forth. For example, oil companies, telecommunication companies, and other data-centric industries have had huge data for long time. Data is being collected and transmitted at unprecedented scale [2, 3] in a wide range of application areas nowadays. The phrase ‘Big Data’ as defined by US National Science Foundation in its recent solicitation, refers to large, diverse, complex, distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and all other digital sources available today and in the future.

Unstructured data is data that does not follow a specified format, which is really most of the Big Data. Radar or sonar data is one typical example, which includes meteorological, vehicular, and oceanographic seismic information, such as in [4], Big Data from O’Reilly radar was described.

And many efforts have been made to develop suitable compression techniques for Big Data. However, traditional compression methods [5, 6] are all based on Nyquist rate, which will have poor efficiency in terms of both sampling rate and computational complexity. Unlike traditional compression techniques, some sparse sampling algorithms have been proposed to overcome Nyquist sampling requirement, like compressive sensing, nested sampling, and coprime sampling.

Nested sampling [7] is an non-uniform sampling, using two different samplers in each period. Although the signal is sampled sparsely and non-uniformly, the autocorrelation of signal could be estimated at all lags. Therefore, although the samples can be arbitrarily sparse, it keeps the signal’s statistical information [8]. While coprime sampling uses two uniform samplers, with sample spacings *P* and *Q* coprime integers. The authors in [8] have already proved that these two sets of samples of the signal could fully estimate all lags of autocorrelation of the original signal. As both nested sampling and coprime sampling could keep the statistical property of the original signal, these two sampling algorithms could be applied to Big Data to highly reduce the transmission or storage cost of Big Data.

Information rate distortion function is a measure of distortion between the original source and its representation. In this paper, we will provide theoretical rate distortion performance, because of these two sparse sampling algorithms, either nested sampling (NS) or coprime sampling (CS). We will show that with these two sparse sampling algorithms, the data rate is much less than that without sparse sampling for a given distortion. With the increasing of sampling spacings, the data rate decreases at certain distortion, which is because with more sparse sampling, less number of bits is required to represent the information. We will also prove that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion.

The rest of this paper is organized as follows. In Section 2, we give a brief introduction of nested sampling and coprime sampling separately. Theoretical derivation of rate distortion performance of nested sampling and coprime sampling is detailed in Sections 3.1 and 3.2. Also, the theoretical analysis and comparison of these two sparse sampling is given in Section 3.3. In Section 4, numerical results are provided to verify the theoretical rate distortion results derived in Section 3. Conclusions are given in Section 5.

## 2 Preliminaries

### 2.1 Nested sampling

The nested array was first introduced in [7] as an effective approach to array processing with enhanced degrees of freedom [9, 10]. In time domain, the signal’s autocorrelation could also be obtained from nested sampling structure [11]. And although the samples from this nested sparse sampling are sparsely and non-uniformly located, the samples of the autocorrelation can be computed at any specified rate. Some applications which depend on the difference co-array, or autocorrelation, like Direction-of-arrival (DOA) estimation and beamforming could be done based on nested sampling.

In the simplest form, there are two levels of sampling density in nested sampling [11–14], with the level 1 samples at the *N*_{1} locations and the level 2 samples at the *N*_{2} locations.

An example of periodic sparse sampling using nested sampling structure is shown in Figure 1, with *N*_{1}=3 and *N*_{2}=5. The cross differences are given by

The cross differences (2) are in the following range with the maximum value (*N*_{1}+1)*N*_{2}−1 [9, 11], except the integers and the corresponding negated versions shown in (3).

For example, consider the example in Figure 1, where *N*_{1}=3, *N*_{2}=5, 1≤*m*≤5 and 1≤*l*≤3, the cross differences *k*=(*N*_{1}+1)*m*−*l* will achieve these values

The difference 0 is also missing, for the reason that *m* and *l* are nonzero. Meanwhile, we notice that all of the missing differences could be covered by the self differences among the second array,

Combining the cross differences and the self-differences is the difference-co-array, which is a filled difference co-array as shown in (2). This indicates that using nested array structure, although sparse samples are obtained, the degrees of freedom is enhanced,

Based on the principle above, a sparse non-uniform sampling using nested sampling structure could be performed as in Figure 1. There are two levels of nesting, with *N*_{1} level-1 samples and *N*_{2} level-2 samples in each period, with period (*N*_{1}+1)*N*_{2}. It is obvious that nested sampling is non-uniform and the samples obtained are very sparse.

We could notice that, in (*N*_{1}+1)*N*_{2}*T* seconds, there are totally *N*_{1}+*N*_{2} samples. Therefore, the average sampling rate is

Here, *T*=1/*f*_{
n
}, *f*_{
n
}≥2 *f*_{max} is the Nyquist sampling frequency. As the Nyquist sampling rate is 1/*T*, the average sampling rate of nested sampling is smaller than the conventional Nyquist sampling rate.

### 2.2 Co-prime sampling

Different from nested sampling, coprime sampling involves two sets of uniformly spaced samplers [8, 12–14] as shown in Figure 2, each of them are sparsely sampled.

The coprime sampling uniformly sample the source signal using two sub-Nyquist samplers, with sample spacing *PT* and *QT* respectively, where *P* and *Q* are coprime integers with *P*<*Q*. 1/*T* Hz is the Nyquist rate for a bandlimited process, i.e., 1/*T*=2 *f*_{max}, *f*_{max} being the highest frequency.

Consider the product

where *x*(*Pn*_{1}) and *x*(*Qn*_{2}) comes from the first and the second sampler. Let the difference as

It has been shown that *k* can achieve any integer value in the range 0≤*k*≤*P* *Q*−1 in [11], if *n*_{1} and *n*_{2} are in the ranges 0≤*n*_{1}≤2*Q*−1 and 0≤*n*_{2}≤*P*−1.

For coprime sampling, the two samplers totally collect *P*+*Q* samples in *PQT* seconds, so the average sampling rate is

Same as in nested sampling, *T*=1/*f*_{
n
}, *f*_{
n
}≥2 *f*_{max} is the Nyquist sampling frequency. It is obvious the average sampling rate of coprime sampling is also much smaller than the conventional Nyquist sampling rate. However, the signal’s second-order statistics, like the autocorrelation, is kept, which allows us to sample a signal sparsely and estimate some aspects of the signal (spectra, DOA, and so on) at a significantly higher resolution.

## 3 Rate distortion performance

Information rate distortion function is a measure of distortion between the original source and its representation. Our purpose is to construct a distortion function which can measure the distortion because of these two sparse sampling algorithms, either nested sampling (NS) or coprime sampling (CS). Sparse sampling can cause possible distortion because less number of samples are used. A wide variety of distortion functions, such as Euclidean distance, Hamming distance, Mahalanobis distance, and Itakura-Saito distance have been used. In this paper, squared error distortion is used. The original samples are denoted as *x*_{
i
},*i*=1,⋯,*L*, where *L* is the total number of samples. Assume that all original information from *L* samples is *X*^{L}= [ *x*_{1},*x*_{2},⋯,*x*_{
L
}], the selected information after sparse sampling can be represented as [15]

where *S*(·) denotes sparse sampling, either nested sampling or coprime sampling. {\widehat{X}}^{{L}^{\prime}}=\phantom{\rule{0.3em}{0ex}}[\phantom{\rule{0.3em}{0ex}}{\widehat{x}}_{1},{\widehat{x}}_{2},\cdots \phantom{\rule{0.3em}{0ex}},{\widehat{x}}_{{L}^{\prime}}] and *L*^{′}<*L*. The distortion associated with the sparse sampling between all original samples and the selected samples is

where *d*(·) is the distortion function.

The expectation in (12) is with respect to the probability distribution on *X*^{L}. The rate distortion function *R*(*D*) is the minimum of data rates *R* such that (*R*,*D*) is in the rate distortion region for a given distortion. From [16, 17], we know that information rate distortion function is defined as

where I({X}^{L};{\widehat{X}}^{{L}^{\prime}}) is the mutual information between *X*^{L} and {\widehat{X}}^{{L}^{\prime}}.

where inequality (*a*) follows from the fact that condition reduces the entropy.

From formula (13), we know that

For squared error distortion,

where *i*=1,⋯,*L* and *j*=1,⋯,*L*^{′}, and (*b*) follows from the definition that E{({x}_{i}-{\widehat{x}}_{j})}^{2}={D}_{k}.

Since Gaussian assumption is a classical modeling assumption heavily used in areas such as signal processing and communication system [18], from [16], the rate distortion function for a single Gaussian source *N*(0,*σ*^{2}) with squared error distortion is

For *L*-independent zero mean Gaussian sources *x*_{1},⋯,*x*_{
L
} with variance {\sigma}_{1}^{2},{\sigma}_{2}^{2},\cdots \phantom{\rule{0.3em}{0ex}},{\sigma}_{L}^{2}, the rate distortion performance with squared error distortion is given by [16, 17, 19, 20]

where

where *λ* is chosen so that \sum _{i}^{L}{D}_{i}=D, and {D}_{i}=E{({x}_{i}-{\widehat{x}}_{i})}^{2}. This gives rise to a kind of reverse waterfilling. We choose a constant *λ* and only describe those random variables with variance greater than *λ*, and no bits are used to describe random variables with variance less than *λ*.

### 3.1 For nested sampling

####
**Theorem**
**1**

(Rate distortion for nested sampling of Gaussian source) Let {x}_{i}\sim N(0,{\sigma}_{i}^{2}), *i*=1,2,⋯,*L*, be independent Gaussian random variables, and under squared error distortion. The rate distortion between the original Gaussian source and after nested sampling of these Gaussian random variables is given by

where *K*_{NS} is given in (24) and

where *λ* is chosen so that \sum _{k=1}^{{K}_{\text{NS}}}{D}_{k}=D.

####
**Proof**
**1**

For nested sampling (NS), all *L* original information is *X*^{L}= [ *x*_{1},*x*_{2},⋯,*x*_{
L
}].

And less number of samples *L*^{′} will be selected based on nested sampling as described,

Therefore, (16) becomes

where the length of *K*_{NS} could be determined based on the following formula, here we assume *Y*=*L*(mod (*N*_{1}+1)*N*_{2})

where

in which *U*=(*Y*−(*N*_{1}+1))(mod (*N*_{1}+1)).

If all samples are assumed to be independent Gaussian N(0,{\sigma}_{i}^{2}), hence, the corresponding rate distortion function for nested sampling will be

where inequality (*c*) follows from the fact that the normal distribution maximizes the entropy for a given second moment, and {\sum}_{k=1}^{{K}_{\text{NS}}}{D}_{k}=D.

To find the minimum value, we could use Lagrange multipliers

and differentiating with respect to *D*_{
k
} and setting equal to 0, we have

or

which results in an equal distortion for each random variable, if the constant *λ*^{′} is less than {\sigma}_{i}^{2} for all *i*. As the increase of the total allowable distortion *D*, the constant *λ*^{′} increases until it exceeds {\sigma}_{i}^{2} for some *i*. Kuhn-Tucker conditions could be used to find the minimum in (26) if we increase the total distortion *D*. In this case, the Kuhn-Tucker conditions yield

Therefore,

where *λ* is chosen so that {\sum}_{k=1}^{{K}_{\text{NS}}}{D}_{k}=D.

### 3.2 For coprime sampling

**Theorem** **2**.

(Rate distortion for coprime sampling of Gaussian source) Let {x}_{i}\sim N(0,{\sigma}_{i}^{2}), *i*=1,2,⋯,*L*, be independent Gaussian random variables, and under squared error distortion. The rate distortion between the original Gaussian source and after coprime sampling of these Gaussian random variables is given by

where *K*_{CS} is given in (36) and

where *λ* is chosen so that \sum _{k=1}^{{K}_{\text{CS}}}{D}_{k}=D.

**Proof** **2**.

For coprime sampling (CS), we still assume the original information with length *L*, i.e., *X*^{L}=[ *x*_{1},*x*_{2},⋯,*x*_{
L
}].

And based on coprime sampling, less number of samples *L*^{′′} will be selected,

Similarly, (16) becomes

where the length of *K*_{CS} could be determined based on the following formula

Therefore, the corresponding rate distortion function for coprime sampling of independent Gaussian source N(0,{\sigma}_{i}^{2}) is

The minimum value could be obtained using the similar procedure as described in nested sampling.

### 3.3 Theoretical analysis

Without sparse sampling, the rate distortion function would be

which is much greater than that with sparse sampling.

From the above derivation of rate distortion function of nested sampling and coprime sampling, we could notice that if the sampling spacings are assumed to be the same, i.e., *N*_{1}=*P* and *N*_{2}=*Q* for these two sparse sampling methods, then the minimum value of {K}_{{\text{NS}}_{\text{min}}} could be achieved when *Y*=*L*(mod (*N*_{1}+1)*N*_{2})=0, therefore

While for coprime sampling, the minimum value of {K}_{{\text{CS}}_{\text{min}}} could be achieved when *L* (mod *P*)=0, *L*(mod *Q*)=0, and *L*(mod *P* *Q*)=0, therefore

As we know that for these two sparse sampling algorithms, the sampling interval is for sure greater than Nyquist sampling spacing, which indicates that *Q*>1, therefore,

which indicates that in most cases, *K*_{NS}>*K*_{CS}. Table 1 shows some example of *K*_{NS} and *K*_{CS} with respect to sampling intervals when *N*_{1}=*P*, *N*_{2}=*Q*, and *L*=1,000. It is clear that with the increase of sampling spacings, samples are selected more sparsely by both nested sampling and coprime sampling, which results in a increase of *K*_{NS} and *K*_{CS}. In addition, we could notice that *K*_{NS}>*K*_{CS} as proved.

With our assumption that all samples are independent Gaussian N(0,{\sigma}_{i}^{2}), we could conclude that

which indicates that both nested sampling and coprime sampling use less number of bits to describe the information compared that without sparse sampling (WS).

As we know from the introduction part, in (*N*_{1}+1)*N*_{2}*T* seconds, there are totally *N*_{1}+*N*_{2} samples for nested sampling, while coprime sampling totally collect *P* + *Q* samples in *PQT* seconds. If the sampling intervals are the same, i.e., *N*_{1}=*P* and *N*_{2}=*Q*, it is obvious that nested sampling is a little sparser than coprime sampling method. *R*_{NS}(*D*)<*R*_{CS}(*D*) is because nested sampling collects a little less number of samples than coprime sampling with the same length *L* of data. The rate *R*(*D*) at a given distortion for both sparse sampling algorithms is less than that without sparse sampling. The reason is that with sparse sampling, less number of bits is used to describe the original information.

## 4 Numerical results

The total length of the information is set to be *L*=1,000. Each sample is assumed to follow a Gaussian distribution *N*(0,1) with zero mean and unit variance. We also assume *D*_{
k
}=*λ*<*σ*^{2}=1, which is equal distortion for each random variable.

Figure 3 shows the rate distortion performance of nested sampling with different sampling spacings. It is clear that with the increasing of distortion, the rate decreases. When the sampling intervals *N*_{1} and *N*_{2} becomes larger, i.e., less samples are acquired, the rate becomes smaller. For example, when *D*=0.3, *N*_{1}=3,*N*_{2}=5, the data rate *R*(*D*)≈1,350, while with the increase of sampling pairs to *N*_{1}=3,*N*_{2}=11, then *R*(*D*)≈1220, which is much smaller. This is because with more sparse sampling, less number of bits is required to represent the information.

The rate distortion performance of coprime sampling with different sampling spacings is shown in Figure 4. Similarly as nested sampling, with the increasing of distortion, the rate *R*(*D*) decreases. When the sampling intervals *P* and *Q* becomes larger, the rate becomes smaller.

Figure 5 compares the rate distortion performance between nested sampling and coprime sampling, where *D* is the distortion between the original source and its sparse-sampled representation, and *R*(*D*) is the corresponding rate at a particular distortion *D*. With the same sampling spacings chosen, *N*_{1}=*P*, and *N*_{2}=*Q*, at the same distortion, the rate of nested sampling is less than that of coprime sampling. For example, when *N*_{1}=*P*=3, and *N*_{2}=*Q*=17, when *D*=0.3, the rate for nested sampling is *R*_{NS}(*D*)≈1,200, while the rate for coprime sampling is *R*_{CS}(*D*)≈1,300. This verifies the result that *R*_{NS}(*D*)<*R*_{CS}(*D*), because nested sampling collects a little less number of samples than coprime sampling with the same length *L* of data, which is a little sparser than coprime sampling.

## 5 Conclusions

Information rate distortion function is a measure of distortion between the original source and its representation. Our purpose in this paper is to construct a distortion function which can measure the distortion because of these two sparse sampling algorithms. Information theoretical rate distortion performance for these two sparse sampling methods, nested sampling and coprime sampling, is studied in this paper. It is showed that with these two sparse sampling algorithms, the data rate is proved to be much less than that without sparse sampling at a given distortion. With the increasing of sampling spacings, i.e., data are more sparsely acquired, the data rate decreases at certain distortion. This is because with sparser sampling, less number of bits is required to represent the information. We also show that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion.

## References

Bughin J, Chui M, Manyika J:

*Clouds, Big Data, and Smart Assets: Ten Tech-enabled Business Trends to Watch*. New York: McKinsey Quarterly; 2010.Labrinidis A, Jagadish HV: Challenges and opportunities with big data.

*Proc. VLDB Endowment*2012, 5(12):2032-2033.Bollier D, Firestone CM:

*The Promise and Peril of Big Data*. Washington, DC: Aspen Institute, Communications and Society Program; 2010.O’Reilly Radar Team:

*Big Data Now: Current Perspectives from O’Reilly Radar*. Sebastopol: O’Reilly Media; 2011.Chen J, Liang Q, Paden J, Gogineni P: Compressive sensing analysis of synthetic aperture radar raw data. In

*Proceedings of the IEEE International Conference on Communications (ICC’12)*. Ottawa, ON; 10–15 June 2012:6362-6366.Chen J, Liang Q: Efficient sampling for radar sensor networks.

*Int. J. Sensor Netw.,*in pressPal P, Vaidyanathan PP: Nested Arrays: a novel approach to array processing with enhanced degrees of freedom.

*IEEE Trans. Signal Process*2010, 58(8):4167-4181.Pal P, Vaidyanathan PP: Coprime sampling and the MUSIC algorithm. In

*Proceedings of the Digital Signal Process. Workshop, IEEE Signal Process. Educ. Workshop*. Sedona, AZ; 4–7 January 2011:289-294.Pal P, Vaidyanathan PP: A novel array structure for directions-of-arrival estimation with increased degrees of freedom. In

*Proceedings of the Acoustics Speech Signal Process*. Dallas, TX; 14–19 March 2010:2606-2609.Pal P, Piya , Vaidyanathan PP: Two dimensional nested arrays on lattices. In

*Proceedings of the IEEE Int. Conf. Acoustics, Speech Signal Process. (ICASSP)*. Prague, 22–27 May 2011; 2011:548-2551.Vaidyanathan PP, Pal P: Sparse sensing with co-prime samplers and arrays.

*IEEE Trans. Signal Process*2011, 59(2):573-586.Chen J, Liang Q, Zhang B, Wu X: Spectrum efficiency of nested sparse sampling and co-prime sampling.

*EURASIP J. Wireless Commun. Netw*2013, 2013: 47. 10.1186/1687-1499-2013-47Chen J, Liang Q, Wang J, Choi H-A: Spectrum efficiency of nested sparse sampling. In

*Wireless Algorithms, Systems, and Applications*. Berlin Heidelberg: Springer; 2012:574-583.Chen J, Liang Q, Wang J: Secure transmission for big data based on nested sampling and coprime sampling with spectrum efficiency.

*Secur. Commun. Netw. Wiley Security Comm. Networks*2013. doi:10.1002/sec.785Liang Q, Cheng X, Huang SC, Chen D: Opportunistic sensing in wireless sensor networks: theory and application.

*IEEE Trans. Comput.*2013. doi:10.1109/TC.2013.85Cover TM, Thomas JA:

*Elements of Information Theory, Second Edition*. Hoboken: Wiley; 2006.Chen J, Liang Q: Rate distortion performance analysis of compressive sensing. In

*Proceedings of the IEEE Global Telecommun. Conf. (GLOBECOM 2011)*. Houston, TX; 5–9 December 2011:1-5.Capdevila M, Florez OWM: A communication perspective on automatic text categorization.

*IEEE Trans. Knowl. Data Eng*2009, 21: 1027-1041.Chen J, Liang Q, Zhang B, Wu X: Information theoretic performance bounds for noisy compressive sensing. Paper presented at the ICC’2013,. Budapest, 09–13 June 2013

Chen J, Liang Q: Theoretical performance limits for compressive sensing with random noise. Paper presented at the IEEE Global Communications Conference 2013 (GLOBECOM’13),. Atlanta, 09–13 December 2013

## Acknowledgements

This work was supported in part by US Office of Naval Research under Grants N00014-13-1-0043, N00014-11-1-0865, US National Science Foundation under grants CNS-1247848, CNS-1116749, CNS-0964713, and National Science Foundation of China (NSFC) under grant 61372097.

## Author information

### Affiliations

### Corresponding author

## Additional information

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Chen, J., Liang, Q. Rate distortion performance analysis of nested sampling and coprime sampling.
*EURASIP J. Adv. Signal Process.* **2014, **18 (2014). https://doi.org/10.1186/1687-6180-2014-18

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/1687-6180-2014-18

### Keywords

- Sparse sampling
- Rate distortion
- Information theory
- Nested sampling
- Coprime sampling