- Research
- Open access
- Published:
Rate distortion performance analysis of nested sampling and coprime sampling
EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 18 (2014)
Abstract
In this paper, rate distortion performance of nested sampling and coprime sampling is studied. It is shown that with the increasing of distortion, the data rate decreases. With these two sparse sampling algorithms, the data rate is proved to be much less than that without sparse sampling. With the increasing of sampling spacings, the data rate decreases at certain distortion, which is because with more sparse sampling, less number of bits is required to represent the information. We also prove that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion. The reason is that nested sampling collects a little less number of samples than coprime sampling with the same length of data, which is a little sparser than coprime sampling.
1 Introduction
The twenty-first century is awash with data. Data are flooding in at rates never seen before, doubling almost every 18 months [1], as result of new information gathered from Radar, Web communities, newly deployed smart assets, and customer data from public, proprietary, purchased sources, and so forth. For example, oil companies, telecommunication companies, and other data-centric industries have had huge data for long time. Data is being collected and transmitted at unprecedented scale [2, 3] in a wide range of application areas nowadays. The phrase ‘Big Data’ as defined by US National Science Foundation in its recent solicitation, refers to large, diverse, complex, distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and all other digital sources available today and in the future.
Unstructured data is data that does not follow a specified format, which is really most of the Big Data. Radar or sonar data is one typical example, which includes meteorological, vehicular, and oceanographic seismic information, such as in [4], Big Data from O’Reilly radar was described.
And many efforts have been made to develop suitable compression techniques for Big Data. However, traditional compression methods [5, 6] are all based on Nyquist rate, which will have poor efficiency in terms of both sampling rate and computational complexity. Unlike traditional compression techniques, some sparse sampling algorithms have been proposed to overcome Nyquist sampling requirement, like compressive sensing, nested sampling, and coprime sampling.
Nested sampling [7] is an non-uniform sampling, using two different samplers in each period. Although the signal is sampled sparsely and non-uniformly, the autocorrelation of signal could be estimated at all lags. Therefore, although the samples can be arbitrarily sparse, it keeps the signal’s statistical information [8]. While coprime sampling uses two uniform samplers, with sample spacings P and Q coprime integers. The authors in [8] have already proved that these two sets of samples of the signal could fully estimate all lags of autocorrelation of the original signal. As both nested sampling and coprime sampling could keep the statistical property of the original signal, these two sampling algorithms could be applied to Big Data to highly reduce the transmission or storage cost of Big Data.
Information rate distortion function is a measure of distortion between the original source and its representation. In this paper, we will provide theoretical rate distortion performance, because of these two sparse sampling algorithms, either nested sampling (NS) or coprime sampling (CS). We will show that with these two sparse sampling algorithms, the data rate is much less than that without sparse sampling for a given distortion. With the increasing of sampling spacings, the data rate decreases at certain distortion, which is because with more sparse sampling, less number of bits is required to represent the information. We will also prove that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion.
The rest of this paper is organized as follows. In Section 2, we give a brief introduction of nested sampling and coprime sampling separately. Theoretical derivation of rate distortion performance of nested sampling and coprime sampling is detailed in Sections 3.1 and 3.2. Also, the theoretical analysis and comparison of these two sparse sampling is given in Section 3.3. In Section 4, numerical results are provided to verify the theoretical rate distortion results derived in Section 3. Conclusions are given in Section 5.
2 Preliminaries
2.1 Nested sampling
The nested array was first introduced in [7] as an effective approach to array processing with enhanced degrees of freedom [9, 10]. In time domain, the signal’s autocorrelation could also be obtained from nested sampling structure [11]. And although the samples from this nested sparse sampling are sparsely and non-uniformly located, the samples of the autocorrelation can be computed at any specified rate. Some applications which depend on the difference co-array, or autocorrelation, like Direction-of-arrival (DOA) estimation and beamforming could be done based on nested sampling.
In the simplest form, there are two levels of sampling density in nested sampling [11–14], with the level 1 samples at the N1 locations and the level 2 samples at the N2 locations.
An example of periodic sparse sampling using nested sampling structure is shown in Figure 1, with N1=3 and N2=5. The cross differences are given by
The cross differences (2) are in the following range with the maximum value (N1+1)N2−1 [9, 11], except the integers and the corresponding negated versions shown in (3).
For example, consider the example in Figure 1, where N1=3, N2=5, 1≤m≤5 and 1≤l≤3, the cross differences k=(N1+1)m−l will achieve these values
The difference 0 is also missing, for the reason that m and l are nonzero. Meanwhile, we notice that all of the missing differences could be covered by the self differences among the second array,
Combining the cross differences and the self-differences is the difference-co-array, which is a filled difference co-array as shown in (2). This indicates that using nested array structure, although sparse samples are obtained, the degrees of freedom is enhanced,
Based on the principle above, a sparse non-uniform sampling using nested sampling structure could be performed as in Figure 1. There are two levels of nesting, with N1 level-1 samples and N2 level-2 samples in each period, with period (N1+1)N2. It is obvious that nested sampling is non-uniform and the samples obtained are very sparse.
We could notice that, in (N1+1)N2T seconds, there are totally N1+N2 samples. Therefore, the average sampling rate is
Here, T=1/f n , f n ≥2 fmax is the Nyquist sampling frequency. As the Nyquist sampling rate is 1/T, the average sampling rate of nested sampling is smaller than the conventional Nyquist sampling rate.
2.2 Co-prime sampling
Different from nested sampling, coprime sampling involves two sets of uniformly spaced samplers [8, 12–14] as shown in Figure 2, each of them are sparsely sampled.
The coprime sampling uniformly sample the source signal using two sub-Nyquist samplers, with sample spacing PT and QT respectively, where P and Q are coprime integers with P<Q. 1/T Hz is the Nyquist rate for a bandlimited process, i.e., 1/T=2 fmax, fmax being the highest frequency.
Consider the product
where x(Pn1) and x(Qn2) comes from the first and the second sampler. Let the difference as
It has been shown that k can achieve any integer value in the range 0≤k≤P Q−1 in [11], if n1 and n2 are in the ranges 0≤n1≤2Q−1 and 0≤n2≤P−1.
For coprime sampling, the two samplers totally collect P+Q samples in PQT seconds, so the average sampling rate is
Same as in nested sampling, T=1/f n , f n ≥2 fmax is the Nyquist sampling frequency. It is obvious the average sampling rate of coprime sampling is also much smaller than the conventional Nyquist sampling rate. However, the signal’s second-order statistics, like the autocorrelation, is kept, which allows us to sample a signal sparsely and estimate some aspects of the signal (spectra, DOA, and so on) at a significantly higher resolution.
3 Rate distortion performance
Information rate distortion function is a measure of distortion between the original source and its representation. Our purpose is to construct a distortion function which can measure the distortion because of these two sparse sampling algorithms, either nested sampling (NS) or coprime sampling (CS). Sparse sampling can cause possible distortion because less number of samples are used. A wide variety of distortion functions, such as Euclidean distance, Hamming distance, Mahalanobis distance, and Itakura-Saito distance have been used. In this paper, squared error distortion is used. The original samples are denoted as x i ,i=1,⋯,L, where L is the total number of samples. Assume that all original information from L samples is XL= [ x1,x2,⋯,x L ], the selected information after sparse sampling can be represented as [15]
where S(·) denotes sparse sampling, either nested sampling or coprime sampling. and L′<L. The distortion associated with the sparse sampling between all original samples and the selected samples is
where d(·) is the distortion function.
The expectation in (12) is with respect to the probability distribution on XL. The rate distortion function R(D) is the minimum of data rates R such that (R,D) is in the rate distortion region for a given distortion. From [16, 17], we know that information rate distortion function is defined as
where is the mutual information between XL and .
where inequality (a) follows from the fact that condition reduces the entropy.
From formula (13), we know that
For squared error distortion,
where i=1,⋯,L and j=1,⋯,L′, and (b) follows from the definition that .
Since Gaussian assumption is a classical modeling assumption heavily used in areas such as signal processing and communication system [18], from [16], the rate distortion function for a single Gaussian source N(0,σ2) with squared error distortion is
For L-independent zero mean Gaussian sources x1,⋯,x L with variance , the rate distortion performance with squared error distortion is given by [16, 17, 19, 20]
where
where λ is chosen so that , and . This gives rise to a kind of reverse waterfilling. We choose a constant λ and only describe those random variables with variance greater than λ, and no bits are used to describe random variables with variance less than λ.
3.1 For nested sampling
Theorem 1
(Rate distortion for nested sampling of Gaussian source) Let , i=1,2,⋯,L, be independent Gaussian random variables, and under squared error distortion. The rate distortion between the original Gaussian source and after nested sampling of these Gaussian random variables is given by
where KNS is given in (24) and
where λ is chosen so that .
Proof 1
For nested sampling (NS), all L original information is XL= [ x1,x2,⋯,x L ].
And less number of samples L′ will be selected based on nested sampling as described,
Therefore, (16) becomes
where the length of KNS could be determined based on the following formula, here we assume Y=L(mod (N1+1)N2)
where
in which U=(Y−(N1+1))(mod (N1+1)).
If all samples are assumed to be independent Gaussian , hence, the corresponding rate distortion function for nested sampling will be
where inequality (c) follows from the fact that the normal distribution maximizes the entropy for a given second moment, and .
To find the minimum value, we could use Lagrange multipliers
and differentiating with respect to D k and setting equal to 0, we have
or
which results in an equal distortion for each random variable, if the constant λ′ is less than for all i. As the increase of the total allowable distortion D, the constant λ′ increases until it exceeds for some i. Kuhn-Tucker conditions could be used to find the minimum in (26) if we increase the total distortion D. In this case, the Kuhn-Tucker conditions yield
Therefore,
where λ is chosen so that .
3.2 For coprime sampling
Theorem 2.
(Rate distortion for coprime sampling of Gaussian source) Let , i=1,2,⋯,L, be independent Gaussian random variables, and under squared error distortion. The rate distortion between the original Gaussian source and after coprime sampling of these Gaussian random variables is given by
where KCS is given in (36) and
where λ is chosen so that .
Proof 2.
For coprime sampling (CS), we still assume the original information with length L, i.e., XL=[ x1,x2,⋯,x L ].
And based on coprime sampling, less number of samples L′′ will be selected,
Similarly, (16) becomes
where the length of KCS could be determined based on the following formula
Therefore, the corresponding rate distortion function for coprime sampling of independent Gaussian source is
The minimum value could be obtained using the similar procedure as described in nested sampling.
3.3 Theoretical analysis
Without sparse sampling, the rate distortion function would be
which is much greater than that with sparse sampling.
From the above derivation of rate distortion function of nested sampling and coprime sampling, we could notice that if the sampling spacings are assumed to be the same, i.e., N1=P and N2=Q for these two sparse sampling methods, then the minimum value of could be achieved when Y=L(mod (N1+1)N2)=0, therefore
While for coprime sampling, the minimum value of could be achieved when L (mod P)=0, L(mod Q)=0, and L(mod P Q)=0, therefore
As we know that for these two sparse sampling algorithms, the sampling interval is for sure greater than Nyquist sampling spacing, which indicates that Q>1, therefore,
which indicates that in most cases, KNS>KCS. Table 1 shows some example of KNS and KCS with respect to sampling intervals when N1=P, N2=Q, and L=1,000. It is clear that with the increase of sampling spacings, samples are selected more sparsely by both nested sampling and coprime sampling, which results in a increase of KNS and KCS. In addition, we could notice that KNS>KCS as proved.
With our assumption that all samples are independent Gaussian , we could conclude that
which indicates that both nested sampling and coprime sampling use less number of bits to describe the information compared that without sparse sampling (WS).
As we know from the introduction part, in (N1+1)N2T seconds, there are totally N1+N2 samples for nested sampling, while coprime sampling totally collect P + Q samples in PQT seconds. If the sampling intervals are the same, i.e., N1=P and N2=Q, it is obvious that nested sampling is a little sparser than coprime sampling method. RNS(D)<RCS(D) is because nested sampling collects a little less number of samples than coprime sampling with the same length L of data. The rate R(D) at a given distortion for both sparse sampling algorithms is less than that without sparse sampling. The reason is that with sparse sampling, less number of bits is used to describe the original information.
4 Numerical results
The total length of the information is set to be L=1,000. Each sample is assumed to follow a Gaussian distribution N(0,1) with zero mean and unit variance. We also assume D k =λ<σ2=1, which is equal distortion for each random variable.
Figure 3 shows the rate distortion performance of nested sampling with different sampling spacings. It is clear that with the increasing of distortion, the rate decreases. When the sampling intervals N1 and N2 becomes larger, i.e., less samples are acquired, the rate becomes smaller. For example, when D=0.3, N1=3,N2=5, the data rate R(D)≈1,350, while with the increase of sampling pairs to N1=3,N2=11, then R(D)≈1220, which is much smaller. This is because with more sparse sampling, less number of bits is required to represent the information.
The rate distortion performance of coprime sampling with different sampling spacings is shown in Figure 4. Similarly as nested sampling, with the increasing of distortion, the rate R(D) decreases. When the sampling intervals P and Q becomes larger, the rate becomes smaller.
Figure 5 compares the rate distortion performance between nested sampling and coprime sampling, where D is the distortion between the original source and its sparse-sampled representation, and R(D) is the corresponding rate at a particular distortion D. With the same sampling spacings chosen, N1=P, and N2=Q, at the same distortion, the rate of nested sampling is less than that of coprime sampling. For example, when N1=P=3, and N2=Q=17, when D=0.3, the rate for nested sampling is RNS(D)≈1,200, while the rate for coprime sampling is RCS(D)≈1,300. This verifies the result that RNS(D)<RCS(D), because nested sampling collects a little less number of samples than coprime sampling with the same length L of data, which is a little sparser than coprime sampling.
5 Conclusions
Information rate distortion function is a measure of distortion between the original source and its representation. Our purpose in this paper is to construct a distortion function which can measure the distortion because of these two sparse sampling algorithms. Information theoretical rate distortion performance for these two sparse sampling methods, nested sampling and coprime sampling, is studied in this paper. It is showed that with these two sparse sampling algorithms, the data rate is proved to be much less than that without sparse sampling at a given distortion. With the increasing of sampling spacings, i.e., data are more sparsely acquired, the data rate decreases at certain distortion. This is because with sparser sampling, less number of bits is required to represent the information. We also show that with the same sampling pairs, the rate of nested sampling is less than that of coprime sampling at the same distortion.
References
Bughin J, Chui M, Manyika J: Clouds, Big Data, and Smart Assets: Ten Tech-enabled Business Trends to Watch. New York: McKinsey Quarterly; 2010.
Labrinidis A, Jagadish HV: Challenges and opportunities with big data. Proc. VLDB Endowment 2012, 5(12):2032-2033.
Bollier D, Firestone CM: The Promise and Peril of Big Data. Washington, DC: Aspen Institute, Communications and Society Program; 2010.
O’Reilly Radar Team: Big Data Now: Current Perspectives from O’Reilly Radar. Sebastopol: O’Reilly Media; 2011.
Chen J, Liang Q, Paden J, Gogineni P: Compressive sensing analysis of synthetic aperture radar raw data. In Proceedings of the IEEE International Conference on Communications (ICC’12). Ottawa, ON; 10–15 June 2012:6362-6366.
Chen J, Liang Q: Efficient sampling for radar sensor networks. Int. J. Sensor Netw., in press
Pal P, Vaidyanathan PP: Nested Arrays: a novel approach to array processing with enhanced degrees of freedom. IEEE Trans. Signal Process 2010, 58(8):4167-4181.
Pal P, Vaidyanathan PP: Coprime sampling and the MUSIC algorithm. In Proceedings of the Digital Signal Process. Workshop, IEEE Signal Process. Educ. Workshop. Sedona, AZ; 4–7 January 2011:289-294.
Pal P, Vaidyanathan PP: A novel array structure for directions-of-arrival estimation with increased degrees of freedom. In Proceedings of the Acoustics Speech Signal Process. Dallas, TX; 14–19 March 2010:2606-2609.
Pal P, Piya , Vaidyanathan PP: Two dimensional nested arrays on lattices. In Proceedings of the IEEE Int. Conf. Acoustics, Speech Signal Process. (ICASSP). Prague, 22–27 May 2011; 2011:548-2551.
Vaidyanathan PP, Pal P: Sparse sensing with co-prime samplers and arrays. IEEE Trans. Signal Process 2011, 59(2):573-586.
Chen J, Liang Q, Zhang B, Wu X: Spectrum efficiency of nested sparse sampling and co-prime sampling. EURASIP J. Wireless Commun. Netw 2013, 2013: 47. 10.1186/1687-1499-2013-47
Chen J, Liang Q, Wang J, Choi H-A: Spectrum efficiency of nested sparse sampling. In Wireless Algorithms, Systems, and Applications. Berlin Heidelberg: Springer; 2012:574-583.
Chen J, Liang Q, Wang J: Secure transmission for big data based on nested sampling and coprime sampling with spectrum efficiency. Secur. Commun. Netw. Wiley Security Comm. Networks 2013. doi:10.1002/sec.785
Liang Q, Cheng X, Huang SC, Chen D: Opportunistic sensing in wireless sensor networks: theory and application. IEEE Trans. Comput. 2013. doi:10.1109/TC.2013.85
Cover TM, Thomas JA: Elements of Information Theory, Second Edition. Hoboken: Wiley; 2006.
Chen J, Liang Q: Rate distortion performance analysis of compressive sensing. In Proceedings of the IEEE Global Telecommun. Conf. (GLOBECOM 2011). Houston, TX; 5–9 December 2011:1-5.
Capdevila M, Florez OWM: A communication perspective on automatic text categorization. IEEE Trans. Knowl. Data Eng 2009, 21: 1027-1041.
Chen J, Liang Q, Zhang B, Wu X: Information theoretic performance bounds for noisy compressive sensing. Paper presented at the ICC’2013,. Budapest, 09–13 June 2013
Chen J, Liang Q: Theoretical performance limits for compressive sensing with random noise. Paper presented at the IEEE Global Communications Conference 2013 (GLOBECOM’13),. Atlanta, 09–13 December 2013
Acknowledgements
This work was supported in part by US Office of Naval Research under Grants N00014-13-1-0043, N00014-11-1-0865, US National Science Foundation under grants CNS-1247848, CNS-1116749, CNS-0964713, and National Science Foundation of China (NSFC) under grant 61372097.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Chen, J., Liang, Q. Rate distortion performance analysis of nested sampling and coprime sampling. EURASIP J. Adv. Signal Process. 2014, 18 (2014). https://doi.org/10.1186/1687-6180-2014-18
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1687-6180-2014-18