Decentralized Turbo Bayesian Compressed Sensing with Application to UWB Systems

In many situations, there exist plenty of spatial and temporal redundancies in original signals. Based on this observation, a novel Turbo Bayesian Compressed Sensing (TBCS) algorithm is proposed to provide an efficient approach to transfer and incorporate this redundant information for joint sparse signal reconstruction. As a case study, the TBCS algorithm is applied in UltraWideband (UWB) systems. A space-time TBCS structure is developed for exploiting and incorporating the spatial and temporal a priori information for space-time signal reconstruction. Simulation results demonstrate that the proposed TBCS algorithm achieves much better performance with only a few measurements in the presence of noise, compared with the traditional Bayesian Compressed Sensing (BCS) and multitask BCS algorithms.


Introduction
Compressed sensing (CS) theory [1,2] is blooming in recent years. In the CS theory, the original signal is not directly acquired but reconstructed based on the measurements obtained from projecting the signal using a random sensing matrix. It is well known that most natural signals are sparse, that is, in a certain transform domain, most elements are zeros or have very small amplitudes. Taking advantage of such sparsity, various CS signal reconstruction algorithms are developed to recover the original signal from a few observations and measurements [3][4][5].
In many situations, there are multiple copies of signals that are correlated in space and time, thus providing spatial and temporal redundancies. Take the CS-based Ultra-Wideband (UWB) system as an example (A UWB system utilizes a short-range, high-bandwidth pulse without carrier frequency for communication, positioning, and radar imaging. One challenge is the acquisition of the high-resolution ultrashort duration pulses. The emergence of CS theory provides an approach to acquiring UWB pulses, possibly under the Nyquist sampling rate [6,7].) [8,9]. In a typical UWB system as shown in Figure 1, one transmitter periodically sends out ultrashort pulses (typically nano-or subnanosecond Gaussian pulses). Surrounding the transmitter, several UWB receivers are receiving the pulses. The received echo signals at one receiver are similar to those received at other receivers in both space and time for the following reasons: (1) at the same time slot, the received UWB signals are similar to each other because they share the same source, which leads to spatial redundancy; (2) at the same receiver, the received signals are also similar in consecutive time slots because the pulses are periodically transmitted and propagating channels are assumed to change very slowly. Hence, the UWB echo signals are correlated both in space and time, which provides spatial and temporal redundancies and helpful information. Such a priori information can be exploited and utilized in the joint CS signal reconstruction to improve performance. On the other hand, our work is also motivated to reduce the number of necessary measurements and improve the capability of combating noise. For successful CS signal reconstruction, a certain number of measurements are needed. In the presence of noise, the number of measurement may be greatly increased. However, more measurements lead to more expensive and complex hardware and software in the system [6]. In such a situation, a question arises: can we develop a joint CS signal reconstruction algorithm to exploit temporal and spatial a priori information for improving performance in terms of less measurements, more noise tolerance, and better quality of reconstructed signal? Related research about joint CS signal reconstruction has been developed in the literature recently. Distributed compressed sensing (DCS) [10,11] studies joint sparsity and joint signal reconstruction. Simultaneous Orthogonal Matching Pursuit (SOMP) [12,13] for simultaneous signal reconstruction is developed by extending the traditional Orthogonal Matching Pursuit (OMP) algorithm. Serial OMP [14] studies time sequence signal reconstruction. The joint sparse recovery algorithm [15] is developed in association with the basis pursuit (BP) algorithm. These algorithms focus on either temporal or spatial joint signal reconstruction. They are developed by extending convex optimization and linear programming algorithms but ignore the impact of possible noise in the measurements.
Other work on sparse signal reconstruction is based on a statistical Bayesian framework. In [16,17], the authors developed a sparse signal reconstruction algorithm based on the belief propagation framework for the signal reconstruction. The information is exchanged among different elements in the signal vector in a way similar to the decoding of low-density parity check (LDPC) codes. In [18], the LDPC coding/decoding algorithm has been extended for real number CS signal reconstruction. Other Bayesian CS algorithms also have been developed in [3,4,19,20]. In [3], a pursuit method in the Bernoulli-Gaussian model is proposed to search for the nonzero signal elements. A Bayesian approach for Sparse Component Analysis for the noisy case is presented in [4]. In [19], a Gaussian mixture is adopted as the prior distribution in the Bayesian model, which has similar performance as the algorithm in [21]. In [20], using a Laplace prior distribution in the hierarchical Bayesian model can reduce reconstruction errors than using the Gaussian prior distribution [21]. However, all these algorithms are designed only for a single signal reconstruction and are not applied for multiple simultaneous signal reconstruction. We are looking for a suitable prior distribution for mutual information transfer. The prior distributions proposed in [3,19,20] are too complex for exploiting redundancy information for joint signal reconstruction. In [22], the redundancies of UWB signals are incorporated into the framework of Bayesian Compressed Sensing (BCS) [5,21] and have achieved good performance. However, only a heuristic approach is proposed to utilize the redundancy in [22].
More related work for the joint sparse signal reconstruction includes [23], in which the authors proposed multitask Bayesian compressive sensing (MBCS) for simultaneous joint signal reconstruction by sharing the same set of hyperparameters for the signals. The mutual information is directly transferred over multiple simultaneous signal reconstruction tasks. The mechanism of sharing mutual information in [24] is similar to the MBCS [23]. This sharing scheme is effective and straightforward. For the signals with high similarity, it has a much better performance than the original BCS algorithm. However, for a low level of similarity, a priori information may adversely affect the signal reconstruction, resulting in much worse performance than the original BCS. In the situation where there exist lots of low-similarity signals, this disadvantage could be unacceptable.
Our work and MBCS [23] are both focused on reconstructing multiple signal frames. However, MBCS cannot perform simultaneous multitask signal reconstruction until all measurements have been collected, which is purely in a batch mode and cannot be performed in an online manner. Moreover, MBCS is centralized and is hard to decentralize. Our proposed incremental and decentralized TBCS has a more flexible structure, which can reconstruct multiple signal frames sequentially in time and/or in parallel in space through transferring mutual a priori information.
In this paper, we propose a novel and flexible Turbo Bayesian Compressed Sensing (TBCS) algorithm for sparse signal reconstruction through exploiting and integrating spatial and temporal redundancies in multiple signal reconstruction procedures performed in parallel, in serial, or both. Note the BCS algorithm has an excellent capability of combating noise by employing a statistically hierarchical structure, which is very suitable for transferring a priori information. Based on the BCS algorithm, we propose an a priori informationbased iterative mechanism for information exchange among different reconstruction processes, motivated by the Turbo decoding structure, which is denoted as Turbo BCS. To the authors' best knowledge, there has not been any work applying the Turbo scheme in the BCS framework. Moreover, in the case study, we apply our TBCS algorithm in UWB systems to develop a Space-Time Turbo Bayesian Compressed Sensing (STTBCS) algorithm for space-time joint signal reconstruction. A key contribution is the space-time structure to exploit and utilize the temporal and spatial redundancies.
A primary challenge in the proposed framework is how to yield and fuse a priori information in the signal reconstruction procedure in order to utilize spatial and temporal redundancies. A mathematically elegant framework is proposed to impose an exponentially distributed hyperparameter on the existing hyperparameter α of the signal elements. This exponential distribution for the hyperparameter provides an approach to generate and fuse a priori information with measurements in the signal reconstruction procedure. An incremental method [25] is developed to find the limited nonzero signal elements, which reduces the computational complexity compared with the expectation maximization (EM) method. A detailed STTBCS algorithm procedure in the case study of UWB systems is also provided to illustrate that our algorithm is universal and robust: when the signals have low similarities, the performance of STTBCS will automatically equal that of the original BCS; on the other hand, when the similarity is high, the performance of STTBCS is much better than the original BCS.
Simulation results have demonstrated that our TBCS significantly improves performance. We first use spike signals to illustrate the performance which can be achieved at each iteration employing the original BCS, MBCS, and our TBCS algorithms. It shows that our TBCS outperforms the original BCS and MBCS algorithms at each iteration for different similarity levels. We also choose IEEE802.15a [26] UWB echo signals for performance simulation. For the same number of measurements, the reconstructed signal using TBCS is much better compared with the original BCS and MBCS. To achieve the same reconstruction percentage, our proposed scheme needs significantly fewer measurements and is able to tolerate more noise, compared with the original BCS and MBCS algorithms. A distinctive advantage of TBCS is that when the similarity is low, MBCS performance is worse than the original BCS while our TBCS is close to the original BCS and much better than MBCS.
The remainder of this paper is organized as follows. The problem formulation is introduced in Section 2. Based on the BCS framework, a priori information is integrated into signal reconstruction in Section 3. A fast incremental optimization method is detailed in Section 4 for the posterior function. Taking UWB systems as a case study, Section 5 develops a space-time TBCS algorithm by applying our TBCS into the UWB system. The space-time TBCS algorithm is summarized in Section 5. Numerical simulation results are provided in Section 6. The conclusions are in Section 7. Figure 2 shows a typical decentralized CS signal reconstruction model. We assume that the signals received at the receiver sides and the received signal are sparse. And we ignore any other effects such as propagation channel and additive noise on the original signal. We assume the received signals are sparse. Taking the UWB system as an example, all those original UWB echo signals, s 11 , s 12 , s 21 , . . ., are naturally sparse in the time domain. These signals can be reconstructed in high resolution from a limited number of measurements using low sampling rate ADCs by taking advantage of CS theory. We define a procedure as a signal reconstruction process from measurements to recover the signal vector. Signal reconstruction procedures are performed distributively. We will develop a decentralized TBCS reconstruction algorithm to exploit and transfer mutual a priori information among multiple signal reconstruction procedures in time sequence and/or in parallel.

Problem Formulation
We assume that the time is divided into K frames. Temporally, a series of K original signal vectors at the first procedure is denoted as, s 11 , s 12 , . . ., and s 1k (s 1k ∈ R N ), which can be correspondingly recovered from the measurements y 11 , y 12 , . . . , and y 1k (y 1k ∈ R N ) by using the projection matrix Φ 1 . All the measurement vectors are collected in time sequence. Spatially, at the same time slot, for example, the kth frame, a set of I original signal vectors, denoted as s 1k , s 2k , . . . , and s Ik (s ik ∈ R N ), are needed to be reconstructed from the M-vector measurements, correspondingly y 1k , y 2k , . . . , and y Ik (y ik ∈ R M ) by using the different projection matrix Φ 1 , Φ 2 , . . . , Φ I . All the spatial measurement vectors are collected at the same time.
The measurements are linear transforms of the original signals, contaminated by noise, which are given by The ik are additive white Gaussian noise with unknown but stationary power β ik . The noise level for different i and k may be different; however, the stationary noise variance can be integrated out in BCS and does not affect the signal reconstruction [5,21,25]. For mathematical convenience, we assume that the β ik are identical for all i and k and denote it by β. Without loss of generality, we assume that s ik is sparse, that is, most elements in s ik are zero.
Signal reconstruction is performed among different BCS procedures in parallel and in time sequence. Information is transferred in parallel and serially. Note that the original signals, s 11 , s 12 , s 22 , . . ., may be correlated with each other because of the spatial and temporal redundancies. However, without loss of generality, we do not specify the correlation model among the signals at different BCS procedures. This similarity leads to a priori information which can be introduced into decentralized TBCS signal reconstruction for improving performance in terms of reducing the number of measurements and improving the capability of combating noise.
For notational simplicity, we abbreviate s ik into s i to utilize one superscript representing either the temporal or spatial index, or both. We use the subscript to represent the element index in the vector. The main notation used throughout this paper is stated in Table 1.

Turbo Bayesian Compressed Sensing
In this section, we propose a Turbo BCS algorithm to provide a general framework for yielding and fusing a priori information from other parallel or serial reconstructed signals. We first introduce the standard BCS framework, in which selecting the hyperparameter α i imposed on the signal element is the key issue. Then we impose an exponential prior distribution on the hyperparameter α i with parameter λ f i . The previous reconstructed signal element will impact the parameter λ i to affect the α i distribution, yielding a priori information. Next, a priori information will be integrated into the current signal estimation.     y i j is the jth element of the measurement vector for reconstructing the signal vector s i that is collected at either ith spatial procedure or ith time frame, which has

Bayesian Compressed
The measurement matrix utilized for compressing the signal vector s i to yield y i . β: The noise variance. original signal with additive noise from the compressed measurements. In the BCS framework, a Gaussian prior distribution is imposed over each signal element, which is given by where α i j is the hyperparameter for the signal element s i j . The zero-mean Gaussian priori is independent for each signal element. By applying Bayes' rule, the a posteriori probability of the original signal is given by where A = diag(α i ). The covariance and the mean of the signal are given by Then, we obtain the estimation of the signal, s i , which is given by In order to estimate the hyperparameters α i and A, the maximum likelihood function based on observations is given by EURASIP Journal on Advances in Signal Processing 5 where, by integrating out s i and maximizing the posterior with respect to α i , the hyperparameter diagonal matrix A is estimated. Then, the signal can be reconstructed using (5).
The matrix A plays a key role in the signal reconstruction. The hyperparameter diagonal matrix A can be used to transfer the mutual a priori information by sharing the same A among all signals [23]. In such a way, if signals have many common nonzero elements, the signal reconstruction will benefit from such a similarity. However, when the similarity level is low, the transferred "wrong" information may impair the signal reconstruction [23].
Alternatively, we find a soft approach to integrating a priori information in a robust way. An exponential priori distribution is imposed on the hyperparameter α i controlled by the parameter λ i . The previously reconstructed signal elements will impact the λ i and change the α i distribution to yield a priori information. Then, the hyperparameter α i conditioned on λ i will join the current signal estimation using the maximum a posterior (MAP) criterion, which is to fuse a priori information.

Yielding A Priori Information.
The key idea of our TBCS algorithm is to impose an exponential distribution on the hyperparameter α i j and exchange information among different BCS signal reconstruction procedures using the exponential distribution in a turbo iterative way. In each iteration, the information from other BCS procedures will be incorporated into the exponential a priori and then used for the signal reconstruction of the current BCS signal reconstruction procedure being considered. Note that, in the standard BCS [21], a Gamma distribution with two parameters is used for α i j . The reason we adopt an exponential distribution here is that we need to handle only one parameter for the exponential distribution, which is much simpler than the Gamma distribution, while both distributions belong to the same family of distributions.
We assume that hyperparameter α i j satisfies the exponential prior distribution, which is given by where λ i j (λ i j > 0) is the hyperparameter of the hyperparameter α i j . By assuming mutual independence, we have that By choosing the above exponential prior, we can obtain the marginal probability distribution of the signal element depending on the parameter λ i j by integrating α i j out, which is given by where Γ(·) is the gamma function, defined as The detailed derivation is shown in Appendix A. Figure 3 shows the signal element distribution conditioned on the hyperparameter λ i j . Obviously, the bigger the parameter λ i j is, the more likely the corresponding signal element can take a larger value. Intuitively, this looks very much like a Laplace prior which is sharply peaked at zero [20]. Here, λ i j is the key of introducing a priori information based on reconstructed signal elements.
Compared with the Gamma prior distribution imposed on the hyperparameter λ i j [21,25], the exponential distribution has only one parameter while the Gamma distribution has two degrees of freedom. In many applications (e.g., communication networks), transferring one parameter is much easier and cheaper using the exponential distribution than handling two parameters. The exponential prior distribution does not degrade the performance, which can encourage the sparsity (see Appendix A). Also, using the exponential distribution is computationally tractable, which can produce a priori information for mutual information transfer. Now the challenge is, given the jth reconstructed signal element s b j from the bth BCS procedure, how one yields a priori information to impact the hyperparameters in the ith BCS procedure for reconstructing the jth signal element s i j . When multiple BCS procedures are performed to reconstruct the original signals (no matter whether they are in time sequence or in parallel), the parameters of the exponential distribution, λ i j , can be used to convey and incorporate a priori information from other BCS procedures. To this end, we consider the conditional probability, P(α i j | s b j , λ i j ), for α i j , given an observation element from another BCS procedure, Since the proposed algorithm 6 EURASIP Journal on Advances in Signal Processing does not use a specific model for the correlation of signals at different BCS procedures, we propose the following simple assumption when incorporating the information from other BCS procedures into λ i j , for facilitating the TBCS algorithm.
Assumption. For different i and b, we assume that α i Essentially, this assumption implies the same locations of nonzero elements for different BCS procedures. In other words, the hyperparameter α i j for the jth signal element is the same over different signal reconstruction procedures. Then, mutual information can be transferred through the shared hyperparameter α i j as proposed in [23]. However, the algorithm in [23] is a centralized MBCS algorithm, so the signal reconstructions for different tasks cannot be performed until all measurements are collected. Note that this technical assumption is only for deriving the algorithm for information exchange. It does not mean that the proposed algorithm only works for the situation in which all signals share the same locations of nonzero elements. Our proposed algorithm based on this assumption can provide a flexible and decentralized way to transfer mutual a priori information.
Based on the assumption, we obtain where Γ(·) is the gamma function, defined as The detailed derivation is given in Appendix A. Obviously, the posterior (α i j | s b j , λ i j ) also belongs to the exponential distribution [27]. Compared with the original prior distribution in (7), given the jth reconstructed signal element s b j from the bth BCS procedure, the hyperparameter λ i j in the ith BCS procedure controlling a priori distribution is actually updated to λ i j , which is given by If the information from n BCS procedures b 1 , . . . , b n is introduced, the parameter λ i j is then updated to where The derivation details are given in Appendix A.
Equations (11) and (13) show how the single or multiple signal elements s bn j , j = 1, 2, . . . , N, n = 1, 2, . . ., from other BCS procedures impact the hyperparameter of the signal element s i j , j = 1, 2, . . . , N at the same location in the ith BCS signal reconstruction. Note that the bth BCS signal reconstruction may be previously performed or is ongoing with respect to the ith BCS procedure. This provides significant flexibility to apply our TBCS in different situations.

Incorporating A Priori
Information into BCS. Now, we study how to incorporate the a priori information obtained in the previous subsection into the signal reconstruction procedure. In order to incorporate a priori information, provided by the external information, we maximize the log posterior based on (6), which is given by Therefore, the estimation of α i not only depends on the local measurements, which are in the first term log P(y i | α i , β), but also relies on the external signal elements {s b } through the parameter λ i , which are in the second term log P(α i | {s b }, λ i )).
An expectation maximization (EM) method can be utilized for the signal estimation. Recall that the signal vector s i is Gaussian distributed conditioned on α i , while α i also conditionally depends on the parameters λ i . Equation (3) shows that the conditional distribution of s i satisfies N (μ, Σ). Then, applying a similar argument to that in [21], we consider s i as hidden data and then maximize the following posterior expectation, which is given by By differentiating (15) with respect to α i and setting the differentiation to zero, we obtain an update, which is given by where Σ i j j is the jth diagonal element in the matrix Σ i . The detail of the derivation is given in Appendix B. Basically, the hyperparameters α i are interactively estimated and most of them will tend to infinity, which means that most corresponding signal elements are zero. Only the nonzero signal elements are estimated.
Considering the computation of the matrix inverse (with complexity O(n 3 )) associated with the process, the EM algorithm has a large computational cost. Even though a 7 Cholesky decomposition can be applied to alleviate the calculation [28,29], the EM method still incurs a significant computational cost. We will provide an incremental method for the optimization to reduce the computational cost.

Incremental Optimization
In this section, we utilize an incremental optimization to incorporate transferred a priori information and optimize the posterior function. Due to the inherit sparsity of the signal, the incremental method finds the limited nonzero elements by separating and testing a single index one by one, which alleviates the computational cost compared with the EM algorithm. Note that the key principle is similar to that of the fast relevance vector machine algorithm in [21]. However, the incorporation of the hyperparameter λ i brings significant difficulty for deriving the algorithm. For convenience, we abbreviate α i as α and y i as y because we are focusing on the current signal estimation.
In order to introduce a priori knowledge, the target log posterior function can be written as where L 1 (α) is the term of signal estimation from local observation and L 2 (α) introduces a priori information from other external BCS procedures.
In contrast to the complex EM optimization, the incremental algorithm starts by searching for a nonzero signal element and iteratively adds it to the candidate index set for the signal reconstruction, an algorithm which is similar to the greedy pursuit algorithm. Hence, we isolate one index, assuming the jth element, which is given by where l 1 (α j ) is the separated term associated with the jth element from the posterior function L(α i ). The remaining term is L 1 (α − j ), resulting from removing the jth index.
Suppose that the external information from other BCS procedures is incorporated, that is, s b j / = 0, λ i j / = 0, and L 2 (α) / = 0. We target maximizing the separated term by considering the remaining term L(α −j ) as fixed. Then, the posterior function separating a single index is given by where where φ j is the jth column vector of the matrix Φ. The detailed derivation is provided in Appendix C. Then, we seek for a maximum of the posterior function, which is given by When there is no external information incorporated, the optimal hyperparameter is given by [25] α j = arg max where otherwise.
When external information is incorporated, to maximize the target function (19), we compute the first-order derivative of l(α j ), which is given by where f (α j , g j , h j , λ j ) is a cubic function with respect to α j . By setting (24) to zero, we get the optimum α * j . By setting (24) to zero, we get the optimum solution for the posterior likelihood function l(α j ), which is given by The details are given in Appendix D.
Therefore, in each iteration only one signal element is isolated and the corresponding parameters are evaluated. After several iterations, most of the nonzero signal elements are selected into the candidate index set. Due to the sparsity of the signal, after a limited number of iterations, only a few signal elements are selected and calculated, which greatly increases the computational efficiency.

Case Study: Space-Time Turbo Bayesian Compressed Sensing for UWB Systems
The TBCS algorithm can be applied in various applications. A typical application is the UWB communication/positioning system. Our proposed TBCS algorithm will be applied to the UWB system to fully exploit the redundancies in both space and time, which is called Space-Time Turbo BCS (STTBCS). In this section, we first introduce the UWB signal model. Then, the structure to transfer spatial and temporal a priori information in the CS-based UWB system is explained in detail. Finally, we summarize the STTBCS algorithm.

UWB System Model.
In a typical UWB communication/positioning system, suppose that there is only one transmitter, which transmits UWB pulses on the order of nano-or sub-nanosecond. As shown in Figure 1, several receivers, or base stations, are responsible for receiving the UWB echo signals. The time is divided into frames. The received signal at the ith base station and the kth frame in the continuous time domain is given by where L is the number of resolvable propagation paths, a l is the attenuation coefficient of the lth path, and t l is the time delay of the lth propagation path. We denote by p(t) the transmitted Gaussian pulse and by p (t) the corresponding received pulse which is close to the original pulse waveform but has more or less distortions resulting from the frequency-dependent propagation channels. At the same frame or time slot, there is only one transmitter but multiple receivers which are closely placed in the same environment. Therefore, the received echo UWB signals at different receivers are similar at the same time, thus incurring spatial redundancy. In other words, the received signals share many common nonzero element locations. Typically, around 30-70% of nonzero element indices are the same in one frame according to our experimental observation [30]. In particular, no matter what kind of signal modulation is used for UWB communication, such as pulse amplitude modulation (PAM), on-off keying (OOK), or pulse position modulation (PPM), the UWB echo signals among receivers are always similar, and thus the spatial redundancy always exists. In this case, the spatial redundancy can be exploited for good performance using the proposed space TBCS algorithm.
In one base station, the consecutively received signals can also be similar. Suppose that, in UWB positioning systems, the pulse repetition frequency is fixed. When the transmitter moves, the signal received at the ith base station and the (k + 1)th frame can be written as Compared with (26), τ stands for the time delay which comes from the position change of the transmitter. In high precision positioning/tracking systems, this τ is always relatively small, which makes the consecutive received signals similar. Due to the similar propagation channels, the numbers L and L , as well as a l and a l , are similar in consecutive frames. This leads to the temporal redundancy. In our experiments, about 10-60% of the nonzero element locations in two consecutive frames are the same [30]. Then, this temporal redundancy can be exploited for good performance by using the Time TBCS algorithms. Actually, there exist both spatial and temporal redundancies in the UWB communication/positioning system. Therefore we can utilize the STBCS algorithm for good performance.
To archive a high precision of positioning and a high speed communication rate, we have to acquire ultrahigh resolution UWB pulses, which demands ultrahigh sampling rate ADCs. For instance, it requires picosecond level time information and 10 G sample/s or even higher sampling rate ADCs to achieve millimeter (mm) positioning accuracy for UWB positioning systems [28], which is prohibitively difficult. UWB echo signals are inherently sparse in the time domain. This property can be utilized to alleviate the problem of an ultrahigh sampling rate. Then the highresolution UWB pulses can be indirectly obtained and reconstructed from measurements acquired using lower sampling rate ADCs.
The system model of the CS-based UWB receiver can use the same model as that in Figure 2. The received UWB signal at the ith base station is first "compressed" using an analog projection matrix [6]. The hardware projection matrix consists of a bank of Distributed Amplifiers (DAs). Each DA functions like a wideband FIR filter with different configurable coefficients [6]. The output of the hardware projection matrix can be obtained and digitized by the following ADCs to yield measurements. For mathematical convenience, the noise generated from the hardware and ADCs is modeled as Gaussian noise added to the measurements. When several sets of measurements are collected at different base stations, a joint UWB signal reconstruction can be performed. This process is modeled in (1).

STTBCS: Structure and Algorithm.
We apply the proposed TBCS to UWB systems to develop the STTBCS algorithm. Figure 4 illustrates the structure of our STTBCS algorithm and explains how mutual information is exchanged. For simplicity, only two base stations (BS1 and BS2) and two consecutive frames of UWB signals (the kth and (k + 1)th) in each base station are illustrated. For each BCS procedure, Figure 4 also depicts the dependence among measurements, noise, signal elements, and hyperparameters.
In the STTBCS, multiple BCS procedures in multiple time slots are performed. Between BS1 and BS2, the signal reconstruction for s 1(k+1) and s 2(k+1) is carried out simultaneously while the information in s 1k and s 2k , the previous frame, is also used.
Algorithm 1 shows the details of the STTBCS algorithm. We start with the initialization of the noise, hyperparameters α, and the candidate index set Ω (an index set containing all possibly nonzero element indices). Then, the information EURASIP Journal on Advances in Signal Processing (2) Update λ using (11) and (13) from the previous reconstructed nonzero signal elements.
This introduces temporal a priori information.
(3) repeat (4) Check and receive the ongoing reconstructed signal elements from other simultaneous BCS reconstruction procedures to update the parameter. λ; this is to fuse spatial a priori information.     from previous reconstructed signals and from other base stations is utilized to update the hyperparameter λ. The terms g j and h j are also computed. The term g 2 j > h j is then used to add the jth element from the candidate index set. A convergence criterion is used to test whether the differences between successive values for any α j , j = {1, 2, . . . , N} are sufficiently small compared to a certain threshold. When the iterations are completed, the noise level β will be reestimated from setting ∂L/∂β = 0 using the same method in [21], which is given by where Σ ii is the diagonal element in the matrix Σ. The details of the above STTBCS algorithm are summarized in Algorithm 1. Note that only the nonzero signal element which is shown from the local measurements can introduce a priori information and thus update the hyperparameter λ j .
In other words, only if it satisfies g 2 j > h j can the parameter λ j be updated. This avoids the adverse effects from wrong a priori information to add a zero signal element into the candidate index set.

Simulation Results
Numerical simulations are conducted to evaluate the performance of the proposed TBCS algorithm, compared with the MBCS [23] and original BCS algorithms [5]. We use spike signals and experimental UWB echo signals [26] for the performance test. The quality of the reconstructed signal is measured in terms of the reconstruction percentage, which is defined as where s is the true signal and s is the reconstructed signal. Our TBCS algorithm performance is largely determined by how the introduced signal is similar to the objective signal. In other words, we consider how many common nonzero element locations are shared between the objective signal and the introduced signals. Then we define the similarity as where K obj is the number of nonzero signal elements in the objective unrecovered signal, K com is the number of the common nonzero element locations among the transferred reconstructed signals and objective signal, and P s represents the similarity level as a percentage. Note that, without loss of generality, we only consider the relative number of common nonzero element locations to measure the similarity, ignoring any amplitude correlation. Hence, when P s = 100%, it does not mean that the signals are the same but means that they have the same nonzero element locations; the amplitudes may not be the same. Our TBCS algorithm performance is compared with MBCS and BCS using different types of signals, different similarity levels, noise powers, and measurement numbers.

Spike Signal.
We first generate four scenarios of spike signals with the same length N = 512, which have the same number of 20 nonzero signal elements with random locations and Gaussian distributed (mean = 0, variance = 1) amplitudes. One spike signal is selected as the objective signal, as shown in Figure 5. With respect to the objective signal, the other three signals have a similarity of 25%, 50%, and 75%, which will be introduced as a priori information. The objective signal is then reconstructed using the original BCS, MBCS, and TBCS algorithms, respectively, with the same number of measurements (M = 62) and the same noise variance 0.15 (SNR 6 dB). We also investigate the performance gain (in terms of reconstruction percentage) at each iteration.  Figures 6 and 7 show the reconstructed spike signal using MBCS and TBCS, respectively, by introducing the spike signal with a similarity of 75%. The reconstruction percentage using TBCS is 92.7% while it is 57.5% using MBCS. The comparison of the two figures shows that TBCS can recover most of the original signal while MBCS fails to reconstruct the signal with so few measurements (M = 62) in spite of using a high-similarity signal as a priori information. Figures 8, 9, and 10 show, when transferred signals have a similarity of 25%, 50%, and 75%, respectively, how much signal reconstruction percentage can be achieved at each iteration using the BCS, MBCS, and TBCS algorithms. The simulations are run 100 times, over which the results are averaged. It is clear that our proposed TBCS is much better than the BCS at each iteration. Particularly, when the similarity is 25%, MBCS is worse than BCS while our TBCS achieves higher performance at each iteration than BCS. For instance, at iteration 25 in Figure 8, TBCS can achieve a reconstruction percentage of 61.7%, while BCS can reach 42.2% and MBCS only recovers 35.6%. It shows that, at a low similarity, our TBCS can still achieve good performance at every iteration, compared with MBCS and BCS. Moreover, with a high similarity, the performance gap between TBCS and MBCS is enlarged at each step. For example, at iteration 21 with a similarity of 25% in Figure 8, TBCS can achieve a reconstruction percentage of 59.7%, while MBCS can reach 28.2%. Hence, the performance gap is 31.5%. When the similarity is 75% in Figure 10, the performance gap is increased to 50.9% because TBCS can reach 80.5%, while MBCS achieves 29.6% at the 21st iteration.

UWB Signal.
The tested scenarios are the experimental UWB echo pulses from various UWB propagation channels in practical indoor residential, office, and clean, line-ofsight (LOS) and non-line-of-sight (NLOS) environments, which are drawn from experimental IEEE 802.15.4a UWB propagation models [26]. In a typical UWB communication/positioning system where receivers are distributed in the same environment, the received UWB echo signals are more or less similar. We test performance of original BCS, TBCS, and MBCS algorithms with different similarity levels.    Figure 11 shows the reconstructed UWB echo signals using the original BCS and our TBCS algorithms. The test UWB echo signals S0 (not shown in Figure 11), S1, S2, S3,  Figure 11: The performance of original BCS and TBCS. The UWB echo signals S1, S2, S3, and S4 with length N = 512 are reconstructed using the BCS and TBCS algorithms but only a section (length = 150) is shown. In the TBCS algorithm, the reconstructed signal S0 (not shown) is transferred to other signal reconstruction as a priori information. The number of measurements, SNR, similarity, and reconstruction percentage are (a) and (b) measurements M = 60; SNR = 9.2 dB; with respect to S0, the similarity in S1 is 11.5%; the reconstruction percentages of S1  that our TBCS is much better than the original BCS for different similarity levels. The reconstruction percentages using TBCS are much higher than those using original BCS by introducing a priori information with the same number of measurements. Moreover, the performance gap is increasing with the growth of the similarity level. For instance, with a similarity of 11.5% for reconstructing the signal S1 in Figures  11(a) and 11(b), the difference of reconstruction percentages using BCS and TBCS is only 3.2% (84.4-81.2%). When the similarity level is 98.1% for reconstructing the signal S4 in Figures 11(g) and 11(h), the difference is increased to 170.2% (93.2-(−77%)). Therefore, with a higher similarity level, higher performance gain can be achieved.
The performance of the original BCS, MBCS, and TBCS at different similarity levels is then compared. We select three UWB echo signals S5, S6, and S7 with the same dimension N = 512. The additive noise variance is only 0.01, implying a very high SNR. The reconstructed signals S6 and S7 as a priori information are transferred to the signal reconstruction for S5. With respect to S6 and S7, the similarities in S5 are 16.3% and 64.4%, respectively. The signal S5 is recovered with different numbers of measurements using the original BCS, TBCS, and MBCS algorithms. Figure 12 shows the reconstruction percentages versus the number of measurements for the signal S5.
Obviously, at a low similarity level, the MBCS performance is substantially worse than the original BCS whereas our TBCS achieves a performance equaling that of the original BCS performance. For a high similarity level, both MBCS and TBCS are much better than the original BCS due to the benefits of high similarity transferred from the signal S7. This demonstrates that our TBCS achieves a good balance between local observations and a priori information, leading to a more robust performance than the MBCS.
In the presence of more noise interference, our TBCS still outperforms MBCS and BCS, as shown in Figure 13. We use the same signals S5, S6, and S7 but the noise variance is increased to 0.4. We observe that our TBCS exhibits good performance, as shown in Figure 12. Particularly in the presence of noise, when the number of measurements is large enough (M > 150). At a low similarity level, the MBCS can achieve a maximum reconstruction percentage of 74.5% while our TBCS algorithm is able to accomplish a maximum reconstruction percentage of 86.9%. At a high similarity level, MBCS can reach a maximum of 80.1% while our TBCS algorithm is still able to accomplish a maximum of 86.9%. Therefore, by introducing a priori information, the proposed TBCS algorithm can significantly reduce the number of measurements and improve the capability of combating noise. Figure 14 shows the Bit Error Rate (BER) for an example UWB communication system using different algorithms. We utilize Binary Phase Shift Keying (BPSK) modulation to transfer the data since biphase modulation is one of the easiest methods to implement. The performance of the TBCS, MBCS, and the original BCS algorithms is compared for the UWB communication system. The BER is tested using different noise levels with the same number of measurements (M = 112). With so few measurements, using the BCS algorithm leads to a high BER at different SNR. It is also observed that, at a low similarity level, the TBCS performance is much better than the MBCS algorithm. At a high similarity level, the BER performance using the TBCS and MBCS algorithms are much better than that using the original BCS algorithm, while TBCS is the best. Therefore, by applying our TBCS algorithm in the UWB communication system, it can reduce the BER, provide more tolerance of the noise, and thus achieve the best performance when compared with the MBCS and BCS algorithms.

Conclusion
This paper has proposed an efficient approach to exploit and integrate the spatial and temporal a priori information existing in sparse signals, for example, UWB pulses. The turbo BCS algorithm has been designed to fully exploit a priori information from both space and time. Numerical simulation results have shown that the proposed TBCS outperforms the MBCS and traditional BCS, in terms of the robustness to noise and reduction of the required amount of samples.

Appendices
A. Proof of (9) and (10) We first show the derivation of (9), which is given by where Γ(·) is the gamma function, defined as Γ(x) = ∞ 0 t x−1 e −t dt. We have Γ(3/2) = ∞ 0 t 1/2 e −t dt. Because both distributions belong to the exponential distribution family, the marginal distribution is still in the same family. It is also observed that the marginal distribution P(s i j | λ i j ) is sharply peaked at zero, which encourages the sparsity. Therefore, the chosen exponential a prior distribution in the hierarchical Bayesian framework can be recognized and encourage the sparsity of the reconstructed signal.
Based on the assumption α b j = α i j , we have the same derivation: In order to obtain (10), we utilize the above equations. Then the derivation of the posterior is given by So the parameter λ i j is updated to λ i j , which is given by For transferred multiplied reconstructed signal elements s b1 j , s b2 j , . . . s bn j , the posterior function also belongs to the exponential distribution family. As shown in (12), the parameter λ i j is updated to The distributions P(s b1 j α i j ), P(s b2 j α i j ), . . . , and P(s bn j | α i j ) are conditionally independent from each other. In this case, the parameter is updated to where n represents the total number of a priori information s b1 j , s b2 j , . . . s bn j . Therefore, the above derivations show how the single or multiple signal elements s bn j , j = 1, 2, . . . , N, n = 1, 2, . . ., from the other BCS procedures update the hyperparameters in the ith BCS signal reconstruction procedure. (15) One strategy to maximize (17) is to exploit an EM method, treating the s i as hidden data, and maximize the following expectation: E s i |y i ,α i log P s i | α i P α i | λ i .

B. Derivation of
(B.1) The operator E s i |y i ,α i denotes an expectation of the posterior P(s i | y i , α i , λ i , β) with respect to the distribution over the s i given the data and hidden variables. Through differentiation with respect to α i we get

EURASIP Journal on Advances in Signal Processing
In order to find the critical point, the differentiation of l 1 (α j ) is given by It is easy to maximize l 1 (α j ) with respect to α j by taking the first and second derivatives. Then the maximum point α j is given by otherwise. (C.8) The second derivative is (C.9) Taking the critical point α j into the second derivative expression, we have known that (C.10) Obviously, it is always negative, and therefore function l 1 (α j ) achieves the maximum at α j , which is unique. (24) and (25) The first derivative of the l 2 (α j ) is l 2 (α j ) = −λ j . All together the first differentiation of the posterior l(α j ) is given by (D.1)

D. Derivations about
By setting the (D.1) to zero, we can find the optimum α j for (25). The g j and h 2 j are not negative based on (C.4) and (C.5). We have α j ≥ 0 and λ j > 0 according to the exponential distribution as shown in (7), and l (α j ) → −2λ j < 0 as α j → +∞. Then, it has l (α j ) > 0 when α j → 0. Therefore, for the function l (α j ) = 0, it has at least one positive root for α j > 0.
We rearrange (D.1) to l α j = 1 2 (D.2) Setting (D.2) to zero is to let the numerator be zero, that is, f (α j , g j , h j , λ j ) = 0. To find the solution, we normalize the equation to reduce one parameter for convenience. Then we need to solve 3) The corresponding coefficients are given by [31] B 0 = 2g j , To solve the cubic function, we define intermediate components as Then the solutions of the cubic function are given by where, Therefore, all those three roots x 1 , x 2 , and x 3 are critical points of the optimization function shown in (19). We choose the positive root which maximizes the optimization function in (19) as the optimum solution α * j for (25).