Censored Distributed Space-Time Coding for Wireless Sensor Networks

We consider the application of distributed space-time coding in wireless sensor networks (WSNs). In particular, sensors use a common noncoherent distributed space-time block code (DSTBC) to forward their local decisions to the fusion center (FC) which makes the ﬁnal decision. We show that the performance of distributed space-time coding is negatively a ﬀ ected by erroneous sensor decisions caused by observation noise. To overcome this problem of error propagation, we introduce censored distributed space-time coding where only reliable decisions are forwarded to the FC. The optimum noncoherent maximum-likelihood and a low-complexity, suboptimum generalized likelihood ratio test (GLRT) FC decision rules are derived and the performance of the GLRT decision rule is analyzed. Based on this performance analysis we derive a gradient algorithm for optimization of the local decision/censoring threshold. Numerical and simulation results show the e ﬀ ectiveness of the proposed censoring scheme making distributed space-time coding a prime candidate for signaling in WSNs. Copyright ©


INTRODUCTION
In recent years, wireless sensor networks (WSNs) have been gaining popularity in a wide range of military and civilian applications such as environmental monitoring, health care, and control. A typical WSN consists of a number of geographically distributed sensors and a fusion center (FC). The low-cost and low-power sensors make local observations of the hypotheses under test and communicate with the FC. Centralized detection schemes require the sensors to transmit their real-valued observations to the FC. However, this automatically translates into the unrealistic assumption of an infinite-bandwidth communication channel. In reality, the WSN has to work in a bandlimited environment. Moreover, as communication is a key energy consumer in a WSN, it is desirable to process the observation data as much as possible at the local sensors to reduce the number of bits that have to be transmitted over the communication channel. Therefore, the sensors typically make local decisions which are then transmitted to the FC where the final decision is made [1][2][3][4][5].
The resulting decentralized detection problem has a long and rich history. The decentralized optimum hypothesis testing problem was first formulated in [1] to provide a theoretical framework for detection with distributed sensors. Traditionally, the local decisions are assumed to be transmitted to the FC through perfect, error-free channels [1][2][3][4][5][6]. Realistically, the sensors typically work in harsh environments and therefore, fading and noise should be taken into account.
The problem of fusing sensor decisions over noisy and fading channels was considered in [7,8]. The fusion rules developed in [7] require instantaneous channel-state information (CSI). While the fusion rules in [8] do not require amplitude CSI, they still assume perfect phase estimation/synchronization. However, obtaining any form of CSI may not be feasible in large-scale WSNs and cheap sensors make phase synchronization challenging. To avoid these problems, simple ON/OFF keying and corresponding fusion rules were considered in [9]. Furthermore, power efficiency is improved in [9] by employing a simple form of censoring [10], where the sensors transmit only reliable decisions to the FC. The schemes in [7][8][9] assume orthogonal channels between the sensors and the FC, which entail a large required bandwidth especially in dense WSNs with a large number of sensors.
To overcome the bandwidth limitations of orthogonal transmission in WSNs, the application of coherent distributed space-time coding was proposed in [11]. In particular, in [11] each sensor is randomly assigned a column of Alamouti's space-time block code (STBC) [12] and it is assumed that only two sensors are active randomly at any time. The quantized observations are encoded by the sensors using the respective preassigned columns of the STBC and transmitted to the FC via a common, noorthogonal channel. Since there are typically more sensors than STBC columns, the same column has to be assigned to more than one sensor resulting in a diversity order of 1. The performance degradation due to the diversity loss and the observation noise is analyzed in [11].
We point out that distributed space-time coding is usually employed in relay networks where a cyclic redundancy check (CRC) code can be used to avoid the retransmission of incorrect decisions by the relays [13][14][15]. In this context, selection relaying first introduced in [16] has some similarities to censoring in sensor networks [9,10]. However, while in selection relaying the decision whether a relay retransmits a packet or not depends on the instantaneous CSI of the source-relay channel, the censoring decision depends on the observation noise at the sensor. Furthermore, relaying decisions in selection relaying are made on a packet-by-packet basis enabling coherent detection at the destination node but censor decisions are performed on a symbol-by-symbol basis making coherent data fusion at the FC practically impossible.
In this paper, we consider noncoherent distributed spacetime block coding for transmission of censored sensor decisions in WSNs. In particular, we make the following contributions.
(i) We show that the noncoherent distributed STBCs (DSTBCs) introduced in [14] eliminate the various restrictions and drawbacks of the coherent scheme in [11]. (ii) Moreover, it is shown that censoring of local decisions is essential for the efficient application of distributed space-time coding in WSNs. (iii) We derive the optimum maximum-likelihood (ML) and a suboptimum generalized likelihood ratio test (GLRT) noncoherent FC decision rules for the proposed signaling scheme. (iv) The bit-error rate (BER) at the FC for the GLRT decision rule is characterized analytically. (v) Based on the analytical expression for the BER, we devise a gradient algorithm for calculation of the optimum local decision/censoring threshold. (vi) Our numerical and simulation results show the effectiveness of the proposed transmission scheme and the ability of the noncoherent DSTBC to achieve a diversity gain in WSNs.
This paper is organized as follows. In Section 2, we present the system model and introduce the proposed transmission scheme for WSNs. In Section 3, we derive the ML and GLRT noncoherent FC decision rules and analyze the performance of the GLRT decision rule. A gradient algorithm for optimization of the local decision/censoring threshold is provided in Section 4. Simulation and numerical results are given in Section 5, while some conclusions are drawn in Section 6.
Notation. In this paper, bold upper case and lower case letters denote matrices and vectors, respectively.

SYSTEM MODEL
The binary hypothesis testing problem under consideration is illustrated in Figure 1, where a set K {1, 2, . . . , K} of K distributed sensors tries to determine the true state of nature H as being H 0 (the null hypothesis) or H 1 (target-present hypothesis). Typical applications for binary hypothesis testing include seismic detection, forest fire detection, and environmental monitoring. The a priori probabilities of the two hypotheses H 0 and H 1 are denoted as P(H 0 ) and P(H 1 ), respectively. We assume that P(H 0 ) = P(H 1 ) = 0.5 throughout this paper. The details of the system model will be discussed in the following subsections.

Local sensor decisions
We assume that the sensor observations are described by where the local observation noise samples n k , k ∈ K, are independent and identically distributed (i.i.d.). For convenience and similar to [8,9,11], we assume identical sensors in this paper and model n k as real-valued additive white Gaussian noise (AWGN) with zero mean and variance σ 2 ε{n 2 k }, k ∈ K. We note, however, that the generalization of our results to nonidentical sensors (e.g., sensors with different noise variances) is also possible.
Upon receiving its own observation, each sensor makes a ternary local decision: where d is the nonnegative decision/censoring threshold. While u k = −1 and u k = 1 correspond to hypotheses H 0 and H 1 , respectively, u k = 0 corresponds to a decision that is deemed unreliable by the sensor and thus censored. For future reference, we denote the sets of sensors with u k = 0, u k = −1, and u k = 1 by S, H 0 , and H 1 , respectively. Note It is not difficult to show that the probabilities of correct and wrong sensor decision are given by respectively. The probability that a decision is censored is given by

Noncoherent distributed space-time coding
The general concept of DSTBC was originally proposed in [13] to achieve a diversity gain in cooperative networks with decode-and-forward relaying. The DSTBC scheme in [14] is particularly attractive for application in networks with a large number of nodes since its decoding complexity is independent of the total number of nodes. This scheme consists of a code C and a set of signature vectors G. The active relay nodes 1 encode the (correctly decoded) source information using a T × N code matrix Φ ∈ C. Each active relay transmits a linear combination of the columns of the informationcarrying matrix Φ. The linear combination coefficients for each node are unique and are collected in a signature vector g k ∈ G, g k 2 2 = 1, k ∈ K, of length N. In this work, we consider the application of the DSTBC scheme in [14] in WSNs. In particular, sensors encode their local decisions using a noncoherent DSTBC. Since we consider here a binary hypothesis testing problem, C = {Φ 0 , Φ 1 } has only two elements. To optimize performance under noncoherent detection, we choose Φ 0 and Φ 1 to be orthogonal, that is, [17]). Each sensor is assigned a unique signature vector g k ∈ G, g k 2 2 = 1, k ∈ K, of length N. For the design of deterministic and random signature vector sets G, we refer to [14,15] , respectively. The transmitted signal of sensor k is given by where E denotes the transmitted energy of sensor k per codeword. We note that sensor k transmits the T elements of s k in T consecutive symbol intervals. The total average transmitted energy per information bit is given by E b = EK(P w + P c ).

Channel model
We assume that the sensors transmit time synchronously and that the sensor-FC channels are frequency-nonselective and time-invariant for at least T symbol intervals. 2 Therefore, using the equivalent complex baseband representation of bandpass signals, the signal samples received at the FC in T consecutive symbol intervals can be expressed as where h k and n denote the fading gain of sensor k and a complex AWGN vector, respectively. The columns of the N ×|H 0 | matrix G H0 and N × |H 1 | matrix G H1 contain the signature vectors of the sensors in H 0 and H 1 , respectively. The corresponding fading gains are collected in column vectors h H0 and h H1 which have lengths |H 0 | and |H 1 |, respectively. We model the channel gains h k , k ∈ K, as i.i.d. zero-mean complex Gaussian random variables (Rayleigh fading) with variance σ 2 h = ε{|h k | 2 } = 1. 3 The elements of the noise vector n have variance σ 2 n = N 0 , where N 0 denotes the power spectral density of the underlying continuous-time passband noise process.
Equation (6) clearly shows the importance of censoring when applying DSTBCs in WSNs, since incorrect sensor decisions lead to interference. For example, for H = H 0 , ideally the term involving Φ 1 in (6) would be absent. However, incorrect decisions may cause some sensors to transmit √ EΦ 1 g k instead of √ EΦ 0 g k . The considered censoring 2 Time synchronous transmission can be accomplished if the relative delays between the relay nodes are much smaller than the symbol duration. This is usually a reasonable assumption for low-rate WSN applications. We refer the interested reader to [18] for a more detailed discussion on time synchronism in the context of WSNs.

4
EURASIP Journal on Advances in Signal Processing scheme reduces the number of incorrect decisions (by choosing d > 0) at the expense of reducing the number of sensors that make a correct decision. However, this disadvantage is outweighed by the reduction of interference as long as d is not too large (cf. Section 5). We note that censoring was not considered in any of the related publications, for example, [11,[13][14][15]. For example, in [13][14][15], DSTBCs were mainly applied for relay purposes, where a CRC code can be used to avoid the retransmission of incorrect decisions.

Processing at fusion center (FC)
The FC makes a decision based on the received vector r and outputs u 0 = 1 if it decides in favor of H 1 , and u 0 = −1 otherwise. Different decision rules may be applied at the FC differing in performance and complexity. In this context, we note that coherent detection is not feasible in large-scale WSNs since the FC would have to estimate and track the channel gains of all sensors. While (6) suggests that only the effective channels √ EG H0 h H0 and √ EG H1 h H1 have to be estimated if distributed space-time coding is applied, this is also not feasible since the sets H 0 and H 1 typically change after T symbol intervals (i.e., for every new sensor decision). Therefore, only noncoherent decision rules will be considered in the next section.

FC DECISION RULES AND PERFORMANCE ANALYSIS
In this section, we present the optimum ML and the generalized-likelihood ratio test (GLRT) noncoherent decision rules. In addition, we provide a performance analysis for the GLRT decision rule.

Optimum maximum-likelihood (ML) decision rule
We first provide the optimum ML decision rule. For this purpose, we introduce the likelihood ratio (LR): where P(H 0 , w denote the probabilities that the sets H 0 , H 1 occur for H 0 and H 1 , respectively. Since r conditioned on H 0 , H 1 is a Gaussian vector, the conditional probability density function (pdf) f (r | H 0 , H 1 ) is given by where the T × T correlation matrix B is defined as B ε{rr H | H 0 , 1 )+σ 2 n I T . Now we can express the ML decision rule at the FC as We note that the sums in the numerator and denominator of (7) both have 3 K terms, that is, the complexity of the ML decision rule is of orde O(3 K ) and grows exponentially with K. In addition, (8) reveals that for the ML decision rule the FC requires knowledge of the signature vectors of all sensors. These two assumptions make the implementation of the ML decision rule difficult, if not impossible in practice. Therefore, we will provide a low-complexity suboptimum FC decision rule in the next subsection.

GLRT decision rule
The received vector can be expressed as If H 0 is the true hypothesis Equation (10) suggests a two-step GLRT approach for the estimation of the transmitted codewor Φ. In the first step, h eff is estimated assuming Φ is known, and in the second step the channel estimate h eff is used to detect Φ. Since the correlation matrix of the effective noise n eff depends on G H1 or G H0 , the ML estimate for h eff and thus the resulting GLRT decision rule depend on the signature vectors. Therefore, the complexity of this GLRT decision rule is still exponential in K.
To avoid this problem we resort to the simpler least-squares (LS) approach to channel estimation. The LS channel estimate is given by Now, the GLRT decision rule can be expressed as where all irrelevant terms have been dropped. The FC output Clearly, the GLRT decision rule does not require CSI and the FC does not have to know the signature vectors of the sensors.

Performance analysis for GLRT decision rule
For the optimum ML decision rule, a closed-form performance analysis does not seem to be feasible. However, fortunately such an analysis is possible for the more practical GLRT decision rule. In particular, the BER can be expressed as Since the considered signaling scheme is symmetric in H 0 and H 1 , (13) can be simplified to P e = P(u 0 = 1|H 0 ). Expanding now P(u 0 = 1|H 0 ) leads to P e = H0,H1 where P(u 0 = 1 | H 0 , H 1 ) denotes the probability that u 0 = 1 is detected assuming that u k = −1 for k ∈ H 0 and u k = 1 for k ∈ H 1 , and P(H 0 , H 1 | H 0 ) is given in Section 3.1.
Exploiting the orthogonality of Φ 0 and Φ 1 and using (6) and (12), P(u 0 = 1 | H 0 , H 1 ) can be expressed as where Since Δ is a quadratic form of Gaussian random variables, the Laplace transform Φ Δ (s) of the pdf of Δ can be obtained as where λ xi and λ yi denote the eigenvalues of the N × N matrices respectively. Thus, P(u 0 = 1 | H 0 , H 1 ) can be calculated from [19] where c is a small positive constant in the region of convergence of the integral. The integral in (19) can be either computed numerically using Gauss-Chebyshev quadrature rules [19] or exactly using [20,21] where RHS stands for the right-hand side of the complex plane. The BER at the FC for the GLRT decision rule can be readily obtained by combining (14) and (19).

OPTIMIZATION OF CENSORING THRESHOLD d
Since a closed-form calculation of the optimum decision/ censoring threshold d which minimizes P e does not seem to be possible, we derive here a gradient algorithm for recursive optimization of d. This algorithm is given by [22] where i is the discrete iteration index and δ is the adaptation step size. Using (14) the gradient in (21) can be expressed as ∂P e ∂d = H0,H1 where we have used the fact that P(u 0 = 1 | H 0 , H 1 ) is independent of d and the remaining partial derivative is given by Using (3), (4) and the fundamental theorem of calculus [23], the derivatives in (23) can be expressed as For d = 0, we have |S| = 0 and since ∂P w /∂d < 0 and ∂P c /∂d < 0 we obtain ∂P e /∂d < 0. On the other hand, for d→∞, we get |H 0 |→0 and |H 1 |→0 which results in ∂P e /∂d > 0. 4 Therefore, by the mean value theorem, ∂P e /∂d = 0 is valid for at least one value of 0 ≤ d < ∞ corresponding to at least one local minimum of P e [23]. Although numerical evidence shows that there is exactly one local minimum (which therefore is also the global minimum), we cannot formally prove this due to the complexity of the involved expressions. Nevertheless, the above considerations suggest that we initialize the gradient algorithm with d[0] = 0 corresponding to the case of no censoring. The solution found by the algorithm is then guaranteed to yield a performance not worse than that of the no censoring case. Numerical examples will be given in the next section. We note that d will typically be calculated at the FC and the value of d has to be conveyed to the sensors over a feedback channel. However, this feedback channel can be very low rate assuming that the statistical properties of the forward channel and the sensors vary only slowly with time.

SIMULATION RESULTS
In this section, we provide some numerical and simulation results for the proposed censored DSTBCs and the system model introduced in Section 2. We assume that T = 8 symbol intervals are available for transmission of one information bit, that is, orthogonal matrices Φ 0 and Φ 1 can be found for N ≤ 4. Here, we consider N = 1, N = 2, and N = 4, and generate Φ 0 and Φ 1  column of H 8 . For the set of signature vectors G, we adopted the gradient sets described in [14]. Unless stated otherwise, the sensors have a local noise variance of σ 2 = 1/4 corresponding to a signal-to-noise ratio (SNR) of 6 dB and we assume the suboptimum GLRT decision rule and P e at the FC are obtained using the analytical results presented in Section 3.3. 5 d and P e versus i. First, we investigate the behavior of the adaptive algorithm described in Section 4 for optimization of d. Figure 2 shows d and the corresponding BER P e at the FC as a function of the iteration number i for N = 1, 2, and 4, respectively. The considered WSN had K = 30 sensors and the channel SNR was 10 log 10 (E b /N 0 ) = 15 dB. d [i] was initialized with 0 and the step size parameter was chosen to achieve a fast convergence while avoiding instabilities. As can be observed from Figure 2 the adaptive algorithm significantly improves the BER over the iterations. While d itself requires more than 600 iterations to converge to the final optimum value, P e does practically not change after more than 180 iterations for all considered cases. It is interesting to note that the optimum value for d decreases with increasing N, that is, for larger N less censoring should be applied. The reason for this behavior is that the maximum achievable diversity order of a DSTBC is N (cf. [14]) and therefore, the performance of the DSTBC improves notably with increasing number of transmitting sensors only until N sensors transmit. If more than N sensors transmit, the diversity order does not further improve and only a small additional coding gain can be real- ized. On the other hand, less censoring means that more erroneous decisions are forwarded to the FC which may negate the additional coding gain. P e versus 10 log 10 (E b /N 0 ). In Figure 3, we consider the BER achievable with the proposed censored DSTBCs at the FC of a WSN with K = 30 sensors as a function of the channel SNR 10 log 10 (E b /N 0 ). For each considered N, we compare the BER for error-free local sensor decisions (σ 2 = 0, d = 0), noisy sensor decisions without censoring (σ 2 = 1/4, d = 0), and noisy sensor decisions with censoring (σ 2 = 1/4, d = d opt ), where d opt denotes the optimum decision/censoring threshold found with the gradient algorithm. Figure 3 clearly shows that DSTBCs suffer from a significant performance degradation due to erroneous decisions if censoring is not applied. Fortunately, with censoring this performance degradation can be avoided and a performance close to that of error-free local decisions can be achieved. Figure 3 also nicely illustrates the diversity gain that can be realized with censored DSTBCs. P e versus K. In Figure 4, we investigate the dependence of the BER on the total number of sensors in the network for 10 log 10 (E b /N 0 ) = 15 dB. In particular, we show in Figure 4 the BER for error-free local sensor decisions and the GLRT decision rule at the FC (σ 2 = 0, d = 0), noisy sensor decisions with censoring and the GLRT decision rule at the FC (σ 2 = 1/4, d = d opt ), and noisy sensor decisions with censoring and the ML decision rule at the FC (σ 2 = 1/4, d = d opt ). 6 7 5 1 0 The results for the GLRT decision rule were obtained numerically based on the analytical results in Section 3.3, whereas Monte Carlo simulation was used to obtain the results for the ML decision rule. For complexity reasons, for the latter case, we only show the results for K ≤ 5. For error-free local sensor decisions, BER is constant for K > N since the diversity order is limited to N and the DSTBC achieves the same performance as the related STBC C for colocated antennas if all K > N sensors transmit. The censored DSTBC with noisy sensor decisions approaches the performance of the DSTBC with error-free sensor decisions as the number of sensors increases. This is due to the fact that as K increases the decision/censoring threshold d opt increases making the transmission of erroneous sensor decisions less likely. Figure 4 also shows that the GLRT decision rule is almost optimum and only small additional gains are possible if the significantly more complex ML decision rule is used. P e and d versus N. Assuming the GLRT decision rule and 10 log 10 (E b /N 0 ) = 15 dB at the FC, Figure 5 shows P e and the corresponding optimum decision threshold d as a function of N for K = 1, 2, 4, 10 and 30. Similar to the observation we made in Figure 2, d decreases for increasing signature vector length N for all K. As we have mentioned before, the maximum achievable diversity order for DSTBC is N. For a given K, a smaller d allows more sensors to be active and thus exploits the the extra diversity benefit provided by the longer signature vectors. This figure also shows that d increases for increasing K. This can be also explained easily. For a given d and N, increasing K allows more sensors to transmit. However, our scheme only requires a certain number of sensors to be active to exploit the full diversity benefit and achieve a certain target BER. On the other hand, increasing d decreases the chance of having erroneous decisions being transmitted to the FC. This suggests that our scheme tries to maximize the performance by only allowing the minimum number of sensors (with quality decisions) to transmit. Finally, it is interesting to see that the P e performance actually deteriorates for N > K for the GLRT fusion rule. This is because for N > K the GLRT fusion rule implicitly estimates the N × 1 effective channel vector h eff in a noisy environment (cf. (11)) whereas the underlying channel vectors, h H0 and h H1 , have a smaller dimensionality K. The increased dimensionality causes a larger channel estimation error while no diversity benefit is achieved because the maximum diversity order is limited to K [14]. In light of this degradation for the GLRT fusion rule, we also simulated the ML fusion rule for K = 1 and K = 2 (dashed curves) and clearly, as expected, the ML decision rule does not suffer from the same degradation. We note that in the practically more relevant case of N < K ML and GLRT decision rules have similar performances (cf. Figure 4). P e and d versus SNR of local sensors. We investigate the effect of local sensor observation noise on the P e performance in Figure 6. In particular, we plot P e versus the SNR of local sensors 10log 10 (1/σ 2 ) for different K and N. We assume the GLRT fusion rule at the FC and the corresponding optimum decision threshold d is also depicted. Furthermore, the channel SNR is fixed to 10 log 10 (E b /N 0 ) = 15 dB for all cases. As expected, the network with K = 30 sensors performs better than the network with K = 10 sensors for any N regardless of the sensor observation noise. However, this gain is minimal for large sensor SNR. This is because as the sensor SNR 8 EURASIP Journal on Advances in Signal Processing  increases, most of the sensor decisions will be correct and less censoring is required. This phenomenon is clearly supported by the corresponding d versus 10 log 10 (1/σ 2 ) figure where the optimum decision threshold d approaches zero for increasing sensor SNR. In addition, as more sensors transmit, the maximum achievable diversity order N and the channel SNR will be the ultimate factors which determine P e and therefore, for a given N, the BER curves for K = 10 and K = 30 converge to the same value for large local sensor SNR. I.n.d. Rayleigh fading. Until now, we have been considering i.i.d. Rayleigh fading channels. In our last example, we consider independent and nonidentically distributed (i.n.d.) fading channels. In particular, we consider a network with K = 30 sensors and the sensor nodes are uniformity distributed in a circle with radius r and the distance from the center of the circle to the FC is d. We assume i.n.d. Rayleigh fading between the sensors and the FC and the received power decreases as d −α k , where d k is the distance measured from sensor k to the FC and α = 3 is the path loss exponent. Figure 7 depicts the simulated P e versus 10 log 10 (E b /N 0 ) for different r/d ratios. For a given N, the decision threshold d was optimized for r/d = 0 (corresponding to i.i.d. fading) and it was then used also for r/d > 0. It can be seen from the figure that, as expected, P e increases with increasing r/d. It is also interesting to note that the performance degradation is larger for larger N. This can be explained as follows. For a given network size K, as we have seen in Figures 4 and 5, d decreases for increasing N. Since a smaller censoring threshold d corresponds to a larger number of active sensors, more sensors are negatively affected by the i.n.d. channels resulting in the greater performance degradation for larger N.

CONCLUSION
In this paper, we have considered the application of noncoherent DSTBCs in WSNs. We have introduced censoring as an efficient method to overcome the negative effects of erroneous local sensor decisions on the performance of the noncoherent DSTBC. Furthermore, we have derived optimum ML and suboptimum GLRT FC decision rules, and we have analyzed the performance of the latter decision rule. Based on this analysis, we have devised a gradient algorithm for recursive optimization of the decision/censoring threshold. Numerical and simulation results have shown the effectiveness of censoring which eliminates the effect of local decision errors for practically relevant BERs if the number of sensors in the network K is greater than the length of the signature vectors N or in other words, if there are enough sensors to exploit the diversity benefit provided by the DSTBC. Finally, our results have shown that the suboptimum GLRT fusion rule performs very close to the optimum ML fusion rule while having a very low complexity and allowing noncoherent detection at the FC.