Decentralized estimation over orthogonal multiple-access fading channels in wireless sensor networks--optimal and suboptimal estimators
© Wang and Yang; licensee Springer. 2011
Received: 26 November 2010
Accepted: 12 December 2011
Published: 12 December 2011
We study optimal and suboptimal decentralized estimators in wireless sensor networks over orthogonal multiple-access fading channels in this paper. Considering multiple-bit quantization for digital transmission, we develop maximum likelihood estimators (MLEs) with both known and unknown channel state information (CSI). When training symbols are available, we derive a MLE that is a special case of the MLE with unknown CSI. It implicitly uses the training symbols to estimate CSI and exploits channel estimation in an optimal way and performs the best in realistic scenarios where CSI needs to be estimated and transmission energy is constrained. To reduce the computational complexity of the MLE with unknown CSI, we propose a suboptimal estimator. These optimal and suboptimal estimators exploit both signal- and data-level redundant information to combat the observation noise and the communication errors. Simulation results show that the proposed estimators are superior to the existing approaches, and the suboptimal estimator performs closely to the optimal MLE.
KeywordsDecentralized estimation maximum likelihood estimation fading channels wireless sensor network
Wireless sensor networks (WSNs) consist of a number of sensors deployed in a field to collect information, for example, measuring physical parameters such as temperature and humidity. Since the sensors are usually powered by batteries and have very limited processing and communication abilities , the parameters are often estimated in a decentralized way. In typical WSNs for decentralized estimation, there exists a fusion center (FC). The sensors transmit their locally processed observations to the FC, and the FC generates the final estimation based on the received signals .
Both observation noise and communication errors deteriorate the performance of decentralized estimation. Traditional fusion-based estimators are able to minimize the mean square error (MSE) of the parameter estimation by assuming perfect communication links (see  and references therein). They reduce the observation noise by exploiting the redundant observations provided by multiple sensors. However, their performance degrades dramatically when communication errors cannot be ignored or corrected. On the other hand, various wireless communication technologies aiming at achieving transmission capacity or improving reliability do not minimize the MSE of the parameter estimation. For example, although diversity combining reduces the bit error rate (BER), it requires that the signals transmitted from multiple sensors are identical, which is not true in the context of WSNs due to the observation noise at sensors. This motivates to optimize estimator at the FC under realistic observation and channel models, which minimizes the MSE of parameter estimation.
The bandwidth and energy constraints are two critical issues for the design of WSNs. When the strict bandwidth constraint is taken into account, the decentralized estimation when the sensors only transmit one bit for each observation, that is, using binary quantization, is studied in [4–9]. When communication channels are noiseless, a maximum likelihood estimator (MLE) is introduced and optimal quantization is discussed in . A universal and isotropic quantization rule is proposed in , and adaptive binary quantization methods are studied in [7, 8]. When channels are noisy, the MLE in additive white Gaussian noise (AWGN) channels is studied and several low complexity suboptimal estimators are derived in . It has been found that the binary quantization is sufficient for decentralized estimation at low observation signal-to-noise ratio (SNR), but more bits are required for each observation at high observation SNR .
When the energy constraint and general multi-level quantizers are considered, various issues of the decentralized estimation are studied under different channels. When communications are error free, the quantization at the sensors is designed in [10–12]. The optimal trade-off between the number of active sensors and the quantization bit rate of each sensor is investigated under total energy constraint in . In binary symmetrical channels (BSCs), the power scheduling is proposed to reduce the estimation MSE when the best linear unbiased estimator (BLUE) and a quasi-BLUE, where quantization noise is taken into account, are used at the FC . Nonetheless, to the best of the authors' knowledge, the optimal decentralized estimator using multiple-bit quantization in fading channels is still unavailable. Although the MLE proposed in AWGN channels  can be applied for fading channels if the channel state information (CSI) is known at the FC, it only considers binary quantization.
Besides the decentralized estimation based on digital communications, the estimation based on analog communications receives considerable attentions due to the important conclusions drawn from the studies for the multi-terminal coding problem [15, 16]. The most popular scheme is amplify-and-forward (AF) transmission, which is proved to be optimal in quadratic Gaussian sensor networks under multiple-access channels (MACs) with AWGN . The power scheduling and energy efficiency of AF transmission are studied under AWGN channels in , where AF transmission is shown to be more energy efficient than digital communications. However, in fading channels, AF transmission is no longer optimal in orthogonal MACs [19–21]. The outage laws of the estimation diversity with AF transmission in fading channels are studied in  and  in different asymptotic regimes. These studies, especially the results in , indicate that the separate source-channel coding scheme is optimal in fading channels with orthogonal multiple-access protocols, which outperforms AF transmission, a simple joint source-channel coding scheme.
In this paper, we develop optimal and suboptimal decentralized estimators for a deterministic parameter considering digital communication. The observations of the sensors are quantized, coded and modulated, and then transmitted to the FC over Rayleigh fading orthogonal MACs. Because the binary quantization is only applicable at low observation SNR levels [4, 13], a general multi-bit quantizer is considered.
We strive for deriving MLEs and feasible suboptimal estimator when different local processing and communication strategies are used. To this end, we first present a general message function to represent various quantization and transmission schemes. We then derive the MLE for an unknown parameter with known CSI at the FC.
In typical WSNs, the sensors usually cannot transmit too many training symbols for the receiver to estimate channel coefficients because of both energy and bandwidth constraints. Therefore, we will consider realistic scenarios that the CSI is unknown at the FC when no or only a few training symbols are available. It is known that channel information has a large impact on the structure and the performance of decentralized estimation. In orthogonal MACs, most of the existing works assume that perfect CSI is available at the FC. Recently, the impact of channel estimation errors on the decentralized detection in WSNs is studied in , and its impact on the decentralized estimation when using AF transmission is investigated in . However, the decentralized estimation with unknown CSI for digital communications has still not been well understood.
Our contributions are summarized as follows. We develop the decentralized MLEs with known and unknown CSI at the FC over orthogonal MACs with Rayleigh fading. The performance of the MLE with known CSI can serve as a practical performance lower bound of the decentralized estimation, whereas the MLE with unknown CSI is more realistic. For the special cases of error-free communications or noiseless observations, we show that the MLEs degenerate into the well-known centralized fusion estimator--BLUE--or a maximal ratio combiner (MRC)-based estimator when CSI is known and a subspace-based estimator when CSI is unknown. This indicates that our estimators exploit both data-level redundancy and signal-level redundancy provided by multiple sensors. To provide feasible estimator with affordable complexity, we propose a suboptimal algorithm, which can be viewed as a modified expectation-maximization (EM) algorithm .
The rest of the paper is organized as follows. Section 2 describes the system models. Section 3 presents the MLEs with known and unknown CSI and their special cases, and Section 4 introduces the suboptimal estimator. In Section 5, we analyze the asymptotic performance and complexity of the presented MLEs and discuss the codebook issue. Simulation results are provided in Section 6, and the conclusions are given in Section 7.
2 System model
We consider a typical kind of WSNs that consists of N sensors and a FC to measure an unknown deterministic parameter θ, where there are no inter-sensor communications among the sensors. The sensors process their observations for the parameter θ before transmission. For digital communications, the processing includes quantization, channel coding and modulation. For analog communications, the processing may simply be amplifying the observations before transmission. A messaging function c(x) is used to describe the local processing. Though we can use c(x) for both digital and analog communication systems, we focus on digital transmission since the popular analog transmission scheme, AF, has been shown to be not optimal in fading channels [19–21].
2.1 Observation model
where n s, i is the independent and identically distributed (i.i.d.) Gaussian observation noise with zero mean and variance , and θ is bounded within a dynamic range [-V, +V].
2.2 Quantization, coding, and modulation
where Δ = 2W/(M - 1) is the quantization interval.
where c m = [cm,1, ..., c m, L ]T is the L symbols corresponding to the quantized observation S m , m = 0, ..., M - 1.
which can be used to describe any coding and modulation scheme following the M-level quantization.
The sensors can use various codes such as natural binary codes to represent the quantized observations. In this paper, our focus is to design decentralized estimators; therefore, we will not address the transmission codebook optimization for parameter estimation.
2.3 Received signals
where y i = [yi,1, ..., y i, L ]T, h i is the channel coefficient, which is i.i.d. and subjected to complex Gaussian distribution with zero mean and unit variance, n c, i is a vector of thermal noise at the receiver subjecting to complex Gaussian distribution with zero mean and covariance matrix , and is the transmission energy for each observation.
3 Optimal estimators with or without CSI
In this section, we derive MLEs when CSI is known or unknown at the receiver of the FC, respectively. To understand how they deal with both the communication errors and the observation noises, we study two special cases. The MLE using training symbols in the transmission codebook is also studied as a special form of the MLE with unknown CSI.
3.1 MLE with known CSI
where ||z||2 = (zHz)1/2 is l2 norm of vector z.
Substituting (9) and (10) into (8), we obtain the log-likelihood function for estimating θ, which can be used for any messaging function c(x), no matter when it describes analog or digital communications.
The MLE is obtained by maximizing the log-likelihood function shown in (11).
3.1.1 Special case when
where δ(x) is the Dirac-delta function.
where c(θ) is the transmitted symbols when the observations of the sensors are θ.
For digital communications, c(θ) is a code word of C t and is a piecewise constant function. Therefore, we cannot get θ by taking partial derivative of (15). Instead, we first regard c(θ) as the parameter to be estimated and obtain the MLE for estimating c(θ). Then, we use it as a decision variable to detect the transmitted symbols and reconstruct θ according to the quantization rule with the decision results.
It follows that when the observations are perfect, the structure of the MLE is the MRC concatenated with data demodulation and parameter reconstruction. This is no surprise since in this case, all the signals transmitted by different sensors are identical; thus, the receiver at the FC is able to apply the conventional diversity technology to reduce the communication errors.
3.1.2 Special case when
When the communications are perfect, . It means that y i merely depends on or equivalently depends on . Then, the log-likelihood function becomes a function of the quantized observation .
It is also no surprise to see that the MLE reduces to BLUE, which is often applied in centralized estimation , where the FC can obtain all raw observations of the sensors.
3.2 MLE with unknown CSI
In practical WSNs, the FC usually has no CSI, and the sensors can transmit training symbols to facilitate channel estimation. The training symbols can be incorporated into the message function c(x). Then, the MLE with training symbols available is a special form of the MLE with unknown CSI. We will derive the MLE with unknown CSI with general c(x) in the following and derive that with training symbols in c(x) in next subsection.
which has a similar form to the likelihood function with known CSI shown in (8).
Therefore, c(x) is an eigenvector of R y , and the corresponding eigenvalue is .
where α is a constant.
3.2.1 Special case when
Similarly to the log-likelihood function with known CSI, the log-likelihood function with unknown CSI for perfect observations has the same form for both analog and digital communications.
where vmax(M) is the eigenvector corresponding to the maximal eigenvalue of the matrix M.
This shows that when CSI is unknown at the FC in the case of noise-free observations, the MLE becomes a subspace-based estimator.
3.2.2 Special case when
When the communication SNR tends to infinity, the receiver of the FC can recover the quantized observations of the sensors with error free if a proper codebook, which will be discussed in Section 5.3, is applied. Then, the MLE with unknown CSI also degenerates into the BLUE shown in (20).
3.3 MLE with unknown CSI using training symbols
where both n cp, i and n cd, i are vectors of thermal noise at the receiver. Note that y i, p is independent from the observation x i .
where y i, p and y i, d are, respectively, the received signals corresponding to the training symbols and the data symbols, and β is a constant.
where can be obtained following Matrix Inversion Lemma .
where n ci, d is the receiver thermal noise.
By deriving the conditional PDF from (44), we can obtain a log-likelihood function that is exactly the same as that shown in (42). This implies that the MLE with unknown CSI exploits the available training symbols implicitly to provide an optimal channel estimate and then uses it to provide the optimal estimation of θ.
By maximizing (45), we obtain a coherent estimator since there only exists the coherent term in this log-likelihood function. By contrast, there exists a coherent term as well as a non-coherent term in the log-likelihood function in (42). This means that the MLE obtained from (42) uses the channel estimate as a "partial" CSI that accounts for the channel estimation errors. The true value of the channel coefficients contained in the channel estimate corresponds to the coherent term in the log-likelihood function, whereas the uncertainty in the channel estimate, that is, the estimation errors, leads to the non-coherent term. We will compare the performance of the two estimators through simulations in Section 6.
4 Suboptimal estimator
In the previous section, we developed the MLE with known CSI, which is not feasible in real-world systems since perfect CSI cannot be provided especially in WSN with strict energy constraint. Nevertheless, its performance can serve as a practical lower bound when both the observation noise and the communication errors are in presence.
The MLE with unknown CSI is more practical, but is too complex for application. Nonetheless, its structure provides some useful hints to derive low complexity estimator. In the following, we derive a suboptimal algorithm for the case with unknown CSI.
which is the necessary condition for the MLE.
The term inside the sum of the right-hand side of the likelihood equation shown in (51) is actually the MMSE estimator of for a given θ. This indicates that we can regard the MLE as a two-stage estimator.
During the first stage, it estimates with the received signals from each sensor. During the second stage, it combines by a sample mean estimator.
We present a suboptimal estimator with a similar two-stage structure. This estimator can be viewed as a modified EM algorithm  since its two-stage structure is similar to the EM algorithm. Because the likelihood function shown in (31) has multiple extrema and the equation shown in (50) is only a necessary condition, the initial value of the iterative computation is critical to the convergence of the iterative algorithm. To obtain a good initial value, the suboptimal estimator estimates by assuming it to be uniformly distributed. Furthermore, since the estimation quality of the first stage is available, we use BLUE to obtain for exploiting the quality information instead of using the MLE in the M-step as in the standard EM algorithm.
Now we derive the mean and variance of , which will be used in the BLUE of θ.
However, in our algorithm is not the true value since we use instead of θ to get it. Therefore, the MMSE estimate may be biased. Because it is hard to obtain this bias in practical systems, we regard the MMSE estimator as an unbiased estimate in our suboptimal algorithm and evaluate the resulting performance loss via simulations later.
Let k denote the index of the iteration, the iterative algorithm performed at the FC can be summarized as follows:
(S1) When k = 1, set as the initial value.
(S2) Compute , i = 1, ..., N, and its variance with (54) and (56).
(S3) Substitute and its variance into (57) to get .
(S4) Update using , i.e., .
(S5) Repeat step (S2) ~ (S4) to obtain until the algorithm converges or a predetermined number of iterations is reached.
Note that this suboptimal algorithm differs from the one proposed in , which applies maximal a posteriori (MAP) criterion to detect binary observations of sensors and then uses the results as the true values of the observations in a MLE derived in noise-free channels. Our suboptimal algorithm inherits the structure of the MLE developed in fading channels, which gives "soft" estimates of the quantized observations at first, and combines them with a linear optimal estimator afterward. By conducting these two stages iteratively, the estimation accuracy is improved rapidly. Although the suboptimal algorithm may converge to local optimal solutions due to the non-convexity of the original optimization problem, it still performs fairly well as will be shown in the simulation results. The convergence behavior of the algorithm will be studied in Section 5.4.
5 Performance analysis and discussion
5.1 Asymptotic performance w.r.t. number of the sensors
Now we discuss the asymptotic performance of the MLEs w.r.t. the number of sensors N by studying the Fisher information as well as the Cramér-Rao lower bound (CRLB) of the estimators.
We first consider the MLE with unknown CSI, where the channel coefficients are i.i.d. random variables. In this case, given θ, the received signals from different sensors are i.i.d. among each other; thus, the Fisher information, defined as , linearly increases with the number of the sensors. Therefore, the CRLB, which is the reciprocal of the Fisher information, decreases at a speed of 1/N, which is the same as the BLUE lower bound of centralized estimation .
When CSI is available at the FC, the received signals are no longer identical distributed. In this case, the Fisher information depends on the channel realizations. In the sequel, we will show that the mathematical expectation of the Fisher information over h is always lower than that with unknown CSI, which means that the knowledge about the channels provides more information to improve the estimation quality.
Therefore, the asymptotic performance of the MLE with known CSI is superior to that of the MLE with unknown CSI, where the CRLB of the latter decreases at the speed of 1/N.
5.2 Computational complexity
5.2.1 MLE with known CSI
Since the parameter being estimated is a scalar, one-dimensional searching algorithms can be used to obtain the maximum of the log-likelihood function. However, because the log-likelihood function shown in (11) is non-concave and has multiple extrema, we need to find all its local maxima to get the global maximum.
Exhaustive searching method can be used to find the global maximum. In order to make the MSE introduced by discrete searching neglectable, we let the searching step size be less than Δ/N; thus, we need to compute the value of the likelihood function at least M × N times to obtain the MLE.
The FC applies (11), (12) and (13) to compute the values of the likelihood function with different θ. The exponential term in (12) is independent from θ; thus, it can be computed before searching and be stored for future use.
Given θ, we still need to compute p(S m |θ), m = 0, ..., M - 1, which complexity is O(M), then to obtain each value of the likelihood function with M additions and M multiplications. Therefore, the computational complexity for getting one value of log p(Y|h, θ) is O(MN).
After considering the operations required by the exhaustive searching, the overall complexity of the MLE is O(M2N2).
5.2.2 MLE with unknown CSI
The difference between the MLEs with known and unknown CSI is that p(y i |c m ) is used in MLE with unknown CSI instead of p(y i | h i , c m ). Since p(y i |c m ) can also be computed before the searching, this difference has no impact on the complexity of the MLE with unknown CSI. The computational complexity of the MLE with unknown CSI is also O(M2N2).
5.2.3 Suboptimal estimator
For each iteration of the suboptimal estimator, we need to get and its variance with (54) and (56) and then obtain the estimate of θ with (57). The complexity is similar to that of computing the log-likelihood function, which is O(MN). If the algorithm converges after I t iterations, the complexity of the suboptimal estimator will be O(I t MN).
5.3 Discussion about transmission codebook issues
As we have discussed, the transmission codebooks can represent various quantization, coding and modulation schemes as well as the training symbols. Here, we discuss the impact of the codebooks on the decentralized MLEs.
Comparing the conditional PDF with unknown CSI p(y i |x) shown in (28) with p(y i |h i , x) shown in (61), we see that both PDFs depend on the correlation between the received signals y i and the transmitted symbols c(x). With known CSI, the optimal estimator is a coherent algorithm, since (61) relies on the real part of the correlation, . With unknown CSI, the optimal estimator is a non-coherent algorithm, since (28) depends on the square norm of . Because , both MLEs depend on the cross-correlation of the transmit symbols cH(x i )c(x).
then p(y i |x) will have two identical extrema since the MLE with unknown CSI only depends on . Such a phase ambiguity will lead to severe performance degradation to the decentralized estimator. Therefore, the autocorrelation matrix of the codebook plays a critical role on the performance of the MLE, especially when CSI is unknown.
Many transmission schemes have this phase ambiguity problem, for example, when the natural binary code and BPSK are applied to represent each quantized observation and to modulate. For any c m in such a transmission codebook, defined as C tn , there exists cm′in C tn that satisfies cm′= -c m . Therefore, C tn is not a proper codebook. Another example is AF, the messaging function of which is c(x) = Gx, where G is the amplification gain. The MLE with unknown CSI is unable to distinguish x from -x when using this messaging function.
In order to handle the phase ambiguity problem inherent in the codebook C tn , we can simply insert training symbols into the transmit symbols. Though heuristic, this approach provides fairly good performance because the MLE exploits the training symbols to estimate the channel coefficients implicitly as we have shown. Moreover, since from the later simulations we see that the MLE without CSI and without training symbols does not perform well, we need to insert training symbols when we apply the decentralized estimator.
Since the MLEs are associated with the autocorrelation matrix of the transmission codebook, this allows us to enhance the performance of the estimators by systematically designing the codebook. Nonetheless, this is out of the scape of this paper. Some preliminary results for optimizing the transmission codebooks are shown in .
5.4 Convergence of the suboptimal estimator
For an iterative algorithm θ(k+1)= T(θ(k)), we call that the algorithm is convergent if the distance between θ(k+1)and a fixed point of T(θ) is smaller than the distance between θ(k)and this fixed point, where the fixed points of T(θ) are the points that satisfy equation θ = T(θ). This means that after each iteration, the output of the algorithm is closer to a fixed point.
Define Φ as a fixed point of T(θ) in (ϕ1, ϕ2). The algorithm is convergent if |θ(k+1)- Φ|<|θ(k)- Φ| for all θ(k)∈ (ϕ1, ϕ2).
In the following, we first study the convergence behavior of an iterative algorithm obtained directly from the likelihood equation (50) due to the mathematically tractability, where T(θ) is defined as the right-hand side of equation (50). The iteration algorithm of the suboptimal estimator can be regarded as a modified version of this algorithm, which will be discussed afterward.
Since the iterative function shown in (63) is derived from the likelihood equation, all stationary points of the log-likelihood function are fixed points of T(θ). Denote Φ n , n = 1, 2, ..., as the local maxima of the log-likelihood function, which are sorted in ascending order. Since the log-likelihood function is a continuous function of θ, there exists a minimum between two adjacent maxima. The minimum between Φ n and Φn+1is defined as ϕ n . We will show in the following that in each interval (ϕn-1, ϕ n ), the algorithm converges to Φ i after ignoring the effect of the non-extremal stationary points of log-likelihood function.
which satisfies (64). Therefore, the iterative algorithm is convergent.
Now we discuss the non-minimum stationary points of the log-likelihood function. Considering a minimum ϕ n , for any θ ∈ (Φ n , Φn+1), the sign of is the same as that of (θ - ϕ n ) on both sides of ϕ n , which does not satisfy the sufficient and necessary condition shown in Appendix. Therefore, the algorithm does not converge to ϕ n unless θ(k)exactly equals ϕ n . Any disturbance will perturb θ(k+1)far from this minimum point. As to any non-extremal stationary point , the sign of is the same as that of at one side of this point. The disturbance with proper direction will also make θ(k+1)far from this point.
When the communication SNR tends to infinity, that is, σ c → 0, there is only one p(y i |c m ), m = 0, ..., M - 1, that can be positive. All other p(y i |c m ) tend to 0. By substituting this into (65), we have . It is not hard to verify that in this case, |θ(k+1)- Φ m | = 0 for any θ(k). It means that the iterative algorithm converges to a local maximum of the log-likelihood function exactly after one iteration.
At practical communication SNR levels, , which will affect on the convergent speed of the algorithm.
which satisfies the condition (71).
When the communication SNR tends to infinity, all tend to -1 as discussed. The estimator shown in (57) degenerates into the algorithm shown in (63). It is also convergent to a local maximum of the log-likelihood function exactly after one iteration.
At practical communication SNR levels, we can see from (72) that is weighted by itself since w i (θ) depends on . A larger will make the weight w i (θ) smaller. Therefore, the value of the partial derivative in (73) is closer to -1 compared with the iterative algorithm defined with (63) given y i and , which increases the speed of convergence.
6 Simulation results
An M = 16 level uniform quantizer is considered, where each quantized value can be represented by a K = 4 bit binary sequence. We do not consider the binary quantizer, which only performs well in low observation SNR.
The summary of the codebooks considered
Error control coding
2 or 5
When CSI is known at the FC, we evaluate the performance of the MLE with codebook C tn . The simulation results are marked as "MLE CSI" in the legend. When CSI is unknown and the codebook is still C tn , the legends for MLE and the supoptimal estimator are "MLE NoCSI" and "Subopt NoCSI," respectively. When CSI is unknown and the codebook is C tp , where 2 or 5 training symbols are inserted, the simulation results are marked as "MLE NoCSI TS2/5" and "Subopt NoCSI TS2/5." We also evaluate the performance of the MLE with a near-optimal codebook obtained in , which is marked as "MLE NoCSI OPT." As discussed in Section 3.2, the FC can use the training symbols to estimate the CSI and use the estimated CSI as the known CSI to estimate θ. We evaluate this estimator with the codebook C tp , which is marked as "MLE EstCH TS2/5."
To demonstrate the performance gain of the proposed estimators, two traditional fusion-based estimators and AF transmission are simulated. In the fusion-based estimators, the FC first demodulates the transmitted data from each sensor, then reconstructs the observation of each sensor from the demodulated symbols following the rule of quantization and finally combines these estimated observations with BLUE fusion rule to produce the estimate of θ. When ECCs are applied at the sensors, the receiver at the FC will exploit its error detection ability to discard the data that cannot pass the error check. The fusion-based estimators using codebook C tn and C tc are denoted as "Fusion-NoECC" and "Fusion-CRC" in the legends of the figures, respectively. For AF, the amplification gain G is designed to make the average transmission power of the sensors equals to that of the digital communication schemes. We also use the MLE at the FC to estimate θ, which is marked as "AF" in the legend.
The MSE of the Quasi-BLUE  is shown as the performance lower bound with legend "Q-BLUE Bound." This MSE is obtained in perfect communication scenarios with the same M-level quantizer as other estimators.
6.1 Convergence of the suboptimal estimator
6.2 MSE versus the communication SNR
When CSI is known at the FC, it is shown from Figure 2a that the MLE outperforms the fusion-based estimators. The MSE of the MLE approaches to the Quasi-BLUE lower bound rapidly with the increasing of the communication SNR. As expected, the MLE with AF transmission, marked as AF, is inferior to the MLE with digital communication using 4-bits quantization, marked as MLE CSI. This justifies the conclusions in [19–21], which show that AF is not optimal in fading channels.
According to the performance analysis for BPSK modulation in Rayleigh fading channels , the BER of the transmission scheme with codebook C tn exceeds 0.15 when γ s < 3 dB. ECC can improve the transmission performance for high communication SNR, but it causes more errors for low SNR. For the transmission schemes using CRC, the BER is even worse because long codes will reduce the transmission energy per symbol. For such a high BER, the fusion-based estimators do not perform well. Most of the demodulated data will be dropped due to the error check; thus, the fusion-based estimators do not have enough information to exploit, which finally leads to the worse MSE performance.
When CSI is unknown at the FC, the MSEs of the MLE with unknown CSI and with two different ways of using training symbols for channel estimation are shown in Figure 2b. One is the MLE obtained from the log-likelihood function in (42), and the other is the estimator obtained from (45), which uses the estimated channel coefficients as their true values. As expected, our MLE shown in (42) performs better, because it takes into account the uncertainty of the channel estimation.
Because there exists phase ambiguity in the schemes with C tn and AF transmission, simulation results show that the MSEs of the MLE and suboptimal estimator using these two transmission schemes are very large and do not decrease when γ c increases. Therefore, they are not shown on the figures.
When we insert training symbols, the performance of the MLE with unknown CSI improves significantly, but it is still much worse than that of the MLE with known CSI at low communication SNR levels. It is interesting to see that using more training symbols does not improve the performance of the MLE as expected, because inserting training symbols will reduce the energy for the data symbols when the energy for transmitting an observation is fixed. Our simulations show that the best performance is obtained when L p = 2. This is consistent with the observation of , where the optimal L p equals to .
As discussed, inserting training symbols is a heuristic way to improve the performance. It is shown in the figure that a codebook designed by using optimization method outperforms all the codebooks with training symbols.
6.3 MSE versus the number of sensors
In this paper, we studied decentralized estimation for a deterministic parameter using digital communications over orthogonal multiple-access fading channels with a multiple-bit quantizer. By introducing a general messaging function, the proposed estimators can be applied for various quantization, coding and modulation schemes, including AF transmission, binary quantization and with or without training symbols.
We derived the MLEs with both known and unknown CSI. The MLE with known CSI can serve as a practical performance lower bound of existing decentralized estimators. It is shown that the MLE with multi-level quantization outperforms the MLE with AF as well as the fusion-based estimators.
The MLE with unknown CSI is more realistic. Without training symbols, it does not perform well due to the phase ambiguity. When inserting training symbols before data symbols, it estimates channel coefficients implicitly and exploits the channel estimates in an optimal way. Under the energy constraint, only a few symbols are beneficial for training channels, while more training symbols will lead to performance degradation. To design an estimator with affordable complexity, we developed a suboptimal estimator that converges rapidly. The proposed estimator performs well. It exhibits similar performance as the MLE at high SNRs and has minor performance loss at low SNRs.
which shows that the algorithm is convergent. When θ(k)- Φ < 0, substituting (77) into (78), we also obtain the inequality shown in (79). Therefore, (76) and (77) are sufficient conditions of the convergence.
After the simplifications, we can obtain (76) from (81).
Similarly, when θ(k)- Φ < 0, (77) can be obtained following the same procedure. Therefore, (76) and (77) are necessary conditions. □
Corollary: A sufficient condition that the algorithm converges to Φ is f(θ)(θ - Φ) < 0, ∀θ ≠ Φ, and f′(θ) > -2.
Therefore, the first inequality in (76) and the second inequality in (77) are satisfied. From the condition f(θ)(θ - Φ) < 0, it is not hard to find that the second inequality in (76) and the first inequality in (77) are also satisfied. Thus, the iterative algorithm is convergent following Proposition. □
This work was supported by the National Nature Science Foundation of China under Grant 60672103. Parts of this work were presented at IEEE Globecom'07, Washington, DC, United States, Nov. 2007.
- Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E: Wireless sensor networks: a survey. Comput Netw 2002,38(4):393-422. 10.1016/S1389-1286(01)00302-4View ArticleGoogle Scholar
- Xiao J-J, Ribeiro A, Luo Z-Q, Giannakis GB: Distributed compression-estimation using wireless sensor networks. IEEE Signal Process Mag 2006,23(7):27-41.View ArticleGoogle Scholar
- Li XR, Zhu Y, Wang J, Han C: Optimal linear estimation fusion--part I: unified fusion rules. IEEE Trans Inf Theory 2003,49(9):2192-2208. 10.1109/TIT.2003.815774View ArticleGoogle Scholar
- Ribeiro A, Giannakis GB: Bandwidth-constrained distributed estimation for wireless sensor networks--part I: Gaussian case. IEEE Trans Signal Process 2006,54(3):1131-1143.View ArticleGoogle Scholar
- Ribeiro A, Giannakis GB: Bandwidth-constrained distributed estimation for wireless sensor networks--part II: unknown probability density function. IEEE Trans Signal Process 2006,54(7):2784-2796.View ArticleGoogle Scholar
- Luo Z-Q: An isotropic universal decentralized estimation scheme for a bandwidth constrained ad hoc sensor network. IEEE J Sel Areas Commun 2005,23(4):735-744.View ArticleGoogle Scholar
- Li H, Fang J: Distributed adaptive quantization and estimation for wireless sensor networks. IEEE Signal Process Lett 2007,14(10):669-672.View ArticleGoogle Scholar
- Fang J, Li H: Distributed adaptive quantization for wireless sensor networks: from delta modulation to maximum likelihood. IEEE Trans Signal Process 2008,56(10):5246-5257.MathSciNetView ArticleGoogle Scholar
- Aysal T, Barner K: Constrained decentralized estimation over noisy channels for sensor networks. IEEE Trans Signal Process 2008,56(4):1398-1410.MathSciNetView ArticleGoogle Scholar
- Lam WM, Reibman AR: Design of quantizers for decentralized estimation systems. IEEE Trans Commun 1993,41(11):1602-1605. 10.1109/26.241739View ArticleGoogle Scholar
- Papadopoulos HC, Wornell GW, Oppenheim AV: Sequential signal encoding from noisy measurements using quantizers with dynamic bias control. IEEE Trans Inf Theory 2001,47(3):978-1002. 10.1109/18.915654MathSciNetView ArticleGoogle Scholar
- Xiao J-J, Luo Z-Q: Decentralized estimation in an inhomogeneous sensing environment. IEEE Trans Inf Theory 2005,51(10):3564-3575. 10.1109/TIT.2005.855580MathSciNetView ArticleGoogle Scholar
- Li J, AlRegib G: Distributed estimation in energy-constrained wireless sensor networks. IEEE Trans Signal Process 2009,57(10):3746-3758.MathSciNetView ArticleGoogle Scholar
- Xiao J-J, Cui S, Luo Z-Q, Goldsmith AJ: Power scheduling of universal decentralized estimation in sensor networks. IEEE Trans Signal Process 2006,54(2):413-422.MathSciNetView ArticleGoogle Scholar
- Gastpar M: To code or not to code, PhD Dissertation, Ecole Polytechnique Fédérale de Lausanne, EPFL. 2002.Google Scholar
- Gastpar M, Vetterli M: Source-Channel Communication in Sensor Networks. Lecture Notes in Computer Science. 2003, 2634: 162-177.View ArticleGoogle Scholar
- Gastpar M: Uncoded transmission is exactly optimal for a simple Gaussian "sensor" network. 2007 Information Theory and Applications Workshop 2007, 5247-5251.Google Scholar
- Cui S, Xiao J-J, Goldsmith AJ, Luo Z-Q, Poor HV: Energy-efficient joint estimation in sensor networks: Analog versus digital. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP' 05 2005, IV: 745-748.Google Scholar
- Xiao J-J, Luo Z-Q: Multiterminal source-channel communication over an orthogonal multiple access channel. IEEE Trans Inf Theory 2007,53(9):3255-3264.MathSciNetView ArticleGoogle Scholar
- Cui S, Xiao J-J, Goldsmith AJ, Luo Z-Q, Poor HV: Estimation diversity and energy efficiency in distributed sensing. IEEE Trans Signal Process 2007,55(9):4683-4695.MathSciNetView ArticleGoogle Scholar
- Bai K, Senol H, Tepedelenlioğlu C: Outage scaling laws and diversity for distributed estimation over parallel fading channels IEEE Trans. Signal Process 2009,57(8):3182-3192.MathSciNetGoogle Scholar
- Admadi HR, Vosoughi A: Impact of channel estimation error on decentralized detection in bandwidth constrained wireless sensor networks. IEEE Military Communications Conference, MILCOM' 08 2008, 1-7.Google Scholar
- Senol H, Tepedelenlioğlu C: Performance of distributed estimation over unknown parallel fading channels. IEEE Trans Signal Process 2008,56(12):6057-6068.MathSciNetView ArticleGoogle Scholar
- Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 1977,39(1):1-38.MathSciNetGoogle Scholar
- Max J: Quantizing for minimum distortion. IRE Trans Inf Theory 1960,6(1):7-12. 10.1109/TIT.1960.1057548MathSciNetView ArticleGoogle Scholar
- Lloyd SP: Least squares quantization in pcm. IEEE Trans Inf Theory 1982,28(2):129-137. 10.1109/TIT.1982.1056489MathSciNetView ArticleGoogle Scholar
- [Online]. Available: http://en.wikipedia.org/wiki/Woodbury_matrix_identity
- Boyd S, Vandenberghe L: Convex Optimization. Cambridge University Press, Cambridge; 2004.View ArticleGoogle Scholar
- Kay SM: Fundamentals of Statistical Signal Processing, vol. I: Estimation Theory. Prentice Hall PTR, New Jersey; 1993.Google Scholar
- [Online]. Available: http://en.wikipedia.org/wiki/Mean_value_theorem
- Wang X, Yang C: Optimal transmission codebook design in fading channels for decentralized estimation in wireless sensor networks. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP' 09 2009, 2293-2296.Google Scholar
- Proakis JG: Digital Communications. 4th edition. The McGraw-Hill Companies, Inc., New York; 2001.Google Scholar
- Wang M, Yang C: Distributed estimation in wireless sensor networks with imperfect channel estimation. 9th International Conference on Signal Processing, ICSP' 08 2008, 3: 2649-2652.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.