A practical approach for outdoors distributed target localization in wireless sensor networks

Wireless sensor networks are posed as the new communication paradigm where the use of small, low-complexity, and low-power devices is preferred over costly centralized systems. The spectra of potential applications of sensor networks is very wide, ranging from monitoring, surveillance, and localization, among others. Localization is a key application in sensor networks and the use of simple, efficient, and distributed algorithms is of paramount practical importance. Combining convex optimization tools with consensus algorithms we propose a distributed localization algorithm for scenarios where received signal strength indicator readings are used. We approach the localization problem by formulating an alternative problem that uses distance estimates locally computed at each node. The formulated problem is solved by a relaxed version using semidefinite relaxation technique. Conditions under which the relaxed problem yields to the same solution as the original problem are given and a distributed consensus-based implementation of the algorithm is proposed based on an augmented Lagrangian approach and primal-dual decomposition methods. Although suboptimal, the proposed approach is very suitable for its implementation in real sensor networks, i.e., it is scalable, robust against node failures and requires only local communication among neighboring nodes. Simulation results show that running an additional local search around the found solution can yield performance close to the maximum likelihood estimate.


Introduction
The deployment of a large number of scattered sensors in a certain area constitutes a very powerful tool for sensing and retrieving information from the environment (e.g., temperature, humidity, motion). The main features of wireless sensor networks (WSN) are that of a large number of low-cost nodes with limited computational and power resources. WSNs must also be scalable and robust against changes in topology (i.e., node failure or addition of new nodes), as well as energy efficient. These are the major design issues in WSNs that make the development of simple and efficient algorithms a challenging problem. These limitations also make centralized approaches not very suitable for being used in WSNs. Localization is a key task (often mandatory) in many applications [1] and therefore, distributed localization algorithms are of high practical importance.
There exist different measurement sources that can be fused in order to get an estimate of the target's position [1,2], like time-of-arrival (TOA), time-difference of arrival (TDOA), angle of arrival (AOA) or received signal strength indicator (RSSI). In this article, we focus on single antenna nodes without tight synchronization abilities, which leads us to the use of RSSI measurements for the localization task. One of the main challenges when using RSSI measurements is that the mapping between the measurement and target's position is nonlinear and hence, finding a suitable solution becomes more challenging. Some approaches to deal with nonlinearities are based on particle filtering principles [3]. In the context of WSN particle filtering approaches have also been proposed for localization and tracking using RSSI measurements [4][5][6][7][8]. In general, particle filtering approaches have shown very good performance when dealing with RSSI measurements but they are centralized and suffer from a high computational cost and hence, their applicability in a real scenario is questionable.
A recent approach based on convex optimization concepts has been proposed in [9,10] for the node localization problem. In [9] a semidefinite relaxation approach is used to cast the localization problem into a semidefinite program (SDP) that can be solved efficiently via interior point methods, see [11] and references therein. The obtained position estimate through SDP is then further refined via an iterative algorithm. Although the proposed method provide near optimal results (i.e., close to the Cramer-Rao bound) they are centralized so their application to WSN may be limited. The problem of source localization using energy measurements has also been treated in [12], where a distributed algorithm based on projections onto convex sets is presented. The algorithm is shown to asymptotically approach the maximum likelihood (ML) estimate as the number of nodes increases when the target lies in the convex hull defined by the node's coordinates. In [13], an alternative approach is presented that can handle variations in the path-loss exponent. However, in both approaches no restrictions are imposed in the communication among nodes. In real applications this will cause rapid battery depletion if far away nodes are to communicate. Further, in both approaches the estimation is performed only by a subset of nodes that are selected according to their received signal to noise ratio. The main drawback is that such subset must be known to every node in the network. In a real scenario, the required signaling and routing overhead necessary for node coordination may limit their application.
In this contribution, we propose a distributed algorithm for localization in WSN's by fusing RSSI measurements. We approach the ML estimation problem by solving a simplified and more tractable problem which allows the use of convex optimization tools for its distributed solution. More precisely, we use an augmented Lagrangian approach with a primal-dual decomposition [14,15]. The developed approach offers an advantage over centralized approaches as it is scalable, robust against changes in network's topology and requires only local communication among neighboring nodes. These are key properties very desirable in the context of WSN's.
The article is organized as follows: Section 2 introduces the localization problem and the underlying propagation model. In Section 3, we present the localization approach based on RSSI readings and its distributed implementation is presented in Section 4. Simulations are provided in Section 6, while some comments and concluding remarks are given in Section 7.

Notation
Bold lower-and upper-case letters denote vectors and matrices, respectively. For vector quantities the operator || · || denotes Euclidean norm while for matrices it refers to the Frobenius norm. The symbol 0 denotes a matrix of appropriate dimensions whose entries are all zeros. The symbol I is used to denote the identity matrix of appropriate dimensions. The optimal value of a variable x in an optimization problem is denoted by x*. The symbol ℝ n is used to denote the set of real ndimensional vectors while S n + is used to denote the set of symmetric, n × n positive semi-definite matrices.

Problem formulation and definitions
Consider a wireless sensor network, as the one depicted in Figure 1, consisting of M nodes randomly deployed on a certain area (in the same x-y plane). Nodes are static and able to communicate with adjacent nodes that lie within a given range for communications. Nodes are aware of their own location but not aware of the location of any other element in the network. Assume the presence of a target node that emits beacon frames that can be heard by all nodes in the network. The goal is to determine the location of the target node in the x-y plane.
For getting estimates of the target position, nodes employ RSSI measurements. The use of RSSI readings is of practical convenience when working with real hardware as they do not need tight synchronization requirements. We assume that the RSSI follows a linear relationship with the received power (we assume they are equal). Let denote r m as the received power at node m. A common assumption, see [2] and references therein, is that the received power follows a lognormal distribution with a distance-dependent mean as where p m is the received power (in dB) at reference distance d 0 , a m is the path-loss exponent, d m is the true distance between the target and the mth node and n m ∼ N (0, σ 2 m ) is a Gaussian random variable of zero mean and variance σ 2 m . The received power r m will be used to get an estimate of the true target position.

Localization strategies
In this section, we present different localization strategies using the received power (in dB) at the nodes. We first consider the (centralized) ML estimate and then we propose to use a suboptimal strategy based on local distance estimate at each node. We show that the proposed localization strategy can be implemented in a fully distributed way by only local communication among neighboring nodes.

ML estimation
Consider the presence of a centralized unit that gathers all measurements coming from the nodes. Let r = [r 1 , . . . , r M ] T be a vector whose components are the different measurements taken by each sensor and denote x = [x t y t ] T ℝ 2 as the target's position. The true distance between the target and the mth sensor can be then expressed as where c m = [x m y m ] ⊤ ℝ 2 are the coordinates of node m with m = 1, . . . , M. The vector of measurements r can now be written as where the vector n ∼ N (0, ) is jointly Gaussian with zero mean and covariance ∑. It is easy to see that r will follow a Gaussian distribution with mean μ(x) and covariance ∑, that is where p(r ; x) is the probability density function of r with parameter x. The ML estimate of the target position is then which is equivalent to maximizing the log of p(r ; x). Neglecting all terms that do not depend on x it is easy to see that The objective to be minimized in (6) is not convex and therefore, finding the global optimum is not an easy task. In Figure 2 (left), we have an illustration of how the objective in (6) looks like for a network of 20 nodes over a normalized square area. It is clear that the function is not convex and that several local minima and saddle point may exist. Instead of dealing directly with the ML estimate we propose to use a suboptimal approach that offers a reasonable good performance and that allows for its distributed computation.

Practical approach
In this section, we propose to estimate the target's position based on local distance estimates computed at each node. The use of local distance estimates allows the derivation of simple and distributed estimators of the target's position. We have that, for the propagation model (1), the ML estimate of the distance between the mth node and the target is given bŷ Taking the square at both sides of (2) and further developing, it is easy to see that the following equation must be satisfied x coordinate y coordinate Rearranging terms we can express (8) in a more compact form as ⎡ where 1 is a M × 1 vector of all ones. However, we do not have the actual distances to the target but a noisy version of them as per (7). Define the vector We can then get an estimate of the target's position by minimizing the norm of (10). In order to incorporate robustness and make the localization task more applicable to realistic scenarios we propose to use a weighted version of the cost function (10). In a WSN it may happen that some of the nodes exhibit a bias in their measurements due to the presence of obstacles. Additionally, nodes do not have precise information about their own locations instead, some errors may be present. The incorporation of weights will mitigate the effects of biased nodes and uncertainties in nodes' positions. So we compute an estimatex of the true target's position x as the solution to the following non-linear (weighted) least-squares problem where D = diag (g 1 , . . . , g M ) is a diagonal weighting matrix with g m ≥ 0 for all m = 1, . . . , M. A proper choice for the weights would be inversely proportional to the variance of the measurements. As we are assuming the log-normal model for the measurements it is well known that the variance of the ML estimate (7) is proportional to the square of the true distance to be estimated [2,16]. With this consideration in mind we may choose to weight our measurements inversely proportional to the measured distance, that is γ m = 1/d m .
This problem has been studied in [17,18] where a distributed version of the Gauss-Newton method can be used for its solution. In this study, we present a more flexible approach that uses concepts from convex optimization theory. The proposed approach has better convergence properties and also makes it straightforward to include additional constraints to the problem that may prevent it from instabilities. In order to proceed let's write problem (11) as the following equivalent problemx Note that, although (11) and (12) are equivalent problems (i.e., with the same solution), they are different because in the latter case we are minimizing the squared norm of Df(x). The minimization of the squared norm is motivated by the fact that it allows a simple distributed implementation as it can be guessed from the structure of (12). The use of the objective in (12) is well motivated by the fact that we get a smoother surface at the cost of introducing some bias with respect to the ML solution (see Figure 2 (right)). However, if the bias is small, we may still get to the ML estimate by performing a local search around the solution of (12). However, in order to use convex optimization methods we need problem (12) to be convex. Unfortunately, the objective function is not convex because we are adding the squares of quadratic convex but not necessarily positive functions [11]. It would be interesting to exploit some hidden convexity of the problem so that convex optimization methods can be applied.
A possible approach to make the problem convex is to use semidefinite relaxation technique. Let X = xx T and note that Tr (X) = ||x|| 2 , where Tr(·) is the trace operator. We can rewrite the problem as We now have that the objective is convex as it is the composition of an affine function of X and x with a convex function [11]. However, the above problem is still non-convex due to the non-linear constraint X = xx T . We can then relax the equality constraint by replacing it with a semidefinite constraint. As a result we end up with the following (convex) SDP where S 2 + is the set of 2 × 2 symmetric positive semidefinite matrices. As we are allowing for a larger feasible set, the optimal value of problem (14) would provide a lower bound on the optimal value of the original problem (12). However, if the optimal solution X* of (14) is of rank-one, we have that the semidefinite relaxation approach is not a relaxation at all and the found solution x* of (14) is also optimal for (12).
It would be interesting to give conditions under which (14) provides a rank-one solution so that the obtained solution is optimal for the original problem, too. To that end let define the matrix A as and let the vector δ be given by We then define the following feasibility problem with variables z ℝ 2 and t ℝ + . The above problem is convex since it belongs to the class of second-order cone program (SOCP) [11]. Based on the feasibility problem (17) we can state the following result: Proposition 1. Assume problem (12) has at least one strictly feasible point. If problem (17) is not feasible, then the optimal solution x* of the semidefinite relaxed problem (14) is also optimal for the original problem (12).
Proof. See the Appendix. □ Corollary 2. If matrix A is singular then, the solution X* of (14) is of rank one with X* = x*x* T and x* is also optimal for (12).
Proof. It follows directly from Proposition 1. If A is singular then, problem (17) is infeasible (because matrix A is not invertible) so that the relaxed problem is not a relaxation at all. □ It is worth to mention that the feasibility problem (17) can be easily checked without the requirement of an optimization solver. If matrix A is singular then, the problem is infeasible and we are done. If however, matrix A is full-rank, we compute A -1 δ (which is unique) and check whether it satisfies the second order cone (SOC) constraint ||z|| 2 ≤ t. If the constraint is not met then we conclude that the problem is infeasible.

Distributed algorithm
One of the main advantages that offers the considered approach is that it allows for a distributed implementation. We assume that nodes communicate with their one-hop neighbors as dictated by the communication graph G(V, E), where V is the set of vertices and E is the set of edges of the graph. By only local communication exchange, nodes can agree to compute some desired (global) quantity using consensus algorithms [19].
Let's proceed by reformulating problem (14) into a SDP with a single matrix variable Z ∈ S 3 + . For that purpose, let us formulate the following problem where M m = I − 2 0 0 c T m 0 and I is the identity matrix.
With the above problem definition we have the following equivalence: Lemma 1. The two problems (14) and (18) are equivalent. Further, if we denote Z* as the optimal solution to (18), then the optimal solution x* of (14) is given by x* = [Z*(3, 1), Z*(3, 2)] T .
Proof. See the Appendix. □ Now that we have established the equivalence between problems (14) and (18) through Lemma 1 we show how to solve it in a distributed way. For that purpose we use the optimization framework for consensus-networked systems as proposed in [15] that uses an augmented Lagrangian approach [14]. The framework in [15] generalizes the previous work of [20] as it can also handle convex but not necessarily strictly convex objective functions. The augmented Lagrangian method adds a quadratic penalty term to the objective function that is zero at the optimal solution. The resulting problem is then equivalent to the original problem as both of them end up with the same solution. Augmented Lagrangian methods are also attractive because they offer better convergence properties than standard primal-dual decomposition methods. A detailed treatment about augmented Lagrangian methods and their properties can be found in [14].
In order to derive a distributed solution consider first the introduction of M new variables and a global consensus constraint into the problem as The problem is now separable in the objective function (as it is the sum of M terms, each one dependent of one node) but we still have the coupling "consensus" constraint Z m = Z. However, we do not need to impose that all nodes agree on the same quantity instead, we can only force nodes to agree with their one-hop neighbors. Let N m be the set of neighbors of node m, we can then reformulate the problem as The two problems (19) and (20) are equivalent provided that the underlying graph is strongly connected [20]. We can now use the developed framework in [15] to derive an augmented Lagrangian method for the distributed solution of (20). Consider then, the introduction of the additional variables W m,j ∈ S 3 + and formulate the equivalent problem The penalized problem [14] can be written as where c >0 is a constant that controls the penalization of the disagreement among neighbors. In general, c could be a sequence as long as it is non-decreasing. The choice of c has a direct impact on the rate of convergence of the distributed algorithm [14]. There is no general rule to choose c and its value will vary depending on the problem at hand. It becomes clear from the formulation of problem (22) that the penalty term is zero at the optimum and, therefore the optimal solution to (22) is also optimal for (20). We can now find a solution of (22) by solving its dual problem. By relaxing the consensus constraint we form the partial augmented Lagrangian L c as where Γ m, j and F m, j are the Lagrange multipliers. We then have that the dual problem is given by The problem is now separable and strictly convex which allows its solution by alternating the minimization over Z m and W m, j as and then performing an update of the Lagrange multipliers using a subgradient step where the superscript (k) denotes the kth iteration. Following the same steps as in [15], it can be shown that Let Δ m, j = Γ m, j = -F m, j and define Ψ m = Δ m, j -Δ j, m . Based on the above results it can be easily shown that the solution to problem (24) is obtained in a distributed way by alternating between the following two updates with (0) m = 0 and m = 1, . . . , M. The network will then operate as follows: At the beginning of the kth iteration, each node locally solves (31). Then, nodes broadcast the computed estimates Z (k+1) m to their neighbors. With the local estimates of the corresponding neighbors at hand, each node updates its multipliers as in (32). The process is repeated until all nodes converge to the same solution which, in turn would be the same as in the centralized case.

Approaching the ML estimate
So far, we have shown how to solve (11) by formulating the relaxed problem (14). We have also provided conditions under which the solutions to (11) and (14) coincide. Further, the solution can be computed in a distributed fashion using convex optimization tools. However, the performance of the followed approach in (11) is below that of the ML estimate (6). In order to come closer to the ML solution we could perform an additional local search that improves the obtained estimate through the solution of (14). The idea is to run a distributed optimization routine, taking the solution of (14) as the starting point, to solve for (6). If the previously computed estimate by solving (14) is close to the ML estimate we may converge to it by optimizing in the neighborhood of the solution of (14), otherwise we will converge to a local optima but still improving performance.
Observe that the ML estimate (6) can be cast into a non-linear least-squares problem of the form of (11). Assuming that ∑ is positive definite, we can write the ML estimation problem (6) as the following unconstrained optimization problem where f ML (x) = S (r -μ(x)) and S is the Cholesky factorization of the inverse covariance matrix, i.e., S T S = ∑ -1 . A local minimum of the above non-linear leastsquares problem (33) can be found using an iterative descent algorithm like the Gauss-Newton method [11,21]. The standard (centralized) Gauss-Newton procedure is given in Algorithm 1, where h gn represents the descent direction (i.e., direction that reduces the value of the cost function) and J (k) = J(x (k) ) with J(x) ℝ M×2 representing the Jacobian matrix of f ML (x) = [f ML 1 (x), . . . , f ML M (x)] T whose entries are given by Algorithm 1 Gauss-Newton method 1: if ∄ h gn then 5: found = true 6: end if 7: x (k+1) ¬ x (k) + h gn {Update} 8: k ¬ k + 1 9: end while If the covariance matrix ∑ has no special structure, then the problem (33) requires a central entity that gathers all the information coming from the nodes in order to solve it. However, it is reasonable to assume independence of the noise processes among the nodes so that ∑ has a diagonal structure, say = diag(σ 2 1 , ..., σ 2 M ) . In that case, matrix S = diag(1/s 1 , . . . , 1/s M ) is also diagonal and we can exploit the problem structure in order to find a distributed implementation of the Gauss-Newton procedure given in Algorithm 1. Note that for finding a distributed implementation of Algorithm 1, it suffices to find a way to compute the descent search direction h gn is a distributed fashion. For such purpose, let first note that J(x) has a block-wise structure given by Based on the block-wise structure of matrix J(x) it is easy to note that and therefore, the above quantities can be computed in a distributed fashion by means of average consensus [19]. Once we have computed the products (36) and (37), it is straightforward to compute the descent search direction h gn .
Based on these observations we propose here a fully distributed algorithm, shown as Algorithm 2, which asymptotically approaches the same result as in the centralized case using only local information and the exchange of low-volume intermediate results within each node's 1-hop neighborhood. We immediately note that the Steps 3-6 and 11-12 can all be performed locally by each node. The only communication occurs in the Steps 8 and 9 via standard average consensus Algorithms [19]. Seeing how (k) n ∈ R 2×2 and is symmetric, and γ (k) n ∈ R 2 , we conclude that each consensus round requires a broadcast of only five real values.
Algorithm 2 Distributed Gauss-Newton localization 1:x (0) ¬ same initial value ∀m M 2: for k = 0 to K -1 do In this section, we provide several numerical examples in order to evaluate the performance of the proposed approach. For the simulations we consider a network of randomly deployed nodes over an area of 100 × 100 squared meters. We have used the same propagation model for all the nodes with reference power p m = -40 dB at reference distance d 0 = 1 m and path-loss exponent a m = 2 for m = 1, . . . , M. We further assume that the noise processes are independent and identically distributed with n m ∼ N (0, σ 2 dB ) for all m. In Figure 3, we have simulated the distributed localization task using a network of 50 nodes (see Figure 1) with a randomly located target. We have performed distributed estimation of the target's position by the alternating between (31) and (32) with penalty parameter c = 0.05. We have plotted in Figure 3 the error between each node's local estimate and the centralized solution of the problem. As it can be appreciated, the distributed algorithm converges to the optimal centralized solution as the number of iterations increases.
We have also simulated the localization task over the same network of Figure 1. For the propagation model the measurement noise variance has been set to σ 2 dB = 9. We have evaluated the performance of the proposed approach with and without weights (labeled SDP and wSDP, respectively) over 1000 random target locations (test points). We compare our approach with Multilateration localization approach [2]. As shown in [2], if one node is chosen as a reference, then the localization problem can be linearized. Therefore, it allows its distributed implementation by means of consensus since it can be cast into a least-squares problem where each node locally contributes to the global cost function. Note, however that nodes must first agree on a common reference. For the simulations, we have chosen the node with the closest distance estimate to be the reference. We also provide the performance of the ML estimate for comparison purposes. The empirical cumulative distribution function (CDF) of the localization error is represented in Figure 4. As it can be observed in Figure  4, the performance of the proposed scheme outperforms that of Multilateration. Further, we observe that the use of weights improves the localization accuracy of the algorithm so that we come closer to the ML estimate.
The performance of the algorithm has also been tested for different values of the measurement noise variance. The results are displayed in Figure 5 where the average error over 1000 locations is depicted as a function of the measurement noise standard deviation. Again, we can appreciate the performance improvement of the proposed approach compared to multilateration as can be seen in Figure 5. We have also displayed the results  when combining the proposed distributed localization approach with a local search (wSDP+local). The local search is performed in a distributed fashion following the steps in Algorithm 2. As it can be observed the results of such combination provide close to ML performance. This implies that our method is capable of providing good estimates that could be used to run a local solver in order to come close to the ML estimate.
Although not guaranteed to converge to the ML solution, the local search can only provide better estimates (in the ML sense).

Conclusions
We have presented a distributed localization approach over sensor networks using consensus and convex optimization. An alternative problem to the ML position estimation problem has been proposed based on local ML distance estimates at each node. In order to circumvent the non-convexity of the problem, semidefinite relaxation technique has been employed and conditions that guarantee zero gap between the relaxed and the original problem have been given. A distributed algorithm based on an augmented Lagrangian approach using primal-dual decompositions have been proposed and it has been shown to converge to the centralized solution. The approach is suitable for its real implementation in WSN as it is scalable, robust against changes in topology and energy efficient by the use of only local broadcast-type communication among nodes. Another interesting property of the proposed algorithm is that it allows the introduction of additional convex constraints to the localization problem in a straightforward manner.
The proposed algorithm is intended to be usable in real networks and its suitability in terms of accuracy would be determined by the application at hand. However, if higher accuracy is required, we could run an additional optimization step around the found solution. We have verified by means of simulations that the combination of our suboptimal method with a local search provides a localization error close to the ML estimate.
It is worth to mention that the proposed approach has a direct application to distributed tracking in WSN's as well. The tracking procedure would be based on the jointly estimated target's position. As all nodes share the same estimate, they could use that estimate to locally run a tracking filter in order to follow the movement of the target.

Appendix 1: Proof of Proposition 1
To show the validity of Proposition 1, consider first the Lagrangian of (14) which is given by with Ψ ≽ 0 being the Lagrange multipliers. Since problem (14) is convex and there exists, by assumption, at least a strictly feasible point, Slater's constraint qualifications are satisfied and therefore, strong duality holds. Moreover, from duality theory we have that, at the optimum, the derivative of the Lagrangian with respect to X and x must be zero, that is implies that the off-diagonal elements of X must equal those of xx T . However, this does not necessarily mean that X is of rank one. From the complementary slackness condition (41) we have that X will equal xx T whenever the constraint is active (i.e., Ψ ≠ 0). So by finding under which conditions Ψ ≠ 0 we will find the conditions that guarantee that the solution of (14) coincides with the solution of the original problem (12). For that purpose, if we set Ψ = 0 we have from (39) and (40)