 Research
 Open Access
 Published:
Nonparametric generalized belief propagation based on pseudojunction tree for cooperative localization in wireless networks
EURASIP Journal on Advances in Signal Processing volume 2013, Article number: 16 (2013)
Abstract
Abstract
Nonparametric belief propagation (NBP) is a wellknown message passing method for cooperative localization in wireless networks. However, due to the overcounting problem in the networks with loops, NBP’s convergence is not guaranteed, and its estimates are typically less accurate. One solution for this problem is nonparametric generalized belief propagation based on junction tree. However, this method is intractable in largescale networks due to the highcomplexity of the junction tree formation, and the highdimensionality of the particles. Therefore, in this article, we propose the nonparametric generalized belief propagation based on pseudojunction tree (NGBPPJT). The main difference comparing with the standard method is the formation of pseudojunction tree, which represents the approximated junction tree based on thin graph. In addition, in order to decrease the number of highdimensional particles, we use more informative importance density function, and reduce the dimensionality of the messages. As byproduct, we also propose NBP based on thin graph (NBPTG), a cheaper variant of NBP, which runs on the same graph as NGBPPJT. According to our simulation and experimental results, NGBPPJT method outperforms NBP and NBPTG in terms of accuracy, computational, and communication cost in reasonably sized networks.
Introduction
Obtaining location estimates of each node position in wireless network as well as accurately representing the uncertainty of that estimate is a critical step for a number of applications, including sensor networks [1], cellular networks [2], and robotics [3]. We consider the case in which a small number of anchor nodes, obtain their coordinates via Global Positioning System or by installing them at points with known coordinates, and the rest, unknown nodes, must determine their own coordinates. Since we want to use energyconserving devices, with lack the energy necessary for longrange communication, we suppose that all unknown nodes obtain a noisy distance measurements of the nearby subset of the other nodes (not necessarily anchors) in the network. Typical measurement techniques [1, 4, 5] are time of arrival (TOA), time difference of arrival, received signal strength (RSS), and angle of arrival. This localization technique is well known as cooperative (or multihop) localization.
Most of the stateoftheart methods for cooperative localization estimate the point estimate of the sensor positions by applying classical least squares, multidimensional scaling, multilateration, or other optimization methods. These methods, also known as deterministic methods[1, 6–8], lack a statistical interpretation, and as one consequence do not provide an estimate of the remaining uncertainty in each sensor location. On the other hand, Gaussian probabilistic methods (such as multilateration by Savvides et al. [9], or variational method by Pedersen et al. [10]) assume a Gaussian model for all uncertainties, which may be questionable in practice. NonGaussian uncertainty is a common occurrence in realworld sensor localization problems, where typically there is a fraction of highly erroneous (outlier) measurements. This problem can be solved using nonparametric probabilistic (or Bayesian) methods [11–14], which take into account uncertainty of the measurements. They estimate the particlebased approximation of the posterior probability density function (PDF) of the positions of all unknown nodes, given the likelihood and a prior PDF of the positions of all unknown nodes. However, the main drawback of these methods is high complexity of marginalization of the joint posterior PDF, especially in largescale networks. Nevertheless, an appropriate factorization of the joint PDF using some messagepassing technique make these methods tractable. Nonparametric belief propagation (NBP), proposed by Ihler et al. [11, 12], is a wellknown particlebased message passing method for cooperative localization in wireless networks. It is capable to provide information about location estimation with appropriate uncertainty, to accommodate nonlinear models, and nonGaussian measurement errors.
However, due to the overcounting problem in the networks with loops, NBP’s convergence is not guaranteed, and its estimates are typically less accurate [15]. Our previous proposals, using NBP based on spanning trees [16] and uniformlyreweighted NBP [17], can mitigate this problem in highly connected networks, but with very small benefit comparing with NBP. Another solution is generalized belief propagation based on junction tree (GBPJT) method [18], which is a standard method for the exact inference in graphical models. In [19], nonparametric generalized belief propagation based on junction tree (NGBPJT) has been applied for the localization in a smallscale network, where it has been showed that it can outperform NBP in terms of accuracy, but with an additional cost. However, there remained two main problems: (i) how to efficiently form the junction tree in an arbitrary network, and (ii) how to decrease the number of particles. Therefore, in this article, we propose nonparametric generalized belief propagation based on pseudojunction tree (NGBPPJT). The main difference comparing with the standard method is the formation of pseudojunction tree (PJT), which represents the approximated junction tree based on thin graph. In addition, in order to decrease the number of highdimensional particles, we use a more informative importance density function, and reduce the dimensionality of the messages. As byproduct, we also propose NBP based on thin graph (NBPTG), a cheaper variant of NBP, which runs on the same graph as NGBPPJT. According to our simulation and experimental results (using measurements from indoor office environment), NGBPPJT method outperforms NBP and NBPTG in terms of accuracy, computational, and communication cost in reasonably sized networks. On the other hand, the main drawback of this method is the high cost in largescale networks.
The remainder of this article is organized as follows. In Section 2, we provide the background on graphical models, correctness of belief propagation, and junction tree formation. In Section 2, we propose an algorithm for PJT formation. Cooperative localization using NGBPPJT method for an arbitrary graph is proposed in Section 2. Simulation results are presented in Section 2. Finally, Section 2 provides some conclusions and proposals for the future work. The summary of notation is provided in Table 1.
Background and related work
Basics of graphical models
A graphical model is a probabilistic model for which a graph denotes the conditional independence structure between random variables. There are two main types: directed graphical models (or Bayesian networks) and undirected graphical models (or Markov networks). For the cooperative localization problem, we use Markov networks (also known as Markov random field).
An undirected graph G = (V, E) consists of a set of nodes V that are joined by a set of edges E. A loop is a sequence of distinct edges forming a path from a node back to itself. A clique is a subset of nodes such that for every two nodes in clique, there exists an link connecting the two. A tree is a connected graph without any loops, and a spanning tree is an acyclic subgraph that connects all the nodes of the original graph. Regarding directed graphs, we define a root node, which is a node without parent, and leaf node, which is a node without children. In order to define a graphical model, we place at each node a random variable taking values in some space. Each edge in the graph represents the information about conditional dependency between two connected nodes. In case of cooperative localization, each random variable represents the 2D (or 3D) position, and each edge, which indicates that the measurement is available, represents the likelihood function of that measurement. If we exclude the anchors nodes, the graph can be considered as undirected.
Correctness of belief propagation
In the standard belief propagation (BP) algorithm (also known as sumproduct), proposed by Pearl [20], the belief at node t, which represents the estimate of the posterior marginal PDF, is proportional to the product of the local evidence at that node ψ _{ t }(x _{ t }), and all the messages coming into node t:
where x _{ t } is a random variable for the state of node t (e.g., 2D position), and G _{ t } denotes the neighbors of node t. The messages are determined by the message update rule:
where ψ _{ t u }(x _{ t },x _{ u }) is the pairwise potential between nodes t and u. On the righthand side, there is a product over all messages going into node u except for the one coming from node t. This product is marginalized in order to form the particular information that we want to send to the destination node. In case of continuous functions, the sum over x _{ u } have to be replaced with the integral.
In practical computation, one starts with nodes at the edge of the graph, and only computes a message when one has available all the messages required. It is easy to see [15] that each message needs to be computed only once for treelike graphs, meaning that the whole computation takes a time proportional to the number of links in the graph, which is significantly less that the exponentially large time that would be required to compute the marginal PDFs naively. In other words, BP is a way of organizing the global computation of marginal beliefs in terms of smaller local computations. For the localization problem, this is not sufficient, so we need to represent the messages and beliefs in nonparametric (particlebased) form, as done in [11]. The resulting method, NBP, is capable to approximate the posterior marginal PDFs in nonGaussian form.
The BP/NBP algorithm does not make a reference to the topology of the graph that it is running on. However, if we ignore the existence of loops, messages may circulate indefinitely around these loops, and the process may not converge to a stable equilibrium [20]. One can find examples of loopy graphs, where, for certain parameter values, the BP/NBP algorithm fails to converge or predicts beliefs that are inaccurate. On the other hand, the BP/NBP algorithm could be successful in graphs with loops, e.g., errorcorrecting codes defined on Tanner graphs that have loops [21].
In order for BP/NBP to be successful, it needs to avoid overcounting[20, 22], a situation in which the same evidence is passed around the network multiple times and mistaken for new evidence. Of course, this is not the case in treelike graphs because a node can receive some evidence only through one path. In a loopy network, overcounting could not be avoided. However, BP/NBP could still lead to nearly exact inference if all evidence is overcounted in equal amounts. This could be formalized by unwrapped network[22] corresponding to a loopy network. The unwrapped network is a treelike network constructed such that performing BP/NBP in the unwrapped network is equivalent to performing BP/NBP in the loopy network. The importance of the unwrapped network is that since it is treelike, BP/NBP on it is guaranteed to give the correct beliefs. However, usefulness of this beliefs depends on the similarity between the probability distribution induced by the unwrapped network and the original loopy network. If the distributions are not similar, then the unwrapped network is not useful and the results will be erroneous as in original loopy network.
For the extensive analysis of this problem, we refer the readers to [15, 22, 23].
Junction tree formation
Junction tree (JT) algorithm is a method for the exact inference in arbitrary graphs. That can be proved by elimination procedure [18]. It is based on triangulated graph, i.e., a graph with additional “virtual” edges, which ensure that every loop of length more than 3 has a chord. In triangulated graph, each 3node loop (which is not part of any larger clique) represents 3node clique, and each edge (which is not part of any 3node clique) represents 2node clique. Larger cliques (> 3) should be avoided, but this is usually not possible even with the optimal triangulation procedure. Using these cliques as hypernodes, we can define a cluster graph[24] by connecting each pair of the cliques with minimum one common node (i.e., nonempty intersection). Using cluster graph, we can create a lot of clique trees, but just one (or very few) of them represent the JT. The JT is a maximum spanning tree of the cluster graph, with weights given by the cardinality of the intersections between cliques. It is already proved [24] that this is a way to satisfy the main property of the JT, the running intersection property (RIP). The RIP is satisfied if and only if each node, which is in two cliques C _{ i } and C _{ j }, is also in all cliques on the unique path between C _{ i } and C _{ j }. If the RIP is not satisfied for one node, there is no theoretical guarantee that its belief in one clique is the same as its belief in another clique.
We illustrate the whole procedure in Figure 1. We first triangulate the graph by adding the edge between nodes 2 and 5 (Figure 1a). Then we form the cluster graph (Figure 1b) with cliques C _{ i }(t, u, v) and the separator sets S _{ i j }(q, r) (S _{ i j } = C _{ i }∩C _{ j }), where t, u, v are the nodes in the clique, and q, r are the separator nodes. Finally, any spanning tree represents the clique tree, such as ones in Figure 1c,d. The tree in Figure 1d is the maximum spanning tree (S _{12} > S _{13}), so it represents the JT of the initial graph. Note that the tree in Figure 1c does not satisfy RIP since the node 6, which is in C _{1} and C _{2}, is not in C _{3}.
The described procedure represents the exact formation of JT, also called chordal graph method. The main problem of this approach is the triangulation phase. Finding, a minimum triangulation, i.e., one where the largest clique has minimum size, is NPhard problem due to the number of permutations that must be checked. Of course, there exist an approximate methods (e.g., [25]) which are less expensive, but still too costly according to authors. For more details, see Chapter 10 in [24].
PJT
Due to the high complexity of the optimal JT formation, it is necessary to find some approximation that will be suitable for the cooperative localization problem. Therefore, our goal is to achieve the following.

(a)
The number of cliques should be reasonable (i.e., in the order of number of nodes).

(b)
In order to reduce the dimensionality of the problem, each clique should include no more than three nodes.

(c)
Since the triangulation is expensive procedure, we are going to avoid it, even if it causes the break of RIP for some small percentage of the nodes.
After these approximations, the final result represents, strictly speaking, the clique tree. However, since it is very close to the junction tree (measured by the percentage of the nodes that satisfies RIP), we name it PJT.
Thin graph formation
In order to satisfy the conditions (a) and (b), we need to decrease the number of the edges in the graph by formation of thin graph. Assuming that each edge provides the same (or sufficiently similar) amount of information, it can be done using a modified version of breadth first search (BFS) method. The standard BFS method [26] begins at randomly chosen root node and explores all the neighboring nodes. Then each of those neighbors explores their unexplored neighbors, and so on, until all the nodes are explored. In this way, there will not be a loop in the graph because all the nodes will be explored just once. Thus, the final result of BFS is the spanning tree. The worst case complexity is O(v + e), where v is the number of nodes and e is the number of edges in the graph, since every node and every edge will be explored in the worst case.
Nevertheless, the spanning tree is very coarse approximation of the original graph since it excludes a lot of edges from the graph. For example, in any spanning tree, one communication failure breaks the graph into two parts. As a consequence, we need more spanning trees in order to have reasonable accurate inference in graphical models. Therefore, we modify standard BFS method by permitting each root node to make an additional visit to the node that was already visited by some of the previous roots. All edges found by first and second visits, along with all the nodes from the original graph, represent the thin graph. In addition, the second visit will automatically form a loop, so we use it to form 3node clique. The 2node cliques can be found easily by taking all the edges that appear in thin graph, but not in any 3node clique. The worst complexity is O(v + e + v · (v  1)) ≈ O(v ^{2}), since for each of the additional visit, we need to check all previous roots (all the nodes minus one, in the worst case). The detailed pseudocode is shown in Algorithm 1, and an example of the original graph and the corresponding thin graph are shown in Figure 2a,b, respectively.
Algorithm 1. Searching for thin graph and cliques using modified BFS method
The main benefit of the thin graph is that it mainly includes 3node loops. The number of these loops, which is obviously always less than the total number of nodes, is nearly constant with respect to connectivity, so the number of cliques will nearly be constant as well. On the other hand, the main drawback is that there exist the loops which include more than three nodes.^{a} These loops should be triangulated, but we prefer to avoid it in order to keep reasonable complexity. Thus, for nnode loops (n > 3), we form maximum n 2node cliques, using each edge (which is not already subset of any 3node clique) of the loop as a clique. Another problem can be caused by the nodes which cannot determine their positions due to the possible nonrigidity of the thin graph (e.g., the nodes with less than three neighbors). However, these nodes can be still located since we bounded the estimate within its bounding box (see Section 2), created using original (not thin) graph. Therefore, the estimates will never be out of these boxes, which means that we ensured a coarse estimate in the worst case scenario. Finally, we note that anchorunknown links are not excluded, so it would be useful if the anchors are placed as close as possible to the edges of the deployment area, where the leaf nodes are expected.
PJT formation
Having defined the cliques, we can form the cluster graph by connecting all pairs of the cliques with nonempty intersection (see Figure 3a). As we already mentioned, the JT, as well as PJT, is the maximum spanning tree of the cluster graph. It can be found using, e.g., Prim’s algorithm [27], as shown in Algorithm 2. The Prim’s algorithm is a method that finds a maximum (or minimum) spanning tree for a connected weighted undirected graph, meaning that the total weight of all the edges in the final tree is maximized (or minimized). In our case, the algorithm starts with a list (i.e., CurrentList in Algorithm 2) which initially includes only randomly chosen root clique. At each step, among all the edges between the cliques in the list and those not in the list yet, it chooses the one with the maximum weight and increases the list by adding the explored clique. Finally, it stops when all the cliques are spanned. The example of PJT is shown in Figure 3b. The worst case complexity is O(e·l o g(v)) [27], but in our case the weights are binary (S _{ i j } = 1, or S _{ i j } = 2), so the execution will be very fast.
Algorithm 2. PJT formation using Prim’s algorithm
The BP/GBP methods are naturally distributed through the network which means that there is no central unit (fusion center) which will handle all computations. Thus, the proposed PJT formation should be done in a distributed way. It is already well known that there is a straightforward distributed way to form any spanning tree, so we refer the readers to [28, 29].
Having defined PJT, it remains to define the communication between neighboring cliques. Since the separator sets, between each pair of the neighboring cliques, are always nonempty, the separator nodes are responsible to perform the communication. Practically, these nodes represents the cluster heads. For example, in Figure 3b, the node 3 will request all the data from node 9, and upon receiving, it will send the data to node 10, and vice versa.
Finally, the previous approximations will likely break the RIP for some small number of the nodes. For instance, in the PJT in Figure 3b, the node 10 (due to the nontriangulated 4node loop: 3–9–5–10), and the node 7 (due to the appearance of 4node clique: 2–6–5–7) do not satisfy the RIP. Therefore, we do not have a guarantee that the belief of that node in one clique is the same as its belief in another clique [24]. Nevertheless, for cooperative localization, this is not a problem since we used the bounded boxes (see Section 2) for the initial set of particles. Regarding other applications, this method might be useful if all edges provide the same (or sufficiently similar) amount of information.
Possible alternatives
Although we provided a tractable solution for formation of the approximated junction tree, we cannot claim that it is an optimal one. In literature, there are available alternatives that could be (with some adaption) applied for this problem. For example, Dechter et al. [30] propose iterative joingraph propagation, which runs on the cluster graph with bounded cluster size, created without discarding any edges. Similar solution, thin junction tree with bounded cluster size, is available in [31]. However, for both approaches, a distributed implementation is not provided, so they cannot directly be applied for our problem. Finally, it is worth mentioning a distributed method [32], which creates a rigid subgraph from the fully connected (complete) graph in a tractable way. However, this method can be applied for cooperative localization, only if adapted for noncomplete graphs.
Nonparametric generalized belief propagation
GBPJT is a standard message passing method for the exact inference in graphical models. This can be proved using elimination procedure [18]. Given cliques C _{ i } and its potentials ${\psi}_{{C}_{i}}\left({x}_{{C}_{i}}\right)$, and given the corresponding junction tree which defines links between the cliques, we send the following message from clique C _{ i } to clique C _{ j } by the message update rule:
where S _{ i j } = S _{ j i } = C _{ i }∩C _{ j }, and where ${G}_{{C}_{i}}$ are the neighbors of clique C _{ i } (including anchor nodes, which are not part of PJT). The belief at clique C _{ i } is proportional to the product of the local evidence at that clique and all the messages coming into clique i:
Finally, the singlenode beliefs can be obtained via further marginalization.
Equations (3), (4), and (5) represent GBPJT algorithm which is valid for any arbitrary graphs. The standard BP algorithm [11] is a special case of GBPJT, obtaining by noting that the original tree is already triangulated, and has only pairs of the nodes as cliques. In that case, sets S _{ i j } are single nodes, and the marginalization is unnecessary.
In order to adapt GBPJT to iterative scenario for cooperative localization, Equations (3), (4), at iteration m + 1 can be written as
At the beginning, it is necessary to initialize ${m}_{\mathit{\text{ij}}}^{1}=1$, and ${M}_{i}^{1}={\psi}_{{C}_{i}}$. The clique potential ${\psi}_{{C}_{i}}$ is given as a product of all singlenode and pairwise potentials. The potentials of 2node clique C _{ i }(t, u) and 3node clique C _{ j }(t, u, v) are, respectively, given by
The singlenode potential (the prior) of node t is given by
The bounding box (b. box) of node t, created using approximated distances to anchors (only 1hop and 2hop) as constraints [33], represents the region of the deployment area where the node t is localized. The pairwise potential ψ _{ t u }, which represents the likelihood function about the distance between nodes t and u, (ψ _{ t u }(x _{ t }, x _{ u }) ∝ p(d _{ t u }x _{ t },x _{ u })) is given by
where d _{ t u } represents the measured distance between nodes t and u, p _{ v }(·) the noise distribution of the measured distance, and R the transmission radius. More general model, which incorporates the probability of detection, can be found in [11, 33].
Regarding the model for mobile scenario, assuming that information is moving only forward in time, there are no loops between different time frames. Therefore, a dynamic model should be defined on the original graph with nodes. One option is to include this information using singlenode potential, i.e.
where a dynamic model p(x _{ t,τ }x _{ t,τ1}) defines the possible positions of the unknown node x _{ t,τ } in current time instant τ, given the estimated position from the previous time instant. It is also necessary to create the PJT at each time instant, except if the structure in the graph remains the same. All other computations (within the same time instant) are the same as in the static scenario. Thus, for clarity, we discard subscript τ in all equations, and focus on the static scenario. More details on mobile positioning can be found in [13, 14, 34].
Due to the high complexity, the presence of nonlinear relationships, and potentially highly nonGaussian uncertainties, GBPJT method is not appropriate for cooperative localization [11]. Thus, we need to use NGBPJT. Moreover, due to the problems explained in previous sections, we are going to use PJT instead of JT. Therefore, in following subsections, we propose NGBP based on PJT (NGBPPJT) for the arbitrary networks. Note that an analysis of NGBPJT for the smallscale network has already been provided in [19].
Drawing particles from the cliques
Let us draw N _{ C } weighted particles, $\left\{{W}_{{C}_{i}}^{k,m},\phantom{\rule{0.3em}{0ex}}{X}_{{C}_{i}}^{k,m}\right\}$ (k = 1, …, N _{ C }, m = 1), from clique i. Since it is computationally very expensive to draw particles from ${M}_{i}^{1}={\psi}_{{C}_{i}}$, we need to find appropriate importance density function. Thus, for the initial particles, we are going to use two constraints: (i) each particle of the node must be inside its bounding box, and (ii) the distance between each pair of the nodes in clique should be close to the mean value of the measured distance. Taking this into account, our importance density function ${q}_{{C}_{i}}^{m}$ (m = 1) for clique C _{ i }(t, u) is given by^{b}:
where μ _{ t u } is the mean value of measured distance (we assumed that we obtained more measurements per link). The parameter δ should be chosen so as to encompass nearly the whole PDF. Otherwise, if we cut out significant part of the PDF, the final beliefs will be overconfident. For instance, if p _{ v } is a Gaussian with standard deviation σ _{ d }, δ = 3σ _{ d } could be a good choice since it will encompass about 99% of the PDF. If the constraint is not satisfied, there is very small probability ϵ for the particle in that area.^{c} Finally, it is straightforward to show, using (13), that the importance density function, for 3node clique C _{ j }(t, u, v), can be found as
To draw clique particle, we need to draw node particles within its boxes and accept the particle if the constraint is satisfied. If not, we reject the sample, and try again. The weights of the particles can easily be computed by
Then these weights (as well as all the weights in the following subsections) are normalized
In this way, we have created two types of particles: the edges (for 2node cliques), and the triangles (for 3node cliques). We illustrated an initial set of particles in Figure 4.
Computing messages
Having computed the initial set of particles from the beliefs, we can compute the particles from the messages. According to Equation (6), we first need to marginalize the belief from previous iteration, then divide it by the incoming message from the previous iteration. Since all node particles within the clique have one common weight (e.g., $\{{W}_{{C}_{i}}^{k,m},\phantom{\rule{0.3em}{0ex}}{X}_{{C}_{i}}^{k,m}\}=\{{W}_{{C}_{i}}^{k,m},\phantom{\rule{0.3em}{0ex}}\{{X}_{t}^{k,m},\phantom{\rule{0.3em}{0ex}}{X}_{u}^{k,m}\left\}\right\})$, we can simply pick the particles of separator nodes (from the clique that send the message), and compute the weight as reminder of (6). The separator sets can include one or two nodes, so there exist 1node and 2node messages. Therefore, the weighted particles of the 2node message from C _{ i }(t, u, v) to C _{ j }(t, u, r), at iteration m + 1, are given by
The 1node messages can be found in analog way. As we can see, we need to approximate the parametric form of the message ${m}_{\mathit{\text{ji}}}^{m}$, so we estimate it using spherically symmetric Gaussian kernel [12, 35]. The bandwidth, parameter which controls the smoothness of this kernel density estimate (KDE), can be found using “rule of thumb” [12], or some advanced method [36]. In case of 2node message, it is too complex to estimate the parametric form directly from highdimensional (4D) particles. However, thanks to the dependency between the nodes within the message (the noisy distance), we can reduce the dimension of the message by
Note that in PJT (in contrast to JT), there is always observed distance between each pair of the nodes within the clique (i.e., no additional edges added by triangulation). Thus, it is sufficient to transmit the particles over one node, and upon receiving, shift them in a random direction for the observed distance. Finally, the messages from any anchor a to any neighboring unknown node t, are simply given in the parametric form
where we assumed that the position of the anchor node is perfectly known (i.e., defined by Delta Dirac function). However, if anchors’ positions are uncertain (as in [37]), the message can be computed in the same way as the messages from the unknown nodes.
Computing beliefs
According to (7), the belief of clique i is a product of its clique potential and all the messages coming into the clique. Before drawing the particles, we need to solve two problems: (i) the messages include information about different nodes within the clique, and (ii) it is intractable to draw the particles from the product.
The first problem can be solved by filling the message with the information about the nodes which appears in the destination clique, but not in the message. For example, for the message ${m}_{\mathit{\text{ij}}}^{m+1}({x}_{t},{x}_{u})$, from C _{ i }(t, u,v) to C _{ j }(t, u, r), we can form the joint message:
Taking Equations (19), (8), and (9) into account, the joint message can be written as
where node t must be in appropriate separator set (t ∈ S _{ i j }), and if S _{ i j } > 1, we can pick one node randomly. Thanks to the particles from the standard messages, we already have few (one or two) node particles from each joint message. The remained node particles can be drawn by shifting given node particles in a random direction for an amount which represents the observed distance, and by checking (only in case of 3node clique) another distance constraint. The weights of the particles from the joint messages are equal to the weights of the particles from the standard messages. However, due to the sample depletion, we resample with replacement[38, 39] so as to produce the particles with same weights: $\{1/{N}_{C},{X}_{\mathit{\text{ij}}}^{k,m+1}\}$. The most of the particles, especially in case of small noise, will be the same. This can cause very poor representation of the beliefs. Therefore, to each of these particles, we add a small jitter ω drawn from p _{ v }
where θ represents a random direction (θ ∼ U n i f[0, 2Π)). Finally, we solve problem (ii), by making the sum (instead of the product) of the joint messages (i.e., using mixture importance sampling (MIS) [11]). Therefore, the final importance density for the belief of clique j, and corresponding particles, are, respectively, given by
We now find the set of particles from the beliefs $\{{W}_{{C}_{j}}^{k,m+1},{X}_{{C}_{j}}^{k,m+1}\}(k=1,\dots ,{N}_{C})$:
where ${W}_{{C}_{j},\text{corr}}^{k,m+1}$ is the correction of the weights due to the MIS, ${X}_{t}^{k,m+1}$ particle from node t, m _{ a t } is the message from the anchor node a to unknown node t, and function choose chooses randomly one particle from $\left{G}_{{C}_{j}}\right$.
As a convergence parameter, we use approximated KullbackLeibler (KL) divergence between the beliefs in two consecutive iterations, which is given by:
where we used the approximation ${M}_{j}^{m+1}\left({X}_{{C}_{j}}^{k,m+1}\right)\approx {W}_{{C}_{j}}^{k,m+1}$. The algorithm stops when $K{L}_{j}^{m+1}$ (for all j) drops below the predefined threshold. However, it is also possible to predefine the number of iterations, given diameter of the graph and transmission radius. We choose the latter approach in simulations.
The final estimation of each node within the cliques is given as the mean of the particles from the belief in last iteration. Since the most of the nodes appear in more than one clique, we simply average multiple estimates. Other options are also possible, such as choosing the belief with the smallest entropy. We summarize the NGBPPJT algorithm in Algorithm 3.
Algorithm 3. NGBPPJT method for cooperative localization
Finally, it is worth noting that a special case of NGBPPJT method is NBP method based on thin graph (NBPTG) assuming that the thin graph has only the pairs of the nodes as cliques. NBPTG is very important byproduct since it runs on the same graph as NGBPPJT, which makes this method cheaper than NBP. It also helps to understand (as shown in the following section) how much the removed edges from the original graph affect the performance of the method.
Performance evaluation
Scenario
We assume that there are N _{ a } + N _{ u } = 60 nodes in a 20 × 20 m^{2} area. Four anchors are placed near the edges. This, usually realistic, constraint helps the unknown nodes near the edges which suffer from low connectivity. The rest of the anchors and the unknowns are randomly deployed within the area. The number of iterations is set to N _{iter} = 3, which means that each node/clique will have available all the information 3hop away from itself. The transmission radius is set to R = 8 m. Simulations are performed using N _{ a } = 6 and N _{ a } = 12 anchor nodes. We assume that the distance is obtained from RSS measurements using lognormal model, since this is usually the worst case scenario [1]. Thus, we choose σ _{dB} = 5 dB as standard deviation of RSS (i.e., the parameters^{d} of lognormal distribution are μ = log(d) and σ = σ _{dB}/10n _{ p } = 0.25, where n _{ p } = 2 is the pathloss exponent.^{e}). Previous parameters are same both for NBP, NBPTG, and NGBPPJT. However, the number of particles is set to 100 (for NBP), 290 (for NBPTG), and 210 (for NGBPPJT), in order to make nearly the same computational time for all three methods (see Table 2). For the KDE of the messages, the bandwidth is found using “rule of thumb”, which is the simplest option. Finally, the following simulation results represent the average over 20 Monte Carlo runs (in each of them, there are N _{ u } estimates available). Note that all defined parameters are valid only if not otherwise stated in the following text.
Comparison of accuracy and convergence
Using the defined scenario, we compare the accuracy and the convergence of NBP, NBPTG, and NBPPJT algorithms. The error is defined as Euclidean distance between the true and the estimated location. First, we illustrate the results of these methods in Figure 5. We can see that NBPPJT method significantly outperforms both NBP and NBPTG methods, and also that NBP slightly outperforms NBPTG.
Then, for randomly chosen node, we illustrate its initial and final belief in Figure 6. First, the initial beliefs of NBP and NBPTG represent nearly uniform distribution within its bounded box, but the initial belief of NGBPPJT (which is also within its bounded box) is not uniform due to the distance constraints within appropriate cliques (see (13)). Thus, the initial belief of NGBPPJT is more informative than the belief of NBP. Second, we can see that the final NBP belief is the tightest, but this information is overconfident comparing with NBPTG and NGBPPJT. Since NBPTG and NGBPPJT run on the same graph, we can say that NBPTG is overconfident, comparing with NGBPPJT, due to the loops. Regarding comparison between NBP and NBPTG, we note that the latter one has less number of the edges and less number of the loops. This increase the uncertainty, so NBPTG provides more uncertain belief. The level of overconfidence can be also analyzed using the true position of the node. Hence, in case of overconfident belief, the true position can be located in the area with probability close to zero (as shown in Figure 6). Therefore, we can conclude that NGBPPJT is less informative, but more trustful. Moreover, in order to obtain more precise conclusion about accuracy, we also consider cumulative distribution function (CDF) of the error in position. We can see in Figure 7 that NGBPPJT outperforms all other methods in terms of any percentile.
Furthermore, we provide the analysis of the RMS error with respect to transmission radius. According to Figure 8, the NGBPPJT significantly (5–10%) outperforms NBP and NBPTG, for all R and both values of N _{ a }. It is also worth noting that the number of anchors significantly affects the accuracy. For instance, NGBPPJT with 6 anchors performs similar as NBP with 12 anchors. Therefore, given nearly the same accuracy, one can decrease the equipment cost by removing 6 anchors (which are usually very expensive). It is also interesting to note the performance difference between NGBPPJT and NBPTG (since they use different message passing, but the same graph), and the difference between NBP and NBPTG (since they use the same message passing, but different graph).
Regarding convergence, we can see in Figure 9 that all algorithms converge sufficiently after second iteration. This is expected since we set R = 8 m, so almost all information is maximum 2hop away from each clique. Finally, we can see that all algorithms, especially NBPTG, cannot perfectly converge (i.e., K L→0) after reasonable number of iterations. This is, of course, caused by the existence of loops (for NBP and NBPTG), and missing edges in thin graph (for NBPTG and NGBPPJT).
Comparison of computational and communication cost
As we already mentioned, we set the same computational cost for R = 5 m by choosing appropriate number of particles for all three methods. It was not possible to set the same cost for all methods since the cost is more sensitive to R in case of NBP method. On the other hand, NGBPPJT and NBPTG costs are less sensitive to R due to the nearly same number of edges with respect to (w.r.t.) R, in formed thin graph. We provide the average cost per node for different values of R in Table 2. We can see that the cost of NGBPPJT is the same or less for all considered values of R. We can also see that the complexity of the PJT formation is neglectable comparing with full algorithms.
Regarding communication cost, which is directly related with the battery life of the wireless devices, we provide a simplified^{f} analysis by counting elementary messages, where one elementary message is defined as a scalar value (e.g., one coordinate of one particle). We will consider the effect of transmission radius and number of unknowns, since their variations obviously affect the cost. First, we analyze the cost of PJT formation (Algorithms 1 and 2). As we can see in Figure 10, it is a linear function of transmission radius, and a quadratic function of number of unknowns. Second, we analyze the cost of all considered algorithms w.r.t. R, for two different number of unknowns. According to Figure 11, we can conclude the following

NGBPPJT significantly outperforms NBP and NBPTG methods, for reasonable number of unknowns.

Comparing with NBP, the improvement of NGBPPJT is increasing as transmission radius increasing. This is achieved thanks to the thin graph.

The cost of NBPTG is slightly less than NGBPPJT due to the redundancy in PJT graph (i.e., when the same node appears in more than one clique).

Increasing the number of unknowns will decrease the benefit of NGBPPJT. This is caused by quadratic dependency of PJT formation w.r.t. number of unknowns. Using results from Figures 10b and 11, we estimate that NGBPPJT will reach the same cost as NBP, for 140 unknown nodes.
Finally, we can conclude that the proposed NGBPPJT method is cheaper for reasonably sized networks. However, it can also be cheap for very largescale networks if the network is divided into regions, and one PJT created for each of them.
Experimental results
We now test NBP, NBPTG, and NGBPPJT using real RSS data obtained in indoor environment. The experiments are performed by Patwari et al. [40] and the data are available online [41]. They marked 44 points in a 14×13 m^{2} office area (see Figure 12), with many obstacles typical for that environment (cubicle walls, desks, computers, etc.). The measurement system includes one transmitter and receiver which uses a wideband directsequence spreadspectrum (DSSS). The transmit power was 10 mW, and the center frequency 2443 MHz. The measurements are conducted by placing the transmitter and receiver at each pair of the points, and taking 5 measurements for each combination (totally, 9460 measurements). Then, for each measurement, TOA and RSS have been estimated, and for each link, the average has been found. More details about the experiments can be found in [40].
In this article, we use averaged RSS samples, shown in Figure 13. Taking P _{0}(d _{0} = 1m) = 37 dBm, the parameters of the lognormal model can be found: n _{ p } = 2.3 and σ _{ d B } = 3.92 dB. Then, we compare the accuracy of NGBPPJT, NBPTG and NBP as a function of transmission radius (which varies from 5 to 10 m). The number of particles is the same as for the test in Section 2, and the number of anchors is set to N _{ a } = 5 (marked as red squares in Figure 12). The results are averaged over 20 Monte Carlo runs (in each run, we use different seed for the particles, but keep the same network and the corresponding measurements). As we can see in Figure 14, the conclusion is the same as for the previous test (Section 2) based on synthetic data: NGBPPJT outperforms NBP and NBPTG, for all considered R. However, comparing Figures 8 and 14, we note that the values of the errors has changes.
Conclusion and future work
In this article, we presented NGBPPJT, a novel message passing approach for cooperative localization in wireless networks. Since the exact formation of junction tree is intractable, we proposed the formation of PJT, which represents the approximated JT based on thin graph. In addition, in order to decrease the number of particles for NGBPPJT method, we proposed a more informative importance density function, and also reduce the dimensionality of the messages. As byproduct, we also proposed NBPTG, a cheaper variant of NBP, which runs on the same graph as NGBPPJT. According to our simulation and experimental results, NGBPPJT, outperforms NBP and NBPTG in terms of accuracy, computational and communication cost in reasonably sized networks. In addition, NGBPPJT beliefs are not overconfident as NBP beliefs, so we can obtain online more trustful information about the position uncertainty. Finally, all algorithms converge sufficiently after very small number of iterations, but the convergence is not perfect.
There remain a number of open directions for the future work. One direction could be to find an alternative method (e.g., modified variants of the methods described in Section 2), which will be tractable in largescale networks. It would be also useful to find in which graphs, and under which conditions, the benefit gained by using a NGBPPJT instead of NBP outweighs the penalty caused by discarding edges. Moreover, an important research line is to investigate if there is some cheaper (nonparticlebased) message representation, which should be capable to handle all realistic uncertainties. Finally, distributed target tracking in sensor network [42, 43] could be an interesting direction since this application requires a number of sensor nodes with known (or estimated) positions.
Endnotes
^{a}According to our empirical analysis, the number of these loops is relatively small (e.g., just one 4node loop in Figure 2b).^{b}We implicitly assumed that ${q}_{{C}_{i}}^{1}\left({x}_{{C}_{i}}\right)=0$ if the state of one of the clique nodes is out of the deployment area.^{c}In practical circumstances, we can set ϵ = 0.^{d}Note that these values do not represent the mean value and the standard deviation of the distance. They are respectively given by: ${\mu}_{d}={e}^{\mu +{\sigma}^{2}/2},{\sigma}_{d}={\mu}_{d}\sqrt{{e}^{{\sigma}^{2}}1}$. Consequently, these parameters are distance dependent.^{e}Typical values for n _{ p } are between 2 and 6 [44]. For the distance estimation, the minimum value is the worst case.^{f}Exact communication cost can only be measured by knowing the hardware specifications, especially, the amount of the bytes in the package, number of reserved bytes in package, energy required to transmit a package, etc.
References
 1.
Patwari N, Ash JN, Kyperountas S, Hero III AO, Moses RL, Correal NS: Locating the nodes: cooperative localization in wireless sensor networks. IEEE Signal Process. Mag 2005, 22(4):5469.
 2.
Sayed AH, Tarighat A, Khajehnouri N: Networkbased wireless location: challenges faced in developing techniques for accurate wireless location information. IEEE Signal Process. Mag 2005, 22(4):2440.
 3.
Fox D, Burgard W, Dellaert F, Thrun S: Monte Carlo localization: efficient position estimation for mobile robots, in Proceedings of the National Conference on Artificial Intelligence. July 1999.
 4.
Gustafsson F, Gunnarsson F: Mobile positioning using wireless networks: possibilities and fundamental limitations based on available wireless network measurements. IEEE Signal Process. Mag 2005, 22(4):4153.
 5.
Gezici S, Tian Z, Giannakis GB, Kobayashi H, Molisch AF, Poor HV, Sahinoglu Z: Localization via ultrawideband radios: a look at positioning aspects for future sensor networks. IEEE Signal Process. Mag 2005, 22(4):7084.
 6.
Niculescu D, Nath B: Ad hoc positioning system (APS), in IEEE Proceedings of the GLOBECOM. November 2001, 29262931.
 7.
Shang Y, Ruml W, Zhang Y, Fromherz M: Localization from connectivity in sensor networks. IEEE Trans. Parallel Distrib. Syst 2004, 15(11):961974. 10.1109/TPDS.2004.67
 8.
Chan F, So HC, Ma WK: A novel subspace approach for cooperative localization in wireless sensor networks Using Range Measurements. IEEE Trans. Signal Process 2009, 57(1):260269.
 9.
Savvides A, Park H, Srivastava MB: The bits and flops of the nhop multilateration primitive for node localization problems, in Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Application. September 2002, 112121.
 10.
Pedersen C, Pedersen T, Fleury BH: A Variational message passing algorithm for sensor selflocalization in wireless networks, in Proceedings of the International Symposium on Information Theory (ISIT). July 2011, 21582162.
 11.
Ihler AT, Fisher I I I JW, Moses RL, Willsky AS: Nonparametric belief propagation for self localization of sensor networks. IEEE J. Sel. Areas Commun 2005, 23(4):809819.
 12.
Ihler AT: Inference in sensor networks: graphical models and particle methods, PhD Thesis. Massachusetts Institute of Technology, June 2005
 13.
Wymeersch H, Lien J, Win MZ: Cooperative localization in wireless networks. Proc. IEEE 2009, 97(2):427450.
 14.
Schiff J, Sudderth EB, Goldberg K: Nonparametric belief propagation for distributed tracking of robot networks with noisy interdistance measurements, in IEEE Proceedings of the International Conference on Intelligent Robots and Systems. October 2009, 13691376.
 15.
Yedidia JS, Freeman WT, Weiss Y: Understanding belief propagation and its generalizations. Explor. Artif. Intell. New Millennium 2003, 8: 239269.
 16.
Savic V, Poblacion A, Zazo S, Garcia M: Indoor positioning using nonparametric belief propagation based on spanning trees. EURASIP J. Wirel. Commun. Netw 2010, 2010: 112.
 17.
Savic V, Wymeersch H, Penna F, Zazo S: Optimized edge appearance probability for cooperative localization based on treereweighted nonparametric belief propagation, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). May 2011, 30283031.
 18.
Jordan MI, Weiss Y: Graphical model: probabilistic inference, in The Handbook of Brain Theory and Neural Networks. Cambridge, MA: MIT Press; 20022002.
 19.
Savic V, Zazo S: Sensor localization using nonparametric generalized belief propagation in network with loops, in IEEE Proceedings of Information Fusion. July 2009, 19661973.
 20.
Pearl J: Probabilistic Reasoning in Intelligent Systems. Networks of plausible inference. Burlington, MA: Morgan Kaufmann; 1988.
 21.
Frey BJ: A revolution: belief propagation in graphs with cycles, in Proceedings of Advances in Neural Information Processing Systems. 1997, 479185.
 22.
Weiss Y: Correctness of local probability propagation in graphical models with loops. Neural Comput 2000, 12(1):141. 10.1162/089976600300015880
 23.
Mooij JM, Kappen HJ: Sufficient conditions for convergence of the sumproduct algorithm. IEEE Trans. Inf. Theory 2007, 53(12):44224437.
 24.
Koller D, Friedman N: Probabilistic Graphical Models: Principles and Techniques. MA: MIT Press; 2009.
 25.
Hutter F, Ng B, Dearden R: Incremental thin junction trees for dynamic Bayesian networks. : Technical Report, Darmstadt University of Technology; 2004. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.9.8661
 26.
Bader DA, Madduri K: Designing multithreaded algorithms for breadthfirst search and stconnectivity on the Cray MTA2, in IEEE Proceedings of the Parallel Processing. August 2006, 523530.
 27.
Wang X, Wang X, Wilkes DM: A divideandconquer approach for minimum spanning treebased clustering. IEEE Trans. Knowledge Data Eng 2009, 21(7):945958.
 28.
Yoo A, Chow E, Henderson K, McLendon W, Hendrickson B, Catalyurek UV: A scalable distributed parallel breadthfirst search algorithm on Bluegene/L, in Proceedings of the Supercomputing. November 2005.
 29.
Mooij AJ, Goga N: A distributed spanning tree algorithm for topologyaware networks, in Proceedings of the Conference on Design, Analysis, and Simulation of Distributed Systems. 2004.
 30.
Dechter R, Kask K, Mateescu R: Iterative joingraph propagation, in Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI). 2002.
 31.
Chechetka A, Guestrin C: Efficient principled learning of thin junction trees, in Advances in Neural Information Processing Systems (NIPS). 2007.
 32.
Caetano TS, Caelli T, Schuurmans D, Barone DAC: Graphical models and point pattern matching. IEEE Trans. Pattern Anal. Mach. Intell 2006, 28(10):16461663.
 33.
Savic V, Zazo S: Nonparametric boxed belief propagation for localization in wireless sensor networks, in IEEE Proceedings of the SENSORCOMM. June 2009, 520525.
 34.
Mihaylova L, Angelova D, Bull DR, Canagarajah NC: Localization of mobile nodes in wireless networks with correlated in time measurement noise. IEEE Trans. Mob. Comput 2011, 10(1):4453.
 35.
Silverman BW: Density Estimation for Statistics and Data Analysis. New York: Chapman and Hall; 1986.
 36.
Botev ZI: A novel nonparametric density estimator. Australia: Technical Report, The University of Queensland; 2006.
 37.
Vemula M, Bugallo MF, Djuric PM: Sensor selflocalization with beacon position uncertainty. Signal Process (Elsevier) 2009, 89(6):11441154. 10.1016/j.sigpro.2008.12.019
 38.
Djuric PM, Kotecha JH, Zhang J, Huang Y, Ghirmai T, Bugallo MF, Miguez J: Particle filtering. IEEE Signal Process. Mag 2003, 20(5):1938. 10.1109/MSP.2003.1236770
 39.
Arulampalam MS, Maskell S, Gordon N, Clapp T: A tutorial on particle filters for online nonlinear/nonGaussian Bayesian tracking. IEEE Trans. Signal Process 2002, 50(2):174188. 10.1109/78.978374
 40.
Patwari N, Hero AO, Perkins M, Correal NS, O’Dea RJ: Relative location estimation in wireless sensor networks. IEEE Trans. Signal Process 2003, 51(8):21372148. 10.1109/TSP.2003.814469
 41.
Wireless Sensor Network Localization Measurement Repository http://web.eecs.umich.edu/~hero/localize/
 42.
GarciaFernandez AF, Grajal J: Asynchronous particle filter for tracking using nonsynchronous sensor networks. Signal Process (Elsevier) 2011, 91(10):23042313. 10.1016/j.sigpro.2011.04.013
 43.
Djuric PM, Beaudeau J, Bugallo MF: Noncentralized target tracking with mobile agents, in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP). May 2011, 59285931.
 44.
Rappaport TS: Wireless Communications: Principles and Practice. NJ: Prentice Hall PTR, Upper Saddle River; 2001.
Acknowledgements
This study was supported in part by the FPU fellowship from Spanish Ministry of Science and Innovation; program CONSOLIDERINGENIO 2010 under the grant CSD200800010 COMONSENS; the European Commission under the grant FP7ICT20094248894WHERE2; the Swedish Foundation for Strategic Research (SSF) and ELLIIT.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Savic, V., Zazo, S. Nonparametric generalized belief propagation based on pseudojunction tree for cooperative localization in wireless networks. EURASIP J. Adv. Signal Process. 2013, 16 (2013). https://doi.org/10.1186/16876180201316
Received:
Accepted:
Published:
Keywords
 Probability Density Function
 Span Tree
 Receive Signal Strength
 Anchor Node
 Unknown Node