A sensor selection approach to maneuvering target tracking based on trajectory function of time

In this paper, we propose a computationally efficient sensor selection approach for maneuvering target tracking using a sensor network with communication bandwidth constraints, given limited prior information on the target maneuvering models. We formulate the stochastic sensor selection problem as a linear programming problem which consists of two easily implementable steps. First, the Cramér–Rao lower bound corresponding to the sensor subset is derived as the objective function of the proposed sensor selection method based on a partially observable Markov decision process. Second, the target trajectory is modeled by a function of time to enable online target tracking which is free of the conventional, a priori Markov modeling of the target dynamics. We demonstrate the effectiveness of our method through several numerical examples.


Related works
The sensor selection can be framed as a linear programming problem based on the partially observable Markov decision process (POMDP) framework [2,3,7,11]. POMDP allows the calculation of the optimal policy for choosing the desired actions despite some important information may not be observed. In this way, a decision-theoretic approach can be taken, leveraging the sensor nodes observations and the reward from following management actions to customize choices. Within the POMDP framework, the action space is usually infinite and continuous, but in practice, it is often assumed to be a finite set of actions.
Different methods have been proposed to solve the optimal policy problem. [9] formulated the sensor selection in a Bayesian framework and estimated the information given by multi-sensor system for a given scene via a Bayes reasoning. [40] introduced several practically feasible measures of information utility. The main idea in these approaches was to select the sensors with the most useful information gain. In addition to information metrics, the sensor selection has also been proposed on the base of some other performance indicators/function optimization [5,21]. Remarkably, the posterior CRLB (PCRLB) was derived for the nonlinear filter in [30], which provided a theoretical performance limit for a Bayesian estimator. It has attracted the interest of many researchers for sensor management, e.g., [12,13,22,28,29]. In particular, [12,13,22] focused on the measurement origin uncertainty and proposed the concept of information reduction factor to calculate the PCRLB with the false alarm while [8] investigated the detection probability less than unity 1. [20] applied the conditional PCRLB which is dependent on the actual observation data and adaptive to the particular realization of the system state.
Furthermore, different forms of the optimization problem have been proposed to solve the sensor selection problem. In [14], the sensor selection was formulated as a linear programming under linear measurement models and solved via convex optimization. [4] extended [14] to the nonlinear measurement models. Meanwhile, scholars paid attention to the researches on sensor selection under resource constraints. For instance, [10] defined the selection problem as a knapsack problem with the goal that guarantees a good performance at the price of low cost and proposed a heuristic algorithm based on a greedy strategy [24,25]. Decomposed the joint resource allocation problem into subproblems and solved them by the Karush-Kuhn-Tuckers optimal conditions. The modified particle swarm optimization was utilized to solve the sensor scheduling in [38]. [27] relaxed the constrained resource allocation to an unconstrained Markov decision process via Lagrangian relaxation.
However, all these bounds/approaches rely heavily on correct Markov-jump modeling of the target dynamics, which can hardly be met in the case of maneuvering target with little prior information about the target dynamics and the sensor statistics.

Our contribution and paper organization
In this paper, we consider a sensor network consisting of bearing-only sensors where the bearing measurements are given by the direction of arrival (DOA) [15]. The proposed method implements an efficient two-step process to obtain the optimal subset of the sensor nodes to be activated. In the first step, all sensors overlapping with the detection radius/FoV of the target are selected as the candidate set. Then, a specified number of nodes are extracted from that candidate set, satisfying the communication restrictions while achieving the optimal performance. Furthermore, the target tracking is decomposed into two modules. In the first module, the selected sensor nodes subset transfers their current moment measurements to the information fusion center. Then, the target location is estimated via the least squares (LS) method [31]. In the second module, we use the trajectory function of time(T-FoT) approach [17][18][19] to describe the movement of the target for tracking. Compared with most model-based filters, the data-driven T-FoT approach has the advantage of needing poor prior information about the target maneuvering and both process and measurement noises.
The main contributions of this work can be summarized as follows: • We consider the challenging scenario in which the target is non-cooperative and moves with completely unknown maneuvering. • We extend the POMDP framework to the T-FoT tracking approach, where the target localization is determined by a LS estimator. This T-FoT approach allows accommodating missing knowledge about the target dynamics and the background noises. • The CRLB of the target localization mean error with regard to DOA sensors is used as the objective function for sensor selection. We propose two CRLB-based strategies: one is to select a fixed number of sensor nodes to fulfill the bandwidth constraint, and the other is to active as few sensor nodes as possible while meeting the CRLB constraint.
The rest part of this paper is organized as follows. The system model we consider is introduced in Sect. 2. The two proposed sensor selection approaches and simulation study are given in Sects. 3 and 4, respectively. The paper is concluded in Sect. 5.

Measurement model
The measurement model of the passive DOA sensor can be written by where (x k , y k ) is the position of the target at time k , (x i k , y i k ) is the coordinate position of sensor i and v i k is assumed as zero-mean Gaussian, v i k ∼ N 0, R i k . Hereafter, the measurement noise of each sensor is irrelevant to that of the rest sensors. The measurements from all activated/selected sensors at time k may be collected as

Target localization using DOA
A typical scenario of DOA target localization is shown in Fig. 1. In this section, we drop the time subscript k for simplicity. Denote the target position angle relative to sensor i by θ i k for which we have (2) can be rewritten in short The estimation of the target localization using the LS method [32] can be calculated from B i k , M i k as follows Here, n ≥ 2 is needed to satisfy the positive definiteness of the matrix.

Target movement modeling by T-FoT
The performance of the standard state-space model depends on how well the Markov model matches the true target dynamics. For the non-cooperative maneuvering target, it is practically impossible to precisely identify the time-varying motion by a Markov-jump model. To address this challenge, we apply the T-FoT approach [17][18][19] for target tracking which is free of Markov-jump modeling. Decomposing the real target trajectory f(t) in each coordinate (e.g., x-position, y-position), the polynomial T-FoT method fits the motion model as follows where t ∈ R + indicates the continuous time, k = 1, 2, ..., denotes the discrete timeinstant, f (t) denote the target trajectory in dimension, F k (t; C k ) is the corresponding T-FoT with parameter set C k , and e k (t) denotes the fitting error in regard to f (t).
The polynomial T-FoT of order γ can be written as for which the parameter set is C k [c k,0 , c k,1 , . . . , c k,γ ]. The order of the polynomial determines the complexity of the model. For the typical motion models such as constant velocity (CV) and constant acceleration (CA) models, i.e., In practical applications, the sliding time-window fitting using the second-order polynomial is applicable to most smooth trajectories. The 2-D T-FoT can be described as

Methods
In this section, we describe the improved sensor selection methods. We formulate the sensor selection problem as a POMDP framework in conjunction with the CRLB of the target localization mean error for tackling the problem that the observers (e.g., sensor nodes) cannot reliably identify the underlying actual target states. Our method extends the POMDP framework by integrating the T-FoT approach to address the unknown target dynamic model.

POMDP framework based on T-FoT
The core idea of the POMDP is choosing the optimal selection command via minimizing the cost function or maximizing the reward function. At the time step k, the POMDP can be defined as where S is a finite set of the sensor selection commands, Z s is a finite set of the observations under the commands set S, g(·|X k , s) is the measurement model conditioned on the command s ∈ S and the target state, F (·; C k ) is the estimated T-FoT at time k, µ(s; ·) is the objective function by executing an action command s ∈ S.
In the core of our POMDP framework, the objective function µ(s; ·) is defined as the CRLB u lb (s k ;X k+1 ) of the pseudo-localization error of the target conditioned on the measurements from the activated sensors, which in turn depends on the selection command s (see Sect. 3.2). Here, the estimated/predicted state X k+1 = F (k + 1; C k ) is obtained from the estimated T-FoT [19] rather than by a Markov-jump model (see Sect. 3.3) which is indispensable prior information in traditional methods. This leads to the key difference of our approach with existing POMDP approaches [6,16].
Typically, the sensor selection needs to meet a specific constraint. In this paper, we consider two practical constraints, i.e., the number of sensors to be selected is deterministic, or the sensors selected correspond to a deterministic CRLB with the minimum number of sensors. For these two cases, the optimal selection command is given by (9) and (10), respectively. where S k ⊆ S denotes the candidate sensor set at time k, |s * k | denotes the number of selected sensors, n s is the specified number of sensors to be selected.
where T lb is the required CRLB such that the selected sensors can meet.

CRLB with regard to DOA
The CRLB provides the lower bound of the variance of unbiased estimators of a deterministic parameter under specific measurement conditions, which can be used to evaluate the detection capability of different sensor node subsets.
For an unbiased estimator X k (Z k ) of a parameter vector X k based on the measurement vector Z k , the CRLB for the error covariance matrix is defined to be the inverse of the Fisher Information Matrix (FIM), denoted by J, as follows where E denotes the mean value of the content and the inequality (11) means that the difference u lb (X k ) − J −1 k is positive semi-definite. Now, consider the predicted target state X k = F (k; C k−1 ) obtained from the estimated T-FoT, a n-sensor extension of the DOA measurement function as in Eq. (1) is where Under the premise that z 1 k , · · · , z n k are conditionally independent of each other, the PDF of the collected measurements Z k = z i k n i=1 ∼ N (θ , R k ) can be expressed as where θ is the mean value of measurements and R k = diag(R 1 k , R 2 k , . . . , R n k ). Then, compute the second-order derivatives of the logarithm of the measurement PDF with respect to X k Substitute Eqs. (14) to (15), the FIM based on DOA measurements J (X k ) can be shown as follows (10) s * k = arg min Expand the H (X k ) and take the first-order partial derivative of X k where (x Thus, J (X k ) can be computed as Finally, the CRLB is given as where X t denotes the estimates of the target state at time t and the fitting error is given as e t = X t −X t and e t is the covariance of the fitting error, c.f., (5).

Algorithm summary
In summary, the proposed sensor selection algorithm can be summarized as Algorithm 1. Based on the POMDP framework, the interaction of the tracked target with the sensor selection strategy can be described as the following three steps (see also Fig. 2): 1 At any time step k, the estimated T-FoT has parameters C k which can be used to predict the target state X k+1 = F (k + 1; C k ) for time k + 1. 2 The sensor network perceives pseudo-observations Ẑ k+1 through the known stochastic observation model g(·|X k , s k ) and the predicted target state X k+1 . This will result in the expression of objective function u lb (s k ;X k+1 ). 3 Find the optimal selection command s * k from the candidate set S k ⊆ S by optimizing the objective function u lb (s k ;X k+1 ) with respect to the potential constraints. Here, the candidate set can be defined as the subset of all sensors that lie within a limited distance to the target.

Results and discussion
In this section, we will exhibit simulations to validate the effectiveness of the proposed methods. First, we will go over the environment setup of the simulation. Then, depending on the simulations, we will have some related discussions.

Simulation setup
We consider 100 DOA sensors of which 50 use R i k = (π/180) 2 rad 2 and the other 50 use R i k = (π/360) 2 rad 2 , which are uniformly distributed over the ROI whose size is 3500m × 2500m and are marked in different colors in Fig. 3. The yellow circle indicates the range of the sensor node detection and the pink circle indicates the range of sensor node communication. These 100 sensors are independent with each other. The position of the target starting point is [500m, 500m] . In x coordinate, the state of the target evolves according to a CV model. A maneuvering model with accelerations of 10m/s 2 and −10m/s 2 in the first and second stages, respectively, prescribes the target dynamics in y coordinate. Our approach uses the first-order polynomial T-FoT in x-dimension and second order in y-dimension. The parameters C x k , C

Fixed number of sensor selection
In this simulation, the optimization goal for the sensor selection is given as in Eq. (9) with n s = 3.
As shown in Fig. 3, the optimal subset of sensors online selected by the proposed algorithm is reasonably, evenly distributed around the target. The tracking accuracy depends not only on the measurement precision but also on the position of the sensor nodes with relative to that of the target. In particular, when two sensor nodes and the target are in a straight line, the accuracy is the poorest. In the case shown in Fig. 3b where two of the nearest sensor nodes and the target are in the same line, the optimal CRLB-based selection is given by the three sensor nodes distributed around the target with insignificant measurement noise. The root mean square errors (RMSEs) of both T-FoT and IMM-EKF trackers against time using either CRLB and random sensor selection algorithms are given in Fig. 4. The average RMSEs and computing times are given in Table 1. The results clearly show that the CRLB-based sensor selection algorithm performs better than the random sensor selection algorithm in terms of tracking accuracy, at the price of higher computational burden. Meanwhile, the performance of the T-FoT is better than that of the IMM-EKF when utilizing the CRLB selection algorithm but they perform similar in the case of random sensor selection.

Adaptive number of sensor selection
We now consider the case for activating as few sensor nodes as possible while still meet the CRLB threshold requirement, corresponding to (10), using the same group of sensors as in the last simulation. To this end, a greedy algorithm is used to find the optimal sensor subset: First, an optimal subset of n s sensors is selected as has been done in solving (9). If the corresponding CRLB exceeds T lb , we increase gradually the number of sensors to be selected and to resolve (9) until the CRLB becomes lower than T lb when the minimum, optimal sensor subset is obtained. This can be referred to as the adaptive number of sensor selection because the numbers of sensors to be selected are different at different times.
Here, we use the threshold T lb = 5m 2 . The RMSEs of both T-FoT and IMM-EKF trackers against time using CRLB (fixed number or adaptive number of sensors) or random sensor selection algorithms are given in Fig. 5. The number of sensors against time in the adaptive number of sensor selection using T-FoT approach in  one Monte Carlo run and the average number of 100 Monte Carlo runs are given in Fig. 6. The average RMSEs and computing times for all methods are given in Table 2. These results show that: 1 The adaptive number of sensor selection outperforms the fixed number of sensor selection whether it is based on CRLB or random. In addition, the performance of the IMM-EKF method in tracking accuracy improves more significantly than the T-FoT method by using the adaptive number of sensor selection but is still underperformed as compared with the T-FoT. 2 The computing time used by the adaptive number of sensor selection does not rise significantly as compared to the fixed number of sensor selection. 3 It is necessary to note that the average MSE of the T-FoT estimator is smaller than the CRLB threshold because the latter is based on the current information only while the T-FoT estimator utilizes all information in the time-window.
In summary, the proposed sensor selection approaches, using whether fixed number or adaptive number of sensors, perform well with the T-FoT approach despite the target maneuvering. Both the fixed and adaptive number of sensor selection approaches improve the tracking performance with acceptable computation cost.

Conclusion
In this paper, we consider the scenario for tracking a non-cooperative maneuvering target using a limited power and wireless-bandwidth network which consists of bearing-only passive sensors. Our approach integrates the T-FoT method into the POMDP framework and minimizes the CRLB of the target localization mean error. We design two sensor selection strategies: one that selects a fixed number of sensors minimizes the CRLB to achieve satisfactory target tracking with the bandwidth constraint, and the other selects as few sensors as possible under a CRLB constraint. The simulation results confirm the effectiveness of the approach. A potential direction of our future work is to address the multi-target tracking problem.