2.1 Overview
The TALA is a batch estimation algorithm that utilises all measurements generated within a time window by an array of sensors, in order to detect and localise an unknown number of target events (i.e. intermittent signals, such as acoustic impulses or radio frequency transmissions). Initially, the TALA generates “candidate” target locations, and then performs “soft” nearest neighbour data association (e.g. [21]), allowing each measurement to be associated with more than one candidate location. This approach removes the need to perform global multi-sensor/multi-target data association, e.g. as necessary in [20], thereby maintaining computational feasibility, even for large scale problems.
Using the measurements associated with each candidate location, ML estimation is then performed in order to localise each potential target. The ML estimation problem cannot be solved analytically, and an iterative G-N approach (e.g. [24]) is used to solve an equivalent non-linear least squares problem. The G-N approach performs iterative gradient descent, and in order to combat potential divergence, line search and randomisation are used to ensure that each iteration increases the value of the likelihood. It is noted that alternative techniques, such as the Newton-Raphson (N-R) approach (e.g. [25]) or the Levenberg-Marquardt algorithm [26, 27], could also be used to perform the gradient descent and may offer similar performance.
2.2 Summary of the main steps
The main steps in the TALA are as follows:
-
1.
Step 1: Determine initial candidate locations
-
If possible (generally only in two-dimensional emitter geo-location), determine the intersection between measurements generated by each pair of sensors.
-
More generally, determine a candidate location that minimises a Mahalanobis-based distance metric using measurements generated by each pair of sensors.
-
These points form the initial candidate (target) location set.
-
In cases for which performing measurement intersection or Mahalanobis distance minimisation is problematic/impossible, initial candidate locations should be randomly sampled within the surveillance region.
-
2.
Step 2: Associate measurements and determine likelihood for each candidate location
-
Determine the measurement from each sensor that has the greatest individual likelihood (or equivalently the smallest Mahalanobis distance) for each candidate location.
-
This measurement is associated with the location provided that the individual likelihood is greater than a pre-specified threshold value.
-
Each measurement is allowed to be associated with more than one candidate location.
-
The overall likelihood of each candidate location is calculated using all of the associated measurements.
-
3.
Step 3: Candidate location deletion
-
The number of candidate locations can be large.
-
To reduce the computational expense of the algorithm, at this stage, some of the candidate locations are deleted.
-
A candidate location is deleted if it either has too few measurements associated with it or if it shares identical associations with another candidate target location that has a greater overall likelihood.
-
Optionally, the candidate location is deleted if it shares any associations with another candidate target location that has a greater overall likelihood.
-
4.
Step 4: Maximum likelihood estimation
-
Using the candidate locations retained from Step 3, plus the measurements associated with each location, determine ML estimates via an iterative G-N approach.
-
Optionally, measurement reassociation may be performed on each iteration of the G-N algorithm.
-
5.
Step 5: Final downselection/outputs
An illustrative example of the TALA is shown in Figs. 1 and 2.
2.3 Step 1: Determine initial candidate locations
An N sensor system is considered. Let n(i) denote the number of measurements, each of dimensionality d
i
, generated by sensor i. Let z(i,j) denote the j-th measurement generated by sensor i. It is assumed that target-generated measurements are corrupted by additive Gaussian noise. Hence, for a target located at coordinates \(\boldsymbol {X}\in \mathbb {R}^{3}\), each target-generated measurement at sensor i is given as follows:
$$\begin{array}{@{}rcl@{}} \boldsymbol{z}(i,.) &=& \boldsymbol{f}(\boldsymbol{X};i) + \boldsymbol{e}(i) \end{array} $$
(1)
where \(\boldsymbol {f}(\boldsymbol {X};i)\triangleq \left (f_{1}(\boldsymbol {X};i) \ \ldots \ f_{d_{i}}(\boldsymbol {X};i)\right)'\). Each measurement error \(\boldsymbol {e}(i)\sim {\mathcal N}(0, \boldsymbol {\Sigma }_{i})\), with Σ
i
denoting the error covariance of each target-generated measurement at sensor i.
The first step in the TALA is to generate a set of initial candidate location hypotheses, with these hypotheses then manipulated in order to determine ML estimates of the locations of an unknown number of targets. Therefore, it would seem prudent to choose candidate locations that are consistent with the measurements. To this end, the following methodology is used to generate initial candidate locations:
-
1.
If the focal problem is concerned with the geo-location of targets within a two-dimensional region (e.g. the geo-location of ground-based targets within a geographically flat region), initial candidate locations can be determined as the intersection of each pair of measurements (if such an intersecting point exists). In later simulations, the intersections between pairs of AOA measurements and pairs of DDOA measurements are used to generate initial candidate locations.
-
2.
For more complex applications in which measurement intersection cannot be performed (e.g. three-dimensional target geo-location, in which case the measurements are extremely unlikely to intersect because of the presence of measurement errors), for each pair of measurements \(\boldsymbol {\hat z}\triangleq \left (\boldsymbol {z}(i_{1},.)' \ \boldsymbol {z}(i_{2},.)'\right)'\), for i
1≠i
2; a candidate location X
c
can be determined by minimising the Mahalanobis distance between \(\boldsymbol {\hat z}\) and \(\boldsymbol {\hat f}(\boldsymbol {X})\triangleq \left (\boldsymbol {f}(\boldsymbol {X};i_{1})' \ \boldsymbol {f}(\boldsymbol {X};i_{2})'\right)'\), i.e.
$$\begin{array}{@{}rcl@{}} \boldsymbol{X}_{c} & = & \mathop{\text{arg min}}\limits_{\boldsymbol{X}\in \mathbb{R}^{3}} \left[ \boldsymbol{\epsilon(\boldsymbol{X})}' \boldsymbol{\hat \Sigma}^{-1} \boldsymbol{\epsilon(\boldsymbol{X})} \right] \end{array} $$
(2)
where \(\boldsymbol {\epsilon }(\boldsymbol {X})\triangleq (\boldsymbol {\hat z}-\boldsymbol {\hat f}(\boldsymbol {X}))\); and \(\boldsymbol {\hat \Sigma }\) is the error covariance of the measurement \(\boldsymbol {\hat z}\).
It may be necessary to limit the number of candidate locations by not considering all combinations of sensor measurements in determining the intersections (in two-dimensional applications) or minimising (2) (in three-dimensional applications). Moreover, in three-dimensional geo-location applications, the optimisation in Eq. (2) may not be straightforward, and it may be more efficient to randomly select candidate locations within the surveillance region.
2.4 Step 2: Associate measurements and determine likelihood for each candidate location
It is assumed that for each sensor, a maximum of one measurement is generated by each target in the time window under consideration. Furthermore, for each candidate target location X, and each sensor i:
-
1.
The index a(X;i)∈{1,…,n(i)} of the measurement that is associated is the one with the largest individual likelihood, i.e. nearest neighbour data association is performed (e.g. see [21]).
-
2.
If every measurement generated by a sensor has an individual likelihood that is less than 100 ξ % of the maximum value l
i
(max), then no measurement from that sensor is associated with the location. The threshold ξ∈[0,1] is pre-specified. It is noted that this approach is equivalent to gating the measurement Mahalanobis distance with a threshold \(g= \sqrt {-2\ln \xi }\).
Therefore, a(X;i) is given as follows:
$$\begin{array}{@{}rcl@{}}a(\boldsymbol{X};i) & \triangleq & \left\{ {{\begin{array}{ll} \underset{j=1,\ldots,n(i)}{\text{arg max}} \ {l_{i}(\boldsymbol{X};j)} &\quad \text{if} \ \underset{j=1,\ldots,n(i)}{\max} l_{i}(\boldsymbol{X};j) \geq \xi \times l_{i}(\max)\qquad \\ \\ -1 &\quad \text{otherwise} \end{array}}} \right.\\ \end{array} $$
(3)
where l
i
(X;j) denotes the individual likelihood at sensor i as a result of associating the j-th measurement z(i,j) with candidate target location X. For the measurement model (1), this individual likelihood is given as follows:
$$ \begin{aligned} l_{i}(\boldsymbol{X};j) &= \frac{1}{(2\pi)^{d_{i}/2}\det(\boldsymbol{\Sigma}_{i})^{1/2}}\exp\left(-\frac{1}{2} \left[\boldsymbol{z}(i,j)\right.\right.\\ & \quad\left.\left.-\boldsymbol{f}(\boldsymbol{X};i)\right]' \boldsymbol{\Sigma}_{i}^{-1}\left[\boldsymbol{z}(i,j)-\boldsymbol{f}(\boldsymbol{X};i)\right]{\vphantom{-\frac{1}{2}}}\right) \end{aligned} $$
(4)
It is noted that a(X;i)=−1 denotes that no measurement from sensor i is associated with the location X.
The overall measurement likelihood is then given as follows:
$$ \begin{aligned} L(\boldsymbol{X}) &= \frac{1}{(2\pi)^{D/2}\det(\boldsymbol{\Sigma})^{1/2}}\exp\left\{-\frac{1}{2}[\boldsymbol{Z}\right.\\ &\quad\left.-\boldsymbol{f}(\boldsymbol{X})]' \boldsymbol{\Sigma}^{-1}[\boldsymbol{Z}-\boldsymbol{f}(\boldsymbol{X})]{\vphantom{-\frac{1}{2}}}\right\} \end{aligned} $$
(5)
where:
$$\begin{array}{@{}rcl@{}} N_{a} &\triangleq& \text{total number of measurements associated with }\\ &&\text{the location \(\boldsymbol{X}\)} \end{array} $$
(6)
$$\begin{array}{@{}rcl@{}} &=&\sum_{i=1}^{N} \sum_{a(\boldsymbol{X};i)>-1}1 \end{array} $$
(7)
$$\begin{array}{@{}rcl@{}} \boldsymbol{Z} &\triangleq& \text{\small{concatenated vector of associated measurements}} \end{array} $$
(8)
$$\begin{array}{@{}rcl@{}} & = &(\boldsymbol{z}(1,a(\boldsymbol{X};1))' \ \ldots \ \boldsymbol{z}(N_{a},a(\boldsymbol{X};N_{a}))')' \end{array} $$
(9)
$$\begin{array}{@{}rcl@{}} && \quad\!\text{(with the sensor indices reordered to 1}, \ldots, N_{a})\\ \,\boldsymbol{f}(\boldsymbol{X}) \!&=&\!\! \left(\boldsymbol{f}(\boldsymbol{X};1)' \ \ldots \ \boldsymbol{f}(\boldsymbol{X};N_{a})'\right)' \end{array} $$
(10)
$$\begin{array}{@{}rcl@{}} D &=& \text{dimensionality of the concatenated vector of}\\ &&\text{all associated measurements} \end{array} $$
(11)
$$\begin{array}{@{}rcl@{}} &=& \sum_{i=1}^{N} \sum_{a(\boldsymbol{X};i)>-1} d_{i} \end{array} $$
(12)
$$\begin{array}{@{}rcl@{}} \boldsymbol{\Sigma} &=& \left(\begin{array}{cccc} \boldsymbol{\Sigma}_{1} & \boldsymbol{\Sigma}_{2,1} & \ldots & \boldsymbol{\Sigma}_{N_{a}, 1} \\ \boldsymbol{\Sigma}_{1,2} & \boldsymbol{\Sigma}_{2} & \ddots & \vdots \\ \vdots & \ddots & \ddots & \boldsymbol{\Sigma}_{N_{a},N_{a}-1} \\ \boldsymbol{\Sigma}_{1,N_{a}} & \ldots & \boldsymbol{\Sigma}_{N_{a}-1,N_{a}} & \boldsymbol{\Sigma}_{N_{a}} \\ \end{array} \right) \\ \end{array} $$
(13)
$$\begin{array}{@{}rcl@{}} \boldsymbol{\Sigma}_{i,j} & \triangleq & \text{correlation between the measurements at}\\ &&\text{sensors {i} and {j}} \end{array} $$
(14)
It is noted that if the measurements from all sensors are uncorrelated (i.e. Σ
i,j
=0, for all i, j), the overall measurement likelihood at each candidate location X is given by the product of the individual likelihood values of the associated measurements, i.e.
$$\begin{array}{@{}rcl@{}} L(\boldsymbol{X}) & = & \prod_{i=1}^{N} \prod_{a(\boldsymbol{X};i)>-1} l_{i}(\boldsymbol{X};a(\boldsymbol{X};i)) \end{array} $$
(15)
More importantly, in this case, the ensemble of measurements that satisfy Eq. (3), for i=1,…, N, also maximises the overall measurement likelihood.
There is no practical reason why the nearest neighbour data association approach cannot be used if the measurements from different sensors are correlated. However, it should be noted that the resulting measurement set is not guaranteed to be close to optimal in maximising the overall measurement likelihood. In such cases, performing measurement reassociation during gradient descent (see Section 2.6.3) may be helpful in correctly resolving the complex multi-sensor/multi-target data association problem.
An exemplar likelihood map is shown in Fig. 1. It is noted that this map is shown for illustration only. The reader is reminded that the TALA calculates the likelihood only at the initial candidate locations and at the locations determined on subsequent iterations of the gradient descent algorithm.
2.5 Step 3: Candidate location deletion
Clearly, the number of candidate locations can be large. To reduce the computational expense of the algorithm, at this stage, some of the candidate locations are deleted. A candidate location is deleted if any of the following are true.
-
1.
Deletion criterion 1: The candidate location does not have at least μ
P
d
N measurements associated with it (i.e. it is not consistent enough with the data). This value is set by noting that the average number of measurements generated by each target is P
d
N for a system with N sensors, and with a probability P
d
that each target is detected by each sensor. In simulations, a value of μ=0.5 was shown to produce excellent results.
-
2.
Deletion criterion 2: The candidate location has exactly the same measurements associated with it as another candidate location that has greater overall likelihood.
-
3.
Deletion criterion 3 (optional): The candidate location has one or more measurements associated with it that are also associated with another candidate location that has greater overall likelihood. The procedure for implementing this deletion criterion is as follows:
-
(a)
The overall likelihood is calculated for each candidate location, using the procedure described on Step 2 of the algorithm.
-
(b)
The candidate location with the greatest overall likelihood is accepted as a potential target location.
-
(c)
Recursively, consider the candidate location with next greatest likelihood. If this candidate location does not share any associations with any of the previously accepted candidate locations it is also accepted as a potential target location, otherwise it is deleted.
Deletion criterion 3 has the advantage of significantly reducing the number of candidate locations that need to be manipulated, and this can significantly reduce the computational expense of the algorithm. The disadvantage is that by deleting candidate locations at this early stage, the TALA has a reduced probability of detecting all target events. This criterion therefore compromises estimator performance for increased computational speed.
In Fig. 2
b, the results of the intersection deletion step are shown for the exemplar scenario. It is noted that deletion criterion 3 is not used in this example.
2.6 Step 4: Maximum likelihood estimation
2.6.1 Background — standard Gauss-Newton approach
Consider the set of N
a
measurements associated with a candidate location, calculated via Eq. (3). The ML estimate \(\hat {\boldsymbol {X}}_{MLE}\) of the target location, based on these measurements is given as follows:
$$\begin{array}{@{}rcl@{}} \hat{\boldsymbol{X}}_{MLE} & = & \mathop{\text{arg max}}\limits_{\boldsymbol{X}} L(\boldsymbol{X}) \end{array} $$
(16)
$$\begin{array}{@{}rcl@{}} & = & \mathop{\text{arg min}}\limits_{\boldsymbol{X}} \left[ [\boldsymbol{Z}-\boldsymbol{f}(\boldsymbol{X})]' \boldsymbol{\Sigma}^{-1}[\boldsymbol{Z}-\boldsymbol{f}(\boldsymbol{X})]\right] \end{array} $$
(17)
with Z, f(X), and Σ given in Eqs. (9), (10), and (13) respectively.
The non-linear least squares problem (17) can be solved using the G-N approach (e.g. [24]). The G-N approach performs iterative gradient descent, starting with an initial estimate X
0. It generates a sequence of estimates as follows:
$$\begin{array}{@{}rcl@{}} \boldsymbol{X}_{k+1} &=& \boldsymbol{X}_{k} - \boldsymbol{\delta}_{k} \end{array} $$
(18)
where the full “Newton step” δ
k
is given as follows:
$$ \boldsymbol{\delta}_{k} =\left[ \boldsymbol{F}(\boldsymbol{X}_{k})'\boldsymbol{\Sigma}^{-1}\boldsymbol{F}(\boldsymbol{X}_{k}) \right]^{-1} \boldsymbol{F}(\boldsymbol{X}_{k})' \boldsymbol{\Sigma}^{-1}[\boldsymbol{Z}-\boldsymbol{f}(\boldsymbol{X}_{k})] $$
(19)
The Hessian matrix F(X
k
) is given as follows:
$$ \begin{aligned} \boldsymbol{F}(\boldsymbol{X}_{k}) \! =\! \left(\nabla_{\boldsymbol{X}_{k}}\,\boldsymbol{f}(\boldsymbol{X}_{k})' \right)'\,=\, \left(\nabla_{\boldsymbol{X}_{k}}\,\boldsymbol{f}(\boldsymbol{X}_{k};1)' \ldots \nabla_{\boldsymbol{X}_{k}}\,\boldsymbol{f}(\boldsymbol{X}_{k};N_{a})'\right)' \end{aligned} $$
(20)
where \(\nabla _{\boldsymbol {X}_{k}}\phantom {\dot {i}\!}\) is the first-order partial derivative operator with respect to \(\boldsymbol {X}_{k}\in \mathbb {R}^{3}\).
If the iterative scheme given in Eq. (18) converges, it will do so to a stationary point, thereby providing a ML estimate. However, convergence is not guaranteed and is highly dependent on the proximity of the initial estimate X
0 to the stationary value.
2.6.2 Implementation — Gauss-Newton approach with an adaptive step size
In light of the potential for the G-N approach to diverge, the implementation herein allows steps smaller than, and in the opposite direction to, the full “Newton step”, whilst attempting to maximise the increase in the overall measurement likelihood on each iteration. Specifically, the G-N approach is initialised with each initial candidate location. On each subsequent iteration, the location is modified as follows:
$$\begin{array}{@{}rcl@{}} \boldsymbol{X}_{k+1} &=& \boldsymbol{X}_{k} + \boldsymbol{\Lambda}_{k} \end{array} $$
(21)
where, either:
-
Λ
k
is the increment from the set {α
δ
k
/m: α=−m,…,−1,1,…,m} that results in the greatest increase in the overall measurement likelihood; where δ
k
is the full Newton step (19) and m is a pre-specified positive integer;
or, if no step from the above set increases the overall measurement likelihood:
-
Λ
k
is a step in a randomly generated direction (i.e. drawn from a Uniform distribution on [−π,π]) of magnitude δ
M
(nominally, δ
M
= 200 metres). This random step is accepted if it increases the overall measurement likelihood.
The G-N approach is terminated if either:
-
1.
A total of 20 random steps have been attempted.
-
2.
The magnitude of each component of the gradient of the normalised sum-of-squared errors (GNSSE) is smaller than a pre-specified value (nominally 10−3). Only in this case is successful convergence to a ML estimate deemed to have been achieved.
This “line search” adaptation of the G-N approach is similar to the line search approach detailed in Section 9.7 in [25]. In Fig. 2
c, ML estimates calculated using the G-N approach are shown for the exemplar scenario.
2.6.3 Reassociation during gradient descent
In scenarios in which the measurement errors are large, each initial candidate location (e.g. generated from the intersection of a pair of measurements) may be distant from the ML estimate. In such cases, the measurements associated with the initial candidate location may not be the nearest to each of the subsequent iterates, X
k
, k=1,2,…, of the G-N algorithm.
Motivated by this, in cases in which the measurements are inaccurate, reassociation can be performed after each iteration of the G-N approach. That is, having determined iterate X
k
, reassociation is performed, and the measurements associated with location X
k
are used to determine the next increment δ
k
, and next iterate X
k+1.
Performing reassociation can significantly improve performance when measurement errors are large. However, this is at the cost of (i) increasing the computational expense of the algorithm and (ii) making the algorithm less likely to converge to a ML estimate, hence reducing the number of target events located.
2.7 Step 5: Final downselection/outputs
Having determined the ML estimates on Step 4, downselection is performed in order to ensure that each measurement is associated with no greater than one ML estimate. The procedure for performing this downselection is exactly the same as given in the optional deletion criterion 3 on Step 3. It is noted that if the optional criterion is performed on Step 3, and provided that reassociation is not performed during the gradient descent on Step 4, then this downselection has already been performed.
A final downselection step also deletes estimates that lie within the sensor perimeter. Such estimates are rare, but can occur because of incorrect associations, or convergence to the wrong point of intersection of the associated measurements.
The remaining ML estimates provide estimates of the target event locations. The approximate error covariance (denoted \(\boldsymbol {\mathcal C}(\boldsymbol {X}^{\star })\)) of each estimate X
⋆ is given by the inverse of the observed Fisher information matrix [28]. This covariance is as follows:
$$\begin{array}{@{}rcl@{}} \boldsymbol{\mathcal C}(\boldsymbol{X}^{\star}) &\approx & \left[\boldsymbol{F}(\boldsymbol{X}^{\star})'\boldsymbol{\Sigma}^{-1}\boldsymbol{F}(\boldsymbol{X}^{\star})\right]^{-1} \end{array} $$
(22)
The matrix Σ is again given by Eq. (13); and the matrix F(.) is given by Eq. (20). In Fig. 2
d, the final outputs of the target localisation algorithm are shown for the exemplar scenario.