 Research
 Open Access
 Published:
Joint target tracking and identification in complex traffic scene
EURASIP Journal on Advances in Signal Processing volume 2022, Article number: 119 (2022)
Abstract
In practical road traffic scene, targets usually face high ground clutter, high and variable motion, high nonlinearity, which lead to targets tracking or identification challenging. What’s more, tracking and identification are usually interdependent in reality, and thus it is promising to solve them jointly. In this paper, we propose a novel joint tracking and identification (JTI) scheme to handle such problems involving coupled tracking and identification, i.e., JTI problems. Specifically, we formulate the JTI problem in complex traffic scene using a hybrid system. Then, by exploiting the generalized Bayes risk for JTI, we derive analytical estimator and decider for the coupling of tracking and identification in complex road targets motion. Furthermore, an unscented Kalman filterexpected mode augmentationbased estimation strategy is creatively developed to improve both estimation and decision performance. In additions, a joint performance evaluation metric is presented to assess the performance the joint of the proposed JTI scheme. Finally, two simulation examples under different traffic scenarios demonstrate that the proposed JTI approach outperforms the traditional trackingthenidentification and identificationthentracking methods in joint performance.
1 Introduction
With the rapid development of intelligent transportation, vehicles tracking and identification, as two fundamental tasks of traffic monitoring, have recently attracted great interests from both industry and academia [1,2,3,4,5]. Specifically, the vehicles tracking aims to estimate the target state (e.g., position, velocity, acceleration, etc.), while identification aims to identify which class the target belongs to [6, 7], such as car, bus, tanker, and ambulance. In practical intelligent transportation systems, however, tracking and identification are inherently coupled and affect each other. Furthermore, in practical complex traffic scene, tracking and identification become more challenging due to the complicated and changeable mobility, high nonlinearity, large ground clutter interference and so on.
To address the above challenges, various schemes are designed to improve the tracking and identification performance in complex traffic scene. So far, there are four kinds of methods to handle the problems involving both tracking and identification [8]: (a) separate tracking and identification [6, 9], where tracking and identification are handled independently without considering their couplings at all; (b) identificationthentracking, in which identification is made first without considering tracking and then tracking is made based on this identification without considering the possible identification error; (c) trackingthenidentification [10, 11], in which tracking is done first, and then, identification is based on it; (d) densitybased method, which is beyond the scope of this paper (it is for point inference).
However, the above existing methods cannot work well [8] because the internal relationship between tracking and identification is not fully explored. Specifically, tracking can provide state information for different target classes, while identification helps tracking by selecting appropriate identitydependent kinematic models. Therefore, a joint tracking and identification (JTI) approach is promising to improve tracking and identification performance jointly. Essentially, JTI is a joint decision and estimation (JDE) problem [12] with dual goals—decision and estimation, which are coupled. For such problems, an integrated JDE paradigm is proposed [12] which can fully utilize the coupling between decision and estimation and finally achieve superior joint performance [13,14,15,16,17]. Within this JDE framework, a conditional JDE (CJDE) approach was proposed in [8] which has simple calculation and superior joint performance.
Although CJDE is superior for solving joint problems, it cannot be applied to JTI in complex traffic scene directly due to the particularities and complexities of the problem. Among many difficulties of JTI in complex traffic scene, this paper considers two typical and common types: complicated mobility and high nonlinearity. To capture the complicated motion and obtain satisfactory estimation performance, multiplemodel (MM) approach is usually used for tracking [18], which contains fixedstructure MM (FSMM) and variablestructure MM (VSMM) [1, 19, 20]. In the former, a large number of models are needed to improve the estimation performance. However, usages of extensive models increase the computational burden considerably; furthermore, the performance will deteriorate since too many models may cause excessive “competition” from the “unnecessary” models. Therefore, we adopt the VSMM estimation method since it can utilize the online mode information and is superior to a FSMM algorithm in both performance and computation for complicated realworld problems [19, 21].
For nonlinear target tracking, there are generally densitybased and point estimationbased methods. The frontier has high computational complexity [22, 23] by approximating the posterior distribution, while the latter is simpler and is thus adequate for practical applications [24, 25]. Within the nonlinear point estimation, there are extended Kalman filter (EKF) [26], unscented Kalman filter (UKF) [27] using the deterministic sampling to compute the moments, the quadrature KF (QKF) [28], the cubature KF [29], and so on. For identification, studies mainly focus on videobased methods, which have been proven to be efficient in road target information processing [30, 31]. However, under complex environmental and illumination conditions, identification using these methods is difficult and may even fail.
In view of the above, solving JTI in complex traffic scene within the JDE framework faces the following difficulties. First, appropriate models are required for problem formulation. The models are expected to describe practical target state evolution objectively and incorporate the target identity and the coupling between state and identity effectively and also can be mathematically tackled easily. Second, tracking and identification solution concerning their couplings and practical complexities is required. Specifically, it is desired that on the one hand, the complex mobility and high nonlinearity can be appropriately utilized to ensure the fitness of JTI solution to reality; on the other hand, the coupling between tracking and identification should be fully utilized.
Motivated by the above, this paper proposes a novel scheme for the JTI problem in complex traffic scene. First, we present a hybrid system containing a dynamic model and a measurement model, which can describe the complex mobility and also the coupling between tracking and identification. Based on this model, we focus on solving the JTI problem within the JDE framework.
We present a generalized Bayes risk for JTI, which unifies tracking and identification. Then, we derive a joint solution by minimizing this JTI risk. Specifically, for estimation, we propose a new expectedmode augmentation (EMA) [20] estimation strategy, named UKFEMA strategy. Here, UKF is adopted due to its superior performance and also adaptability to the JDE framework; EMA is utilized due to its superiority in handling complex motion. For decision, a decider is provided by incorporating the effect of estimation on decision. Generally, this JTI solution with explicit form fully exploits the coupling between target state and identity and also utilizes the characteristics in complex traffic scene. Furthermore, a joint performance metric is provided, which considers both tracking and identification errors. Simulation results verify that the proposed JTI approach outperforms traditional twostep methods in joint performance.
More specifically, the contributions of this work are summarized as follows:

We propose a hybrid system for practical JTI problems in complex traffic scene. In this system, both the complicated target state evolution and the coupling between tracking and identification are incorporated.

We propose a novel and tractable JTI approach for JTI problems in complex traffic scene. A joint risk is first presented, which unifies estimation and decision errors. Then, we derive an analytical JTI solution containing an estimator and a decider with their couplings being accounted for. Specifically, a UKFEMA estimation strategy is creatively proposed due to its nice properties. Finally, we present an efficient JTI algorithm.

We examine the performance of the proposed JTI approach in practical complex traffic scene, where motion at a road corner and a crossroads is representatively considered. The results verify that the proposed JTI approach can utilize the coupling between tracking and identification and finally outperforms the traditional methods in joint performance.
This paper is organized as follows. Section 2 formulates the JTI problem in complex traffic scene. Section 3 proposes an applicable JTI approach by considering the characteristics of JTI problems and also the coupling between tracking and identification. Also presented is a joint performance metric. Section 4 presents simulation results and analyses. Section 5 concludes the paper.
2 Problem formulation
2.1 Problem description
Figure 1 illustrates two typical JTI problems in complex traffic scene. A target with multiple possible classes moves on the road, where different classes have different dynamics. Specifically, in Fig. 1a, a car and a bus are moving at the corner of the road with different mobility, i.e., different turn rates. In Fig. 1b, a car and a bus are moving at the crossroads with different motion modes. Here, we aim to jointly identify and track the target using multiple sensor data under complex traffic scenes (intersection, high and variable motions, large ground clutter interference, etc.).
The tracking and identification are highly coupled in this problem. Accurate tracking provides the target’s location and motion information, which promotes identification. Correct identification benefits learn more behaviors of targets, which helps tracking. Therefore, this is essentially a JTI problem and good solutions require solving both tracking and identification problems jointly.
2.2 Modeling
Let \(x_{k}\) denote the target state (position, velocity, acceleration, etc.) at time k, and \(c_{i}\) denotes the target class i, which belongs to the possible class set \(\{1,\ldots ,N\}\). In the JTI problem, tracking is to obtain the state estimate \({\hat{x}}_{k}\) while identification is to determine the target identity \(c_{i}\). Therefore, our goal is to obtain \(\{ {\hat{x}}_{k},c_{i}\}\) jointly.
As is analyzed above, a hybrid system is expected to take both the complex target motion and the coupling between tracking and identification into consideration. Therefore, we propose the following hybrid system for JTI in complex traffic scene. For target class \(c_{i}\), the state evolution and measurement models are given by
where k is time index; \(f_{k}^{i}(\cdot )\) and \(h_{k}^{i}(\cdot )\) are the state transition function and measurement function, respectively, which can be either linear or nonlinear. \(w_{k}^{i}\) and \(v_{k}^{i}\) are zeromean Gaussian white process and measurement noises with covariance matrixes \(Q_{k}\) and \(R_{k}\), respectively. Note that the superscript i denotes target class i. Therefore, different target classes have different system model (1).
Remark 1
Equation (1) describes the motion model of class \(c_{i}\), i.e., target class is related to the motion model. Meanwhile, the motion model basically describes the evolution of target state. Therefore, the motion model relates the target state and class. Based on these, the relationship between the target state and class is as follows: Target classes differ from each other in motion models.
The state transition function \(f_{k}^{i}(x_{k})\) and measurement function \(h_{k}^{i}(x_{k})\) not only describe the transition of state and measurement, but also provide sufficient flexibility. For example, \(f_{k}^{i}(x_{k})\) and \(h_{k}^{i}(x_{k})\) can be either linear or nonlinear, time invariant or timevarying, etc. This fits the practical traffic scene in which the target motion may be complicated and changeable over time.
Remark 2
As road target JTI in complex traffic scene is typically a JDE problem, good solutions require solving tracking and identification jointly. In the following, we first review the existing JDE approach. Then, as the main part of this paper, we propose an applicable JTI approach to solve the JTI problem in complex traffic scene.
3 Methods
3.1 Motivation
For JDE problems, [12] proposed an integrated JDE framework based on a new generalized Bayes risk, as follows:
where \(D^{i}\) and \(H^{j}\) are the ith decision and the jth hypothesis, respectively; x is the true target state with \({\hat{x}}\) being its estimate; \({\bar{C}}(x,{\hat{x}})\) is the cost of estimating x by \({\hat{x}}\); \(c_{ij}\) is the cost of deciding on \(D^{i}\) but \(H^{j}\) is true, and \(E[\bar{C }(x,{\hat{x}})D^{i},H^{j}]\) is the corresponding expected estimation cost; and \(\alpha _{ij}\) and \(\beta _{ij}\) are weight factors. This joint framework is optimal in the joint performance by accounting for the coupling between decision and estimation. Within this framework, we develop a conditional JDE (CJDE) approach by introducing the online data [8]. CJDE inherits the theoretical advantages of JDE but has much simpler calculation.
Although CJDE has many advantages for problems involving coupled decision and estimation like JTI, it cannot be directly applied to JTI in complex traffic scene. That is because in such problems, targets usually face high maneuverability and high nonlinearity, which are not considered in the original CJDE approach. Due to these complexities, the JTI solution in complex traffic scene is difficult to be obtained.
Therefore, great efforts are needed to overcome these difficulties so as to achieve a joint solution. Specifically, appropriate estimation strategy satisfying practical traffic scene is required, which can not only bring superior estimation performance but also be easily integrated into the JDE framework. Besides, the coupling between estimation and decision needs further exploration so as to improve the joint performance.
In the following, we propose an applicable JTI approach by accounting for the characteristics of JTI in complex traffic scene and also the adaptability to the JDE framework.
3.2 JTI solution in complex traffic scene
We propose the following CJDE risk for the JTI problem:
where \(H^{j},D^{i},\alpha _{ij}\) and \(\beta _{ij}\) are the same as in the JDE risk (2), z is the online data, and
is the expected estimation cost when \(H^{j}\) is true but \(D^{i}\) is decided, in which \(C(x,{\hat{x}})\) is the cost of estimating x by \({\hat{x}}\).
To obtain the JTI estimation and decision results, we need to minimize the above CJDE risk.
3.2.1 Estimator
Suppose the decision \(D^{i}\) is given and the estimation cost \(C(x,{\hat{x}})\) has the quadratic form, i.e., \(C(x,{\hat{x}})={\tilde{x}}^{\prime }{\tilde{x}}\) with \({\tilde{x}}=x{\hat{x}}\). Then, the optimal JTI estimation which minimizes the JTI risk \(R^{c}(z)\) is the following generalized posterior mean:
where \({\hat{x}}^{(j)}\) is the state estimate under hypothesis \(H^{j}\). \(\bar{P }_{i}\{H^{j}z\}\) is the generalized posterior probability, given by
where \(P\{H^{j}z\}\) is the posterior hypothesis probability of \(H^{j}\).
3.2.2 Decider
According to the Bayes decision rule, the optimal decision is to minimize the decision risk, i.e., the decision candidate which has the smallest Bayes risk. Thus, with given expected estimation cost \(\xi _{ij}(z)\), the optimal JTI decider D is to choose the one whose posterior cost is the smallest:
in which the posterior cost
It can be seen that in order to obtain the JTI decider D, the key is to determine the posterior cost \({\mathbf {C}}^{i}(z)\). Specifically, as \(\alpha _{ij},c_{ij},\beta _{ij}\) are design parameters which are already given, it is the expected estimation cost \(\xi _{ij}(z)\) and the posterior hypothesis probability \(P\{H^{j}z\}\) that affect \({\mathbf {C}}^{i}(z)\). Thus, in the following, we focus on determining \(\xi _{ij}(z)\) and \(P\{H^{j}z\}\).
For \(\xi _{ij}(z)\), with the linear Gaussian assumption and estimation cost \(C(x,{\hat{x}})={\tilde{x}}^{\prime }{\tilde{x}}\), we can get that
where \({\hat{x}}^{(j)}\) is the state estimate under hypothesis \(H^{j}\), \({\check{x}}^{(i)}\) is the estimate under decision \(D^{i}\), and \(\mathrm {mse}( {\hat{x}}^{(j)}H^{j},z)\) is the estimation mean square error (mse) under \(H^{j}\).
For the posterior hypothesis probability \(P\{H^{j}z\}\), following the Bayes rule, we can get:
where \(P\{H^{j}\}\) is the prior probability and \(f(zH^{j})\) is the measurement likelihood of \(H^{j}\).
So far, the JTI solution \(\{{\check{x}},D\}\) containing an estimator (5) and a decider (7) is presented. This joint solution has an analytical form, which makes it more practicable. More importantly, the coupling between decision and estimation is fully taken into account.
However, when it comes to practical JTI in complex traffic scene, the concrete decider and estimator \(\{{\check{x}},D\}\) are difficult to be determined mainly because of the complicated motion patterns, e.g., high mobility, variability, nonlinearity. In the following part, we will strive to determine the concrete joint solution by considering the peculiarities of JTI in complex traffic scene.
3.3 Determination of JTI tracker and identifier in complex traffic scene
To get full insight of the JTI solution, we conduct a detailed analysis. The JTI estimator is weighed sum of the hypothesisconditioned estimate \({\hat{x}} ^{(j)}\), where the weight factor is related to the hypothesis probability \(P\{H^{j}z\}(j=1,2,\ldots ,N)\). The JTI decider is to choose the decision candidate with the smallest posterior cost, which is mainly determined by the hypothesisconditioned estimate \({\hat{x}}^{(j)}\) and the hypothesis probability \(P\{H^{j}z\}\).
In view of the above, the core of obtaining the JTI solution is to determine the hypothesisconditioned estimate \({\hat{x}}^{(j)}\) and the corresponding hypothesis probability \(P\{H^{j}z\}\). Therefore, appropriate estimation strategy is needed, which should satisfy two basic requirements:
(1) It has accurate estimation performance for both linear and nonlinear systems;
(2) Through this estimation, the hypothesis probability can be easily obtained.
With these requirements, in order to derive the JTI solution, we focus on determining \({\hat{x}}^{(j)}\) and \(P\{H^{j}z\}(j=1,2,\ldots ,N)\) in the following parts.
3.3.1 Determination of \({\hat{x}}^{(j)}\)
Determination of each hypothesisconditioned estimate
For estimation, it has been demonstrated that variablestructure multiple model (VSMM) has superior performance and low computational complexity. Note that tracking and identification in complex scene are a difficult problem due to the complicated and changeable target motion. Therefore, VSMM is very suitable for this problem.
The essential issue of the VSMM approach is model set adaptation (MSA). Many MSA methods have been proposed, among which expectedmode augmentation (EMA) is widely used and extensively researched. In the EMA approach, the original set of models is augmented by a variable set of models intended to match the expected value of the unknown true mode. Specifically, the newly activated models are generated adaptively in real time which are probabilistically weighted sums of mode estimates over the model set.
By combining the variablestructure interactingmultiple model (VSIMM) with the EMA approach, we propose to use the EMAVSIMM algorithm in this paper. Specifically, EMAVSIMM algorithm consists of six steps: (1) probability prediction; (2) MSA using EMA approach; (3) interaction/mixing of the estimates; (4) filtering in each filters; (5) probability update; and (6) estimate fusion. Compared to fixedstructure MM method, the adaptive model set in this algorithm is obtained using the EMA algorithm. More details about EMAVSIMM can be found in [32].
Based on the above, we propose the following estimation strategy for JTI, as illustrated in Fig. 2. Assume that there are two possible target classes (\(H^{1}\) and \(H^{2}\)), and under each class, there are M models composing the EMA filter, i.e., M is the total number of models in the EMA filter (for any target class).
In the lower layer, each basic filter (e.g., KF, UKF, etc.) runs and outputs the modelbased state estimate \({\hat{x}}_{jq}\) and the corresponding model probability \(\mu _{jq}\), where \(j=1,2\) and \(q=1,\ldots ,M\). Here, the subscript j is the variable denoting the jth hypothesis (i.e., \(H^{j}\)), while q is the variable denoting the qth model. Then, under each hypothesis \(H^{j}\), through the EMA estimation process, the state estimate \({\hat{x}}^{(j)}\) and the hypothesis probability \(P\{H^{j}\}\) (\(j=1,2\)) can be obtained.
In the upper layer, JTI approach runs based on (\({\hat{x}}^{(1)},P\{H^{1}z\}\)) \(,({\hat{x}}^{(2)},P\{H^{2}z\})\), which are output by the EMA estimator under hypothesis \(H^{1}\) and \(H^{2}\), respectively. Finally, the JTI solution (\({\hat{x}},D\)) can be obtained by Eqs. (5) and (7).
Determination of each modelbased estimator
Considering the above two requirements for estimation, a new estimation strategy is required. For linear case, Kalman filter can be applied easily since it is optimal in minimum mean square error (MMSE) sense, and it can also output the analytical estimation result and the corresponding model probability. However, this paper considers nonlinear case, which is much more common in practical traffic scene.
For nonlinear case, we propose to use UKF as it satisfies the requirements mentioned earlier in Sect. 3.3:
(a) It has satisfactory estimation performance and low calculation for handling the nonlinear estimation problem;
(b) It can output the required posterior hypothesis probability.
Suppose under hypothesis \(H^{j}(j=1,2,\ldots ,N)\), there are totally M models in the model set for EMA, and \(m_{jq}\) denotes the qth (\(q=1,2,\ldots ,M\)) model in the model set given \(H^{j}\). Then, as shown in Fig. 2, every basic filter (based on model \(m_{jq}\)) is UKF. For each \(m_{jq}\)(\(q=1,2,\ldots ,M\)), the estimation process based on UKF is as follows.
Generally, UKF is based on the UT conversion, whose basic idea can be described as: For nonlinear conversion \(y=f(x)\), x is the ndimensional state vector with \({\bar{x}}\) being its mean and P being its variance. We can get \(2n+1\) Sigma points X with the corresponding weight \(\omega\) to compute the statistics of y. Specifically, one cycle of the unscented Kalman filter is as follows:
(1) Given \({\hat{x}}_{k1k1},P_{k1k1}\), compute the onestep predict state \({\hat{x}}_{kk1}\) and the predict error covariance matrix \(P_{kk1}\).
(a) Compute the \(\sigma\) point \(\xi _{k1k1}^{(i)},i=1,2,\ldots ,2n\), that is,
(b) Calculate the \(\sigma\) point of \(\xi _{kk1}^{(i)},i=1,2,\ldots ,2n\) propagating through the state evolution function, that is,
(2) Obtain the propagation of the \(\sigma\) point \({\hat{x}}_{kk1},P_{kk1}\) through the measurement equation using UT.
(a) Calculate the propagation of \(\sigma\) point \({\hat{x}}_{kk1},P_{kk1}\) through the measurement equation to \(x_{k}\), i.e.,
(b) Compute the onestep predict of the output, i.e.,
(3) After obtaining the new measurement \(z_{k}\), update the following quantities:
where \(K_{k}\) is the filter gain.
Based on the above steps, each model \(m_{jq}(q=1,2,\ldots ,M)\)based state estimate and the corresponding estimation MSE can be obtained by (10) and (11), respectively.
Besides, the probability of each model \(m_{jq}(q=1,2,\ldots ,M)\) can be determined as follows. Under the Gaussian assumption, the probability of each model is calculated by
where \(L_{jq}(k)\) is the likelihood of the model \(m_{jq}\); \(r_{jq}(k)\) and \(S_{jq}(k)\) are its residual and covariance, respectively, which can be given by UKF as follows:
The probability prediction is given by
where \(\mu _{jq}(kk1)\) is the predicted probability from time \(k1\) to k , \(\mu _{jp}(k1)\) is the probability of the pth (\(p=1,2,\ldots ,M\)) model at time \(k1\), and \(\pi _{pq}\) is the (p, q)th element of the transition probability matrix (TPM) for EMA. Note that this likelihood \(L_{jq}(k)\) is the basis of computing the hypothesis probability, which further plays important role in obtaining the JTI solution.
3.3.2 Determination of \(P\{H^{j}z\}\)
In the following, we focus on determining the posterior probability of hypothesis \(H^{j}\), i.e., \(P\{H^{j}z\},\)where \(j=1,2,\ldots ,N\). According to the Bayesian rule, the probability of \(H^{j}\) is given by:
in which \(P\{H^{j}\}\) is the prior probability and \(f\{zH^{j}\}\) is the likelihood of \(H^{j}\). Therefore, the key is to obtain \(f\{zH^{j}\}\).
Since the EMA estimation method is adopted, the likelihood \(f\{zH^{j}\}\) is the total likelihood of all models given hypothesis \(H^{j}\), i.e.,
in which \(f\{zm_{jq},H^{j}\}\) denotes the model likelihood of \(m_{jq}\) given hypothesis \(H^{j}\), and \(P\{m_{jq}H^{j}\}\) means the model probability of \(m_{jq}\) under hypothesis \(H^{j}\).
Specifically, under hypothesis \(H^{j},\)the model likelihood of \(m_{jq}\) is
where \(L_{jq}(k)\) is given in (13).
The model probability of \(m_{jq}\) is
where \(\mu _{jq}(k)\) is given in (12).
Remark 3
Based on the above, both the hypothesisconditioned estimate \({\hat{x}}^{(j)}\) and the posterior hypothesis probability \(P\{H^{j}z\}(j=1,2,\ldots ,N)\) can be obtained, which are critical in JTI tracker and identifier.
Remark 4
To make it more clear, we state the JTI tracking and identification results again.
First, for tracking, the JTI tracking solution is given in (5), which is a weighted sum of the hypothesisconditioned estimate \({\hat{x}}^{(j)}(j=1,2,\ldots ,N)\) with the weight being closely related to the hypothesis probability \(P\{H^{j}z\}\). Based on these, we can obtain the final JTI tracking result \({\check{x}}^{(i)}.\) (Suppose decision is \(D^{i}.\))
Second, for identification, the JTI identification solution is given in (7), where the key is the expected estimation cost \(\xi _{ij}(z)\) and the posterior hypothesis probability \(P\{H^{j}z\}\). Specifically, for the former \(\xi _{ij}(z)=\mathrm {mse}({\hat{x}}^{(j)}H^{j},z)+({\hat{x}}^{(j)} {\check{x}}^{(i)})^{\prime }(\cdot )\) [given in (8)], the key is to obtain \(\mathrm {mse}({\hat{x}}^{(j)}H^{j},z),{\hat{x}}^{(j)}\), and \({\check{x}}^{(i)}\). Among these, \(\mathrm {mse}({\hat{x}}^{(j)}H^{j},z)\) and \({\hat{x}}^{(j)}\) can be determined by EMA under hypothesis \(H^{j}\), and \({\check{x}}^{(i)}\) can be determined by (5). For the latter \(P\{H^{j}z\}\), the detailed calculation is given in (14).
3.4 A JTI algorithm in complex traffic scene
Based on the above JTI tracking and identification results, we propose the following JTI algorithm at time k.
1. Initialization. Under each hypothesis \(H^{j}\) (\(j=1,2,\ldots ,N\) ), calculate the hypothesisconditioned estimate \({\hat{x}}_{k1}^{(j)}\), the corresponding MSE \(P_{k1}^{(j)}\), and the hypothesis probability \(P\{H^{j}z^{k1}\}\) at time \(k1\).  
2. Onestep prediction. Following the UKFEMAbased estimation method, calculate the onestep predicted state estimate \({\hat{x}}_{kk1}^{(j)}\) and MSE \(P_{kk1}^{(j)}\).  
3. Update. When data \(z_{k}\) comes, update \({\hat{x}}_{kk}^{(j)}\) and \({\hat{P}}_{kk}^{(j)}\). Based on these, calculate \({\check{x}}_{kk}^{(i)}\) (\(i=1,2,\ldots ,N\)) according to (5).  
4. Further calculation. Calculate the expected estimation cost \(\xi _{ij}(z^{k})\) by (8) and the posterior cost \({\mathbf {C}}^{i}(z^{k})\). Then, JTI decision is \(D_{k}^{i}\) , if \({\mathbf {C}}^{i}(z^{k})\le {\mathbf {C}}^{l}(z^{k}),\forall l\).  
5. Output Output the constrained JTI solution for time k: \(D_{k}=D_{k}^{i}\) in step 4 and \({\hat{x}}_{k}={\check{x}}_{k}^{(i)}\) in step 3. 
Remark 5
The complexity of the proposed JTI algorithm is analyzed as follows.
(a) The tracking and identification results can be obtained jointly without iteration. The above algorithm steps show that to achieve the dual goals (tracking and identification), no iteration is required. Once new data come, after simple implementation of steps 1, 2, and 3, we can achieve the dual goals simultaneously.
(b) All elements are obtained by point estimation without any density estimation, which makes it easy in implementation. Specifically, the hypothesisconditioned estimate \({\hat{x}}_{k}^{(j)},\)the estimation MSE \(P_{k}^{(j)}\), the hypothesis probability \(P\{H^{j}z^{k}\}\), the expected estimation cost \(\xi _{ij}(z^{k})\), the posterior cost \({\mathbf {C}} ^{i}(z^{k})\), the finally JTI tracking result \({\check{x}}_{k}^{(i)},\) and identification result \(D_{k}^{i}\) are all obtained by point estimation. In other words, the proposed UKFEMA strategy does not involve any density estimation.
In summary, the proposed JTI algorithm has low implementation complexity due to its point estimation basis. Note that this paper considers tracking and identification with high maneuverability and high nonlinearity, while the traditional methods for such problems usually adopt density estimationbased method, e.g., particle filter, random finite set methods. From this point of view, this paper has superiority in calculation complexity.
3.5 Joint performance evaluation metric
The traditional performance evaluation of JDE problems is that the decision performance and the estimation performance are evaluated separately using their own metrics, where the correctdecision rate is usually used for decision performance evaluation, while mean square error is used to evaluate the estimation performance [33, 34]. For JDE problems, however, they are comprehensive and may even fail to compare different algorithms. Considering this, reference [8, 35] points out that decision and estimation performance should be evaluated jointly rather than separately.
To evaluate the joint performance of JTI in complex traffic scene, we adopt the following joint performance measure (JPM), as proposed in [17]:
in which \(d_{k}^{i}(H^{i},{\hat{D}}_{k})\) and \(d_{k}^{t}(x_{k},{\hat{x}}_{k})\) are the cost for identification and tracking, respectively. Specifically, if decision is correct (\(c_{i}={\hat{D}}_{k}\)), \(d_{k}^{i}(H^{i},{\hat{D}}_{k})=0\); otherwise, \(d_{k}^{i}(H^{i},{\hat{D}}_{k})=1\). \(d_{k}^{t}(x_{k},{\hat{x}}_{k})\) is the normalized estimation cost, which is defined in detailed in [15]. \(\gamma\) is the weight factor, which can adjust the relative weight of tracking and identification cost.
4 Simulation and discussion
This section presents two typical JTI problems in complex traffic scene. Performance evaluation metrics are rootmeansquare error (RMSE), probability of correct classification (PC), and JPM. The compared methods are the traditional identificationthentracking (IthenT), trackingthenidentification (TthenI), and our proposed JTI method.
Specifically, in IthenT, the optimal Bayes decision is made first based on the posterior hypothesis probability, and then, estimation is obtained based on this decision. In TthenI, the minimum mean square error (MMSE) optimal estimation is obtained first, and then, decision is made based on the ratio of current measurement likelihoods conditioned on \({\hat{x}}_{kk1}\) and \(H^{j}\) [13].
Suppose a vehicle moves in complex traffic scene, whose class may be \(c_{1}\) or \(c_{2}\). We want to identify the target class and track its state jointly using all available data. Here, classes differ in dynamic behaviors, which is reasonable since targets in different classes usually have different behaviors in reality. For example, for a car and a truck, a car usually has larger maneuverability than a truck. For modelbased tracking or classification, such dynamic behaviors are described by motion models.
Two examples simulate two different scenarios. Example 1 considers a constant turning motion at a corner of a road, while Example 2 considers a complicated moving at a crossroads. These are very common and representative mobilities in practical complex traffic scene.
4.1 Example 1: Simulation scenario at a corner
In this example, we consider a complex turning motion at the corner of a road. Classes differ from each other in turning rates (e.g., a car moving on the inner lane has larger turning rate than a bus moving on the outer lane); therefore, identification is based on this difference. However, the turn rate is unknown in advance and changeable over time, and it is not easy to determine it. Our goal is to track the vehicle’s state and identify its class jointly.
We propose to use the constant turn (CT) model to describe the target motion [36]. Suppose the target state at time k is \(x_{k}=[p_{k}^{x},v_{k}^{x},p_{k}^{y},v_{k}^{y}]^{\prime }\), in which \(p_{k}^{x},v_{k}^{x},p_{k}^{y},v_{k}^{y}\) denotes position in xaxis, velocity in xaxis, position is yaxis, and velocity in yaxis, respectively. The system model is given by:
where the transition function \(F^{CT}(\omega )\) is given by
and the covariance of process noise is
The measurement model is
The initial target state \(x_{0}=[500,10,500,10]^{\prime };P_{0}=[10^{4},1,10^{4},1]^{\prime }\). The measurement noise \(v_{k}\) in one dimension follows \({\mathcal {N}}(0,50^{2}\)m\(^{2})\). The JPM (15) with \(\gamma =1\) is used. \(c_{ij}=1,c_{ii}=0,\alpha _{ij}=1,\sum _{i}\beta _{ij}=10^{4},\beta _{ii}/\beta _{ij}=1.5\). All results were obtained from 10000 MC (Monte Carlo) runs. The true target class is randomly generated with equal probabilities in each MC run, i.e., \(P(c_{1})=P(c_{2})=0.5\).
The parameters in EMA estimation are as follows. For class 1, the fixed model set is \(\{3\pi /180,4\pi /180\}\); for class 2, the fixed model set is \(\{6\pi /180,10\pi /180,12\pi /180,8\pi /180\}\). For each class at each time step, we use one expected model, i.e., \(EMA\{2+1\}\) and \(EMA\{4+1\}\) for classes 1 and 2, respectively. The transition probability matrix (TPM) for the total model set (containing the expected one) is as follows:
Simulation results are presented in Fig. 3. They show that for tracking, TthenI performs best, JTI is in the middle and IthenT is the worst. Here, TthenI is best as is desired since its tracking is MMSE estimation, which is optimal in the sense of mse. IthenT is worst since it does tracking completely based on the decided class without considering possible decision errors. For identification, IthenT performs best since with \(c_{ii}=0,c_{ij}=1(i\ne j)\), identification in IthenT is the minimalerrorrate decision, which has the highest correct identification rate. TthenI has the worst decision performance since it does decision based on the onestep predicted estimation.
For the joint performance, JTI outperforms IthenT and TthenI. This verifies that JTI can make a good tradeoff between optimal decision and optimal estimation and finally performs best in joint performance, which is cared about most in a joint problem. Specifically, within about 18 steps, JTI is significantly superior than IthenT and is very close to TthenI. After 18 steps, JTI is close to IthenT but consistently superior than TthenI. Quantitatively, the joint performance of JTI is improved by 30% compared with TthenI at the steady state. In general, the proposed JTI is significantly superior than the traditional twostep methods.
Remark 6
To see more clearly, we also provide the lower bound of the joint performance, as shown in Fig. 4. In Fig. 4, “ideal” means the ideal case in which the true target class is known, and only the joint performance is presented since it is the most desirable performance. Figure 4 verifies that with the accumulation of data, the proposed JTI approach is robust and is near to the lower bound of that in the ideal case.
4.2 Example 2: Simulation scenario at a crossroad
In this simulation, we consider a typical moving at a crossroads, as illustrated in Fig. 1b. The vehicle passing the crossroads goes strait first, and then turns, and then goes strait ahead. This follows the “straitturnstrait” mode and can be represented by the linear motion and the turn motion. We propose to use the constant acceleration (CA) and CT models to describe this motion [36]. With the target state \(x_{k}=[p_{k}^{x},v_{k}^{x},p_{k}^{y},v_{k}^{y}]^{\prime }\), the system model is given by:
When the target moves in a CA model,
while when the target moves in a CT model, the dynamic model is the same as in Example 1.
and
Here, \(w_{k}^{x}\) and \(w_{k}^{y}\), which are modeled as process noises, are actually the accelerations along the x and yaxes, respectively.
When the vehicle moves in a CA model, \(\alpha _{CA}^{1}=1g,\alpha _{CA}^{2}=2g,\)where \(g=9.8m/s;\)when the vehicle moves in a CT model, \(\omega _{1}=3\pi /180(rad/s),\omega _{2}=18\pi /180(rad/s)\). The initialization parameters are the same as in Example 1. The total time step is 30s, and the target motion is as follows: 0–10s, CA model; 10–20s, CT model; and 20–30s, CA model. In this example, since the target motion pattern changes over time, single model is adopted to eliminate the interference caused by the model switching and only to verify the proposed JTI algorithm.
The simulation results are presented in Fig. 5. It can be seen that for tracking, TthenI performs best, JTI is in the middle, and IthenT performs worst. This is consistent with our expectations, and the reason is the same as in Example 1. For identification, IthenT performs best since it is the minimalerrorrate decision. For joint performance, JTI is the best, which verifies that JTI is robust to complicated mobility and continuously better than the traditional methods. Generally, this fully demonstrates the superiority of the proposed JTI approach in complex motion scenario.
Remark 7
Generally, the JTI problem in complex traffic scene is formulated (by illustration and models) in Sect. 2, the JTI solution with theoretical analyses is provided in Sect. 3, and the simulation verification is presented in Sect. 4.
For problem formulation, there are all kinds of complex traffic scenes, among which complicated mobility and high nonlinearity are common and typical. Therefore, we formulate the JTI problem based on these two scenarios. More importantly, JTI as a joint problem has highly coupled tracking and identification, which is critical in this paper.
For solution, we explore a JTI solution accounting for the practical complex traffic scene and also utilizing the coupling between tracking and identification. Specifically, JTI solution with incorporated UKFEMA estimation strategy is proposed, which has superior joint performance by utilizing the coupling information and also considers the practical complicated mobilities.
For simulation, two typical examples representing different complex scenes fully demonstrate the superiority of the proposed JTI approach. Simulation results show that the proposed JTI approach with incorporated UKFEMA estimation can beat the traditional twostep strategies and finally performs best in joint performance.
5 Conclusions
This paper proposes a new joint tracking and identification (JTI) approach for practical JTI problem in complex traffic scene. JTI is essentially a joint decision and estimation (JDE) problem, and better solution requires solving the tracking and identification jointly. The recently proposed JDE framework provides a good framework for solving such problems involving coupled decision and estimation.
First, we formulate the JTI problem in complex traffic scene using a hybrid system model. Then, an applicable JTI approach which considers the complexities of practical traffic scene and also the interdependence between tracking and identification is proposed. Specifically, we propose a CJDEbased JTI risk and then derive a JTI solution by minimizing this risk. A new UKFEMAbased estimation strategy is proposed. On the one hand, it guarantees the superiority of estimation performance due to UKF in handling nonlinear estimation and EMA in handling complicated motions. On the other hand, it facilitates the decision by providing quantities required in JTI decider. Also presented is a joint performance evaluation metric which can evaluate tracking and identification performance comprehensively.
Simulation results demonstrate the superiority of the proposed JTI approach in complex traffic scene. By considering the characteristics of JTI in practical complex traffic scene and also the highly coupling between tracking and identification, the proposed JTI approach beats the traditional TthenI and IthenT methods in joint performance. Note that this paper focuses on the complexity and nonlinearity of the target motion for one single target, multiple targets scenarios will be investigated in the future.
Availability of data and materials
Data sharing is not applicable to this article.
Abbreviations
 JTI:

Joint tracking and identification
 JDE:

Joint decision and estimation
 CJDE:

Conditional joint decision and estimation
 VSMM:

Variablestructure multiple model
 EMA:

Expectedmode augmentation
 UKF:

Unscented Kalman filter
 MMSE:

Minimum mean square error
 CT:

Constant turn
 CA:

constant acceleration
 JPM:

Joint performance metric
 IthenT:

Identificationthentracking
 TthenI:

Trackingthenidentification
References
Y. BarShalom, X.R. Li, T. Kirubarajan, Estimation with Applications to Tracking and Navigation: Theory, Algorithms, and Software (Wiley, New York, 2001)
R.O. ChavezGarcia, O. Aycard, Multiple sensor fusion and classification for moving object detection and tracking. IEEE Trans. Intell. Transp. Syst. 17(2), 525–534 (2016)
W. Yi, Z. Fang, W. Li, R. Hoseinnezhad, L. Kong, Multiframe trackbeforedetect algorithm for maneuvering target tracking. IEEE Trans. Veh. Technol. 69(4), 4104–4118 (2020)
T. Li, M. Mallick, Q. Pan, A parallel filteringcommunicationbased cardinality consensus approach for realtime distributed PHD filtering. IEEE Sens. J. 20(22), 13824–13832 (2020)
M. Mallick, V. Krishnamurthy, B.N. Vo, Integrated Tracking, Classification, and Sensor Management: Theory and Applications (Wiley IEEE Press, New York, 2012)
K.C. Chang, R. Fung, Target identification with Bayesian networks in multiple hypothesis tracking system. Opt. Eng. 36, 684–691 (1997)
B. Ristic, N. Gordon, A. Bessell, On target classification using kinematic data. Inf. Fusion 5, 15–21 (2004)
W. Cao, J. Lan, X.R. Li, Conditional joint decision and estimation with applications to joint tracking and classification. IEEE Trans. Syst. Man Cybern. Syst. 46(4), 459–471 (2016)
T. Kurien, Framework for integrated tracking and identification of multiple targets, in Proceedings of Digital Avionics System Conference (Burlington, 1991), pp. 362–366
Y. BarShalom, T. Kirubarajan, C. Gokberk, Tracking with classificationaided multiframe data association. IEEE Trans. Aerosp. Electron. Syst. 41(3), 868–878 (2005)
H. Lang, C. Shan, M.T. Pronobis, S. Scott, Wavelets feature aided tracking (WFAT) using GMTI/HRR data. Signal Process. 83(12), 2683–2690 (2003)
X. R. Li, Optimal Bayes joint decision and estimation, in International Conference on Information Fusion (Quebec City, 2007), pp. 1316–1323
Y. Liu, X. R. Li, Recursive joint decision and estimation based on generalized Bayes risk, in 14th Internatinal Conference on Information Fusion (Chicago, 2011), pp. 2066–2073
W. Cao, J. Lan, X. R. Li, Extended object tracking and classification based on recursive joint decision and estimation, in 16th International Conference on Information Fusion (Istanbul, 2013), pp. 1670–1677
W. Cao, J. Lan, X. R. Li, Joint tracking and classification based on recursive joint decision and estimation using multisensor data, in 17th International Conference on Information Fusion (Salamanca, 2014)
W. Cao, J. Lan, X.R. Li, Extended object tracking and classification using radar and ESM sensor data. IEEE Signal Process. Lett. 25(1), 90–94 (2018)
W. Cao, J. Lan, Q.S. Wu, Joint tracking and identification based on constrained joint decision and estimation. IEEE Trans. Intell. Transp. Syst. 22(10), 6489–6502 (2021)
X.R. Li, V.P. Jilkov, Survey of maneuvering target tracking. Part V. Multiplemodel methods. IEEE Trans. Aerosp. Electron. Syst. 41(4), 1255–1321 (2005)
X.R. Li, Y. BarShalom, Multiplemodel estimation with variable structure. IEEE Trans. Autom. Control 41(4), 478–493 (1996)
X.R. Li, V.P. Jilkov, J. Ru, Multiplemodel estimation with variable structure. Part VI: expectedmode augmentation. IEEE Trans. Aerosp. Electron. Syst. 41(3), 853–867 (2005)
T. Kirubarajan, Y. BarShalom, Tracking evasive movestopmove targets with a GMTI radar using a VSIMM estimator. IEEE Trans. Aerosp. Electron. Syst. 39(3), 1098–1103 (2003)
M. Ekman, E. Sviestins, Multiple model algorithm based on particle filters for ground target tracking, in Proceedings of International Conference on Information Fusion (Quebec City, 2007)
Y. Cheng, T. Singh, Efficient particle filtering for roadconstrained target tracking. IEEE Trans. Aerosp. Electron. Syst. 43(4), 1454–14693 (2007)
S.J. Julier, J.J. LaViola, On Kalman filtering with nonlinear equality constraints. IEEE Trans. Signal Process. 55(6), 2774–2784 (2007)
L. Xu, X.R. Li, Z. Duan, J. Lan, Modeling and state estimation for dynamic systems with linear equality constraints. IEEE Trans. Signal Process. 61(11), 2927–2939 (2013)
A. Jazwinski, Stochastic Processing and Filtering Theory (Academic, New York, 1970)
S.J. Julier, J.K. Ulhmann, H.F. DurrantWhyte, A new method for the nonlinear transformation of means and covariances in filters and estimators. IEEE Trans. Autom. Control 45(3), 472–482 (2000)
K. Ito, K. Xiong, Gaussian filters for nonlinear filtering problems. IEEE Trans. Autom. Control 45(5), 910–927 (2000)
I. Arasaratnam, S. Haykin, Cubature Kalman filters. IEEE Trans. Autom. Control 54(6), 1254–1269 (2009)
V. Gaikwad, S. Lokhande, Lane departure identification for advanced driver assistance. IEEE Trans. Intell. Transp. Syst. 16(2), 910–918 (2015)
L. Martinez, M. Paulik, M. Krishnan, E. Zeino, Mapbased lane identification and prediction for autonomous vehicles, in 2014 IEEE International Conference on Electro/ Information Technology (EIT) (2014), pp. 448–453
Z.J. Liu, Q. Li, X.H. Liu, J. Lan, C.D. Mu, An expectedmode augmentationbased approach for multiplefault detection and diagnosis in flight control systems. Proc. Inst. Mech. Eng. Part G: J. Aerosp. Eng. 226(G10), 1202–1213 (2012)
X. R. Li, Z. Zhao, Measures of performance for evaluation of estimators and filters, in Proceedings of SPIE Conference on Signal and Data Processing of Small Targets, vol. 4473 (San Diego, 2001)
X. R. Li, Z. Duan, Comprehensive evaluation of decision performance, in Proceedings of International Conference on Information Fusion (2008), pp. 1–8
X. R. Li, M. Yang, J. Ru, Joint tracking and classification based on Bayes joint decision and estimation, in International Conference on Information Fusion (Quebec City, 2007), pp. 1421–1428
X.R. Li, V.P. Jilkov, Survey of maneuvering target tracking. Part I: dynamic models. IEEE Trans. Aerosp. Electron. Syst. 39(4), 1333–1364 (2003)
Acknowledgements
The authors would like to express their sincere thanks to the editors and anonymous reviewers.
Funding
Research was supported in part by the Fundamental Research Funds for the Central Universities, CHD (300102322103); the National Natural Science Foundation of China (61803042); Shaanxi Provincial Natural Science Foundation of China (2021JM186)
Author information
Authors and Affiliations
Contributions
WC conceived the idea and proposed the JTI approach. QL and YH provided guidance on the analysis and simulations. WC wrote the majority of the manuscript. SM revised the manuscript and provided constructive suggestions. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Approved.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cao, W., Li, Q., Hu, Y. et al. Joint target tracking and identification in complex traffic scene. EURASIP J. Adv. Signal Process. 2022, 119 (2022). https://doi.org/10.1186/s13634022009553
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13634022009553
Keywords
 Joint tracking and identification
 Complex traffic scene
 Joint decision and estimation
 Joint performance metric