4.1 MTL model construction
According to the scattering center model [4], for a highresolution radar system, a radar target does not appear as a "point target" any more, but consists of many scatterers distributed in some range cells along radar LOS. For a certain target, the scattering center model varies throughout the whole targetaspect. Therefore, preprocessing techniques should be applied to the raw HRRP data. In our previous work [11–13], we divide the HRRP into frames according to the aspectsectors without most scatterers' motion through resolution cells (MTRC), and use distinct parametric models for statistical characterization of each HRRP frame, which are referred to as the aspectframes and corresponding singletask learning (STL) models in our articles.
For the motivating HRRP recognition problems of interested here, we utilize TSBHMM for analyzing spectrogram features extracted from HRRP data. For a multiaspect HRRP sequence of target c (c ∈ {1, ..., C} with C denoting the number of targets here), we divide the data into M_{
c
} aspect frames, e.g., the m th set (here m ∈ {1, ..., M_{
c
}}) is {\left\{{\mathbf{x}}^{\left(c,m,n\right)}\right\}}_{n=1}^{N} where N denotes the number of samples in the frame, and x^{(c, m, n)}= [x^{(c, m, n)}(1), ..., x^{(c, m, n)}(L_{
x
})]^{T}represents the n th HRRP sample in the m th frame, with L_{
x
} denoting the number of range cells in an HRRP sample. Each aspect frame corresponds to a small aspectsector avoiding scatters' MTRC [13], and the HRRP samples inside each targetaspect frame can be assumed to be i.i.d. We extract the spectrogram feature of each HRRP sample, and Y^{(c, m, n)}= [y^{(c, m, n)}(1), ..., y^{(c, m, n)}(L_{
y
})] denotes the spectrogram feature of x^{(c, m, n)}as defined in (2) with L_{
y
} denoting the number of time bins in spectrogram feature.
If learning a separate TSBHMM for each frame of the target, i.e., {\left\{{\mathbf{Y}}^{\left(c,m,n\right)}\right\}}_{n=1}^{N}, is termed the singletask TSBHMM (STL TSBHMM). Here, we wish to learn a TSBHMM for all the aspectframes (tasks) of one target jointly, which is referred to as multitask TSBHMM (MTL TSBHMM). MTL is an approach to inductive transfer that improves generalization by using the domain information contained in the training samples of related tasks as an inductive bias [20]. In our learning problems, the aspectframes of one target may be viewed as a set of related learning tasks. Rather than building models for each aspectframe individually (due to targetaspect sensitivity), it is desirable to appropriately share the information among these related data. Therefore, the training data for each task are strengthened and overall recognition performance is potentially improved.
The construction of the MTL TSBHMM with parameters for target c is represented as
\begin{array}{c}{\mathbf{y}}^{\left(c,m,n\right)}\left(l\right)~f\left({\mathbf{\theta}}_{{s}_{l}^{\left(c,m,n\right)}}\right);\phantom{\rule{1em}{0ex}}l=1,\dots ,{L}_{y};\phantom{\rule{1em}{0ex}}n=1,\dots ,N;\phantom{\rule{1em}{0ex}}m=1,\dots ,{M}_{c};\hfill \\ {s}_{l}^{\left(c,m,n\right)}~\left\{\begin{array}{cc}\hfill {\mathbf{w}}_{0}^{\left(c,m\right)}\hfill & \hfill \mathsf{\text{if}}\phantom{\rule{1em}{0ex}}l=1\hfill \\ \hfill {\mathbf{w}}_{{s}_{l1}^{\left(c,m,n\right)}}^{\left(c,m\right)}\hfill & \hfill \mathsf{\text{if}}\phantom{\rule{1em}{0ex}}l\ge 2\hfill \end{array}\right.\phantom{\rule{2.77695pt}{0ex}}\hfill \\ {\mathbf{w}}_{i}^{\left(c,m\right)}{\beta}_{i}~\mathbf{G}\mathbf{D}\mathbf{D}\left({\mathbf{1}}_{\left(I1\right)\times 1},{\left[{\beta}_{i}\right]}_{\left(I1\right)\times 1}\right),\phantom{\rule{1em}{0ex}}{\beta}_{i}{a}_{\alpha},{b}_{\alpha}~\mathbf{G}\mathbf{a}\left({a}_{\alpha},{b}_{\alpha}\right);\phantom{\rule{1em}{0ex}}i=0,\dots ,I\hfill \\ {\mathbf{\theta}}_{i}~H;\phantom{\rule{1em}{0ex}}i=1,\dots ,I\hfill \end{array}
(7)
where y^{(c, m, n)}(l) is l th time chunk of n th sample's spectrogram in the m th aspectframe of the c th target, {s}_{l}^{\left(c,m,n\right)} denotes the corresponding state indicator, (a_{
α
}, b_{
α
}) are the preset hyperparameters. Here, the observation model f(·)is defined as independently normal distribution, and each corresponding element in H(·) is normalGamma distribution to preserve conjugacy requirements. Since each time bin of spectrogram feature of a plane corresponds to a fragment of the plane, the HMM states can characterize the frequency domain properties of different fragments of the plane target, i.e., the scattering properties of different physical constructions. A graphical representation of this model is shown in Figure 4a, and Figure 4b depicts that the sequential dependence across time chunks for a given aspectframe is characterized by an HMM structure.
The main difference between MTL TSBHMM and STL TSBHMM is in the proposed MTL TSBHMM, all the multiaspect frames of one target are learned jointly, each of the M_{
c
} tasks of target c is assumed to have an independent statetransition statistics, but the statedependent observation statistics are shared across these tasks, i.e., the observation parameters are learned via all aspectframes; while in the STL TSBHMM, each multiaspect frame of target c is learned separately, therefore, each targetaspect frame builds its own model and the corresponding parameters are learned just via this aspectframe.
4.2 Model learning
The parameters of proposed MTL TSBHMM model are treated as variables, and this model can readily be implemented by Markov Chain Monte Carlo (MCMC) [28] method. However, to approximate the posterior distribution over parameters, MCMC requires large computational resources to assess the convergence and reliability of estimates. In this article, we employ VB inference [19, 29, 30], which does not generate a single point estimation of the parameters, but regard all model parameters as possible, with the goal of estimating the posterior density function on the model parameters, as a compromise between accuracy and computational cost for largescale problems.
The goal of Bayesian inference is to estimate the posterior distribution of model parameters Φ. Given the observation data X and hyper parameters γ, by Bayes' rule, the posterior density for the model parameters may be expressed as
p\left(\mathbf{\Phi}\mathbf{X},\gamma \right)=\frac{p\left(\mathbf{X}\mathbf{\Phi},\mathbf{\gamma}\right)p\left(\mathbf{\Phi}\mathbf{\gamma}\right)}{\int p\left(\mathbf{X}\mathbf{\Phi},\mathbf{\gamma}\right)p\left(\mathbf{\Phi}\mathbf{\gamma}\right)d\mathbf{\Phi}}
(8)
where the denominator ∫p(XΦ, γ)p(Φγ)d Φ = p(Xγ) is the model evidence (marginal likelihood).
VB inference provides a computationally tractable way which seeks a variational distribution q(Φ) to approximate the true posterior distribution of the latent variables p(ΦX, γ), we obtain the expression
log\phantom{\rule{0.3em}{0ex}}p\left(\mathbf{X}\mathbf{\gamma}\right)=L\left(q\left(\mathbf{\Phi}\right)\right)+KL\left(q\left(\mathbf{\Phi}\right)\left\rightp\left(\mathbf{\Phi}\mathbf{X},\mathbf{\gamma}\right)\right)
(9)
where L\left(q\left(\mathbf{\Phi}\right)\right)=\int q\left(\mathbf{\Phi}\right)log\frac{p\left(\mathbf{X}\mathbf{\Phi},\mathbf{\gamma}\right)p\left(\mathbf{\Phi}\mathbf{\gamma}\right)}{q\left(\mathbf{\Phi}\right)}d\mathbf{\Phi}, and KL(q(Φ) p(ΦX, γ)) is the KullbackLeibler (KL) divergence between the variational distributions q(Φ) and the true posterior p(ΦX, γ). Since KL(q(Φ) p(ΦX, γ)) ≥ 0, and it reaches zero when q(Φ) = p(ΦX, γ), this forms a lower bound for log p(Xγ), so we have log p(Xγ) ≥ L(q(Φ)). The goal of minimizing the KL divergence between the variational distribution and the true posterior is equal to maximize this lower bound, which is known as the negative free energy in statistical physics.
For the computational convenience and intractable of the negativefree energy, we assume a factorized q(Φ), i.e., q\left(\mathbf{\Phi}\right)=\prod _{k}{q}_{k}\left({\phi}_{k}\right), which has same form as employed in p(ΦX, γ). With this assumption, the mean field approximation of the variational distributions for the proposed MTL TSBHMM with target c may be expressed as
q\left(\mathbf{\Phi}\right)=\prod _{m=1}^{{M}_{c}}\prod _{i=0}^{I}q\left({\mathbf{w}}_{i}^{\left(c,m\right)}\right)\left[\prod _{m=1}^{{M}_{c}}\prod _{n=1}^{N}\prod _{l=1}^{{L}_{y}}q\left({s}_{l}^{\left(c,m,n\right)}\right)\right]q\left(\mathbf{\theta}\right)q\left(\mathbf{\beta}\right)
(10)
where \left\{{\left\{{\mathbf{w}}_{i}^{\left(c,m\right)}\right\}}_{m=1,i=0}^{{M}_{c},I},{\left\{{s}_{l}^{\left(c,m,n\right)}\right\}}_{m=1,n=1,l=1}^{{M}_{c},N,{L}_{y}},\mathbf{\theta},\mathbf{\beta}\right\} are the latent variables in this MTL model.
A general method for performing variational inference for conjugateexponential Bayesian networks outlined in [17] is as follows: for a given node in a graphic model, write out the posterior as though everything were known, take the logarithm, the expectation with respect to all known parameters and exponentiate the result. We can implement expectationmaximization (EM) algorithm in variational inference. The lower bound is increased in each of iteration until the algorithm converges. In the following experiments, we terminate EM algorithm when the changes of the lower bound can be neglected (the threshold is 10^{6}). Since it requires computational resource comparable to EM algorithm, variational inference is faster than MCMC methods. The detailed update equations for the latent variables and hyperparameters of MTL TSBHMM with HRRP spectrogram feature are summarized in the Appendix.
4.3 Main procedure of radar HRRP target recognition based on the proposed MTL TSBHMM algorithm
The main procedure of radar HRRP target recognition based on the proposed MTL TSBHMM algorithm is shown as follows.
4.3.1. Training phase

(1)
Divide the training samples of target c (c = 1, 2, ..., C) into HRRP frames {\left\{{\mathbf{x}}^{\left(c,m\right)}\right\}}_{m=1}^{{M}_{c}}, where M_{
c
} is the number of tasks of target c, {\mathbf{x}}^{\left(c,m\right)}={\left\{{\mathbf{x}}^{\left(c,m,n\right)}\right\}}_{n=1}^{N} denotes the m th range aligned and amplitude normalized HRRP frame, N is the number of echoes a frame contains.

(2)
Extract the spectrogram feature {\left\{{\mathbf{Y}}^{\left(c,m,n\right)}\right\}}_{m=1,n=1}^{{M}_{c},N} of each HRRP sample with Y^{(c, m, n)}= [y^{(c, m, n)}(1), y^{(c, m, n)}(2), ..., y^{(c, m, n)}(L_{
y
})] denoting the spectrogram feature of x^{(c, m, n)}as defined in (2).

(3)
For each target, we construct an MTL TSBHMM model, and learn the parameters of {w}_{0,{s}_{1}^{\left(c,m\right)}}^{\left(c,m\right)}, {w}_{i,j}^{\left(c,m\right)}, and θ_{
i
}for all aspectframes of the target via using spectrogram feature, where {w}_{0,{s}_{1}^{\left(c,m\right)}}^{\left(c,m\right)} is the initial state probability for the index frame m of target c, {w}_{i,j}^{\left(c,m\right)} is state transition probability from state i to the j for the index frame m of target c, and θ_{
i
}are the parameters of observation model associated with corresponding state i (c ∈ {1, ..., C}, m ∈ {1, ..., M_{
c
}}, i, j ∈ {1, ..., I}). The detailed learning procedure of the parameters of MTL TSBHMM with HRRP spectrogram feature are discussed in Section 4.3 and the Appendix.

(4)
Store the parameters of initial state probability {\left\{{w}_{0,{s}_{1}^{\left(c,m\right)}}^{\left(c,m\right)}\right\}}_{m=1}^{{M}_{c}}, state transition probability {\left\{{}_{\mathbf{w}}^{\left(c,m\right)}\right\}}_{m=1}^{{M}_{c}} and the parameters of observation model {\left\{{\mathbf{\theta}}_{i}\right\}}_{i=1}^{I} for each target c with c = 1, 2, ..., C.
4.3.2. Classification phase

(1)
The amplitude normalized HRRP testing sample is timeshift compensated with respect to the averaged HRRP of each frame model via the slide correlation processing [23].

(2)
Extract the spectrogram feature {\left\{{\mathbf{Y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}\right\}}_{c=1,m=1}^{C,{M}_{c}}of the slidecorrelated HRRP testing sample x_{test}, where {\mathbf{Y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}=\left\{\left[{\mathbf{y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}\left(1\right),{\mathbf{y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}\left(2\right),\dots ,{\mathbf{y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}\left({L}_{y}\right)\right]\right\} denotes the spectrogram feature of HRRP testing sample correlated with m th frame of target c as defined in (2).

(3)
The frameconditional likelihood of target can be calculated as
p\left({\mathbf{Y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}c,m\right)=\sum _{\mathbf{s}}\u3008{w}_{0,{s}_{1}^{\left(c,m\right)}}^{\left(c,m\right)}\u3009\prod _{l=2}^{{L}_{y}}\u3008{w}_{{s}_{l1}^{\left(c,m\right)},{s}_{l}^{\left(c,m\right)}}^{\left(c,m\right)}\u3009\prod _{l=1}^{{L}_{y}}{f}^{\left(c,m\right)}\left({\mathbf{y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}\left(l\right)\u3008{\mathbf{\theta}}_{{s}_{l}^{\left(c,m\right)}}\u3009\right)\phantom{\rule{1em}{0ex}}
(11)
where 〈·〉 means the posterior expectation for the latent variable over the corresponding distribution on it, e.g., \u3008{w}_{0,{s}_{1}^{\left(c,m\right)}}^{\left(c,m\right)}\u3009 denotes the posterior expectations of initial state probability, \u3008{w}_{{s}_{l1}^{\left(c,m\right)},{s}_{l}^{\left(c,m\right)}}^{\left(c,m\right)}\u3009 denotes the posterior expectations of state transition probability from state {s}_{l1}^{\left(c,m\right)} to the {s}_{l}^{\left(c,m\right)} for the frame m of target c, and \u3008{\mathbf{\theta}}_{{s}_{l}^{\left(c,m\right)}}\u3009 denotes the posterior expectations of the observation model parameters associated with state {s}_{l}^{\left(c,m\right)}, with the corresponding state indicator for the l th time chunk {s}_{l}^{\left(c,m\right)}\in \left\{1,\cdots \phantom{\rule{0.3em}{0ex}},I\right\}. Then, p\left({\mathbf{Y}}_{\mathsf{\text{test}}}^{\left(c,m\right)}c,m\right) can be calculated by forwardbackward procedure [24] for each m (m ∈ {1, ..., M_{
c
}}) and c (c ∈ {1, ..., C}).

(4)
We calculate the classconditional likelihood p\left({\mathbf{Y}}_{\mathsf{\text{test}}}^{\left(c\right)}c\right) for each target c
p\left({\mathbf{Y}}_{\text{test}}^{\left(c\right)}c\right)=\underset{m}{max}p\left({\mathbf{Y}}_{\text{test}}^{\left(c,m\right)}c,m\right);\phantom{\rule{1em}{0ex}}m=1,\cdots ,{M}_{c}
(12)

(5)
As discussed in Section 1, the testing HRRP sample will be assigned to the class with the maximum classconditional likelihood, with the assumption that the prior class probabilities are same for all targets of interests,
k=arg\phantom{\rule{0.3em}{0ex}}\underset{c}{max}p\left({\mathbf{Y}}_{\mathsf{\text{test}}}^{\left(c\right)}c\right);\phantom{\rule{1em}{0ex}}c=1,\cdots \phantom{\rule{0.3em}{0ex}},C
(13)