### 2.1 STFT model of LFM signal

In this paper, a uniform linear array (ULA) with *M* sensors is considered, and the inter-element spacing is *d*. Furthermore, *N* LFM signals are assumed to impinge on the ULA from *θ*_{1}, *θ*_{2}, ⋯, *θ*_{N}. The array output at time *t* can be modeled as

$$ \mathbf{X}(t)=\mathbf{A}(t)\mathbf{S}(t)+\mathbf{N}(t), $$

(1)

where **X**(*t*) = [*x*_{1}(*t*), *x*_{2}(*t*), ⋯, *x*_{M}(*t*)]^{T} is the array observation vector, **S**(*t*) = [*s*_{1}(*t*), *s*_{2}(*t*), ⋯, *s*_{N}(*t*)]^{T} is the LFM signal vector, and **N**(*t*) = [*n*_{1}(*t*), *n*_{2}(*t*), ⋯, *n*_{M}(*t*)]^{T} is the additive white Gaussian noise vector. **A**(*t*) = [**a**_{1}(*t*), **a**_{2}(*t*), ⋯, **a**_{N}(*t*)] is the *M* × *N* array manifold matrix, and **a**_{n}(*t*) = [exp(−*jf*_{n}(*t*)*τ*_{n1}), ⋯, exp(−*jf*_{n}(*t*)*τ*_{nM})]^{T} is the steering vector of the *nth* signal at time *t*, where *j* is the imaginary unit. *τ*_{nm} = − 2*π*(*m* − 1)*d* cos(*θ*_{n})/*c* represents the time delay of the *nth* signal on the *mth* element respecting to the reference element, where *c* is the speed of light. The instantaneous frequency *f*_{n}(*t*) is denoted as

$$ {f}_n(t)={f}_{n0}+{\gamma}_nt, $$

(2)

where *f*_{n0} and *γ*_{n} are initial frequency and modulated frequency of the *nth* signal, respectively. When *γ*_{n} = 0, the LFM signal becomes narrowband single-frequency signal.

In 1946, Gabor proposed the concept of the STFT, which is an effective TF analysis tool for dealing with the LFM signal. The STFT distribution of **X**(*t*) in the free of noise can be expressed as

$$ {\mathbf{S}}_X\left(t,f\right)=\mathbf{A}(t){\mathbf{S}}_S\left(t,f\right)=\mathbf{A}(t)\left[\begin{array}{c}{S}_{S_1}\left(t,f\right)\\ {}\vdots \\ {}{S}_{S_N}\left(t,f\right)\end{array}\right], $$

(3)

where \( {S}_{S_n}\left(t,f\right) \) is the STFT value of the *nth* LFM signal.

Then, the STFD matrix can be calculated through the correlation operation of **S**_{X}, namely,

$$ {\mathbf{D}}_{XX}\left(t,f\right)={\mathbf{S}}_X{\mathbf{S}}_X^H=\mathbf{A}(t){\mathbf{D}}_{SS}\left(t,f\right){\mathbf{A}}^H(t), $$

(4)

where **D**_{XX} and **D**_{SS} are array output STFD matrix and signal STFD matrix, respectively.

The STFT distribution of LFM signals in TF domain can be described as Fig. 1. Figure 1a denotes that the LFM signals are TF-disjoint in TF domain, which means that signals are independent of each other, and the reference [15] has proposed efficient single-source TF points selection method.

This paper mainly studies that LFM signals are tiny spectrally overlapped in TF domain. That is to say, multiple signals only have a small number of same TF points in TF domain, as is shown in Fig. 1b. In the TF domain, there exist noise-term TF points and auto-term TF points, and auto-term contains single-source TF points and multiple-source TF points. Based on the proposed algorithm, we can remove noise-term TF points and multiple-source TF points in turn and obtain the single-source TF points set of each signal. Finally, the STFD matrix can be constructed, and subsequent DOA estimation can also be achieved.

In TF domain, the LFM signals have obvious energy aggregation effect. According to [15], **D**_{XX} is very large at the TF ridgeline, and **D**_{XX} is close to zero at other TF points. Let *Ω*_{X} be the TF support domain, we can obtain

$$ {\displaystyle \begin{array}{l}{\mathbf{D}}_{XX}\left(t,f\right)\gg 0\kern1.2em \left(t,f\right)\in {\varOmega}_X\\ {}{\mathbf{D}}_{XX}\left(t,f\right)\approx 0\kern1.2em \left(t,f\right)\notin {\varOmega}_X.\end{array}} $$

(5)

In order to filter out the noise and obtain the TF points which have enough energy in TF domain, we can set an empirical threshold value and perform following criterion at each sampling time instant (*t*, *f*) of **D**_{XX}

$$ \frac{{\left\Vert {\mathbf{D}}_{XX}\left(t,f\right)\right\Vert}_F}{\underset{f}{\mathit{\max}}{\left\Vert {\mathbf{D}}_{XX}\left(t,f\right)\right\Vert}_F}>{\varepsilon}_1\kern0.8000001em \left(t,f\right)\in {\varOmega}_X, $$

(6)

where ‖·‖_{F} is the Frobenius norm and *ε*_{1} is a small empirical threshold value and is related to noise. Typically, *ε*_{1} = 0.05 when SNR = 10 dB [18].

Then, the new TF points set *Ω*_{A} which includes single-source TF points *Ω*_{S} and multiple-source TF points *Ω*_{M} can be obtained.

### 2.2 Extracting single-source TF points

At the signal intersections, LFM signals are coherent owing to same frequency, which brings the rank deficiency of the signal subspace matrix. If these intersected TF points are selected to construct the STFD matrix, then subsequent DOA estimation will fail. Therefore, our goal is to remove multiple-source TF points and obtain single-source TF points set belonging to each signal.

In theory, two LFM signals only have an intersection in TF domain. However, since the sampling data is discrete and finite, the TF ridgeline of LFM signals under STFT has somewhat wide. Therefore, there exist multiple TF points around the intersections when LFM signals are non-disjoint in TF domain. By converting the TF distribution of LFM signals into ordinary grayscale image, we can use the Hough transform method to detect straight lines in the grayscale image [26]. Hough transform is a parameter estimation technique using the voting principle in TF domain. The basic principle is to map the straight line detecting problem in the plane coordinate domain into the parameter coordinate domain by using the point-line duality property, so that the mapped result is easier to be detected. It is assumed that the curve satisfies the following equation

$$ F\left(\left(t,f\right),\left({a}_1,{a}_2,\cdots, {a}_m\right)\right)=0. $$

(7)

In the plane Cartesian coordinates, the curve satisfies specific parameters (*a*_{1}, *a*_{2}, ⋯, *a*_{m}), and (*t*, *f*) is located at the curve. *ρ* denotes the distance between the curve and the coordinate origin, and *θ* is the angle between *ρ* and the sampling time axis. After mapping into the parameter domain, we use the polar coordinates to describe the parameters, and there exists following relationship

$$ \rho =t\ast \cos \left(\theta \right)+f\ast \sin \left(\theta \right). $$

(8)

Equation (8) shows that a point in the plane coordinates corresponds to a curve in the parameter coordinates. The different points of the same curve in the plane coordinates intersect at a point in the parameter coordinates, as is shown in Fig. 2.

Figure 2 shows that the three points *A*, *B*, *C* of the same line in the plane coordinate domain can be mapped into three curves and curves intersect at the point *D*. By integrating along with different *θ*, we can obtain the Hough transform. As a consequence, the Hough transform is a kind of projection integral essentially.

Firstly, we convert the TF distribution into the grayscale image. Then, by performing Hough transform on the grayscale image, we can obtain the corresponding curves of each TF points in the parameter domain. These curves exist multiple intersections, and each intersection corresponds to a detected line segment in the plane coordinate domain. For *N* LFM signals in TF domain, 2*N* peak points (*θ*_{1}, *ρ*_{1}), (*θ*_{2}, *ρ*_{2}), ⋯, (*θ*_{2N}, *ρ*_{2N}) can be searched by setting appropriate threshold value and carrying accumulation statistics; this is because the TF distribution of LFM signal under STFT has somewhat wide. The 2*N* peak points correspond to 2*N* line segments of the plane coordinate domain, and the mathematical equation of these lines can be calculated based on the endpoint coordinates, namely,

$$ \Big\{{\displaystyle \begin{array}{l}{f}_1={k}_1t+{b}_1\\ {}{f}_2={k}_2t+{b}_2\\ {}\vdots \\ {}{f}_{2N}={k}_{2N}t+{b}_{2N}\end{array}}, $$

(9)

where *k*_{1}, *k*_{2}, ⋯, *k*_{2N} and *b*_{1}, *b*_{2}, ⋯, *b*_{2N} are slopes and intercepts of lines, respectively.

Since the noise in TF domain has been removed, the energy value in the TF ridgeline is much stronger than other TF points. Therefore, based on the Hough transform, we can precisely extract the line segments where TF distribution boundary of LFM signals is located in TF domain. Assuming that there exist *Q* (*Q* ≤ *N*(*N* − 1)/2) intersections between *N* signals, then the detected 2*N* line segments which corresponds to *N* signals will have *P* (*P* ≥ 4*Q*) intersections. If *N* LFM signals only pairwise intersect, the detected line segments have 4*Q* intersections. However, when there exists the situation that multiple signals intersect at one point, the number of intersections among 2*N* line segments will be more than 4*Q*. For any intersection in TF domain, we can calculate all intersections (*t*_{1}, *f*_{1}), (*t*_{2}, *f*_{2}), ⋯, (*t*_{P}, *f*_{P}) according to the line equation in the grayscale image. Then the minimum min(*t*_{1}, *t*_{2}, ⋯*t*_{P}) and maximum max(*t*_{1}, *t*_{2}, ⋯*t*_{P}) of the sampling time axis can be obtained in all intersections. In addition, the corresponding min(*f*_{1}, *f*_{2}, ⋯*f*_{U}) and max(*f*_{1}, *f*_{2}, ⋯*f*_{U}) can also be solved based on the (9), where *f*_{1}, *f*_{2}, ⋯*f*_{U} are all frequency values that min(*t*_{1}, *t*_{2}, ⋯*t*_{P}) and max(*t*_{1}, *t*_{2}, ⋯*t*_{P}) correspond to detected line segments. If multiple-source TF points cannot be completely removed, these TF points are selected to estimate DOAs, which will result in the failure of DOA estimation. So we remove multiple-source TF points corresponding to the region of (min(*t*_{1}⋯*t*_{P}), max(*t*_{1}⋯*t*_{P})) and (min(*f*_{1}⋯*f*_{U}), max(*f*_{1}⋯*f*_{U})) and then obtain single-source TF points set *Ω*_{S} = ∪ *Ω*_{n} for *n* = 1, 2, ⋯, *N*, where *Ω*_{n} denotes single-source TF points set of the *nth* signal.

### 2.3 Constructing STFD matrix

In this subsection, we firstly calculate the single-source TF points set belonging to each signal and then construct the STFD matrix for subsequent DOA estimation. According to reference [15], for the signal STFD matrix **D**_{SS}(*t*, *f*), the diagonal elements have a larger value, and the remaining elements are close to zero at the single-source TF points. As a result, based on the (4), the single-source TF points of the *nth* signal in *Ω*_{n} satisfies

$$ {\mathbf{D}}_{XX}\left(t,f\right)={\mathbf{D}}_{SS}\left(t,f\right){\mathbf{a}}_n(t){\mathbf{a}}_n^H(t)\kern1.1em \forall \left(t,f\right)\in {\varOmega}_n. $$

(10)

For any two single-source TF points (*t*_{i}, *f*_{i}) and (*t*_{j}, *f*_{j}) in *Ω*_{n}, (10) can be rewritten as

$$ {\mathbf{D}}_{XX}\left({t}_i,{f}_i\right)={\mathbf{D}}_{SS}\left({t}_i,{f}_i\right){\mathbf{a}}_n(t){\mathbf{a}}_n^H(t). $$

(11)

$$ {\mathbf{D}}_{XX}\left({t}_j,{f}_j\right)={\mathbf{D}}_{SS}\left({t}_j,{f}_j\right){\mathbf{a}}_n(t){\mathbf{a}}_n^H(t). $$

(12)

It can be seen from (10–12) that the STFD matrix **D**_{XX} of different TF points in *Ω*_{n} has same eigenvector **a**_{n}(*t*).

For any TF point (*t*_{p}, *f*_{p}) in *Ω*_{S}, we can compute the main eigenvalue *λ*(*t*_{p}, *f*_{p}) and the corresponding principal eigenvector **e**(*t*_{p}, *f*_{p}) from the STFD matrix. Without loss of generality, we can make the first element of **e**(*t*_{p}, *f*_{p}) real and positive, namely,

$$ \tilde{\mathbf{e}}\left({t}_p,{f}_p\right)=\left[\begin{array}{c}\frac{{\mathbf{e}}_1\left({t}_p,{f}_p\right)}{\left\Vert \mathbf{e}\left({t}_p,{f}_p\right)\right\Vert}\\ {}\vdots \\ {}\frac{{\mathbf{e}}_M\left({t}_p,{f}_p\right)}{\left\Vert \mathbf{e}\left({t}_p,{f}_p\right)\right\Vert}\end{array}\right].\frac{\left\Vert {\mathbf{e}}_1\left({t}_p,{f}_p\right)\right\Vert }{{\mathbf{e}}_1\left({t}_p,{f}_p\right)}, $$

(13)

where **e**_{m}(*t*_{p}, *f*_{p}) denotes the *mth* element of eigenvector **e**(*t*_{p}, *f*_{p}).

For single-source TF points set *Ω*_{S}, mathematically, (*t*_{i}, *f*_{i}) and (*t*_{j}, *f*_{j}) belong to the same signal if they have following relationship

$$ d\left(\tilde{\mathbf{e}}\left({t}_i,{f}_i\right),\tilde{\mathbf{e}}\left({t}_j,{f}_j\right)\right)<{\varepsilon}_2, $$

(14)

where *d*(·) is the Euclidean distance operator between \( \tilde{\mathbf{e}}\left({t}_i,{f}_i\right) \) and \( \tilde{\mathbf{e}}\left({t}_j,{f}_j\right) \). *ε*_{2} is an empirical threshold value, which is greater than 0. Typically, *ε*_{2} = 0.05 when SNR = 10 dB [15]. By traversing entire set *Ω*_{S}, we can obtain single-source TF points set *Ω*_{n} of the *nth* signal.

In order to reduce the calculation error and make full use of single-source TF points, we can compute the averaged STFD matrix of all TF points. For the nth signal, the averaged STFD matrix is expressed as

$$ {\overline{\mathbf{D}}}_n\left(t,f\right)=\frac{1}{\varSigma {\varOmega}_n}\sum \limits_{\left(t,f\right)\in {\varOmega}_n}{\mathbf{D}}_{XX}\left(t,f\right), $$

(15)

where \( {\overline{\mathbf{D}}}_n\left(t,f\right) \) is similar to the covariance matrix in the subspace-based algorithm and *ΣΩ*_{n} denotes the number of single-source TF points.

Therefore, based on the MUSIC algorithm, we can carry eigendecomposition on the matrix \( {\overline{\mathbf{D}}}_n\left(t,f\right) \) and obtain the signal subspace and noise subspace. Finally, we can construct the spectrum peak searching function and perform DOA estimation

$$ \mathbf{P}\left(\theta \right)=\frac{1}{{\mathbf{b}}^H\left(\theta \right){\mathbf{U}}_n{{\mathbf{U}}_n}^H\mathbf{b}\left(\theta \right)}, $$

(16)

where **b**(*θ*) is the steering vector and **U**_{n} is the noise subspace.

For the readers’ convenience, the procedure of single-source TF points selection algorithm based on Hough transform and STFT is summarized as follows:

- (1)
Filter out noise according to empirical threshold value *ε*_{1} in TF domain.

- (2)
Based on Hough transform, remove multiple-source TF points, and obtain all single-source TF points set *Ω*_{S}.

- (3)
Classify signals according to the Euclidean distance operator, and get single-source TF points set *Ω*_{n} of each signal.

- (4)
For every signal, calculate the averaged STFD matrix \( {\overline{\mathbf{D}}}_n\left(t,f\right) \) in the set *Ω*_{n}, and then construct the spectrum peak search function **P**(*θ*) = 1/**b**^{H}(*θ*)**U**_{n}**U**_{n}^{H}**b**(*θ*) based on the MUSIC algorithm.