### 2.1 CD model

From the theory of CS [4, 5], a discrete signal *x* of length \( \mathtt{L} \) can be represented by a linear combination of a set of bases \( \Psi \in \left[{\psi}_1,{\psi}_2,\cdots, {\psi}_{\mathtt{N}}\right] \)

$$ \mathtt{x}=\sum \limits_{i=1}^{\boldsymbol{N}}{\uptheta}_i{\uppsi}_i=\Psi \theta $$

(1)

in which **θ** ∈ *R*^{N × 1} is the sparse vector, *θ*_{i} = 〈*x*, *ψ*_{i}〉, \( x\in {R}^{\mathtt{L}\times 1} \) is the signal to be detected, and \( {\Psi}^{\mathtt{L}\times \mathtt{N}} \) is the sparse basis or sparse dictionary. If *θ* has only \( \mathtt{K}\left(\mathtt{K}=\mathtt{N}\right) \) nonzero elements, then *θ* is termed the \( \mathtt{K} \)-sparse representation of the signal *x*. \( \mathtt{K} \) is the sparsity of the signal.

When the signal *x* can be sparsely represented [22], the problem of CD can be expressed by the following mathematical model of hypothesis testing

$$ \left\{\begin{array}{l}{\mathtt{H}}_0:y=\Phi {s}_n\\ {}{\mathtt{H}}_1:y=\Phi \left(x+{s}_n\right)\end{array}\right. $$

(2)

in which \( \Phi \in {R}^{\mathtt{M}\times \mathtt{N}}\left(\mathtt{M}=\mathtt{N}\right) \) is the measurement matrix and *s*_{n} is the Gaussian white noise. The hypothesis \( {\mathtt{H}}_0 \) is that the signal does not exist, and the hypothesis \( {\mathtt{H}}_1 \) is that the signal exists. Using Eq. (1), Eq. (2) can be rewritten as

$$ \left\{\begin{array}{l}{\mathtt{H}}_0:y=\Phi {s}_n\\ {}{\mathtt{H}}_1:y=\Phi \left(\Psi \theta +{s}_n\right)\end{array}\right. $$

(3)

### 2.2 Sparse representation model for radar signals

Assuming that the radar target contains a number \( \mathtt{K} \) of strong scattering centers located at different distance cells, the model of the pulse signal emitted s_{T}(*t*) ∈ *R*^{L × 1} by the radar is

$$ {\boldsymbol{s}}_T(t)= rect\left(t/{T}_P\right)\exp \left(j2\pi {f}_0t+ j\pi {K}_r{t}^2\right) $$

(4)

The complex base-band echo signal received by the radar system *x*_{R}(*τ*, *t*) can be described as

$$ {\displaystyle \begin{array}{c}{x}_{\mathtt{R}}\left(\tau, \mathtt{t}\right)=\sum \limits_{\mathtt{k}=1}^{\mathtt{K}}{\sigma}_{\mathtt{k}}\exp \left[-\mathtt{j}4\pi {\mathtt{f}}_0{\mathtt{r}}_{\mathtt{k}}\left(\tau \right)/\mathtt{c}\right]\\ {}\cdot \mathtt{rect}\left[\left(\mathtt{t}-2{\mathtt{r}}_{\mathtt{k}}\left(\tau \right)/\mathtt{c}\right)/{\mathtt{T}}_{\mathtt{P}}\right]\\ {}\cdot \exp \left\{\mathtt{j}\pi {\mathtt{K}}_{\mathtt{r}}{\left[\mathtt{t}-2{\mathtt{r}}_{\mathtt{k}}\left(\tau \right)/\mathtt{c}\right]}^2\right\}\\ {}+{\boldsymbol{s}}_{\mathtt{c}}\left(\mathtt{t}\right)+{\boldsymbol{s}}_{\mathtt{n}}\left(\mathtt{t}\right)\end{array}} $$

(5)

in which *σ*_{k} is the backscattering coefficient of the scattering point *k*; *T*_{P} is the time width of the pulse signal; *f*_{0} is the center frequency of the signal; *r*_{k}(*τ*) is the distance between the scattering point *k* and the radar platform at the moment of pulse emission; *c* is the speed of light; *K*_{r} is the frequency modulation coefficient; *s*_{c}(*t*) is the clutter, which generally follows the Weibull distribution; and *s*_{n}(*t*) is the Gaussian white noise [37].

With *s*_{0}(*t*) = [*rect*(*t*/*T*_{P})/(*T*_{P}|*K*_{r}|)] ⋅ exp(*jπK*_{r}*t*^{2}) substituted into Eq. (5), the following is obtained

$$ {\displaystyle \begin{array}{c}{\boldsymbol{x}}_R\left(\tau, t\right)=\sum \limits_{k=1}^K\left\{{\alpha}_k\left(\tau \right){\boldsymbol{s}}_0\left[t-2{r}_k\left(\tau \right)/c\right]\right\}\\ {}+{\boldsymbol{s}}_c(t)+{\boldsymbol{s}}_n(t)\\ {}={\boldsymbol{s}}_r(t)+{\boldsymbol{s}}_c(t)+{\boldsymbol{s}}_n(t)\end{array}} $$

(6)

in which \( {\alpha}_k\left(\tau \right)={T}_P\left|{K}_r\right|{\sigma}_{\mathtt{k}}\exp \left[-\mathtt{j}4\pi {\mathtt{f}}_0{\mathtt{r}}_{\mathtt{k}}\left(\tau \right)/\mathtt{c}\right] \).

For the entire radar scene, the position occupied by the target is very small and is sparse for the total scanning area. The distance resolution of the radar in the observation interval [*r*_{1}, *r*_{2}] is denoted *Δr*, and thus, the target scattering center in the distance cell can be represented by a one-dimensional vector *θ*

$$ {\theta}^{\mathrm{T}}=\left[{\mathtt{\theta}}_1,{\mathtt{\theta}}_2,\cdots, {\mathtt{\theta}}_{\mathtt{l}},\cdots, {\mathtt{\theta}}_{\mathtt{L}-1},{\mathtt{\theta}}_{\mathtt{L}}\right] $$

(7)

in which *θ*_{l} = *σ*_{l} exp[−*j*4*πf*_{0}*r*_{l}(*τ*)/*c*], *σ*_{l} is the backscattering coefficient of the scattering center of distance cell *r*_{l}, *r*_{l} = *r*_{1} + *lΔr*, *l* ∈ [1 : *L*], and *L* = 1 + (*r*_{2} − *r*_{1})/*Δr*. When there is no target in a certain distance cell *k*, *θ*_{k} = 0. Through the delay of *s*_{0}(*t*), the sparse dictionary basis Ψ ∈ *R*^{L × N} of the radar signal is constructed:

$$ {\displaystyle \begin{array}{c}{\psi}_i=\left[{s}_0\left({t}_i-2\frac{r_1}{c}\right)\right.{s}_0\left({t}_i-2\frac{r_1+\varDelta r}{c}\right)\\ {}\cdots {s}_0\left({t}_i-2\frac{r_1+ l\varDelta r}{c}\right)\\ {}\ {\left.\cdots {s}_0\left({t}_i-2\frac{r_2-\varDelta r}{c}\right){s}_0\left({t}_i-2\frac{r_2}{c}\right)\right]}^H\\ {}={\left[s\left({t}_i-{\uptau}_1\right)s\left({t}_i-{\uptau}_2\right)\cdots s\left({t}_i-{\uptau}_L\right)\right]}^H\end{array}} $$

(8)

Ψ = [*ψ*_{1}, *ψ*_{2}, ⋯, *ψ*_{N}], in which *i* = 1, 2, ⋯, *N*, and *τ*_{i} is the time delay corresponding to each distance cell. Given that the clutter and noise are not considered, according to Eqs. (1) and (8), the target scattering center can be sparsely expressed as

$$ {\theta}_r={\Psi}^{\mathrm{H}}{\boldsymbol{s}}_r(t) $$

(9)

### 2.3 Measurement matrix model

According to the theory of CS, with the complex base-band echo signal *s*_{r}(*t*) and through the design of the measurement matrix, the signal after CD can be directly detected. Let the measurement matrix be

$$ \Phi ={\Phi}_1{\Psi}^{\mathrm{H}} $$

(10)

in which Φ_{1} ∈ *R*^{M × N}, *N*/*M* = *l*, and *l* is an integer. Then, Φ_{1} is defined as

$$ {\displaystyle \begin{array}{c}{\Phi}_1=\left[\begin{array}{cccc}1& & & \\ {}& 1& & \\ {}& & \ddots & \\ {}& & & 1\end{array}\cdots \begin{array}{cccc}1& & & \\ {}& 1& & \\ {}& & \ddots & \\ {}& & & 1\end{array}\right]\\ {}=\left[\mathrm{eye}\left(M,M\right)\kern0.5em \cdots \kern0.5em \mathrm{eye}\left(M,M\right)\right]\end{array}} $$

(11)

According to Eq. (3), the following is obtained

$$ \left\{\begin{array}{l}{H}_0:\boldsymbol{y}={\Phi}_1{\Psi}^{\mathrm{H}}\left({\boldsymbol{s}}_c+{\boldsymbol{s}}_n\right)\\ {}{H}_1:\boldsymbol{y}={\Phi}_1{\Psi}^{\mathrm{H}}\left(\boldsymbol{x}+{\boldsymbol{s}}_c+{\boldsymbol{s}}_n\right)\end{array}\right. $$

(12)

Substituting Eq. (11) into Eq. (12), the signal expression of hypothesis *H*_{1} is obtained:

$$ {\displaystyle \begin{array}{c}\boldsymbol{y}={\Phi}_1{\Psi}^{\mathtt{H}}\left(\boldsymbol{x}+{\boldsymbol{s}}_c+{\boldsymbol{s}}_n\right)\\ {}={\Phi}_1\left({\theta}_r+\boldsymbol{D}\right)\\ {}=\left[\mathtt{eye}\left(M,M\right)\kern0.5em \cdots \kern0.5em \mathtt{eye}\left(M,M\right)\right]\\ {}\cdot \left\{\left[\begin{array}{c}0\\ {}\vdots \\ {}{\upsigma}_1^2\\ {}\vdots \\ {}0\\ {}{\upsigma}_L^2\\ {}\vdots \\ {}0\end{array}\right]+\left[\begin{array}{c}\boldsymbol{D}(1)\\ {}\boldsymbol{D}(2)\\ {}\\ {}\vdots \\ {}\\ {}\boldsymbol{D}\left(N-1\right)\\ {}\boldsymbol{D}(N)\end{array}\right]\right\}\end{array}} $$

(13)

Therefore, *D* = Ψ^{H}(*s*_{c} + *s*_{n}). Since the radar target signal *x* is sparse, Eq. (13) shows that \( {\sigma}_1^2,\cdots, {\sigma}_L^2 \) are *L* targets’ projections on the dictionary basis. When the *i*-th line (1 ≤ *i* ≤ *L*) element 1 of Φ_{1} is multiplied by \( {\sigma}_i^2 \) and *D*, the results are all 0 except *D* multiplied by the *i*-th element 1 of Φ_{1}. Good noise reduction performance is thus achieved. The noise reduction performance increases with increasing *M*. Through the CD of the echo signal, the sampling rate is reduced. Additionally, a storage space savings of 1/*l* is achieved, and the direct detection of the compressed signal is realized without reconstruction.

### 2.4 SVDD model

Currently, the classifier is divided into a one-class classifier and a multiclass classifier according to the number of categories of training samples [38]. The multiclass classifiers are used mainly for classification problems where the training samples are sufficient and the data of different categories are relatively balanced. Different types of training samples are needed to construct the classification function, and the sample classification is achieved by determining the optimization points between different categories. A one-class classifier is mainly used in scenarios where multiple types of training samples cannot be obtained or are too expensive to obtain. This classifier needs a training sample from only one target category. The training sample of this category is used to construct a closed cover. The unknown test sample is determined as a target or non-target. An SVDD is often used in the field of abnormal data detection and is a very typical classifier. In the training phase, training samples from only one type of target are needed to obtain the optimal classification surface [36]. Assuming that the radius of the hypersphere found in the high-dimensional feature space is *R*_{ϕ} and that the sphere center vector of the hypersphere is *a*_{ϕ} , the expression for any point on the hypersphere *ϕ*(*x*_{i}) is \( {\left\Vert \phi \left({x}_i\right)-{a}_{\phi}\right\Vert}^2={R}_{\phi}^2 \). Therefore, the solution process for the optimal classification surface of the SVDD can be expressed as

$$ \left\{\begin{array}{l}\min\;{\mathtt{R}}_{\phi}^2+\mathtt{C}\sum \limits_{\mathtt{i}=1}^l{\xi}_i\\ {}s.t.{\left\Vert \phi \left({x}_i\right)-{a}_{\phi}\right\Vert}^2\le {\mathtt{R}}_{\phi}^2+{\xi}_i,{\xi}_i\ge 0,i=1,2,\cdots, l\end{array}\right. $$

(14)

in which *l* is the number of sample points, *C* is the penalty coefficient, and *ξ*_{i} is the Lagrange multiplier corresponding to the *i*-th sample. With the Lagrange multiplier method, we can obtain the dual optimization problem

$$ \left\{\begin{array}{l}\max \sum \limits_{i=1}^l{\beta}_i\mathtt{K}\left({\mathtt{x}}_i,{\mathtt{x}}_j\right)\hbox{-} \sum \limits_{i=1}^l\sum \limits_{j=1}^l{\beta}_i{\beta}_j\mathtt{K}\left({\mathtt{x}}_{\mathtt{i}},{\mathtt{x}}_{\mathtt{j}}\right)\\ {}s.t.\sum \limits_{i=1}^l{\beta}_i=1,{\beta}_i\in \left[0,C\right],i=1,2,\cdots, l\end{array}\right. $$

(15)

in which the kernel function *K*(*x*_{i}, *x*_{j}) = *ϕ*^{T}(*x*_{i})*ϕ*(*x*_{i}) and *β*_{i} is the Lagrange multiplier. In the process of solving the dual optimization problem, the expression of the sphere center vector of the hypersphere is obtained:

$$ {a}_{\phi }=\sum \limits_{i=1}^l{\beta}_i\phi \left({x}_i\right) $$

(16)

With the Lagrange multiplier and any support vector on the hypersphere *ϕ*(*x*), the radius of the hypersphere can be obtained:

$$ {\displaystyle \begin{array}{c}{R}_{\phi }=K\left(x,x\right)-2\sum \limits_{i=1}^l{\beta}_iK\left({x}_i,x\right)\\ {}+\sum \limits_{i=1}^l\sum \limits_{j=1}^l{\beta}_i{\beta}_jK\left({x}_i,{x}_j\right)\end{array}} $$

(17)

Any sample point *x* becomes *ϕ*(*x*) after being mapped to the high-dimensional feature space, and the decision-making method is

$$ {f}_{\phi }(x)={\left\Vert \phi (x)-{a}_{\phi}\right\Vert}^2-{R}_{\phi}^2 $$

(18)

When *f*_{ϕ}(*x*) ≤ 0, *x* is the target sample. Otherwise, *x* is an abnormal sample.