# Adaptive matching pursuit with constrained total least squares

## Abstract

Compressive sensing (CS) can effectively recover a signal when it is sparse in some discrete atoms. However, in some applications, signals are sparse in a continuous parameter space, e.g., frequency space, rather than discrete atoms. Usually, we divide the continuous parameter into finite discrete grid points and build a dictionary from these grid points. However, the actual targets may not exactly lie on the grid points no matter how densely the parameter is grided, which introduces mismatch between the predefined dictionary and the actual one. In this article, a novel method, namely adaptive matching pursuit with constrained total least squares (AMP-CTLS), is proposed to find actual atoms even if they are not included in the initial dictionary. In AMP-CTLS, the grid and the dictionary are adaptively updated to better agree with measurements. The convergence of the algorithm is discussed, and numerical experiments demonstrate the advantages of AMP-CTLS.

## 1 Introduction

A new class of techniques called compressed sampling or compressive sensing (CS) has been widely used recently, due to the fact that CS techniques have shown good performance in different areas such as signal processing, communication and statistics; see, e.g., . Generally, CS finds the sparsest vector x from measurements y = Φx, where Φ is often referred to as dictionary with more columns than rows, and each column of the dictionary is called an atom or a basis.

Matching pursuit (MP) is a set of popular greedy approaches to compressive sensing. The basic idea is to sequentially find the support set of x and then project on the selected atoms. The atoms selected in the support set are mainly determined by correlations between atoms and the regularized measurements . MP methods include standard MP , and several other examples, such as orthogonal matching pursuit (OMP) , regularized OMP (ROMP) , stage-wise OMP (StOMP) , compressive sampling matching pursuit (CoSaMP)  and subspace pursuit (SP) .

These MP methods  do not consider the off-grid problem in grid-based CS approaches. In some applications of CS, such as harmonic retrieval and radar signal processing (e.g., range profiling [8, 9], direction of arrival estimate ), we usually divide a continuous parameter space into discrete grid points to generate the dictionary. For example, in harmonic retrieval, frequency space is divided and dictionary is a discrete Fourier transform (DFT) matrix. The off-grid problem emerges when the actual frequencies are placed off the predefined grid. The mismatch between the predefined and actual atom can lead to performance degradation in sparse recovery (e.g., ).

The grid misalignment problem in CS has recently received growing interest. The sensitivity of CS to the mismatch between the predefined and actual atoms is studied in ; however, the focus of that article is mainly on mismatch analysis rather than development of an algorithm. Cabrera et al.  and Zhu et al. , respectively, provided an iterative re-weighted (IRW)-based and a Lasso-based method to recover an unknown vector considering the atom misalignment, whereas we focus on MP methods in this article. Compared with IRW or Lasso, MP methods greedily find the support set and greatly reduce the dimension of the CS problem; thus, they have an advantage in computability. Gabriel  proposed best basis compressive sensing in a tree-structured dictionary, but some dictionaries (e.g., DFT matrix) do not possess a tree structure.

To alleviate the off-grid problem in matching pursuit, we developed adaptive matching pursuit with constrained total least squares (AMP-CTLS). In AMP-CTLS, we model the grid as an unknown parameter, and adaptively search for the best one. We choose harmonic retrieval to demonstrate the performance of AMP-CTLS. The algorithm can also be applied to jointly estimate range and velocity in randomized step frequency (RSF) radar. Note that in the RSF scenario range-velocity estimation is hard to be directly solved by subspace-based methods, e.g., Capon's method, MUSIC and ESPRIT . Since only one snapshot data is available in RSF radar, to obtain the covariance matrix these subspace-based methods need to apply smoothing method, which requires uniform and linear condition . However, this condition is not satisfied in the case of random frequency model.

This article is structured as follows. Section 2 introduces grid-based CS and outlines the procedures of AMP-CTLS. In Sections 3 and 4, we discuss the implementation of AMP-CTLS in harmonic retrieval and RSF radar, respectively. In Section 5, numerical examples are presented to illustrate merits of AMP-CTLS. Section 6 is dedicated to a brief conclusion. Notations: (·)H denotes conjugate transpose matrix; (·)T transpose matrix; (·)* conjugate matrix; (·) pseudo-inverse matrix; I L /0 L the L × L identity/zero matrix; || · ||2 the ℓ2 norm; {·} denotes a set; | · | the absolute value of a complex number or the cardinality of a set; (·)Λ denotes elements/columns indexed in the set Λ of a vector/matrix; supp(·) is the support set of a vector, that is, the indices of the nonzero elements in the vector; Re(·) the real part of a complex number; denotes the right Kronecker product ; and E[·] denotes the expectation of a random variable.

## 2 Grid-based CS and the AMP-CTLS algorithm

The signal model of grid-based CS is introduced in Section 2.1. We combine the greedy idea of MP methods and the constrained total least squares (CTLS) technique , and thus produce AMP-CTLS to alleviate the off-grid problem. In AMP-CTLS, the grid is cast as an unknown parameter, and is jointly estimated together with x. In Section 2.2, the framework of AMP-CTLS is given. Section 2.3 is dedicated to the iterative joint estimator (IJE) algorithm, which is implemented in AMP-CTLS. In the IJE algorithm, the CTLS technique is used, which is presented in Section 2.4. Section 2.5 summarizes the entire procedure of AMP-CTLS. In Section 2.6, the convergence of IJE is analyzed.

### 2.1 Grid-based CS

CS promises efficient recovery of sparse signals. In many applications, signals are sparse in a continuous parameter space rather than finite discrete atoms. Usually, we divide the continuous parameter into discrete grid points and cast the problem as a grid-based CS model:

$y = Φ ( g ) x + w ,$
(1)

where y M × 1and w M × 1are measurement vector and white Gaussian noise (WGN) vector, respectively. x N × 1is to be learned. g N × 1are discrete grid points g = [g1, g2, . . . , g N ]. Φ(g) M × N is built from g, Φ(g) = [ϕ(g1), ϕ(g2), . . . , ϕ(g N )], and the mapping gΦ is known. For example, to recover a frequency sparse signal, we grid the frequency space into discrete frequency points $g= [ 0 , 1 N , 2 N , … , N - 1 N ] T$. Φ is a DFT matrix, of which the m th-row, n th-column element is exp $( j 2 π n N m )$. However, the signal is only sparse in the DFT atoms if all of the sinusoids are exactly at the pre-defined grid points . In some cases, no matter how densely we grid the frequency space, the sinusoids could be off-grid, which saps the performance of CS methods .

### 2.2 Main idea of AMP-CTLS

The off-grid problem usually emerges because we do not often have enough priori knowledge to generate a perfect grid to guarantee that all of the signals exactly lie on grid points. Thus, we cast the grid as an unknown parameter, and search for the best grid g as well as the sparsest x by solving the optimum problem:

(2)

where η is the noise power. Equation (2) is similar to that used in traditional MP methods , except that we recover x and simultaneously estimate the grid. In most cases, solving (2) is a complex non-linear optimum problem. In this article, an iterative method is introduced.

AMP-CTLS inherits the greedy idea from MP methods, which use correlations to iteratively find the support set. In each iteration, one or more atoms are added into the support set. Suppose the support set is obtained as Λ(k)after the k th iteration, and denote the corresponding grid points as $g ^ Λ ( k )$. In traditional MP methods , xΛ is estimated by solving a least squares problem. In AMP-CTLS, considering the off-grid problem, we jointly search for xΛ and the best grid points in the neighboring continuous region of $g ^ Λ ( k )$ via (3), in which we minimize norm of the residual error, which is defined as r = y - Φ (gΛ) xΛ.

(3)

We develop the iterative joint estimator (IJE) algorithm to solve (3), which is detailed in ensuing section.

### 2.3 IJE algorithm

It is difficult to find an analytical solution to (3). The IJE algorithm is devised to seek a numerical solution. Given initial grid points $g ^ Λ ( 0 )$, IJE searches for the best grid points gΛ in the neighborhood of $g ^ Λ ( 0 )$. The mismatch of the grid is denoted as $Δ g Λ = g Λ - g ^ Λ ( 0 ) = [ Δ g 1 , . . . , Δ g | Λ | ] T$. IJE includes three steps: calculate the estimation of the mismatch, $Δ g ^ Λ$; update the grid with $Δ g ^ Λ$; and estimate xΛ with projection onto the new grid points. These three steps are executed iteratively to pursue more accurate results. To distinguish from iterations in search for the support set in (3), we denote l as the counter of loops in IJE; thus, IJE is expressed as follows:

(4)
$g ^ Λ ( l + 1 ) = g ^ Λ ( l ) + Δ g ^ Λ ( l ) ,$
(5)
(6)

In (4), CTLS technique is applied to simultaneously search for the mismatch ΔgΛ and x Λ , and $Δ g ^ Λ ( l )$ and xCTLS are the results. CCTLS denotes the penalty function of CTLS, which is detailed in Section 2.4. Since (6) is a linear least squares problem, the closed-form solution is

$x ^ Λ ( l + 1 ) = Φ g ^ Λ ( l + 1 ) † y = ( Φ H Φ ) - 1 Φ H y .$
(7)

The loops are terminated when the norm of residual error is scarcely reduced.

### 2.4 CTLS technique

Traditional MP methods  apply least squares to calculate amplitudes of xΛ after finding the support set. When there are off-grid signals, mismatches occur in the dictionary; thus, we replace the least squares model with total least squares (TLS) criterion, which is appropriate to deal with the fitting problem when perturbations exist in both the measurement vector and in the dictionary . Since the dictionary mismatches are constrained by errors of grid points, we introduce the constrained total least squares (CTLS) technique  in AMP-CTLS to jointly estimate the grid point errors and xΛ, i.e., solving (4). It has been proved that CTLS is a constrained space state maximum likelihood estimator .

Suppose that we obtain the estimate of grid points as $g ^ Λ ( l )$ after l th IJE iteration. Assume that the mismatch ΔgΛ is significantly small; thus we can approximate the perfect dictionary Φ(g Λ ) as a linear combination of the mismatch Δg with Taylor expansion:

$Φ ( g Λ ) = Φ g ^ Λ ( l ) + ∑ i = 1 | Λ | R i g ^ Λ ( l ) Δ g i + ∑ i = 1 | Λ | o ( Δ g i 2 ) ,$
(8)

where R i M ×|Λ|is

(9)

and o(·) denotes higher order terms. For simplicity, in this section we ignore the iteration counter in the notations, and $R i ĝ Λ l ,Φ ĝ Λ l$ are, respectively, simplified as R i , Φ Λ . Neglect $o Δ g i 2$ and the signal model in (1) is replaced by:

$y = Φ Λ + ∑ i = 1 | Λ | R i Δ g i x Λ + w .$
(10)

CTLS models Δg Λ as an unknown random perturbation vector. The grid misalignment and the noise vector are combined into a (M +| Λ|)-dimensional vector $v = ( Δ g Λ ) T , w T T$, and CTLS aims at minimizing $|| v | | 2 2$. It has been proved that CTLS is a constrained space state maximum likelihood estimator if v is a WGN vector . Thus, we first whiten v. Assume that Δg Λ is independent of w. The covariance matrix of Δg Λ is $C g = E Δ g Λ ( Δ g Λ ) H ∈ ℂ | Λ | × | Λ |$. D |Λ|×|Λ| obeys $C g - 1 = D H D$. The variance of white noise w is $σ w 2$. We denote an unknown normalized vector u (M+| Λ|) × 1as (11); thus, u is a WGN vector.

$u = D Δ g Λ 1 σ w w$
(11)

Minimize the penalty function $C CTLS =|| u | | 2 2$ and (4) is detailed as follows:

(12)
$s . t . - y + Φ Λ + ∑ i | Λ | R i Δ g i x Λ + w = 0 .$
(13)

The constraint condition (13) can be rewritten as:

$s . t . - y + Φ Λ x Λ + W x u = 0 ,$
(14)

where W x = [H σ w I M ] M × (| Λ|+M). H M ×| Λ|is defined as

$H = G ( D - 1 ⊗ I | Λ | ) ( I | Λ | ⊗ x Λ ) ,$
(15)

where $G = [ R 1 , … , R | Λ | ] ∈ ℂ M × | Λ | 2$. The equivalence between (13) and (14) is proved as follows:

$∑ i = 1 | Λ | R i Δ g i x Λ = G ( Δ g Λ ⊗ I | Λ | ) x Λ = G ( D - 1 ⊗ I | Λ | ) ( D Δ g Λ ⊗ I | Λ | ) x Λ = G ( D - 1 ⊗ I | Λ | ) ( I | Λ | ⊗ x Λ ) D Δ g Λ = H D Δ g Λ .$
(16)

When W x is of full-row rank, the optimum problem (12, 14) are equivalent to (17)-(19), which has been proved in .

$x CTLS = min x Λ | | W x † ( y - Φ Λ x Λ ) | | 2 2$
(17)
$u ^ = w x † ( y - Φ Λ x Λ ) | x = x CTLS$
(18)
$w x † = w x H ( w x w x H ) - 1$
(19)

It is quite difficult to obtain analytical solution to (17). A complex version of Newton method is developed in , which is presented in Appendix 1. Initial value of x Λ required in Newton's method for (17) can be given as:

$x ini = Φ Λ † y = ( Φ Λ H Φ Λ ) - 1 Φ Λ H y .$
(20)

$Δ g ^ Λ$ is extracted from û via $Δ g ^ Λ = [ D - 1 0 N ] u ^$, thus (4) is solved. The sketch of CTLS is given in Algorithm 1. As the authors' best knowledge, the convergence guarantees for this Newton method are still open question.

### 2.5 Sketch of AMP-CTLS

Similarly to traditional MP methods , AMP-CTLS first greedily finds the support set. Then AMP-CTLS adaptively optimizes the grid points indexed in the support set. In this article, we imitate the greedy approach of OMP, in which only one atom is added to the support set in each iteration. If the number of atoms is known, terminate the iterations when the cardinality of the support set reaches the pre-specified number. If it is not known, we can apply some other successfully used stopping criterions, e.g., norm of residual being below a threshold . A sketch of AMP-CTLS is presented in Algorithm 2.

### 2.6 Convergence of the IJE algorithm

Here, we analyze convergence of the IJE algorithm. Assume that the mapping gΛ Φ (gΛ) is linear, which means

$Φ ( g Λ + Δ g Λ ) =Φ ( g Λ ) + g ( Δ g Λ ⊗ I | Λ | ) ,$
(21)

and G should be a constant matrix.

Proposition. If the measurement y is perturbed by WGN and (21) is obeyed, IJE monotonically reduces values of the penalty function in (3). The estimates of x Λ and g Λ satisfy:

$| | y - Φ ( g ^ Λ ( l ) ) x ^ Λ ( l ) | | 2 2 ≥ | | y - Φ ( g ^ Λ ( l + 1 ) ) x ^ Λ ( l + 1 ) | | 2 2 .$
(22)

Proof. Define a penalty function as follows:

$f p ( Δ g Λ , x Λ ) = σ w 2 | | u | | 2 2 = σ w 2 ( Δ g Λ ) H C g - 1 Δ g Λ + | | y - Φ ( g ^ Λ ( l ) ) x Λ - G ( Δ g Λ ⊗ I | Λ | ) x Λ | | 2 2 ;$
(23)

thus, $Δ g ^ Λ ( l )$ and xCTLS are obtained by solving

(24)

which is the same as (4), for $σ w 2$ is a constant. Thus, it is satisfied that

$f p Δ g ^ Λ ( l ) , x CTLS ≤ f p ( 0 , x ^ Λ ( l ) ) = | | y - Φ ( g ^ Λ ( l ) ) x ^ Λ ( l ) | | 2 2 .$
(25)

Substitute (5), (21) into $f p ( Δ g ^ Λ ( l ) , x CTLS )$, and note that $C g - 1$ is a positive definite matrix; thus,

$f p Δ g ^ Λ ( l ) , x CTLS = | | y - Φ ( g ^ Λ ( l + 1 ) ) x CTLS | | 2 2 + σ w 2 Δ g ^ Λ ( l ) H C g - 1 Δ g ^ Λ ( l ) ≥ | | y - Φ ( g ^ Λ ( l + 1 ) ) x CTLS | | 2 2 ≥ | | y - Φ ( g ^ Λ ( l + 1 ) ) x ^ Λ ( l + 1 ) | | 2 2 ,$
(26)

where the last inequality is taken from (6). The inequalities in (25) and (26) are transformed to equalities if and only if $Δ g ^ Λ ( l ) = 0$. □

For simplicity, we assume that the transform Φ(gΛ) is linear. In some practical applications like harmonic retrieval, linearity is not strictly guaranteed. However, when atom mismatch $Δ g$ is significantly small, the higher order errors due to Taylor expansion (8) are ignorable, and (21) is approximately satisfied. Numerical examples are performed in Section 5, which demonstrate the convergence of the proposed algorithm in the case of harmonic retrieval.

## 3 Application in the harmonic retrieval

In this section, we apply AMP-CTLS in harmonic retrieval. In Section 3.1, the signal model of harmonic retrieval is presented and adverse effects of MP approaches  in harmonic retrieval is discussed. In Section 3.2, we detail the implementation of AMP-CTLS in harmonic retrieval.

### 3.1 Signal model of harmonic retrieval

Consider a complex sinusoidal signal

(27)

where y m is the m th measurement, and w m is the m th noise, m = 0, 1, . . . , M - 1. There are K sinusoids, and amplitude α k , frequency f k of the k th sinusoid are unknown parameters. When the sinusoids are sparse, i.e., K << M, harmonic retrieval problem can be solved by grid-based CS approaches. Divide the digital frequency f [0 1) into N grid points g = [g1, g2, . . . , g N ]T. When all frequencies are exactly at grid points, rewrite (27) as

(28)

where g n is the frequency of the n th grid point and

$x n = α k , the k th sinusoid is present at n th grid point , 0 , no sinusoid is present at n th grid point .$
(29)

Rewrite (28) in matrix form as

$y = Φ x + w ,$
(30)

where the m th-row, n th-column element of Φ is of the form ϕ(m, n) = exp(j 2πg n m). Apply CS methods to seek the sparsest solution of (30). Then, estimates of the frequencies and amplitudes are obtained with the indices and magnitudes of nonzero coefficients in x, respectively. The sparsest solution can be obtained with computational MP methods, which greedily minimize the ℓ0 norm. It can also be obtained by minimizing the 1 norm , the quasi-norm [24, 25] or the ℓp ≤ 1p-norm-like diversity .

We focus on MP methods in this article for the high computation efficiency. However, conventional MP methods  suffer from performance degradation if the frequency space is not perfectly grided. When the frequency is sparsely divided, sinusoids may lie off the grid points, and accuracy of frequency estimates is limited by the gap between neighboring grid points. MP methods iteratively search for the sinusoids. If an off-grid sinusoid emerges, the energy of this sinusoid can not be totally canceled and performs as an interference in the next iterations. The leakage of the energy may mask the weak sinusoids. On the other hand, if the frequency space is densely divided, correlations between atoms are enhanced , which also reduces the performance of MP methods. Especially in those MP methods that select multiple atoms into the support set in a single iteration, e.g., CoSaMP, SP, ROMP, and StOMP, highly correlated atoms could be chosen in a same iteration, which impairs the numerical stability of projection onto the adopted atoms.

### 3.2 Harmonic retrieval with AMP-CTLS

The AMP-CTLS algorithm can be applied for harmonic retrieval. AMP-CTLS adaptively finds the atoms and recovers the sinusoids. In those MP approaches with constant predefined atoms, frequency estimates are discrete values, depending on grid points. In AMP-CTLS frequency estimates are continuous, since estimates of the grid misalignments are continuous. In this section, we adjust two steps of AMP-CTLS presented in Section 2 to better fit the harmonic retrieval problem.

Calculate the R matrix in (9). According to (9), the m th-row, i th-column element of the R i is expressed as follows:

$R i ( m , i ) = exp ( j 2 π g i m ) ⋅ j 2 π m .$
(31)

Elements in other columns are all zeros.

Adjust the grid-updating formula in (5). In CTLS as presented in Section 2.4, the grid misalignment Δg Λ is assumed to be a complex vector; therefore, the estimate $Δ g ^ Λ$ is complex. However, frequency grid points are restrained to be real, so regularization Δg Λ = (Δg Λ )* should be added to (12) in the case of harmonic retrieval. Unfortunately, the solver becomes complex, which is derived in Appendix 2. For simplicity, (5) is replaced with (32) to approximatively update the grid points:

$g ^ Λ ( l + 1 ) = g ^ Λ ( l ) + Re Δ g ^ Λ ( l ) .$
(32)

## 4 Application in RSF radar

AMP-CTLS can also be applied in randomized step frequency (RSF) radar. RSF radar can improve the range-velocity resolution and avoid range-velocity coupling problems [28, 29]. However, RSF radar suffers from the sidelobe pedestal problem, which results in small targets being masked by noise-like components due to dominant targets . Our problem of interest is to recover small targets. When the observed scene is sparse, i.e., only few targets exists, we can use sparse recovery to exploit the sparseness . AMP-CTLS relieves the sidelobe pedestal problem in RSF radar and recovers small targets well.

Correlation-matrix-based spectral analysis methods, e.g., MUSIC, ESPRIT , are hard to be directly utilized in range-Doppler estimation in RSF scheme. Since only one snapshot of radar data is available and radar echoes from different scatterers are coherent, smoothing technique is invoked to obtain a full rank correlation matrix . Smoothing method requires that the array is uniform and linear . However, in RSF radar, the echoes are determined by a random permutation of integers, see (34); thus, the uniform and linear condition is not satisfied, which restricts application of correlation-matrix-based methods.

We discuss a specific example of RSF radar, in which the waveform is a monotone pulse signal and the frequency of the m th pulse is f0 + C m δf, m = 0, 1, . . . , M - 1, where f0 is carrier frequency and δf is frequency step size. C m is a random permutation of integers from 0 to M - 1. The m th echo of radar can be expressed as (see [8, 28, 29]):

$y m = ∑ k = 1 K α k s m ( p k , q k ) + w m ,$
(33)
$s m ( p , q ) = exp - j 2 π C m p - j 2 π m ( 1 + C m δ f / f 0 ) q ,$
(34)

where w m is noise in the m th echo. K denotes the number of targets and k denotes k th target. α k , p k , and q k are to be learned. α k presents the scattering intensity. p k [0 1) and q k [0 1) are determined by range and radial velocity of the k th target, respectively. Note that in (34) the echo is simultaneously related to the sequence m and the random integer C m .

Divide p space into C grid points p c = c/C, c = 0, 1, . . . ,C - 1. Divide q space into D grid points p d = d/D, d = 0, 1, . . . , D - 1. Rewrite (33) as:

$y = Φ ( p , q ) x + w ,$
(35)

where the m th-row, (c + dC)th-column element of Φ (p, q) M × CD is s m (p c , q d ).

AMP-CTLS is implemented to solve (35). First, we find the support set Λ and then use IJE and CTLS to adjust the grid points, though CTLS described in Section 2.4 requires modification. The grid misalignment vector consists of two parts: p mismatch ΔpΛ |Λ|× 1and q mismatch ΔqΛ |Λ|× 1, $Δ g Λ = ( Δ p Λ ) T , ( Δ q Λ ) T T ∈ ℝ 2 | Λ | × 1$. The R M × CD matrix

$R p i = ∂ Φ ( p Λ , q Λ ) ∂ p i , R q i = ∂ Φ ( p Λ , q Λ ) ∂ q i .$
(36)

Assume that ΔpΛ and ΔqΛ are independent of each other and of the noise. The covariance matrix of ΔgΛ is

$C g = [ C p 0 | Λ | 0 | Λ | C q ] ∈ ℂ 2 | Λ | × 2 | Λ | ,$
(37)

where $C p = E Δ p Λ ( Δ p Λ ) H ∈ ℂ | Λ | × | Λ |$, $C q = E Δ q Λ ( Δ q Λ ) H ∈ ℂ | Λ | × | Λ |$, $C p - 1 = D p H D p$ and $C q - 1 = D q H D q$. In the case of RSF radar $u = ( D p Δ p Λ ) T , ( D q Δ q Λ ) T , 1 σ w w T T ∈ ℂ ( 2 | Λ | + M ) × 1$, and W x = [H p , H q , σ w I N ] M × (2|Λ|+M), where $G p = R p 1 , . . , R p | Λ | ∈ ℂ M × | Λ | 2$, $H p = G p ( D p - 1 ⊗ I | Λ | ) ( I | Λ | ⊗ x Λ ) ∈ ℂ M × | Λ |$, $G q = R q 1 , . . , R q | Λ | ∈ ℂ M × | Λ | 2$ and $H q = G q ( D q - 1 ⊗ I | Λ | ) ( I | Λ | ⊗ x Λ ) ∈ ℂ M × | Λ |$. Since p and q are both real, formula (32) s used to update grid points, in which the imaginary parts of Δp and Δq estimates are abandoned.

## 5 Simulations

Numerical results are provided to illustrate the performance of the new algorithm. In all examples, the noise is additive Gaussian white noise.

### 5.1 Accuracy of AMP-CTLS

We compare the accuracy of AMP-CTLS with standard OMP. We assume that there is a single sinusoid in the measurements of form (27), where α = 1 and the signal to noise ratio SNR = α22 = 5 dB, where σ2 is the variance of noise. The number of measurements M is 32. The frequency of sinusoid is varied between two adjoining frequency grid points. The mean square errors (MSEs) of frequency estimates are calculated. The MSEs are compared with the corresponding Cramer-Rao lower bound (CRB) . The frequency is uniformly divided into m grid points in xMP m (OMP M , OMP10M, CoSaMP M , etc.) and into M points in AMP-CTLS. AMP-CTLS is configured as follows: IJE loops no more than 14 times; the normalization factors in (11) are D = I /(σΔf), σΔf= 0.005, and σ w = 1. As shown in Figure 1, MSEs of AMP-CTLS are close to the CRB and lower than those of OMP, except when the sinusoid is in the vicinity of the grid point.

### 5.2 Convergence of AMP-CTLS

We first discuss the convergence speed of the proposed IJE algorithm in noise-free case. Suppose that the sinusoid is located at f = 9.5/M, M = 32. Other conditions are the same as described in Section 5.1. In the l th iteration of IJE, we can obtain a grid point $ĝ ( l )$ with (5) and residual error $r ( l ) = y -Φ ( ĝ ) x ^ ( l )$ after (6). we calculate the norm of residual ||r(l)||2 and the grid error $|ĝ ( l ) -f|$, and normalize the results with ||r(0)||2 and $|ĝ ( 0 ) -f|$, respectively. As shown in Figure 2, both the residual error and the grid error converge fast (about five steps) to 0 in noiseless case.

The purpose of what follows is to discuss the feasible zone of the initial grid points in noisy circumstance. In Section 2.6, the convergence analysis of IJE is based on the assumption that the transform Φ is linear. This is only approximately satisfied in harmonic retrieval when the higher order terms of Taylor expansion (8) are ignorable, which means that the grid points indexed in the support set are required to be close to the actual frequencies. We assign SNR = 5 dB and the initial frequency grid point as g(0) = 9/M. The true frequency of the sinusoid varies from 9/M to 11/M.

As shown in Figure 3, when the distance (normalized by 1/M) between the true frequency and the initial grid point is less than 0.7, the initial grid is adjusted to be close to the actual value, and MSEs of the frequency estimates converge to CRB. When the distance is greater than 1, the AMP-CTLS curve is close to the initial distance, which means that AMP-CTLS fails to improve the initial grid, because errors of Taylor expansion cannot be ignored and affect convergence of the algorithm.

### 5.3 Input of sparsity

In Sections 5.1 and 5.2, we assume that the sparsity K, i.e., the number of modes, is known and we use K to terminate AMP-CTLS, while a priori sparsity is not obligatory. When K is unknown, we can use norm of residual error r = y - Φ (g Λ ) x Λ as termination criterion.

Furthermore, AMP-CTLS does not seriously rely on the given sparsity K', and the performance is slightly affected when K' > K. Suppose there are three sinusoids denoted as Si1, Si2, and Si3, where α1 = 20, α2 = 15, α3 = 1, f1 = 3.15/M, f2 = 4.2/M, f3 = 7.25/M, M = 32. $SN R 3 = α 3 2 / σ 2 =10 dB .$ In AMP-CTLS, frequency is uniformly grided to 2M points, and other configurations are the same as described in Section 5.1.

We calculate means of the final residual norm ||r(K')||2 versus K' and present the results in Figure 4. When all of the sinusoids have been chosen into the support set and K' ≥ K, energy of the sinusoids are canceled thoroughly and only noise exist in the residual. The norm of residual error becomes small and is slowly reduced along with K'. The results illustrate that we can use threshold of values or decrease rate of the norm of residual to end AMP-CTLS loops.

Spurious sinusoids emerge when K' > K. Denote the amplitude estimates by $α ^ 1 , α ^ 2 ,…, α ^ K ′$ in descend order of magnitudes and their counterparts of frequency estimates by $f ^ 1 , f ^ 2 ,…, f ^ K ′$. MSEs of f3 estimates versus K' are presented in Figure 5, which indicates that accuracy of frequency estimates of Si3 is slightly affected (< 2 dB) by K'.

We also calculate $| α ^ K + 1 / α K |$ as measurement of the level of spurious sinusoids. Figure 6 presents the results and shows that the ratios $| α ^ K + 1 / α K |$ stay at low level (< 0.2) and are not sensitive to K'.

Figure 7 presents MSEs of f3 estimates versus SNR at different K'. Noise variance σ2 is altered such that SNR3 varies. The MSEs converges to CRB at high SNR (SNR3> 2 dB) when K' = K = 3. The results of K' = 6 are close to those of K' = 3.

### 5.4 Recovering small sinusoids

We compare the performance on recovering weak sinusoids of AMP-CTLS with CS methods, e.g., OMP and CoSaMP, and conventional spectral analysis methods, e.g., ESPRIT and root MUSIC . Suppose there are three sinusoids denoted as Si1, Si2, and Si3, where α1 = 20, α2 = 15, α3 = 1, f1 = 3.15/M, f2 = 5.2/M, f3 = 3.95/M, M = 32.$SN R 3 = α 3 2 / σ 2 =5 dB$. The number of sinusoids K = 3 is assumed to be known. CoSaMP iterates 50 times. In both ESPRIT and root MUSIC, the model orders are set as K, and the covariance matrix orders are M/ 2 according to . ESPRIT and root MUSIC output frequency estimates, and the corresponding magnitudes are obtained by projection on these frequencies. AMP-CTLS is configured the same as mentioned in Section 5.3.

Some intuitive results are presented in Figure 8. The Si3 is recovered by the AMP-CTLS algorithm and is masked via other tested algorithms. CoSaMP100Mis also tested, but the results are not displayed because the amplitude estimates are too large (> 1,000), which is caused by projection onto the ill-conditioned matrix consisting of highly correlated atoms. In OMP2M, the sinusoids are not exactly at the grid points, so the energies of Si1 and Si2 cannot be totally canceled in the beginning two iterations, and the leakage of the energies masks the smallest signal Si3. In OMP100M, all sinusoids are placed at the grid points, and Si1 and Si2 are better recovered than in OMP2M, but energy leakage of dominant sinusoids still exists. In AMP-CTLS, the grid points are adaptively adjusted to match the sinusoids, so the algorithm is less sensitive to grid mismatch and can achieve better performance than OMP and CoSaMP even if the frequency space is sparsely divided. ESPRIT and root MUSIC do not correctly recover Si3 as AMP-CTLS does. Since there is only one snapshot data, smoothing method  is used in these two methods to estimate the covariance matrix, which results in aperture loss .

### 5.5 Range-velocity joint estimate in RSF radar

In this section, we discuss merits of AMP-CTLS in recovering small targets with RSF radar. Suppose there are three targets: two large targets T1 and T2 and a small target T3. The number of measurements M is 32. The scattering intensities are α1 = α2 = 10, α3 = 1, and the ranges and the velocities are set such that the p, q parameters are p1 = 10.1/M, p2 = 10.7/M, p3 = 20/M, q1 = 19.4/M, q2 = 10.2/M and q3 = 15.2/M. AMP-CTLS is configured as follows: the p, q spaces are both uniformly divided into M grid points; the normalization factors are D p = D q = I Δ, σΔ = 0.025, σ w = 1; and the IJE algorithm iterates fewer than 14 times. In OMP m , both the p and q spaces are uniformly divided into m grid points. Note that all of the targets lie on the grid points in OMP10M. We focus on the results of recovering the weakest target T3. Change the noise covariance σ2; thus, the signal to noise ratio $SN R 3 = α 3 2 / σ 2$ varies. Calculate MSEs of p3, q3 parameters. As shown in Figure 9, the MSEs with AMP-CTLS are lower than those with OMP and converge to the CRB when the SNR3 is no less than 2 dB. The difference between these MSEs of AMP-CTLS at high SNR and CRB is less than 0.5 dB.

### 5.6 DoA estimation

In this section, AMP-CTLS is compared with the Lasso-based TLS method WSS-TLS  on direction of arrival (DoA) estimation. The goal is estimating DoA of plane waves from far-field, narrowband sources with uniform linear array of antennas . We focus on the single-snapshot case. Suppose the antenna array contains M = 8 elements and the interval between neighboring elements d = 1/ 2 wavelength. There are two sources (K = 2) from angles θ1 = -29° and θ2 = 13°. The amplitudes α1 = α2 = 1 and $SN R 1 = SN R 2 = α 1 2 / σ 2$, where σ2 is the noise variance. The angle space from -90° to 90° are uniformly divided to N = 90 grid points; thus both sources are 1° off the nearest grid points. The WSS-TLS algorithm is set according to . Since WSS-TLS returns multiple nonzero DoA estimates, we choose two peaks with largest magnitudes as the estimates of θ1 and θ2. AMP-CTLS models DoA estimation as an harmonic retrieval problem and outputs frequency estimates $f ^ ∈ [ 0 1 )$. Denote $f ̃ = f ^ - 0 . 5 sgn ( f ^ - 0 . 5 ) + 1$, where sgn(·) represents signum function; thus the DoA estimate with AMP-CTLS is obtained by $θ ^ = arcsin − 1 ( f ˜ / d )$. In AMP-CTLS, IJE loops no more than 50 times; the normalization factors in (11) are D = I /(σΔf), σΔf= 0.005, σ w = 1. MSEs of θ1 and θ2 estimates versus SNR are shown in Figure 10a, b, respectively. The results indicate that MSEs of AMP-CTLS are closer to CRB than those of WSS-TLS.

## 6 Conclusion

To alleviate the off-grid problem in grid-based MP methods, we implement CTLS into the OMP framework and propose a new algorithm, namely AMP-CTLS. Unlike traditional MP methods, AMP-CTLS adaptively adjusts the grid and dictionary. The convergence of AMP-CTLS is analyzed, and the initial conditions of the algorithm are discussed. Numerical examples indicate that the advantages of AMP-CTLS over OMP and CoSaMP are twofold: (1) it is still efficient even when the continuous parameter space is sparsely divided, but OMP or CoSaMP suffers from performance degradation when the space is not divided reasonably; (2) it can achieve higher accuracy, and the MSEs converge to CRB.

## Appendix 1

We recall the complex version of Newton's method in . Minor changes are made because of differences in notation. The recursion formula of the complex Newton method for (17) is:

$x CTLS ( t + 1 ) = x CTLS ( t ) + ( A * B - 1 A - B * ) - 1 ( a * - A * B - 1 a ) ,$
(38)

where

$h = ( W x W x H ) - 1 Φ Λ x CTLS ( t ) - y ,$
(39)
$u ^ =- W x H h ,$
(40)
$B ̃ = Φ Λ + G ( D - 1 ⊗ I | Λ | ) ( u ^ ⊗ I | Λ | ) ,$
(41)
$Q j = ( R 1 ) { j } , … , ( R | Λ | ) { j } D - 1 , 0 N ,j≤|Λ|,$
(42)
$Q ̃ = [ Q 1 H h , Q 2 H h , … , Q | Λ | H h ] ,$
(43)
$a = h H B ̃ T ,$
(44)
$A = - Q ̃ H W x † B ̃ - Q ̃ H W x † B ̃ T ,$
(45)
$B = Q ˜ H ( W x † W x − I ) Q ˜ + ( B ˜ H ( W x W x H ) − 1 B ˜ ) T$
(46)

## Appendix 2

When the frequencies are constrained to be real, the CTLS solver becomes more complex. Some notations are introduced for simplicity as follows: W1 = H, W2 = σ w I N , $u = [ u 1 T , u 2 T ] T$, z = y - ΦΛxΛ. Notice that the matrix H is relative to xΛ. Replace the optimum problem in (12), (14) with:

(47)
$s .t . z - [ W 1 , W 2 ] u 1 u 2 u 1 * = u 1 = 0 .$
(48)

First, suppose xΛ is known, and seek the solution of u. If both W1 and W2 are of full-row rank, we have

$u 1 = 2 Re ( W 1 H v ) ,$
(49)
$u 2 = 2 W 2 H v ,$
(50)

where

$v = - ( C 2 - 1 C 1 - ( C 1 - 1 C 2 ) * ) - 1 ( C 2 - 1 z - ( C 1 - 1 z ) * ) ,$
(51)
$C 1 = W 1 W 1 H + 2 W 2 W 2 H ,$
(52)
$C 2 = W 1 W 1 T .$
(53)

The solution u1, u2 depends on xΛ. Then, calculate xΛ as

(54)

However, it is difficult to solve (54), because the Jacobi matrix and Hessian matrix of $u 1 T u 1 + u 2 H u 2$ versus xΛ are complex. In this article, we simply consider the frequencies to be complex and ignore the imaginary parts.

Proof: We prove that when xΛ is known, the solution of u is given as (49) to (53). The Lagrangian

$L ( u , v ) = 1 2 u 1 T u 1 - v H ( z + W 1 u 1 + W 2 u 2 ) + 1 2 u 2 H u 2 - v T ( z * + W 1 * u 1 + W 2 * u 2 * )$
(55)

can be expressed as follows:

$L ( u , v ) = 1 2 ( u 1 - 2 Re ( W 1 H v ) ) T ( u 1 - 2 Re ( W 1 H v ) ) - 2 ( Re ( W 1 H v ) ) T Re ( W 1 H v ) - 2 v H W 2 W 2 H v + 1 2 ( u 2 - 2 W 2 H v ) H ( u 2 - 2 W 2 H v ) - 2 Re [ v H z ] .$
(56)

When $u 1 = 2 Re ( W 1 H v )$ and $u 2 = 2 W 2 H v$, the Lagrangian reaches the infimum; thus, the Lagrange dual function is obtained as

(57)

Calculate the Jacobi and Hessian matrix of γ(v):

$∂ γ ∂ v * =- 1 2 W 1 W 1 H v - 1 2 W 1 W 1 T v * - W 2 W 2 H v - 1 2 z ,$
(58)
$∂ 2 γ ∂ v T ∂ v * =- 1 2 W 1 W 1 H - W 2 W 2 H .$
(59)

Because W1, W2 are of full-row rank, $∂ 2 γ ∂ v T ∂ v *$ is a negative definite matrix. We solve v with the optimum condition $∂ θ ∂ v * = 0$, and obtain (51) to (53). The proof is complete.

## Algorithm 1. The CTLS technique

1. 1)

Input the dictionary Φ Λ and all the coefficient matrices R i .

2. 2)

Compose W x and solve (17) with the initial value given in (20).

3. 3)

Calculate û via (18) and (19).

4. 4)

Extract $Δ g ^ Λ$ from û .

## Algorithm 2. The AMP-CTLS algorithm

1. 1)

Divide the continuous parameter f into grid point $g ^ ( 0 )$; input the sparsity level K or residual threshold δ.

Set the support set Λ(0) = , and the residual error r(0) = y.

2. 2)

Calculate the correlations $p i = r ( k ) , Φ ĝ i ( k )$.

3. 3)

Find the index n = arg max |p i |.

4. 4)

Merge the support set Λ(k+1)= Λ(k) {n}.

5) Solve (3) with the IJE algorithm. Then we get $g ^ Λ ( k + 1 )$ and $x ^ Λ ( k + 1 )$.

1. 6)

Update the residual error $r ( k + 1 ) = y − Φ ( g ^ Λ ( k + 1 ) ) x ^ Λ ( k + 1 )$.

2. 7)

Increase k. Return to Step 2 until stop criterion, e.g., k < K, ||r (k)||2 < δ or ||r (k)||2 < δ||r (k- 1)||2, is satisfied.

3. 8)

Simultaneously output $x ^ Λ ( k )$ and $g ^ Λ ( k )$, and set the elements of x not indexed in Λ to 0.

## References

1. 1.

Baraniuk RG: Compressive sensing [lecture notes]. IEEE Signal Process Mag 2007, 24(4):118-121.

2. 2.

Dai W, Milenkovic O: Subspace pursuit for compressive sensing signal reconstruction. IEEE Trans Inf Theory 2009, 55(5):2230-2249.

3. 3.

Mallat SG, Zhang Z: Matching pursuits with time-frequency dictionaries. IEEE Trans Signal Process 1993, 41(12):3397-3415. 10.1109/78.258082

4. 4.

Davenport MA, Wakin MB: Analysis of orthogonal matching pursuit using the restricted isometry property. IEEE Trans Inf Theory 2010, 56(9):4395-4401.

5. 5.

Needell D, Vershynin R: Uniform uncertainty principle and signal recovery via regular-ized orthogonal matching pursuit. Found Comput Math 2009, 9(3):317-334. 10.1007/s10208-008-9031-3

6. 6.

Donoho D, Drori I, Tsaig Y, Starck J: Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit. Department of Statistics, Stanford University, California; 2006.

7. 7.

Needell D, Tropp J, CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Appl Comput Harmonic Anal 2009, 26(3):301-321. 10.1016/j.acha.2008.07.002

8. 8.

Huang T, Liu Y, Meng H, Wang X: Randomized step frequency radar with adaptive compressed sensing. Proc IEEE Radar Conf (RADAR), Kansas City, Missouri, USA 2011, 411-414.

9. 9.

Shah S, Yu Y, Petropulu A: Step-frequency radar with compressive sampling (SFR-CS). Proc IEEE Int Acoustics Speech and Signal Processing (ICASSP) Conf, Dallas, Texas, USA 2010, 1686-1689.

10. 10.

Hyder MM, Mahata K: Direction-of-arrival estimation using a mixed ℓ2,0norm approx-imation. IEEE Trans Signal Process 2010, 58(9):4646-4655.

11. 11.

Zheng C, Li G, Zhang H, Wang X: An approach of regularization parameter estimation for sparse signal recovery. Proc IEEE 10th Int Signal Processing (ICSP) Conf, Beijing, China 2010, 385-388.

12. 12.

Zheng C, Li G, Zhang H, Wang X: An approach of DOA estimation using noise subspace weighted 1minimization. Proc IEEE Int Acoustics, Speech and Signal Processing (ICASSP) Conf, Prague, Czech 2011, 2856-2859.

13. 13.

Chi Y, Scharf LL, Pezeshki A, Calderbank AR: Sensitivity to basis mismatch in com-pressed sensing. IEEE Trans Signal Process 2011, 59(5):2182-2195.

14. 14.

Zhu H, Leus G, Giannakis GB: Sparsity-cognizant total least-squares for perturbed compressive sampling. IEEE Trans Signal Process 2011, 59(5):2002-2016.

15. 15.

Chae DH, Sadeghi P, Kennedy RA: Effects of basis-mismatch in compressive sampling of continuous sinusoidal signals. Proc 2nd Int Future Computer and Communication (ICFCC) Conf, Wuhan, China 2010, 2: V2.739-V2.743.

16. 16.

Cabrera SD, Malladi S, Mulpuri R, Brito AE: Adaptive refinement in maximally sparse harmonic signal retrieval. IEEE 11th Proc and the 3rd IEEE Signal Processing Education Workshop Digital Signal Processing Workshop, Taos Ski Valley, New Mexico, USA 2004, 231-235.

17. 17.

Peyre G: Best basis compressed sensing. IEEE Trans Signal Process 2010, 58(5):2613-2622.

18. 18.

Stoica P, Moses R: Spectral Analysis of Signals. Pearson/Prentice Hall, Upper Saddle River; 2005.

19. 19.

Bellman R: Introduction to Matrix Analysis. Society for Industrial Mathematics, Philadelphia; 1997.

20. 20.

Abatzoglou TJ, Mendel JM, Harada GA: The constrained total least squares technique and its applications to harmonic superresolution. IEEE Trans Signal Process 1991, 39(5):1070-1087. 10.1109/78.80955

21. 21.

Golub G, Van Loan C: An analysis of the total least squares problem. SIAM J Numer Anal 1980, 17(6):883-893. 10.1137/0717073

22. 22.

Boufounos P, Duarte MF, Baraniuk RG: Sparse signal reconstruction from noisy compressive measurements using cross validation. Proc IEEE/SP 14th Workshop Statistical Signal Processing SSP'07, Madison, Wisconsin, USA 2007, 299-303.

23. 23.

Brito AE, Cabrera SD, Villalobos C: Optimal sparse representation algorithms for harmonic retrieval. Proc Conf Signals, Systems and Computers Record of the Thirty-Fifth Asilomar Conf, Pacific Grove, California, USA 2001, 2: 1407-1411.

24. 24.

Rao BD, Kreutz-Delgado K: An affine scaling methodology for best basis selection. IEEE Trans Signal Process 1999, 47: 187-200. 10.1109/78.738251

25. 25.

Gorodnitsky IF, Rao BD: Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm. IEEE Trans Signal Process 1997, 45(3):600-616. 10.1109/78.558475

26. 26.

Cabrera S, Rosiles J, Brito A: Affine scaling transformation algorithms for harmonic retrieval in a compressive sampling framework. Proc Wavelets XII, SPIE, San Diego, California, USA, 6701 2007, 67012D.1-67012D.12.

27. 27.

Tropp JA: Greed is good: algorithmic results for sparse approximation. IEEE Trans Inf Theory 2004, 50(10):2231-2242. 10.1109/TIT.2004.834793

28. 28.

Axelsson SRJ: Analysis of random step frequency radar and comparison with experiments. IEEE Trans Geosci Remote Sens 2007, 45(4):890-904.

29. 29.

Liu Y, Meng H, Li G, Wang X: Range-velocity estimation of multiple targets in randomised stepped-frequency radar. Electron Lett 2008, 44(17):1032-1034. 10.1049/el:20081608

30. 30.

Odendaal JW, Barnard E, Pistorius CWI: Two-dimensional superresolution radar imaging using the MUSIC algorithm. IEEE Trans Antennas Propag 1994, 42(10):1386-1391. 10.1109/8.320744

31. 31.

Yau SF, Bresler Y: A compact Cramer-Rao bound expression for parametric estimation of superimposed signals. IEEE Trans Signal Process 1992, 40(5):1226-1230. 10.1109/78.134484

32. 32.

Mahata K, Soderstrom T: ESPRIT-like estimation of real-valued sinusoidal frequencies. IEEE Trans Signal Process 2004, 52(5):1161-1170. 10.1109/TSP.2004.826169

## Acknowledgements

This study was supported in part by the National Natural Science Foundation of China (No. 40901157) and in part by the National Basic Research Program of China (973 Program, No. 2010CB731901). Thanks to the anonymous reviewers for many valuable comments and to Hao Zhu for helpful discussions and her Matlab® programs of WSS-TLS.

## Author information

Authors

### Corresponding author

Correspondence to Yimin Liu.

### Competing interests

The authors declare that they have no competing interests.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

Reprints and Permissions

Huang, T., Liu, Y., Meng, H. et al. Adaptive matching pursuit with constrained total least squares. EURASIP J. Adv. Signal Process. 2012, 76 (2012). https://doi.org/10.1186/1687-6180-2012-76 