Sparse multidimensional modal analysis using a multigrid dictionary refinement

We address the problem of multidimensional modal estimation using sparse estimation techniques coupled with an efficient multigrid approach. Modal dictionaries are obtained by discretizing modal functions (damped complex exponentials). To get a good resolution, it is necessary to choose a fine discretization grid resulting in intractable computational problems due to the huge size of the dictionaries. The idea behind the multigrid approach amounts to refine the dictionary over several levels of resolution. The algorithm starts from a coarse grid and adaptively improves the resolution in dependence of the active set provided by sparse approximation methods. The proposed method is quite general in the sense that it allows one to process in the same way mono-and multidimensional signals. We show through simulations that, as compared to high-resolution modal estimation methods, the proposed sparse modal method can greatly enhance the estimation accuracy for noisy signals and shows good robustness with respect to the choice of the number of components.


Introduction
The topic of sparse signal representation has received considerable attention in the last decades since it can find application in a variety of problems, including mono-and multidimensional deconvolution [1], statistical regression [2], and radar imaging [3]. Sparse approximation consists of finding a decomposition of a signal y as a linear combination of a limited number of elements from a dictionary F ℂ M × N , i.e., finding a coefficient vector x that satisfies y ≈ Fx, where F is overcomplete (M < N). The sparsity condition on x ensures that the underdetermined problem does not have an infinite number of solutions. The dictionary F can be chosen according to its ability to represent the signal with a limited number of coefficients or it can be imposed by the inverse problem at hand. In the latter case, we consider dictionaries whose atoms are function of some parameters. The different atoms of the dictionary are then formed by evaluating this function over a grid which has to be very fine to achieve a certain degree of resolution. This is the case for the modal estimation problem in which the atoms are formed by discretizing the frequency and damping factor axes. In this situation, the challenge is to get a good approximation without a prohibitive computational cost due to the huge size of the dictionary.
This study addresses the modal retrieval problem. This is an important topic in various applications including nuclear magnetic resonance (NMR) spectroscopy [4], wireless communications, radar, and sonar [5]. A modal signal is modeled as a sum of damped complex sinusoids. Several methods have been developed to address the modal estimation problem such as maximum likelihood [6,7] and subspace-based methods [5,[8][9][10][11][12]. A special case of modal estimation is the harmonic retrieval problem (null damping factor) which has been formulated as a sparse approximation in a number of contributions. In the case of 1-D harmonic retrieval problem, we can cite FOCUSS [13], the method of Moal and Fuchs [14], basis pursuit [15], adaptive weighted norm extrapolation [16]. Some other contributions may be found in [17,18]. Nevertheless, only a few methods have been applied to the damped case. For instance, [19] presents a sparse estimation example of 1-D NMR (modal) data by using Lasso [20], LARS [21] and OMP [22]. Goodwin et al. [23] proposed a damped sinusoidal signal decomposition for 1-D signals using Matching Pursuit [24]. Similarly, regarding multigrid approaches associated with sparse approximation methods, only some studies are considering the 1-D harmonic signals [25,26]. In the case of 2-D signals, an approach combining adaptive multigrid decomposition and TLS-Prony estimation was proposed in [27]. However, to authors knowledge, there is no study that deals with the problem of estimating parameters of multidimensional (R-D) damped sinusoidal signals by sparse approximation methods. This article provides a multidimensional generalization of the study presented in [28,29].
The goal of this article is to present an efficient approach that reduces the computational cost of sparse algorithms for R-D modal estimation problems. The main contributions of the article are as follows. (i) We propose a procedure which iteratively improves the set of atoms in the dictionary. The goal of this procedure is to improve resolution by avoiding computationally expensive operations due to the processing of large matrices; we refer to this procedure as the multigrid approach. (ii) We show how the 1-D modal retrieval problem can be addressed using sparse estimation approach by building a dictionary whose atoms are calculated by sampling the modal function over a 2-D grid (frequency and damping factor) in order to obtain all possible modes combinations. (iii) We show how to extend the sparse 1-D modal estimation problem to R-D modal problems.
The article is organized as follows. In Section 2, we provide background material and definitions for sparse signal representation. We present some known sparse methods and we recall the single best replacement (SBR) [30] algorithm and its advantages as compared to other algorithms such as OMP, OLS, and CoSaMP, to name a few. In Section 3, we present the multigrid dictionary refinement approach and we discuss its usefulness to accelerate computation and to improve resolution. In Section 4, we see how the 1-D modal retrieval problem may be addressed using sparse approximations and how the multigrid approach can be applied. In Section 5, we extend the sparse multigrid approach to the R-D modal estimation problem. In Section 6, experimental results are presented first to compare SBR to a greedy algorithm (OMP) and a solver to the basis pursuit problem. Then, the effectiveness of the multigrid approach will be illustrated on simulated 1-D and 2-D modal signals. Conclusions are drawn in section 7.
Notations: Upper and lower bold face letters will be used for matrices and column vectors, respectively. A T denotes the transpose of A. "⊙" will denote the Khatri-Rao product (column-wise Kronecker) and "⊗" will denote the Kronecker product.

Key ideas of sparse approximations
Consider an observation vector y ℂ M which has to be approximated by a sum of vectors from a matrix F such that y ≈ Fx, where F = [j 1 ..., j N ] ℂ M × N and x ℂ N contains coefficients that select and weight columns j n . We refer to F as a dictionary and to x as a representation of the signal y with respect to the dictionary. To find an accurate approximation for any arbitrary signal y, the dictionary has to be overcomplete, i.e., has to contain a large number of atoms. Therefore, we have to solve an underdetermined system when M < N. Clearly, there is an infinite number of solutions that can be used to represent y. This is why additional conditions have to be imposed. Let us introduce the pseudo norm ℓ 0 , || ⋅ || 0 : ℂ N N, which counts the number of non-zero components in its arguments. We say that a vector x is ssparse, when ||x|| 0 = s. In the case for an observed signal corrupted with noise, the problem of estimating the sparsest vector x such as Fx approximates y at best can be stated as an ℓ 2 -ℓ 0 minimization problem admitting two formulations: -the constrained ℓ 2 -ℓ 0 problem whose goal is to seek the minimal error possible at a given level of sparsity s ≥ 1: arg min -the penalized ℓ 2 -ℓ 0 problem: The goal is to balance between the two objectives (fitting error and sparsity). Here, the solution sparsity level is controlled by the l parameter.
The ℓ 2 -ℓ 0 problem is known to yield an NP complete combinatorial problem which is usually handled by using suboptimal search algorithm. Restricting our attention to greedy algorithms, the main advantage of the ℓ 2 -ℓ 0 penalized form is to allow both insertion and removal of elements in x, while the constrained form only allows the insertion when optimization is carried through a descent approach [30,31].
A well known greedy method for sparse approximation is orthogonal matching pursuit (OMP) [22]. It minimizes iteratively the error ℰ(x) until a stoping criterion is met. At each iteration the current estimate of the coefficient vector x is refined by selecting one more atom to yield a substantial improvement of the signal approximation.
There are other paradigms for solving sparse approximation problems by using ℓ 2 -ℓ p minimization for p ≤ 1. One of these is basis pursuit (BP) [32], which is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest ℓ 1 norm of coefficients among all such decompositions: This principle leads to approximation that can be sparse and this minimization problem can be solved via linear programming [32]. Instead of ℓ 2 -ℓ 1 penalized problem, FOCUSS algorithm [13] uses a ℓ 2 -ℓ p penalized criterion. For p <1, the cost function is nonconvex, and the convergence to global minima is not guaranteed. It is indicated in [33], that the best results are obtained for p close to 1, whereas the convergence is also slowest for p = 1.
In this article, we will use the SBR algorithm together with the multigrid approach. This algorithm has very interesting performance particularly in the case where the dictionary elements are strongly correlated [30], this is precisely the case with modal atoms. The algorithm is briefly recalled in the following paragraph.

SBR algorithm for penalized ℓ 2 -ℓ 0 problem
The heuristic SBR algorithm (see, Table 1) was proposed in [30] to minimize the mixed ℓ 2 -ℓ 0 cost function J (x, λ) defined in (2) for a fixed parameter value λ. It is a forward-backward algorithm inspired by the SMLR method [34]. It is an iterative search algorithm that addresses the penalized ℓ 2 -ℓ 0 problem for fixed λ. We denote by Ω • n the insertion or removal of an index n into/from the active set Ω At each iteration, the N possible single replacements Ω • n (n = 1,..., N) are tested (i.e., N least square problems are solved to compute the minimum squared error ℰ Ω•n related to each support Ω•n), then the replacement yielding the minimal value of the cost function J (x, λ), i.e., J •n (λ) := ε •n + λCard( • n), is selected. In Table 1, the replacement rule is formulated by "n k Î..." in case several replacements yield the same value of J (x, λ). However, this special case is not likely to occur when dealing with real data. A detailed analysis and performance evaluation can be found in [30] where it is shown that SBR performs very well in the case of highly correlated dictionary atoms (which is the case here). We note that unlike many algorithms which require to fix either a maximum number of iterations to be performed or a threshold on the squared error variation (OMP and OLS for instance), the SBR algorithm does not need any stopping condition since it stops when the cost function J (x, λ) does not decrease anymore. However it requires to tune the parameter λ which is done empirically.

Multigrid dictionary refinement
As mentioned before, we restrict our attention to the case of modal dictionaries whose atoms are calculated by evaluating a function over a multidimensional grid, the grid dimension being equal to the number of unknown modal parameters. To achieve a high-resolution modal estimation, a possible way is to define a high resolution dictionary often resulting in a prohibitive computational burden. Rather than defining a highly resoluted dictionary, we propose to adaptively refine a coarse one through a multigrid scheme. This results in the algorithm sketched on Table 2, where the key step is the adaptation of the dictionary as a function of the previous dictionary and the estimated vector x. The algorithm amounts to insert (resp, remove) atoms in (resp, from) F and to re-run the sparse approximation algorithm. We propose two procedures to refine the dictionary. The first one consists in inserting new atoms in the F matrix in the neighborhood of active ones.
In other words, we first restore the signal x (l) related to the dictionary F (l) by applying a sparse approximation method (SAM) at level l. Then we refine the dictionary by inserting atoms in between pairs of Φ (l) , in the neighborhood of each activated atom and we apply again the SAM at level l + 1 to restore x (l+1) with respect to the refined dictionary Φ (l+1) . Thus we refine iteratively the dictionary until the maximum level l = L -1 is reached. This procedure is illustrated on Figure 1a where the dictionary atoms depend on two parameters, f and a. The disadvantage of this procedure is that the size of the dictionary is increasing as new atoms are constantly added between two resolution levels. Hence, the computational cost will be increasing.
To cope with this limitation, we propose a second procedure consisting in adding new atoms as in the first procedure and deleting remote non-active ones (Figure 1b). The later multigrid approach may suffer from one main shortcoming. Indeed, removing non-active atoms excludes the possibility of further having active components in the neighborhood of already suppressed atoms. A possible way to overcome this problem consists in maintaining all the atoms from the initial dictionary in all the Φ (l) 's. The multigrid dictionary refinement is proposed in the context of modal analysis. However, it is worth noticing that this idea can be straightforwardly extended to any dictionary obtained by sampling a continuous function over a grid.
4 Monodimensional modal estimation using sparse approximation and multigrid

1-D data model
A 1-D complex modal signal containing F modes can be written as: where A is an M × F Vandermonde matrix:

1-D sparse modal estimation
The problem of modal estimation is an inverse problem since y is given and A,c, and F are unknown. It can be formulated as a sparse signal estimation problem by defining the dictionary F gathering all the possible modes obtained by sampling a (P samples) and f (K samples) on a 2-D grid. F is expressed in (7) with φ p,k (m) = e (−α p +j2π f k )m and N = PK Provided that a and f are finely sampled, we can assume that A is a submatrix of F so that c correspond to the nonzero elements in x. Then the modal estimation problem can be formulated as a penalized ℓ 2 -ℓ 0 sparse signal estimation problem (2). The multigrid approach presented before can be used to that end.
5 Multidimensional modal estimation using sparse approximation and multigrid

R-D data model
A multidimensional complex modal signal containing F modes can be written as: where m r = 0,...,M r -1 for r = 1,...,R. M r denotes the sample support of the rth dimension, a i,r = e (−α i,r +j2π f i,r ) is the ith mode in the rth dimension, with {α i,r } F,R i=1,r=1 the damping factors and {f i,r } F,R i=1,r=1 the frequencies, {c i } F i=1 the complex amplitudes, and e(m 1 m 2 ...,m R ) stands for an additive observation noise. The problem is to estimate the set of parameters {α i,r } F,R i=1,r=1 and {c i } F i=1 from the samples y(m 1 ,...,m R ).
In order to facilitate the presentation, we rewrite the data model using the Khatri-Rao product. Given (8), we define the vector y as: 0, 0, . . . , 1) . . .   Then, we define R Vandermonde matrices A r ∈ C M r ×F with generators {a i,r } F i=1 such that where c = [c 1 ,c 2 ,..., c F ] T gathers the complex amplitudes and e is the noise vector.

E-D sparse modal estimation
Similar to the 1-D case, the R-D modal retrieval problem can be formulated as a sparse signal estimation problem by defining a dictionary that gathers all possible combinations of 1-D modes obtained by sampling damping factors and frequencies for each dimension on 2-D grids. Let P r be the number of damping factors a 1,r , α2,r,...,αP r,r and K r the number of frequencies f 1,r, f 2,r ,..., f Kr,r resulting from the sampling of the r th dimension, then the corresponding dictionary is given by p,k (m r ) = e (−α p,r +j2π f k,r )m r for p = 1,...,P r and k = 1,...,K r . Finally, the dictionary involved in the R-D sparse modal approximation is defined by: where the number of atoms is N = R r=1 N r , with N r = P r K r . Note that the dictionary F can be seen as a 2R-dimensional sampling of the R-dimensional modal function. Then the R-D modal retrieval problem can be formulated as a penalized ℓ 2 -ℓ 0 sparse signal estimation problem (2).

Multigrid approach for R-D modal estimation
According to (10), the dictionary is obtained by doing the Kronecker product of R 1-D modal dictionaries. Thus, we can still use the multigrid approach presented in section 3 to adapt each 1-D dictionary to form the R-D dictionary. This results in the algorithm sketched in Table 3.

Experimental results
In this section, we present some experimental results for the multigrid sparse modal estimation. First, we present two examples on 1-D simulated modal signals. Next, we present and discuss results on a 2-D simulated signal and we compare them with those obtained by the subspace method "2-D ESPRIT" [5]. We chose the 2-D ESPRIT method because a comparative performance study [35] has shown that among different subspacebased high resolution modal estimation techniques, it was the one which was giving the best results.

1-D modal estimation results
First, we compare the results achieved by SBR, OMP, and the primal-dual logarithmic barrier (log-barrier) algorithm for solving the BP problem [15]. Here we used the SparseLab a implementations of OMP and BP (SolveOMP and SolveBP). Then, we present the results achieved using the multigrid SBR approach.
The first dataset is a noise-free 1-D modal signal y composed of M = 30 samples and made up of three 1-D superimposed damped complex sinusoids having the same amplitude. The 1-D modes are: The dictionary is constructed using 20 equally spaced frequency points in the interval [0 1], where each frequency point is coupled with 20 points of damping factors in [0 0.5] and each atom represents a 1-D complex sinusoid of M samples. Thus, the dictionary F is of size 30 × 400. We notice that the simulated 1-D modes belong to the dictionary. Thus, in the noise free case, it is possible to have an exact representation of the signal.
We estimate the parameters of y using SBR, OMP, and log-barrier; the results are shown in Figure 2. The representation given at the bottom of Figure 2 plots the active modes in the frequency-magnitude plane: the vertical lines are located at the frequencies of the active set Ω and their heights represent the corresponding estimated magnitudes |x Ω |. The horizontal segments represent the damping factors. Clearly, the results obtained by SBR and OMP are more sparse than those achieved by the BP solver because BP detects much more than three modes. This is due to the fact that BP is an l 2 -l 1 solver and thus tends to detect many atoms having low amplitudes, while OMP and SBR do not impose any l 1 penalty on the amplitudes allowing for the detection of a small number of atoms possibly having large amplitudes. SBR exactly yields the three modes (exact recovery) whereas OMP gives the true frequencies but leads to a wrong α 2 . The Fourier transform of signal y and its estimates obtained by SBR, OMP and log-barrier algorithms are given on Figure 2 (top). We observe that unlike OMP and log-barrier, SBR correctly estimates the modal parameters of y. Although log-barrier algorithm estimates correctly the frequencies for harmonic signals, it does not estimate correctly the parameters of 1-D modal signals and the solution is less sparse than the solutions provided by SBR and OMP.
In the second example, SBR algorithm coupled to multigrid approach is applied to estimate the 1-D modes from a simulated 1-D modal signal expressed in (5) with 30 samples embedded in additive Gaussian white noise such that the SNR is 23 dB. We start restoration using the same dictionary described in the first example, then we refined it with the multigrid approach. The simulated modes are: These modes are chosen in such a way that they cannot be separated by the Fourier transform ( Figure 3)    and they are not initially in the dictionary F. Figure 3a shows the spectrum of each sinusoid activated in the first level. Using the second multigrid procedure presented before, we see on Figure 3b that the two 1-D modes have been well separated in level 7, which proves the effectiveness of the approach. To give some figures about the efficiency of the multigrid approach, it is interesting to compare the size of the dictionary at the 7th degree of resolution to the uniform dictionary allowing the same resolution. For our example, F (7) is of dimension 30 × 520 while the uniform one achieving the same resolution would require a dictionary of size 30 × 6553600. This dramatic increase in the number of atoms is due to the bidimensional nature of the dictionary. Obviously, this complexity becomes huge for bi-and multidimensional modal signals. Note that the first two modes are not separated by 2-D Fourier transform. Amplitudes are (c 1 , c 2 , c 3 ) = (1,1,3) and the additive white noise variance is such that the SNR of the first mode is 7 dB (SNR 1 = 7). In the following simulations we use this simulated 2-D signal (y sim ) with the same modes and amplitudes, we only change the SNR value. The spectrum of the simulated signal is represented by contour lines in Figure 4a where it is verified that the first two peaks are not separable by Fourier transform. The SBR method coupled with the proposed multigrid approach detects well the three components at the third resolution level. Their respective spectras are shown in Figure 4b. To give an idea about the gain in computational cost, the size of the dictionary at the third level is equal to    400 × 3136. The size of the uniform dictionary achieving the same resolution is 400 × 409600; the gain in term of size is a multiplicative factor 130.

2-D modal estimation results
In Figure 5, we compare the estimated modes obtained by the 2-D ESPRIT [5] and our proposed technique. We use the 2-D simulated signal (y sim ) with the SNR set to 20 dB. Both our technique and 2-D ESPRIT are able to separate the three modes, whereas there is a slight error made by 2-D ESPRIT on the first and second modes. In Figure 6, we decrease the SNR to 7 dB, and only the proposed algorithm is still able to estimate the three modes with an accuracy similar to what was obtained when the SNR equals 20 dB. In that case, the 2-D ESPRIT performance decreases and the modal parameters are biased.
In Figure 7, we test the sensitivity of our technique to the correct determination of the number of modes in the signal. In the previous examples, the parameter λ of the penalized cost function in SBR algorithm was fixed to 0.01 and we did not give any constraint on the number of modes to be estimated. However, in the example presented in Figure 7, we use the 2-D simulated signal with SNR equal to 20 dB, and we force 2-D ESPRIT and the proposed algorithm to estimate 5 modes (while the actual number of modes is 3). We observe that the proposed algorithm is not very sensitive to the correct determination of the number of existing modes in the sense that the true modes are activated and the other active atoms lies in the neighborhood of the true modes. On the contrary, the 2-D ESPRIT yields spurious modes located very far from the true ones.
In Figure 8, we analyze the sensitivity of the multigrid algorithm to noise power. We use the same signal y sim with different noise levels SNR 1 . For each noise level we do 20 Monte Carlo trials and then we calculate the percentage of successful estimations obtained after three multigrid levels. We can see that the proposed algorithm reconstruct exactly the signal with a rate upper than    80% for an SNR 1 more than 6 dB; and the rate of success is 100% with an SNR 1 upper than 15 dB.

Conclusion
We presented a multigrid technique that adaptively refines ordered dictionaries for sparse approximation. The algorithm may be associated with any sparse method, but clearly the accuracy of the final results will depend on the accuracy of the sparse approximation. Then sparse approximation associated to multigrid are used to tackle mono-and multidimensional modal (damped sinusoids) estimation problem. Thus, we applied the SBR algorithm which is shown, using simulation results, to perform better than OMP and Basis Pursuit for modal approximation. Finally, we examined performances of our proposed algorithm over existing R-modal estimation algorithms. It allows one to separate modes that the Fourier transform cannot resolve without a huge increase in the computational cost, improves robustness to noise and does not require initialization. As perspectives, we will study possible improvements for the sparse multigrid approach in the case of multidimensional modal signals. In particular, we can envisage to used multiple 1-D modal estimation to get a low dimension initial dictionary for R-D modal estimation. We also are planning to study the convergence properties of the multigrid approach and we will apply the method to the modal estimation of real NMR signals.