Near optimal bound of orthogonal matching pursuit using restricted isometric constant

As a paradigm for reconstructing sparse signals using a set of under sampled measurements, compressed sensing has received much attention in recent years. In identifying the sufficient condition under which the perfect recovery of sparse signals is ensured, a property of the sensing matrix referred to as the restricted isometry property (RIP) is popularly employed. In this article, we propose the RIP based bound of the orthogonal matching pursuit (OMP) algorithm guaranteeing the exact reconstruction of sparse signals. Our proof is built on an observation that the general step of the OMP process is in essence the same as the initial step in the sense that the residual is considered as a new measurement preserving the sparsity level of an input vector. Our main conclusion is that if the restricted isometry constant δKof the sensing matrix satisfies δK 1)-sparse signals from measurements. We show that our bound is sharp and indeed close to the limit conjectured by Dai and Milenkovic.

In the sensing operation, K-sparse signal vector x, i.e., n-dimensional vector having at most K non-zero elements, is transformed into m-dimensional signal (measurements) y via a matrix multiplication with F. This process is expressed as Since n > m for most of the compressive sensing scenarios, the system in (1) can be classified as an underdetermined system having more unknowns than observations, and hence, one cannot accurately solve this inverse problem in general. However, due to the prior knowledge of sparsity information, one can reconstruct x perfectly via properly designed reconstruction algorithms. Overall, commonly used reconstruction strategies in the literature can be classified into two categories. The first class is linear programming (LP) techniques including ℓ 1 -minimization and its variants. Donoho [10] and Candes [13] showed that accurate recovery of the sparse vector x from measurements y is possible using ℓ 1 -minimization technique if the sensing matrix F satisfies restricted isometry property (RIP) with a constant parameter called restricted isometry constant. For each positive integer K, the restricted isometric constant δ K of a matrix F is defined as the smallest number satisfying for all K-sparse vectors x. It has been shown that if δ 2K < √ 2 − 1 [13], the ℓ 1 -minimization is guaranteed to recover K-sparse signals exactly.
The second class is greedy search algorithms identifying the support (position of nonzero element) of the sparse signal sequentially. In each iteration of these algorithms, correlations between each column of F and the modified measurement (residual) are compared and the index (indices) of one or multiple columns that are most strongly correlated with the residual is identified as the support. In general, the computational complexity of greedy algorithms is much smaller than the LP based techniques, in particular for the highly sparse signals (signals with small K). Algorithms contained in this category include orthogonal matching pursuit (OMP) [1], regularized OMP (ROMP) [18], stagewise OMP (DL Donoho, I Drori, Y Tsaig, JL Starck: Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit, submittd), and compressive sampling matching pursuit (CoSaMP) [16].
As a canonical method in this family, the OMP algorithm has received special attention due to its simplicity and competitive reconstruction performance. As shown in our Table, the OMP algorithm performs the support identification followed by the residual update in each iteration and these operations are repeated usually K times. It has been shown that the OMP algorithm is robust in recovering both sparse and near-sparse signals [13] with O(nmK) complexity [1]. Over the years, many efforts have been made to find out the condition (upper bound of restricted isometric constant) guaranteeing the exact recovery of sparse signals. For example, δ 3K < 0.165 for the subspace pursuit [19], δ 4K < 0.1 for the CoSaMP [16], and δ 4K < 0.01/ log K for the ROMP [18]. The condition for the OMP is given by [20] Recently, improved conditions of the OMP have been reported including δ K+1 < 1/ √ 2K [21] and δ K+1 < 1/( √ K + 1) (J Wang, B Shim: On recovery limit of orthogonal matching pursuit using restricted isometric property, submitted).
The primary goal of this article is to provide an improved condition ensuring the exact recovery of Ksparse signals of the OMP algorithm. While previously proposed recovery conditions are expressed in terms of δ K+1 [20,21], our result, formally described in Theorem 1.1, is expressed in terms of the restricted isometric constant δ K of order K so that it is perhaps most natural and simple to interpret. For instance, our result together with the Johnson-Lindenstrauss lemma [22] can be used to estimate the compression ratio (i.e., minimal number of measurements m ensuring perfect recovery) when the elements of F are chosen randomly [17]. Besides, we show that our result is sharp in the sense that the condition is close to the limit of the OMP algorithm conjectured by Dai and Milenkovic [19], in particular when K is large. Our result is formally described in the following theorem.
for a large K. In order to get an idea how close the proposed bound is from the limit conjectured by Dai and Milenkovic δ K+1 = 1/ √ K , we plot the bound as a function of the sparsity level K in Figure 1. Although the proposed bound is expressed in terms of δ K while (3) and the limit of Dai and Milenkovic are expressed in terms of δ K+1 so that the comparison is slightly unfavorable for the proposed bound, we still see that the proposed bound is fairly close to the limit for large K. a As mentioned, another interesting result we can deduce from Theorem 1.1 is that we can estimate the maximal compression ratio when Gaussian random matrix is employed in the sensing process. Note that the direct investigation of the condition δ K < for a given sensing matrix F is undesirable, especially when n is large and K is nontrivial, since the extremal singular values of n K sub-matrices need to be tested. In an alternative way, a condition derived from Johnson-Lindenstrauss lemma has been popularly considered. In fact, it is now well known that m × n random matrices with i.i.d. entries from the Gaussian distribution N (0, 1/m) obey the RIP with δ K ≤ with overwhelming probability if the dimension of the measurements satisfies [17] where C is a constant. By applying the result in Theorem 1.1, we can obtain the minimum dimension of m ensuring exact reconstruction of K-sparse signal using the OMP algorithm. Specifically, plugging This result [m is expressed in the order of O K 2 log n K is desirable, since the size of measurements m grows moderately with the sparsity level K.
2 Proof of theorem 1.1

Notations
We now provide a brief summary of the notations used throughout the article.
• |D| is the cardinality of D.
• T\D is the set of all elements contained in T but not in D.
• F D ℝ m×|D| is a submatrix of F that only contains columns indexed by D.
• x D ℝ |D| is a restriction of the vector x to the elements with indices in D.
• span(F D ) is the span (range) of columns in F D .
• D denotes the transpose of the matrix F D .
• P D = D † D denotes the orthogonal projection onto span(F D ).
• P ⊥ D = I − P D is the orthogonal projection onto the orthogonal complement of span(F D ).

Preliminaries-definitions and lemmas
In this subsection, we provide useful definition and lemmas used for the proof of Theorem 1.1.
Definition 1 (Restricted orthogonality constant [23]). For two positive integers K and K', if K + K' ≤ n, then K, K'-restricted orthogonality constant θ K,K' , is the smallest number that satisfies for all x and x' such that x and x' are K-sparse and K'-sparse respectively, and have disjoint supports.
Lemma 2.1. In the OMP algorithm, the residual r k is orthogonal to the columns selected in previous iterations. That is,  for i T k . Lemma 2.2 (Monotonicity of δ K [19]). If the sensing matrix satisfies the RIP of orders K 1 and K 2 , respectively, then This property is referred to as the monotonicity of the restricted isometric constant. Lemma 2.3 (A direct consequence of RIP [19]). Let I ⊂ {1,2,... ,n} and F I be the sub-matrix of F that contains columns of F indexed by I. If δ |I| < 1, then for any u ℝ |I| , Lemma 2.4 (Square root lifting inequality [23]). For a ≥ 1 and positive integers K, K' such that aK' is also an integer, we have Lemma 2.5 (Lemma 2.1 in [13]). For all x, x' ℝ n supported on disjoint subsets I 1 , I 2 ⊂ {1, 2,..., n}, we have Lemma 2.6. For two disjoint sets I 1 , I 2 ⊂ {1, 2,... ,n}, let θ |I 1 |,|I 2 | be the |I 1 |, |I 2 |-restricted orthogonality constant of F. If |I 1 | + |I 2 | ≤ n, then Proof. Let u ∈ R |I 1 | be a unit vector, then we have where the maximum of inner product is achieved when u is in the same direction of I 1 I 2 x I 2 i.e., u = I 1 I 2 x I 2 / I 1 I 2 x I 2 2 . Moreover, from Definition 1, we have and thus Lemma 2.7. For two disjoint sets I 1 ,I 2 with |I 1 | + |I 2 | ≤ n, we have Proof. From Lemma 2.5 we directly have By Definition 1, θ |I 1 |,|I 2 | is the minimal value satisfying and this completes the proof of the lemma.

Proof of theorem 1.1
Now we turn to the proof of our main theorem. Our proof is in essence based on the mathematical induction; First, we show that the index t 1 found at the first iteration is correct (t 1 T) under (4) and then we show that t k+1 is also correct (more accurately T k = {t 1 ,t 2 , ...,t k } T then t k+1 T\T k ) under (4).
Proof. Let t k be the index of the column maximally correlated with the residual r k-1 in the k-th iteration of the OMP algorithm. Since r k-1 = y for k = 1, t 1 can be expressed as and also where (19) uses the fact |T| = K (x is K-sparse supported on T). Now that y = F T x T , we have where (21) follows from Lemma 2.3. Now, suppose that t 1 does not belong to the support of x (i.e., t 1 ∉ T), then where (23) is from Lemma 2.6. This case, however, will never occur if Let a = K/(K -1), then a(K -1) = K is an integer and where (27) and (28) follow from Lemma 2.4 and 3.1, respectively. Thus, (25) holds true when In summary, if T for the first iteration of the OMP algorithm. Now we assume that former k iterations are successful (T k = {t 1 , t 2 ,... ,t k } T) for 1 ≤ k ≤ K -1. Then it suffices to show that t k+1 is in T but not in T k (i.e., t k+1 T\T k ). Recall from Table 1 that the residual at the k-th iteration of the OMP is expressed as Since y = F T x T and T k is a submatrix of F T , r k span (F T ) and hence r k can be expressed as a linear combination of the |T| (= K) columns of F T . Accordingly, we can express r k as r k = Fx k where the support (set of indices for nonzero elements) of x k is contained in the support of x. That is, r k is a measurement of Ksparse signal x k using the sensing matrix F. Therefore, it is clear that if T k T, then t k+1 T under (29). Recalling that the residual r k is orthogonal to the column already selected (〈 i , r k 〉 = 0 for i T k ) from Lemma 1, index of these columns is not selected again (see the identify step in Table 1) and hence t k+1 T\T k . This indicates that under the condition in (4) all the indices in the support T will be identified within K iterations (i.e., T K = T) and thereforê which completes the proof.

Discussions
In [19], Dai and Milenkovic conjectured that the sufficient condition of the OMP algorithm guaranteeing exact recovery of K-sparse vector cannot be further relaxed to δ K+1 = 1/ √ K . This conjecture says that if the RIP condition is given by δ K+1 < then should be strictly smaller than 1/ √ K . In [20], this conjecture has been confirmed via experiments for K = 2.
We now show that our result in Theorem 1.1 agrees with the conjecture, leaving only marginal gap from the limit. Note that since we cannot directly compare Dai and Milenkovic's conjecture (expressed in term of δ K+1 ) with our condition (expressed in term of δ K ), we need to modify our result. Following proposition provides a bit loose bound (sufficient condition) of our result expressed in the holds true for any integer K > 1 (see Appendix), if Initialize: iteration count k = 0 residual vector r 0 = y support set estimate T 0 = 0.

End
Output:x = arg min Also, from the monotonicity of the RIP constant (δ Syllogism of above two conditions yields the desired result.
One can clearly observe that 5858 is better than the condition δ K+1 < 1/ 3 √ K [20], similar to the result of Wang and Shim, and also close to the achievable limit δ k+1 < 1/ √ K , in particular for large K.
Considering that the derived condition δ K+1 < 1/ √ K + 3 − √ 2 is slightly worse than our original condition δ K < √ K − 1/ √ K − 1 + K , we may conclude that our result is fairly close to the optimal.

Conclusion
In this article, we have investigated the sufficient condition ensuring exact reconstruction of sparse signal for the OMP algorithm. We showed that if the restricted isometry constant δ K of the sensing matrix satisfies then the OMP algorithm can perfectly recover Ksparse signals from measurements. Our result directly indicates that the set of sensing matrices for which exact recovery of sparse signal is possible using the OMP algorithm is wider than what has been proved thus far. Another interesting point that we can draw from our result is that the size of measurements (compressed signal) required for the reconstruction of sparse signal grows moderately with the sparsity level.