On recovery of block-sparse signals via mixed l2/lq(0 < q ≤ 1) norm minimization

Compressed sensing (CS) states that a sparse signal can exactly be recovered from very few linear measurements. While in many applications, real-world signals also exhibit additional structures aside from standard sparsity. The typical example is the so-called block-sparse signals whose non-zero coefficients occur in a few blocks. In this article, we investigate the mixed l2/lq(0 < q ≤ 1) norm minimization method for the exact and robust recovery of such block-sparse signals. We mainly show that the non-convex l2/lq(0 < q < 1) minimization method has stronger sparsity promoting ability than the commonly used l2/l1 minimization method both practically and theoretically. In terms of a block variant of the restricted isometry property of measurement matrix, we present weaker sufficient conditions for exact and robust block-sparse signal recovery than those known for l2/l1 minimization. We also propose an efficient Iteratively Reweighted Least-Squares (IRLS) algorithm for the induced non-convex optimization problem. The obtained weaker conditions and the proposed IRLS algorithm are tested and compared with the mixed l2/l1 minimization method and the standard lq minimization method on a series of noiseless and noisy block-sparse signals. All the comparisons demonstrate the outperformance of the mixed l2/lq(0 < q < 1) method for block-sparse signal recovery applications, and meaningfulness in the development of new CS technology.


Introduction
According to the Shannon/Nyquist sampling theorem [1,2], if we would like to avoid lose of information when capturing a signal, we must sample the signal at the socalled Nyquist rate, which means twice the highest frequency of the signal. Since the theorem only exploits the bandlimitedness of a signal and most real-world signals are sparse or compressible, the process of massive data acquisition based on Shannon/Nyquist sampling theorem usually samples too many useless information and eventually we have to compress to store or encode a very few essential information of the signal. Obviously, this process is extremely wasteful and therefore, a more effective sampling way to directly acquire the essential information of a signal has been expected.
Compressed sensing (CS) [3][4][5] was motivated by this purpose and it can completely acquire the essential *Correspondence: wjj@swu.edu.cn 2 School of Mathematics and Statistics, Southwest University, Chongqing 400715, China Full list of author information is available at the end of the article information of a signal by exploiting its compressibility. In a word, the main contribution of CS is that it presents a new efficient scheme to capture and recover the compressible or sparse signals at a reduced sampling rate far below the Nyquist rate. The basic principle of CS is that it first employs non-adaptive linear projections to preserve the structure of the signal; then one can exactly recover these signals from a surprisingly small number of random linear measurements through a nonlinear optimization procedure (such as l 1 -minimization) if the measurement matrix satisfies some suitable sufficient conditions in terms of restricted isometry property (RIP, [6]). Consequently, CS implies that it is indeed possible to acquire the data in already compressed form. Nowadays, driven by a wide range of applications, CS and other related problems have attracted much interest in various communities, such as in signal processing, machine learning, and statistics.
Different from general sparse signals in conventional sense, some real-world signals may exhibit some additional structures, i.e., the non-zero coefficients appear in http://asp.eurasipjournals.com/content/2013/1/76 a few fixed blocks, we refer to these signals as blocksparse signals in this article. Such block-sparse signals arise in various application problems, say, DNA microarrays [7,8], equalization of sparse communication channels [9], source localization [10], wideband spectrum sensing [11], and color imaging [12].
Using standard convex relaxation (l 1 -minimization) as conventional CS framework for recovering the blocksparse signal does not exploit the fact that the nonzero elements of the signal appear in consecutive positions. Therefore, one natural idea is considering the block version of l 1 -minimization, i.e., the mixed l 2 /l 1minimization, to exploit the block-sparsity. Many previous works have shown that the mixed l 2 /l 1 -minimization is superior to standard l 1 -minimization when dealing with such block-sparse signals [13][14][15]. Huang and Zhang [13] developed a theory for the mixed l 2 /l 1 -minimization by using a concept called strong group sparsity and they demonstrated that the mixed norm minimization is very efficient for recovering strongly group-sparse signals. Stojnic et al. [15] obtained an optimal number of Gaussian measurements for uniquely recovering a block-sparse signal through the mixed l 2 /l 1 norm minimization. By generalizing the conventional RIP notion to the block-sparse case, Eldar and Mishali [14] showed that if the measurement matrix D has the same restricted isometry constant as that in the l 1 case, then the mixed norm method is guaranteed to exactly recover any block-sparse signals in the noiseless case. Furthermore, they also showed that the block-sparse signal recovery would be robust in the noisy case under the same recovery condition. Another common approach to deal with the block-sparsity problem is by suitably extending the standard greedy methods, such as orthogonal matching pursuit, iterative hard thresholding (IHT), and compressive sampling matching pursuit (CoSaMP), to the block-sparse case. In [16], the CoSaMP algorithm and the IHT algorithm were extended to the model-based setting, treating the block-sparsity as a special case. It was also shown that the new recovery algorithms demonstrate provable performance guarantees and robustness properties. Eldar et al. [17] generalized the notion of coherence to the block-sparse setting and proved that a block-version of the orthogonal matching pursuit (BOMP) algorithm can exactly recover any blocksparse signal if the block-coherence is sufficiently small. In addition, the mixed l 2 /l 1 -minimization approach was certified to guarantee successful recovery with the same condition on block-coherence. Ben-Haim and Eldar [18] examined the ability of greedy algorithms to estimate a block-sparse signal from noisy measurements. They derived some near-oracle results for the block-sparse version of greedy pursuit algorithms both in the adversarial noise and the Gaussian noise cases. Majumdar and Ward [19] used BOMP to deal with the block-sparse representation-based classification problem. The validity and robustness of these new methods were theoretically proved.
In recent years, several studies [20][21][22][23][24][25][26][27] have showed that the non-convex l q (0 < q < 1) minimization allows the exact recovery of sparse signals from fewer linear measurements than that by l 1 -minimization. Chartrand and Staneva [21] provided a weaker condition to guarantee perfect recovery for the non-convex l q (0 < q < 1) minimization method using a l q variant of the RIP. They obtained the number of random Gaussian measurements necessary for the successful recovery of sparse signals via l q (0 < q < 1) minimization with high probability. Sun [27] used the conventional RIP as in the l 1 case to prove that whenever q is chosen to be about 0.6796 × (1 − δ 2k ), every k-sparse signal can exactly be recovered via l q minimization, here δ 2k is the restricted isometry constant for the measurement matrix. Xu et al. [24][25][26] considered a specially important case (q = 1/2) of l q minimization. They developed a thresholding representation theory for l 1/2 minimization and conducted a phase diagram study to demonstrate the merits of l 1/2 minimization. This article presents an ongoing effort to extend the non-convex l q (0 < q < 1) minimization methodology to the setting of block-sparsity. Specifically, we will study the performance of the block-sparse signal recovery via the mixed l 2 /l q (0 < q < 1) norm minimization by means of the block RIP (block-RIP). We first exhibit that under similar RIP conditions with that in the standard l q case, the mixed l 2 /l q recovery method can assuredly to recover any block-sparse signal, irrespective to the locations of non-zero blocks. In addition, the method is robust in the presence of noise. Our formulated recovery conditions will show that the non-convex l 2 /l q (0 < q < 1) minimization is superior to the convex l 2 /l 1 minimization within block-RIP framework. Furthermore, we will compare the sparse signal recovery ability of the nonconvex l 2 /l q (0 < q < 1) method to the convex l 2 /l 1 method and the standard l q method by conducting a series of simulation studies. To the best of the authors' knowledge, although Majumdar and Ward [12] first proposed the non-convex l 2 /l q (0 < q < 1) method in CS literature for color imaging and showed that the l 2 /l 0.4 minimization has the best performance on some imaging experiments, their works were experimental in feature and lack of convincing theoretical assessment. As compared, our work not only highlights theoretical merits of the non-convex block optimization method, but also makes a more intensive study on the block-sparse signal recovery capabilities for some different values of q via numerical experiments.
We begin with our study in Section 2,3 and 4 by presenting the problem setting. In Section 5, we establish the sufficient conditions for the mixed l 2 /l q (0 < q < 1) optimization approach to guarantee exact and robust http://asp.eurasipjournals.com/content/2013/1/76 recovery of block-sparse signals in terms of block-RIP. We also develop an efficient Iteratively Reweighted Least-Squares (IRLS) algorithm to recover block-sparse signals from given fewer measurements, which generalizes the algorithm of [28] to the unconstrained l 2 /l q (0 < q ≤ 1) norm minimization case in Section 6. In Section 7, we show that the non-convex l 2 /l q (0 < q < 1) method has stronger block-sparsity promoting ability than the convex l 2 /l 1 method and the standard l q method through a series of simulations. Finally, we conclude the article in Section 8 with some useful remarks.

Block-sparsity
The conventional CS only consider the sparsity that the signal x has at most k non-zero elements, and it does not take into account any further structure. However, in many practical scenarios, the non-zero elements are aligned to blocks, meaning that they appear in regions. These signals are referred to the block-sparse signals. Mathematically speaking, a block-sparse signal x ∈ R N over block index set I = {d 1 , . . . , d m } can be modeled as follows: Here, x[ i] denotes the ith block of x and d i is the block size for the ith block. The block-sparsity we consider in this article means that there are at most k < m non-zero blocks. Obviously, if d 1 = · · · = d m = 1, the block-sparse signals degenerate to the conventional sparse signals well studied in CS. The main focus of this study is to recover a blocksparse signal x from random linear measurement y = x (noiseless case) or y = x + z (noisy case). Here, y ∈ R M is a vector, ∈ R M×N is a measurement matrix, whose entries are usually randomly drawn from a Gaussian or a Bernoulli distribution, and z is an unknown bounded noise. We represent as a concatenation of column-blocks [ i] of size M × d i , that is, ] . (2) Then we are interested in formulating sufficient conditions on the measurement matrix under which a blocksparse signal x can assuredly be and stably recovered from its fewer noiseless measurements where I( x[ i] 2 > 0) is an indicator function, we then notice that a block k-sparse signal x can be defined as a vector that satisfies x 2,0 ≤ k. In the remainder of the article, we will restrict our attention to how and in what conditions these block-sparse signals can be recovered exactly and stably in noiseless and noisy scenarios respectively.

Block-RIP
Candes and Tao [6] first introduced the notion of RIP of a matrix to characterize the condition under which the sparsest solution of an underdetermined linear system exists and can be found. And then the RIP was used as a powerful tool to study CS in several previous works [4,5,21,29]. Let be a matrix of size M×N, where M < N, we say that matrix satisfies RIP of order k if there exists a constant δ k ∈[ 0, 1) such that for every x ∈ R N ( x 0 ≤ k), Obviously, δ k quantifies how close to isometric the all M × k submatrices of should be. Since the block-sparse signals exhibit additional structure, Eldar and Mishali [14] extended the standard RIP to the block-sparse setting and showed that the new block-RIP constant is typically smaller than the standard RIP constant. Now we state the new definition in block-sparse setting.
For convenience, in the remainder of the article, we still use δ k , instead of δ k|I , to represent the block-RIP constant whenever the confusion is not caused.
With the new notion, Eldar and Mishali [29] generalized the sufficient recovery conditions to the block-sparse signals both in noiseless and noisy settings. They showed that if is taken random as conventional CS, it satisfies the block-RIP with overwhelming probability. All these results illustrated that one can recover a block-sparse signal exactly and stably via the convex mixed l 1 /l 2 minimization method whenever the measurement matrix is constructed from a random ensemble (i.e., Gaussian ensemble). http://asp.eurasipjournals.com/content/2013/1/76

Non-convex recovery method
It is known from [14] that whenever satisfies the block-RIP with δ 2k < 1, there is a unique block-sparse signal x which can be recovered by solving the following problem: Unfortunately, the problem (6) is an NP-hard problem and finding the optimal solution of (6) has exponential complexity. In principle, one only can solve the problem exactly by searching over all possible sets of k blocks whether there exists a vector consistent with the measurements. Obviously, this approach is unable to deal with high-dimensional signals.
One natural idea to find x more efficiently is to employ a convex relaxation technique, namely, to replace the l 2 /l 0 norm by its closest convex surrogate l 2 /l 1 norm, thus resulting in the following model: where This model can be treated as a second-order cone program (SOCP) problem and many standard software packages can be used for the solutions very efficiently. In many practical cases, the measurements y are corrupted by bounded noise, then we can apply the modified SOCP or the group version of basis pursuit denoising [30] program as the following: where λ is a tuning parameter, which controls the tolerance of the noise term. There are also many methods to solve this optimization problem efficiently, such as the block-coordinate descent technique [31] and the Landweber iterations technique [32]. As mentioned before, recent studies on non-convex CS have indicated that one can reduce the number of required linear measurements for successful recovery of a general sparse signal by replacing the l 0 norm by a non-convex surrogate l q (0 < q < 1) quasi-norm, which motivates us to generalize the better recovery capability of the non-convex CS to the block-sparse setting. Therefore, we suggest the use of the following non-convex optimization model for recovery of block-sparse signals, that is, where ≥ 0 controls the noise error term ( = 0 means noiseless case) and 2 ) 1/q is a generalization of standard l q quasi-norm for 0 < q < 1. We will show that this new non-convex recovery approach can achieve better block-sparse recovery performance both practically and theoretically when compared with the commonly used convex l 2 /l 1 minimization approach. In the following section, we will provide some sufficient conditions for exact and stable recovery of block-sparse signals through the mixed l 2 /l q (0 < q < 1) norm minimization, and further develop a similar IRLS algorithm as in [28,33] for solutions of such non-convex optimization problem.

Sufficient block-sparse recovery conditions
In this section, we first consider the recovery problem of a high-dimensional signal x ∈ R N in the noiseless setting. Thus, we propose the constrained mixed l 2 /l q norm minimization with 0 < q < 1: where y ∈ R M are available measurements, is a known M × N measurement matrix.
To state our main results, we need more notations. We first consider the case where x is exactly block k-sparse. We use Null( ) to denote the null space of and T to denote the block index set of non-zero blocks of the signal x. Let x * be a solution of the minimization problem (10). From [15], it is known that x * is the unique sparse solution of (10) being equal to x if and only if h T 2,q < h T c 2,q for all non-zero vector h in the null space of . This is called the null space property (NSP). In order to characterize more accurately the NSP, one can consider the following equivalent form: There exists a smallest constant ρ satisfying 0 < ρ < 1 such that When x is not exactly block sparse, Aldroubi et al. [34] also showed that NSP actually guarantees stability. Precisely, if we use T to denote the block index set over the k blocks with largest l 2 norm of x, then the NSP (11) gives here C is a constant. Indeed, from (11), it is easy to see that the following equality holds q) h T c 2,q . http://asp.eurasipjournals.com/content/2013/1/76 Therefore, Our main point of the study is to show how to make γ (h, q) < 1 for all non-zero vector h in the null space of . Our first conclusion is the following theorem. Theorem 1. (Noiseless recovery). Let y = x be measurements of a signal x. If the matrix satisfies the block RIP (5) with δ 2k < 1/2, then there exists a number q 0 (δ 2k ) ∈ (0, 1] such that for any q < q 0 , the solution x * to the mixed l 2 /l q problem (10) obeys to where C 0 (q, δ 2k ) and C 1 (q, δ 2k ) are positive constants dependent on q and δ 2k , T 0 is the block index set over the k blocks with largest l 2 norm of the original signal x. In particular, if x is block k-sparse, the recovery is exact.
Remark 1. Theorem 1 provides a sufficient condition for the recovery of a signal x via l 2 /l q minimization with 0 < q < 1 in the noiseless setting. Focusing on the case where x is block k-sparse, it is known [14] that when the l 2 /l 1 minimization scheme (7) is employed to recover x, the sufficient condition on the block-RIP is that δ 2k < 0.414. As compared, Theorem 1 says that, whenever the non-convex l 2 /l q minimization scheme (10) is used, this constant can be relaxed to δ 2k < 0.5 for some q < 1. This shows that similar as the standard sparse signal recovery, when compared with the convex minimization method, the non-convex minimization method can enhance performance of block-sparse signal recovery.

Lemma 2. ([35]
). For any fixed q ∈ (0, 1) and x ∈ R N , Proof of Theorem 1. Set x * = x + h be a solution of (10), where x is the original signal we need to reconstruct. Throughout the article, x T will denote the vector equal to x on an index set T and zero elsewhere. Let T 0 be the block index set over the k blocks with largest l 2 norm of x. And we decompose h into a series of vectors Here h T i is the restriction of h to the set T i and each T i consists of k blocks (except possibly T J ). Rearranging the block indices such that h T j [ 1] Note that From Lemma 2, it follows that From Equation (18), we obtain where we have used the fact that (a + b) q ≤ a q + b q for non-negative a and b. Therefore, we have On the other hand, let Since By the definition of the block-RIP, Lemma 1,(20) and (22), it then implies that By the definition of δ 2k and using Hölder's equality, we then further have (24) and (25),
From the definition of γ (h, q), we have As x * q = x+h q is the minimum, using the equation Since That is, Thus, we have This justifies that the first equality of (13) is proved. In the following, we further prove the second equality of (13). In effect, from (24) and (25), we have which, together with (22) and (30), then implies Thus, it follows from (31) and (32) that That is, the second inequality in (13) holds. With this, the proof of Theorem 1 is completed.
We further consider the recovery problem of any highdimensional signals in the presence of noise. In this situation, the measurement can be expressed as where z ∈ R N is an unknown noise term. In order to reconstruct x, we adopt the unconstraint mixed l 2 /l q minimization scheme with 0 < q < 1: where > 0 is a bound on the noisy level. We below show that one can also recover x stably and robustly under the same assumption as those in Theorem 1.

Theorem 2.
(Noisy recovery). Let y = x + z( z 2 ≤ ) be noisy measurements of a signal x. If the matrix satisfies the block RIP (5) with δ 2k < 1/2, then there exists a number q 0 (δ 2k ) ∈ (0, 1] such that for any q < q 0 , the mixed l 2 /l q method (34) can stably and robustly recover the original signal x. More precisely, the solution x * of (34) obeys to where C 1 (q, δ 2k ) and C 2 (q, δ 2k ) are positive constants dependent on q and δ 2k , T 0 is the block index set over the k blocks with largest l 2 norm of the original signal x.

Remark 2.
The inequality (35) in Theorem 2 offers an upper bound estimation on the recovery of the mixed l 2 /l q minimization (q ∈ (0, q 0 )). In particular, this estimation shows that the recovery accuracy of the mixed l 2 /l q minimization can be controlled by the degree of sparsity of the signal and the exponential number q. It reveals also the close connections among the recovery precision of the mixed l 2 /l q minimization method may achieve, the sparsity of the signal and the index q used in the recovery procedure. In addition, the estimation (35) shows that the recovery precision of the method (34) can be bounded by the noise level. In this sense, Theorem 2 shows that under certain conditions, a block k-sparse signal can be robustly recovered by the method (34).
Proof of Theorem 2. The proof of Theorem 2 is similar to the procedure of the proof of Theorem 1 with minor differences.
To be more detail, we also set x * = x + h. Due to the existence of noise, in this case, h is not necessarily in the null space of any more. But we can still prove Theorem 2 under the same assumption.
We still denote by T 0 the block index set over the k blocks with largest l 2 norm of x, and h T 0 the restriction of h onto these blocks. We also denote h T j (j ≥ 1) similar to the proof of Theorem 1. By z 2 ≤ and the triangle inequality, we first have Since of the block-RIP, we get Hence, (38)  On the other hand, From (38) and (39), we thus have and we also have (23), thus then, by a easy calculation, we can easily obtain that the maximum of g(t) occurs at t 1 = 2−q 2 and By (23) and the condition of f (t) in the proof of Theorem 1, we also have Plugging (42) and (43) into (40), it is easy to see that Consequently, we obtain In the following, we further prove that f (t 0 ) < 1 one can obtain the conclusion of Theorem 2. Precisely, under the same condition on δ 2k , we can prove Theorem 2. Note that from (27), we also have Plugging (44) into the above inequality and by f (t 0 ) < 1, one can show that Since v q ≤ 2 1/q−1 v 1 for v ∈ R 2 , we further have and by (37), (22), and (23), we also have where we have used the fact that Thus, it then follows from (45) and (47) that This arrives to the conclusion of Theorem 2. http://asp.eurasipjournals.com/content/2013/1/76

An IRLS algorithm
Inspired by the ideas of [28,33], in this section, we present an efficient IRLS algorithm for the solution of the mixed l 2 /l q norm minimization problem (34). We first rewrite the problem as the following regularized version of the unconstrained smoothed l 2 /l q minimization: where τ is an regularization parameter and Let J 2,q (x, , τ ) be the objective function associated with (49), that is, Then, the first-order necessary optimality condition for the solution of x is Hence, we define the diagonal weighting matrix W as W i = diag(q 1/2 ( 2 + x[ i] 2 2 ) q/4−1/2 ) for ith block, and after simple calculations, we can obtain the following necessary optimality condition: Due to the nonlinearity, there is no straightforward method to solve the above system of equations. But if we fix W = W (t) to be that determined already in the tth iteration step, the solution of (51), set as the (t + 1)th iterate, then can be found to be This defines a nature iterative method for solution of (49). We formalize such reweighted algorithm as the following: Algorithm 1. An IRLS algorithm for the unconstrained smoothed l 2 /l q (0 < q <≤ 1) minimization problem Step 1. Initializex (0) = arg min x y − x 2 2 andk be the estimated block-sparsity. Set 0 = 1, t = 0.
Step 2. while loop do until t+1 = 0 otherwise repeat the above computation within a reasonable time Step 3. Output x (t+1) to be an approximation solution.
In Algorithm 1, r(x)k +1 is the (k + 1)th largest l 2 norm value of the block of x in the decreasing order, α ∈ (0, 1) is a number such that αr(x (1) )/N < 1, and τ is an appropriately chosen parameter which controls the tolerance of noise term. Although the best τ may change continuously with respect to noise level, we used the fixed value of τ in our numerical implementations.
Obviously, from the update equation (52), we can see that x (t) can be understood as the minimizer of 1 2τ Due to the iteratively fixing feature of W (t) , this gives the name IRLS. Majumdar and Ward [12] first adopted the IRLS methodology to solve a non-convex l 2 /l q blocksparse optimization problem. But their algorithm is unsuitable for noisy block-sparse recovery. Lai et al. [28] proposed an IRLS algorithm for the unstrained l q (0 < q ≤ 1) minimization problem and they made an detailed analysis which includes convergence, local convergence rate and error bound of the algorithm. To some extent, our proposed algorithms can be considered as a generalization of their algorithm to the setting of block-sparse recovery.
Note that { t } in Algorithm 1 is a bounded nonincreasing sequence, which must converge to some * ≥ 0. Then using a similar argument as that in [28], we can prove that x (t) must have a convergent subsequence and the limit of the subsequence is a critical point of (50) whenever * > 0. In addition, there exists a convergent subsequence whose limit x * is a sparse vector with block-sparsity x * 2,0 ≤k when * = 0. Furthermore, we can also verify the super-linear local convergence rate for the proposed Algorithm 1. Due to space limitations, we leave the detailed analysis to the interested reader.

Numerical experiments
In this section, we conduct two numerical experiments to compare the non-convex l 2 /l q (0 < q < 1) minimization method with the l 2 /l 1 minimization method and http://asp.eurasipjournals.com/content/2013/1/76 the standard l q (0 < q ≤ 1) method in the context of block-sparse signal recovery. Note that this is possible because Algorithm 1 can apply to the standard l q (0 < q ≤ 1) minimization method too. For all compared methods, we use the same starting point x (0) = arg min x y − x 2 2 . In our experiments, the measurement matrix was generated by creating an M × N matrix with i.i.d draws from a standard Gaussian distribution N(0, 1). We considered four different values of q = 0.1, 0.5, 0.7, 1 for both the l 2 /l q minimization method and the l q minimization method. The purpose of experiments was to compare the recovery performance of the mixed l 2 /l q method and the l q method for block-sparse signals without noise and with noise respectively.

Noiseless recovery
In this set of experiments, we considered the case that the signals were perfectly measured without noise. We first randomly generated the block-sparse signal x with values chosen from a Gaussian distribution of mean 0 and standard deviation 1 and then randomly drew a measurement matrix from Gaussian ensemble. Then we observed the measurements y from the model y = x. In all the experiment cases, if t+1 < 10 −7 or x (t+1) − x (t) 2 < 10 −8 , the iteration terminates and outputs x (t+1) as an approximation solution of original signal x; otherwise, we let the algorithms run to the maximum number of iterations max = 2000. We also set parameters τ and α to 10 −5 and 0.7. We tested Algorithm 1 for different initial block-sparsity estimates and did find that any overestimatedk of k would yield similar results. A typical simulation result is shown in Figure 1. Therefore, for simplicity, we setk = k + 1 in our implementation. Figure 2a depicts an instance of the generated blocksparse signal with signal length N = 512. There are 128 blocks with uneven block size and 16 active blocks with the sparsity: k 0 = x 0 = 101. Figure 2b-d shows that the recovery results by the standard l q method with q = 1, the standard l q method with q = 0.5, and the mixed l 2 /l 1 method, respectively, when M = 225. Since the sample size M is only around 2.2 times the signal sparsity k 0 , the standard l 1 method does not yield good recovery results, whereas the mixed l 2 /l 1 method and the non-convex l q (q = 0.5) method achieve near perfect recovery of the original signal. The results illustrate that if one incorporates the block-sparsity structure into the recovery procedure, the block version of convex l 1 minimization does also reduce the number of measurement  as the standard non-convex l q minimization with some q < 1.
We further compared the recovery performance of the standard l q method and the mixed l 2 /l q method for different values of q. Figure 3 shows a similar instance as Figure 2. We generated a block-sparse signal with the same non-zero block locations as in Figure 2a, and we observed M = 144 measurements that is only around 1.4 times the signal sparsity k 0 . From Figure 3, we can see that only the non-convex l 2 /l q method with q = 0.5, 0.1 achieve near optimal recovery results while other methods fail. The results illustrate that for any q ≤ 0.5, the mixed l 2 /l q method can exactly recover the original signal. In addition, the results also demonstrate the outperformance of the non-convex mixed l 2 /l q (0 < q < 1) method over the standard non-convex l q (0 < q < 1) method. Figure 4a shows the effect of sample size, where we report the average root mean squares error (RMSE) over 100 independent random trails in the logarithmic scale for each sample size. In this case, we set signal length N = 256, and there are 64 blocks with uneven block size and the k = 4 active blocks were randomly extracted from the 64 blocks. The figure indicates the decay in recovery error as a function of sample size for all the algorithms. We can observe that both the l q and the mixed l 2 /l q methods improve the recovery performance as q decreases, and for a fixed q, the mixed l 2 /l q method is clearly superior to the standard l q method in this blocksparse setting. To further study the effect of the active block number k (with k 0 fixed), we drew a matrix of size 128 × 256 from Gaussian ensemble. We also set the signal x with the even block size and the total sparsity k 0 = 64. The block size was changed while keeping other parameters unchanged. Figure 4b shows the average RMSE over 100 independent random runs in the logarithmic scale. One can easily see that the recovery performance for the standard l q method is independent to the active block number, while the recovery errors for the mixed l 2 /l q method are significantly better when the active block number k is far smaller than the total signal sparsity k 0 . As expected, the performance of the mixed l 2 /l q method becomes identical to the standard l q method when k = k 0 . This illustrates that the mixed method favors large sized block when the total sparsity k 0 is fixed. Moreover, similar to the standard l q method, the mixed l 2 /l q method performs better and better as q decreases. http://asp.eurasipjournals.com/content/2013/1/76

Noisy recovery
In this experiment, we considered the case of recovering the block-sparse signals in the presence of noise. We observed the measurements y from the model y = x+z, here and x were generated as the last subsection and z was zero-mean Gaussian noise with standard deviation σ . In our implementation of this experiment, we set τ = 10 −1 max | T y| and kept other parameters unchanged. Table 1 lists the comparison results of the relative errors of the true solutions and the approximate solutions yielded, respectively, by the mixed l 2 /l q method and the l q method with active block number k varying in {4, 12}, sample size r = M/k 0 in{3, 4}, and the noise level σ in {0.02, 0.05, 0.10}. Here, the relative errors are defined as x − x * 2 / x 2 . It is reported in Table 1 that the average relative errors and the standard deviations over 100 random trails. From the table, it is seen that, for a fixed q, the mixed l 2 /l q method always get better results than the standard l q method. And in the low-noisy cases (say, σ = 0.02, 0.05), as q decreases, the mixed l 2 /l q method improves the recovery performance. However, when σ = 0.10, the mixed l 2 /l q method is not always able to improve the recovery results when q ≤ 0.7. Thus, we may reasonably infer that there exits a q 0 ≤ 0.1 such that for any q < q 0 , all the l 2 /l q minimization can obtain similar recovery results when the noise level is low (σ = 0.02, 0.05); while there exits a q 0 ≤ 0.7 such that as q < q 0 decreases, the mixed l 2 /l q method is unable to improve the recovery results when σ = 0.10.

Conclusion
In this article, we have investigated the block-sparse recovery performance of the mixed l 2 /l q minimization approach, especially for the non-convex case of 0 < q < 1. Under the assumption that the measurement matrix has the RIP with δ 2k < 1/2, we have proved that the non-convex l 2 /l q (0 < q < 1) method can exactly and stably recover original block-sparse signals in noiseless case and noisy case, respectively. The sufficient recovery condition we obtained is weaker than those of l 2 /l 1 method (δ 2k < 0.414), which implies the better block-sparse recovery ability of the mixed l 2 /l q (0 < q < 1) method. We have conducted a series of numerical experiments to support the correctness of the theory and the outperformance of the mixed method.  Our study so far is only concerned with the blocksparse signal recovery without overlapping blocks. While in many real applications, such as the gene expression data in bioinformatics, the blocks of elements could potentially be overlapped. Rao et al. [36] derived some tight bounds for the number of measurements required for exact and stable recovery of block-sparse signals with overlapping blocks by the mixed l 2 /l 1 minimization method. Their analysis can naturally be extended to the non-convex l 2 /l q (0 < q < 1) method. All these extensions will be part of our future research.
Although our simulation studies in this article demonstrate clearly that the recovery ability of the mixed l 2 /l q method would be better and better as q decreases, there is still lack of further theoretical analysis to support for such observation. The works of [37,38] addressed the block-sparse recovery ability by an accurate analysis of the breakdown behavior of the mixed l 2 /l 1 method. One could then have an interest in extending these results to the case of the mixed l 2 /l q (0 < q < 1) method as well. It is noted that since the resultant mixed minimization problem is  Table 1 Comparison results ( x − x * 2/ x 2 ) of randomly generated signals with length N = 256 and m = 64 uneven size blocks by mixed l 2 /l q method and l q method with four different values of q and two different sampling sizes r = M/k 0 (Continued)