- Research
- Open Access
A fast and accurate algorithm for ℓ _{1} minimization problems in compressive sampling
- Feishe Chen^{1},
- Lixin Shen^{1, 2}Email author,
- Bruce W. Suter^{2} and
- Yuesheng Xu^{1}
https://doi.org/10.1186/s13634-015-0247-5
© Chen et al. 2015
- Received: 9 April 2015
- Accepted: 13 July 2015
- Published: 1 August 2015
Abstract
An accurate and efficient algorithm for solving the constrained ℓ _{1}-norm minimization problem is highly needed and is crucial for the success of sparse signal recovery in compressive sampling. We tackle the constrained ℓ _{1}-norm minimization problem by reformulating it via an indicator function which describes the constraints. The resulting model is solved efficiently and accurately by using an elegant proximity operator-based algorithm. Numerical experiments show that the proposed algorithm performs well for sparse signals with magnitudes over a high dynamic range. Furthermore, it performs significantly better than the well-known algorithm NESTA (a shorthand for Nesterov’s algorithm) and DADM (dual alternating direction method) in terms of the quality of restored signals and the computational complexity measured in the CPU-time consumed.
Keywords
- Compressive sensing
- ℓ _{1} minimization
- Proximity operator
1 Introduction
A large amount of research has been done on solving problems (BP), (BP_{ ε }), and (QP_{ λ }). Here, we only give a brief and non-exhaustive review of results for these problems. In [3], problems (BP) and (QP_{ λ }) are solved by first reformulating them as perturbed linear programming and then applying a primal-dual interior-point approach [4]. Recently, many iterative shrinkage/thresholding algorithms are proposed to handle problem (QP_{ λ }). These include the proximal forward-backward splitting [5], the gradient projection for sparse reconstruction [6], the fast iterative shrinkage-thresholding algorithm (FISTA) [7], the fixed-point continuation algorithm [8], the Bregman iterative regularization [9, 10], and the reference therein. Problem (BP_{ ε }) also frequently appears in wavelet-based signal/image restoration [11, 12] with the matrix A associated with some inverse transforms.
Problem (BP_{ ε }) can be formulated as a second-order cone program and solved by interior-point algorithms. Many suggested algorithms for (BP_{ ε }) are based on repeatedly solving (QP_{ λ }) for various values of λ. Such algorithms are referred to as the homotopy method originally proposed in [13, 14]. The homotopy method is also successfully applied to (BP) in [15]. A common approach for obtaining approximate solutions to (BP_{ ε }) is often accompanied by solving (QP_{ λ }) for a decreasing sequence of values of λ [16]. The optimization theory asserts that problems (BP_{ ε }) and (QP_{ λ }) are equivalent provided that the parameters ε and λ satisfy certain relationship [17]. Since this relationship is hard to compute in general, solving problem (BP_{ ε }) via repeatedly solving (QP_{ λ }) for various values of λ is problematic. Recently, the NESTA [18] which employs Nesterov’s optimal gradient method was proposed for solving relaxed versions of (BP) and (BP_{ ε }) via Nesterov’s smoothing technique [19]. Clearly, the closeness of the solution to the relaxed version of (BP) (or the relaxed version of (BP_{ ε })) to the solution to (BP) (or (BP_{ ε })) is determined by the level of the closeness of the smoothed ℓ _{1}-norm to the ℓ _{1}-norm itself. Certainly, the performance of these approaches depends on the fine tuning of the parameter λ in (QP_{ λ }) or a parameter that controls the degree of the closeness of the ℓ _{1}-norm and its smoothed version.
In this paper, we consider solving problems (BP) and (BP_{ ε }) by a different approach. We convert the constrained optimization problems to unified unconstrained one via an indicator function. The corresponding objective function for the unconstrained optimization problem is the sum of the ℓ _{1}-norm of the underlying signal u and the indicator function of a set in \(\mathbb {R}^{m}\), which is {0} for (BP) or the ε-ball for (BP_{ ε }), composing with the affine transformation A u−b. Non-differentiability of both the ℓ _{1}-norm and the indicator of the set imposes challenges for solving the associated optimization problem. Fortunately enough, their proximity operators have explicit expressions. The solutions for the problem can be viewed as fixed-points of a coupled equation formed in terms of these proximity operators. An iterative algorithm for finding the fixed-points is then developed. The main advantage of this approach is that solving (QP_{ λ }) or smoothing the ℓ _{1}-norm are no longer necessary. This makes the proposed algorithm attractive for solving (BP) and (BP_{ ε }). The efficiency of fixed-point-based proximity algorithms has been demonstrated in [5] and [20–22] for various image processing models.
The rest of the paper is organized as follows: in Section 2, we reformulate the ℓ _{1}-norm minimization problems (BP) and (BP_{ ε }) via an indicator function and characterize solutions of the proposed model in terms of fixed-point equations. We also point out the connection between the proposed model and (QP_{ λ }) through the Moreau envelope. In Section 3, we develop an algorithm for the resulting minimization problem based on the fixed-point equations arising from the characterization of the proposed model. Numerical experiments are presented in Section 4. We draw our conclusions in Section 5.
2 An ℓ _{1}-norm optimization model via an indicator function
In this section, we consider a general optimization model that includes models (BP) and (BP_{ ε }) as its special cases and characterize solutions to the proposed model.
Clearly, the indicator function ι _{ C } is in \(\Gamma _{0}(\mathbb {R}^{d})\) for any closed nonempty convex set C. In particular, we define a ball in \(\mathbb {R}^{m}\) centered at the origin with radius ε as \( B_{\epsilon }:=\{v: v \in \mathbb {R}^{m} \; \text {and} \; \|v\|_{2} \le \epsilon \}. \)
We can easily see that if ε=0, then model (2) reduces to (BP), and if ε>0, then model (2) reduces to (BP_{ ε }). In other words, both constrained optimization problems (BP) and (BP_{ ε }) can be unified as the unconstrained optimization problem (2) via the indicator function \(\iota _{B_{\epsilon }}\).
Now, with the help of the subdifferential and the proximity operator, we can characterize a solution of the indicator function based on model (2) via fixed-point equations.
Proposition 2.1.
Conversely, if there exist α>0, β>0, \(u \in \mathbb {R}^{n}\), and \(v \in \mathbb {R}^{m}\) satisfying (4) and (5), then u is a solution of model (2).
Proof.
We first assume that \(u \in \mathbb {R}^{n}\) is a solution of model (2). Set \(\varphi :=\iota _{B_{\epsilon }}(\cdot -b)\). Hence, A u−b must be in the ball B _{ ε }. Therefore, both sets ∂∥·∥_{1}(u) and ∂ φ(A u) are nonempty. By Fermat’s rule, we have that 0∈∂∥·∥_{1}(u)+A ^{⊤} ∂ φ(A u). Therefore, for any α>0 and β>0, there exist \(w \in \frac {1}{\alpha } \partial \|\cdot \|_{1} (u)\) and \(v \in \frac {1}{\beta } \partial \varphi (Au)\) such that 0=α w+β A ^{⊤} v, i.e., \(w=-\frac {\beta }{\alpha }A^{\top } v\). By using (3), inclusion \(w \in \frac {1}{\alpha } \partial \|\cdot \|_{1} (u)\) implies \(u=\text {prox}_{\frac {1}{\alpha }\|\cdot \|_{1}}\left (u+w\right)\), which is (4). Since \(\frac {1}{\beta } \varphi = \varphi \) for any β>0, inclusion \(v \in \frac {1}{\beta } \partial \varphi (Au)\) leads to A u=prox_{ φ }(v+A u), which is equivalent to (5).
Conversely, if (4) and (5) are satisfied for some α>0, β>0, \(u \in \mathbb {R}^{n}\), and \(v \in \mathbb {R}^{m}\), using (3) again, we have that \(-\frac {\beta }{\alpha }A^{\top } v \in \partial \left (\frac {1}{\alpha }\|\cdot \|_{1}\right) (u)\) and v∈∂ φ(A u). Since \(\partial \left (\frac {1}{\alpha }\|\cdot \|_{1}\right) (u)=\frac {1}{\alpha } \partial \|\cdot \|_{1} (u)\) and ∂ φ(A u)=β ∂ φ(A u), we know from the above that 0∈∂∥·∥_{1}(u)+A ^{⊤} ∂ φ(A u). This indicates that u is a solution of model (2). The proof is complete.
We remark that the above fixed-point characterization can be identified as a special case of Proposition 1 in [22]. We include the proof of Proposition 2.1 here for making the paper self-contained.
for i=1,2,…,n.
The proximity operator \(\text {prox}_{\iota _{B_{\epsilon }}(\cdot -b)}\) is given by the following lemma.
Lemma 2.2.
Proof.
By the definition of the proximity operator, we can verify directly that \(\text {prox}_{\iota _{B_{\epsilon }}(\cdot -b)} = b + \text {prox}_{\iota _{B_{\epsilon }}}(\cdot -b)\) and \(\text {prox}_{\iota _{B_{\epsilon }}}\) is the projection operator onto the ball B _{ ε }. The result of this lemma follows immediately.
3 An algorithm and its convergence
In this section, we develop an algorithm for finding a solution of model (2) and provide a convergence analysis for the developed algorithm.
To show convergence of the iterative scheme (8), we recall a result from [20].
Lemma 3.1 (Theorem 3.5 in [20]).
With the help of Lemma 3.1, the following result shows that under appropriate conditions on parameters α and β, the sequence \(\{u^{k}: k \in \mathbb {N}_{0}\}\) converges to a solution of model (2).
Theorem 3.2.
then for arbitrary initial vectors \(u^{0} \in \mathbb {R}^{n}\), \(w^{0}, v^{0} \in \mathbb {R}^{m}\), the sequence \(\{u^{k}: k \in \mathbb {N}_{0}\}\) generated by the iterative scheme (8) converges to a solution of model (2).
Proof.
By setting x=0 and λ=1 and identifying \(\varphi = \iota _{B_{\epsilon }}(\cdot -b)\) in model (10), the iterative scheme (8) can be viewed as a special case of the one given in (9). The desired result follows immediately from Lemma 3.1.
The convergence result given by Theorem 3.2 offers a practical way to find a solution of model (2). Since the explicit forms of the proximity operators \(\text {prox}_{\frac {1}{\alpha }\|\cdot \|_{1}}\) and \(\text {prox}_{\iota _{B_{\epsilon }}(\cdot -b)}\) are given by (6) and Lemma 2.2, respectively, based on Theorem 3.2, a unified approach for solving both (BP) and (BP_{ ε }) is depicted in Algorithm 1.
where the matrix P=α I−β A ^{⊤} A is positive definite. The condition \(\frac {\beta }{\alpha }<\frac {1}{\|A\|^{2}}\) ensures the positive definiteness of P. The technique of introducing the term (u−u ^{ k })P(u−u ^{ k }) was used earlier in [25]. It can be easily seen that the iterative scheme (13) is equivalent to (8) with λ ^{ k }=β v ^{ k }. It is worth pointing out that if P is the zero matrix, then the iterative scheme (13) reduces to the conventional alternative direction method of multipliers (ADMM) for the constrained optimization problem (12) (see, e.g., [26]); in this case, the u-subproblem in (13) has no explicit solution and must be solved by an appropriate iterative algorithm, for example, FISTA in [7].
with an assumption v ^{−1}=v ^{0}−(A u ^{0}−w ^{0}) for the given u ^{0}, w ^{0}, and v ^{0}. We can further substitute w ^{ k+1} computed in step 2 into step 3. In this way, the intermediate variable w ^{ k } is no longer needed. Hence, these simplifications yield Algorithm 2, a variant of Algorithm 1. When ε=0, all vectors w ^{ k+1} in Algorithm 1 are equal to the constant vector b for all k≥0. Because of this, we would like to set w ^{0}=b in both Algorithms 1 and 2. Finally, it is more efficient to update u ^{ k+1} with step 1 of Algorithm 2 than with step 1 of Algorithm 1 in each iteration since the matrix-vector multiplication involving A is not required in (14). However, updating u ^{ k+1} via the formulation of step 2 in Algorithm 1 can be implemented through the use of the component-wise Gauss-Seidel iteration which may accelerate the rate of convergence of the algorithm and therefore reduce the total CPU time consumed. The efficiency of component-wise Gauss-Seidel iteration has been verified in [20, 21].
In terms of proximity operator, the updates u ^{ k+1} and v ^{ k+1} in (16) are identical to the update u ^{ k+1} in step 1 and the update v ^{ k+1} in step 2 of Algorithm 2, respectively.
4 Numerical simulations
This section is devoted to showing the numerical performance of the proposed algorithms for compressive sampling. We use NESTA [18] and dual alternating direction method (DADM) [23] as a comparison. In the comparisons, the NESTA with continuation in available code NESTA_v1.1 is applied and DADM for model (BP_{ ε }) is chosen. We focus on sparse signals with various dynamic ranges and various measurement matrices including randomly partial discrete cosine transforms (DCTs), randomly partial discrete Walsh-Hadamard transforms (DWHTs), and random Gaussian matrices and evaluate performance of algorithms in terms of various error metrics, speed, and robustness-to-noise. All the experiments are performed in MATLAB 7.11 on DELL XPS 14 with Intel Core i5, 4GB RAM on Windows 8 operating system.
where η _{1}=±1 with probability 1/2 and η _{2} is uniformly distributed in [0,1]. The locations of the nonzero components are randomly permuted. Clearly, the range of the magnitude of nonzero components of an s-sparse signal is [1,10^{ θ }] with the parameter θ controlling this dynamic range. An observed signal (data) is collected by b=A u+z, where z represents a Gaussian noise.
where u is the true data and u _{◇} is the restored data. All results reported in this section are the means of these relative errors and CPU time consumed from simulations that were performed 50 trials.
in our numerical experiments. In such the way, α is essentially the only parameter that needs to be determined. We now investigate the impact of the parameter α on the performance of Algorithm 2.
Three new parameters introduced in Algorithm 3 are integers p>0, τ>1, and T>0. The parameter T is the allowable maximum number of updating the parameters α and β. For each update, the pair (α,β) will change to (τ α,τ β) that will keep the ratio β/α unchanged. The parameter p is to indicate that the underlying algorithm with a pair (α,β) will iterate p times before the algorithm with the pair (τ α,τ β) runs another p times. We now demonstrate the efficiency of varying the parameters α and β via applying Algorithm 3 for the same data used in Fig. 2 a. We set T=6, τ=4, and p=20 and initialize \(\alpha =\frac {m}{n}\frac {20 \|A\|^{2}}{\|A^{\top } b\|_{\infty }}\). Again, we choose β by using (19). The corresponding result is shown in Fig. 2 b. It is clear to see that it takes about 200 iterations to drop the relative ℓ _{1} error down below 10^{−14}. Hence, the strategy of updating the parameters α and β as described in Algorithm 3 is reasonable.
The rest of this section consists of two subsections. The first subsection focuses on comparisons of proposed algorithm to NESTA and DADM for sensing matrices A with A A ^{⊤}=I, while the second subsection only focuses on numerical performance of proposed algorithms for random Gaussian sensing matrices.
4.1 Numerical comparisons
This subsection consists of three parts. Part one contains the comparisons of Algorithm 3, DADM, and NESTA for data setting with partial DCT measurement matrices, part two contains that for data setting with partial DWHT measurement matrices, and part three contains results on random matrices with orthonormalized rows.
4.1.1 4.1.1 Numerical comparison with partial DCT sensing matrices
First of all, we compare the performance of Algorithm 3 with that of NESTA and that of DADM [23] for noise-free data. The algorithm NESTA was developed by applying a smoothing technique for the nonsmooth ℓ _{1}-norm and an accelerated first-order scheme, both from Nesterov’s work [19]. A parameter denoted by μ is used to control how close the smoothed ℓ _{1}-norm to the ℓ _{1}-norm will be. To obtain high accuracy of restored signal for NESTA, μ=10^{−10} is used for partial DCT sensing matrices and various dynamic range parameters. A parameter Tol for tolerance in NESTA varies for different values of the smoothing parameter μ and different settings of generated data and needs to be determined. We choose the tolerance to obtain reasonable results. We finally choose Tol=10^{−12}, 10^{−14}, 10^{−15}, respectively, for data generated with dynamic range parameters θ=1, 3, 5. For DADM, two parameters γ and β have to be predetermined. γ=1.618 is chosen in all settings, while β varies in different settings to obtain reasonable results. We choose parameters \(\beta =\frac {\|b\|_{1}}{m2^{1}},~\frac {\|b\|_{1}}{m2^{3}},~\frac {\|b\|_{1}}{m2^{6}}\) for dynamic range parameters θ=1, 3, 5, respectively. For Algorithm 3, we set p=20 and T to be the smallest integer that is greater than \(\log _{10}(\frac {n}{m}\|A^{\top } b\|_{\infty })\). In our experiments, we notice that T is θ or θ+1. The stopping criterion of Algorithm 3 and DADM is that the relative errors between the successive iterates of the reconstructed signal should satisfy the inequality ∥u ^{ k+1}−u ^{ k }∥/∥u ^{ k }∥^{−15} for data generated by partial DCT for Algorithm 3 and DADM.
Numerical results with partial DCT sensing matrices for noise-free data. The number of measurements m is m=n/4, and the test signals are s-sparse with s=0.02n. Each value in a cell represents the mean over 50 trials
Method | ℓ _{2}-error | ℓ _{1}-error | ℓ _{ ∞ }-error | CPU time(s) | Iterations |
---|---|---|---|---|---|
n=2^{13} | |||||
Algorithm 3 | 4.99e −15 | 6.77e −16 | 6.55e −14 | 1.0153 | 387 |
DADM | 4.48e −15 | 6.45e −16 | 5.24e −14 | 1.1525 | 391 |
NESTA | 8.29e −11 | 1.75e −10 | 6.10e −10 | 1.8168 | 469 |
n=2^{15} | |||||
Algorithm 3 | 3.04e −15 | 3.81e −16 | 4.71e −14 | 5.1618 | 394 |
DADM | 2.10e −15 | 2.96e −16 | 2.96e −14 | 5.6775 | 398 |
NESTA | 8.48e −11 | 1.77e −10 | 6.80e −10 | 7.7550 | 477 |
n=2^{13} | |||||
Algorithm 3 | 6.20e −15 | 1.07e −15 | 4.65e −12 | 1.0640 | 394 |
DADM | 3.96e −15 | 1.20e −15 | 2.72e −12 | 1.1331 | 388 |
NESTA | 1.41e −12 | 4.70e −12 | 6.34e −10 | 2.8384 | 742 |
n=2^{15} | |||||
Algorithm 3 | 3.18e −15 | 5.39e −16 | 2.81e −12 | 5.3503 | 403 |
DADM | 2.35e −15 | 3.96e −16 | 1.84e −12 | 5.7291 | 395 |
NESTA | 1.48e −12 | 4.78e −12 | 6.95e −10 | 12.1293 | 748 |
n=2^{13} | |||||
Algorithm 3 | 4.69e −15 | 9.25e −16 | 2.94e −10 | 1.0637 | 397 |
DADM | 3.01e −15 | 1.46e −15 | 1.74e −10 | 1.9593 | 665 |
NESTA | 2.05e −14 | 1.96e −14 | 8.12e −10 | 4.7221 | 1236 |
n=2^{15} | |||||
Algorithm 3 | 3.04e −15 | 4.74e −16 | 2.08e −10 | 5.4653 | 404 |
DADM | 2.05e −15 | 4.85e −16 | 1.28e −10 | 10.2053 | 691 |
NESTA | 2.09e −14 | 3.04e −14 | 7.93e −10 | 19.5025 | 1209 |
Numerical results with partial DCT sensing matrices for noise-free data. The number of measurements m is m=n/8, and the test signals are s-sparse with s=0.01n. Each value in a cell represents the mean over 50 trials
Method | ℓ _{2}-error | ℓ _{1}-error | ℓ _{ ∞ }-error | CPU time(s) | Iterations |
---|---|---|---|---|---|
n=2^{13} | |||||
Algorithm 3 | 1.24e −14 | 1.65e −15 | 1.47e −13 | 2.0684 | 776 |
DADM | 1.11e −14 | 4.34e −15 | 1.33e −13 | 2.3184 | 803 |
NESTA | 1.96e −10 | 3.89e −10 | 1.61e −09 | 2.8518 | 764 |
n=2^{15} | |||||
Algorithm 3 | 5.67e −15 | 5.81e −16 | 8.02e −14 | 10.2525 | 799 |
DADM | 4.58e −15 | 6.12e −16 | 6.15e −14 | 12.0084 | 832 |
NESTA | 2.00e −10 | 3.89e −10 | 1.79e −09 | 11.9268 | 761 |
n=2^{13} | |||||
Algorithm 3 | 9.65e −15 | 1.99e −15 | 6.91e −12 | 1.9584 | 758 |
DADM | 1.04e −14 | 6.61e −15 | 7.02e −12 | 2.2513 | 791 |
NESTA | 3.14e −12 | 9.96e −12 | 1.49e −09 | 4.5468 | 1216 |
n=2^{15} | |||||
Algorithm 3 | 5.09e −15 | 6.58e −16 | 4.29e −12 | 9.7934 | 762 |
DADM | 5.21e −15 | 6.98e −16 | 4.21e −12 | 11.5441 | 817 |
NESTA | 3.18e −12 | 9.97e −12 | 1.64e −09 | 18.3234 | 1187 |
n=2^{13} | |||||
Algorithm 3 | 1.36e −14 | 2.22e −15 | 5.62e −10 | 1.8825 | 727 |
DADM | 7.87e −15 | 7.69e −15 | 4.09e −10 | 3.2409 | 1129 |
NESTA | 4.68e −14 | 1.04e −13 | 2.09e −09 | 7.3318 | 1950 |
n=2^{15} | |||||
Algorithm 3 | 5.26e −15 | 7.99e −16 | 3.39e −10 | 9.4868 | 739 |
DADM | 4.32e −15 | 7.42e −16 | 2.59e −10 | 17.2200 | 1202 |
NESTA | 5.19e −14 | 3.54e −14 | 2.35e −09 | 24.6468 | 1793 |
Numerical results with partial DCT sensing matrices for noisy data. The number of measurements m is m=n/4, and the test signals are s-sparse with s=0.02n. Each value in a cell represents the mean over 50 trials
Method | ℓ _{2}-error | ℓ _{1}-error | ℓ _{ ∞ }-error | CPU time(s) | Iterations |
---|---|---|---|---|---|
n=2^{13} | |||||
Algorithm 3 | 6.06e −2 | 6.28e −3 | 5.49e −1 | 0.2309 | 82 |
DADM | 6.06e −2 | 6.23e −3 | 5.49e −1 | 0.2268 | 75 |
NESTA | 7.25e −2 | 2.47e −2 | 6.68e −1 | 0.4006 | 123 |
n=2^{15} | |||||
Algorithm 3 | 6.10e −2 | 6.28e −3 | 6.15e −1 | 1.0700 | 80 |
DADM | 6.10e −2 | 6.23e −3 | 6.15e −1 | 1.0925 | 76 |
NESTA | 7.23e −2 | 2.29e −2 | 7.14e −1 | 1.7906 | 123 |
n=2^{13} | |||||
Algorithm 3 | 1.90e −2 | 1.76e −3 | 10.0453 | 0.2684 | 99 |
DADM | 1.89e −2 | 1.72e −3 | 10.0370 | 0.2181 | 71 |
NESTA | 2.05e −2 | 1.61e −2 | 12.0646 | 0.4353 | 132 |
n=2^{15} | |||||
Algorithm 3 | 1.88e −2 | 1.60e −3 | 11.3331 | 1.5018 | 111 |
DADM | 1.88e −2 | 1.54e −3 | 11.3232 | 1.0662 | 71 |
NESTA | 2.09e −2 | 1.54e −2 | 13.0586 | 1.8931 | 132 |
n=2^{13} | |||||
Algorithm 3 | 1.13e −3 | 1.03e −4 | 49.7915 | 0.2740 | 101 |
DADM | 1.13e −3 | 5.85e −4 | 50.3671 | 0.5953 | 199 |
NESTA | 1.28e −3 | 1.13e −3 | 61.2107 | 0.4243 | 125 |
n=2^{15} | |||||
Algorithm 3 | 1.18e −3 | 5.73e −5 | 56.1854 | 1.3543 | 102 |
DADM | 1.18e −3 | 5.49e −4 | 56.8402 | 2.9696 | 200 |
NESTA | 1.34e −3 | 1.11e −3 | 66.0787 | 1.7721 | 126 |
Numerical results with partial DCT sensing matrices for noisy data. The number of measurements m is m=n/8, and the test signals are s-sparse with s=0.01n. Each value in a cell represents the mean over 50 trials
Method | ℓ _{2}-error | ℓ _{1}-error | ℓ _{ ∞ }-error | CPU time(s) | Iterations | |
---|---|---|---|---|---|---|
n=2^{13} | ||||||
Algorithm 3 | 1.02e −1 | 1.94e −2 | 8.48e −1 | 0.3296 | 122 | |
DADM | 1.02e −1 | 1.94e −2 | 8.48e −1 | 0.2790 | 94 | |
NESTA | 1.20e −1 | 3.07e −2 | 1.0099 | 0.4606 | 145 | |
n=2^{15} | ||||||
Algorithm 3 | 1.02e −1 | 1.83e −2 | 9.37e −1 | 1.5691 | 121 | |
DADM | 1.02e −1 | 1.82e −2 | 9.37e −1 | 1.4009 | 95 | |
NESTA | 1.22e −1 | 2.74e −2 | 1.1065 | 2.0506 | 149 | |
n=2^{13} | ||||||
Algorithm 3 | 2.97e −2 | 5.30e −3 | 15.1517 | 0.2853 | 102 | |
DADM | 2.97e −2 | 5.21e −3 | 15.1429 | 0.3028 | 99 | |
NESTA | 3.10e −2 | 2.08e −2 | 17.2106 | 0.5012 | 160 | |
n=2^{15} | ||||||
Algorithm 3 | 2.92e −2 | 5.89e −3 | 16.8347 | 1.5609 | 120 | |
DADM | 2.92e −2 | 5.79e −3 | 16.8203 | 1.4675 | 99 | |
NESTA | 3.16e −2 | 1.93e −2 | 19.4426 | 2.2300 | 160 | |
n=2^{13} | ||||||
Algorithm 3 | 1.94e −3 | 2.87e −4 | 75.5390 | 0.3231 | 115 | |
DADM | 1.92e −3 | 3.49e −4 | 75.4992 | 0.6975 | 230 | |
NESTA | 1.93e −3 | 1.50e −3 | 90.2023 | 0.4981 | 157 | |
n=2^{15} | ||||||
Algorithm 3 | 1.89e −3 | 2.00e −4 | 86.3350 | 1.5025 | 114 | |
DADM | 1.88e −3 | 2.06e −4 | 86.2110 | 3.3468 | 233 | |
NESTA | 2.03e −3 | 1.41e −3 | 99.4225 | 2.2662 | 158 |
4.1.2 4.1.2 Numerical comparison with partial DWHT sensing matrices
The performance of the three algorithms will be discussed in this part. The performance of the algorithms will be presented in a different manner from the previous part with partial DCT sensing matrices. In all of those three algorithms, the computational cost is mainly attributed to the matrix-vector multiplication involving A or A ^{⊤}. Under the assumption that A A ^{⊤}=I, the three algorithms only have two such multiplications, one involving A and the other involving A ^{⊤} in each iteration. Hence, we will only use the number of iterations to represent the computational cost. For the accuracy, only the relative ℓ _{2}− error will be selected. The setting of parameters of the three algorithms remains almost the same except that μ=10^{−8}, Tol=10^{−13} is used in NESTA for noise-free data with dynamic range parameter θ=5.
4.1.3 4.1.3 Numerical comparison with orthonormal Gaussian sensing matrices
4.2 Simulation with Gaussian sensing matrices
Numerical results with Gaussian measurement matrices for noise-free data. The test signals have size n=4096. Each value in a cell represents the mean over 50 trials
m | s | θ | ℓ _{2}-error | ℓ _{1}-error | ℓ _{ ∞ }-error | CPU time(s) | Iterations |
---|---|---|---|---|---|---|---|
n/4 | 0.02n | 1 | 1.64e −13 | 2.31e −14 | 2.07e −12 | 6.0768 | 844 |
n/8 | 0.01n | 1 | 3.28e −13 | 4.93e −14 | 3.70e −12 | 5.1040 | 1305 |
n/4 | 0.02n | 3 | 1.32e −13 | 3.13e −14 | 9.93e −11 | 5.7287 | 799 |
n/8 | 0.01n | 3 | 3.93e −13 | 1.06e −13 | 2.61e −10 | 4.9531 | 1255 |
n/4 | 0.02n | 5 | 1.08e −13 | 2.55e −14 | 5.76e −09 | 5.3643 | 752 |
n/8 | 0.01n | 5 | 4.01e −13 | 1.25e −13 | 2.10e −08 | 4.5700 | 1157 |
Numerical results with Gaussian measurement matrices for noisy data. The test signals have size n=4096. Each value in a cell represents the mean over 50 trials
m | s | θ | ℓ _{2}-error | ℓ _{1}-error | ℓ _{ ∞ }-error | CPU time(s) | Iterations |
---|---|---|---|---|---|---|---|
n/4 | 0.02n | 1 | 1.01e −3 | 4.08e −4 | 8.99e −3 | 2.3762 | 315 |
n/8 | 0.01n | 1 | 1.65e −3 | 5.13e −4 | 1.29e −2 | 1.8412 | 455 |
n/4 | 0.02n | 3 | 3.57e −4 | 1.68e −4 | 0.1729 | 1.2184 | 160 |
n/8 | 0.01n | 3 | 6.10e −4 | 2.46e −4 | 0.2602 | 0.8990 | 211 |
n/4 | 0.02n | 5 | 4.51e −05 | 5.52e −05 | 2.1526 | 1.0484 | 139 |
n/8 | 0.01n | 5 | 6.22e −05 | 4.56e −05 | 3.1652 | 0.7387 | 165 |
5 Conclusions
We reformulated the ℓ _{1}-norm minimization problems (BP) and (BP_{ ε }) via indicator functions as unconstrained minimization problems. The objective function for each unconstrained problem is the sum of the ℓ _{1}-norm of the underlying signal u and the indicator function of a set in \(\mathbb {R}^{m}\), which is {0} for (BP) or the ε-ball for (BP_{ ε }), composing with the affine transformation A u−b. Due to the structure of this objective function and the availability of the explicit forms of the proximity operators for both the ℓ _{1}-norm and the indicator function, an accurate and efficient algorithm is developed for recovering sparse signals based on fixed-point equation. The algorithm outperforms NESTA in terms of the relative ℓ _{2}, the relative ℓ _{1}, and the absolute ℓ _{ ∞ } error measures as well as the computational cost for tested signals ranging from a low dynamic range to a high dynamic range with different sizes. For signal with high dynamic range, the proposed algorithm also outperforms DADM in terms of computational cost but yields comparable accuracy. Further, the proposed algorithms also solve general problems without requiring condition A A ^{⊤}=I efficiently and accurately.
Declarations
Acknowledgements
Lixin Shen was partially supported by the US National Science Foundation under grant DMS-1115523 and by an award from National Research Council via the Air Force Office of Scientific Research. Yuesheng Xu was partially supported by the US National Science Foundation under grant DMS-1115523.
Authors’ Affiliations
References
- E Candes, J Romberg, T Tao, Stable signal recovery from incomplete and inaccurate measurements. Commun. Pur. Appl. Math. 59(8), 1207–1223 (2006).MathSciNetView ArticleMATHGoogle Scholar
- E Candes, T Tao, Near optimal signal recovery from random projections: universal encoding strategies?IEEE Trans. Inf. Theory. 52(12), 5406–5425 (2006).MathSciNetView ArticleGoogle Scholar
- SS Chen, DL Donoho, MA Saunders, Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998).MathSciNetView ArticleGoogle Scholar
- SJ Wright, Primal-Dual Interior-Point Methods (Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997).Google Scholar
- P Combettes, V Wajs, Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. A SIAM Interdiscip. J. 4, 1168–1200 (2005).MathSciNetView ArticleMATHGoogle Scholar
- MAT Figueiredo, SJ Wright, RD Nowak, Gradient projection for sparse reconstruction: applications to compressed sensing and other inverse problems. IEEE J. Selected Topics Signal Process. 1, 586–597 (2007).View ArticleGoogle Scholar
- A Beck, M Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009).MathSciNetView ArticleMATHGoogle Scholar
- ET Hale, W Yin, Y Zhang, Fixed-point continuation for ℓ _{1} minimization: methodology and convergence. SIAM J. Optim. 19, 1107–1130 (2008).MathSciNetView ArticleMATHGoogle Scholar
- J-F Cai, S Osher, Z Shen, Split Bregman methods and frame based image restoration. Multiscale Model. Simul.: A SIAM Interdiscip. J. 2, 337–369 (2009).Google Scholar
- W Yin, S Osher, D Goldfarb, J Darbon, Bregman iterative algorithms for ℓ ^{1} minimization with applications to compressed sensing. SIAM J. Imaging Sci. 1, 143–168 (2008).MathSciNetView ArticleMATHGoogle Scholar
- A Chambolle, RA DeVore, N-Y Lee, BJ Lucier, Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process. 7, 319–335 (1998).MathSciNetView ArticleMATHGoogle Scholar
- R Chan, T Chan, L Shen, Z Shen, Wavelet algorithms for high-resolution image reconstruction. SIAM J. Sci. Comput. 24(4), 1408–1432 (2003).MathSciNetView ArticleMATHGoogle Scholar
- B Efron, T Hastie, I Johnstone, R Tibshirani, Least angle regression. Ann. Stat. 32, 407–451 (2004).MathSciNetView ArticleMATHGoogle Scholar
- MR Osborne, B Presnell, BA Turlach, A new approach to variable selection in least squares problems. IMA J. Numeric. Anal. 20, 389–403 (2000).MathSciNetView ArticleMATHGoogle Scholar
- D Donoho, Y Tsaig, Fast solution of ℓ _{1}-norm minimization problems when the solution may be sparse. IEEE Trans. Inf. Theory. 54(11), 4789–4812 (2008).MathSciNetView ArticleMATHGoogle Scholar
- E van den Berg, MP Friedlander, Probing the Pareto frontier for basis pursuit solutions. SIAM J. Scie. Comput. 31, 890–912 (2008).MathSciNetView ArticleMATHGoogle Scholar
- RT Rockafellar, Convex Analysis (Princeton University Press, Princeton, NJ, 1970).Google Scholar
- S Becker, J Bobin, E Candes, NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1–39 (2009).MathSciNetView ArticleGoogle Scholar
- Y Nesterov, Smooth minimization of non-smooth functions. Mathematical Programming, Series A. 103, 127–152 (2005).MathSciNetView ArticleMATHGoogle Scholar
- Q Li, CA Micchelli, L Shen, Y Xu, A proximity algorithm accelerated by Gauss-Seidel iterations for L1/TV denoising models. Inverse Probl. 28, 095003 (2012).MathSciNetView ArticleGoogle Scholar
- CA Micchelli, L Shen, Y Xu, Proximity algorithms for image models: denoising. Inverse Probl. 27, 045009–30 (2011).MathSciNetView ArticleGoogle Scholar
- CA Micchelli, L Shen, Y Xu, X Zeng, Proximity algorithms for the L1/TV image denoising model. Adv. Comput. Math. 38, 401–426 (2013).MathSciNetView ArticleMATHGoogle Scholar
- J Yang, Y Zhang, Alternating direction algorithms for l1-problems in compressive sensing. SIAM J. Scie. Comput. 33, 250–278 (2011).View ArticleMATHGoogle Scholar
- W Deng, W Yin, On the global and linear convergence of the generalized alternating direction method of multipliers (ADMM). Technical report, UCLA, Center for Applied Mathematics (2012).Google Scholar
- X Zhang, M Burger, S Osher, A unified primal-dual algorithm framework based on Bregman iteration. J. Sci. Comput. 46, 20–46 (2011).MathSciNetView ArticleMATHGoogle Scholar
- S Boyd, N Parikh, E Chu, B Peleato, J Eckstein, Distributed optimization and statistical learning via alternating direction method of multipliers. Foundations Trends Mach. Learn. 3, 1–122 (2010).View ArticleMATHGoogle Scholar
- A Chambolle, T Pock, A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40, 120–145 (2011).MathSciNetView ArticleMATHGoogle Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.