Compressive sensing framework emerged from the efforts to decrease the amount of data sensed in the modern applications [5], and consequently to decrease the needs for resources such as storage capacity, number of sensors, and consumption. The set of measured data in traditional approach is usually referred to as full dataset, while the data in CS scenarios are called incomplete dataset. The data samples in the traditional sampling approach are equidistant and uniformly sampled, while in the CS, the data should be sampled randomly to achieve a high incoherence of the dataset. If the measurement process is modeled by a certain measurement matrix Φ, then the measured dataset y of length M can be described as [2]:
$$ \mathbf{y}=\boldsymbol{\Phi} x $$
(1)
where x represents the full dataset of length N, and the measurement matrix is of size M × N, with M < < N. In practical scenarios, the CS dataset y and the measurement matrix Φ are known, and we desire to reconstruct the full dataset x of length N. However, the system of equations in (1) is underdetermined since M < < N. Thus, in order to tackle the problem, one resorts to the assumption that the signal x is sparse when represented in certain transform basis:
$$ \mathrm{X}=\mathbf{W}\mathrm{x} $$
(2)
where X is a transform domain representation of full signal dataset and W is the transform matrix, i.e.,
$$ \mathbf{W}=\left[\begin{array}{cccc}{W}_1(1)& {W}_2(1)& ...& {W}_N(1)\\ {}{W}_1(2)& {W}_2(2)& ...& {W}_N(2)\\ {}...& ...& ...& ...\\ {}{W}_1(N)& {W}_2(N)& ...& {W}_N(N)\end{array}\right] $$
(3)
where Wi are the basis functions. Depending on the signal, the basic functions could belong to discrete Fourier transform (DFT), discrete Hermite transform (DHT), discrete wavelet transform (DWT), discrete cosine transform (DCT), etc. For the sake of simplicity, we can write W as the coefficients’ matrix:
$$ \mathbf{W}=\left[\begin{array}{cccc}{w}_{11}& {w}_{21}& ...& {w}_{N1}\\ {}{w}_{12}& {w}_{22}& ...& {w}_{N2}\\ {}...& ...& ...& ...\\ {}{w}_{1N}& {w}_{2N}& ...& {w}_{NN}\end{array}\right] $$
(4)
Sparsity means that the vector X has only K out of N nonzero (or significant) elements, while K < < N and K < M. The system of linear equations becomes:
$$ y=\boldsymbol{\Phi} {W}^{-1}X $$
(5)
The reconstructed signal X can be obtained as a sparsest solution of the following optimization problem [7]:
$$ \operatorname{minimize}{\left\Vert X\right\Vert}_1\kern0.72em \mathrm{subject}\kern0.34em \mathrm{to}\kern0.48em y=\boldsymbol{\Phi} {W}^{-1}X $$
(6)
where ||X||1 denotes l1-norm. Generally, the l0-norm is used to measure the signal sparsity, but in practical implementations, using the l0-norm can lead to an NP problem; hence, the l1-norm is used instead.
There are few types of algorithms that can be applied for solving this problem and the most commonly used are the convex optimization algorithms and the greedy algorithms [9, 10]. In general, the greedy solutions are simpler, but less reliable. Convex optimization algorithms are more reliable and accurate in general and guaranteed convergence, but their solutions comes at the cost of computational complexity and large number of nested iterations [4, 9]. In this paper, we consider the gradient-based convex optimization algorithm as a representative of the convex optimization group allowing simpler implementation compared with other common solutions.
2.1
Gradient-based signal reconstruction algorithm
Assume that the missing samples positions are defined by the set Nm = {n1, n2, ...nN − M}. Now the measurement vector is extended to the length of full dataset such that we embed zeros on the positions of missing samples, i.e., y(n) = 0, for n ∈ Nm. The gradient-based approach starts from some initial values of unavailable samples (initial state) which are changed through the iterations in a way to constantly improve the concentration in sparsity domain. In general, it does not require the signal to be strictly sparse in certain transform domain, which is an advantage over other methods. In particular, the missing samples in the signal domain can be considered as zero values. In each iteration, the missing samples are changed for +∆ and for −∆. Then, for both changes, the concentration is measured as l1-norm of the transform domain vectors X+ (for +∆ change) and X− (for −∆ change), while the gradient is determined using their difference. Finally, the gradient is used to update the values of missing samples. Here, it is important to note that each sample is observed separately and a single iteration is terminated when all samples are processed.
In the sequel, we provide a modified version of the algorithm adapted for an efficient hardware implementation avoiding complex functions and calculations.
The algorithm steps are combined within the blocks according to the parallelization principle. The steps within the same block can be implemented in parallel. The steps from the succeeding block are performed after the current block is processed. When the algorithm comes close to the minimum of the sparsity measure, the gradient changes direction for almost 180 degrees, meaning that the step ∆ needs to be decreased (line 18). The precision of the result in this iterative algorithm is estimated based on the change of the result in the last iteration.