DCT domain

AbstractThe discrete cosine transform (DCT) offers superior energy compaction properties for a large class of functions and has been employed as a standard tool in many signal and image processing applications. However, it suffers from spurious behavior in the vicinity of edge discontinuities in piecewise smooth signals. To leverage the sparse representation provided by the DCT, in this article, we derive a framework for the inverse polynomial reconstruction in the DCT expansion. It yields the expansion of a piecewise smooth signal in terms of polynomial coefficients, obtained from the DCT representation of the same signal. Taking advantage of this framework, we show that it is feasible to recover piecewise smooth signals from a relatively small number of DCT coefficients with high accuracy. Furthermore, automatic methods based on minimum description length principle and cross-validation are devised to select the polynomial orders, as a requirement of the inverse polynomial reconstruction method in practical applications. The developed framework can considerably enhance the performance of the DCT in sparse representation of piecewise smooth signals. Numerical results show that denoising and image approximation algorithms based on the proposed framework indicate significant improvements over wavelet counterparts for this class of signals.

The problem of mitigating the Gibbs phenomenon has been thoroughly studied for the case of Fourier representation [10,11]. To overcome the Gibbs phenomenon in Fourier series expansion, in [12], a polynomial reconstruction method is proposed that successfully eliminates the oscillations with high accuracy. This is feasible through re-expanding the signal in a suitably chosen polynomial basis. In other words, the Fourier representation of the signal, which includes the Gibbs oscillations, is re-projected onto a polynomial basis, which is Gegenbauer polynomials in the case of Fourier series. Thus, this procedure is referred to as reprojection.
Roughly speaking, the re-projection procedure is a basis set transformation. The idea of a re-projection method is to determine the transformation between the two representations. In the direct method (originally termed as Gegenbauer reconstruction method in the literature) this transformation is realized as consecutive projections of the signal first onto the Fourier series basis and then onto the space of Gegenbauer polynomials [12,9]. Alternatively, the inverse method, also known as inverse polynomial reconstruction method (IPRM), reverses the order of projections, i.e. projects the signal first onto the polynomial basis and then onto the Fourier space [11,13,14]. It turns out that these two methods are not equivalent and the reconstruction procedure using the inverse method is not only more accurate, but also independent of the polynomial basis [11]. The latter is an important advantage of the inverse method, since there is no optimal way to select the parameter of the Gegenbauer polynomials in the direct method, automatically.
The formulation of IPRM for the case of Fourier series expansion has been studied in [11,13]. In [14] the extension of this framework to partial Fourier series expansion has been considered. Here we follow a similar formulation as the one given by [13] and take advantage of the results of [14] to derive the IPRM for the partial DCT expansion, since DCT has been adopted as a standard tool for signal and image compression and denoising, due to its excellent decorrelation properties. We specifically concentrate on the class of piecewise smooth signals where the exact locations of the discontinuity points are known a priori or can be estimated with suitable methods. For this class, we derive a framework for the IPRM in the DCT expansion.
Limiting the order of polynomials and assuming the discontinuity locations are given, it leads to solving a linear approximation problem. Applying this framework to both piecewise polynomial and piecewise smooth signals, we show that it is feasible to reconstruct the signals from a limited number of DCT coefficients, with high accuracy. This result can significantly improve the performance of the DCT in sparse representation of piecewise smooth signals.
In addition, we consider the problem of selecting the polynomial orders in IPRM, especially when the signal is given in the presence of noise. To this end, we use two model selection methods, namely minimum description length (MDL) [15] and cross-validation (CV) [16,17]. Finally, having been equipped with the DCT-based IPRM and polynomial order selection methods in IPRM, we will show that the use of this framework leads to efficient algorithms for denoising and linear approximation.
The article is organized as follows. In the next section, the derivation of the inverse method in DCT expansion is presented. In Section 3, we adopt order selection techniques to determine the polynomial orders in IPRM. Section 4 focuses on applications, namely, denoising and approximation. In Section 5, numerical experiments are presented, where the performance of the DCT-based IPRM is compared with other methods, especially the wavelet transform. Conclusions are given in Section 6.

DCT-based IPRM
is a partition of Ω. a Furthermore, assume that n i represents the cardinality of Ω i , which is equal to b i − a i + 1. Designating f in each sub-interval Ω i by f i , f (x n ) can be written as Note that The Gegenbauer polynomial expansion of f i at the grid points is given by where C λ ℓ (x) is the Gegenbauer polynomial of order ℓ with parameter λ (λ > 0), g ℓ are the Gegenbauer polynomial coefficients. Note that since IPRM is independent of the polynomial basis [13], one may use other polynomial bases arbitrarily. Here we use Gegenbauer polynomials for consistency with the literature on IPRM. In addition, X i is a linear map defined from {x n } bi n=ai to a uniform grid {y n } n i −1 n=0 over the interval [−1, 1] such that y 0 = x 0 and y ni−1 = x N −1 : Note that {x n } bi n=ai and {y n } ni−1 n=0 are of the same cardinality. Approximating f i with a polynomial of degree m i yields: Thus, polynomial approximation of f yields On the other hand, the DCT expansion of f (x n ) is given bŷ wheref k are the DCT coefficients. Here we use DCT type 2, which is the most common form of DCT [4].
The normalization factors β[k] are given by , the signal f (x n ) is obtained through the inverse DCT expansion as follows: Reconstructing the signal using only the first N d DCT coefficients results in the signal f N d (x n ): which suffers from the spurious oscillations. Instead, by substituting f (x n ) in Equation (6) with f m (x n ) from Equation (5), we havê By defining the matrix W i for the ith sub-interval as with the matrix elements W i kℓ given by This equation can also be written in matrix form as where the vectors G i andf are The constraint is necessary, in order for the system of equations in (13) not to be under-determined [14]. To further simplify the representation of Equation (13), we define the matrix W and vector G as Thus Equation (13) can be written asf which can be solved for the vector of Gegenbauer coefficients G, through pseudo-inversion of W Equation (18) defines the IPRM for the discrete cosine transform. In addition to the condition in (15), since the signal is provided at a finite number of grid points, an extra set of conditions must be met in order for the matrix W not to be rank deficient: This is due to the fact that if a sub-interval Ω i is chosen too small and thus the number of grid points in Ω i is less than the polynomial order, i.e. 1 + m i > n i , the polynomial coefficients cannot be determined uniquely.

Polynomial order selection
When applying IPRM, assuming that f is a smooth signal, the higher the order of the polynomial, the more accurate the reconstruction is. Specifically, if f is a polynomial of degree M , with m ≥ M the reconstruction is exact. As a consequence, choosing the polynomial order sufficiently large will result in an accurate reconstruction. However, one usually seeks the reconstruction of the signal using as few polynomial coefficients as possible within an acceptable accuracy (e.g. in approximation problem). Furthermore, when the signal samples are given in the presence of noise (e.g. in denoising problem), increasing the polynomial order causes over-fitting in the recovered signal. In other words, the reconstructed signal follows the random fluctuations, which are due to the noise present in the signal. Therefore, practical implementation of IPRM in many applications requires the choice of the polynomial order. We suggest employing two commonly used model selection methods, namely MDL and CV, for making such choices. Both model selection methods attempt to choose the best estimate, according to a certain criterion, among a given list of models. In general, this criterion is devised so as to make a compromise between a measure of goodness of fit and that of parametric complexity. The two model selection methods have been chosen as they are known to provide nearly-unbiased estimates of model parameters and are effective in avoiding model over-fitting, which is essential for the proper selection of polynomial orders in the presence of noise.
In this section, we assume that each model M j corresponds to a combination of the polynomial orders for different intervals. We further assume that the polynomial order in the ith interval is constrained . . , N s }. As such, the number of polynomial order combinations (or models M j ) to be chosen from is

Minimum description length
According to MDL principle the best model for a given data set is the one that compresses the data the most and produces the shortest code length of the data. Suppose the noisy values of a piecewise smooth signal f (x n ) are given: where the errors ϵ n are i.i.d. Gaussian random variables with zero mean and unknown variance σ 2 . The aim is to select the polynomial orders {m i } Ns i=1 with which IPRM results in the best estimate (in MDL sense) y = fm N of f . By the Gaussian distribution with unknown variance assumption, the MDL score of each model is given by [18] MDL(M j ) = N log where residual sum of squares (RSS) is calculated as follows: and K represents the number of free parameters in the model. Since in each interval there are m i + 1, i ∈ {1, . . . , N s } polynomial coefficients, so is the number of free parameters. Therefore, the total number of free parameters in the IPRM problem is given by ∑ Ns i=1 (m i + 1). Subtracting the constant term N s from the Hence, the MDL score for IPRM can be written as Given a data set, competing models M j , j ∈ {1, . . . , ∏ Ns i=1 M i } can be ranked according to their MDL scores, with the one having the lowest MDL being the best in MDL sense [15].
In addition, if the data is available in terms of DCT coefficients, instead of time domain data points, we can apply the MDL directly to the DCT coefficients to determine the polynomial orders. This will lead to the same results as in the time domain case, when all the DCT coefficients are employed. This is due to the fact that the DCT matrix is unitary and preserves L 2 -norm of the data. In this case, the value of RSS in terms of DCT coefficients is where a i andâ i designate the DCT coefficients of the noisy and reconstructed signals, respectively. If only the first N d terms of the DCT representation are employed in the reconstruction procedure, the MDL score can be obtained by In this case, the RSS value is written as

Cross-validation
Similar to the previous section, suppose y is the vector of data andŷ designates the reconstructed signal using IPRM with a combination of the polynomial orders {m i } Ns i=1 . If the estimateŷ can be written in the form ofŷ where H is a matrix that does not depend on the data y, the fitting method is called linear [16], and the matrix H is referred to as hat matrix. In leave-one-out cross-validation [17], for a linear fitting problem, the CV score can be obtained using the following equation: where H nn is the nth diagonal element of H. This score is calculated for all the competing models M j , Since IPRM can be expressed in the form of a linear fitting model, we can take advantage of Equations (27) and (28) to calculate CV and GCV scores, respectively. The hat matrix of IPRM is given by where W is the transformation matrix of IPRM from Equation (16); D and P are the DCT and polynomial basis matrices, respectively. The polynomial basis matrix P is the direct sum of the polynomial basis matrices of sub-intervals P i , i.e.
The rationale behind derivation of the hat matrix of IPRM in the form given in Equation (29) is as follows.
From Equation (18)  Similar to MDL, if the signal is provided in terms of its DCT coefficients, by making use of the unitary property of the DCT matrix, it is feasible to find the GCV score c as follows: Moreover, if only a part of DCT coefficients (first N d -terms) are used for the reconstruction, D is the first N d rows of the DCT matrix. In this case, the GCV score is given by

Numerical experiments
In this section, we assess and compare the performance of the order selection methods introduced above.
Root mean square error (RMSE) is used as a quantitative measure for comparison purposes: Here we use two test signals; a piecewise polynomial and a piecewise smooth signal. The first test signal is given by The sample size is set at N = 256. The signal-to-noise ratio (SNR) is changed from 10 to 20 dB (which in linear scale corresponds to SNR = 3.16 to 10) with step size of 1 dB. For each SNR, 100 sets of noisy data were simulated and for each SNR, RMSE is averaged over all the data sets. The polynomial order in each interval is iterated from m i = 1 to 10. For each combination of orders, we perform IPRM to reconstruct the signal. For each reconstructed signal, we calculate the RSS given in Equation (21). The values of RSS and polynomial orders are used to compute the MDL and CV scores given in Equations (22) and (28), respectively. In each case, the minimizer of the resulting score is chosen as the combination of estimated orders, which is used to compute RMSE of the reconstructed signal. The plot of variations of the RMSE values versus SNR is depicted in Figure 1a, in which we compare the RMSE performance of MDL and CV with that of the "oracle", which is basically the IPRM reconstruction with the optimal orders so as to minimize the RMSE criterion. The orders employed in the oracle method are obtained using the original signal f .
We repeat the same experiment for the heavisine signal, using the same set of parameters. As demonstrated in Figure 1b, at low SNR values MDL outperforms CV, whereas at higher SNR values (larger than around 12 dB) CV exhibits a better performance compared to MDL.
The simulation results of this section suggest that the performance of the two model selection methods differs slightly depending on both the signal and SNR level.

Applications
In this section, we concentrate on denoising and linear approximation as two main applications for which DCT has been applied successfully.

Denoising
Applying the framework derived in Section 2, in this section we show that inverse reconstruction method offers excellent de-noising properties. As a result, a new de-noising algorithm for piecewise smooth signals using inverse method in the DCT domain is presented. This de-noising algorithm is particularly applicable when the signal is provided in terms of its first N d DCT coefficients, where recovering the signal using ordinary inverse DCT would result in spurious oscillations in the reconstructed signal.
Suppose f is a piecewise smooth discrete signal. The problem is formulated as follows: one observes y n = f (x n ) + e n where the errors e i are independent and identically distributed zero-mean Gaussian random variables. Alternatively, the observation may be available in terms of the first N d DCT coefficients of a noisy signal. The original signal is supposed to be deterministic and independent of the noise. Our goal is to find an estimate of the original signal using the DCT IPRM framework derived in Section 2.
The proposed de-noising algorithm works in the following way: (1) Compute the DCT coefficients of the noisy signal y. (This step is omitted in the case the noisy signal is given in terms of its DCT coefficients.) (2) Find the regions of smoothness in the signal. Here we employ edge detection techniques based on concentration factors for noisy data [19], which works on Fourier coefficients of the signal. The Fourier coefficients are found from either the noisy signal or through a transformation on the DCT coefficients of the signal.
(3) Find the best polynomial orders for each interval using the order selection techniques introduced in Section 3.
(4) Reconstruct the signal using DCT-based IPRM with the polynomial orders calculated in step 3.
The performance and accuracy of this algorithm depends on the following factors: the number of DCT coefficients utilized in IPRM procedure, precise knowledge of the intervals of smoothness in the signal (i.e. edge detection), and the polynomial orders determined as a result of order selection techniques. In the sequel, we investigate the effect of the number of DCT coefficients used for reconstruction.
Let f be a piecewise smooth discrete signal of length N . Taking DCT of f results in N DCT coefficients.
We consider applying IPRM on f when the first N d (N d ≤ N ) DCT terms of f are used for reconstruction. When the signal is noise free, the number of DCT coefficients N d is required to satisfy the minimum requirement given by inequality (15).
In order to examine the accuracy of the reconstruction, as the number of DCT coefficients varies, we consider two examples. The first example is a piecewise polynomial signal of length N = 256 given by which is illustrated in Figure 2a. The DCT spectrum of f 1 is plotted in Figure 2b. Figure 3a depicts the RMSE of the reconstructed signal as a function of the number of DCT coefficients N d . As can be observed in this figure, although the signal f 1 has more significant DCT coefficients than the lower bound specified by inequality (15), the reconstruction is exact (within machine accuracy) and RMSE is of order 10 −14 , when the above constraint is met.
Next, we consider Heavisine signal f 2 of the same length, which is piecewise smooth and is depicted in Figure 4a. From the RMSE graph of this piecewise smooth signal, it is clear that the higher the number of DCT coefficients used in IPRM, the more accurate the reconstructed signal is. Note that since f 2 is not a piecewise polynomial signal, the error does not converge to zero.
Subsequently, we examine the RMSE performance of the above signals in the presence of noise at different number of DCT coefficients. We assume the signal-to-noise-ratio of the signals to be SNR = 10 dB. For each signal, we conduct a Monte Carlo experiment of 100 sets of noisy data, and average the RMSE result over all data sets. The resulting RMSE for signals f 1 (x) and f 2 (x) are demonstrated in Figure 5.
For the piecewise polynomial signal f 1 (x), when N d is equal or close to the minimum requirement, the RMSE of the reconstruction is high compared to the case where higher number of DCT coefficients is used.
Above this range, including more DCT coefficients in IPRM, does not improve the reconstructed signal.
For the piecewise smooth signal f 2 (x), we virtually encounter the same scenario as in the noise free case.
In other words, as we increase the number of DCT coefficients used for IPRM, we achieve a more accurate reconstruction.

Approximation
In [13], the extension of the one-dimensional (1D) inverse method to images has been studied, where they have shown that when using the tensor product of the 1D case, the transformation matrix in the twodimensional (2D) case, even for sub-domains with a simple geometry, easily becomes singular. Therefore, they have suggested to instead apply IPRM on images in a slice-by-slice d manner [13], which is not efficient for approximation purposes.
One way to overcome the restriction of IPRM in 2D scenarios, is to convert the image into a 1D signal.
The aim is to find a path through all data points of the image such that the resulting signal is as smooth as possible, while considering the costs required for storing the path data. In this way we can apply IPRM on the resulting 1D signal efficiently and achieve a sparse DCT representation of the image.
Recently, an adaptive wavelet transform, referred to as Easy Path Wavelet Transform (EPWT), e was proposed in [20] for a sparse representation of images. The EPWT works along pathways of image data points and exploits the correlations among neighboring pixels. In a nutshell, it consists of two parts: first, finding the pathways according to the difference in value among the neighboring pixels and second, applying a 1D wavelet transform along the resulting pathways. Here, we refer to the first part as easy path algorithm.
The EPWT algorithm is shown to be highly efficient for representation of real-world images, outperforming the tensor-product wavelet transform. However, in this article we show that applying IPRM on the result of the easy path algorithm can improve the approximation performance for piecewise smooth images, significantly.
The easy path algorithm works as follows. Let f (i, j) be a discrete image of size The neighborhood of an index (i, j) ∈ I is defined by According to this definition, the indices located at a vertex and at a boundary but not a vertex, have three and five neighbors, respectively. Otherwise, N (i, j) comprises eight elements. However, considering that the path passes through every point exactly once, it is necessary to exclude the indices that already have been used in the path.
We carry on this procedure as long as the set of admissible neighbors is not empty. This sequence of neighboring points is referred to as a pathway, and the union of pathways forms a path through the index set. Moreover, the transition from one pathway to the next in the path is called an interruption.
In the case of an interruption, we need to pick the next index p(ℓ + 1) from the remaining free indices in the index set. Among different possibilities to start the next pathway [20], we select the index minimizing the absolute difference of the function values.
In addition, it can happen that we encounter multiple choices for the next index in the sense that more than one neighbor attain the minimum cost given by the absolute difference criterion. In [20], it is suggested to fix a favorite direction in advance to find a unique pathway. However, we propose a modified procedure to not only avoid the occurrences of interruptions the most, but also to reduce the number of edges in the path substantially. The latter consequence is particularly important in IPRM, since we shall rely on the edge locations in the reconstruction process.
The proposed modification in the easy path algorithm is as follows. Suppose the recent element of the path is p(ℓ) and the easy path algorithm is to choose the next element of the path p(ℓ+ 1) out of K potential Considering that the storage cost of the path vector is not negligible in comparison with that of the approximation coefficients, efficient coding schemes are required to reduce the cost of storing this vector.
Here, we employ the coding strategy introduced in [20]. Since each pathway connects neighboring indices, one can store the direction of the next element rather than the index itself. Obviously, having eight neighbors at most, the direction of the next index can be represented by three bits in the worst case. The coding algorithm that maps the vector of path indices p into the vectorp works in the following steps: (1) Letp(0) = 0 since p(0) = 0 is the starting point.
(2) For findingp(1), look clockwise through the neighbors of p(0) and insert (3) Having found the path codep(ℓ), the subsequent path codep(ℓ + 1) is zero if it follows the previous direction of the path. More precisely, if Due to the fact that each index appears only once in the path, as we proceed with the algorithm, the number of admissible neighbors decreases and the coding algorithm tends to assign smaller codes top [21]. This in turn results in a lower entropy of the path information.
From the coding strategy described above, it is clear that for a better storage of the path vector, it is preferable to continue the pathway in the direction which leads to smaller values of the path codep.
However, this is in contrast with the procedure explained for the multiple options scenario. In other words, there is a trade-off between the entropy of the pathways and the number of edges and interruptions in the path. This can be tackled by making a compromise through setting a predetermined upper bound θ on the number of degrees of freedom. More precisely, if the minimum number of degrees of freedom is lower than or equal to θ, the neighbor with the minimum degrees of freedom is selected as p(ℓ + 1). Otherwise, the next index p(ℓ + 1) is the one closest to the favorite direction.
The proposed approximation scheme using DCT-based IPRM and the easy path algorithm works in the following way. First, we apply the easy path algorithm with the settings described above, in order to find the vector path p through the image. Having determined the 1D signal f (p), we compute the DCT expansion of f (p). Next, we find the discontinuity locations, using the edge detection from spectral data algorithm suggested in [22]. Specifically, we employ trigonometric concentration factors accompanied with the nonlinear enhancement algorithm proposed in [23]. The Fourier coefficients are found either from the time domain signal itself or through a transformation from the DCT coefficients of the signal [24]. The K-term approximation is obtained by storing the first K DCT coefficients, the path vector p and the edge locations.
The reconstruction of f from {f k } K k=1 is given in the following steps: (1) Find the polynomial orders using the model selection methods explained in Section 3 (employing either MDL or CV through Equations (24) and (33), respectively).
(2) Apply DCT-based IPRM, with the polynomial orders determined in step 1, to obtain the approximate signal f p .
(3) Apply the permutation f (p) = f p in order to recover the function f (i.e. reciprocate the operation of the easy path algorithm).
In order to be able to compare the storage costs of our scheme with that of EPWT, we need to review how its algorithm works. In EPWT, after finding the complete path vector p, a 1D wavelet transform is applied to the vector of function values along the path. As a result of a one-level decomposition, low-pass and high-pass wavelet coefficients are derived. The down-sampled low-pass part is used to form a low-pass filtered image with the same size as the original image and consisting of pairs of identical data points. The easy path algorithm is applied to the resulting image to find a path through the pairs of data points. The wavelet transform is again applied to the resulting path vector. The same strategy is repeated in further levels up to L decomposition levels, the extent of which depends on the data. After a sufficient number of iterations, a shrinkage procedure is applied to the wavelet coefficients at different decomposition levels.
In order to reconstruct the image, one needs the wavelet coefficients, location of the wavelet coefficients (since it is a nonlinear approximation scheme), and the path vectors in each iteration step. According to this algorithm, the length of the path vector at decomposition level ℓ is N1N2 2 ℓ−1 . As a result, the cost of storing the path indices for the further levels usually doubles that of the first decomposition level [20].

Denoising
In this section, we analyze and assess the performance of the de-noising algorithm, introduced in Section 4.
RMSE is used as a quantitative measure for comparison purposes: Here, we use all the DCT coefficients of the signal for reconstruction in IPRM, following the discussion on the effect of the number of coefficients N d , presented in the previous section. Recall that for piecewise polynomial signals a wide range of N d values produce identical results, whereas for piecewise smooth signals, as we increase the number of coefficients N d , the reconstructed signal improves gradually.
We compare the result of our algorithm with the following de-noising methods: (1) Translation-invariant (TI) wavelet de-noising (cycle-spinning) [25] (2) De-noising based on wavelet transform footprints [26,27] (3) Direct method [28,29] The first test signal is a piecewise constant signal given in Figure 7, which is known as Blocks in de-noising literature. We use the same specifications (length of the signal N = 2, 048, SNR = 7) as those provided in Wavelab [30]. For this signal, cycle-spinning gives the best result in terms of RMSE, when Haar wavelet is employed. When applying the wavelet footprints algorithm, it is assumed that the signal is piecewise constant. We also consider the case where polynomial orders are not known. Considering the de-noising procedure for the direct method, in [28] it is suggested to use the following formula: In case the result of this formula is not an integer, ⌈m i ⌉ is considered as the polynomial order [9]. However, the polynomial orders obtained through this formula do not result in an acceptable reconstruction of the signal. Here, we assume that the signal is piecewise constant (thus setting the polynomial orders to zero) and find the parameters λ i from Equation (38). In the case of our algorithm, we consider two scenarios for the polynomial orders: first, we use an oracle method, where an oracle tells us which are the best polynomial orders (zero order polynomials for Blocks signal). Second, we use one of the model selection methods introduced in Section 3, to find the polynomial orders automatically. Here, we use MDL for polynomial order selection.
The simulation results in terms of RMSE for one sample noisy signal are given in Table 1. According to this table, de-noising using IPRM significantly outperforms other de-noising algorithms. Note that Table 1 indicates the same RMSE values for IPRM with and without oracle. This is attributed to the fact that the MDL has estimated the correct polynomial orders. Figure 7 shows the reconstructed signals using different de-noising algorithms.
The second test signal is a piecewise polynomial signal given by The simulation results in terms of RMSE for one sample noisy signal are given in Table 2. Figure 8 illustrates the reconstructed signals using different de-noising algorithms.
The next test signal is a piecewise smooth signal, referred to as Heavisine in de-noising literature. The RMSE results of different de-noising algorithms are given in Table 3. We use MDL for selecting the polynomial orders, which are m 1 = 6, m 2 = 6, and m 3 = 5. In case of the direct method, we set the parameter λ = 1, which was found by trial and error, since the values found through the above formula do not lead to an acceptable reconstruction of the signal. Figure 9 shows the reconstructed signals using different de-noising algorithms at SNR = 16.91 dB.
The last experimental signal is Doppler signal, as illustrated in Figure 10a. The simulation results in terms of RMSE for a noisy signal with SNR = 16.91 dB are given in Table 4. Figure 10 illustrates the reconstructed signals using different de-noising algorithms. Here we use CV to find the polynomial order.
The resulting polynomial order is m = 207 (oracle gives m = 203) which is relatively high, and causes over-fitting at the low frequency section of the signal. However, this is inevitable due to the nature of Doppler signal, which requires a high polynomial order for the high frequency part, and a low polynomial order for the low frequency one. One way to further improve the performance of IPRM algorithm for this signal is to divide it into two parts successively and find the polynomial orders for the resulting sub-intervals, separately.
After one step of such division, it is feasible to reduce the RMSE of the recovered signal from 13.6757 to 11.1579. In this case, the polynomial orders used in the IPRM procedure are m 1 = 140, and m 2 = 15, respectively. As depicted in Figure 10f, the accuracy of the recovered signal is significantly improved in the right sub-interval of the signal.
In order to investigate the performance of the denoising algorithms over a wider range of SNR values, here we present Monte Carlo experiments for the blocks and f (x) signals at different values of SNR. In the case of the blocks signal, 100 sets of noisy signals are simulated, where we vary SNR from 10 to 20 dB by a step-size of 2 dB. The graph in Figure 11a illustrates the average RMSE results for different algorithms.
For the de-noising algorithm based on wavelet footprints, two scenarios are considered: first, the signal is assumed to be piecewise constant, i.e. polynomial orders are known in advance or given by an oracle, and second, polynomial orders are not known in advance. For de-noising based on the direct method, we assume that the polynomial orders are known. In the case of IPRM, we find the polynomial orders using the MDL algorithm. In addition, Figure 11b demonstrates the RMSE performance of the denoising algorithms with the same set of parameters for the piecewise smooth signal f (x).

Approximation
In this section we present some numerical examples of the proposed approximation scheme applied to grayscale piecewise-smooth images. We compare the performance of the DCT-based IPRM with the approximation results of both EPWT and the tensor product wavelet transform using Haar, Daubechies db4 and 9/7 bi-orthogonal filter banks. This comparison is carried out in terms of the number of the coefficients used for the sparse representation, storage costs and the SNR given by where ∥.∥ 2 denotes the L 2 -norm, and f andf are the original and reconstructed sparse images, respectively.
The storage cost of our algorithm consists of the costs of storing the coefficients, edge locations and the path information. The entropy of the path is composed of the entropy of the pathways and the cost of storing the location of interruptions. The entropy of the pathways is given by where n = 8 is the number of possible options in the coded path vectorp and h 0 , . . . , h n−1 designate the frequencies of their occurrence. N = N 1 × N 2 stands for the total number of coefficients. Moreover, the cost of storing the location of either an edge or an interruption is considered as log 2 (N ).
In order to roughly compare the storage costs of the proposed approximation algorithm with that of EPWT and the tensor product wavelet transform, the following simplified scheme is assumed [20,31]. the storage cost of the path is also required to be taken into account.
In the first example we consider the Shepp-Logan phantom image of size 256 × 256, as depicted in Figure   12a. To gain an insight into the effect of the parameter θ, in Table 5 In Table 6, we present some approximation results of our algorithm, EPWT and tensor product wavelet transform, for the phantom image. In this table, we have obtained the storage costs for both b = 8 and b = 16, and the number of iterations for EPWT and tensor product wavelet transform are considered L = 9 and L = 5, respectively, which usually results in the best approximation scenario for the phantom image.
Please note that as long as the number of both iterations and coefficients are constant, cost of the path in EPWT is independent of the choice of the wavelet transform.
According to Table 6, as expected, the performance of both DCT-based IPRM and EPWT is especially superior to tensor product wavelet transform for higher values of the parameter b. In addition, the DCTbased IPRM algorithm outperforms the wavelet-based algorithms, in terms of the storage costs as well as the SNR of the approximation result. Figures 12b and 13a illustrate the approximation results of tensor product wavelet transform and EPWT, respectively, using the Haar transform with the specifications provided in Table 6. In addition, Figure 13b shows the reconstruction result of the proposed DCT-based IPRM algorithm with 175 approximation coefficients.
For the second toy example, we consider the piecewise linear image, shown in Figure 14a. Akin to the previous example, we compare the approximation results of the proposed algorithm with EPWT and tensor 18a and 17 product wavelet transform, as depicted in Table 7. As can be observed from this table, the DCT-based IPRM algorithm significantly reduces the number of coefficients as well as the storage costs of the approximation, at similar SNR values. The reconstruction results of this image using tensor product wavelet transform and EPWT, using Haar filters, are illustrated in Figures 14b and 15a, respectively. The 40-term approximation of the piecewise linear image using the proposed algorithm, is shown in Figure 15b.
Finally, we consider the approximation of disparity map images. Such images encode the shifts between stereoscopic views of the same scene and can serve for determining the depth map associated with a certain view [32]. The color view (Figure 16a) is augmented by disparity map (Figure 16b) encoded in gray-scale levels. The comparison of the approximation performance of the proposed algorithm with that of EPWT and tensor product wavelet transform is given in Table 8. In addition, the 130-term approximation of the DCTbased IPRM (as depicted in Figure ) offers a similar SNR to 500-term and 3, 000-term approximations using EPWT and tensor product wavelet transform (using Haar), with less storage cost. The approximation results of tensor product wavelet transform and EPWT using Haar are depicted in Figures b, respectively.

Conclusions
In this article, a mathematical formulation for the IPRM in the DCT expansion has been derived. The proposed framework aims at overcoming the oscillatory behavior of a signal with discontinuities reconstructed from a limited number of its DCT coefficients. It also leads to a sparser representation if the underlying signal is piecewise smooth. Assuming the discontinuity locations are known and limiting the polynomial order for each smooth piece, the framework essentially leads to the solution of a linear approximation problem. Furthermore in the article, denoising and linear approximation algorithms for piecewise-smooth signals based on their DCT-based IPRMs have been developed. Simulation results have demonstrated significant improvements over wavelet-based methods, namely, TI wavelet denoising, denoising based on wavelet transform footprints, as well as recently introduced EPWT based method.
The linear approximation problem of re-projection of a signal represented in one basis onto another basis as addressed in this article, correlates with the problem considered in the compressive sensing (CS) literature [33]. As the CS theory postulates, if a signal is known to be sparse with respect to a certain dictionary of atoms, it can be reconstructed exactly from measurements in another dictionary, incoherent with the first one. In our case, the sparse representation is sought based on the (Gegenbauer) polynomials while the measurements are taken with respect to the dictionary of DCT atoms. The CS formalism may lead to more general results, i.e. when the discontinuity locations are not given or when the DCT coefficients used for reconstruction are selected arbitrary rather than taking the first K terms. Such results might be quite beneficial for e.g. compression and are object of our future research.