PSD-Constrained PAR Reduction for DMT/OFDM

Common to all DMT/OFDM systems is a large peak-to-average ratio (PAR), which can lead to low power e ﬃ ciency and nonlinear distortion. Tone reservation uses unused or reserved tones to design a peak-canceling signal to lower the PAR of a transmit block. In DMT ADSL systems, the power allocated to these tones may be limited due to crosstalk issues with many users in one twisted pair bundle. This PSD limitation not only limits PAR reduction ability, but also makes the optimization problem more challenging to solve. Extending the recently proposed active set tone reservation method, we develop an e ﬃ cient algorithm with performance close to the optimal solution.


INTRODUCTION
Communication systems using multicarrier modulation have recently become widely used both in wireless (DVB-T, DAB, IEEE 802.11a) and wireline (ADSL, VDSL) environments [1,2,3]. Multicarrier systems have distinct advantages over single-carrier systems, but suffer from a serious drawback: the approximately Gaussian-distributed output samples cause a high peak-to-average ratio (PAR) that results in low power efficiency and possible nonlinear distortion.
In order to alleviate this PAR problem, many researchers have made efforts to reduce large signal peaks through a variety of PAR reduction methods [4,5,6,7,8,9,10]. A technique known as tone reservation was initially developed in [4,5] and is well suited for discrete multitone modulation (DMT) ADSL systems over twisted pair copper wiring. A common phenomenon of this environment is a distancedependent rolloff of the channel transfer function power with increasing frequency, resulting in upper frequency subchannels having very low SNRs and being incapable of reliably transmitting data. An additive peak-canceling signal can be constructed from these dataless tones, as in [4,5], to help reduce the PAR problem. Further developed tone reservation algorithms have been presented in [11,12,13,14,15,16,17,18].
In ADSL and other practical systems, the peak-reduction signal may be power limited on each of the reserved tones due to crosstalk constraints with many users being serviced in one twisted pair bundle. This is, for instance, manifested in the recent ADSL2 standard [19] as a −10 dB PSD limit on the reserved tones compared to the data-carrying tones. This PSD constraint on the tones can change the theoretical ability of tone reservation to reduce the PAR [20,21] as well as the complexity versus performance tradeoff for practical algorithms.
In this paper, we analyze the PSD-constrained tone reservation problem and its complexity versus performance tradeoff. We extend the recently proposed active set tone reservation approach [16] to handle PSD constraints. Results are analyzed and compared to performance bounds, and computational complexity and algorithm alteration are detailed. In Section 2, we define the system and data model, give a description of the active set PAR reduction algorithm, and introduce PSD-constrained tone reservation. Extension of the active set approach to the PSD-constrained case is presented and analyzed in Section 3, followed by simulation results presented in Section 4.

DMT AND TONE RESERVATION
A DMT system uses a symbol length of N samples, which is typically 512 samples in the ADSL downstream direction. Although these samples uniquely define a signal block, when considering the PAR of the analog signal, peak regrowth [16,17,18] between the sampling points upon digital-to-analog (D/A) conversion has to be considered. Oversampling of the digital signal is a viable approach. Figure 1 schematically describes the reduction approach. A reduction signal c[n] is added to the original data signal x[n], and is constructed of dataless tones that either cannot transmit data reliably (due to low SNRs) or are explicitly reserved by the system for PAR reduction. For example, in the ADSL2 standard, the mechanism for this is to exclude the reserved tones from the supported set of data tones during startup. The goal for the PAR reduction algorithm is to make the resulting signal, , have a smaller amplitude span than x[n]. If the reduction signal is constructed of tones with low SNRs, the reduction signal c[n] may be attenuated before arriving at the receiver. This makes tone reservation using low SNR tones mainly applicable to reducing the transmitter side PAR.
The PAR is defined as where the average power in the denominator is that of the data-bearing signal before PAR reduction is applied. 1 We definex where X k represents the data symbols and C k the FFT domain PAR reduction signal. On a given DMT tone, one of them has to be zero to maintain distortionless data transmission where U c represents the set of data-bearing subchannels and U represents the set of available subchannels for PAR reduction. Let x L denote the data signal of one symbol block and let c L denote the additive peak-reduction signal generated from the tone set U, both oversampled to L times the nominal sample rate. We focus on the specific case of a real baseband DMT system, where the data and reduction signals can be expressed as weighted sums of real-valued sinusoids and cosinusoids. In matrix form, we can write c L =Q LČ , whereQ L is an NL × 2U matrix of sinusoidal and cosinusoidal column vectors with frequencies specified by the U reserved tones andČ is a length 2U vector with the weights of these (co)sinusoids,Č For this real-valued case, minimizing the peak magnitude of the resulting signal, equivalent to minimizing its peak power, can be formulated as the linear program [5] minimize γ subject to

Tone selection
It is desired that reduction signal c L cancels out the peaks in the data signal x L as best as possible. Total cancellation, c L = −x L , is naturally impossible, and an alternative, yet still unrealistic, goal is to drive the signal towards a PAR of 0 dB (i.e., the peak power and average power are equal). This tight control of the signal requires a large portion of the frequency band. In general, more reserved tones lead to a lower PAR, and therefore, a tradeoff exists between data throughput and PAR [22]. A choice must be made as to which tones will be used for PAR reduction rather than data transmission. If the system is able to freely choose, the distribution of these tones over the system bandwidth has a significant impact on PAR reduction ability. In general, with no power constraints on the reduction tones, an uneven, spread-out placement (e.g., generated by a random selection of tones) allows for very good PAR reduction [5,23]. A significant performance loss, however, results by placing the reduction tones as a contiguous block or uniformly distributed over the entire bandwidth.
In wireline DMT systems, it is preferred to use those tones which cannot send data reliably due to insufficient SNRs, thereby maintaining the same throughput level. Generally, these tones are in the uppermost frequencies, and tend to resemble a contiguous block of tones, which is not a good tone set in terms of performance. An alternative is to reduce the system throughput by sacrificing some tones for peak reduction and achieving an uneven, spread-out placement. We will consider these two extreme cases of tone placement. In practice, a combination of these may turn out to be the most attractive choice.
After determining the set of reserved tones, the reduction signal c L is created from a nominal peak-reduction kernel p [5], formed by projecting an impulse at n = 0 onto the set of tones U. This corresponds to the least squares approximation of the impulse with equal weight on each reduction tone. Other forms of p generated by different criteria, such as minimizing the size of their sidelobes, have been suggested in [5].

Active set tone reservation
The linear program in (6) can be solved with a simplex method, but is expensive with a complexity of O(N 2 L 2 ) operations. Computationally efficient O(NL) approaches based upon projection-onto convex sets (POCS) and gradient projection were developed in [4,5], respectively, but suffer from slow convergence. A recent O(NL) approach [13,16] was developed based on active set methods [24] and exhibits rapid convergence towards a minimax PAR solution. Whereas a finite number of iterations will achieve the optimal PAR level γ * for the given tone set, a very good suboptimal solution can be achieved in two or three iterations, making this an attractive practical solution.
As in the gradient project and POCS approaches, the active set approach reduces the PAR through the use of the kernel p. Circularly shifted versions of this kernel, p · , also lie in the signal space generated from U, allowing easy reduction of a peak at an arbitrary sample location.
Beginning with the sample of largest magnitude γ 0 at location n 0 , the peak is reduced by subtracting a scaled version of p n0 until a second peak at some location n 1 is balanced with it at some magnitude γ 1 < γ 0 . These two peaks are then reduced equally through a linear combination of p n0 and p n1 until a third peak is balanced. These three peaks are reduced equally until a fourth is balanced, and so forth. When a sample is at the peak magnitude, it signifies an active inequality constraint (i.e., strictly equal) in (6), and the active set approach is therefore building a set of active constraints. Mathematically, the iteration updates can be written as wherex (i) represents the signal after the ith iteration,p (i) is the descent direction in the ith iteration, and µ (i) represents the descent step size. At the start of the ith iteration, there will be i peaks which are balanced at locations n 0 , n 1 , . . . , n i−1 . To keep these peaks balanced, the next iteration descent must satisfŷ with the assumption that we scalep (i) to have unit magnitude in locations corresponding to the active set of peaks. No matter what value of µ (i) is chosen, the magnitudes of the peaks at n 0 , n 1 , . . . , n i−1 will remain equal. Thep ni values can be calculated asp where the α (i) k are computed by solving the i × i system of equations This requires an i × i matrix inverse, but in practical implementations, i will typically be at most 3, and the inverse cost is then insignificant relative to the total iteration complexity. Furthermore, efficient inverse techniques [25] can be applied as in addition to being symmetric (due to the symmetry of p), the matrix in a given iteration is contained in the matrix for the next iteration. The step size µ (i) required to balance the next active peak is determined by testing samples as follows 2 (see [15,16] for more details), where A represents the set of samples in the active set. Strategies exist [15,16] to reduce the sample testing complexity as the structure ofx (i) andp (i) can be exploited to eliminate many potential samples from consideration. For practical implementation, the division operation can be replaced by a multiplication with the output of a prestored inverse . Exact values are not needed for comparison purposes, and therefore, a dense lookup table is not required.

PSD-constrained tone reservation
Solving (6) for the optimum PAR value will in many cases cause the power on the reduction tones to grow immensely as the very last bits of reduction performance require large reduction signals. A standardized system generally has to follow certain PSD constraints on data tones. Similar rules are applicable for reduction tones as well, especially in wireline systems where crosstalk exists and the effect on other users should be kept to a minimum. Thus, a system may have to abide by instantaneous and/or average power constraints on the reserved tones.
What the PSD constraint should be is a system design issue based upon factors such as crosstalk and power consumption or, in practice, often determined by a standard. In the new ADSL2 ITU-T Recommendation [19, Figure 8-19/G.992.3], passband tones are under strict control and can be grouped into different categories: one group of tones is for data transmission and another group consists of monitored tones for receiver functions (e.g., channel estimation). Both of these groups belong to the medley set. Tones that are not in the medley set have a PSD restriction 10 dB below the nominal PSD level and these are the tones that can be used for PAR reduction.
Since the PSD is a measurement averaged over time, the power on the tones may be allowed to vary from symbol to symbol, and the instantaneous power of a symbol may therefore exceed the PSD constraint. As an example, consider a target PAR value of 12 dB and the uppermost probability curves for unreduced signals shown in Figures 4 to 9. It follows that approximately 8% of the symbols require PAR reduction, and due to averaging, a revised PSD constraint on the reserved tones can be determined. If PAR reduction is employed for only 8% of the symbols, we can allow an average reserved tone power 10 log(1/0.08) ≈ 11 dB above its overall −10 dB PSD constraint. This results in a revised PSD constraint on the reserved tones −10 dB + 11 dB = +1 dB above the nominal PSD mask for the ADSL2 system. When processing one symbol at a time, however, a peak power constraint per tone for each symbol is much easier to deal with than an averaged PSD constraint. Using this power constraint can cause the averaged PSD figure to be somewhat less than this peak constraint. Nevertheless, for a given peak power constraint per tone, a corresponding averaged PSD level can be determined experimentally for a specific system, and the constraints can then be interchanged. In the rest of this paper, we consider the peak power limitation, or instantaneous PSD constraint, on each tone rather than a PSD as a result of averaging.
Incorporating the power constraint on each tone, the PSD restriction becomes part of (6) in the form of a quad-ratic constraint: where A l,max is the limitation in amplitude on tone t l . Due to the introduction of quadratic constraints, the problem is no longer a linear program, but instead a quadratically constrained quadratic program (QCQP).

Modifications for PSD constraints
If the active set algorithm is to be used in the PSDconstrained case, it must be modified. LettingČ l denote the lth element ofČ (including both cosine and sine parts), the total weight on tone t l after iteration i can be described aš where the increments ∆Č (i) l in each iteration include the effect from reducing one additional peak. Using the step size µ (i) and weighting α (i) k from (9), the increments ∆Č (i) l can be expressed in cosine and sine components.
where K is a known constant that results from normalizing p so that p 0 = 1. We can think of three main outcomes when performing an active set iteration at an instance where none of the PSD constraints have been met or exceeded.
(1) A new peak is balanced and no PSD constraints are met. This is the same case as with no PSD constraint. The algorithm can continue with its next step. (2) All tones meet/exceed the PSD constraints at the same time. This happens when reducing one peak and reaching the PSD constraint before a second active peak is encountered. (3) Some tones meet/exceed the PSD constraint. This can happen when two or more peaks are already balanced. Then different tones will likely have different magnitudes, see Figure 2.
For case (1), the algorithm will be identical to what is described in Section 2.2. For case (2), the algorithm merely takes the step µ max that fills all subchannels to the PSD constraint, and the optimal solution has been reached. The interesting question is what to do in case (3), as some of the tones have filled up or gone past their PSD constraints, while others are still available for further reduction. The µ descent can easily be scaled back to where the first tone reaches Tone t 1 Figure 2: Addition of the tone weights for reduction of two different peaks can cause the PSD constraint to be reached on certain tones before others.
the PSD constraint, that tone can be frozen, and the remaining tones can be used for PAR reduction for subsequent iterations. This process can be repeated until all tones reach the PSD constraint. We note that an iteration now refers to the operations performed to reach either a new active peak or a new tone that meets the PSD constraint.

Cost-versus-performance issues
It can be expected that once any tone reaches the PSD constraint, many or all of the remaining tones are not far from reaching it as well. At this point, the problem is that convergence speed (i.e., additional PAR reduction per iteration) is severely reduced as a new iteration must be performed to the point where either a new tone reaches the PSD constraint or a new active peak is encountered.
After each new tone reaches the PSD constraint and is shut off, the set U changes and a new nominal peakreduction kernel p needs to be recomputed. Rather than compute the projection of an impulse onto the remaining tones, the contribution of the removed tone can just be subtracted (using NL operations) from the latest p.

Low complexity algorithm
The cost-versus-performance tradeoff dictates that it may not be worth iterating beyond the point where the first tone reaches the PSD constraint, and therefore not utilizing the available remaining power in the other tones. This low complexity approach saves a lot of computation and results in only a small performance loss from the optimal solution as simulations show in the next section. The complexity of this extended algorithm is the same as the unconstrained active set approach with an additional extra cost of keeping track of the signal power in each tone. This cost is insignificant compared to the rest of the algorithm since U NL. During each iteration, a newp (i) is created according to (9), and in parallel to that, the new signal in each tone is calculated, based on the additional contributions according to (14). Before applying (7) and potentially wasting operations, 2U multiplies and U adds are used to check the tones powers against the PSD constraint. If any of the tones exceeds the PSD constraint, µ (i) must be scaled back to find the point where the PSD constraint is met with one or more tones. The  Figure 3: Linear approximation of the quadratic magnitude constraints. An octagon is shown here, but a polygon with a larger number of sides can be used for a better approximation. quadratic equation is solved for β l for the tone(s) exceeding the PSD constraint, and the minimum β l value is chosen to scale µ (i) . This modified step size is then used in (7) to compute the final PARreduced signal.

Performance bounds
It is important to gauge how much performance is lost when using this low complexity algorithm that halts PAR reduction once any tone reaches the PSD constraint. Three lower bounds on achievable PAR level are now presented.

Bound on minimum PAR
The resulting PAR level after the low complexity algorithm can be compared to the optimal solution of (12). This equation represents a QCQP, and still is a convex problem. Linear approximations of the quadratic constraint (see Figure 3) can be employed to transform the problem back to linear programming form [21], in order to solve the problem with linear programming algorithms. Thereby, a performance bound 3 can be computed through simulations. It should be noted that this bound on the optimal solution is extremely tight when used with polygons of 16 sides and larger.

A max bound
The constraint on maximum power per tone (equivalent to a constraint on the maximum magnitude) results in limiting the magnitude of the peak-reduction signal to A max , where We assume that an arbitrary peak-reduction signal can be created, with the only limitation being that its amplitude is between −A max and A max . As a result, starting with a symbol with peak level max |x L [n]|, the peak level can at best be reduced down to max |x L [n]| − A max . Since this model admits additional degrees of freedom compared to the true reduction signal, it serves as a lower bound on the achievable PAR level. Given a peak value for a symbol block, this can be expressed as This A max bound shows that when the PSD constraint is quite restrictive and only a small number of tones are reserved, PAR reduction performance is severely limited, even with an arbitrary choice of reduction tones [20,21]. In this case, a choice of tones discarding the minimum amount of data capacity may be the most favorable.

2-Bound
The A max bound from (17) corresponds to the achieved peak level when all tones are filled in order to reduce the largest peak in x L . A similar bound can be computed after the active set approach has already performed its first iteration. The two balanced peaks can be reduced (without any regard for the other samples, and thus making a bound) until all tones meet the PSD constraint. This bound, which we refer to as the 2-bound, is simple to simulate because α 0 and α 1 must be of equal magnitude due to the symmetry of p.

SIMULATIONS
A DMT system with symbol length N = 512 is simulated with tones 33-255 used for either data transmission or PAR reduction (these system parameters are the same for downlink ADSL transmission). Each of the data-carrying tones uses a 1024-point QAM constellation. Before active set processing, the signals have been oversampled by the factor L = 4 to limit analog peak-regrowth effects upon digital-toanalog conversion. It has been observed that operating on the digital L = 1 signal does not provide any worthwhile PAR reduction performance at the analog signal [15]. Oversampling to L = 4 makes the computational cost increase by a factor of 4, although L = 2 could be employed for a performance decrease which varies based upon the number of tones, their locations, and PSD constraints. As described in Section 2.3, the averaged PSD constraint for the reduction tones could be set to about 1 dB above the nominal PSD mask for the given example. We now use this figure as a guideline for the instantaneous PSD mask in the following simulations. To illustrate the effects when varying the maximum reduction power per tone, the simulations will first use a restrictive constraint set at the nominal PSD mask, and then use a looser mask, where the magnitude is increased by 50% (+3.5 dB).
We view the forthcoming PAR results on a per-symbol basis using the simulated probability that at least one sample in a symbol block exceeds a certain PAR level. This corre-  Figure 4: Symbol clip probability for 12 PAR reduction tones, chosen as a contiguous block of the highest tones. Up to four active set iterations are applied, but the algorithm stops once any tone hits the PSD constraint. The three leftmost curves represent optimal solution bounds. sponds to taking the maximum value over one symbol in (1), thereby reflecting the probability that a symbol is transmitted with distortion. This clip probability also is commonly used in the literature. A viable alternative would be to evaluate the clip probability of each individual sample, which reflects the percentage of time the transmitted signal is clipped. Figure 4 shows simulations with the upper block of 12 tones (number 244-255) used for PAR reduction and subjected to an instantaneous PSD constraint equal to the nominal PSD level for the data tones. The curves show the reduction performance using the extended active set algorithm, stopping as soon as any PSD constraint is reached. Shown on the vertical axis is the probability that the time domain symbol block x L would be clipped if subjected to a clip level γ c on the xaxis, that is,

Restrictive PSD constraint
Starting at the rightmost line, corresponding to the clip probability of an unreduced symbol, curves representing iterations one through four are shown. The two leftmost curves show the lower bounds from Section 3.4 (A max bound and 2-bound), which the simulations cannot cross. The third lowest curve, dashed and  Figure 5: Symbol clip probability for PAR reduction with the 12 highest tones. The PSD constraint allows 50% higher magnitude per tone than in Figure 4. The reduction performance shows only a small gain compared to Figure 4, showing that this placement cannot take much advantage of the loosened PSD constraint. ending at a clip probability of 3 · 10 −4 is the PAR achieved by finding the minimum value of (6) with linearized quadratic constraints (a 32-sided polygon, cf. Figure 3) and using the same upper block of 12 tones. This curve will also serve as a bound for the suboptimal algorithm, but due to its much larger complexity, this curve has not been simulated for the lower clip probabilities.
Looking at the performance of the low complexity algorithm, we see that for the higher clip probabilities, there is a performance gain of about 0.15 dB going beyond two iterations, and an additional 0.1 dB compared to the minimum PAR bound (dashed line). At the lower clip probabilities, we see that the curves converge towards the A max bound from (17).
Here we see a situation where a restrictive PSD constraint and a small number of reduction tones set a limit on the achievable PAR level. The reduction performance is limited by the A max bound, and not necessarily by the block placement reduction performance. The low complexity algorithm provides near-optimal performance at a very low cost for this system.

Loosening the PSD constraint
In Figure 5, the PSD constraint is increased by 50% in magnitude for each tone. Comparing the figures, we see that the lower bound decreases due to an increase of the maximum reduction signal. However, the simulated reduction performance, including the optimal solution, increases by 16   only about 0.3 dB. The block placement simply cannot take advantage of the increased reduction power, and is the real limiting factor in this case. Looking at the performance of the low complexity algorithm, we see that its loss compared to the minimum PAR bound is about 0.2 dB. Figure 6 shows results for when the upper block of 24 tones are used for PAR reduction along with the same PSD constraint as in Figure 4. Looking at the figure, we see that the gain from 12 to 24 tones is only about 0.4 dB, which is small considering that the maximum reduction magnitude has been doubled (the A max bound is significantly lower). In this situation, however, we see that after 4 active set iterations, we are about 0.2 dB from the minimum PAR bound at higher probabilities, thus telling us that further iterations are likely not worth the significant cost to achieve it.

Randomly chosen tones
We have seen that even when constraints (PSD limit or number of tones) are loosened, a bad tone set selection can still be a limiting factor. Now a more "spread-out" toneset is evaluated, where the reserved tones are randomly selected in the interval from 33 to 255 inclusive. Figure 7 shows similar simulations as Figure 4, using the restrictive instantaneous PSD constraint, equal to the average 16    power mask for the data tones. Looking at the figure, the iterations quickly converge to within 0.1 dB of the A max bound, and the performance is only slightly better than for block placed tones. Here the A max bound effectively sets the limitation on system performance [20,21]. 16 Figure 8 shows the performance when the PSD constraint is set to allow for a tone magnitude 50% higher than before. The reduction performance has increased thanks to more allowed power. At the lower clip probabilities, the gains are close to 1 dB compared to Figure 7, and the active set results are very close to the performance bounds. At higher clip probabilities, the gains are close to 0.5 dB, but are a noticeable distance from the very tight minimum-PAR bound. This is only a minor issue, since in these regions, the PAR level after 3 or 4 iterations is already rather low.

Increasing the number of tones
Finally, Figure 9 shows simulations using 24 randomly chosen tones, with the restrictive PSD constraint. Due to the superior reduction ability for this placement type, the resulting PAR level is clearly lower than in the previous simulation. The allowed A max is 100% higher here than with half the number of tones, and we see that a larger number of active set iterations may be needed to achieve PAR levels very close to the optimal solution. However, when considering lower clip probabilities, the 4th active set iteration is not very far from the 2-bound.

CONCLUSIONS
Introducing PSD constraints into tone reservation affects the achievable PAR reduction and significantly alters the complexity-versus-performance tradeoff for practical algorithms. The results in this paper show the impact that PSD constraints have on tone reservation performance, and it is clear that the effect when using randomly chosen tone sets is more severe than for contiguous tone sets.
A low complexity suboptimal solution has been presented, and results show that its performance is close to optimal solution bounds. Since small performance increases incur a major computation cost (greater than the low complexity algorithm itself), we assert that our proposed approach gives a very good tradeoff of complexity and PAR reduction.
To evaluate whether the oversampling of L = 4 is sufficient, the signals were oversampled by an additional factor of 4 after reduction. The peak regrowth has been observed to be less than 0.2 dB. Further studies could also include the effect on peak regrowth after the filter chain present in the transmitter [16,17,18].
An important special case results when a nonuniform PSD constraint is given, that is, more power is allowed on some reserved tones than others. In this case, certain tones may reach their PSD constraint much sooner than the rest, and sizeable performance gains beyond this stoppage point may still exist. An intelligent approach may be to modify the formation of p by weighting the impulse projection onto the tones according to the nonuniformity of the PSD mask. In this way, the more restricted tones do not reach their PSD constraint with greater ease than the others.
Although the real baseband DMT case is the main focus of this paper, the principles can also be applied to the complex baseband case (for wireless OFDM systems), as an active set approach for this case has already been developed in [14,16]. The problem with tone reservation in wireless systems is that it may not be desirable to sacrifice data tones in a fading channel. However, it is possible that in a fixed wireless scenario (with a slowly varying channel), channel state feedback could be employed and certain subchannels with low SNRs could be used for tone reservation.

ACKNOWLEDGMENT
This work was supported by Ericsson AB and by the Australian Research Council.