Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2007, Article ID 12140, 7 pages doi:10.1155/2007/12140 ## Research Article # A Heuristic Optimal Discrete Bit Allocation Algorithm for Margin Maximization in DMT Systems ### Li-Ping Zhu,<sup>1</sup> Yan Yao,<sup>1</sup> Shi-Dong Zhou,<sup>1</sup> and Shi-Wei Dong<sup>2</sup> - <sup>1</sup>Department of Electronic Engineering, School of Information Science and Technology, Tsinghua University, Beijing 100084, China - <sup>2</sup> National Key Laboratory of Space Microwave Technology, Xi'an Institute of Space Radio Technology, Xi'an 710100, China Received 14 July 2006; Revised 24 December 2006; Accepted 25 December 2006 Recommended by Erchin Serpedin A heuristic optimal discrete bit allocation algorithm is proposed for solving the margin maximization problem in discrete multitone (DMT) systems. Starting from an initial equal power assignment bit distribution, the proposed algorithm employs a multistaged bit rate allocation scheme to meet the target rate. If the total bit rate is far from the target rate, a multiple-bits loading procedure is used to obtain a bit allocation close to the target rate. When close to the target rate, a parallel bit-loading procedure is used to achieve the target rate and this is computationally more efficient than conventional greedy bit-loading algorithm. Finally, the target bit rate distribution is checked, if it is efficient, then it is also the optimal solution; else, optimal bit distribution can be obtained only by few bit swaps. Simulation results using the standard asymmetric digital subscriber line (ADSL) test loops show that the proposed algorithm is efficient for practical DMT transmissions. Copyright © 2007 Li-Ping Zhu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. #### 1. INTRODUCTION Discrete multitone (DMT) is a modulation technique that has been widely used in various digital subscriber lines (xDSL), such as asymmetric digital subscriber line (ADSL) and very-high-speed digital subscriber line (VDSL), permitting reliable high rate data transmission over hostile frequency-selective channels [1, 2]. Recently, it is proposed for broadband downstream power-line communications due to its high flexibility in resources management [3]. A crucial aspect in the design of a DMT system is to allocate bits and power to the subchannels in an optimal way under various constraints. One of the problems that are of practical interest is margin maximization or transmission power minimization, also known as margin adaptive (MA) [4]. Many optimal or suboptimal discrete bit-loading algorithms are proposed for solving the problem. Among the algorithms in which the constraint of a target bit rate is considered, the computational complexity of the Hughes-Hartogs algorithm [5] and Chow's algorithm [6] is relatively high. There are also a lot of computationally efficient algorithms, including the algorithms proposed by Piazzo [7, 8], the algorithm of Krongold et al. [9], and the Levin-Campello (LC) algorithms [4, 10, 11]. Researchers afterwards take into account more constraints including the transmission power spectral density (PSD) mask and the maximum allowable size of the QAM constellations [12, 13], and a common feature of these algorithms is that they all use greedy bit-loading, either during the whole allocation process or after the initial allocation. To achieve the target rate, greedy bit-filling adds one bit at a time to the subchannel that requires the smallest additional power, while greedy bit-removal removes one bit at a time from the subchannel that requires the largest additional power. If the initial bit rate is far from the target rate, the computation load of these algorithms is heavy. In [14], a multiple-bits loading procedure is introduced that converges faster to the optimal solution. Initially, the algorithm calculates two bit allocations, that is, loop-representative bit allocation and maximum bit rate allocation, to obtain the initial bit distribution, and then it performs multiple-bits loading for achieving the target rate. However, the extra cost paid in calculating the loop-representative bit allocation is not always helpful. When the target rate is high enough, the performance of the algorithm degrades compared to greedy bitremoval algorithm [14]. In this paper, a heuristic optimal discrete bit allocation algorithm is proposed. The new algorithm starts from an initial equal power assignment bit distribution determined by the system PSD mask, and then employs a multi-staged bit rate allocation scheme to meet the target rate. Specially, if the total bit rate is far from the target rate, a multiple-bits loading procedure is used to obtain a bit allocation close to the target rate. When close to the target rate, a parallel bit-loading procedure is used to achieve the target rate. This parallel bit-loading step is computationally more efficient than the conventional greedy bit-loading algorithm. The resulting bit distribution is not guaranteed to be optimal so it is necessary to perform a clean-up operation using the LC efficientizing (EF) algorithm [4] to obtain the optimal solution. The algorithm achieves exactly the same optimal solutions as the algorithm in [14], but the computation load is on average much lower and this can be attributed to the speed up from the parallel bit-loading step. The new bit-loading algorithm is explained in detail in Section 2. Simulation results and analysis are given in Section 3. Finally, conclusion is drawn in Section 4. #### 2. THE NEW BIT-LOADING ALGORITHM Assume a DMT system consisting of M subcarriers. The transmission power and bit rate (in bits/symbol) of subchannel n (n = 1, 2, ..., M) are $P_n$ and $b_n$ , respectively. Assume that each subchannel n has the pulse-response gain $H_n$ and the noise consisting of crosstalk and thermal noise modeled as additive white Gaussian noise (AWGN) with power $\sigma_n^2$ , then $P_n$ is related to $b_n$ by $$P_n = P_n(b_n) = (2^{b_n} - 1) \frac{\Gamma}{\text{CNR}_n},$$ (1) where $CNR_n = |H_n|^2/\sigma_n^2$ is the subchannel gain-to-noise ratio (CNR) of subchannel n, and $\Gamma$ is the signal-to-noise ratio (SNR) gap (in dB) [4], which is given by $$\Gamma = 10 \log_{10} + \left( \frac{\left[ Q^{-1} (P_e/2) \right]^2}{3} \right) + \gamma_m - \gamma_c,$$ (2) where $P_e$ is the given target probability of symbol error (PSE), $\gamma_m$ and $\gamma_c$ are the SNR margin and the coding gain, respectively, and $Q^{-1}(x)$ represents the inverse function of Q(x) which is given by $$Q(x) = \frac{1}{\sqrt{2\pi}} \int_{x}^{\infty} e^{-t^2/2} dt.$$ (3) The MA problem considered can be stated as follows: $$\min \sum_{n=1}^{M} P_n \quad \text{subject to } \sum_{n=1}^{M} b_n = B_T, \quad \sum_{n=1}^{M} P_n \le P_T,$$ $$0 \le b_n \le \hat{b}_n, \quad b_n \in \mathbb{Z}_+, n = 1, 2, \dots, M,$$ $$(4)$$ where $B_T$ and $P_T$ are the target bit rate and the total power budget,<sup>1</sup> respectively, $\hat{b}_n$ is the maximum bit rate of subchannel n, and $\mathbb{Z}_+$ represents the set of nonnegative integer. The maximum bit rate $\hat{b}_n$ is given by $$\hat{b}_n = \min\{b_{\max}, \overline{b}_n\}, \quad n = 1, 2, \dots, M, \tag{5}$$ where $b_{\max}$ is the maximum allowable size of the QAM constellations and $\overline{b}_n$ is the bit rate determined by the maximum allowable power $\overline{P}_n$ imposed by the system PSD. In practical systems, the maximum PSD of the system is typically flat over the region of the transmission bandwidth, so $\overline{P}_n$ is some constant given by $$\overline{P}_n = \Phi \cdot F, \quad n = 1, 2, \dots, M, \tag{6}$$ where $\Phi$ is the maximum PSD of the system and F is the subchannel bandwidth. The bit rate $\overline{b}_n$ is given by $$\overline{b}_n = \left[ \log_2 \left( 1 + \frac{\overline{P}_n \cdot \text{CNR}_n}{\Gamma} \right) \right], \tag{7}$$ where $\lfloor x \rfloor$ denotes the greatest integer that is smaller than x. The new bit-loading algorithm consists of four steps. Initially, the algorithm calculates the maximum rate bit-loading distribution. Then based on this bit distribution, the difference between the total bit rate B and the target bit rate $B_T$ is used to calculate a loading parameter a. If the difference $|B - B_T|$ is large, the loading parameter is used in a multiplebits loading procedure to add or remove the same number of bits to or from all the subcarries in a designated set to accelerate allocation. Next, when the bit difference $|B - B_T|$ is small and nonzero, a parallel bit-filling or bit-removal is used to meet the target rate. Specially, parallel bit-filling compares the transmission power increment $\Delta P_n(b_n + 1)$ $(0 \le b_n < b_{\text{max}})$ of all the subcarries in a designated set, and adds one bit to each of the $|B - B_T|$ least power-consumptive subcarriers, while parallel bit-removal compares the transmission power increment $\Delta P_n(b_n)$ (0 < $b_n \leq b_{\text{max}}$ ) of all the subcarriers in a designated set, and removes one bit from each of the $|B - B_T|$ largest power-consumptive subcarriers. The transmission power increment $\Delta P_n(b_n)$ of subcarrier n $$\Delta P_n(b_n) = P_n(b_n) - P_n(b_n - 1) = 2^{(b_n - 1)} \frac{\Gamma}{\text{CNR}_n}.$$ (8) Finally, since the resulting distribution is not guaranteed to be optimum, the last step is to use the EF algorithm to check whether the target rate bit distribution is efficient. If there is no movement of a bit from one subchannel to another that reduces the total transmission power, then the resulting bit distribution is efficient. If the target rate bit distribution is efficient, it is also the optimal bit distribution; else, the optimal bit distribution can be obtained by several bit swaps. The following is the detailed algorithm. - (A) Initial maximum bit rate allocation. - (1) Compute the equal power assignment discrete bit distribution $\overline{\mathbf{b}} = [\overline{b}_1 \ \overline{b}_2 \cdots \overline{b}_M]$ in which $\overline{b}_n \ (n = 1, 2, ..., M)$ is calculated by (6) and (7). - (2) Let bit rate $b_n$ be the maximum bit rate calculated by (5). The total number of bits loaded in maximum bit rate allocation is $B = \sum_{n=1}^{M} b_n$ . Generally, $B \ge B_T$ . If $B = B_T$ , go to <sup>&</sup>lt;sup>1</sup> If the power used for maximum bit rate allocation exceeds $P_T$ , then the most power-expensive bits have to be removed to meet the power budget constraint and the new bit distribution $\hat{b}_n$ determines the maximum bit rate allocation, as has been indicated in [14]. In practical situations, the power used for maximum bit rate allocation is usually less than $P_T$ . Li-Ping Zhu et al. 3 step (D). If $B > B_T$ , then the number of bits to be removed is $B \text{ diff} = B - B_T$ , and the algorithm enters the target bit rate allocation. (B) Multibit loading allocation. Let $$\widetilde{N} = \{ n : \overline{b}_n > b_{\text{max}}, \ n = 1, 2, \dots, M \},$$ $$N = \{ n : 0 < \overline{b}_n \le b_{\text{max}}, \ n = 1, 2, \dots, M \}$$ (9) represent the index set of the subcarriers that carry more bits and no more bits than $b_{\max}$ , respectively, during initialization. The cardinality of N and N is L = |N| and L = |N|, respectively. Generally $L \neq 0$ as L = 0 holds only when $b_{\max} < \overline{b}_n$ or $\overline{b}_n = 0$ for all n which is unrealistic for xDSL applications. Consider the complex case of $L \neq 0.2$ The maximum and the minimum of the difference between $\overline{b}_n$ ( $n \in N$ ) and $b_{\max}$ is $$\overline{\nu} = \max_{n \in \widetilde{N}} (\overline{b}_n - b_{\max}), \qquad \underline{\nu} = \min_{n \in \widetilde{N}} (\overline{b}_n - b_{\max}),$$ (10) respectively. Define loading parameter $a = \lfloor B \operatorname{diff}/L \rfloor$ . Multibit loading allocation, which is upper-bounded by $b_{\max}$ and lower-bounded by zero, is performed in such a way that the resulting bit distribution is the shift version of the initial bit distribution $\overline{b}$ . Therefore, if a (a > 1) bits were to be removed from subcarrier n $(n \in N)$ , then $a - (\overline{b}_n - b_{\max})$ bits must be removed from subcarrier n ( $n \in N_s$ ), where $N_s = \{n : b_{\text{max}} < \overline{b}_n < b_{\text{max}} + a, \ n \in N \}$ , or the number of bits carried by subcarrier n ( $n \in N_s$ ) should be reduced to $\overline{b}_n - a$ . Following are the notations of subsets and their cardinalities that will be used below $$\widetilde{N}_{s1} = \{n : \overline{b}_n = b_{\max} + \underline{\nu}, n \in \widetilde{N}\}, \qquad \widetilde{L}_{s1} = |\widetilde{N}_{s1}|; \widetilde{N}_{s2} = \{n : b_{\max} + \underline{\nu} < \overline{b}_n < b_{\max} + \underline{\nu} + a, n \in \widetilde{N}\}, \qquad \widetilde{L}_{s2} = |\widetilde{N}_{s2}|; \widetilde{N}_{s3} = \{n : \overline{b}_n = b_{\max} + \underline{\nu} + a, n \in \widetilde{N}\}, \qquad \widetilde{L}_{s3} = |\widetilde{N}_{s3}|; \widetilde{N}_{s4} = \{n : b_{\max} < \overline{b}_n < b_{\max} + \overline{\nu}, n \in \widetilde{N}\}, \qquad \widetilde{L}_{s4} = |\widetilde{N}_{s4}|, N = \{n : b_n > 0, n = 1, 2, ..., M\}, \qquad L = |N|.$$ (11) According to the value of a and the relation among a, $\underline{v}$ , and $\overline{v}$ , several different bit allocation schemes can be determined. (1) $$a = 0$$ . Go to (1) of step (C). - (2) $a = \underline{v}$ . - (i) Remove *a* bits from all the subcarriers in $\underset{\sim}{N}$ , and update *B* diff. - (ii) Go to (2) of step (C). - (3) $v < a < \overline{v}$ . - (i) Remove $\underline{v}$ bits from all the subcarriers in $\underset{\sim}{N}$ and update B diff. - (ii) Calculate new loading parameter $a = \lfloor B \operatorname{diff}/(\frac{L}{L} + \widetilde{L}_{s1}) \rfloor$ , remove a bits from all the subcarriers in $N \cup N_{s1}$ , reduce the number of bits carried by the subcarriers in $N_{s2}$ to $\overline{b}_n \underline{v} a$ , and update $B \operatorname{diff}$ . - (iii) Go to (3) of step (C). - (4) $a = \overline{v}$ . - (i) Remove *a* bits from all the subcarriers in N, reduce the number of bits carried by the subcarriers in $N_{s4}$ to $\overline{b}_n a$ , the number of bits carried by the subcarriers in $N_{s4}$ to $b_n - a$ , and update B diff. (ii) Calculate new loading parameter a = |B| B diff |/(L + a)| - (ii) Calculate new loading parameter $a = \lfloor |B \operatorname{diff}| / (L + \sum_{s_4}) \rfloor$ , add a bits to all the subcarriers in $N \cup N_{s_4}$ , and update $B \operatorname{diff}$ . - (iii) Go to (4) of step (C). - (5) $\overline{v} < a$ . - (i) Remove $\overline{\nu}$ bits from all the subcarriers in N, reduce the number of bits carried by the subcarriers in $N_{s4}$ to $\overline{b}_n \overline{\nu}$ and update B diff. - (ii) Do the following loop. Calculate new loading parameter $a = \lfloor B \operatorname{diff}/L \rfloor$ . If a < 0, add |a| bits to all the subcarriers in N, upper-bound $b_n$ with $b_{\max}$ , and update $B \operatorname{diff}$ ; else if a > 0, remove a bits from all the subcarriers in N, lower-bound $b_n$ with zero and update $B \operatorname{diff}$ ; else if a = 0, break the loop and go to (5) of step (C). - (C) Parallel-bit loading allocation. - (1) a = 0. Remove one bit from each of the *B* diff largest power-consumptive subcarriers in N. (2) a = v. If B diff = 0, go to step (D); else, remove one bit from each of the B diff largest power-consumptive subcarriers in $N \cup \stackrel{\sim}{N}_{s1}$ . (3) $v < a < \overline{v}$ . If B diff = 0, go to step (D); else if B diff < 0, add one bit to each of the |B| diff | least power-consumptive subcarriers in $N \cup N_{s1} \cup N_{s2}$ ; else, remove one bit from each of the B diff largest power-consumptive subcarriers in $N \cup N_{s1} \cup N_{s2} \cup N_{s3}$ . (4) $a = \overline{\nu}$ . If B diff < 0, add one bit to each of the |B diff | least power-consumptive subcarriers in $N \cup N_{s4}$ ; else, remove one bit from each of the B diff largest power-consumptive subcarriers in $N = \{n : b_n > 0, n = 1, 2, ..., M\}$ . $(5) \overline{v} < a$ . If B diff = 0, go to step (D); else if B diff < 0, add one bit to each of the |B| diff | least power-consumptive subcarriers in <sup>&</sup>lt;sup>2</sup> For the case of $\widetilde{L}=0$ , target bit rate allocation is performed by repeated multiple-bits loading until the value of loading parameter a, where $a=\lfloor B\operatorname{diff}/L\rfloor$ , is zero, and then parallel bit-loading is executed for achieving the target bit rate. | | | Maximum rate allocation | Target rate al | | | | |-------------|------------------------------------|-------------------------|------------------------|--------------------------------------|-----|-----------------------------| | Target rate | Loading parameter | | Multiple-bits loading | Parallel bit-filling/<br>bit-removal | | Final allocation adjustment | | | | B diff | Number of subtractions | B diff | L | Number of bit swaps | | 2864 | a = 0 | 151 | 0 | 151 | 216 | 0 | | 2714 | $a = \underline{v}$ | 301 | 216 | 85 | 224 | 0 | | 2563 | $\underline{v} < a < \overline{v}$ | 452 | 224 | 12 | 236 | 0 | | 2111 | $a = \overline{v}$ | 904 | 242 | 14 | 242 | 0 | | 1809 | $\overline{v} < a$ | 1206 | 491 | 39 | 249 | 0 | | | | | | | | | Table 1: Simulation results for ADSL loop T1.601#9 showing different allocation phases of the proposed algorithm. *N*; else, remove one bit from each of the *B* diff largest power-consumptive subcarriers in *N*. (D) Final efficient adjustment of bit allocation. As the initial bit distribution is not guaranteed to be optimal without incorporating the minimum power constraint, the target rate bit distribution is not guaranteed to be efficient, so EF algorithm is employed and the following steps are executed. - (1) Find the least power-consumptive subcarrier $n^+$ in $\stackrel{\sim}{N}_n = \{n : 0 < b_n < b_{max}, n = 1, 2, ..., M\}.$ - $\widetilde{N}_p = \{n : 0 \le b_n < b_{\max}, n = 1, 2, \dots, M\}.$ (2) Find the largest power-consumptive subcarrier $n^-$ in $\widetilde{N}_p = \{n : 0 < b_n \le b_{\max}, n = 1, 2, \dots, M\}.$ - (3) If $\Delta P_{n^+}(b_{n^+}+1) < \Delta P_{n^-}(b_{n^-})$ , let $b_{n^+} = b_{n^+}+1$ and $b_{n^-} = b_{n^-}-1$ , update $\Delta P_{n^+}(b_{n^+}+1)$ and $\Delta P_{n^-}(b_{n^-})$ , and go back to step (1); else, the algorithm ends. In this way, the optimal bit distribution can be obtained after very few bit swaps. In many practical situations where the PSD is flat, the optimal bit distribution is obtained after parallel bit-loading due to the discretization nature of the task. Hence, in most cases, this procedure only plays the role of checking whether the target rate bit distribution is optimal or not, and bit swaps procedure can be omitted. #### 3. SIMULATION RESULTS AND ANALYSIS Using the new bit-loading algorithm given in the previous section, we present extensive simulation results for various standard ADSL test loops and target rates. The ADSL loops employ a duplex transmission strategy with echo canceling and the ADSL downlinks with subcarriers 7 through 255 loaded are tested. An AWGN floor of $-135\,\mathrm{dBm/Hz}$ is assumed. For ADSL test loop T1.601#7, T1.601#9, and T1.601#13, the operating environment with 50 high bit rate DSL (HDSL) and 50 integrated services digital network (ISDN) crosstalkers is assumed. For other ADSL test loops, the environment with 1 ADSL crosstalker is assumed. The total power budget is 100 mW, the PSD mask is -40 dBm/Hz, the SNR margin is 4 dB, the coding gain is 4 dB, and the target PSE is $Pe = 10^{-7}$ . The maximum size of the QAM constellations is set at $b_{\text{max}} = 15$ . Table 1 gives the numerical results of corresponding parameters in a different allocation phase for ADSL test loop T1.601#9 [15]. The target rates 2864, 2714, 2563, 2111, and FIGURE 1: Bar chart of seven different bit distributions for ADSL loop T1.601#9. 1809 correspond to allocation scheme a=0, $a=\underline{\nu}$ , $\underline{\nu}< a<\overline{\nu}$ , $a=\overline{\nu}$ , and $\overline{\nu}< a$ , respectively. Parameters given in Table 1 include the bit difference B diff after maximum bit rate allocation, number of subtractions in performing the multiplebits loading, number of bits B diff allocated by parallel bit-filling or bit-removal, the cardinality L of the designated subchannel set in which parallel bit-filling or bit-removal is performed, and the number of bit swaps in final bit allocation adjustment. As shown in Table 1, the number of bit swaps in each case is zero. Simulation on other ADSL test loops under various target rates also shows that the number of bit swaps is at most 3, and in most cases the number of bit swaps is zero, meaning that the bit distribution is optimal after parallel bit-loading. Figure 1 shows the bar chart of seven different bit distributions for loop T1.601#9. Bit distributions number 5 to number 1 are the optimal bit distributions corresponding to allocation scheme a = 0, $a = \underline{v}$ , $\underline{v} < a < \overline{v}$ , $a = \overline{v}$ , and $\overline{v} < a$ , Li-Ping Zhu et al. 5 Table 2: Simulation results showing the computation load of the proposed algorithm and that of existing algorithms. | | Target<br>rate | Algorithm in [14] | | | Proposed | Proposed algorithm | | Computation load comparison | | | | | | | |--------------|----------------|--------------------------|-----------------------|-----|--------------------------|-------------------------|-----|-----------------------------|--------|--------------------|--------|------------------------------|-------|--| | Test<br>loop | | Multiple allocation | Greedy<br>bit-loading | | Multiple allocation | Parallel<br>bit-loading | | Algorithm in | | Proposed algorithm | | Ratios of the two algorithms | | | | | | Subtraction/<br>addition | B diff | L | Subtraction/<br>addition | B diff | L | A | С | A | С | A | С | | | | 2262 | 248 | 70 | 238 | 232 | 19 | 244 | 387 | 16 590 | 251 | 4446 | 1.54 | 3.73 | | | T1.601#7 | 1759 | 0 | 185 | 244 | 493 | 29 | 237 | 369 | 44 955 | 522 | 6438 | 0.71 | 6.98 | | | | 1257 | 244 | 212 | 218 | 493 | 71 | 210 | 667 | 46004 | 564 | 12 354 | 1.18 | 3.72 | | | | 754 | 437 | 99 | 182 | 493 | 166 | 186 | 634 | 17 919 | 659 | 17 015 | 0.96 | 1.05 | | | | 251 | 413 | 93 | 118 | 784 | 18 | 98 | 598 | 10881 | 802 | 1593 | 0.75 | 6.83 | | | T1.601#13 | 2628 | 249 | 196 | 231 | 216 | 76 | 228 | 640 | 45 080 | 292 | 14 402 | 2.19 | 3.13 | | | | 2044 | 249 | 99 | 248 | 246 | 52 | 246 | 446 | 24453 | 298 | 11 414 | 4.55 | 2.14 | | | | 1460 | 0 | 13 | 249 | 495 | 34 | 243 | 25 | 3224 | 529 | 7667 | 0.05 | 0.42 | | | | 876 | 240 | 115 | 204 | 495 | 154 | 206 | 469 | 23 345 | 649 | 19 789 | 0.72 | 1.18 | | | | 292 | 413 | 127 | 132 | 818 | 12 | 102 | 866 | 16637 | 830 | 1146 | 1.04 | 14.5 | | | | 2620 | 249 | 55 | 248 | 243 | 48 | 249 | 358 | 13 585 | 291 | 10 776 | 1.23 | 1.26 | | | | 2038 | 249 | 220 | 249 | 492 | 132 | 249 | 688 | 54 560 | 624 | 24 090 | 1.10 | 2.26 | | | CSA#4 | 1456 | 0 | 136 | 249 | 492 | 216 | 249 | 271 | 33 728 | 708 | 30 348 | 0.38 | 1.11 | | | | 873 | 245 | 202 | 241 | 492 | 60 | 239 | 648 | 48 480 | 552 | 12 510 | 1.17 | 3.88 | | | | 291 | 245 | 73 | 156 | 492 | 168 | 188 | 390 | 11 315 | 660 | 17 388 | 0.59 | 0.65 | | | CSA#6 | 2606 | 249 | 50 | 248 | 244 | 45 | 249 | 348 | 12 350 | 289 | 10 170 | 1.20 | 1.21 | | | | 2027 | 249 | 196 | 249 | 493 | 145 | 249 | 640 | 48 608 | 638 | 25 520 | 1.00 | 1.90 | | | | 1448 | 0 | 137 | 249 | 493 | 207 | 249 | 273 | 33 976 | 700 | 30 015 | 0.39 | 1.13 | | | | 869 | 246 | 196 | 240 | 493 | 45 | 237 | 637 | 46 844 | 538 | 9630 | 1.18 | 4.86 | | | | 290 | 246 | 70 | 154 | 493 | 154 | 183 | 385 | 10710 | 647 | 16 247 | 0.60 | 0.66 | | | CSA#7 | 2567 | 249 | 228 | 249 | 248 | 37 | 249 | 704 | 56 544 | 285 | 8510 | 2.47 | 6.64 | | | | 1996 | 249 | 155 | 249 | 497 | 110 | 249 | 558 | 38 440 | 607 | 21 285 | 0.92 | 1.81 | | | | 1426 | 249 | 83 | 249 | 497 | 182 | 249 | 414 | 20584 | 679 | 28 665 | 0.61 | 0.72 | | | | 856 | 0 | 238 | 248 | 497 | 5 | 243 | 475 | 58 786 | 502 | 1200 | 0.95 | 48.99 | | | | 285 | 248 | 84 | 158 | 497 | 92 | 162 | 415 | 13 188 | 589 | 10 626 | 0.70 | 1.24 | | | CSA#8 | 2546 | 249 | 217 | 249 | 248 | 35 | 249 | 682 | 53 816 | 283 | 8085 | 2.41 | 6.66 | | | | 1980 | 249 | 149 | 249 | 497 | 103 | 249 | 546 | 36 952 | 600 | 20 291 | 0.91 | 1.82 | | | | 1415 | 249 | 82 | 249 | 497 | 170 | 249 | 412 | 20 336 | 667 | 27 795 | 0.62 | 0.73 | | | | 849 | 0 | 235 | 246 | 497 | 238 | 246 | 469 | 57 575 | 735 | 30 107 | 0.64 | 1.91 | | | | 283 | 246 | 82 | 157 | 497 | 84 | 157 | 409 | 12 792 | 581 | 9618 | 0.70 | 1.33 | | | Mid-CSA | 2795 | 249 | 174 | 248 | 235 | 75 | 248 | 596 | 42 978 | 310 | 15 750 | 1.92 | 2.73 | | | | 2174 | 249 | 51 | 249 | 497 | 199 | 249 | 350 | 12 648 | 696 | 29 651 | 0.50 | 0.43 | | | | 1553 | 249 | 177 | 249 | 497 | 73 | 249 | 602 | 43 896 | 570 | 15 476 | 1.06 | 2.84 | | | | 932 | 249 | 54 | 249 | 497 | 196 | 249 | 356 | 13 392 | 693 | 29 498 | 0.51 | 0.45 | | | | 311 | 245 | 73 | 164 | 497 | 74 | 164 | 390 | 11 899 | 571 | 9361 | 0.68 | 1.27 | | respectively. Bit distributions number 7 and number 6 correspond to initial equal power assignment bit distribution $\overline{b}$ and maximum bit rate distribution, respectively. To evaluate the computational efficiency of the proposed algorithm, we compare the main computation load of the proposed algorithm with that of the algorithm in [14] for ADSL test loop T1.601#7, T1.601#13, CSA#4, CSA#6, CSA#7, CSA#8, and Mid-CSA [15], with target bit rate corresponding to 90%, 70%, 50%, 30%, and 10% of the loop's maximum bit rate. The computation load of the proposed algorithm is mainly determined by the operations in performing multiple-bits loading and parallel bit-loading, while that of the algorithm in [14] is mainly determined by the operations in performing multiple-bits loading and greedy bit-loading. For the same number of bits B diff to be allocated in the subchannel set with the same number of subchannels L, parallel bit-loading performs B diff adjustment in *one step* compared to the B diff greedy bit-loading steps, thus is computationally more efficient. Assume that the transmission power increment of each subchannel is obtained beforehand. Parallel bit-loading requires $L-1+L-2+\cdots+L-B$ diff comparisons and B diff additions or subtractions, while greedy bit-loading requires $(L-1)\cdot B$ diff comparisons, B diff additions or subtractions, and an extra of B diff -1 multiplications or divisions in updating the transmission power increment. The number of comparisons, the basic operation, of the parallel bit-loading is $(B \operatorname{diff} -1) \cdot B \operatorname{diff}/2$ less than that of the greedy bit-loading. Table 2 shows the experimental results of the number of subtraction and/or addition in performing the multiple-bits loading, the number of bits B diff allocated by parallel bitloading or greedy bit-loading, and the cardinality L of the designated subchannel set in which parallel bit-loading or greedy bit-loading is performed. The main computation load of the two algorithms, which is calculated based on these results, depends on two kinds of operations, that is, arithmetic operation and comparison, which are represented by symbols "A" and "C" in Table 2, respectively. The computation load of minor adjustment using the EF algorithm is low as it obtains the optimal solution with the minimum number of bit swaps. Specially, the number of bit swaps for each scenario of Table 2 is zero. The number of "A" operations for the proposed algorithm is the sum of two parts: the number of subtraction or addition for multiple-bits loading and the number of subtraction or addition B diff for parallel bitloading. The number of "A" operations for the algorithm in [14] is the sum of three parts: the number of subtraction or addition for multiple-bits loading, the number of subtraction or addition B diff for greedy bit-loading, and the number of multiplication or division $B \operatorname{diff} -1$ for updating the transmission power increment. The number of "C" operations for the proposed algorithm is $L-1+L-2+\cdots+L-$ B diff, while that of "C" operations for the algorithm in [14] is $(L-1) \cdot B$ diff. To facilitate comparison of the computation load of the two algorithms, the ratios of the number of operations for the algorithm in [14] to the number of corresponding operations for the proposed algorithm are also provided. As can be seen from Table 2, the number of "C" operations is much more than that of "A" operations, meaning that parallel bit-loading and greedy bit-loading play the most important part in determining the computation load of the proposed algorithm and the algorithm in [14], respectively, and the basic operation of the two algorithms is compared. The smaller the value of B diff and L is, the lighter the computation load is. Obviously, the main computation load of the proposed algorithm, that is, the number of "C" operations, is much lower than that of the algorithm in [14] in most cases. So it can be expected that the proposed algorithm is faster than the algorithm in [14] except when the algorithm in [14] ends up with a low value of B diff. Using order-statistic selection algorithm [16], parallel bit-loading can be performed in O(L) time. As $L \leq M$ , the proposed algorithm is as efficient as the LC algorithms which has the computational complexity of O(M), and more efficient than the algorithms of Piazzo [8] and Krongold et al. [9], both of which have the computational complexity of $O(M \cdot \log M)$ . #### 4. CONCLUSION In this paper, a heuristic optimal discrete bit allocation algorithm for margin maximization in DMT systems is presented. Compared to existing multiple-bits-loading-based algorithm which calculates an initial efficient bit calculation whatever the target bit rate is, the proposed algorithm is more flexible in that it performs bit swaps only when the target bit allocation is not efficient. Compared to conventional greedy bit-loading algorithm, the introduced parallel bit-loading algorithm is computationally more efficient. Numerical results on the standard ADSL test loops show the reduced computational load of our algorithm in comparison with existing multiple-bits-loading-based algorithm. The idea of our algorithm can also be applied to bit allocation in other DMT transmission systems. #### **ACKNOWLEDGMENTS** The authors wish to thank the anonymous reviewers for their constructive and detailed comments and suggestions which help to improve the quality of the paper. This work was supported by the China Postdoctoral Science Foundation under Grant no. 2006039083. #### **REFERENCES** - [1] J. M. Cioffi, V. Oksman, J.-J. Werner, et al., "Very-high-speed digital subscriber lines," *IEEE Communications Magazine*, vol. 37, no. 4, pp. 72–79, 1999. - [2] J. A. C. Bingham, ADSL, VDSL, and Multicarrier Modulation, John Wiley & Sons, New York, NY, USA, 2000. - [3] E. Del Re, R. Fantacci, S. Morosi, and R. Seravalle, "Comparison of CDMA and OFDM techniques for downstream power-line communications on low voltage grid," *IEEE Transactions on Power Delivery*, vol. 18, no. 4, pp. 1104–1109, 2003. - [4] J. M. Cioffi, "Advanced Digital Communication," EE379C Course Textbook, Stanford University, 2002. - [5] D. Hughes-Hartogs, "Ensemble modem structure for imperfect transmission media," U.S. Patents, 4,679,227 (July 1987), 4,731,816 (March 1988), and 4,833,706 (May 1989). - [6] P. S. Chow, J. M. Cioffi, and J. A. C. Bingham, "A practical discrete multitone transceiver loading algorithm for data transmission over spectrally shaped channels," *IEEE Transactions on Communications*, vol. 43, no. 2–4, pp. 773–775, 1995. - [7] L. Piazzo, "Fast algorithm for power and bit allocation in OFDM systems," *Electronics Letters*, vol. 35, no. 25, pp. 2173–2174, 1999. - [8] L. Piazzo, "Fast optimal bit-loading algorithm for adaptive OFDM systems," Internal Report 002-04-03, INFOCOM Department, University of Rome, Rome, Italy, 2003. - [9] B. S. Krongold, K. Ramchandran, and D. L. Jones, "Computationally efficient optimal power allocation algorithms for multicarrier communication systems," *IEEE Transactions on Communications*, vol. 48, no. 1, pp. 23–27, 2000. - [10] J. Campello, "Optimal discrete bit loading for multicarrier modulation systems," in *Proceedings of IEEE International Symposium on Information Theory*, p. 193, Cambridge, Mass, USA, August 1998. - [11] H. E. Levin, "A complete and optimal data allocation method for practical discrete multitone systems," in *Proceedings of IEEE Global Telecommunications Conference (GLOBECOM '01)*, vol. 1, pp. 369–374, San Antonio, Tex, USA, November 2001. - [12] R. V. Sonalkar and R. R. Shively, "An efficient bit-loading algorithm for DMT applications," *IEEE Communications Letters*, vol. 4, no. 3, pp. 80–82, 2000. Li-Ping Zhu et al. [13] A. Fasano, "On the optimal discrete bit loading for multicarrier systems with constraints," in *Proceedings of the 57th IEEE Semiannual Vehicular Technology Conference (VTC '03)*, vol. 2, pp. 915–919, Jeju, South Korea, April 2003. - [14] N. Papandreou and T. Antonakopoulos, "A new computationally efficient discrete bit-loading algorithm for DMT applications," *IEEE Transactions on Communications*, vol. 53, no. 5, pp. 785–789, 2005. - [15] T. Long, J. M. Cioffi, and F. Liu, XDSL Technology and Applications, Publishing House of Electronics Industry, Beijing, China, 2002. - [16] U. Manber, Introduction to Algorithms: A Creative Approach, Pearson Education Asia Limited and Publishing House of Electronics Industry, Beijing, China, 2005. Li-Ping Zhu received the B.S. degree in communications engineering in 1992, the M.S. degree in communications and electronics system in 1995, both from Dalian Maritime University, Dalian, China, and the Ph.D. degree in circuits and systems in 2004 from Shanghai Jiao Tong University, Shanghai, China. She has been with the State Key Laboratory on Microwave & Digital Communications in the Department of Elec- tronic Engineering at Tsinghua University since April 2005. Her main research interests lie in the area of signal processing for communications, with particular emphasis on antijam spread-spectrum communications, wavelet theory and applications, performance analysis, and resource allocation for communication systems. Yan Yao graduated from Tsinghua University, Beijing, China, in 1962, and joined Department of Electronic Engineering as Assistant Professor, Associate Professor, and Professor. He was Director of State Key Laboratory on Microwave & Digital Communications and Vice Chairman of Radio-Electronic Research Institute at Tsinghua University. He has been teaching and researching in the field of wireless and digi- tal communications for more than 40 years. The present academic field is communication and electronic systems; research directions include broadband transmission, personal communication systems and networks, software radio technology, antifading and antijamming techniques in wireless communications. He is also Fellow of CIC, Senior Member of CIE, and Senior Member of IEEE. Shi-Dong Zhou is a Professor at Tsinghua University, China. He received the Ph.D. degree in communication and information systems from Tsinghua University in 1998. His B.S. and M.S. degrees in wireless communication were received from Southeast University, Nanjing, China, in 1991 and 1994, respectively. From 1999 to 2001, he was in charge of several projects in the China 3G Mobile Communication R and D Program. He is now a Member of China's FuTURE Project. His research interests are in the area of wireless and mobile communications. Shi-Wei Dong received the Ph.D. degree in circuits and systems from Northwest-ern Polytechnical University in 2003. He is now with National Key Laboratory of Space Microwave Technology, Xi'an Institute of Space Radio Technology. His research interests include space microwave technology, satellite communications, and electromagnetic compatibility of information technology systems.