PAPR reduction in SFBC MIMO MC-CDMA systems via user reservation

The combination of multicarrier code-division multiple access (MC-CDMA) with multiple-input multiple-output technology is attractive for broadband wireless communications. However, the large values of the peak-to-average power ratio (PAPR) of the signals transmitted on different antennas can lead to nonlinear distortion and a subsequent degradation of the system performance. In this article, we propose a PAPR reduction scheme for space-frequency block coding MC-CDMA downlink transmissions that does not require any processing at the receiver side because it is based on the addition of signals employing the spreading codes of inactive users. As the minimization of the PAPR leads to a second-order cone programming problem that can be too cumbersome for a practical implementation, some strategies to mitigate the complexity of the proposed method are also explored


Introduction
Several approaches to combine multicarrier modulation with code-division multiple access (CDMA) techniques have been proposed with the aim of bringing the best of both worlds to wireless communications [1]. Among these, multi-carrier CDMA (MC-CDMA), also known as orthogonal frequency division multiplexing CDMA (OFDM-CDMA), offers several key advantages such as immunity against narrowband interference and robustness in frequency-selective fading channels [2]. Such desirable properties make MC-CDMA an attractive choice for the present and future radio-communication systems; among these, we have satellite communications [3], high-frequency band modems [4], and systems based on the concept of cognitive radio [5].
In spite of its advantages, MC-CDMA shares with other multicarrier modulations a common problem: the usually high values of the peak-to-average power ratio (PAPR) of the transmitted signals. As multicarrier modulations are more sensitive than single carrier systems to nonlinearities in the RF high-power amplifier (HPA) [6], this latter component would be required to operate with a high output back-off value to reduce the risk of entering into the nonlinear part of its input-output characteristics. However, raising the back-off dramatically decreases the power efficiency of the HPA, a fact that seriously limits the applicability of multicarrier modulations in battery-operated portable devices and on-board satellite transmitters.
The need of reducing the PAPR in multicarrier systems has spurred the publication of a number of PAPR mitigation schemes in OFDM, such as clipping and filtering [7], block coding [8], partial transmit sequences [9,10], selected mapping [11,12], and tone reservation (TR) [13]; most of these methods are also applicable with minor modifications to MC-CDMA systems [14,15]. Other PAPR reduction algorithms have been developed specifically for MC-CDMA signals, such as spreading code selection [16][17][18] and subcarrier scrambling [19]. It is noticed that, in general, reducing the PAPR is always done either at the expense of distorting the transmitted signals, thus increasing the bit error rate (BER) at the receiver, or by reducing the information data rate, usually because high PAPR signals are somehow discarded and replaced by others with lower PAPR before being transmitted [20].
On the other hand, multiple-input multiple-output (MIMO) techniques using both space-time block coding and space-frequency block coding (SFBC) can be combined with multicarrier modulations to provide spatial diversity without requiring multiple antennas at the receiver. However, SFBC is preferable in the presence of fast fading conditions because all the redundant information is sent simultaneously through different antennas and subcarriers [21,22]. The problem of PAPR reduction in SFBC MIMO-OFDM has also been addressed by different authors, using extensions of techniques developed for the single-input single-output case [23][24][25].
In this article, we further explore a PAPR reduction technique previously proposed by the authors, namely the user reservation (UR) approach [26]. The UR technique is based on the addition of peak-reducing signals to the signal to be transmitted; these new signals are selected so that they are orthogonal to the original signal and, therefore, can be removed at the receiver without the need of transmitting any side information, and, ideally, without penalizing the BER. In the UR method, these peak-reducing signals are built by using spreading codes that are either dynamically selected from those users that are known to be idle, or deliberately reserved a priori for PAPR reduction purposes. The concept of adding orthogonal signals for peak power mitigation has been previously proposed to reduce PAPR in Discrete MultiTone and OFDM transmissions [13,27], and also in CDMA downlink systems [28]. However, to the authors' knowledge, the implementation of this idea in the context of MIMO MC-CDMA communications has never been addressed. In this study, our aim is also to develop strategies to alleviate the inherent complexity of the underlying minimization problem.
The rest of the article is structured as follows: Section "MC-CDMA with SFBC" defines basic concepts related to SFBC MC-CDMA. Section "PAPR reduction via UR" describes the UR technique. Section "Dimension reduction" is devoted to explore the possibility of reducing the complexity of the optimization problem. Section "Iterative clipping" develops an iterative UR method. Section "Experimental results" presents some simulations that show the potential of the UR approach. Finally, this article ends with some conclusions.

MC-CDMA with SFBC
In the next subsections, we describe the architecture of an SFBC MIMO MC-CDMA transmitter, and define the basic terms related to the PAPR of the involved signals.

System model
In an MC-CDMA system, a block of M information symbols from each active user are spread in the frequency domain into N = LM subcarriers, where L represents the spreading factor. This is accomplished by multiplying every symbol of the block for user k, where k {0,1,..., L -1}, by a spreading code {c (k) l , l = 0, 1, ..., L − 1}, selected from a set of L orthogonal sequences, thus allowing a maximum of L simultaneous users to share the same radio channel. The spreading codes are the usual Walsh-Hadamard (WH) sequences, which are the columns of the Hadamard matrix of order L, C L . If L is a power of 2, the Hadamard matrix is constructed recursively as where the symbol "⊗" denotes de Kronecker tensor product.
We will assume in the sequel that, of the L maximum users of MC-CDMA system, only K a <L are "active," i.e., they are transmitting information symbols, while the other K b = L -K a remain "inactive" or "idle." We will further assume that there is a "natural" indexing for all the users based on their WH codes, where the index associated to a given user is the number of the column that its code sequence occupies in the order-L Hadamard matrix. For notational convenience, we will assume throughout the article that column numbering begins at 0, so that In the downlink transmitter, each spread symbol of every active user is added to the spread symbols of the remaining active users, and the resulting sums are interleaved to form a set of N = LM complex amplitudes as follows: The space-frequency encoder then maps the complex amplitudes to two different antennas according to an Alamouti [29] scheme, resulting in the following vectors: where (·)* denotes complex conjugate.
Finally, the components of both vectors, x (1) and x (2) , are employed to modulate a set of N subcarriers with a frequency spacing of 1/T, where T is the duration of a block, so that the complex baseband signals to be transmitted by each antenna are In practice, the OFDM modulation of Equation 5 is implemented in discrete-time via an inverse discrete Fourier transform (IDFT). The whole processing in the transmitter is depicted in Figure 1.
If we sample s 1 (t) and s 2 (t) at multiples of T s = T/NQ, where Q is the oversampling factor, then we will obtain the discrete-time version of Equation 5 which, taking into account Equation 4, can be rewritten in vector notation as where the components of vectors, s 1 and s 2 are, respectively, the NQ samples of the baseband signals s 1 (t) and s 2 (t) in the block W is a NQ × N matrix formed by the first N columns of the IDFT matrix of order NQ, I e N and I o N are diagonal matrices of order N with alternating patterns of 1s and 0s along their main diagonals (with the 1s occupying either even or odd positions, respectively), Z is the lower shift matrix of order N, and x is the vector of N complex amplitudes obtained after spreading and interleaving the data symbols as defined in Equation 3 x where a is the vector of K a M symbols of the K a active users to be transmitted, C a L is a L × K a matrix whose columns are the WH codes of the active users and I M is the identity matrix of order M.
It is straightforward to check that matrices I e N and I o N , as defined in Equation 9, verify the conditions:

PAPR properties
The PAPR of a complex signal s(t) can be defined as the ratio of the peak envelope power to the average envelope power: where E(·) represents the expectation operation.
In the MIMO case, we will correspondingly extend the definition of PAPR as Spread symbols of other active users OFDM modulator where N T is the number of transmitter antennas. In our case, the computation of the peak is performed on the discrete-time version of s p (t) given by Equation 6; such approximation is justified if the oversampling factor Q is sufficiently high.
As the PAPR is a random variable, an adequate statistic is needed to characterize it. A common choice is to use the complementary cumulative distribution function (CCDF), which is defined as the probability of the PAPR exceeding a given threshold: It should be noticed that the distribution of the PAPR of MC-CDMA signals substantially differs from other multicarrier modulations. For instance, in OFDM, the subcarrier complex amplitudes can be assumed to be independent random variables, so that by applying the Central Limit Theorem, the baseband signal is usually assumed to be a complex Gaussian process. However, in MC-CDMA the subcarrier amplitudes generally exhibit strong dependencies because of the poor autocorrelation properties of WH codes; this fact, in turn, translates into a baseband signal that is no longer Gaussian-like, but instead has mostly low values with sharp peaks at regular intervals. This effect is particularly evident when the number of active users is low. Thus, we should expect higher PAPR values as the load of the system decreases.

PAPR Reduction via UR
In this article, our approach to PAPR reduction is based on "borrowing" some of the spreading codes of the set of inactive users, so that an adequate linear combination of these codes is added to the active users before the SFBC operation. The coefficients of such linear combination ("pseudo-symbols") should be chosen with the intention that the peaks of the signal in the time domain are reduced. As the added signals are orthogonal to the original ones, the whole process is transparent at the receiver side.
The addition of inactive users is simply performed by replacing the complex amplitudes of Equation 3 with can be also expressed in vector notation as where b is the vector of K b M pseudo-symbols of the K b inactive users to be determined, and C b L is a L × K b matrix whose columns are the WH codes of the idle users. Substituting Equation 16 in 6, we can decompose the signal vectors in two components: with s a 1 and s a 2 only depending on the symbols of the active users, and s b 1 and s b 2 are obtained using the pseudo-symbols of the inactive users: where the matrices involved in Equation 18 are, according to Equations 6 and 16: If we concatenate the signal vectors of the two antennas, we can express Equation 17 more compactly using Equation 18: and Thus, our objective to minimize the PAPR is to find the values of the pseudo-symbols b that minimize the peak value of the amplitudes of the components of vector s in Equation 20: The minimization involved in Equation 23 may be formulated as a second-order cone programming (SOCP) convex optimization problem [30]: Solving Equation 24 in real-time can be a daunting task, and we are, thus, interested in reducing the complexity of the optimization problem. Two approaches will be explored in the sequel: (a) Reducing the dimension of the optimization variable b.
(b) Using suboptimal iterative algorithms to approximately solve Equation 24.

Dimension reduction
We will see in the next subsections that not all the inactive users are necessary to enter the system in Equation 16 to reduce the PAPR, i.e., the number K b can be considerably less than the "default" L -K a to obtain exactly the same reduction in the peak value of the signal vector. This fact is a direct consequence of the specific structure of the Hadamard matrices.

Periodic properties of WH sequences
The particular construction of Hadamard matrices imposes their columns to follow highly structured patterns, thus, making WH codes to substantially depart from ideal pseudo-noise sequences. The most important characteristic of WH sequences that affects their Fourier properties is the existence of inner periodicities, i.e., groups of binary symbols (1 or -1) that are replicated along the whole length of the code. This periodic behavior of WH codes in the frequency domain leads to the appearance of characteristic patterns in the time domain, with many zero values that give the amplitude of the resulting signal a "peaky" aspect. This somewhat "sparse" nature of the IDFT of WH codes is, in turn, responsible of the high PAPR values we usually find in MC-CDMA signals.
For the applicability of our UR technique, it is important to characterize the periodic properties of WH codes. This is because PAPR reduction is possible only if we add in Equation 15 those inactive users whose WH codes have time-domain peaks occupying exactly the same positions as those of the active users, so that, with a suitable choice of the pseudo-symbols, a reduction of the amplitudes of the peaks is possible. As we will see, this characterization of WH sequences will lead us to group them in sets of codes, where the elements of a given set share the property that any idle user with a code belonging to the set can be employed to reduce the peaks produced by other active users with codes of the same set.
A careful inspection of the recursive algorithm described in Equation 1 for generating the Hadamard matrix of order n, C n (with n a power of two) shows that two columns of this matrix are generated using a single column of the matrix of order n/2, C n/2 . If we denote as c (k) n/2 the kth column of C n/2 (k = 0,1,..., n/2 -1), then it can be seen that the two columns of the matrix C n generated by c On the other hand, Equation 25b implies that the last n/2 columns of the order n Hadamard matrix are formed by concatenating the columns of the order n/2 matrix with a copy of themselves, but with the sign of their elements changed; therefore, the periodicities in the columns of the original matrix are now destroyed by the copy-and-negate operation in the last n/2 columns: If we denote as P the minimum length of a pattern of binary symbols that is repeated an integer number of times along any given column of the Hadamard matrix of order n (period length), then we can see by inspection that the first column (formed by n 1s) has P = 1, and the second column (formed by a repeated alternating pattern of 1s and -1s) has P = 2; then, by recursively applying properties 1, 2, and 3, we can build Table 1.
It is noticed from Table 1 that, for a Lth-order Hadamard matrix, we will have log 2 L + 1 different periods in its columns. It is also noticed that, for P > 1, the number of WH sequences with the same period is half the length of the period.

Selection of inactive users
The periodic structure of the WH codes determines their behavior in the time domain because the number and Table 1 Periods of the WH codes of length L As a result of this fact, idle users can only mitigate the PAPR of signals generated by active users with the same periodic patterns in their codes. This is because only those users will be able to generate signals with their peaks located in the same time instants (and with opposite signs) as the peaks of the active users, so that these latter peaks can be reduced. Therefore, we conclude that we need to include in Equation 24 only those idle users whose WH codes have the same period as any of the active users currently in the system. The choice of inactive users can be easily obtained with the help of Table 1 and the selection rule can be summarized as follows: For every active user k a A (with k a > 1), select for the optimization of Equation 24 only the inactive users k b B such that ⌊log 2 k b ⌋ = ⌊log 2 k a ⌋, where ⌊·⌋ denotes the "integer part."

Iterative clipping
The SOCP optimization of Equation 24 solved with interior-point methods requires O((NQ) 3/2 ) operations [30]. Although the structure of the matrices involved could be exploited to reduce the complexity, it is desirable to devise simpler suboptimal algorithms, whose complexity only grows linearly with the number of subcarriers. This can be accomplished if we adopt a strategy of iterative clipping of the time-domain signal.

Design of clipping signals
Iterative clipping is based on the addition of peak reducing signals so that, at the ith iteration, the signal vector of Equation 20 is updated as where r (i) is a "clipping vector" that is designed to reduce the magnitude of one or more of the samples of the signal vector. It is noticed that, as the clipping vector should cause no interference to the active users, it must be generated as where the matrices, F and G, were defined in Equations 22 and 19, and b (i) ∈ C K b M .
We now suppose that, at the ith iteration, we want to clip the set of samples of vector s (i) {s where δ u is the length-2NQ discrete-time impulse delayed by u samples (29) and {α (i) u , u ∈ U (i) } is a set of suitably selected complex coefficients.
It is noticed, however, that, as we require vector r (i) to be of the form given by Equation 27, it is not possible, in general, to synthesize the set of required time-domain impulses using only symbols from the inactive users, and so every term α so that the actual clipping vector would result in which, using Equation 30, can be easily shown to be in agreement with the restriction of Equation 27. A straightforward way to approximate a scaled impulse vector aδ u using only inactive users is obtained by minimizing a distance between vectors aδ u and d u (a) b u (α) = arg min b ||αδ u − Fb − Gb * || p (32) where ||·|| p denotes the p-norm. When p = 2, we have the least-squares (LS) solution: with (⋅) H denoting conjugate transpose and the error vector ε is Then, to perform the optimization defined by Equation 33, we need to solve the equation: where ∇ is the complex gradient operator [31]. The computation of the gradient can be simplified if we take into account the following properties of the matrices F and G, which can be deduced from their definitions in Equations 22 and 19: so that we obtain the following optimal vector of pseudo-symbols b as the solution of Equation 35: Now, replacing Equation 37 in 30, we get the LS approximation to aδ u as In the case under study, as δ u is a real vector: δ u = δ * u , and Equation 38 reduces to where the square matrices P and Q (of order 2NQ) are defined as Taking into account from Equation 29 that δ u is just the uth column of I 2NQ , we conclude that the LS approximation to a scaled unit impulse vector centered at position u, aδ u , is the uth column of matrix Pa + Qa*, with P and Q as defined in Equation 40. It is noticed that, in general, P and Q are not circulant matrices; this is in contrast with the projection onto convex sets approach for PAPR mitigation in OFDM [27] and related methods, where the functions utilized for peak reduction are obtained by circularly shifting and scaling a single basic clipping vector.
Several approaches can be found in the literature for the iterative minimization of the PAPR in OFDM based on TR. Among these, one that exhibits fast convergence is the active-set approach [32]. As we will see in the sequel, it can be readily adapted to simplify the UR method for PAPR reduction in SFBC MIMO MC-CDMA.

Active-set method
Iterative clipping procedures based on gradient methods tend to have slow convergence due to the use of the non-ideal impulses d u of Equation 38 in the clipping process because they must satisfy the restriction given by Equation 30. As they have non-zero values outside the position of their maximum, any attempt to clip a peak of the signal at a given discrete time u using d u can potentially give rise to unexpected new peaks at another positions of the signal vector.
On the contrary, the active-set approach [32] keeps the maximum value of the signal amplitude controlled, so that it always gets reduced at every iteration of the algorithm. An outline of the procedure of the active-set method follows [33]: (1) Find the component of s with the highest magnitude (peak value).
(2) Clip the signal by adding inactive users so that the peak value is balanced with another secondary peak. Now, we have two peaks with the same magnitude, which is lower than the original maximum.
(3) Add again inactive users to simultaneously reduce the magnitudes of the two balanced peaks until we get three balanced peaks. (4) Repeat this process with more peaks until either the magnitudes of the peaks cannot be further reduced significantly or a maximum number of iterations is reached.
It is noticed that, at the ith stage of the algorithm, we have an active set {s (i) u , u ∈ U (i) } of signal peaks that have the same maximum magnitude: where R (i) is the peak magnitude, and U (i) is the complement of the set U (i) . The problem at this point is, thus, to find a clipping vector r (i) generated as Equation 31 that, when added to the signal s (i) as in Equation 26, will satisfy two conditions: (a) The addition of the clipping vector must keep the magnitudes of the components of the current active set balanced. (b) The addition of the clipping vector should reduce the value of the peak magnitude until it reaches the magnitude of a signal sample that was previously outside the active set.
Both the conditions can easily be met if we design the vector r (i) in two stages: first, we obtain a vector z (i) as a suitable combination of non-ideal scaled impulses of the form given by Equation 30 that satisfies condition (a): and then, we compute a real number to scale vector z (i) until condition (b) is met. Therefore, the final update equation for the signal vector is where μ (i) is a convenient step-size.
A simple way to ensure that z (i) satisfies condition (a) is to force its components at the locations of the peaks to be of unit magnitude and to have the opposite signs to the signal peaks in the current active set: because then, according to Equations 41, 43, and 44 Therefore, if we use the minimum ℓ 2 norm approximation to the scaled impulses, and taking into account Equations 42, 44, and 39, the set of coefficients {β (i) u , u ∈ U (i) } is obtained as the solution of the system of equations: where p u, v and q u, v are, respectively, the elements of the uth row and vth column of matrices P and Q defined in Equation 40.
Once the vector z (i) is computed, the step-size μ (i) is determined by forcing the new peak magnitude R (i+1) to be equal to the highest magnitude of the components of s (i+1) not in the current active set Therefore, we can consider the possible samples to be included in the next active set {s so that we select as step-size the minimum of all the candidates: and its associated signal sample enters the new active set. This choice ensures that no other sample exceeds the magnitude of the samples in the current active set because we have the smallest possible reduction in the peak magnitude.
Squaring both sides of Equation 48 and rearranging terms, we find μ (i) n satisfies a quadratic equation with two real roots, and so we choose for μ (i) n the smallest positive root, given by [34]: where ℜ(⋅) denotes the real part. The overall complexity of the active-set method can be alleviated if we reduce the number of possible samples to enter the active set, so that we need to compute only a small number of candidate step-sizes. For instance, the authors of [33] propose a technique based on the prediction at the ith stage of a tentative step-sizeμ (i) , and so the candidate samples are only those that verify the condition

Experimental results
The performance of the UR algorithm was tested by simulating an Alamouti SFBC MC-CDMA system under the conditions listed in Table 2.
For comparison purposes, Figure 2 represents the estimated CCDF of the PAPR, as defined in Equations 13 and 14, obtained under three different conditions for the system load: 8, 16, and 24 active users, respectively. The K a = 8 case represents a "low load" situation (for only 25% of the maximum number of users are active), K a = 16 is an intermediate condition, with half of the potential users active, and a system with K a = 24 (75% of the maximum) can be considered as highly loaded. In all the cases, we have compared the PAPR of the transmitted signal in the original SFBC MC-CDMA system with the one obtained when the UR method is applied, using either the exact optimization given by Equation 24 or the suboptimal active-set approach. For the latter algorithm, we have employed in the clipping procedure, represented by Equations 42 and 43, the approximate scaled impulses given by Equation 39.
It is evident from Figure 2 that, as it was expected, for an unmodified SFBC MC-CDMA system described by Equations 3-5, the PAPR can become very high, especially if the number of active users is small. It is noticed also that it is precisely in cases of low and moderate load (K a = 8 and K a = 16 in our example) when the PAPR reduction provided by the UR method is more significant. This is because, as K a decreases, more inactive users are available and the dimensionality of vector b in Equation 16 increases, letting additional degrees of Data symbols per user in a frame (M) 4

Number of subcarriers (N) 128
Oversampling factor (Q) freedom to the optimization procedure described in Equation 24. We can also see from Figure 2 that the active-set approach gets close to the optimal if a sufficient number of iterations are allowed. It is noticed that there is an upper bound for this parameter: the number of iterations cannot exceed the size of vector b because the matrix that is involved in the linear system given by Equation 46 then becomes singular.

Conclusions
The UR scheme for the reduction of the PAPR of the signal transmitted in an SFBC MIMO MC-CDMA downlink was explored in this article. This approach does not require any modification at the receiver side because it is based on the addition of the spreading codes of users that are inactive. The optimization procedure provides significant improvements in PAPR, especially when the number of active users is relatively low.
The inherent complexity of the SOCP optimization involved in the method can be alleviated if we select only inactive users with WH codes that share the same periods as those of the active users in the system. For further computational savings, suboptimal procedures can be applied to reduce the PAPR; these are based on the idea of iteratively clipping the original signal in the time domain via the addition of impulse-like signals that are synthesized using the WH codes of inactive users.