EURASIP Journal on Applied Signal Processing 2005:11, 1645–1655 c ○ 2005 Hindawi Publishing Corporation Design of a Baseband Transceiver for Multicarrier CDMA Communications

Multicarrier systems have become popular for their spectral efficiency and robustness against frequency-selective fading. Multicarrier code-division multiple access (MC-CDMA) is a technique that combines the advantage of multicarrier modulation with that of code-division multiple access (CDMA) to offer reliable high-data-rate downlink cellular communication services. In this paper, we present the architecture of a downlink baseband transceiver using the MC-CDMA technology under the same bandwidth requirement and channel condition as the third-generation wideband CDMA system. In the transmitter, a scrambling code is applied in order to reduce the peak-to-average power ratio (PAPR) of the transmitter output. In the receiver, we use a joint weighted least-squares (WLS) synchronization error estimation algorithm and a novel channel estimator. Both algorithms greatly enhance the system error-rate performance, as indicated by functional simulation. Simulation results also verify maximum aggregate coded data rates of Mbps from 32/64 users in mobile/stationary multipath fading channel with a 3/4 convolutional code, respectively.


INTRODUCTION
Direct-sequence spread-spectrum (DSSS) CDMA has been adopted in the third-generation (3G) mobile communication standard to provide high capacity and high transmission rate over conventional schemes such as frequencydivision multiple access (FDMA) and time-division multiple access (TDMA). However, it is also well known that due to inherent wide bandwidth of the spread-spectrum systems, severe frequency-selective fading degrades system performances. Moreover, multiple-access interference (MAI) can limit its application if there is no diversity or forward-errorcorrection coding. Recently, one type of multicarrier modulation, known as orthogonal frequency-division multiplexing (OFDM) has drawn much attention due to its ability to combat frequency-selective multipath fading and to utilize spectrum resource efficiently. In an OFDM system, a frequency-selective-faded wideband channel is partitioned into a large number of flat-faded narrowband subchannels, each of which allows simple yet effective equalization. In addition, the signals in these subchannels (subcarriers) overlap with one another, but are kept mutually orthogonal so that reliable high-data-rate transmission is possible.
By combining the above two popular wireless communication techniques, communication system designers have three different approaches that have advantages of both CDMA and OFDM. The three approaches are multicarrier CDMA (MC-CDMA), multicarrier direct-sequence CDMA (MC-DS-CDMA), and multitone CDMA (MT-CDMA). The MC-CDMA transmitter spreads input data in the frequency domain over several subcarriers using a given spreading code. The orthogonality among subcarriers is maintained by setting the frequency spacing to the inverse of the symbol time, as in the case of OFDM [1,2]. The received signals are transformed into the frequency domain and despreaded. In this way, signals scattered among many subcarriers can be gathered to decide the transmitted data and frequency diversity is achieved. In the MC-DS-CDMA system, the available spectrum is divided into several subcarriers and each subcarrier carries a DSSS signal with direct-sequence spreading in the time domain. All the subcarriers are kept mutually orthogonal by setting the frequency spacing to the inverse of the chip time [3]. Compared to a wideband DSSS transmission, this scheme lowers the chip rate in each subcarrier so that it is easier to achieve chip synchronization as well as code acquisition. The MT-CDMA transmitter spreads several data streams independently in the time domain and positions the subcarriers that carry those spread signals with a frequency spacing equivalent to the inverse of the symbol time prior to spreading [4]. Although the MT-CDMA scheme has the advantage of better spectral efficiency, it suffers from strong intercarrier interference even with perfect synchronization. It has been shown in [5] that MC-DS-CDMA and MT-CDMA are favorable for systems using noncoherent modulation. Nevertheless, when all the subcarriers are coherently demodulated, such as in the case of downlink transmission, MC-CDMA outperforms the other two approaches [6].
Although the MC-CDMA technology has a great potential as a candidate for future downlink wireless communication systems, it also has some drawbacks. The high peakto-average power ratio (PAPR) inherited from the OFDM technology demands stringent linearity specification on the power amplifier and wide dynamic range on the analogto-digital converter (ADC) and digital-to-analog converter (DAC). Saturation of the power amplifier output introduces clipping noise in all subcarriers. Challenges still exist for researchers to decrease the high PAPR in MC-CDMA systems [7]. MC-CDMA signal receiving is also susceptible to synchronization errors which have been known to corrupt the orthogonality among subcarriers. In this case, both ICI and MAI appear because signals from other subcarriers and other users corrupt the target subcarrier of the desired user [8]. Hence MC-CDMA receivers must adopt accurate synchronization error estimation algorithms to attain acceptable performance. Additionally, in multipath fading channels, channel estimation must be able to track channel variation and provide accurate channel information for the equalizer. This is because improvement in channel estimation precision helps to enhance the detected signal quality [9]. MAI may also arise owing to the fact that the orthogonality among spreading codes may be destroyed in multipath fading channels.
Numerous researchers have worked on the issues mentioned above. But to the best of our knowledge, there exists no report on a complete MC-CDMA transceiver that integrates all necessary functional blocks. This paper presents a downlink MC-CDMA system and evaluation of its performance in a setting that has channel impairments and less-than-ideal analog front end. In the proposed transmitter, a scrambling code is adopted to scramble the subcarrier signals so as to reduce PAPR. In the receiver, two main tasks, synchronization and detection, are performed. The synchronization block includes a symbol boundary detector, a carrier recovery loop, and a timing recovery loop. We implement coarse frequency acquisition that has fast settling speed and frequency tracking that further enhances the synchronization performance. A novel joint carrier frequency offset and timing frequency offset estimator based on a weighted least-squares (WLS) technique is proposed for tracking residual synchronization errors. In the detection block, the channel state is continuously estimated by a new low-complexity pilot-aided frequency-domain interpolation algorithm. Thereafter a threshold orthogonality restoring combining (TORC) technique is adopted to suppress MAI and limit noise enhancement.
In the following, system specification is presented in Section 2. Section 3 then delineates the transmitter architecture of the MC-CDMA system. Algorithms and architecture of the functional blocks in the proposed receiver are described in Section 4. Simulation results that demonstrate attractive system performance are presented in Section 5. Finally, Section 6 concludes this paper.

SYSTEM SPECIFICATION
The proposed MC-CDMA system aims at increasing the downlink transmission data rate of the current 3G wideband-CDMA (WCDMA) cellular communication system in urban areas. To this end, the proposed system uses the same RF frequency around 2 GHz and a 5 MHz signal bandwidth. To facilitate front-end channel-selection filter design, 5% bandwidth on both edges of the signal band is reserved as guard bands. The ADC sampling rate is 5.76 MHz, which is chosen as 1.5 times the chip rate in the WCDMA system, 3.84 MHz, to allow for reconfigurable dual-mode front-end design.
The length of guard interval and the symbol duration are set according to the system coverage. The channel models provided by 3GPP specify that the maximum excess delay is 2.14 microseconds in typical urban areas, and the excess delay of the first path in the second cluster can be as long as 10 microseconds in bad urban areas. The highest mobility supported is up to 120 km/h [10]. In light of all the above specifications, the guard interval is set to be longer than 10 microseconds. The maximum Doppler frequency caused by the highest mobility, 222 Hz, limits the symbol duration to be shorter than 200 microseconds. Given the sampling rate and the symbol duration, the fast Fourier transform (FFT) size is set to 1024, of which 768 subcarriers are used to transmit data while extra 33 pilot subcarriers are uniformly inserted for synchronization/channel estimation. The MC-CDMA system uses three signal constellations, namely, QPSK, 16-QAM, and 64-QAM. The orthogonal variable-spreading-factor (OVSF) codes [11] spread user signals over a number of subcarriers. The choices in OVSF code length as well as signal constellation make possible multiuser and multirate transmission, as in the 3G WCDMA   Table 1.

TRANSMITTER ARCHITECTURE
The transmitter block diagram is shown in Figure 1. Data from each user first pass through an interleaver and a mapper before they are spread by an OVSF code and then combined with other users' signals. Note that an FEC code should be applied before the interleaver, but since we focus on the inner transceiver, the FEC encoder is omitted in the diagram. The combined frequency-domain signals are multiplexed with the pilot signals and then scrambled to achieve a certain degree of decorrelation that lowers the PAPR in the transmitter output. A generic OFDM modulator then transforms the scrambled frequency-domain signals to a timedomain MC-CDMA signal, s n . Note that periodically training symbols that consist of all known subcarrier data and no user data are transmitted so as to enable continuous and reliable transmission.
Since a user's OVSF code may be well less than 768, the number of data subcarriers, he may transmit several data in one MC-CDMA symbol. In a particular MC-CDMA symbol, denote the lth data of the uth user by d u,l . Then to achieve maximum frequency diversity, we spread the data d u,l on subcarriers as far apart as possible. Moreover, adjacent subcarriers are assigned to different data that have been spread by the same chip [2,12].
Comb-type evenly distributed pilot subcarriers facilitate synchronization and channel estimation tasks in the receiver. The number of pilot subcarriers is a tradeoff between transmission efficiency and accuracy in the estimation of channel states and synchronization errors. First of all, proper pilot subcarrier allocation reduces the probability of all or most of the pilot subcarriers being deeply faded. What is more, in order to eliminate aliasing effect in channel estimation, the number of pilot subcarriers, M, must be larger than the normalized channel maximum excess delay, M > τ max /T s [13], where τ max and T s are the maximum excess delay and the sample time, respectively. From the above inequality, the minimum number of pilot subcarriers can be deduced. If the pilot subcarrier spacing is denoted by D and if the signal at the kth subcarrier is denoted by A k , then A mD = Γ m , m = −M/2 + 1, . . . , M/2, where Γ m are the value of the mth pilot subcarrier. In this paper, we set D to 25 to strike a balance between the estimation accuracy and the transmission efficiency.
Multicarrier systems suffer a severe PAPR problem when the frequency-domain subcarrier data have certain degree of regularity. In MC-CDMA systems, due to the existence of spreading codes, PAPR depends on both the out-of-phase auto-correlation and the cross-correlation of these spreading codes [7]. The autocorrelation property dominates the PAPR level when the system is lightly loaded, while the crosscorrelation characteristic has more influence when the system is heavily loaded. In order to destroy the regularity of the spreading codes, we scramble the frequency-domain signals with an orthogonal Gold sequence. The selected Gold code combines two maximal-length sequences of length 1023. We append an additional "0" at the end of the Gold sequence to generate a length-1024 orthogonal sequence. Then, we  multiply the frequency-domain data on each subcarrier by a chip in the code. At the receiver, we simply descramble the FFT output signals using the same scrambling code and no code acquisition or tracking is necessary.
The training symbols play an important role when the receiver starts up. It also provides frequency-domain channel complex gains in stationary and quasistationary channels. The training symbol spacing is set to 9 after taking into account the coherence time and the system efficiency. The time-domain training symbol is made up of two identical halves [14], which is done by putting zeroes in all odd subcarriers. Taking advantage of this longer interval of repeated waveform, the receiver attains more robust symbol boundary detection than by using only the cyclic prefix. In the frequency domain, data in subcarriers of even indices are differentially encoded with a pseudonoise (PN) sequence [15], which expedites integer carrier frequency offset acquisition in the receiver. Figure 2 shows how the pilot subcarriers and the training symbols are allocated. Also, the frequencydomain subcarriers are plotted from the subcarrier index 0 to N − 1 and the guardband is also indicated. Figure 3 depicts the baseband receiver architecture for the proposed downlink MC-CDMA communication system. According to their functionality and signal domain, we partition the whole receiver to four parts: time-domain synchronization, frequency-domain synchronization, channel estimation, and final signal detection. The acquisition of synchronization parameters such as coarse symbol boundary detection and fine frequency offset estimation in the time domain as well as coarse frequency offset acquisition and fine symbol boundary detection in the frequency domain are activated as soon as the receiver starts up. Afterwards, only the tracking mechanism using the joint WLS estimation continues to work and compensate for the residual synchronization errors. In the following, we will describe them in detail.

Time-domain synchronization
Multicarrier signals are well known to be very sensitive to synchronization errors such as carrier frequency offset, timing frequency/phase offset, and symbol boundary slipping. Therefore, we implement all synchronization compensation tasks in the time domain, that is, before the signal enters the FFT module. Otherwise, ICI and MAI will contaminate the signal and the receiver will have a hard time eliminating them in the frequency domain. As a result the receiver uses a timedomain interpolator for timing frequency offset compensation and a phase derotator for carrier frequency offset cancellation.
When the receiver first starts up, it first searches for coarse symbol boundary by looking for the two identical halves in the training symbols. Denote the received signal in the time domain as r n . The receiver adopts a delay correlator with a delay that is matched to the repetition interval of the training symbols, N/2, where N is the FFT size (1024). The delay correlator output Λ n is given by where * denotes complex conjugate. In a multipath fading channel environment, the main difference between the first half and the second half of a training symbol is phase rotation due to carrier frequency offset. We therefore compute a moving average of |Λ n | and locate the time instant that maximizes the smoothed signal, which indicates the end of a training symbol. Assume that the normalized carrier frequency offset with respect to the subcarrier spacing is given by . Then the phase shift between two halves of a training symbol is π . Usually the maximum | | can be larger than 1, so we divide into two components: I and f , where I is the nearest even integer to and = I + f , −1 ≤ f < 1. Then the maximum likelihood estimation (MLE) of the fractional carrier frequency offset is given by [14]

Frequency-domain synchronization
After the receiver has determined the symbol boundary, cyclic prefix is stripped off and the received signal is transformed to the frequency domain by the FFT module. Denote the frequency-domain signal Z k , where k is the subcarrier index. Since the fractional frequency offset has been properly compensated by the phase derotator in the first part, ICI is kept to an acceptable level. The remaining integer frequency offset I causes the frequency-domain signal to have shifted subcarrier indices. Note that the even-numbered frequency-domain subcarriers of a training symbol are differentially encoded by a PN sequence of length L [15], that is, A 2k A * 2k+2 = C k , where C k is the PN code and · denotes modulo L. So, the index shift caused by I can be easily detected by exploiting the autocorrelation property of a PN code. The receiver computes where Ω is a set of subcarriers that have been differentially encoded by one PN code. The autocorrelation output Ψ(i) is maximized at i = I . Since there exists only finite possibilities of i if we assume a maximum carrier frequency offset of 20 ppm, thus a bank of correlators with different i are implemented to determine I . Thereafter, fine symbol boundary detection is activated and it adjusts the symbol boundary to a safe and ISI-free position. This work can be done by computing the phase difference between signals of a pair of adjacent subcarriers and averaging over the results of all pairs in the training symbol. Note that in both integer frequency offset detection and fine symbol boundary detection, the receiver assumes that two adjacent subcarriers undergo similar fading, the effect of which is eliminated by computing the phase difference between their signals. The tracking mechanism for residual carrier frequency error and timing frequency error starts right after the fine symbol boundary detection is completed. The receiver uses the signals on the pilot subcarriers to continuously estimate the residual synchronization errors. The carrier frequency error causes identical phase shift in every subcarrier, while the timing frequency error, δ, results in phase shift that is proportional to the subcarrier index. In the multipath fading channel case, it is essential that the channel fading effect be removed before any synchronization error estimation can be carried out. Usually, this is implemented by taking the phase difference of signals at the same subcarrier across two MC-CDMA symbols. In addition, linear regression provides the best estimation in terms of least squared error since it can find simultaneously the best slope and intercept in the plot of pilot subcarrier phase differences versus subcarrier indices.
The slope provides information about the timing frequency error and the intercept contains information about the carrier frequency error. Weighting the data in each subcarrier is also helpful because data of deeply faded subcarriers should be assigned smaller weights to minimize their adverse effect on estimation accuracy.
We have proposed to estimate both the carrier frequency error and the timing frequency error by a joint weighted leastsquares (WLS) algorithm [16]. Let y m represent the phase difference of the signals at the mth pilot subcarrier between two consecutive symbols. Note that the mth pilot subcarrier is subcarrier x m . The joint WLS-estimated fractional carrier frequency error f and timing frequency error δ are given by where w m is the weight applied on the data at the mth pilot subcarrier and N g is the number of cyclic prefix samples, 64. We use weights w m that are the squared pilot-subcarrier channel gains, which can be shown to approach the Cramèr-Rao bound when the residual synchronization error is small [16].

Channel estimation
As in traditional multicarrier systems, every subcarrier in the MC-CDMA systems suffers flat fading that can be rectified by a simple one-tap equalizer. In the proposed MC-CDMA system, both the training symbols and the pilot subcarriers can be used to acquire estimation of frequency-domain channel gain. However, the comb-type pilot scheme has been shown to be more suitable for fast fading channels since channel estimation can then be performed continuously [9,17]. For the receiver to operate at a mobility of 120 km/h, we use a frequency-domain interpolation-based channel estimator that uses information from pilot subcarriers more efficiently. Two main classes of comb-type pilot-aided channel estimation are (a) time-domain windowing and (b) frequencydomain interpolation. In the time-domain windowing algorithms, time-domain channel impulse response (CIR) is obtained by inverse Fourier transforming the frequencydomain channel response at the pilot subcarriers. Thereafter, different windowing techniques are applied to the reconstructed time-domain CIR in order to reduce noise and aliasing effects. These include rectangular windowing [18,19,20,21] and MMSE weighting [13]. The windowed CIR is then transformed back to the frequency domain and be used by frequency-domain equalizers. The second class of frequencydomain interpolation-based channel estimation algorithms include linear interpolation [22], second-order interpolation [23], and spline-cubic interpolation [17]. Basically, the channel gains at data subcarriers are interpolated from the received pilot subcarrier channel gains using respective interpolation methods. In fact, we have shown that correspondence exists between these two classes of channel estimators [24].
In terms of computational complexity, the frequencydomain interpolation channel estimation is a better choice since no inverse/forward Fourier transforms are involved. However, in the conventional frequency-domain interpolation, pilot subcarriers need to sample the channel frequency response dense enough, that is, M ≥ 2τ max /T s [25]. Basically, the performance of these two types of algorithms depends on how they shape the time-domain CIR. Due to the sampling of channel response at pilot subcarriers with spacing D/(NT s ), the inverse Fourier transformation of the M pilotsubcarrier channel responses has periodic repetition in the time domain with a spacing of NT s /D as shown in Figure 4. A time-domain window is required to preserve the major portion of the desired CIR, which is located near the origin, while the remaining replicas are regarded as the unwanted CIR and must be suppressed in order to reconstruct the frequency-domain channel response samples with a narrower spacing of 1/(NT s ). Any replica energy which is not rejected will cause the aliasing effect. If the symbol boundary detection is correct, then the first pulse of the CIR must be aligned with the origin and the remaining impulse response appears in the duration of the guard interval [0, N g T s ]. Due to energy leakage, precursor as well as postcursor show up in the T s -spaced CIR [26]. Consequently, the window must be shifted to the right instead of centering at the origin. In light of the above considerations, we have developed a shifted raised-cosine interpolator-based channel estimator [24]. Its frequency-domain interpolation coefficients are given by where β is the roll-off factor and q corresponds to the time shift in the time domain. Since |W k | falls off rapidly as |k| becomes large, we use only six taps in the interpolation.

Detection
In an MC-CDMA system, the signal corresponding to a user's data spreads over several subcarriers and must be equalized and combined. Among single-user detection techniques, the optimal maximum likelihood detector has been shown to have a complexity that grows exponentially with the number of users [2]. This disadvantage leads to the consideration of suboptimal techniques, such as orthogonality restoring combining (ORC), equal-gain combining (EGC), maximum ratio combining (MRC), threshold orthogonality restoring combining (TORC), and minimum mean squared error combining (MMSEC). They all attempt to increase signal-to-interference plus-noise ratio (SINR) by suppressing MAI and/or noise. The ORC technique can recover the orthogonality among user signals as its name implies and thus can reduce MAI. Unfortunately, noise enhancement can occur when a subcarrier is in deep fade and is corrupted by noise. The MRC method maximizes the SNR and pays no attention to MAI. MMSEC minimizes the mean squared error between the transmitted and the received data, so it reduces MAI and noise simultaneously. Nonetheless, it requires information of SNR per subcarrier, which is usually expensive to come by. TORC, which is a combination of ORC and EGC, is then chosen in the receiver. TORC can reduce MAI in high-SNR cases and can avoid noise enhancement in low-SNR cases. Additionally, it has been shown to approach MMSEC performance and do so with less complexity [27]. The TORC coefficient G k can be expressed as where H k is the estimated channel gain at subcarrier k and h THR is a threshold below which the subcarrier data is deemed in deep fade.The optimal threshold depends on the number of active users and the received signal-to-noise ratio. Finally, the OVSF code of the desired user is also multiplied to those subcarriers for despreading. The combined and despread signal is then sent to the decision device for final signal detection.

SIMULATION RESULTS
In this section, we will describe the simulation results based on the proposed transmitter/receiver algorithms. The channel model that we use is the typical urban power-delay profile specified in a 3GPP technical report, where a large number of paths ensure that correlation properties in the frequency domain are realistic [10]. The profile is listed in Table 2.
In addition, we have included the impairments arising from RF/analog front end, such as carrier frequency offset and sampling clock offset. Unless specified, we use 16-QAM constellation and 64-chip spreading code. The number of pilot subcarriers is 33 with a spacing of 25 subcarriers. Training symbols are transmitted periodically every 9 symbols. We use a rate-3/4 convolutional code with an interleaver across bits contained in 16 blocks. Figure 5 depicts the cumulative distribution functions of the proposed MC-CDMA baseband transmitter's squared outputs (P(|s n | 2 ≥ E p )) in different conditions given that the average output energy is set to unity and consequently the PAPR in each case can be easily obtained from the figure. We can see that the MC-CDMA baseband transmitters  outputs have severe PAPR problem when there is only one user and have smaller PAPR when the system is heavily loaded with 64 users. This is because with more users, the combined frequency-domain subcarrier data become more random. Also in [7], the upper bound of PAPR is derived and it shows that in the single-user case, the PAPR depends on aperiodic autocorrelation of the specific spreading code instead of the user data. However, as the number of active users increases, the PAPR depends on the cross-correlation property. Note that the solid lines represent the cases without applying the scrambling code while the dashed lines show the cases with the scrambling code. The peak energy is reduced by about 4.3 dB and the PAPR decreases from 15.3 dB to 11 dB in the 32-user case at a peak energy probability around 10 −5 .

Tracking of residual synchronization error
Figures 6 and 7 illustrate how the proposed joint WLS algorithm improves the estimation accuracy in carrier frequency error and timing frequency error when compared to the other joint estimation algorithms. In the simulation, we assume that after initial acquisition of the fractional frequency offset, the residual carrier frequency error is 0.05 times the subcarrier spacing and the timing frequency error is 20 ppm. The Sliskovic [28] algorithm estimates the timing frequency error by first computing the phase differences between pairs of adjacent pilot subcarriers, then computing the differences over two symbols of those differences, and finally weighted averaging. The carrier frequency error is then estimated by first calculating the pilot-subcarrier phase differences between two symbols, removing the contribution from the previously estimated timing frequency error, and finally weighted averaging. The LLS algorithm is the linear least-squares algorithm, which assigns equal weights on data at every pilot subcarrier [29]. The root-mean-squared (RMS) errors of the estimates for carrier frequency offset and timing frequency offset are shown for three estimators under 80 km/h mobility. The joint WLS algorithm obviously is the best algorithm and it produces the most accurate estimation with least RMS errors in both carrier frequency error and timing frequency error.

Channel estimation
To verify the effectiveness of the channel estimator the proposed receiver uses, we first simulate the performance of various estimators in a quasistatic channel, which means the channel impulse response is invariant within one symbol period. In this circumstance, the ICI that resulted from mobile channels can be ignored and we can determine how precise the estimators are. Figure 8 illustrates the channel estimation errors attained by several channel estimators mentioned in the previous section. Here, channels with long delay spread refer to the cases that the channel excess delay is close to the product of the number of pilot subcarriers and the sample time T s . We see that in the case of quasistatic channels with long delay spread, the six-tap raised-cosine interpolation (RC6) channel estimation algorithm is almost as good as the MMSE method, which is a time-domain windowing method that requires extra inverse Fourier transform operations. Besides, it is significantly better than other frequencydomain interpolation channel estimation algorithms with approximately the same complexity.
Next we find out how effective the raised-cosine channel estimation is in a fast-fading channel with a speed of 120 km/h. Figure 9 depicts the BER by using channel estimates from the raised-cosine interpolation and the recursive least-squares (RLS) adaptive algorithm [9]. The SNR per user is set to 12 dB. We see that even though the RLS algorithm converges very fast, it still cannot cope with this fast fading channel. So as time elapses from the training symbol, the BER gets larger. On the other hand, the pilotaided frequency-domain interpolation algorithm can track the channel variation very well and the BER is kept under control.

System performance
Next, we want to find out the overall performance of the proposed MC-CDMA system in multiuser multipath fading environments. Figure 10 shows the system performance degradation as the number of users gets larger. In the simulation, we further assume that the carrier frequency and the timing frequency are derived from the same oscillator source and thus have the same percentage offset. The carrier frequency offset and the timing frequency offset are set to 2.2 subcarrier spacing and −6.2 ppm, respectively. The BER performance is evaluated under stationary multipath channels with AWGN. Because we use a spreading factor of length 64, the system allows a maximum of 64 active users. In addition, system performances with 32 users and 16 users are also plotted. In stationary channels, with the system fully loaded, the coded BER is satisfactory and is on the order of 10 −5 at a per-user SNR of about 12 dB.
At last, the simulated system BER performances using different signal constellations and under different mobility conditions are illustrated in Figure 11. Similarly, we inject the carrier frequency offset of 2.2 subcarrier spacing and timing frequency offset of −6.2 ppm. But now the multipath fading channel is time varying according to the mobility settings. Note that with 32 active users and 169 Kbps per user (16-QAM constellation), the proposed receiver provides a coded BER performance at around 10 −3 for a per-user SNR level of 15 dB under the highest supported mobility. Using the QPSK constellation and a 85 Kbps per-user data rate, the BER performance can be further enhanced, reaching a coded BER below 10 −5 . Estimation MSE SNR Second [23] Linear [22] Direct [20] MST [21] RC6 MMSE [13] (a) Uncoded BER SNR Second [23] Linear [22] Direct [20] MST [21] RC6 MMSE [13] (b)

CONCLUSION
In this paper, we proposed the baseband transceiver architecture of a downlink MC-CDMA communication system. At first, a set of MC-CDMA system specifications suited for 3G WCDMA environments was designed. Based on the specifications, a transmitter with subcarrier data scrambling  was introduced. Then we designed and integrated all necessary modules in a baseband MC-CDMA receiver, including a symbol boundary detector, a carrier frequency recovery loop, a timing frequency recovery loop, a channel estimator, and a combining and despreading block. In the receiver, two novel techniques that enhanced the system performance, joint carrier/timing frequency error estimation  and frequency-domain interpolation-based channel estimation, were implemented. Simulation results showed that the scrambling in the transmitter and the two new receiver algorithms greatly improved the MC-CDMA system performance. The simulation results also showed that the maximum aggregate uncoded data rates that can be transmitted reliably in mobile/stationary multipath fading channels were 5.4/10.8 Mbps, respectively. As a result, the proposed MC-CDMA baseband transceiver algorithm/architecture can play an important role in future high-data-rate downlink communications over mobile multipath fading channels.