Coherent Code Tracking for Spatial Transmit Diversity DS-CDMA Systems

Spatial transmit diversity schemes are now well integrated into third-generation cellular mobile communication system speciﬁ-cations. When DS-CDMA-based technology is deployed in typical macro-and microcell environments, multipath diversity and spatial diversity may be exploited simultaneously by a 2D RAKE receiver. The work presented in this paper focuses on taking advantage of spatial transmit diversity in synchronising the 2D RAKE structure. We investigate the use of coherent and noncoherent techniques for tracking the timing parameters of each multipath component. It is shown that both noncoherent and coherent techniques beneﬁt from transmit diversity. Additionally the performance gap between these two techniques increases with the number of antennas.


INTRODUCTION
For direct-sequence code-division multiple-access (DS-CDMA) communications, antenna diversity techniques have been proposed and implemented to mitigate the effects of multipath fading [1,2].One economical way of deploying such diversity is to use multiple antennas at the base station, and a single antenna at the mobile station [3].With such an architecture, uplink reception can be carried out using adaptive beamforming solutions.Downlink reception relies on the transmission of distinct waveforms at the transmit antennas which can then be separated at the single receive antenna.
It is well known that CDMA-based systems are able to exploit and recombine signal components of different delays using RAKE-style receiver structures.Recently 2dimensional (2D) RAKE receivers have been proposed which are designed to exploit both delayed signal components (time diversity) and spatial diversity simultaneously.Although time diversity is exploited by the overall operation of a 2D RAKE receiver, it is of no benefit to the delay tracking functions which underpin its operation.
In this paper, we consider techniques for tracking the delay of a received signal component when spatial transmit diversity is employed.Closed loop synchronization techniques are generally used for this purpose.The most popular is the noncoherent delay-locked loop (DLL) [4].However, the WCDMA (wideband CDMA) specifications provide known pilot symbols in both link directions making the coherent DLL a viable alternative.
The structures described in this paper are well suited to digital implementation.All of the operations that they require are either described in terms of discrete-time signals or have a digital equivalent which is realisable with todays processing capabilities.

SYSTEM MODEL
The system under consideration uses antenna diversity at the base station (BS) and a single antenna at the mobile station.There are good reasons for this choice.Firstly it is more economical [5].Secondly it allows significant separation of the antennas thereby achieving good decorrelation of the channels.Finally provision has been made in the third-generation partnership project (3GPP) specifications for this configuration.3GPP is the body responsible for the standardisation of WCDMA, the emerging high data rate mobile telephony service.

Transmitter
3GPP currently specifies antenna diversity with two antennas and utilises the Alamouti space-time block code [5,6].The potential capacity increase obtained by applying this technique to WCDMA in a 2D RAKE is demonstrated in [3].Furthermore, there are draft proposals for expansion to four antenna transmitters [7].
In WCDMA different users 1 are separated by different spreading codes.Let us define d k [n] as the nth data symbol of the kth user.The data symbols of each user are applied to a block coding scheme to yield M distinct se- } is transmitted on antenna m).For analysis purposes, it is convenient to provide a representation of the baseband signal which corresponds to user k and is transmitted on the mth antenna.This is denoted as where E c,k is the energy per chip of the kth user, a k [i] is the user-specific spreading code, T c is the chip period, G k is the user's spreading factor, and g(t) is the chip shaping waveform.Note that in WCDMA a k [i] represents the product of the real-valued user-specific channelisation code and a base-station-specific complex-valued scrambling code [8].
The channelisation codes in WCDMA are orthogonal codes.It is assumed that , e j3π/4 , e j5π/4 , e j7π/4 (2) and also that the energy in g(t) is unity.
In order to exploit transmit diversity, a receiver must be capable of estimating the channel from each BS antenna.This is facilitated by the common pilot channel (CPICH).To introduce the CPICH into our model, we designate user index 1 as the pilot channel.Therefore, } represent the pilot symbol sequences which are transmitted on antennas 1 • • • M, respectively.In order to facilitate independent channel estimation from each antenna, the pilot symbols are orthogonal across antennas.
The overall transmitter is shown in Figure 1.The blocks marked g(t) are chip shaping filters."Other users" refer to other data and control signals which have been chipped by 1 Strictly it is different UMTS (Universal Mobile Telecommunications System) physical channels that are separated by different spreading codes; however, we will denote them as users and reserve the word channel to refer to the wireless propagation environment.
Other users Scrambling code their respective channelisation codes.The pilot channel does not require channelisation and is spread directly by the basestation-specific scrambling code.
The baseband signal which is transmitted on the mth antenna consists of all user signals summed together as follows: The figure of merit commonly used in a downlink scenario is E c /I or , where I or is the one-sided power spectral density of the entire base station transmission:

Channel model
The radio propagation conditions in this work are modelled based on the following assumptions.Firstly there is no lineof-sight component between transmitter and receiver.Secondly the mobile device is in motion through a complex scattering environment which causes fading of the received signal level.Thirdly strong reflections from more distant objects are also received by the mobile station.This results in a frequency-selective (Rayleigh) fading channel model.The received signal can be described in terms of P resolvable multipath components.Each resolvable path has a fading waveform due to local scatterers.The baseband equivalent received signal, r(t), may be expressed as where α p,m (t) is the complex fading envelope of the channel between transmit antenna m and the receive antenna via path p.For the purpose of chip timing recovery, it is possible to assume that the signals from all antennas through multipath component p have the same time-varying propagation delay of τ p (t) seconds.

Receiver
The well-known RAKE receiver demodulates a number of resolvable multipath components simultaneously and combines the results coherently to improve the SNR [1,2].The 2D RAKE structure described in [9] employs multiple receive antennas and a beamformer on each branch of the RAKE structure to improve performance.In the downlink scenario currently under consideration, only one antenna is available at the receiver, therefore this technique cannot be applied.However, a 2D RAKE can still be implemented by exploiting transmit diversity on each branch of the RAKE [3,10].Synchronising a RAKE receiver to a multipath channel consists of two stages.Firstly the receiver must identify and select a set of multipath components to recombine in the RAKE structure.This typically involves searching for paths, or using cross-correlation techniques such as those described in [11].The second stage, which we address in the remainder of this paper, is to track the delays of the multipath components through time.These delays can vary due to movement of the mobile device and changes in the environment; therefore they must be tracked independently.This is typically achieved using delay-locked loop structures.

Single-antenna delay-locked loops
Figure 2 shows a generic chip timing recovery loop consisting of a timing error detector (TED), a loop filter to smooth the TED output, and a numerically controlled clock (NCC) to control the timing of the sampling devices [12].The order of the loop is determined by the transfer function of the loop filter [13].Here we use a first-order loop by setting the discrete-time transfer function of the loop filter to F(z) = 1.The loop bandwidth is then controlled by setting the gain term κ.The performance of a timing recovery loop is ultimately limited by the properties of its TED. Figure 3 shows an example of a coherent TED for DS-CDMA [14].A noncoherent version is shown in Figure 4.Both TEDs operate by despreading early and late versions of the received signal to obtain Y + [n] and Y − [n], respectively.Let us assume that the DLL is tracking a path with a delay of τ 1 seconds, and the current estimate of this delay is τ1 .The normalised timing error is then defined as

Coherent DLL
As shown in Figure 3, the coherent TED derives a measurement of the timing error (denoted as e[n]) by computing the difference between Y + [n] and Y − [n], and then explicitly removing the carrier phase offset and data modulation using knowledge of their values obtained from other parts of the receiver.The error signal developed by this TED may be expressed as where φ is an estimate of the carrier phase.Taking the expectation conditioned on the timing error ε, we obtain The carrier phase estimate φ is required in any case for coherent detection of the data.The term d[n] represents either a data decision (i.e., in a decision-directed loop [15]) or a known pilot symbol, depending on which physical channel is being used to derive the timing error.
Let us now examine the statistics of the signals at the output of the early and late arms of the detector.Appendix A derives the statistics of a despreader output with a timing error of ε chips.The early and late despreader outputs Y ± [n] are generated by intentionally introducing a timing offset of ±∆ chips.Therefore, based on the analysis in Appendix A, it can be shown that the statistics of where V ± 0 is the effective one-sided spectral density of the all combined interference terms.R 0 (δ) is the time-normalised chip shape autocorrelation function: These equations are valid under the assumption that other multipath components are significantly spaced from the path being tracked by the loop.

Noncoherent DLL
The noncoherent loop [13] on the other hand assumes no knowledge of the data or carrier phase and they are removed by squaring the early and late despreader outputs, as shown in Figure 4.This causes a reduction in the overall signal-tonoise ratio (SNR) of the timing error measurement, an effect known as squaring loss.The discrete-time error signal for a noncoherent loop is

TED comparison
Tables 1 and 2 show the final equations for the coherent and noncoherent TED output statistics, respectively.Analysis of the square law devices is performed in Appendix B. The noncoherent equations are then obtained by combining (11) and (9).For the coherent analysis, perfect carrier estimation is assumed as well as knowledge of the data symbols.This is acceptable given that tracking utilizes the CPICH.Both analyses also assume that ∆ = 0.5 chips and that square-root Nyquist pulse shaping is used as this guarantees statistical independence of the noise terms on the post despreader early and late branches.
Figure 5 shows the SNRs of coherent and noncoherent TED output signals for different timing errors.Simulation results are plotted along with the theoretical curves obtained from the equations in Tables 1 and 2. As can be seen, the coherent detector outperforms the noncoherent detector, particularly at low SNRs.The slight discrepancies between theoretical and empirical results are accounted to self-noise (interchip interference due to intentionally sampling too early and too late) and imperfect channel estimation in the simulations.

Exploiting spatial transmit diversity
A suggestion for noncoherent timing error detection which combines the signals from multiple receive antennas is given in [10].In the present scenario, we apply this by taking early and late measurements of multiple transmit antenna CPICHs as they arrive at a single receive antenna.The despreader output of the mth CPICH and transmitted by the Let us define a pair of vectors which contain the early and late correlation measurements from all antennas: where M is the number of transmit antennas.In [10] a sample autocorrelation matrix are then used to compute a measurement of the timing error.
In the present downlink scenario, the mobile station must make regular timing error measurements based on the pilot symbols for each path being tracked.This results in significant computation to maintain an estimate of the autocorrelation matrix and evaluate the principle eigenvalues.To reduce complexity and still benefit from the transmit diversity, the following scheme is suggested.
For the noncoherent TED, we simply calculate the timing error as A coherent version of this structure is realised as before by explicitly removing the carrier phase and data modulation.In the present scenario, data modulation is taken into account during despreading in order to exploit the orthogonality of different CPICHs.In addition, as we are summing the outputs of multiple coherent TEDs, it is appropriate to apply maximal ratio combining at this stage.Therefore, we arrive at the following timing error computation for a multiantenna coherent TED: Let us now find an expression for the statistics of the multiantenna coherent TED.Taking into account the reduction in power of each antenna by a factor of M and employing Table 1, the mth antenna TED statistics are We may assume that the noise terms are independent since the pilot sequences are orthogonal.Summing across all TEDs, the statistics of e [n] are  The SNR at the output of the detector conditioned on the timing error is then Therefore, we can see that the SNR of the multiantenna timing error measurement increases with the number of antennas.

Setup
This section describes a set of computer simulations which were used to evaluate the tracking jitter performance of the delay-locked loops described in Section 3.2.These simulations were configured to mimic a realistic set of downlink scenarios compliant with the UMTS specifications.
Figure 6 shows the end-to-end simulation configuration which is in agreement with the 3GPP open loop diversity testing procedures set out in [16].The base station simulation was configured to yield a total output power spectral density of I or = 1 W/Hz.The primary CPICH (P-CPICH) power level was set to be −10 dB down from the total base station output.The remainder of the BS output was made up of data, control, and synchronisation channels according to [16].
Three distinct configurations were employed: 1 transmit antenna, 2 transmit antennas, and 4 transmit antennas all with a single receive antenna.The total BS output power was constant across all three configurations.The signals were passed through the multiantenna multipath channel model with the power-delay profile shown in Figure 7.
As can be seen, the channel consisted of two multipath components, path 1 and path 2 with propagation delays of τ 1 and τ 2 , respectively.We define the average relative power of a path as the average power of its complex signal envelope.For example, the average relative power of path p is where it is assumed that the average power received through path p from all transmit antennas is the same.
Path 1 was tracked by the coherent and noncoherent chip timing recovery loops described in Section 3.2.The average relative power of this path, denoted as A dB , was varied over different simulation runs in order to control the SNR.The purpose of path 2 was only to introduce multipath interference and it was not exploited by the receivers.Note that in a full receiver path 2 would be tracked independently by a separate DLL.Its average power relative to the BS output was fixed to 0 dB.
The motivation for this setup is that a receiver must be able to track individual paths in the presence of other paths.Multipaths are the dominant source of interference in the downlink as they destroy orthogonality between users.The effect of other multipaths on a (despread) path of interest is described by (A.9) in Appendix A.

Results
Figures 8, 9, and 10 show the measured normalised mean square tracking jitter results of the simulations for 3 kph, 50 kph, and 120 kph fading channels, respectively.Uncorrelated fading waveforms were generated using the technique described in [17].
A chip rate of 3.84 Mcps (mega chips per second) and carrier frequency of 2.1 GHz were employed.The most obvious feature of these results is the substantial performance increase obtained in all cases by exploiting transmit antenna diversity.This result is as expected from [18].The performance advantage due to an increased number of antennas is exacerbated by slow fading.This can be explained by the drift effect which occurs when a channel enters a deep fade and the tracking loop is temporarily guided by the noise.In a slow channel, the deep fades are long in duration potentially permitting the tracking device to drift further from the maximum effect point.Increasing the number of transmit antennas reduces the probability of deep fades thereby mitigating the drift effect.
Another point worth noting is that the performance difference between coherent and noncoherent tracking loops tends to increase with the number of antennas.This is because of the decreased SNR at the input to each square law device in the noncoherent detector.From Figure 5,  it is clear that squaring loss is exacerbated with lower input SNRs thereby reducing the performance of the overall detector.
Finally we note that the 4-antenna configuration is particularly poor for low SNRs.This can also be explained by squaring loss in the case of the noncoherent tracking loop and by the reduced ability to estimate the magnitude and phase of the channel accurately in the case of the coherent loop.

CONCLUSION
In this paper, we have presented coherent and noncoherent chip timing recovery loops which use simple schemes to exploit spatial transmit diversity.The motivation is to track the timing parameters of delayed signal components in a (singleantenna) 2D RAKE receiver.
It has been shown that the tracking jitter of delay-locked loops can be reduced with an increase in the number of transmit antennas.The performance gap between the coherent and noncoherent loops is also shown to increase with the number of antennas due to increased squaring loss.

APPENDICES A. DESPREADER OUTPUT ANALYSIS
The purpose of this appendix is to derive (9) which describe the statistics the despreader output in the presence of synchronisation errors.The received baseband signal is given by ( 5) and for the case of a single transmit antenna becomes (A.1) The component of z[i] corresponding to the path of interest is . This is derived by substituting (1) and (3) into (A.4) and then applying the combined filtering and sampling operation defined in (A.2) to only the path of interest.R 0 (δ) is defined in (10).The desired signal component, which we will denote as y d [i], is found by considering only the user of interest (k = 1) in (A.5) multiplied by a * k [i]:

A.2. Interference due to synchronisation errors
Although orthogonal codes are used to separate users, in the presence of synchronisation errors the user of interest effectively experiences interchip interference (ICI) from all users.This component is Using the properties of the spreading codes described in [13], we find that the variance of the real and imaginary parts of y ICI [i] is

A.3. Multipath interference
The variance of y[i] due to other multipath components is bounded by (A.9)

A.4. Background noise
Under the assumption that the background noise is additive white Gaussian with a two-sided noise spectral density of N 0 /2, it is easily shown that the variance contribution of background noise to the variance of y[i] is (A.10)

A.5. Despreader output
Finally the despreader output is found by summing y[i] over the G k chips corresponding to each symbol: The mean is found from (A.6) to be Combining (A.8), (A.9), and (A.10), we obtain The effective one-sided spectral density of the combined interference terms is

B. COMPLEX SQUARING LOSS
Consider the scenario of Figure 11 where s is a complex random variable with constant amplitude and a random phase.The observation of s (denoted as z) is perturbed by a complex-valued Gaussian noise term n, which has the following statistical properties: The power of the observation is then measured by taking the square of its magnitude to obtain y = |z| 2 , and filtering the result.We will now evaluate the first-and second-order statistics of y.
The observation and its magnitude squared are defined, respectively, as

Figure 1 :
Figure 1: Baseband model of transmitter with M CPICHs transmitted on M antennas.

Figure 5 :
Figure 5: TED output SNRs for different timing errors.
= I or − 9 dB (other base stations) Downlink channel configuration I or = 1W/Hz P-CPICH: −10 dB (relative to total BS output)