EURASIP Journal on Applied Signal Processing 2002:3, 211–220 c ○ 2002 Hindawi Publishing Corporation Space-Time Codes for Wireless Optical Communications

A space-time channel coding technique is presented for overcoming turbulence-induced fading in an atmospheric optical hetero-dyne communication system that uses multiple transmit and receive apertures. In particular, a design criterion for minimizing the pairwise probability of codeword error in a space-time code (STC) is developed from a central limit theorem approximation. This design criterion maximizes the mean-to-standard-deviation ratio of the received energy di ﬀ erence between codewords. It leads to STCs that are a subset of the previously reported STCs for Rayleigh channels, namely those created from orthogonal designs. This approach also extends to other fading channels with independent, zero-mean path gains. Consequently, for large numbers of transmit and receive antennas, STCs created from orthogonal designs minimize the pairwise codeword error probability for this larger class of fading channels.


INTRODUCTION
In atmospheric optical communication, lognormal fading arising from refractive-index turbulence can make the recovery of a transmitted signal extremely difficult at the receiver.As a result, the receiver must have a redundant replica of the transmitted signal for reliable communication.Space-time codes (STCs) provide both spatial and temporal redundancy, or diversity, by using multiple apertures (antennas) over several time-slots.
Tarokh et al., in [1], established space-time code design criteria for Rayleigh and Ricean fading channels.These design criteria specify the pairwise properties of codewords from the STC.In this paper, we derive a similar design criterion for the lognormal fading channel based on a central limit theorem approximation.Our criterion leads to STCs created from orthogonal designs, a subset of the previously reported STCs for Rayleigh channels.Tarokh et al., in [2], showed that such codes have a decoding algorithm requiring only linear processing at the receiver.We show that these STCs also maximize the mean-to-standard-deviation ratio of the received energy difference between codewords, a result analogous to maximal ratio combining.
Our derivation extends to other fading channels with independent, zero-mean path gains.In other words, we show that for large numbers of transmit and receive antennas, STCs created from orthogonal designs minimize the pairwise codeword error probability regardless of the individual pathgain fading distributions.
This paper is structured as follows.Section 2 describes our channel model and the objective of space-time coding.Section 3 derives the STC design criterion for lognormal fading channels based on a central limit theorem approximation.Section 4 discusses the performance and presents an example of these STCs.

PROBLEM FORMULATION
Consider a line-of-sight atmospheric optical heterodyne communication system that uses multiple transmit and receive apertures, as shown in Figure 1.The space-time encoder maps a segment of bits from the information ... ... A laser beam propagating over a clear-weather, line-ofsight atmospheric path from transmitter exit optics to receiver entrance optics experiences amplitude and phase fluctuations due to refractive-index turbulence [3].A propagation model, established in [4], based on the extended Huygens-Fresnel principle [5] characterizes this fading as a complex lognormal process with correlation times on the order of 10 −3 to 10 −2 seconds.At high data rates, a single fade can obliterate several message packets.
Because the duration of a fade is usually much longer than the length of a message packet, we will assume that the fades are constant during a codeword transmission.We will model the path gain from transmit aperture n to receive aperture m as α nm = exp(χ nm + jφ nm ).Here χ nm , φ nm are independent Gaussian random variables with moments var , and E(φ nm ) = 0.The log-amplitude variance, σ 2 χ , typically lies within the range 0.01 (mild fading) to 0.35 (severe fading).We also assume that the spacing between elements of the receiver aperture array is large enough to ensure that the path gains for different (n, m) values are approximately independent.
We will assume optical heterodyne reception, for which the detector output is known to consist of a frequency downshifted version of the incident optical field plus an additive white Gaussian noise [6].We will use w m (t) to denote this additive Gaussian noise for receive aperture m during time slot t; it is a complex-valued, zero-mean, white Gaussian random process with variance N 0 /2 per real dimension.
Combining the fading and additive noise fluctuations, the signal at receive aperture m ∈ {1, . . ., M} during time slot t ∈ {1, . . ., T} is Given a received sequence {r m (t) : 1 ≤ m ≤ M, 1 ≤ t ≤ T} and knowledge of the path gains α = {α nm : 1 ≤ n ≤ N, 1 ≤ m ≤ M}, the minimum probability of error receiver chooses the codeword c that minimizes The exact probability of error is difficult to calculate for an STC with more than two codewords.An upper bound on this probability of error comes from the union bound where Pr(c → e) is the probability of decoding codeword c as codeword e in the absence of all other codewords.This sum is usually dominated by the terms of the closest, or minimum distance, codeword pairs.The union bound estimate of the codeword error probability is the sum of pairwise error probabilities of the minimum distance codeword pairs where K min is the average number of minimum-distance codeword neighbors and Pr(c → e) min is the pairwise probability of erroneously decoding a pair of minimum distance codewords.
Given knowledge of the path gains and assuming equallylikely codewords, the pairwise probability of incorrectly decoding transmitted codeword c as codeword e is where is the distance between codewords at the receiver, and Q(x) is the area under the tail of the standard normal density function.Averaging over α, the unconditional probability of incorrectly decoding c as e is therefore, where p α (α) is the joint probability density function of the lognormal path gains.
Our ultimate objective is to construct a space-time code that minimizes the exact probability of error, P e .In this paper, however, we will focus on minimizing Pr(c → e) min in the union bound estimate of this probability.

DESIGN CRITERION
The integral in (7) is very difficult to evaluate analytically because of the lognormal density function.We will attempt to simplify its evaluation using a central limit theorem (CLT) approximation.
Rewriting (6) as where shows that d 2 (c, e) is the sum of MN 2 complex lognormal random variables. 1Because the coefficients {A nk : 1 ≤ n, k ≤ N} and the central moments are bounded, no single term dominates the sum.Thus, we will use the central limit theorem to approximate its distribution as a truncated Gaussian with mean µ and variance σ 2 on the interval d 2 (c, e) ≥ 0. Using this approximation, we can rewrite (7) as where Define A as the matrix with A nk as its nkth element.This matrix characterizes the relationship between codeword pairs of the space-time code.Our goal is to derive properties of A that minimize the CLT approximation to Pr(c → e).We will do so by expressing (10) as a function of two normalized parameters that measure the fading strength and the signal-to-noise ratio.We then find bounds on the normalized fading strength based on the design matrix A. We conjecture that (10) is unimodal as a function of this fluctuation strength.We then show that for large numbers of transmit and receive apertures, minimizing the normalized fading strength, or equivalently choosing A to be a scaled identity matrix, minimizes the CLT approximation to the pairwise probability of error.

Normalized parameters
Our first step in minimizing (10) is to rewrite Pr(c → e) in terms of normalized parameters.The first normalized parameter measures the strength of the fading.Define the normalized fading strength, η, to be the standard-deviation-tomean ratio of the energy difference between the codewords at the receiver, that is, η = σ/µ.The second normalized parameter measures the total received signal-to-noise ratio and is defined as ρ = µ/N 0 .
With the change of variables z = x/µ, (10) becomes where and Z is a Gaussian random variable with unit mean and variance η 2 .

Mean and variance calculations
To approximate d 2 (c, e) as Gaussian, we must first determine its mean µ and variance σ 2 .Notice that because where tr(A) = N n=1 A nn .We define the energy difference between transmitted codewords as We can then express the total signal-to-noise ratio, ρ, as the sum of signal-to-noise ratios at each receive aperture, that is, ρ = ME d /N 0 = MSNR, where SNR = E d /N 0 is the signal-tonoise ratio at each receive aperture.
The second moment of d 2 (c, e) is To evaluate this summation, we can split it into two cases.
For m 1 = m 2 , we have When m 1 = m 2 = m, we find that From these results it follows that the second moment of whence Notice that we have assumed that the path gains are lognormally distributed, but we have only used the fact that they are independent and identically distributed with zero mean, unit variance, and finite fourth moment.Therefore, our method and results extend to all fading distributions that satisfy these weaker conditions.

Bounds on the normalized fading fluctuation
The µ and σ that we have found are tied to the design matrix A by ( 14) and (20), respectively.We will now derive bounds on their ratio η = σ/µ expressed in terms of A.
A lower bound, obtained via the Cauchy-Schwarz in-equality, is Equality holds in (21) when A nn = β, n = 1, . . ., N, for some positive real number β.Furthermore, setting A nk = 0 for n = k minimizes the numerator in (21).Thus we get the bound with equality when A = βI, where I is the N × N identity matrix.Also, [2] provide a method to construct STCs that satisfy the design criterion A = (E d /N)I and provide easy decoding at the receiver.Therefore, STCs created from orthogonal designs maximize the mean-to-standard-deviation ratio of the received energy difference between codewords.
We start the upper bound derivation by noticing that A is positive semi-definite [7] because it has an N × T square-root matrix B with ntth element c n (t) − e n (t) such that A = BB † [1].Let λ 1 , . . ., λ N denote the nonnegative eigenvalues of A. For e 4σ 2 χ − 2 ≥ 0, an upper bound on η is found as follows: with equality when A is a diagonal matrix.Using tr with equality when A is a diagonal matrix of rank one.The last inequality follows from which is met with equality when exactly one of the eigenvalues is nonzero.For e 4σ 2 χ − 2 < 0, an upper bound on η is found by suppressing the first term in σ 2 : Equality in (26) requires A to be rank one with all its diagonal terms equal to zero, an impossibility if A is positive semidefinite.
The bounds on η are then The lower bound is achieved when 1 ≥ 1, the upper bound is achieved when A has only one nonzero diagonal element.The upper bound is unachieved when e 4σ 2 χ − 1 < 1.

Minimizing the probability of codeword error
To our knowledge, the codeword probability of error in (12) does not have a closed-form solution.In this section, we will analyze its asymptotic behavior, and conjecture that it is unimodal as a function of η 2 , that is, it has only one extremum, a maximum, for a fixed ρ.First, we will fix a value for η and examine the behavior of Pr(c → e; η 2 , ρ) as we vary ρ.We saw in Section 3.3 that η is closely tied to the STC design matrix; therefore, fixing a value of η is in essence fixing a design matrix.
For small values of ρ, the probability of codeword error approaches one-half, that is, As ρ increases without bound, Pr(c → e; η 2 , ρ) decays as 1/ρ, namely, where ∞ 0 Q( (1/2)ρz)dz = 1/ρ using integration by parts.We will now fix the total receiver signal-to-noise ratio, ρ, and determine the probability of codeword error for different values of normalized fading fluctuation, η, or equivalently, for different design matrices.As η approaches zero, the Gaussian probability density function in (12) becomes sharply peaked around the value z = 1.This sampling-like behavior results in lim because the Gaussian density approaches zero for large values of η.
The behavior of (12) for intermediate values of η is more difficult to evaluate analytically.We will, therefore, make the following conjecture as supported by numerical evaluations of Pr(c → e; η 2 , ρ).Conjecture 1.For 0 < η < ∞, Pr(c → e; η 2 , ρ) has only one extremum, a maximum, for a given value of ρ.Plots of (12) for different values of ρ are shown in Figure 2 to support this conjecture.
Assuming that Pr(c → e; η 2 , ρ) is unimodal in η 2 , its minimum must occur on the boundary of the allowable range for η in (27).In other words, if then the optimal design criterion, in terms of minimizing the pairwise probability of codeword error, is A = (E d /N)I, because this design matrix meets the lower bound of η with equality.When (32) does not hold, and e 4σ 2 χ − 1 ≥ 1, then the optimal design criterion is to choose A to be all zero except for a single nonzero diagonal element.This design matrix, however, violates the CLT assumption that no single term dominates the summation in (8). Figure 3 shows the bounds on η and the probability of codeword error curve.
In the central limit theorem regime, the values of M and N must be large in order for d 2 (c, e) to be approximately Gaussian.We have also observed through numerical   plots, we conclude that in the central limit theorem regime (large values of M and N), A = (E d /N)I is the optimal design matrix.

PERFORMANCE
In this section, we address the validity of the central limit theorem approximation and the performance of STCs on lognormal channels.

Performance bounds for orthogonal design STCs
We will now derive the pairwise probability of decoding codeword c as codeword e assuming that the space-time code satisfies the design criterion A = (E d /N)I, but without using the central limit theorem approximation.Under this design criterion, d 2 (c, e) becomes where χ k , k = 1, . . ., MN, are independent, identically distributed Gaussian random variables with var(χ k ) = σ 2 χ and E(χ k ) = −σ 2 χ .Define χ = (χ 1 , . . ., χ MN ).The probability of decoding c as e is then where p χ (χ) is the multivariate Gaussian probability density function for χ.Using the bound where Fr(a, 0; b) is the lognormal density frustration function given by Using the bound Q(x) ≥ exp(−x 2 )/4, gives a similar lower bound A closed form evaluation of the frustration function does not exist; therefore, we use a saddle-point integration method developed by Halme in [8] to numerically evaluate it.For the design criterion A = (E d /N)I, Figure 7 compares the probability of codeword error in (34), the central  limit theorem approximation probability of codeword error in (12), its asymptotic behavior in (29), and the frustration function bounds in ( 35) and (37).This figure shows that for small values of SNR, or typical values of error probability, the CLT approximation seems valid for MN = 16 in moderate fading.Asymptotically, however, the CLT probability of codeword error decays slower than the actual error probability.From (29), we know that the CLT probability of codeword error decays as 1/SNR, whereas the frustration function bounds suggest the actual curve decays faster.This discrepancy arises from the dissimilarities in the tails of the Gaussian distribution and the actual distribution as emphasized by large values of SNR.
To measure the validity of the central limit theorem approximation, we examined the difference in SNR between the error probability expression in (34) and its approximation in (12) at a given error probability.For example, in Figure 7 for an error probability of Pr(c → e) = 10 −6 , the CLT approximation requires 0.5 dB more SNR than the actual lognormal curve.Figure 8 shows this spurious SNR for different aperture products (MN) in different fading environments (σ 2 χ = 0.01, 0.1, 0.35).From Figure 8, we see that the CLT approximation is accurate to fractions of a dB in mild fading environments (σ 2 χ = 0.01) for all values of MN ≥ 2. A larger number of apertures is required for more severe fading (roughly, MN > 16 for σ 2 χ = 0.1 and MN > 64 for σ 2 χ = 0.35).

A lower bound on the probability of codeword error
In Section 4.1, we derived lower and upper bounds on the probability of incorrectly decoding codeword c as codeword e under the design criterion A = (E d /N)I without using the

Spurious SNR [dB]
Aperture product Figure 8: Difference in SNR required to achieve a 10 −6 pairwise codeword error probability between the actual lognormal error expression in (34) and its central limit theorem approximation in (12).
central limit theorem approximation for d 2 (c, e).In this section, we derive a lower bound on this probability of error without using the central limit theorem approximation that is valid for an arbitrary design matrix A. Using the Cauchy-Schwarz inequality on (6) gives where we have renumbered the sum of the MN independent lognormal random variables as in Section 4.1.Following a similar derivation to that in Section 4.1, a lower bound on the probability of error for any design matrix is For a large number of transmit apertures, N, this bound can be quite loose, confer the orthogonal design bound in (37).

Infinite transmit diversity performance limit
If we fix the energy difference between codewords, E d , and have enough receive apertures, M, such that (32) holds, then A = (E d /N)I minimizes the pairwise error probability, and this design matrix gives η 2 = (exp(4σ 2 χ ) − 1)/MN.As we increase the number of transmit apertures, N, we see that η approaches zero, and hence (30) provides a performance limit for infinite transmit diversity, that is, These limits appear as circles in Figure 2 for MSNR = 8, 13, 15, 18 dB.One can view this limit as the error probability of a one transmit, M receive aperture system with no fading.In other words, the large number of transmit apertures mitigates the fading, and the only uncertainty in the decision process arises from the additive white Gaussian noise.

An orthogonal design example: the Alamouti scheme
Alamouti in [9] proposed a simple transmit diversity technique using two transmit apertures (N = 2), two time-slots (T = 2), M receive apertures, and a complex QAM signal constellation of size 2 b .During the first time-slot, 2b bits arrive, determining two signal constellation points, s 1 and s 2 that are transmitted simultaneously on the first and second apertures, respectively.During the second time-slot, the first aperture transmits −s * 2 , while the second sends s * 1 .In other words, this STC consists of all the codewords of the form where s 1 and s 2 range over all possible signal constellation points.Tarokh in [2] showed that the Alamouti scheme is an example of a STC created from a complex orthogonal design.
The design matrix of this STC for two codewords c = (c 1 , −c *  (42)

CONCLUSIONS
In this paper, we presented a framework for developing space-time codes for an atmospheric optical heterodyne communication system.Through a central limit theorem approximation, we found that the design criterion A = (E d /N)I minimized the pairwise probability of codeword error for large numbers of apertures.Although developed for lognormal fading, this method generalizes to other fading distributions in which the fades are zero-mean and independent.
Our design criterion also satisfies the rank and determinant criteria presented in [1] for Rayleigh channels.Furthermore, orthogonal designs provide a method of constructing STCs that satisfy our criterion, and require only linear processing at the receiver [2].

Figure 1 :
Figure 1: Block diagram of the transmitter, channel, and receiver.

Figure 7 :
Figure 7: Comparison of pairwise codeword probability of error for A = (E d /N)I STCs with and without the central limit theorem approximation; MN = 16, σ 2 χ = 0.1.

T
t=1 c n (t)c * k (t) = 0 for n = k.As a result, space-time codes created from orthogonal designs, such as the Alamouti scheme, have a simple decoding algorithm.Rewriting the decision metric in (2m (t)α nm c n (t)   (41) shows that joint detection of (c 1 (1), . . ., c N (T)) is equivalent to decoding each individual symbol, c n (t), separately.The structure of the Alamouti STC allows for further simplification, and the decision rules become ŝ1 = argmin