Spatial-Mode Selection for the Joint Transmit and Receive MMSE Design

To approach the potential MIMO capacity while optimizing the system bit error rate (BER) performance, the joint transmit and receive minimum mean squared error (MMSE) design has been proposed. It is the optimal linear scheme for spatial multiplexing MIMO systems, assuming a ﬁxed number of spatial streams p as well as a ﬁxed modulation and coding across these spatial streams. However, state-of-the-art designs arbitrarily choose and ﬁx the value of the number of spatial streams p , which may lead to an ine ﬃ cient power allocation strategy and a poor BER performance. We have previously proposed to relax the constraint of ﬁxed number of streams p and to optimize this value under the constraints of ﬁxed average total transmit power and ﬁxed spectral e ﬃ ciency, which we referred to as spatial-mode selection . Our previous selection criterion was the minimization of the system sum MMSE. In the present contribution, we introduce a new and better spatial-mode selection criterion that targets the minimization of the system BER. We also provide a detailed performance analysis, over ﬂat-fading channels, that conﬁrms that our proposed spatial-mode selection signiﬁcantly outperforms state-of-the-art joint Tx/Rx MMSE designs for both uncoded and coded systems, thanks to its better exploitation of


INTRODUCTION
Over the past few years, multiple-input multiple-output (MIMO) communication systems have prevailed as the key enabling technology for future-generation broadband wireless networks, thanks to their huge potential spectral efficiencies [1].Such spectral efficiencies are related to the multiple parallel spatial subchannels that are opened through the use of multiple-element antennas at both the transmitter and receiver.These available spatial subchannels can be used to transmit parallel independent data streams, what is referred to as spatial multiplexing (SM) [2,3].To enable SM, joint transmit and receive space-time processing has emerged as a powerful and promising design approach for applications, where the channel is slowly varying such that the channel state information (CSI) can be made available at both sides of the transmission link.In fact, the latter design approach exploits this CSI to optimally allocate resources such as power and bits over the available spatial subchannels so as to either maximize the system's information rate [4] or alternatively reduce the system's bit error rate (BER) [5,6,7,8].
In this contribution, we adopt the second design alternative, namely, optimizing the system BER under the constraints of fixed rate and fixed transmit power.Moreover, among the possible design criteria, we retain the joint transmit and receive minimum mean squared error (joint Tx/Rx MMSE), initially proposed in [5] and further discussed in [7,8], for it is the optimal linear solution for fixed coding and symbol constellation across spatial subchannels or modes.The latter constraint is set to reduce the system's complexity and adaptation requirements, in comparison with the optimal yet complex bit loading [9].Nevertheless, state-of-the-art contributions initially and arbitrarily fix the number of used SM data streams p [5,6,7,8].We have previously argued that, compared to their channel-aware power allocation policies, the initial, arbitrary, 1 and static choice of the number of transmit data streams p is suboptimal [10].More specifically, we have highlighted the highly inefficient transmit power allocation and poor BER performance this approach may lead to.Consequently, we have proposed to include the number of streams p as an additional design parameter, rather than a mere arbitrary fixed scalar as in state-of-the-art contributions, to be optimized in order to minimize the joint Tx/Rx MMSE design's BER [10,11].A remark in [7] previously raised this issue without pursuing it.The optimization criterion, therein proposed, was the minimization of the sum MMSE and has been also investigated in [10,11] for flat-fading and frequency-selective fading channels, respectively.The sum MMSE minimization criterion, however, is obviously suboptimal as it equivalently overlooks the joint Tx/Rx MMSE design p parallel modes as a single one whose BER is minimized.Consequently, it fails to identify the optimal MSEs and BERs on the individual spatial streams that would actually minimize the system average BER.In the present contribution, a better spatial-mode selection criterion is proposed which, on the contrary, examines the BERs on the individual spatial modes in order to identify the optimal number of spatial streams to be used for a minimum system average BER.Finally, spatial-mode selection has also been investigated in the context of space-time coded MIMO systems in presence of imperfect CSI at the transmitter [12,13].The therein developed solutions, however, do not apply for spatial multiplexing scenarios, which are the focus of the present contribution.
The rest of the paper is organized as follows.Section 2 provides the system model and describes state-of-the-art joint Tx/Rx MMSE designs.Based on that, Section 3 derives the proposed spatial-mode selection.In Section 4, the BER performance improvements enabled by the proposed spatial-mode selection are assessed for both uncoded and coded systems.Finally, we draw the conclusions in Section 5.

Notations
In all the following, normal letters designate scalar quantities, boldface lower case letters indicate vectors, and boldface capitals represent matrices; for instance, I p is the p × p identity matrix.Moreover, trace(M), [M] i, j , [M] •, j , [M] •,1: j , respectively, stand for the trace, the (i, j)th entry, the jth column, and the j first columns of matrix M. [x] + refers to Max(x, 0) and (•) H denotes the conjugate transpose of a vector or a matrix.Finally, ||m|| 2 indicates the 2-norm of vector m.

System model
The SM MIMO wireless communication system under consideration is depicted in Figure 1.It consists of a transmitter and a receiver, both equipped with multiple-element antennas and assumed to have perfect knowledge about the current channel realization.At the transmitter, the input bit stream b is coded, interleaved, and modulated according to a predetermined symbol constellation of size M p .The resulting symbol stream s is then demultiplexed into p ≤ Min(M R , M T ) independent streams.The latter SM operation actually converts the serial symbol stream s into a higher-dimensional symbol stream where every symbol is a p-dimensional spatial symbol, for instance, s(k) at discretetime index k.These spatial symbols are then passed through the linear precoder T in order to optimally adapt them to the current channel realization prior to transmission through the M T -element transmit antenna.At the receiver, the M R symbol-sampled complex baseband outputs from the M Relement receive antenna are passed through the linear decoder R matched to the precoder T. The resulting p output streams conveying the detected spatial symbols ŝ(k) are then multiplexed, demodulated, deinterleaved, and decoded to recover the initially transmitted bit stream.For a flat-fading MIMO channel, the global system equation is given by where n(k) is the M R -dimensional receiver noise vector at discrete-time index k.H is the M R × M T channel matrix whose (i, j)th entry [H] i, j represents the complex channel gain between the jth transmit antenna element and the ith receive antenna element.In all the following, the discretetime index k is dropped for clarity.

Generic joint Tx/Rx MMSE design
The linear precoder and decoder T and R represented by an M T ×p and p×M R matrix, respectively, are jointly designed to minimize the sum mean squared error (MSE) on the spatial symbols s subject to fixed average total transmit power P T constraint [6] as stated in the following: subject to: The statistical expectation E s,n {•} is carried out over the data symbols s and the noise samples n.We assume uncorrelated data symbols of average symbol energy E s and zero-mean temporally and spatially white complex Gaussian noise samples with covariance matrix σ 2 n I MR .We introduce the thin [14, page 72] singular value decomposition (SVD) of the MIMO channel matrix H: where U p and V p are, respectively, the M R × p and M T × p left and right singular vectors associated to the p strongest singular values or spatial subchannels or modes 2 of H, stacked in decreasing order in the p × p diagonal matrix Σ p .U p and V p are the left and right singular vectors associated to the remaining (Min(M R , M T ) − p) spatial modes of H, similarly stacked in decreasing order in Σ p .The optimization problem stated in (2) is solved using the Lagrange multiplier technique which formulates the constrained cost-function as follows: where λ is the Lagrange multiplier to be calculated to satisfy the transmit power constraint.The optimal linear precoder and decoder pair {T, R}, solution to (4), was shown to be [6] where Z is an optional p × p unitary matrix, Σ T is the p × p diagonal power allocation matrix that determines the transmit power distribution among the available p spatial modes 2 We will alternatively use spatial subchannels and spatial modes to refer to the singular values of H, as these singular values represent the parallel independent spatial subchannels or modes underlying the flat-fading MIMO channel modeled by H. and is given by and Σ R is the p × p diagonal complementary equalization matrix given by The joint Tx/Rx MMSE design of (5) essentially decouples the MIMO channel matrix H into its underlying spatial modes and selects the p strongest ones, represented by Σ p , to transmit the p data streams.Among the latter p spatial modes, only those above a minimum signal-to-noise ratio (SNR) threshold, determined by the transmit power constraint, are the actually allocated power as indicated by [•] + in (6).Furthermore, more power is allocated to the weaker ones in an attempt to balance the SNR levels across spatial modes.

Problem statement
The discussed generic joint Tx/Rx MMSE design has been derived for a given number of spatial streams p which are arbitrarily chosen and fixed [5,6,7,8,15].These p streams will always be transmitted regardless of the power allocation policy that may, as previously highlighted, allocate no power to certain weak spatial subchannels.The data streams assigned to the latter subchannels are then lost, leading to a poor overall BER performance.Furthermore, as the SNR increases, these initially disregarded modes will eventually be given power and will monopolize most of the available transmit power, leading to an inefficient power allocation strategy that detrimentally impacts the strong modes.Finally, it has been shown [16] that the spatial subchannel gains exhibit decreasing diversity orders.This means that the weakest used subchannel sets the spatial diversity order exploited by the joint Tx/Rx MMSE design.The previous remarks highlight the influence of the choice of p on the transmit power allocation efficiency, the exhibited spatial diversity order, and thus on the joint Tx/Rx MMSE designs' BER performance.
Hence, we alternatively propose to include p as a design parameter to be optimized according to the available channel knowledge for an improved system BER performance, what we subsequently refer to as spatial-mode selection.

State-of-the-art joint Tx/Rx MMSE designs
Before proceeding to derive our spatial-mode selection, we first introduce two state-of-the-art designs that instantiate the aforementioned generic joint Tx/Rx MMSE solution and that are the base line for our subsequent optimization proposal.While preserving the joint Tx/Rx MMSE design's core transmission structure {Σ T , Σ p , Σ R }, these two instantiations implement different unitary matrices Z.As will be subsequently shown, the latter unitary matrix can be used to enforce an additional constraint without altering the resulting system's sum MMSE p , formally defined in (2).In order to explicit it, we introduce the MSE covariance matrix MSE p , associated with the considered fixed p data streams and fixed symbol constellation across these streams, defined as follows: Clearly, the diagonal elements of MSE p represent the MSEs induced on the individual spatial streams.Consequently, their sum would result in the aforementioned sum MMSE p when the optimal linear precoder and decoder pair {T, R} of ( 5) is used.In the latter case, MSE p can be straightforwardly expressed as follows: MMSE p is then simply given by [6] MMSE Since the trace of a matrix depends only on its singular values, the unitary matrix Z, indeed, does not alter the MMSE p that can be reduced to

Conventional joint Tx/Rx MMSE design
The conventional 3 joint Tx/Rx MMSE design only aims at minimizing the system's sum MSE.Since, as aforementioned, the unitary matrix Z does not alter the system's MMSE p , this design simply sets it to identity Z = I p [6,7,8].Nevertheless, this design exhibits nonequal MSEs across the data streams as pointed out in [7,15].Thus, its BER performance will be dominated by the weak modes that induce the largest MSEs.
To overcome this drawback, the following design has been proposed.

Even-MSE joint Tx/Rx MMSE design
The even-MSE joint Tx/Rx MMSE design enforces equal MSEs on all data streams while maintaining the same overall sum MMSE p .This can be achieved by choosing Z as the p × p IFFT matrix [15] with In fact, taking advantage of the diagonal structure of the inner matrix in (9), the pair {IFFT, FFT} enforces equal diagonal elements for MSE p ,4 what amounts to equal MSEs on all data streams.Through balancing the MSEs across the data streams, this design guarantees equal minimum BER on all streams for the given fixed number of spatial streams p and fixed constellation across these streams.Nevertheless, the use of the {IFFT, FFT} pair induces additional interstream interference in the case of the even-MSE design.

SPATIAL-MODE SELECTION
As previously announced, we aim at a spatial-mode selection criterion that minimizes the system's BER.In order to identify such criterion, we subsequently derive the expression of the conventional joint Tx/Rx MMSE design's average BER and analyze the respective contributions of the individual used spatial modes.To do so, we rewrite the input-output system equation ( 1) for this design, using the optimal linear precoder and decoder solution of ( 5) and setting Z to identity: Remarkably, the conventional joint Tx/Rx MMSE design transmits the p available data streams on p parallel independent channel spatial modes.Each of these spatial modes is simply Gaussian with a fixed gain, given by its corresponding entry in Σ p Σ T , and an additive noise of variance σ 2 n . 5Consequently, for the used Gray-encoded square QAM constellation of size M p and average transmit symbol energy E s , the average BER on the ith spatial mode, denoted by BER i , is approximated at high SNRs (see [17, page 280] and [18, page 409]) by where σ i denotes the ith diagonal element of Σ p , which represents the ith spatial mode gain.Similarly, σ T i is the ith diagonal element of Σ T whose square designates the transmit power allocated to the ith spatial mode.Since the used square QAM constellation of size M p and minimum Euclidean distance d min = 2 has an average symbol energy The argument σ 2 i σ T 2 i /σ 2 n is easily identified as the average symbol SNR normalized to the symbol energy E s on the ith spatial mode.For a given constellation M p , the latter average SNR clearly determines the BER on its corresponding spatial mode.The conventional design's average BER performance, however, depends on the SNRs on all p spatial modes as follows: Consequently, to better characterize the conventional design's BER, we define the p × p diagonal SNR matrix SNR p whose diagonal consists of the average SNRs on the p spatial modes: Using the expression of the optimal transmit power allocation matrix Σ 2 T formulated in ( 6), the previous SNR p expression can be further developed into The latter expression illustrates that the conventional joint Tx/Rx MMSE design induces uneven SNRs on the different p spatial streams.More importantly, (17) shows that the weaker the spatial mode is, the lower its experienced SNR is.The conventional joint Tx/Rx MMSE BER, BER conv , of (15) can be rewritten as follows: The previous SNR analysis further indicates that the p spatial modes exhibit uneven BER contributions and that of the weakest pth mode, corresponding to the lowest SNR [SNR p ] p,p , dominates BER conv .Consequently, in order to minimize BER conv , we propose as the optimal number of streams to be used p opt , the one that maximizes the SNR on the weakest used mode under a fixed rate R constraint.The latter proposed spatial-mode selection criterion can be expressed as follows: The rate constraint shows that, though the same symbol constellation is used across spatial streams, the selection/adaptation of the optimal number of streams p opt requires the joint selection/adaptation of the used constellation size such that M opt = 2 R/ popt .Adapting (17) for the considered square QAM constellations (i.e., E s = 2(M p − 1)/3), the spatial-mode selection criterion stated in (19) can be further refined into The latter spatial-mode selection problem has to be solved for the current channel realization to identify the optimal pair {p opt , M opt } that minimizes the system's average BER, BER conv .
We have derived our spatial-mode selection based on the conventional joint Tx/Rx MMSE design because this design represents the core transmission structure on which the even-MSE design is based.Our strategy is to first use our spatial-mode selection to optimize the core transmission structure {Σ T , Σ popt , Σ R }, the even-MSE, then additionally applies the unitary matrix Z, which is now the p opt × p opt IFFT matrix to further balance the MSEs and the SNRs across the used p opt spatial streams.

PERFORMANCE ANALYSIS
In this section, we investigate the uncoded and coded BER performance of both conventional and even-MSE joint Tx/Rx MMSE designs when our spatial-mode selection is applied.The goal is manifold.We first assess the BER performance improvement offered by our spatial-mode selection over state-of-the-art full SM conventional and even-MSE joint Tx/Rx MMSE designs.Then, we compare our spatial-mode selection performance and complexity to those of a practical spatial adaptive loading strategy.Last but not least, we evaluate the impact of channel coding on the relative BER performances of all the above-mentioned designs.In all the following, the MIMO channel is stationary Rayleigh flat-fading, modeled by an M R × M T matrix with i.i.d unitvariance zero-mean complex Gaussian entries.In all the following, the BER figures are averaged over 1000 channel realizations for the uncoded performance and over 100 channels for the coded performance.For each channel, at least 10 bit errors were counted for each E b /N 0 value, where E b /N 0 stands for the average receive energy per bit over noise power.A unit average total transmit power was considered, P T = 1.

Uncoded performance
Considering the uncoded system, we first compare the relative BER performance of the conventional and even-MSE joint Tx/Rx MMSE designs when full SM is used.We later apply our spatial-mode selection for improved BER performances, which we further contrast with that of a practical spatial adaptive loading scheme inspired from [19].

Conventional versus even-MSE joint Tx/Rx MMSE
For a fixed number of spatial streams p and fixed symbol constellation M p , BER conv given by (15) Recalling Jensen's inequality [20, page 25] and the comparison of ( 18) and (21) where the MSEs ([MSE p ] i,i = 1/[SNR p ] i,i ) i would be denoted as variable (x i ) i , we can state that BER even − MSE ≤ BER conv (22) when f p (x) = erfc(1/

√
x) is convex.The analysis of the function { f p (x), x ≥ 0}, provided in Appendix A, shows that it is convex for values of x smaller than a certain x inf ; for x larger than x inf , the function turns out to be concave.Since x stands for the MSEs on the spatial modes, which decrease when the average receive energy per bit over noise power (E b /N 0 ) increases, we can relate the convexity of f p (x) to the relative BER performance of the conventional and the even-MSE joint Tx/Rx MMSE designs as follows: E b /N 0inf is the E b /N 0 value needed to reach f p (x)'s inflection point x inf = MSE inf .This BER analysis is further confirmed by the simulated results plotted in Figures 2, 3, and 4.More specifically, the latter figures illustrate that the full SM even-MSE outperforms the full SM conventional design after a certain E b /N 0 value, previously referred to as E b /N 0inf .
As it turns out, the latter value occurs before 0 dB for both the (2, 2) MIMO setup at R = 4 bps/Hz and the (3, 3) MIMO setup at R = 6 bps/Hz, respectively, plotted in Figures 2 and  3.For the case of the (3, 3) MIMO setup at R = 12 bps/Hz of Figure 4, however, the even-MSE design surpasses the conventional design only for SNRs larger than E b /N 0inf = 10 dB.This is due to the fact that, for a given (M T , M R ) MIMO system with fixed average total transmit power P T , the larger the constellation used and the larger the rate supported, the larger the induced MSEs at a given E b /N 0 value or alternatively the larger the E b /N 0inf needed to fall below MSE inf on the used spatial streams, which is required for the even-MSE design to outperform the conventional one.

Spatial-mode selection versus full spatial multiplexing
Applying our spatial-mode selection to both joint Tx/Rx MMSE designs leads to impressive BER performance improvement for various MIMO system dimensions and parameters.Figure 2 illustrates such BER improvement for the case of a (2, 2) MIMO setup supporting a spectral efficiency R = 4 bps/Hz.Our proposed spatial-mode selection is shown to provide 12.6 dB and 10.5 dB SNR gain over full SM conventional and even-MSE designs, respectively, at BER = 10 −3 .Figures 3 and 4 confirm similar gains for a (3, 3) MIMO setup at spectral efficiency R = 6 bps/Hz and R = 12 bps/Hz, respectively.These significant performance improvements are due to the fact that our spatial-mode selection, depending on the spectral efficiency R, wisely discards a number of weak spatial modes that exhibit the lowest spatial diversity orders, as argued in [16].The same weak modes that dominate the performance of both full SM joint Tx/Rx MMSE designs.According to (20), our spatial-mode selection restricts transmission to the p opt strongest modes only.The latter p opt modes exhibit significantly higher spatial diversity orders and form a more balanced subset 6 over which a more efficient power allocation is possible, leading to higher transmission SNR levels and consequently lower BER figures.Furthermore, it is because the subset of p opt selected modes is balanced that the additional effort of the even-MSE joint Tx/Rx MMSE to further average it brings only marginal BER improvement over the conventional joint Tx/Rx MMSE when spatial-mode selection is applied.Clearly, the proposed spatial-mode selection enables a more efficient transmit power allocation and a better exploitation of the available spatial diversity. 6The difference between the p opt spatial mode gains is reduced.

Spatial-mode selection versus spatial adaptive loading
The spatial adaptive loading, herein considered, is simply the practical Fischer's adaptive loading algorithm [19].The latter algorithm was initially proposed for multicarrier systems.Nevertheless, it directly applies for a MIMO system where an SVD is used to decouple the MIMO channel into parallel independent spatial modes, which are completely analogous to the orthogonal carriers of a multicarrier system.Hence, the considered spatial adaptive loading setup first performs an SVD that decouples the MIMO channel into parallel independent spatial modes.Fischer's adaptive loading algorithm [19] is then used to determine, using the knowledge of the current channel realization, the optimal assignment for the R bits on the decoupled spatial modes such that equal minimum symbol-error rate (SER) is achieved on the used modes.Consequently, strong spatial modes are loaded with large constellation sizes, whereas weak modes carry small constellation sizes or are dropped if their gains are below a given threshold.This scheme, indeed, exhibits excellent performance, as shown in Figures 2, 3, and 4, mostly outperforming both joint Tx/Rx MMSE designs even when spatialmode selection is used.This is due to spatial adaptive loading's additional flexibility of assigning different constellation sizes to different spatial modes.This higher flexibility, however, entails a higher complexity and signaling overhead, as later on highlighted.
When the spectral efficiency is low and there is major discrepancy between available spatial modes, as occurs between the two spatial modes of a (2, 2) MIMO system [16], both spatial adaptive loading and spatial-mode selection in conjunction with joint Tx/Rx MMSE designs converge to the same solution, basically single-mode transmission or max-SNR solution [21], as illustrated in Figure 2. Figure 3 illustrates the case of a (3, 3) MIMO system when the spectral efficiency is low R = 6 bps/Hz.In this case, the two first channel singular values corresponding to the two strongest spatial modes out of the three available spatial modes have relatively close diversity orders and close gains [16].Consequently, spatial adaptive loading can optimally distribute the available R = 6 bits between these two strongest modes while using a lower constellation on the second mode to reduce its impact on the BER, whereas spatial-mode selection has to stick to the single-mode transmission with 64 QAM to avoid the weak third mode that would be used by the next possible constellation (4 QAM7 over all three spatial streams).In this case, spatial-mode selection suffers an SNR penalty of 2 dB compared to spatial adaptive loading at BER = 10 −3 .When the spectral efficiency is further increased to R = 12 bps/Hz, spatial adaptive loading's flexibility margin is reduced and so is its SNR gain over spatial-mode selection, which is now only 0.7 dB at BER = 10 −3 for the conventional joint Tx/Rx MMSE design, as shown in Figure 4.
Furthermore, the even-MSE design, when spatial-mode selection is applied, even outperforms spatial adaptive loading for high SNRs.The latter result is related to these two designs' BER minimization strategies.On the one hand, the even-MSE joint Tx/Rx MMSE design guarantees equal minimum MSEs on each stream and hence equal minimum SER and BER since the same constellation is used across streams.On the other hand, spatial adaptive loading enforces equal minimum SER across streams; the BERs on the latter streams, however, are not equal since they bear different constellations.Thus, the weak modes, carrying small constellations, exhibit higher BERs.The latter imbalance explains the fact that the even-MSE design surpasses spatial adaptive loading when spatial-mode selection is applied.For target high data-rate SM systems, the latter regime is particularly relevant and our spatial-mode selection was shown to tightly approach spatial-adaptive-loading optimal BER performance while exhibiting lower complexity and adaptation requirements.The comparison of the complexity required by our spatial-mode selection to that of spatial adaptive loading, assessed in [22, page 67], shows that both techniques exhibit similar complexities when the available number of modes or subchannels is small.When the number of modes increases, 8however, spatial adaptive loading requires an increased number of iterations to reach the final bits assignment, and consequently, its complexity significantly outgrows that of our spatial-mode selection.More importantly, adaptive loading requires the additional flexibility of assigning different constellations sizes to different modes, whereas our spatial-mode selection assumes a single constellation across modes.This higher flexibility comes at the cost of a higher signaling overhead between the transmitter and receiver.

Coded performance
In Section 4.1, we established our spatial-mode selection as a diversity technique that successfully exploits the spatial diversity available in MIMO channels to improve the performance of state-of-the-art joint Tx/Rx MMSE designs.In a practical wireless communication system, however, it will not be the only such diversity technique to be present.Indeed, channel coding will also be used, together with the latter state-of-the-art designs, to exploit the same spatial diversity.Therefore, in this section, we undertake a coded system performance analysis to confirm that our spatial-mode selection remains advantageous over the state-of-the-art full SM approach when channel coding is present.We further verify whether our conclusions, concerning the relative performance of all previously discussed schemes, are still valid.We consider a bit-interleaved coded modulation (BICM) system, as shown in Figure 1, with a rate-1/2 convolutional encoder with constraint length K = 7, generator polynomials [133 8 , 171 8 ], 9 and optimum maximum likelihood sequence estimation (MLSE) decoding using the Viterbi decoder [23].

Conventional versus even-MSE joint Tx/Rx MMSE
To gain some insight into both designs' coded performances, we derive the equivalent additive white Gaussian noise (AWGN) channel model describing the output of the linear equalizer R for each of the two designs.Such a model highlights the diversity branches available at the input of the Viterbi decoder and hence the achievable spatial diversity for the corresponding joint Tx/Rx MMSE design.Furthermore, it was used to calculate the bit log-likelihood ratios (LLR), which form the soft inputs for soft-decision Viterbi decoding as in [24].The output of the linear equalizer R for the conventional joint Tx/Rx MMSE design is described in (12).Accordingly, the detected symbol ŝi on the ith spatial mode can be expressed as the output of an equivalent AWGN channel having s i as its input: The latter equivalent AWGN channel is described by a gain µ conv i and a zero-mean white complex Gaussian noise of variance σ R 2 i σ 2 n .Similarly, the AWGN channel equivalent model for the even-MSE design can be shown to be (See Appendix B) where η i stands for the equivalent zero-mean white complex Gaussian noise of variance σ 2 η .In this case, however, the latter equivalent noise contains, in addition to scaled receiver noise, interstream interference induced by the use of the {IFFT, FFT} pair.The equivalent noise variance σ 2 η was found to be (See Appendix B) Clearly, the conventional joint Tx/Rx MMSE design provides symbol estimates (ŝ i ) 1≤i≤p , and consequently coded bits, that experienced independently fading channels with different diversity orders, which enables the channel coding to exploit the system's spatial diversity, whereas the even-MSE design, through the use of {IFFT, FFT}, creates an equivalent average channel for all p spatial streams, as shown in ( 25) and (26).Consequently, the even-MSE design prohibits the channel coding from any diversity combining and only allows for coding gain.In other words, the coded even-MSE design exhibits the same diversity order as the uncoded one.The latter diversity order is the one exhibited, at high E b /N 0 , by the average 10 received bit SNR on the p spatial streams.At high E b /N 0 , the MMSE receiver Σ R reduces to a zero-forcing receiver equal to Σ −1 T Σ −1 p .In that case, the average received bit SNR on the p spatial streams, denoted as SNR even − MSE , can be defined as follows: where σ 2 η is the asymptotic equivalent noise variance equal to (σ 2 n / p) p i=1 1/σ 2 i σ T 2 i , corresponding to the evaluation of (26) at high E b /N 0 .Consequently, SNR even − MSE can be developed into The previous SNR even − MSE statistics should be contrasted with those of the average received SNRs on the p parallel modes of the conventional joint Tx/Rx MMSE design, denoted as (SNR convi ) i .Based on (24), the latter received SNRs are simply given by Furthermore, the spatial diversity exhibited by SNR even − MSE should also be compared to the maximum spatial diversity 10 Carried out over data symbols and noise samples.
order achievable by channel coding, 11 given by maximumratio combining (MRC) across the conventional design's p spatial modes.Since the latter p spatial modes can be considered independent diversity paths of SNRs (SNR convi ) i , the aforementioned maximum achievable spatial diversity order is described by the statistics of SNR MRC [17, page 780]: Figure 5 provides such a spatial diversity comparison, as it plots the cumulative probability density functions (cdf) of ( 28), (29), and (30) for a full SM (3, 3) MIMO setup at spectral efficiency R = 12 bps/Hz and average receive E b /N 0 = 20 dB.The steeper the SNR's cdf is, the higher the diversity order of the corresponding spatial mode or design is.Consequently, Figure 5 confirms the decreasing diversity orders of the conventional design's p spatial modes.More importantly, it shows that the diversity order exhibited by the even-MSE design is closer to that of the weakest spatial mode, which obviously dominates the even-MSE design's equivalent channel of (25).The even-MSE design's diversity order is also lower than the diversity order achievable by the conventional design when channel coding is applied.The latter observation explains the coded BER results of Figures 6 and 7 where, contrarily to the uncoded system, the full SM conventional design now significantly outperforms the SM even-MSE design.Furthermore, comparing Figures 3, 6, and 7 confirms that channel coding, as previously argued, does not improve on the spatial diversity exploited by the even-MSE design, whereas it does significantly improve the performance of the conventional design through exploiting the different diversity branches this design provides.

Spatial-mode selection versus full spatial multiplexing
Figure 5 further depicts the evolution of the previous spatial diversity comparison when our spatial-mode selection is applied.Clearly, only the two highest diversity spatial modes are selected for transmission.As previously explained, these two strong modes form a more balanced subset on which a more efficient power allocation is possible and consequently larger experienced SNR values on the spatial modes are achieved.Moreover, since the weakest mode has been discarded, the even-MSE design now averages the two strongest spatial modes and obviously exhibits a higher equivalent diversity order.However, the latter diversity order is still lower than that achievable through channel coding across the conventional design's two parallel spatial modes.Hence, the coded conventional design still outperforms the coded even-MSE when our spatial-mode selection is applied, as illustrated in Figures 6 and 7.More importantly, our spatial-mode selection still significantly improves the performance of both joint Tx/Rx MMSE designs in presence of channel coding.by our spatial-mode selection over full SM for the conventional design.The gains are more dramatic for the even-MSE design, as channel coding is prohibited to access the spatial diversity in the full SM case.

Spatial-mode selection versus spatial adaptive loading
Although our spatial-mode selection significantly improves the BER performance of the uncoded conventional joint Tx/Rx MMSE design, the latter design performance will always be dominated by the weakest mode among the p opt selected ones.The latter remark explains the better BER performances of both even-MSE design and spatial adaptive loading in Figure 3. Channel coding and interleaving mitigate this problem as they spread each information bit over several coded bits that are transmitted on all p opt spatial modes and eventually optimally combined before detection.Consequently, channel coding suppresses the SNR gap previously observed between the conventional design and spatial adaptive loading, as illustrated in Figure 6.Soft-decision decoding is shown in Figure 7 to further favor the conventional joint Tx/Rx MMSE design as it is the design that provides the more diversity branches at the output of the equalizer R.This is because spatial adaptive loading, in order to achieve equal SER across used spatial modes, enforces equal SNR across the latter modes which reduces the equivalent spatial diversity branches it provides to the Viterbi decoder.

CONCLUSIONS
In this paper, we proposed a novel selection-diversity technique, so-called spatial-mode selection, that optimally selects the number of spatial streams used by the spatial multiplexing joint Tx/Rx MMSE design in order to minimize the system's BER.We assessed the significant improvement in BER performance that our spatial-mode selection provides over the two state-of-the-art full SM joint Tx/Rx MMSE designs, namely, the conventional and even-MSE.Such significant improvements were shown to be due to the more efficient transmit power allocation and the better exploitation of the available spatial diversity achieved by our spatial-mode selection.Furthermore, when our spatial-mode selection is applied, both conventional and even-MSE designs were shown to tightly approach the optimal performance of spatial adaptive loading while exhibiting lower complexity and signaling overhead requirements.Finally, we confirmed that our spatial-mode selection is still advantageous when channel coding is present in the system.

APPENDICES
The function f p (x) = erfc(1/ √ x) for x ≥ 0 is explicitly defined as follows: To determine the convexity of the latter function, we need to evaluate the sign of its second derivative f p (x) for x ≥ 0. To do so, we first calculate the first derivative Accordingly, the first derivative f p (x) can be easily shown to be The second derivative f p (x) = d/dx[ f p (x)] can then be straightforwardly expressed as follows: Consequently, the sign of f p (x) for x ≥ 0 is solely determined by the sign of (−3/2 + 1/x) for x ≥ 0. Accordingly, f p (x) is convex ( f p (x) ≥ 0) when x ≤ 3/2, whereas it is concave ( f p (x) ≤ 0) for x ≥ 3/2.

B. DERIVATION OF (25) AND (26)
First, we instantiate the input-output system (1) for the even-MSE design using the optimal linear precoder and decoder solution of ( 5), where Z is the p × p IFFT matrix with {[Z]    The last two terms, respectively, represent the interstream interference caused by the {IFFT, FFT} pair and the AWGN resulting from the unitary filtering of the receiver noise.To draw the equivalent AWGN channel model of the even-MSE design, these two terms are merged into a single term, denoted η, approximated [24] as a zero-mean white Gaussian noise vector of variance σ 2 η .Accordingly, the even-MSE design's AWGN-channel equivalent model can be drawn as follows: 3) The evaluation of the previous model for the ith spatial stream leads to (25).We now calculate the equivalent noise variance σ 2 η .First, using the statistical independence of the elements of n and the effect of the {IFFT, FFT} pair on inner diagonal matrices, it can be easily shown that the filtered noise term of (B.2) has a covariance matrix σ 2  , where µ conv i stands for σ Ri σ i σ T i .Finally, since the filtered receive noise and the interstream interference are statistically independent, the sum of their above calculated variances coincides with the variance of their sum η as stated in (26).

Figure 1 :
Figure 1: The considered (M T , M R ) spatial multiplexing MIMO system using linear joint transmit and receive optimization.

Conventional mode 1 Figure 5 :
Figure 5: Comparison of the diversity orders exhibited by the spatial modes for (a) full SM and (b) spatial-mode selection for a (3, 3) MIMO setup at R = 12 bps/Hz and average receive E b /N 0 = 20 dB.Conventional mode 3 SNR conv3 does not appear in (b).

Figure 7 :
Figure5further depicts the evolution of the previous spatial diversity comparison when our spatial-mode selection is applied.Clearly, only the two highest diversity spatial modes are selected for transmission.As previously explained, these two strong modes form a more balanced subset on which a more efficient power allocation is possible and consequently larger experienced SNR values on the spatial modes are achieved.Moreover, since the weakest mode has been discarded, the even-MSE design now averages the two strongest spatial modes and obviously exhibits a higher equivalent diversity order.However, the latter diversity order is still lower than that achievable through channel coding across the conventional design's two parallel spatial modes.Hence, the coded conventional design still outperforms the coded even-MSE when our spatial-mode selection is applied, as illustrated in Figures6 and 7.More importantly, our spatial-mode selection still significantly improves the performance of both joint Tx/Rx MMSE designs in presence of channel coding.Figures 6 and 7 report 6 dB and 3.5 dB SNR gains at BER = 10 −3 , respectively, for hard-and soft-decision decoding provided f p (x) = d/dx[ f p (x)].For that, we use the identity provided in[25,   page 275], which differentiates an integral of the form v(x) u(x) f (x, t)dt with respect to x as follows: , t)dt = v (x) f x, v(x) − u (x) f x, u(x) n,k = (1/ √ p) exp( j2πnk/ p); 0 ≤ k, n ≤ (p − 1)}, as follows:ŝ = Z H • Σ R Σ p Σ T • Z + Z H • Σ R n. (B.1)As earlier mentioned, taking advantage of the diagonal structure of the inner matrix Σ R Σ p Σ T , the {IFFT, FFT} pair enforces equal diagonal elements forZ H • Σ R Σ p Σ T • Z. Since the {IFFT, FFT} pair is unitary, the trace Z H • Σ R Σ p Σ T • Zis the trace of the inner diagonal matrix.Consequently, the diagonal elements of Z H • Σ R Σ p Σ T • Z are equal to p i=1 σ Ri σ i σ T i / p.Hence, the input-output equation (B.1) can be simply developed into • Σ R Σ p Σ T • [Z] •,1:(p−1) • s 1:(p−1) . . .[Z] •,p−1 H • Σ R Σ p Σ T • [Z] •,0:(p−2) • s 0:(p−2)

2 i
/ p•I p .Second, recalling the Vandermonde structure of Z and the fact that for all {k, n} :[Z] p n,k = 1, we can show that[Z] H •,0 • Σ R Σ p Σ T • [Z] •,1:(p−1) = [Z] H •, j • Σ R Σ p Σ T • [Z] •,( j+1):(p−1) [Z] •,1:( j−1) ; 1 ≤ j ≤ (p − 1).(B.4)Analyzing the term of interstream interference in (B.2), in light of the latter equality, allows us to see that the variance of the interstream interference on the p streams is the same.Straightforward calculations on the first stream show that the latter common variance is equal to E s [p i ) 2 ]/ p 2 approximates the The latter noise variance represents also the equivalent MSE at the output of the ith spatial mode, which can be denoted by [MSE p ] i,i = 1/[SNR p ] i,i .Hence, using the same zero-forcing assumption, the even-MSE enforces an equal MSE or noise variance across p streams equal to p i=1 (1/[SNR p ] i,i )/ p; thus its average BER, BER even − MSE , is approximately given by