A robust LU polynomial matrix decomposition for spatial multiplexing

This paper considers time-domain spatial multiplexing in MIMO wideband system, using an LU-based polynomial matrix decomposition. Because the corresponding pre- and post-filters are not paraunitary, the noise output power is amplified and the performance of the system is degraded, compared to QR-based spatial multiplexing approach. Degradations are important as the post-filter polynomial matrix is ill-conditioned. In this paper, we introduce simple transformations on the decomposition that solve the ill-conditioning problem. We show that this results in a MIMO spatial multiplexing scheme that is robust to noise and channel estimation errors. In the latter context, the proposed LU-based beamforming compares favorably to the QR-based counterpart in terms of complexity and bit error rate.

(2020) 2020:45 Page 2 of 17 signal on each antenna is a superposition of signals from the different transmit antennas called co-channel interference (CCI). In order to recover the transmitted data sequence corrupted by channel interference, a conventional method is the spatio-temporal vector coding (STVC) [4]. STVC structure is suggested as a theoretical means for achieving capacity, and a reduced complexity discrete matrix multitone (DMMT) technique is implemented by the authors to exploit the frequency selective MIMO channel. It is based on discrete multitone which is a technique that uses the discrete Fourier transform (DFT) to implement frequency-division multiplexing (FDM). DMMT is essentially analogous to OFDM [5] approach: the wideband problem is reduced to a narrowband form by using a DFT or FFT to split the data into narrower frequency bands and applying an SVD at each frequency to decorrelate the signals [6]. This approach ignores correlations between frequency bands, and the SVD will order the output channels according to power in each individual band leading to a lack of phase coherence [7]. An alternative is to consider time-domain scheme for which the diagonalization of the temporal MIMO channel can be performed once for the entire system [8]. This design, based on polynomial matrix decomposition, transforms the MIMO channel into a number of independent single-input single-output (SISO) subchannels. This is one of the most efficient techniques which is done by a factorization of the MIMO channel polynomial matrix as: where U(z) and V (z) are square matrices of sizes N r and N t , respectively. If the inverses of V (z) and U(z), assuming they are stable and causal, are inserted into the transmission chain respectively as pre-and post-filters, then the original MIMO channel becomes equivalent to D(z). Diagonalization of H(z), viz. the factorization in (1) with D(z) diagonal, therefore reduces the MIMO wideband channel to N = min(N t , N r ) independent SISO subchannels, thereby canceling the CCI. Such decomposition is most commonly achieved using the popular polynomial singular value decomposition (PSVD) method, leading to paraunitary factors U(z) and V (z). This paraunitaryness assures that the power distributions of the signal and noise remain unaltered after post-filtering. However, given a polynomial matrix, a PSVD factorization as described above does not exist in general [9]. By contrast, the MIMO spatial multiplexing scheme presented in [10,11] completely eliminates the CCI. This beamforming method is inspired from a blind equalization method exploiting the Bezout identity [12,13]. It is based on a combination of the classical Smith canonical form and LU (Gauss elimination). The decomposition method in [11], called LU-PMD (LU-polynomial matrix decomposition), is effective and does not require any iteration: the algorithm ends up after a finite and prescribed number of steps, with a matrix D(z) which is exactly diagonal. Moreover, it was shown in [11] that unless for some improbable original MIMO channel, all but except the last resulting independent SISO subchannels reduce to simple additive noise channels. Therefore, in addition to completely canceling the CCI, this decomposition also inherently avoids the ISI problem. However, the corresponding factors U(z) and V (z) are unimodular and not paraunitary as in the QR-based methods. The loss of the latter property induces a serious limitation consisting in an output noise enhancement. The role of the post-filter in this performance degradation was clarified: the degradation becomes severe as the norm and the condition (2020) 2020:45 Page 3 of 17 number of the post-filter matrix-valued transfer function increase. Improving the postfilter matrix conditioning by a simple row balancing was proposed in [14]. Significant improvement of the performance, in terms of bit error rate, has been observed.
In this paper, we revisit the LU-based factorization in [11], in combination with the row balancing trick in [14]. We show that the resulting transformations solve the illconditioning problem and lead to a MIMO spatial multiplexing scheme that is robust to noise and channel estimation errors (see also [15] for a combination of spatial beamforming and channel estimation). In the latter context, the proposed LU-based beamforming compares favorably to the QR-based counterpart in terms of both complexity and bit error rate.
The structure of this paper is as follows. Section 2 is devoted to the LU-based decomposition method for MIMO spatial multiplexing scheme. The noise enhancement problem is also explained. Two solutions, and a combination of both, are presented in Section 3. Simulation results showing that the proposed LU-based decomposition significantly reduces the noise enhancement are given in Section 3.2 with comparison with the QR-based scheme. The robustness of the proposed scheme to channel estimation errors is discussed in Section 4 with comparison with the QR-based scheme. Finally, concluding remarks are given in Section 5.

MIMO spatial multiplexing scheme
Let us consider a MIMO communication system which has N t transmitting antennas and N r receiving antennas through a channel represented by its transfer matrix-valued function H(z) ∈ C N r ×N t . Let {x i,k } k∈N denote the equivalent discrete-time causal signal on the transmit antenna i ∈ {1, · · · , N t } and define by: its associated Z-transform. We use the boldface notation x(z) for the column vector of size N t given by x(z) =[ x 1 (z) · · · x N t (z)] T , where the superscript T stands for the transpose operator. Likewise, we denote by y(z) the vector collecting the z-transforms of the discrete-time signals recorded on the N r receiving antennas. Then, the MIMO channel input-output relation reads in the z-transform domain as: where n(z) stands for the z-transform of a sample realization of the noise corruption n ∈ C N r ×1 . Assume that the channel's transfer matrix admits a factorization H(z) = U(z)D(z)V (z) as in (1). Then, using the inverse of U(z) and V (z), noted: respectively, as post-and pre-filters, allow one to reduce the original MIMO channel into the simpler form D(z). Indeed, if the original signal is pre-filtered before transmission as in x(z) = V pr (z)x(z) = x 1 (z) · · · x N t (z) T , then the corresponding channel's output becomes y(z) = H(z) x(z) + n(z). Thus, the post-filtering step y(z) = U po (z) y(z) yields the final equivalent system: where we have set n(z) =U po (z)n(z) for the noise after post-filtering. The decomposition in Eq. (1) is mostly performed by polynomial matrix SVD decomposition. The corresponding factors V (z) and U(z) are then expected to be paraunitary, which means that they satisfy: where the notation * stands for the para-Hermitian conjugation, that is, [F(z)] * =F(1/z) T , and I is the identity matrix of appropriate size. Thereby, the pre-and post-filters V pr (z) = V (z) * and U po (z) = U(z) * are also paraunitary, and setting E(·) for the mathematical expectation, we have: Likewise, we obtain x(z) 2 2 = x(z) 2 2 showing that, in this case, the pre-and postfiltering do not modify the mean power of the original signal and noise stochastic processes. Unfortunately, polynomial matrix SVD does not exist in general. Of course, an SVD decomposition is clearly feasible if one relaxes the constrain of the factors being polynomial. But then, the presence of poles can lead to instability. Instead, a common solution is to consider a Laurent polynomial matrix decomposition. Several iterative algorithms have been proposed to obtain approximate Laurent polynomial matrix SVD [16][17][18][19]. These methods can only generate approximately diagonal matrices D(z), leading to inevitable residual CCI. The residual CCI may be drastically reduced by increasing the number of iterations in the algorithms but at the expense of large order of the polynomial D(z), which translates into increased complexity and more intersymbol interference (ISI) on each resulting SISO channel. Polynomial order truncation is introduced to limit the degrees of the polynomials. But, this can affect the paraunitary property of the pre-and post-filters (see also [20,21] where the order growth problem is mitigated). In this regard, a MIMO beamforming scheme based on a combination of the classical Smith canonical form and LU (Gauss elimination) was presented in [11] as an alternative solution.

LU-based polynomial matrix decomposition (LU-PMD)
The decomposition algorithm is recalled in Section 3.1.1, with a reformulation in two nested recursions. Let us give an overview, meanwhile. Basically, the approach follows the same steps as the classical LU factorization. However, in each step, a preprocessing by the first step of the decomposition in Smith canonical form is considered. This preprocessing solves a Bezout equation in order to reduce the pivot element to a constant. We first obtain: where U(z) and R(z) are respectively N r × N r -unimodular and N r × N t -upper triangular polynomial matrices. Next, the same decomposition is applied to R(z) T to obtain:  (5). Then, for N r N t , a common setting in MIMO systems, the factorization (1) follows with: where D(z) is an N t × N t -diagonal matrix and O i,j is the zero matrix of size i × j.

Noise amplification problem
First, observe as in [14] that if the channel's output noise n is spatially and temporally white, i.e., with power spectral density matrix E [n(z)n(z) * ] = σ 2 I N r , then the post-filtered noise power reads as: The noise component in the equivalent reduced system (i.e., after pre-and post-filtering) is thus amplified with respect to the original system whenever the norm of the post-filter is high. This is illustrated in Fig. 1.
In this experiment, a complete OFDM communication system is simulated with a 4-QAM modulation. The sequence {x i,k } k 0 in (2) then represents the ith OFDM signal, including a cyclic prefix. The performance of the LU-based spatial multiplexing is measured by the corresponding bit error rate vs the SNR. Four different 3 × 3 MIMO channels, each corrupted by a unit-variance spatial-temporal white noise, are considered. The performance significantly degrades as the norm of the post-filter increases.
Clearly, this performance loss cannot be explained only by the noise power enhancement since the output signal y also undergoes the same post-filtering. Therefore, an analysis based on signal-to-noise ratios is more relevant.
To proceed, note that the post-filtering operation amounts to the resolution of the linear perturbed system U(z) y(z) = H(z) x(z) + n(z), with the error term n(z). Let us denote by where Tr(·) is the trace operator. Now, compare the communication systems in (3) and (4) in the light of classical perturbation analysis [22]. The corresponding noise-to-signal ratios are then related by (see [14]): When the post-filter U(z) is ill-conditioned, i.e., κ(U) 1, the noise-to-signal ratio can be significantly higher for the reduced system than for the original one. This explains the performance drop observed in the experiment reported in Fig. 2.
In this experiment, we have considered again the previous setting, with 3 × 3 randomly selected MIMO channels with Rayleigh distribution. The system's performance is measured by the corresponding bit error rate vs the SNR. The same experiment is then repeated with 4 × 4 and 5 × 5 MIMO channels in Figs. 3 and 4, respectively. All these experiments confirm a drop in performance as the condition number of the post-filter increases.
A row balancing of the post-filter was proposed in [14] as a solution to keep both the norm and the condition number of the post-filter low (see [23]). This method consists in replacing the preceding post-filter U po (z) by S(z) of the form: where W is a diagonal constant matrix selected such that each row of S(z) has unit norm. The diagonal elements W i,i of W then read as: where [ A] i denotes the ith row of matrix A. Accordingly, the channel's output signal after this modified post-filtering would read as: Good performance in terms of bit error rate was observed. Despite this improvement, the LU-based polynomial matrix decomposition for MIMO beamforming remains less competitive than the state-of-the-art methods because of the post-filter noise amplification.

Source of ill-conditioning
To identify where the abovementioned ill-conditioning stems from, let us recall one iteration of the LU-based factorization (see [11]). Indeed, the decomposition of H(z) in (5) can be rephrased by a recursion of the form: initialized to H 0 (z) = H(z) and ending at H N (z) = R(z), with N = min(N r − 1, N t ). The form of the polynomial transition matrix will be given later. Given the (k − 1)th iterate with: we describe how to get to the next step. First, the kth diagonal entry h (k) k,k (z) is reduced to the greatest common divisor (gcd) of the polynomials h (k) k+ ,k (z), = 0, . . . , N r − k, through the recursion: which runs until = k such that either 1 k < N r − k and d k, k (z) = 1 or k = N r − k. Each iteration of this recursion is implemented in matrix form by a left multiplication by: where d k, −1 (z) and h (k) k+ ,k (z) are respectively the quotients of d k, −1 (z) and h (k) k+ ,k (z) by their gcd d k, (z) and where h k,k (z) and h k+ ,k (z) are obtained from the Bezout equation: Next, the kth iteration of the recursion (13) is completed by a Gaussian elimination step. This is achieved by left multiplying A k (z) = A k, k (z)A k, k −1 (z) · · · A k,1 (z), obtained at the end of the recursion (16), by the polynomial matrix: The polynomial transition matrix in the recursion (13) then readily reads as k (z) = L k (z)A k (z) and has the block diagonal form: for some polynomial matrix k (z). The first k − 1 rows of (k) (z) = k (z) (k−1) (z) are therefore identical to that of (k−1) (z). Meanwhile, the degrees of the remaining rows are increased, compared to (k−1) (z), because of the left multiplication by the polynomial matrix k (z). As a consequence, the final matrix: , is badly scaled. This explains why the post-filter is ill-conditioned [23].

A robust post-filter
As explained above, the row imbalance induced by the iterations of the decomposition leads to an ill-conditioned post-filter. Observe that the reduction steps of the decomposition, implemented by the multiplications by the polynomial matrices A k (z), are one of the main sources of the row unbalance. Recall that these steps are applied to each iteration k, to reduce the pivot (diagonal element of column k) to the greatest common divisor of the pivot and the polynomials in column k beneath the diagonal. As already mentioned, the iterations described in the preceding subsection are applied to R(z) T to complete the decomposition (1). Consider the iteration k in this context and call d(z) the gcd of the pivot and the polynomials in column k of R(z) T , beneath the diagonal. As a result of the factorization (5) described above, the corresponding pivot is already the greatest common divisor of all the subchannels from the original kth transmit antenna to the receive antennas k, . . . , N r . Now, the reduction step for this iteration seeks d(z) as the gcd of (1) all subchannels from the original kth transmit antenna to the receive antennas k, . . . , N r and (2) all the subchannels linking the transmit antennas k, . . . , N t with the kth receive antenna. Most likely, d(z) will be equal to 1, leading to A k (z) ≡ I N . A direct consequence is that the pre-filter V pr (z) is better conditioned than the post-filter U po (z).
We thus come to the conclusion that the noise amplification can be avoided by a simple modification in the decomposition by swapping the order in which the pre-and postfilters are computed. To see this, let us consider the decomposition in (1) applied to G(z) = H(z) T instead of H(z), i.e.,

G(z) = H(z) T = U(z)D(z)V (z).
(20)  Then transposing back again, we obtain: The post-filter becomes V (z) T . Now since the design of V (z) is most likely free from the reduction step, the output noise enhancement is avoided. This allows the post-filter to have improved properties. Since the pre-filter has no effect on the noise component, its conditioning properties will not affect the system's performance.

Performance analysis
The proposed "left-right swapping" scheme is compared to the "row balancing" solution described in [14]. The condition number of this new post-filter V (z) −1 and that of the post-filter S(z) in (10) obtained with the "row balancing" technique are computed. Also, the output noise power after post-filtering is computed via (8) with σ 2 = 1. Thereby, we consider several (p × p)-MIMO systems, for p = 3, 4, · · · , 15. For each system, we thus calculate the average power and the average 1 condition number on 100 randomly simulated Rayleigh fading channels H(z). With the row balancing, the output noise power is readily given by √ p. Table 1 displays the obtained results.
The results show that the proposed "left-right swapping" scheme provides a better conditioned post-filter matrix, with a reasonable norm (output noise power). It is therefore expected that this translates into enhanced MIMO-OFDM performance. The effect in terms of bit error rate is now studied in MIMO-OFDM system. For the simulation, we consider a spatial multiplexing scheme using V-BLAST algorithm, with the ITU Pedestrian-A channel model with the following parameters: 20 MHz of bandwidth, N s = 512 subcarriers, CP = Ns/8 = 64 for cyclic prefix length, and 4-QAM modulation. Figures 5 and 6 show the BER comparison in MIMO-OFDM time-domain spatial multiplexing, between classical the LU-PMD post-filtering, the modified post-filter based on "row balancing, " and this "left-right swapping" scheme. Significant improvement is obtained with the proposed method in both MIMO 3 × 3 contexts: indoor (5) and outdoor (6). Observe how the performance gain is very important in the more severe outdoor context. For example, the same BER level of 10 −3 is reached with the proposed solution with about 5 dB drop in SNR compared to the "row balancing" trick. This is due to the fact that the post-filter matrix is better conditioned now, while the output filtering power remains reasonably high.

Comparison with QR-based spatial multiplexing
In this subsection, we compare the performance of the improved scheme in MIMO-OFDM system with those of the QR-based spatial multiplexing [19]. For the QR decomposition, we have set the tolerance parameter ε = 10 −3 for the off-diagonal elements. With this value, the residual CCI is insignificant. The truncation parameter is selected as μ = 10 −3 to limit the growth of the degrees of the Laurent polynomials in the final reduced equivalent channel D(z) . We refer to [18] for more details on the meaning and roles of these parameters. For the purpose of the comparison, we have simulated a complete transmission chain from the encoding/interleaving block of the original binary source to the final demodulation block, through an outdoor pedestrian ITU MIMO 3 × 3 channel. The different BERs are displayed in Fig. 7. Figure 7 shows that, in terms of BER in MIMO wideband spatial multiplexing, the LU-PMD using "left-right swapping" compares favorably to the QR approach, even for weak SNR. The interesting properties of the LU-PMD decomposition (low complexity, CCI cancelation, and ISI mitigation) are now becoming apparent.

A robust and unitary post-filter
As already mentioned before and observed in [14], the "row-balancing" trick improves the conditioning of the post-filter matrix. Swapping the pre-and post-filter matrices also results in an improved beamforming system as argued above. We therefore propose in this section a combination of both improvements, that is, (1) to swap the left and right factors of the decomposition to obtain a better conditioned post-filter at the reception and (2) to apply a row balancing to improve further its conditioning. The final resulting post-filter matrix is subsequently denoted by Q(z). Table 2 shows how this combination allows one to enhance the good conditioning of the post-filter matrix. These results are obtained with the same simulation setting as in Table 1.  The performance in terms of BER in 3 × 3, 4 × 4, and 5 × 5 MIMO systems is studied with an ITU Pedestrian-A channel model. The results presented respectively in Figs. 8,9, and 10 confirm the expectation that the combination of the two methods improves the performance compared to each one taken separately.

Discussion
In a spatial multiplexing problem, a common and underlying assumption is that the coefficients of the polynomial matrix representing the MIMO channel are available. Accordingly, in all the preceding experiments, the pre-and post-filters correspond exactly to the right and left factors of the decomposition of the channel that is actually used to simulate the transmission system. However, the channel coefficients result from an estimation procedure. The pre-and post-filters therefore do not stem from the decomposition of the exact transmission MIMO channel. In this section, we thus study the impact In the sequel, we compute the pre-and post-filters from the decomposition of H(z) but the MIMO transmission system is still simulated using the exact channel matrix H(z). QR-based decomposition is also implemented in this channel-pre/post-filter mismatch setting for comparison. We evaluate the BER performance for different values of the relative error E r = E/ H(z) 2 . The results are presented in Figs. 11 and 12. For E r = 0.01, we observe that for both the LU-based and QR-based methods, the BER curves obtained with the exact channel coincide with that corresponding to the estimated channel. Very small channel estimation errors do not affect the BER for both methods. However, the BER performance drops significantly as the estimation errors increase, and this is particularly visible for high SNR, when the noise effect is no longer dominant. The proposed LU spatial multiplexing scheme appears to be more robust to channel estimation imperfection than the QR-based method. Therefore, the proposed LU-PMD with "left-right swapping" scheme is more realistic than the QR-based approach because it provides better BER performance in the presence of channel estimation errors.

Conclusion
Unlike the QR-based decompositions of polynomial matrix, the LU-based decomposition is simple and exact. Nonetheless, this approach was hitherto discarded in MIMO wideband spatial multiplexing applications, due to an amplification of the output noise. We have presented in this paper a simple but effective solution to this problem of output noise enhancement. We have clearly established in previous studies that performance limitation of the LU-based spatial multiplexing was essentially due to an ill-conditioning of the corresponding post-filter polynomial matrix. Matrix row balancing has then been proposed, and a significant reduction of the noise amplification was observed. Here, we have shown that the ill-conditioning of the post-filter matrix is caused by the pivot reduction step during the polynomial matrix factorization. A simple permutation of the left and right factors of the decomposition was sufficient to significantly improve the BER performance compared to the previous row balancing solution. Then, a combination of both solutions results in an LU-based polynomial matrix decomposition approach for MIMO spatial multiplexing in which the noise amplification is now avoided. Finally, we have shown that this proposed LU-based multiplexing scheme compares favorably to the state-of-the-art QR-based methods, in the realistic setting where knowledge of the channel's coefficient matrices is corrupted by estimation errors.