Skip to main content

Using tensor contractions to derive the structure of slice-wise multiplications of tensors with applications to space–time Khatri–Rao coding for MIMO-OFDM systems

This article has been updated

Abstract

The slice-wise multiplication of two tensors is required in a variety of tensor decompositions (including PARAFAC2 and PARATUCK2) and is encountered in many applications, including the analysis of multidimensional biomedical data (EEG, MEG, etc.) or multi-carrier multiple-input multiple-output (MIMO) systems. In this paper, we propose a new tensor representation that is not based on a slice-wise (matrix) description, but can be represented by a double contraction of two tensors. Such a double contraction of two tensors can be efficiently calculated via generalized unfoldings. It leads to new tensor models of the investigated system that do not depend on the chosen unfolding (in contrast to matrix models) and reveal the tensor structure of the data model, such that all possible unfoldings can be seen at the same time. As an example, we apply this new concept to the design of new receivers for multi-carrier MIMO systems in wireless communications. In particular, we consider MIMO-orthogonal frequency division multiplexing (OFDM) systems with and without Khatri–Rao coding. The proposed receivers exploit the channel correlation between adjacent subcarriers, require the same amount of training symbols as traditional OFDM techniques, but have an improved performance in terms of the symbol error rate. Furthermore, we show that the spectral efficiency of the Khatri–Rao-coded MIMO-OFDM can be increased by introducing cross-coding such that the “coding matrix” also contains useful information symbols. Considering this transmission technique, we derive a tensor model and two types of receivers for cross-coded MIMO-OFDM systems using the double contraction of two tensors.

1 Introduction

In many tensor applications, we only have an element-wise or a slice-wise description of our data/signal model. For instance, there exists only a slice-wise description of the PARATUCK2 decomposition and the PARAFAC2 decomposition corresponding to a certain unfolding of the overall tensor [1,2,3]. In the same way, some proposed tensor-based models for MIMO-OFDM communication systems have only an element-wise or a slice-wise representation [4]. Further examples include the slice-wise description of MIMO communication systems using two-way relaying [5, 6]. This description of the signal models does not reveal the tensor structure explicitly. Hence, the derivation of all tensor unfoldings is not always obvious. Therefore, we propose to express the slice-wise multiplication of two tensors in terms of the double contraction operator and use it to derive an explicit tensor structure of the received data tensor in the form of a CP-like, or Tucker-like, model in a systematic way. These explicit tensor models reveal all the possible generalized unfoldings at the same time and can subsequently be exploited to estimate the model parameters. One of our main contributions is to provide a systematic way to derive such an explicit tensor representation.

OFDM is the most widely used multi-carrier technique in current wireless communication systems. It is robust in multipath propagation environments and has a simple and efficient implementation [7, 8]. Using the fast Fourier transform (FFT), the complete frequency band is divided into smaller frequency subcarriers. Moreover, the use of the cyclic prefix mitigates the inter-symbol Interference (ISI) and the inter-carrier Interference (ICI). Typically, the OFDM receiver is implemented in the frequency domain based on a zero forcing (ZF) filter. Other more advanced solutions are proposed in [9], as well as optimal training and channel estimation for OFDM systems are proposed in [10, 11].

Tensor-based signal processing offers an improved identifiability, uniqueness, and more efficient denoising compared to matrix-based techniques. In [4], a MIMO multi-carrier system is modeled using tensor algebra and the PARATUCK2 tensor decomposition resulting in a novel space, time, and frequency coding structure. Similarly in [12], trilinear coding in space, time, and frequency is proposed for MIMO-OFDM systems based on the CP tensor decomposition. By exploiting tensor models, semi-blind receivers are introduced for multi-carrier communication systems in [13, 14]. All these works use additional spreading that leads to a significantly reduced spectral efficiency to create the tensor structure. Moreover, previous publications on tensor models for multi-carrier communication systems [4, 12,13,14] do not exploit the channel correlation between the adjacent subcarriers. The previously mentioned publications rely on the subcarrier-wise description of the MIMO-OFDM system. This description of the signal models does not reveal the tensor structure explicitly. Hence, the derivation of all tensor unfoldings is not always obvious. In [15], a PARAFAC model and a Tucker model are proposed for joint channel, data, and phase noise estimation in MIMO-OFDM system, taking into account the phase noise due to inter-carrier interference. The author of [16] also proposes a tensor model for filter bank-based multi-carrier (FBMC) communication systems. However, this model is derived from a PARATUCK2 decomposition, which is not based on tensor contractions. This derivation, although popular, is as not as general as the derivation proposed in this paper. Otherwise stated, the tensor modeling approach of [16] is restricted to FBMC systems, while our approach is valid for any MIMO-OFDM system with orthogonal or nonorthogonal subcarriers.

In this paper, we propose a new approach to model MIMO-OFDM communication systems and to design semi-blind receivers. The idea is built upon a double contraction model that allows to replace the slice-wise multiplication of two tensors so that the explicit tensor structure of the data model can be derived. We provide the mathematical tools to derive such an explicit tensor structure in general. The received data in a MIMO-OFDM system are derived from such an explicit tensor structure, which is efficiently exploited at the receiver for a joint channel and symbol estimation. More specifically, we first present the double contraction between an uncoded signal tensor and a channel tensor for OFDM systems, yielding the same spectral efficiency as matrix-based approaches (since no additional spreading is used) [17]. We propose an application of the double contraction operator to Khatri–Rao-coded MIMO-OFDM systems [18]. Due to the Khatri–Rao coding, the signal tensor has a richer structure and can be recast as a constrained CP-like model. In fact, the Khatri–Rao space–time coding concept has been introduced in [19]. Later, it has been extended in [20] to Khatri–Rao space–time–frequency coding. In contrast to the state of the art [4, 13, 14, 20], in this work we exploit the structure of the channel and the contraction properties using the transmit signal tensor and the known coding matrix to propose a receiver based on the LS-KRF. In addition, we reduce the number of required pilot symbols by exploiting the correlation of the channel in the frequency domain, which has not been exploited in these previous works. Finally, we propose a more spectrally efficient cross-coding model for MIMO-OFDM systems. In this case, the known and fixed Khatri–Rao coding matrix is eliminated, and two useful symbol matrices are cross-coded by means of the Khatri–Rao product. By exploiting the CP-like tensor structure of the received signal, we also design two types of receivers for the cross-coded MIMO-OFDM systems.

This paper is organized as follows. In Sect. 2, we introduce the tensor algebra notation and provide the mathematical tools to derive an explicit tensor structure from the slice-wise multiplication of two tensors. Section 3 describes the system model using the double contraction formalism for the traditional MIMO-OFDM transmission. In Sect. 4, we recast the tensor signal model for the Khatri–Rao-coded MIMO-OFDM case and present the two closed-form receiver designs for this system, which are based on the Khatri–Rao factorization. In Sect. 5, we consider a cross-coded MIMO-OFDM system with enhanced spectral efficiency and derive the corresponding semi-blind receivers. A discussion on the computational complexity of the different receivers is also carried out. In Sect. 6, numerical results are presented, and the paper is concluded in Sect. 7.

2 Tensor algebra and notation

2.1 Notation

We use the following notation. Scalars are denoted either as capital or lower-case italic letters, Aa. Vectors and matrices are denoted as bold-faced lower-case and capital letters, \({\varvec{{a}}}, {\varvec{{A}}}\), respectively. Tensors are represented by bold-faced calligraphic letters \({\varvec{{{\mathcal {A}}}}}\). The following superscripts, \(^{\mathrm{T}}\), \(^{\mathrm{H}}\),\(^{{-1}}\), and \(^+\) denote transposition, Hermitian transposition, matrix inversion, and Moore–Penrose pseudo-matrix inversion, respectively. The outer product, Kronecker product, and Khatri–Rao product are denoted as \(\circ\), \(\otimes\), and \(\diamond\), respectively. Moreover, we denote the Hadamard product (element-wise multiplication) and the inverse Hadamard product (element-wise division) between two arrays of equal dimensions as \(\odot\) and \(\oslash\), respectively. The operators \(\left| \left| .\right| \right| _{\text {F}}\) and \(\left| \left| .\right| \right| _{\text {H}}\) denote the Frobenius norm and the higher order norm of a tensor that is defined as the square root of the sum of the squared absolute values of its elements, respectively. Moreover, the n-mode product between a tensor \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I_1\times I_2 \ldots \times I_N}\) and a matrix \({\varvec{{B}}} \in {\mathbb {C}}^{J \times I_n}\) is denoted as \({\varvec{{{\mathcal {A}}}}}\times _n{\varvec{{B}}}\), for \(n=1, 2, \ldots N\) [21]. The identity N-way tensor of dimension \(R\times R\cdots \times R\) is denoted as \({\varvec{{{\mathcal {I}}}}}_{N,R}\). Similarly, an identity matrix of dimension \({R\times R}\) is denoted as \({\varvec{{I}}}_R\) and we denote a vector of ones of length R as \({\varvec{{1}}}_R\). The nth three-mode slice of a tensor \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I\times J\times N}\) is denoted as \({\varvec{{{\mathcal {A}}}}}_{(.,.,n)}\) and accordingly one element of this tensor is denoted as \({\varvec{{{\mathcal {A}}}}}_{(i,j,n)}\). The operator \(\mathrm{{diag}}(.)\) transforms a vector into a diagonal matrix and the operator \(\mathrm{{vec}}(.)\) transforms a matrix into a vector. Note that we distinguish between a super-diagonal or an identity tensor and a diagonal tensor. A diagonal tensor is a tensor that consists of diagonal slices along one dimension. For instance, a diagonal tensor \({{\varvec{{{\mathcal {D}}}}}_A \in {\mathbb {C}}^{M\times N\times N}}\) that is diagonal along the first dimension has diagonal one-mode slices, i.e., \({{\varvec{{{\mathcal {D}}}}}_A}_{(m,.,.)} = \mathrm{{diag}}({\varvec{{a}}}_m)\), for \(m = 1,\ldots ,M\), where \({\varvec{{a}}}_m\) is an n-dimensional vector. The concatenation of two tensors along their mth dimension is denoted as \(\sqcup _m\) [22]. For two tensors \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I\times I_2 \times I_3}\) and \({\varvec{{{\mathcal {B}}}}} \in {\mathbb {C}}^{J\times I_2 \times I_3}\), after the concatenation along the first dimension, we get \({\varvec{{{\mathcal {A}}}}} \sqcup _1 {\varvec{{{\mathcal {B}}}}} \in {\mathbb {C}}^{I + J\times I_2 \times I_3}\).

2.2 The CP decomposition and generalized tensor unfoldings

The CP tensor decomposition decomposes a given tensor into the minimum number of rank one components. The CP decomposition of a four-way, rank R tensor \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I \times J \times M \times N}\) can be written as

$$\begin{aligned} {\varvec{{{\mathcal {A}}}}} = {\varvec{{{\mathcal {I}}}}}_{3,R}\times _1{\varvec{{F}}}_1\times _2{\varvec{{F}}}_2\times _3{\varvec{{F}}}_3\times _4{\varvec{{F}}}_4, \end{aligned}$$
(1)

where \({\varvec{{F}}}_1 \in {\mathbb {C}}^{I \times R}, {\varvec{{F}}}_2 \in {\mathbb {C}}^{J \times R}\), \({\varvec{{F}}}_3 \in {\mathbb {C}}^{M \times R}\), and \({\varvec{{F}}}_4 \in {\mathbb {C}}^{N \times R}\) are the factor matrices [21, 23]. In addition to the n-mode unfoldings, generalized matrix unfoldings can be defined by using two subsets of any of the N dimensions [24, 25]. For instance, the set of modes \((1, 2,\ldots ,N)\) of an N-way tensor \({\varvec{{{\mathcal {A}}}}}\) can be divided into two non-overlapping subsets with cardinality P and \(N-P\), \(\alpha ^{(1)}=[\alpha _1 \ldots \alpha _P]\) and \(\alpha ^{(2)}=[\alpha _{P+1} \ldots \alpha _N]\), respectively. This leads to the generalized unfolding \(\left[ {\varvec{{{\mathcal {A}}}}} \right] _{(\alpha ^{(1)},\alpha ^{(2)})}\), where the indices contained in \(\alpha ^{(1)}\) vary along the rows and the indices contained in \(\alpha ^{(2)}\) vary along the columns. Here, the index \(\alpha _1\) varies the fastest between the rows, the index \(\alpha _{P+1}\) varies the fastest between the columns, P is any number between one and N, and \(\alpha _{n}\) is any of the tensor dimensions. For instance, let us assume the four-way tensor \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I \times J \times M \times N}\) defined in Eq. (1). In the generalized unfolding \([{\varvec{{{\mathcal {A}}}}}]_{([1,2],[3,4])}\) the first mode varies faster than the second mode along the rows and the third mode varies faster than the fourth mode along the columns. Moreover, for a tensor with a CP structure, its unfoldings and generalized unfoldings can be expressed in terms of the factor matrices. For instance, the generalized unfolding \([{{\varvec{{{\mathcal {A}}}}}}]_{([1, 2],[3, 4])}\) of the tensor \({\varvec{{{\mathcal {A}}}}}\) satisfies [18, 25]

$$\begin{aligned}{}[{{\varvec{{{\mathcal {A}}}}}}]_{([1, 2],[3, 4])} =\left( {\varvec{{F}}}_2\diamond {\varvec{{F}}}_1\right) \cdot \left( {\varvec{{F}}}_4\diamond {\varvec{{F}}}_3\right) ^{\mathrm{T}}. \end{aligned}$$

In a similar way, the rest of the tensor unfoldings and generalized unfoldings can be defined.

2.3 Tensor contraction

The contraction \({\varvec{{{\mathcal {A}}}}}\bullet _n^m{\varvec{{{\mathcal {C}}}}}\) between two tensors \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I_1\times I_2 \ldots \times I_N}\) and \({\varvec{{{\mathcal {C}}}}} \in {\mathbb {C}}^{J_1\times J_2 \ldots \times J_N}\) represents an inner product of the nth mode of \({\varvec{{{\mathcal {A}}}}}\) with the mth mode of \({\varvec{{{\mathcal {C}}}}}\), provided that \(I_n = J_m\) [26]. Contraction along several modes of compatible dimensions is also possible and accordingly the contraction along two modes is denoted as \({\varvec{{{\mathcal {A}}}}}\bullet _{n,k}^{m,l}{\varvec{{{\mathcal {C}}}}}\). More specifically, the double contraction between the tensors \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{I \times J \times M \times N}\) and \({\varvec{{{\mathcal {C}}}}} \in {\mathbb {C}}^{M \times N \times K}\) is defined as [26],

$$\begin{aligned} ({\varvec{{{\mathcal {A}}}}}\bullet _{3,4}^{1,2}{\varvec{{{\mathcal {C}}}}})_{(i,j,k)} \triangleq \sum _{n=1}^{N}{\sum _{m=1}^{M}{\varvec{{{\mathcal {A}}}}}_{(i,j,m,n)}\cdot {\varvec{{{\mathcal {C}}}}}_{(m,n,k)}}={\varvec{{{\mathcal {T}}}}}_{(i,j,k)}. \end{aligned}$$

This example represents a contraction of the third and fourth mode of \({\varvec{{{\mathcal {A}}}}}\) with the first and second mode of \({\varvec{{{\mathcal {C}}}}}\), respectively.

Using the concept of the generalized unfoldings, it can be shown that the tensor contraction satisfies

$$\begin{aligned}{}[{\varvec{{{\mathcal {A}}}}}\bullet _{3,4}^{1,2}{\varvec{{{\mathcal {C}}}}}]_{([1,2],3)}&= [{\varvec{{{\mathcal {A}}}}}]_{([1,2],[3,4])}\cdot [{\varvec{{{\mathcal {C}}}}}]_{([1,2],3)}{,} \end{aligned}$$
(2)
$$\begin{aligned} _{([1,2],3)}&= [{\varvec{{{\mathcal {A}}}}}]_{([1,2],[4,3])}\cdot [{\varvec{{{\mathcal {C}}}}}]_{([2,1],3)}. \end{aligned}$$
(3)

In the generalized unfolding \([{\varvec{{{\mathcal {A}}}}}]_{([1,2],[3,4])}\) the first mode varies faster than the second mode between the rows and the third mode varies faster than the fourth mode between the columns.

2.4 Contraction properties for element-wise and slice-wise multiplications

2.4.1 Hadamard product via tensor contraction

First, let us consider a Hadamard product (element-wise multiplication) between two vectors \({\varvec{{a}}} \in {\mathbb {C}}^{M \times 1}\) and \({\varvec{{b}}} \in {\mathbb {C}}^{M \times 1}\), \(c_{(m)}=a_{(m)}b_{(m)}\), \(\forall m=1,\ldots , M\) (\({\varvec{{c}}} \in {\mathbb {C}}^{M \times 1}\)). The Hadamard product can be expressed via the multiplication of a diagonal matrix and a vector, i.e., \({\varvec{{a}}} \odot {\varvec{{b}}} = \mathop {\mathrm{{diag}}}\left( {\varvec{{a}}}\right) {\varvec{{b}}} = \mathop {\mathrm{{diag}}}\left( {\varvec{{b}}}\right) {\varvec{{a}}}\). Using the fact that a matrix multiplication is equivalent to the contraction \({\bullet }_{2}^{1}\), we get \({\varvec{{a}}} \odot {\varvec{{b}}} = \mathop {\mathrm{{diag}}}\left( {\varvec{{a}}}\right) {\bullet }_{2}^{1}{\varvec{{b}}} = \mathop {\mathrm{{diag}}}\left( {\varvec{{b}}}\right) {\bullet }_{2}^{1}{\varvec{{a}}}.\)

Next, for the Hadamard product between two matrices \({\varvec{{A}}} \in {\mathbb {C}}^{M \times N}\) and \({\varvec{{B}}} \in {\mathbb {C}}^{M \times N}\), \({\varvec{{C}}}_{(m,n)}={\varvec{{A}}}_{(m,n)}{\varvec{{B}}}_{(m,n)}\), \(\forall m=1,\ldots , M\) and \(n=1,\ldots , N\), we can show that \({\varvec{{C}}} = {\varvec{{A}}} \odot {\varvec{{B}}} = {\varvec{{{\mathcal {D}}}}}_A{\bullet }_{2,4}^{1,2}{\varvec{{B}}} = {\varvec{{{\mathcal {D}}}}}_B{\bullet }_{2,4}^{1,2}{\varvec{{A}}}\). Here \({\varvec{{{\mathcal {D}}}}}_A \in {\mathbb {C}}^{M \times M \times N \times N}\) and \({\varvec{{{\mathcal {D}}}}}_B \in {\mathbb {C}}^{M \times M \times N \times N}\) are diagonal four-way tensors with nonzero elements \({{\varvec{{{\mathcal {D}}}}}_A}_{(m,m,n,n)} = {\varvec{{A}}}_{(m,n)}\) and \({{\varvec{{{\mathcal {D}}}}}_B}_{(m,m,n,n)} = {\varvec{{B}}}_{(m,n)}\), respectively. As an alternative, we also have

$$\begin{aligned} {\varvec{{C}}} = {\varvec{{A}}} \odot {\varvec{{B}}} = {\varvec{{{\mathcal {D}}}}}^{(A)}{\bullet }_{2,3}^{1,3}{\varvec{{{\mathcal {D}}}}}^{(B)}, \end{aligned}$$

where the diagonal three-way tensors have the following nonzero elements \({{\varvec{{{\mathcal {D}}}}}^{(A)}}_{(m,m,n)}={\varvec{{A}}}_{(m,n)}\) and \({{\varvec{{{\mathcal {D}}}}}^{(B)}}_{(m,n,n)}={\varvec{{B}}}_{(m,n)}\). Moreover, these diagonal three-way tensors can be either defined it terms of slices, \({{\varvec{{{\mathcal {D}}}}}^{(A)}}_{(.,.,n)}=\mathop {\mathrm{{diag}}}\left( {{\varvec{{A}}}_{(.,n)}}\right) , \forall n = 1,\ldots , N\), \({{\varvec{{{\mathcal {D}}}}}^{(B)}}_{(m,.,.)}=\mathop {\mathrm{{diag}}}\left( {{\varvec{{B}}}_{(m,.)}}\right) , \forall m = 1,\ldots , M\) or using tensor notation \({{\varvec{{{\mathcal {D}}}}}^{(A)}} = {\varvec{{{\mathcal {I}}}}}_{3,M} \times _3 {\varvec{{A}}}^{\mathrm{T}}\) and \({{\varvec{{{\mathcal {D}}}}}^{(B)}} = {\varvec{{{\mathcal {I}}}}}_{3,N} \times _1 {\varvec{{B}}}\).

2.4.2 Slice-wise multiplication via tensor contraction

A slice-wise multiplication between two tensors \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{M \times N \times K}\) and \({\varvec{{{\mathcal {B}}}}} \in {\mathbb {C}}^{N \times J \times K}\) is defined as \({{\varvec{{{\mathcal {T}}}}}_1}_{(.,.,k)} = {\varvec{{{\mathcal {A}}}}}_{(.,.,k)}{\varvec{{{\mathcal {B}}}}}_{(.,.,k)}\), \(\forall k = 1,\ldots , K\). We depict this slice-wise multiplication in Fig. 1. To express this slice-wise multiplication we can diagonalize \({\varvec{{{\mathcal {B}}}}}\) to obtain

$$\begin{aligned} {\varvec{{{\mathcal {T}}}}}_1 = {\varvec{{{\mathcal {A}}}}}{\bullet }_{2,3}^{1,4}{\varvec{{{\mathcal {D}}}}}_B \in {\mathbb {C}}^{M\times J \times K}, \end{aligned}$$

where \({\varvec{{{\mathcal {D}}}}}_B\in {\mathbb {C}}^{N\times J \times K \times K}\) has nonzero elements \({{\varvec{{{\mathcal {D}}}}}_B}_{(n,j,k,k)} = {{\varvec{{{\mathcal {B}}}}}}_{(n,j,k)}\) or \({{\varvec{{{\mathcal {D}}}}}_{B}}_{(n,j,.,.)}=\mathop {\mathrm{{diag}}}\left( {{\varvec{{{\mathcal {B}}}}}}_{(n,j,.)}\right)\), for \(n =1,\ldots N\) and \(j =1,\ldots J\). Further combinations are also possible that lead to the same result, for instance, \({\varvec{{{\mathcal {T}}}}}_2 = {\varvec{{{\mathcal {D}}}}}_B{\bullet }_{1,4}^{2,3}{\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{J\times K \times M}\) or \({\varvec{{{\mathcal {T}}}}}_3 = {\varvec{{{\mathcal {D}}}}}_A{\bullet }_{2,4}^{1,3}{\varvec{{{\mathcal {B}}}}} \in {\mathbb {C}}^{M\times K \times J}\) with \({{\varvec{{{\mathcal {D}}}}}_A}_{(m,n,k,k)} = {{\varvec{{{\mathcal {A}}}}}}_{(m,n,k)}\) as diagonal elements (nonzero elements of \({\varvec{{{\mathcal {D}}}}}_A\)). Note that the tensors \({\varvec{{{\mathcal {T}}}}}_1\), \({\varvec{{{\mathcal {T}}}}}_2\), and \({\varvec{{{\mathcal {T}}}}}_3\) contain the same elements, but have permuted dimensions. However, the permuted order of the dimensions is not relevant, because we always explicitly declare which dimension is multiplied or unfolded.

Fig. 1
figure 1

A slice-wise multiplication between two tensors \({\varvec{{{\mathcal {A}}}}} \in {\mathbb {C}}^{M \times N \times K}\) and \({\varvec{{{\mathcal {B}}}}} \in {\mathbb {C}}^{N \times J \times K}\)

2.4.3 Representation of diagonal matrices and diagonal tensors in terms of Khatri–Rao products

An explicit expression of the diagonalized tensor can be obtained by expressing its generalized unfolding in terms of a Khatri–Rao product with an identity matrix. First, let us consider the column vector \({\varvec{{a}}} \in {\mathbb {C}}^{M}\). It can be easily shown that

$$\begin{aligned} \mathop {\mathrm{{diag}}}\left( {\varvec{{a}}}\right) = {\varvec{{I}}}_M\diamond {\varvec{{a}}}^{\mathrm{T}}. \end{aligned}$$

Next, let us consider the reshaping of the matrix \({\varvec{{A}}}\in {\mathbb {C}}^{M\times N}\) into a diagonal tensor \({{\varvec{{{\mathcal {D}}}}}^{(A)}} = {\varvec{{{\mathcal {I}}}}}_{3,M} \times _3 {\varvec{{A}}}^{\mathrm{T}}\). By studying the resulting tensor structure, the tensor unfoldings, and the properties of the Khatri–Rao product, we get

$$\begin{aligned} {\left[ {\varvec{{{\mathcal {D}}}}}^{(A)} \right] _{([3,2],[1])}}= {\varvec{{I}}}_M \diamond {{{\varvec{{A}}}}}^{\mathrm{T}}. \end{aligned}$$

Likewise, for the tensor \({{\varvec{{{\mathcal {D}}}}}^{(B)}} = {\varvec{{{\mathcal {I}}}}}_{3,N} \times _1 {\varvec{{B}}}\in {\mathbb {C}}^{M\times N\times N}\) and the matrix \({\varvec{{B}}}\in {\mathbb {C}}^{ M \times N}\), we have \({\left[ {\varvec{{{\mathcal {D}}}}}^{(B)} \right] _{([1,3],[2])}}= {\varvec{{I}}}_N \diamond {{{\varvec{{B}}}}}\).

Table 1 Link between the diagonalized tensor structures and their generalized unfoldings

The expression of the diagonalized tensor in terms of its generalized unfoldings and the Khatri–Rao product with an identity matrix can also be obtained for N-way tensors. It is useful to note that there exists a link between the diagonalized tensor structures and their corresponding generalized unfoldings. The latter can always be expressed as a Khatri–Rao product between an identity matrix and a generalized unfolding of the tensor to be diagonalized, where the dimensions that are diagonalized are in the columns of the second matrix. This notation will be used later in this paper, and it is given in Table 1.

The element-wise or slice-wise multiplication between two arrays of the same order can be written in terms of a contraction if the unaffected mode vectors are transformed into a diagonal matrix (by adding an additional array dimension). This diagonalization can be performed using the Khatri–Rao product as shown in Table 1. As an example, please refer to the transformation of Eq. (4) to the equations at the beginning of Sect. 3.3 in this paper.

3 MIMO-OFDM

We assume a MIMO-OFDM system with \(M_{{\mathrm{T}}}\) transmit and \(M_{\mathrm{R}}\) receive antennas. One OFDM block consists of N samples, which equals the discrete Fourier transform (DFT) length, using the assumption that all N subcarriers are used for data transmission. If guard subcarriers are used, i.e., not all subcarriers are used for data transmission, the number of OFDM samples is smaller that the DFT length. All signals and equations used for the following derivation are in the frequency domain. Moreover, N is the number of subcarriers and K denotes the number of transmitted frames, where each frame consists of N symbol periods. The received signal in the frequency domain \(\tilde{{\varvec{{{\mathcal {Y}}}}}} \in {\mathbb {C}}^{N\times M_{\mathrm{R}} \times K}\) after the removal of the cyclic prefix is defined by means of the contraction operator

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {Y}}}}}} = \tilde{{\varvec{{{\mathcal {H}}}}}}\bullet _{2,4}^{1,2}\tilde{{\varvec{{{\mathcal {S}}}}}} + \tilde{{\varvec{{{\mathcal {N}}}}}} = \tilde{{\varvec{{{\mathcal {Y}}}}}}_0 + \tilde{{\varvec{{{\mathcal {N}}}}}}. \end{aligned}$$
(4)

We use \(\sim\) to distinguish the frequency domain from the time domain, i.e., \(\tilde{{\varvec{{{\mathcal {Y}}}}}}={{\varvec{{{\mathcal {Y}}}}}}\times _1{\varvec{{F}}}_N\), where \({\varvec{{F}}}_N\in {\mathbb {C}}^{N\times N}\) is the DFT matrix and \({\varvec{{{\mathcal {Y}}}}}\) is the received signal in the time domain. The transmit signal tensor is denoted as \(\tilde{{\varvec{{{\mathcal {S}}}}}} \in {\mathbb {C}}^{N \times M_{{\mathrm{T}}} \times K}\) and \(\tilde{{\varvec{{{\mathcal {N}}}}}} \in {\mathbb {C}}^{N \times M_{\mathrm{R}} \times K}\) represents the additive white Gaussian noise in the frequency domain. The tensor \(\tilde{{\varvec{{{\mathcal {Y}}}}}}_0 \in {\mathbb {C}}^{N\times M_{\mathrm{R}} \times K}\) represents the noiseless received signal in the frequency domain after the removal of the cyclic prefix. The frequency-selective propagation channel is represented by a channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}} \in {\mathbb {C}}^{N\times N \times M_{\mathrm{R}} \times M_{{\mathrm{T}}}}\) as we propose in [18] the structure of which is detailed as follows.

3.1 Channel tensor

We assume that the frequency-selective channel has an impulse response \({\varvec{{h}}}_L^{(m_{\mathrm{R}},m_{{\mathrm{T}}})} \in {\mathbb {C}}^{L \times 1}\), for each receive–transmit antenna pair, \((m_{\mathrm{R}},m_{{\mathrm{T}}})\), for \(m_{\mathrm{R}} = 1\ldots M_{\mathrm{R}}\) and \(m_{{\mathrm{T}}} = 1\ldots M_{{\mathrm{T}}}\), and a maximum of L taps. After the removal of the cyclic prefix, the channel matrix in the frequency domain is a diagonal matrix for each receive–transmit antenna pair, \(\tilde{{\varvec{{H}}}}^{(m_{\mathrm{R}},m_{{\mathrm{T}}})}=\mathrm{{ diag}}\left( {\varvec{{F}}}_{N\times L}\cdot {\varvec{{h}}}_L^{(m_{\mathrm{R}},m_{{\mathrm{T}}})}\right) \in {\mathbb {C}}^{N \times N}\) [10, 11]. Here, the matrix \({\varvec{{F}}}_{N\times L} \in {\mathbb {C}}^{N \times L}\) contains the first L columns of the DFT matrix of size \(N \times N\). Collecting all the channel matrices in a four-way channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\), we get

$$\begin{aligned}&\tilde{{\varvec{{{\mathcal {H}}}}}}_{(.,.,m_{\mathrm{R}},m_{{\mathrm{T}}})} = \mathrm{{ diag}}\left( {\varvec{{F}}}_{N\times L}\cdot {\varvec{{h}}}_L^{(m_{\mathrm{R}},m_{{\mathrm{T}}})}\right) =\mathrm{{ diag}}\left( \tilde{{\varvec{{h}}}}^{(m_{\mathrm{R}},m_{{\mathrm{T}}})}\right) . \end{aligned}$$
(5)

For each receive–transmit antenna pair, the channel transfer matrix is a diagonal matrix that is represented by the corresponding slice of the tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) as shown in (5). The vector \(\tilde{{\varvec{{h}}}}^{(m_{\mathrm{R}},m_{{\mathrm{T}}})} \in {\mathbb {C}}^{N\times 1}\) contains the frequency domain channel coefficients. An example of a MIMO system with \(M_{{\mathrm{T}}}=2\) transmit antennas and \(M_{\mathrm{R}}=3\) receive antennas and the corresponding channel vectors is depicted in Fig. 2. We assume that the channel stays constant during the K frames. Note that only in case of cyclic prefix OFDM the channel tensor in the frequency domain contains diagonal matrices for each receive–transmit antenna pair. In a general multi-carrier system, the frequency domain channel matrix is not necessarily diagonal. However, Eq. (4) is still satisfied which means that our general model is valid for any multi-carrier MIMO system (not only OFDM-based), including systems without orthogonality in the frequency domain and systems with different types of coding.

Fig. 2
figure 2

A MIMO system with \(M_{{\mathrm{T}}}=2\) transmit antennas and \(M_{\mathrm{R}}=3\) receive antennas (left-hand side). Visualization of the generalized unfolding \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[2,4])}\) for the MIMO-OFDM (right-hand side)

In (5), we have defined the channel tensor. However, up to this point, we have not revealed the explicit tensor structure. In order to do so, let us first assume that all channel transfer matrices for the \(m_{{\mathrm{T}}}\)th transmit and all receive antennas are collected in a diagonal tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})} \in {\mathbb {C}}^{N\times N \times M_{\mathrm{R}}}\), i.e.,

$$\begin{aligned}&\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}(.,.,m_{\mathrm{R}})}^{(m_{{\mathrm{T}}})} = \mathrm{{diag}}\left( \tilde{{\varvec{{h}}}}^{(m_{\mathrm{R}},m_{{\mathrm{T}}})}\right) \end{aligned}$$
(6)

Based on this diagonal structure, the tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\) can be written as the following CP decomposition:

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})} = {\varvec{{{\mathcal {I}}}}}_{3,N}\times _1{\varvec{{I}}}_N\times _2{\varvec{{I}}}_N\times _3\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}, \end{aligned}$$
(7)

where \(\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})} =\begin{bmatrix} \tilde{{\varvec{{h}}}}^{(1,m_{{\mathrm{T}}})}&\tilde{{\varvec{{h}}}}^{(2,m_{{\mathrm{T}}})}&\ldots&\tilde{{\varvec{{h}}}}^{(M_{\mathrm{R}},m_{{\mathrm{T}}})} \end{bmatrix}^{\mathrm{T}}\in {\mathbb {C}}^{M_{\mathrm{R}} \times N}\).

The complete four-way channel tensor, defined in Eq. (5), can be obtained by concatenating the \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\) tensors along the fourth dimension. Hence, the four-way channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) can be expressed as

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}} }&=\begin{bmatrix} \tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(1)}&\sqcup _4&\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(2)}&\sqcup _4&\ldots&\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(M_{{\mathrm{T}}})} \end{bmatrix} = \sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}} \tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})} \circ {\varvec{{e}}}_{m_{{\mathrm{T}}}} \\&= \sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}} {\varvec{{{\mathcal {D}}}}}\times _1{\varvec{{I}}}_N\times _2{\varvec{{I}}}_N\times _3\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\times _4 {\varvec{{e}}}_{m_{{\mathrm{T}}}}. \end{aligned}$$
(8)

Note that \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) satisfies a very special block term decomposition (BTD), where \({\varvec{{{\mathcal {D}}}}}_{(.,.,.,1)}={\varvec{{{\mathcal {I}}}}}_{3,N} \in {\mathbb {R}}^{N\times N\times N\times 1}\) (\({\varvec{{{\mathcal {D}}}}}={\varvec{{{\mathcal {I}}}}}_{4,1}\otimes {\varvec{{{\mathcal {I}}}}}_{3,N}\)) and \({\varvec{{e}}}_{m_{{\mathrm{T}}}}\in {\mathbb {R}}^{M_{{\mathrm{T}}}\times 1}\) is a pining vector. The BTD decomposes a tensor into block terms of smaller n-mode ranks [27]. We prove the BTD structure of the channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) in “Appendix.” In this appendix, we also show that the ([1, 3], [2, 4]) generalized unfolding of the channel tensor can be expressed as

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[2,4])} = \tilde{{\varvec{{H}}}}\diamond ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N}) \in {\mathbb {C}}^{NM_{\mathrm{R}} \times NM_{{\mathrm{T}}}}, \end{aligned}$$
(9)

where \(\tilde{{\varvec{{H}}}} \in {\mathbb {C}}^{ M_{\mathrm{R}} \times NM_{{\mathrm{T}}}}\) is a matrix containing all nonzero elements of the tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) and it is defined as

$$\begin{aligned} \tilde{{\varvec{{H}}}}&= \left[ \begin{array}{c|c|c|c} \tilde{{\varvec{{h}}}}^{(1,1){\mathrm{T}}} &{} \tilde{{\varvec{{h}}}}^{(1,2){\mathrm{T}}} &{} \ldots &{} \tilde{{\varvec{{h}}}}^{(1,M_{{\mathrm{T}}}){\mathrm{T}}}\\ \vdots &{} \vdots &{}\vdots &{} \vdots \\ \tilde{{\varvec{{h}}}}^{(M_{\mathrm{R}},1){\mathrm{T}}} &{} \tilde{{\varvec{{h}}}}^{(M_{\mathrm{R}},2){\mathrm{T}}} &{} \ldots &{} \tilde{{\varvec{{h}}}}^{(M_{\mathrm{R}},M_{{\mathrm{T}}}){\mathrm{T}}} \end{array} \right] \\&= \left[ \begin{array}{cccc} \tilde{{\varvec{{H}}}}^{(1)}_{\mathrm{R}}&\tilde{{\varvec{{H}}}}^{(2)}_{\mathrm{R}}&\ldots&\tilde{{\varvec{{H}}}}^{(M_{{\mathrm{T}}})}_{\mathrm{R}} \end{array} \right] \in {\mathbb {C}}^{ M_{\mathrm{R}} \times NM_{{\mathrm{T}}}}. \end{aligned}$$
(10)

Figure 2 (right-hand side) depicts the structure of the generalized unfolding \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[2,4])}\) for a MIMO-OFDM system with parameters \(M_{{\mathrm{T}}} = 2\), \(M_{\mathrm{R}} = 3\), and \(N = 3\).

3.2 Data transmission

The signal tensor \(\tilde{{\varvec{{{\mathcal {S}}}}}}\) in Eq. (4) contains all data symbols in the frequency domain that are transmitted on N subcarriers, \(M_{{\mathrm{T}}}\) transmit antennas, and K frames. For notational simplicity, we define the following block matrix \(\tilde{{\varvec{{S}}}}\) as the transpose of the three-mode unfolding of \(\tilde{{\varvec{{{\mathcal {S}}}}}}\)

$$\begin{aligned} \tilde{{\varvec{{S}}}} = \tilde{{\varvec{{{\mathcal {S}}}}}}_{([1,2],[3])}^{\mathrm{T}}= \left[ \begin{array}{cccc} \tilde{{\varvec{{S}}}}^{(1)}&\tilde{{\varvec{{S}}}}^{(2)}&\ldots&\tilde{{\varvec{{S}}}}^{(M_{{\mathrm{T}}})} \end{array} \right] \in {\mathbb {C}}^{K \times NM_{{\mathrm{T}}}}, \end{aligned}$$
(11)

where \(\tilde{{\varvec{{S}}}}^{(m_{{\mathrm{T}}})} \in {\mathbb {C}}^{K \times N}\) contains the symbols transmitted via the \(m_{{\mathrm{T}}}\)th antenna.

Moreover, we assume that the symbol matrix consists of data and pilot symbols, \(\tilde{{\varvec{{S}}}}= \tilde{{\varvec{{S}}}}_{\mathrm{{d}}}+\tilde{{\varvec{{S}}}}_{\mathrm{{p}}}\). The matrices \(\tilde{{\varvec{{S}}}}_{\mathrm{{d}}}\) and \(\tilde{{\varvec{{S}}}}_{\mathrm{{p}}}\) represent the data symbols and the pilot symbols, respectively. The matrix \(\tilde{{\varvec{{S}}}}_{\mathrm{{d}}}\) contains zeros at the positions of the pilot symbols. Accordingly, the matrix \(\tilde{{\varvec{{S}}}}_{\mathrm{{p}}}\) contains nonzero elements only at the pilot positions. Typically, there are three ways of arranging the pilot symbol within the OFDM blocks (block, comb, and lattice-type) [7]. We assume a comb-type arrangement, where the pilot symbols are positioned on non-consecutive positions with equidistant spacing in the time and the frequency domains, for each antenna. The spacing in the time domain is denoted by \(\Delta K\). Moreover, we assume a spacing in the frequency domain of \(\Delta F\) between two pilot symbols. Furthermore, there are positions where neither pilot symbols nor data symbols are allowed to be transmitted. These positions are reserved for the pilot symbols corresponding to the remaining antennas. This results in \(M_{{\mathrm{T}}}\lfloor \frac{N}{\Delta F}\rfloor\) pilot symbols per frame. In comparison, other publications such as [4, 12,13,14] use \(N{M_{{\mathrm{T}}}}\) pilot symbols per frame. By exploiting the channel correlation among adjacent subcarriers, a reduced number of pilot symbols can be used for channel estimation.

3.3 Receiver design

Using the property of the generalized unfoldings in Eq. (2), the received signal in Eq. (4) becomes

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([1, 2],[3])} = [\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[2,4])}\tilde{{\varvec{{{\mathcal {S}}}}}}_{([1,2],[3])} + [\tilde{{\varvec{{{\mathcal {N}}}}}}]_{([1, 2],[3])}. \end{aligned}$$

Next, by substituting the corresponding tensor unfoldings in the above equation, we get

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([1, 2],[3])} = \left( \tilde{{\varvec{{H}}}}\diamond ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\right) \cdot \tilde{{\varvec{{S}}}}^{\mathrm{T}}+ [\tilde{{\varvec{{{\mathcal {N}}}}}}]_{([1, 2],[3])}. \end{aligned}$$

The above equation satisfies an unfolding of a noisy observation of a low-rank tensor with a CP structure. By applying an inverse unfolding for the received signal in the frequency domain after the removal of the cyclic prefix, we get the desired tensor description of the received data tensor

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {Y}}}}}} = {\varvec{{{\mathcal {I}}}}}_{3,NM_{{\mathrm{T}}}} \times _1 ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N}) \times _2 \tilde{{\varvec{{H}}}} \times _3 \tilde{{\varvec{{S}}}} + \tilde{{\varvec{{{\mathcal {N}}}}}} {\in {\mathbb {C}}^{N\times M_{\mathrm{R}} \times K}}. \end{aligned}$$
(12)

Note that this model is a constrained CP-like model where the one-mode factor is a known constraint matrix. Our goal is to exploit (12) to jointly estimate the channel and the symbols, i.e., \(\tilde{{\varvec{{H}}}}\) and \(\tilde{{\varvec{{S}}}}\). The author of [16] proposes a similar model for the received signal of FBMC systems. In contrast to the model derived in this paper from contractions, the model in [16] is derived from the PARATUCK2 model. This means that the received signal should fit the PARATUCK2 decomposition in order to satisfy the received signal structure. On the other hand, the proposed derivation based on contractions in (4) is more general and it holds without such an assumption. More specifically, the proposed tensor contraction formalism that defines the signal model in Eq. (5) does not require the matrix slices \(\tilde{{\varvec{{H}}}}\) defined in Eq. (6) to be diagonal. Therefore, the proposed model and the derived algorithms remain valid for nonorthogonal multi-carrier systems with an arbitrary structure of the equivalent channel tensor in Eq. (4). This aspect is not captured by the tensor modeling approach of [16].

Using the prior knowledge of the pilot symbols and their positions, the channel in the frequency domain can be estimated. Naturally, the channel is estimated only at those subcarrier positions where the pilot symbols are located. Afterwards, an interpolation is applied to get the complete channel estimate. Alternatively, as shown in [10, 11] the channel can be first estimated in the time domain and then transformed into the frequency domain. Either way, this leads to a pilot-based channel estimate that we denote as \({\tilde{{\varvec{{H}}}}}_{\mathrm{{p}}}\), or \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{{p}}}\)Footnote 1. The pilot-based channel estimate is then used to estimate the data symbols. In the remainder of this section, we discuss different ways to estimate the symbols. We use the pilot-based channel estimate to initialize the proposed algorithms.

Traditionally, the estimate of the symbols is obtained in the frequency domain with a ZF receiver. In this case, the symbols are calculated by inverting the channel matrix for each subcarrier individually.

Alternatively, if we compute the one-mode unfolding of the tensor \({\tilde{{\varvec{{{\mathcal {Y}}}}}}}\) in Eq. (12), we get

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([1],[2, 3])} = ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\cdot \left( \tilde{{\varvec{{S}}}}\diamond \tilde{{\varvec{{H}}}}\right) ^{\mathrm{T}}+[\tilde{{\varvec{{{\mathcal {N}}}}}}]_{([1],[3, 2])}. \end{aligned}$$

Taking into account the structure of the matrices \(({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N}) \in {\mathbb {R}}^{N \times NM_{{\mathrm{T}}}}\), \(\tilde{{\varvec{{H}}}}\) in (10), and \(\tilde{{\varvec{{S}}}}\) in (11), the one-mode unfolding becomes

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([1],[2, 3])} = \sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}}\left( \tilde{{\varvec{{S}}}}^{(m_{{\mathrm{T}}})}\diamond \tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\right) ^{\mathrm{T}}+[\tilde{{\varvec{{{\mathcal {N}}}}}}]_{([1],[3, 2])}. \end{aligned}$$

After transposition and omitting the noise term, we get

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([2, 3],[1])} \approx \sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}}\left( \tilde{{\varvec{{S}}}}^{(m_{{\mathrm{T}}})}\diamond \tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\right) \in {\mathbb {C}}^{M_{\mathrm{R}}K \times N}. \end{aligned}$$

This sum of Khatri–Rao products can be resolved in a column-wise fashion. Let \(\tilde{{\varvec{{y}}}}_n\in {\mathbb {C}}^{M_{\mathrm{R}}K \times 1}\) denote the nth column of \([\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([2, 3],[1])} \in {\mathbb {C}}^{M_{\mathrm{R}}K \times N}\). After reshaping this vector into the matrix \(\tilde{{\varvec{{Y}}}}_n \in {\mathbb {C}}^{M_{\mathrm{R}}\times K}\), such that \(\tilde{{\varvec{{y}}}}_n=\mathrm{{vec}}(\tilde{{\varvec{{Y}}}}_n)\), it is easy to see that this matrix satisfies

$$\begin{aligned} \tilde{{\varvec{{Y}}}}_n \approx \tilde{{\varvec{{H}}}}_n\cdot \tilde{{\varvec{{S}}}}_n, \end{aligned}$$
(13)

where \(\tilde{{\varvec{{H}}}}_n\) and \(\tilde{{\varvec{{S}}}}_n\) are the nth slices of \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{(n,n,.,.)}\in {\mathbb {C}}^{M_{\mathrm{R}} \times M_{{\mathrm{T}}}}\) and \(\tilde{{\varvec{{{\mathcal {S}}}}}}_{(n,.,.)}\in {\mathbb {C}}^{M_{{\mathrm{T}}} \times K}\), respectively. Note that \(\tilde{{\varvec{{Y}}}}_n\) is the nth slice of \({\tilde{{\varvec{{{\mathcal {Y}}}}}}}_{(n,.,.)}\). Using the pseudo inverse of the channel, we get the traditional ZF receiver.

Alternatively, the channel and the symbols on each subcarrier can be estimated by means of iterative or recursive LS algorithms. Similar algorithms were proposed in [28] and [29] for blind source separation on a single subcarrier. We extend two of the algorithms presented in [29] that are based on projection to our application. We have proposed an extension of these algorithm using enumeration in [17], namely iterative least squares with projection (ILSP) and recursive least squares with projections (RLSP). In this paper, our focus is on finite alphabet projection-based algorithms since that they are computationally less expensive than the algorithms based on enumeration.

The identifiability properties of the problem in Eq. (13) have already been studied in [29], where the authors present sufficient conditions for identifiability.

4 Khatri–Rao-coded MIMO-OFDM

In this section, we model a Khatri–Rao-coded MIMO-OFDM communication system as a double tensor contraction between a channel and a signal tensor that contains coded symbols. This double tensor contraction is essentially equivalent to the model in (4). However, we assume that the signal tensor contains Khatri–Rao-coded symbols.

As in Sect. 3, we assume a MIMO-OFDM communication system with \(M_{{\mathrm{T}}}\) transmit and \(M_{\mathrm{R}}\) receive antennas. One OFDM block consists of N samples, which equals the DFT length. Moreover, all N subcarriers are used for data transmission. Furthermore, we assume a frequency-selective channel model that stays constant over the transmission of P frames. In contrast to the model presented in Sect. 3, here, we assume that the P frames are divided into K groups of Q blocks (Q corresponds to the spreading factor), \(P = K\cdot Q\).

Accordingly, the received signal in the frequency domain is given by

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {Y}}}}}} = \tilde{{\varvec{{{\mathcal {H}}}}}}\bullet _{2,4}^{1,2}\tilde{{\varvec{{{\mathcal {X}}}}}} + \tilde{{\varvec{{{\mathcal {N}}}}}} = \tilde{{\varvec{{{\mathcal {Y}}}}}}_0+ \tilde{{\varvec{{{\mathcal {N}}}}}} \in {\mathbb {C}}^{N \times M_{\mathrm{R}}\times K\times Q}, \end{aligned}$$
(14)

where \(\tilde{{\varvec{{{\mathcal {H}}}}}} \in {\mathbb {C}}^{N \times N \times M_{\mathrm{R}} \times M_{{\mathrm{T}}}}\) is the channel tensor and \(\tilde{{\varvec{{{\mathcal {X}}}}}} \in {\mathbb {C}}^{N \times M_{{\mathrm{T}}} \times K \times Q}\) is the signal tensor. The tensor \(\tilde{{\varvec{{{\mathcal {N}}}}}}\in {\mathbb {C}}^{N \times M_{\mathrm{R}}\times K\times Q}\) contains additive white Gaussian noise and \(\tilde{{\varvec{{{\mathcal {Y}}}}}}_0\in {\mathbb {C}}^{N \times M_{\mathrm{R}}\times K\times Q}\) is the noiseless received signal.

4.1 Channel tensor

In this section, we use the model of the channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) defined in Eq. (8). Moreover, we have defined the generalized unfolding \(\left[ \tilde{{\varvec{{{\mathcal {H}}}}}} \right] _{([1,3],[2,4])}\) in Eq. (9). Using a permutation matrix, it can be shown that the generalized unfolding \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}\) of the channel is equal to

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])} = \bar{{\varvec{{H}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}), \end{aligned}$$
(15)

where \(\bar{{\varvec{{H}}}} = \begin{bmatrix} \tilde{{\varvec{{H}}}}_R^{(1)}&\ldots&\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(M_{{\mathrm{T}}})} \end{bmatrix}\cdot {\varvec{{P}}} = {\tilde{{\varvec{{H}}}}}\cdot {\varvec{{P}}} \in {\mathbb {C}}^{M_{\mathrm{R}} \times M_{{\mathrm{T}}}N}.\) The permutation matrix \({\varvec{{P}}} \in {\mathbb {R}}^{NM_{{\mathrm{T}}} \times M_{{\mathrm{T}}}N }\) reorders the columns such that the faster increasing index is \(M_{{\mathrm{T}}}\) instead of N and it is defined as \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}=[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[2,4])}\cdot {\varvec{{P}}}\). Recall that the matrices \(\tilde{{\varvec{{H}}}}\in {\mathbb {C}}^{M_{\mathrm{R}} \times NM_{{\mathrm{T}}}}\) and \(\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\in {\mathbb {C}}^{M_{\mathrm{R}} \times N}\) are defined in Eq. (10). The structure of the four-way channel tensor in the frequency domain \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) and its unfoldings are derived in “Appendix.”

4.2 Data transmission

We can impose a CP structure to the transmit signal tensor, if we assume Khatri–Rao-coded symbols [19, 20]. The coding is proportional to the number of transmit antennas if we use a spreading factor \(Q=M_{{\mathrm{T}}}\), for each subcarrier \(n = 1,2,\ldots , N\). Hence, the generalized unfolding of the signal tensor is

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],[4,3])}&= \begin{bmatrix} \tilde{{\varvec{{S}}}}_1\diamond {\varvec{{C}}}_1&\tilde{{\varvec{{S}}}}_2\diamond {\varvec{{C}}}_2&\ldots&\tilde{{\varvec{{S}}}}_N\diamond {\varvec{{C}}}_N \end{bmatrix}^{\mathrm{T}} \\&= {\varvec{{I}}}_{ M_{{\mathrm{T}}}N}(\bar{{\varvec{{S}}}}\diamond \bar{{\varvec{{C}}}})^{\mathrm{T}}, \end{aligned}$$
(16)

where the matrix \(\tilde{{\varvec{{S}}}}_n \in {\mathbb {C}}^{K \times M_{{\mathrm{T}}}}\) contains modulated data symbols and \({\varvec{{C}}}_n \in {\mathbb {C}}^{Q \times M_{{\mathrm{T}}}}\) is a Vandermonde coding matrix as defined in [19]. The matrices \(\bar{{\varvec{{S}}}} = \begin{bmatrix} \tilde{{\varvec{{S}}}}_1&\ldots&\tilde{{\varvec{{S}}}}_N \end{bmatrix} \in {\mathbb {C}}^{K\times M_{{\mathrm{T}}}N}\) and \(\bar{{\varvec{{C}}}} = \begin{bmatrix} {\varvec{{C}}}_1&\ldots&{\varvec{{C}}}_N \end{bmatrix} \in {\mathbb {C}}^{Q\times M_{{\mathrm{T}}}N}\) contain all symbol and coding matrices for each subcarrier, respectively. Note that \(\bar{{\varvec{{S}}}}=\tilde{{\varvec{{S}}}}\cdot {\varvec{{P}}}\), where the matrix \(\tilde{{\varvec{{S}}}}\) is defined in Eq. (11) and \({\varvec{{P}}} \in {\mathbb {R}}^{NM_{{\mathrm{T}}} \times M_{{\mathrm{T}}}N }\) is the above-mentioned permutation matrix that reorders the columns such that the faster increasing index is \(M_{{\mathrm{T}}}\) instead of N. Moreover, we assume that \(\tilde{{\varvec{{S}}}}\) contains pilot symbols as explained after Eq. (11). As shown in [19] and as directly follows from (16), the tensor \([\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],3,4)}\) satisfies the following CP decomposition:

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],3,4)} = {\varvec{{{\mathcal {I}}}}}_{3,M_{{\mathrm{T}}}N} \times _1 {\varvec{{I}}}_{M_{{\mathrm{T}}}N} \times _2 \bar{{\varvec{{S}}}} \times _3 \bar{{\varvec{{C}}}}. \end{aligned}$$

4.3 Receiver design

Using Eqs. (2), (3), and (14), the noiseless received signal can be expressed as

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}_0]_{([1,2],[4,3])}=[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}\cdot [\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],[4,3])}. \end{aligned}$$

Inserting the corresponding unfoldings of the channel and the signal tensor in Eqs. (15) and (16), respectively, the noiseless received signal in the frequency domain is given by

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}_0]_{([1,2],[4,3])}= \left( \bar{{\varvec{{H}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) \cdot (\bar{{\varvec{{S}}}}\diamond \bar{{\varvec{{C}}}})^{\mathrm{T}}. \end{aligned}$$

The above equation represents an unfolding of a four-way tensor with a CP structure. Therefore, the noiseless received signal tensor can be expressed as

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {Y}}}}}}_0 = {{\varvec{{{\mathcal {I}}}}}_{4,M_{{\mathrm{T}}}N}}\times _1 ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}) \times _2 \bar{{\varvec{{H}}}} \times _3 \bar{{\varvec{{S}}}}\times _4 \bar{{\varvec{{C}}}} { \in {\mathbb {C}}^{N \times M_{\mathrm{R}}\times K\times Q}}. \end{aligned}$$
(17)

Equation (17) represents the received signal in the frequency domain for all N subcarriers, \(M_{\mathrm{R}}\) receive antennas, and P frames after the removal of the cyclic prefix. Depending on the available a priori knowledge at the receiver side, channel estimation, symbol estimation, or joint channel and symbol estimation can be performed.

Let us compare the MIMO-OFDM tensor model and the Khatri–Rao-coded MIMO-OFDM tensor model in Eqs. (12) and (17), respectively. First, the factor matrices in these equations have different index orderings. In Eq. (12) the faster increasing index in N, whereas in Eq. (17) the faster increasing index in \(M_{{\mathrm{T}}}\) along the columns of the factor matrices. We use \(\sim\) and − to distinguish the different index orderings of the factor matrices. Recall that we have defined a permutation matrix \({\varvec{{P}}}\) that considers the reordering of the columns of the factor matrices. Moreover, Eq. (17) has an additional tensor dimension (the four-mode) corresponding to the coding technique and the spreading factor Q. Furthermore, taking into account the permutation matrix \({\varvec{{P}}}\), we get Eq. (12) from Eq. (17) for \(Q = 1\) and \(\bar{{\varvec{{C}}}}= {\varvec{{1}}}_{M_{{\mathrm{T}}}N}^{\mathrm{T}}\) (i.e., no coding and the spreading factor equals one).

Using Eq. (17), the channel and the data symbols can be jointly estimated from the ([1, 4], [3, 2]) generalized unfolding of the noise corrupted received signal

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([1,4],[3,2])}\approx \left( \bar{{\varvec{{C}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) \cdot (\bar{{\varvec{{H}}}}\diamond \bar{{\varvec{{S}}}})^{\mathrm{T}}. \end{aligned}$$

Under the assumption that \(Q=M_{{\mathrm{T}}}\), \(\left( \bar{{\varvec{{C}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) \in {\mathbb {C}}^{NQ \times M_{{\mathrm{T}}}N}\) is a block diagonal, left invertible matrix and known at the receiver. Using the properties of the coding matrices defined in [19], i.e., \({\varvec{{C}}}_n^{\mathrm{H}}{\varvec{{C}}}_n=M_{{\mathrm{T}}}{\varvec{{I}}}_{M_{{\mathrm{T}}}}\), we have

$$\begin{aligned} \bar{{\varvec{{Y}}}}\triangleq \frac{1}{M_{{\mathrm{T}}}}\left( \bar{{\varvec{{C}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) ^{\mathrm{H}}\cdot [\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([1,4],[3,2])} \approx (\bar{{\varvec{{H}}}}\diamond \bar{{\varvec{{S}}}})^{\mathrm{T}}. \end{aligned}$$

After transposition, \(\bar{{\varvec{{Y}}}}^{\mathrm{T}}\approx \bar{{\varvec{{H}}}}\diamond \bar{{\varvec{{S}}}}\) can be approximated by the Khatri–Rao product between the channel and the data symbols. Therefore, the channel and the data symbols can be jointly estimated based on the LS-KRF as in [30].

Using the LS-KRF, the matrices \(\bar{{\varvec{{H}}}}\) and \(\bar{{\varvec{{S}}}}\) can be identified up to one complex scaling factor ambiguity per column. Hence, the estimated matrices satisfy the following relations:

$$\begin{aligned}&\hat{\bar{{\varvec{{H}}}}} = \bar{{\varvec{{H}}}}\cdot {\varvec{{\Lambda }}} \mathrm { and } \hat{\bar{{\varvec{{S}}}}} = \bar{{\varvec{{S}}}}\cdot {\varvec{{\Lambda }}}^{-1}, \end{aligned}$$
(18)

where \({\varvec{{\Lambda }}} \in {\mathbb {C}}^{M_{{\mathrm{T}}} N\times M_{{\mathrm{T}}}N}\) is a diagonal matrix with diagonal elements equal to the \(M_{{\mathrm{T}}}N\) complex scaling ambiguities. The simplest way to resolve the scaling ambiguity is by assuming the knowledge of one row of the matrix \(\bar{{\varvec{{S}}}} \in {\mathbb {C}}^{K\times M_{{\mathrm{T}}}N}\). This corresponds to \(M_{{\mathrm{T}}}N\) pilot symbols, i.e., one pilot symbol per transmit antenna and subcarrier. Since traditional MIMO-OFDM communication systems use fewer pilot symbols than \(M_{{\mathrm{T}}}N\), we propose to use the same amount of pilot symbols and exploit the channel correlation between adjacent subcarriers in order to estimate the scaling matrix. We transmit pilot symbols on positions with equidistant spacing in the frequency and the time domain. With the prior knowledge of the pilot symbols and their positions, we can obtain an initial channel estimate as in traditional MIMO-OFDM systems (see Sect. 3). We denote this pilot-based channel estimate by \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{{p}}}\) \(({\bar{{\varvec{{H}}}}}_p)\). The pilot-based channel estimate is then used to estimate the scaling ambiguity \({\varvec{{\Lambda }}}\) in Eq. (18) as

$$\begin{aligned} \hat{{\varvec{{\Lambda }}}} = \mathrm{{diag}}\left( \frac{1}{M_{\mathrm{R}}}\sum _{m_{\mathrm{R}}= 1}^{M_{\mathrm{R}}}\hat{\bar{{\varvec{{H}}}}}_{(m_{\mathrm{R}},.)}\oslash {\bar{{\varvec{{H}}}}}_{\mathrm{{p}}(m_{\mathrm{R}},.)}\right) . \end{aligned}$$

By multiplying the solution of the LS-KRF with the diagonal matrix \(\hat{{\varvec{{\Lambda }}}}\), the scaling ambiguity in Eq. (18) is resolved and the data symbols can be demodulated. Note that the proposed Khatri–Rao receiver estimates the channel and the symbols in a semi-blind fashion. First, the channel and the symbols are jointly estimated without any a priori information. The pilot-based channel estimate is then used to resolve the scaling ambiguity affecting the columns of \(\hat{\bar{{\varvec{{H}}}}}\) and \(\hat{\bar{{\varvec{{S}}}}}\). Therefore, the optimal length and repetition of the piloting sequences are identical as for the traditional OFDM systems. We summarize the steps of the proposed Khatri–Rao (KR) receiver in Algorithm 1.

figure a

Furthermore, the channel estimate resulting from the KR receiver can be used for channel tracking in future transmission frames if the channel has not changed drastically. If the channel estimate is used for tracking, it could be improved by means of an additional LS estimate from \([\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([2,4,1],[3])}\) with the knowledge of the estimated and projected symbols onto the finite alphabet \(\Omega\), i.e., \({Q({{\bar{{\varvec{{S}}}}}})}=\mathrm{{proj}}({{\bar{{\varvec{{S}}}}}})\). The finite alphabet \(\Omega\) depends on the modulation type and the modulation order \(M_o\).

$$\begin{aligned} \hat{\bar{{\varvec{{H}}}}}_{\mathrm{{LS}}} ^{\mathrm{T}}= \left( ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\diamond \bar{{\varvec{{C}}}}\diamond {Q({\bar{{\varvec{{S}}}}})}\right) ^{+}\cdot [\tilde{{\varvec{{{\mathcal {Y}}}}}}]_{([2,4,1],[3])} \end{aligned}$$

However, we can also use this improved channel estimation to further improve the performance of the KR receiver. Using this updated channel estimate an improved estimate of the diagonal scaling matrix \(\hat{{\varvec{{\Lambda }}}}\) can be calculated and with that an enhanced estimate of the symbols, \(\hat{\bar{{\varvec{{S}}}}}_{\mathrm{{LS}}}\), using Eq. (18). Note that, instead of just one LS estimate of the channel and the symbols the performance can be further enhanced with additional iterations leading to an iterative receiver. Note that the symbol matrix \(\hat{\bar{{\varvec{{S}}}}}_{\mathrm{{LS}}}\) can be estimated in the least squares sense from the three-mode unfolding of Eq. (17), but the estimation of \(\hat{{\varvec{{\Lambda }}}}\) is computationally cheaper. The KR receiver with its enhancement via LS is summarized in Algorithm 2.

figure b

Due to the additional LS-based estimates, the KR+LS algorithm has higher computational complexity than the KR algorithm.

5 Khatri–Rao cross-coding MIMO-OFDM

In Sect. 4, we have proposed a tensor model for KR-coded MIMO-OFDM systems that introduces an additional CP-like structure to the signal tensor. The additional CP-like structure of the signal tensor is achieved by means of a simplified Khatri–Rao coding. However, using such a Khatri–Rao coding, we add additional spreading that reduces the spectral efficiency of the system. To overcome this issue, in this section we propose to keep the CP structure of the signal tensor proposed in Sect. 4, but introduce a cross-coding approach, where the known Khatri–Rao coding matrices \({\varvec{{C}}}_1,\ldots , {\varvec{{C}}}_N\) are replaced by symbol matrices containing useful information symbols to be transmitted.

As in Sect. 4, the received signal in the frequency domain after the removal of the cyclic prefix is given by

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {Y}}}}}} = \tilde{{\varvec{{{\mathcal {H}}}}}}\bullet _{4,2}^{2,1}\tilde{{\varvec{{{\mathcal {X}}}}}} + \tilde{{\varvec{{{\mathcal {N}}}}}} = \tilde{{\varvec{{{\mathcal {Y}}}}}}_0+ \tilde{{\varvec{{{\mathcal {N}}}}}} \in {\mathbb {C}}^{N \times M_{\mathrm{R}}\times K\times Q}, \end{aligned}$$
(19)

Likewise, the \(P = KQ\) frames that are divided into K groups of Q blocks (“spreading factor”). We model the channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\) according to Eq. (8). Details regarding this model are also provided in “Appendix.” In this section, we make use of the generalized unfolding \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])} = \bar{{\varvec{{H}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\) that is defined in (15). The generalized unfolding ([2, 1], [4, 3]) of the received signal tensor is given by

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],[4,3])}&= \begin{bmatrix} {\bar{{\varvec{{S}}}}^{(1)}_1}\diamond \bar{{\varvec{{S}}}}^{(2)}_1&{\bar{{\varvec{{S}}}}^{(1)}_2}\diamond \bar{{\varvec{{S}}}}^{(2)}_2&\ldots&{\bar{{\varvec{{S}}}}^{(1)}_N}\diamond \bar{{\varvec{{S}}}}^{(2)}_N \end{bmatrix}^{\mathrm{T}} \\&= {\varvec{{I}}}_{ M_{{\mathrm{T}}}N}(\bar{{\varvec{{S}}}}^{(1)}\diamond \bar{{\varvec{{S}}}}^{(2)})^{\mathrm{T}}, \end{aligned}$$
(20)

where the matrix \({\bar{{\varvec{{S}}}}^{(1)}_n} \in {\mathbb {C}}^{K \times M_{{\mathrm{T}}}}\) and \(\bar{{\varvec{{S}}}}^{(2)}_n \in {\mathbb {C}}^{Q \times M_{{\mathrm{T}}}}\) are the first and second symbol matrices that carry information symbols. The first symbol matrix \(\bar{{\varvec{{S}}}}^{(1)}\) follows the structure of the symbol matrix in Sect. 4 and is composed of a pilot part and a data symbols part (c.f. Eq. (11)). On the other hand, the second symbol matrix only contains data symbols, except its first row, which contains known symbols (e.g., row vectors composed of 1’s). We refer to this transmission scheme as cross-coded MIMO-OFDM, due to the fact that \({\bar{{\varvec{{S}}}}^{(1)}_n}\) plays the role of a random KR coding with respect to \({\bar{{\varvec{{S}}}}^{(2)}_n}\) and vice versa. Let us define the block matrices \(\bar{{\varvec{{S}}}}^{(1)} = \begin{bmatrix} {\tilde{{\varvec{{S}}}}^{(1)}_1}&\ldots&{\tilde{{\varvec{{S}}}}^{(1)}_N} \end{bmatrix} \in {\mathbb {C}}^{K\times M_{{\mathrm{T}}}N}\) and \(\bar{{\varvec{{S}}}}^{(2)} = \begin{bmatrix} \tilde{{\varvec{{S}}}}^{(2)}_1&\ldots&\tilde{{\varvec{{S}}}}^{(2)}_N \end{bmatrix} \in {\mathbb {C}}^{Q\times M_{{\mathrm{T}}}N}\). From (20), the tensor \([\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],3,4)}\) satisfies the following CP decomposition

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],3,4)} = {\varvec{{{\mathcal {I}}}}}_{3,M_{{\mathrm{T}}} N} \times _1 {\varvec{{I}}}_{M_{{\mathrm{T}}}N} \times _2 \bar{{\varvec{{S}}}}^{(1)} \times _3 \bar{{\varvec{{S}}}}^{(2)}. \end{aligned}$$

Using Eqs. (2) and (19), the noiseless received signal is given by

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}_0]_{([1,2],[4,3])}=[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}\cdot [\tilde{{\varvec{{{\mathcal {X}}}}}}]_{([2,1],[4,3])}. \end{aligned}$$
(21)

Inserting (15) and (20) into (21), we obtain

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {Y}}}}}}_0]_{([1,2],[4,3])}= \left( \bar{{\varvec{{H}}}} \diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) \cdot (\bar{{\varvec{{S}}}}^{(1)}\diamond \bar{{\varvec{{S}}}}^{(2)})^{\mathrm{T}}, \end{aligned}$$

or, alternatively, using the n-mode product notation

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {Y}}}}}}_0 = {{\varvec{{{\mathcal {I}}}}}_{4,M_{{\mathrm{T}}}N}}\times _1 ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}) \times _2 \bar{{\varvec{{H}}}} \times _3 \bar{{\varvec{{S}}}}^{(1)}\times _4 \bar{{\varvec{{S}}}}^{(2)} {\in {\mathbb {C}}^{N \times M_{\mathrm{R}}\times K\times Q}}. \end{aligned}$$
(22)

Depending on the available a priori knowledge at the receiver side, channel estimation, symbol estimation, or joint channel and symbol estimation can be performed. Differently from the KR-coded system, where a known coding matrix is used, in the cross-coded MIMO-OFDM system, this knowledge is not available, which makes the receiver design more challenging. A joint channel and symbol estimation now involves the estimation of three factor matrices from the noisy version of the four-way CP model (22). From the three-mode, four-mode, and two-mode unfoldings of \(\tilde{{\varvec{{{\mathcal {Y}}}}}}\) in (19), and using (22), we can obtain the LS equations for estimating \(\bar{{\varvec{{S}}}}^{(1)}\), \(\bar{{\varvec{{S}}}}^{(2)}\) and \(\bar{{\varvec{{H}}}}\), respectively:

$$\begin{aligned} \bar{{\varvec{{S}}}}^{(1)}&= \left[ \tilde{{{\varvec{{{\mathcal {Y}}}}}}} \right] _{(3)}\cdot \left[ {\left( \bar{{\varvec{{S}}}}^{(2)}\diamond \bar{{\varvec{{H}}}}\diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) ^{\mathrm{T}}}\right] ^{+}, \end{aligned}$$
(23)
$$\begin{aligned} \bar{{\varvec{{S}}}}^{(2)}&= \left[ \tilde{{{\varvec{{{\mathcal {Y}}}}}}} \right] _{(4)}\cdot \left[ \left( \bar{{\varvec{{S}}}}^{(1)}\diamond \bar{{\varvec{{H}}}}\diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) ^{\mathrm{T}}\right] ^{+} \end{aligned}$$
(24)
$$\begin{aligned} \bar{{\varvec{{H}}}}&= \left[ \tilde{{{\varvec{{{\mathcal {Y}}}}}}} \right] _{(2)}\cdot \left[ \left( \bar{{\varvec{{S}}}}^{(2)}\diamond \bar{{\varvec{{S}}}}^{(1)}\diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) ^{\mathrm{T}}\right] ^{+} \end{aligned}$$
(25)

We adopt a three step ALS algorithm for estimating the symbol and channel matrices from the noisy versions of (23)–(25). However, it is known that there is no guarantee of convergence if we initialize the ALS algorithm randomly. To overcome this issue, we propose to use the pilot-based channel estimate \(\bar{{\varvec{{H}}}}_{\mathrm{{p}}}\) to obtain initial estimates of the matrices \(\bar{{\varvec{{S}}}}^{(1)}\) and \(\bar{{\varvec{{S}}}}^{(2)}\) based on LS-KRF. Such a channel estimate is obtained from the pilot symbols in \(\bar{{\varvec{{S}}}}^{(1)}\) and the first row of \(\bar{{\varvec{{S}}}}^{(2)}\) that has known symbols. From the ([3, 4], [1, 2]) generalized unfolding of the noisy received signal tensor \({\varvec{{{\mathcal {Y}}}}}\), we get

$$\begin{aligned} \left[ \tilde{{\varvec{{{\mathcal {Y}}}}}} \right] _{([3,4],[1,2])} \approx \left[ \bar{{\varvec{{S}}}}^{(2)}\diamond \bar{{\varvec{{S}}}}^{(1)}\right] \cdot \left[ \bar{{\varvec{{H}}}}_{\mathrm{{p}}}\diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right] ^{\mathrm{T}}. \end{aligned}$$
figure c
figure d

Given \(\bar{{\varvec{{H}}}}_{\mathrm{{p}}}\) and \(M_{\mathrm{R}}\ge M_{{\mathrm{T}}}\), from \(\left[ {\tilde{{\varvec{{{\mathcal {Y}}}}}}} \right] _{([3,4],[1,2])}\cdot \left[ \left( \bar{{\varvec{{H}}}}_{\mathrm{{p}}}\diamond ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\right) ^{\mathrm{T}}\right] ^{+}\approx \left[ \bar{{\varvec{{S}}}}^{(2)}\diamond \bar{{\varvec{{S}}}}^{(1)}\right]\) based on LS-KRF, we obtain \(\hat{\bar{{\varvec{{S}}}}}^{(1)}\) and \(\hat{\bar{{\varvec{{S}}}}}^{(2)}\). However, the matrices \(\hat{\bar{{\varvec{{S}}}}}^{(1)}\) and \(\hat{\bar{{\varvec{{S}}}}}^{(2)}\) are estimated up to one complex scaling ambiguity per column. We exploit the first row of the matrix \({\bar{{\varvec{{S}}}}}^{(2)}\) to estimate this ambiguity (recall that the elements of the first row of the matrix \({\bar{{\varvec{{S}}}}}^{(2)}\) are set to one). After resolving the scaling ambiguity, we propose to iterate between Eqs. (23)–(25) to enhance the accuracy of the receiver. Hence, we propose two receivers such as cross-coded Khatri–Rao (CC-KR) and cross-coded Khatri–Rao+alternating least squares (CC-KR+ALS) for the cross-coded MIMO-OFDM systems. These two algorithms are summarized in Algorithm 3 and Algorithm 4, respectively. The CC-KR receiver exploits the LS-KRF to compute an estimate of the symbol matrices \(\bar{{\varvec{{S}}}}^{(1)}\) and \(\bar{{\varvec{{S}}}}^{(2)}\), assuming that \(M_{\mathrm{R}}\ge M_{{\mathrm{T}}}\), the first row on the matrix \(\bar{{\varvec{{S}}}}^{(2)}\) contains only ones, and a pilot-based channel estimate \(\bar{{\varvec{{H}}}}_{\mathrm{{p}}}\) is already available. Note that the initial steps of the CC-KR+ALS and the CC-KR receivers are the same. As for the subsequent steps, for the CC-KR+ALS receiver, the channel matrix and both symbol matrices are estimated using ALS. The algorithm is stopped if it exceeds the maximum number of iterations that is set to 5, reaches a predefined minimum of the cost function \(\left\| \tilde{{\varvec{{{\mathcal {Y}}}}}} - {{\varvec{{{\mathcal {I}}}}}_{4,M_{{\mathrm{T}}}N}}\times _1 ({\varvec{{I}}}_N\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}) \times _2 \hat{\bar{{\varvec{{H}}}}} \times _3 \hat{\bar{{\varvec{{S}}}}}^{(1)}\times _4 \hat{\bar{{\varvec{{S}}}}}^{(2)}\right\| _{\mathrm{H}}^2/\left\| \tilde{{\varvec{{{\mathcal {Y}}}}}}\right\| _{\mathrm{H}}^2\), or if the error of the cost function has not changed within two consecutive iterations. The CC-KR+ALS algorithm has a higher computational complexity than the CC-KR algorithm due to the additional ALS iterations, as shown in Algorithm 4.

Based on Eqs. (23)–(25) and to ensure the parameter estimation identifiability, Algorithms 3 and 4 have to satisfy the following conditions related to the system parameters,

$$\begin{aligned} M_RQ \ge M_T, \quad M_RK \ge M_T, \quad KQ \ge M_T. \end{aligned}$$
(26)

These conditions establish trade-offs involving the space, time, and coding diversities to ensure a unique recovery of the channel and the symbols. More specifically, decreasing the number of receive antennas can be compensated by an increase in the numbers of groups K or the number of blocks Q that define the cross-coding scheme in order to ensure joint channel and symbol identifiability.

6 Simulation results

In this section, we evaluate the performance of the proposed receivers for MIMO-OFDM systems using Monte Carlo simulations. First, we compare the performance of ZF, ILSP, and RLSP, using 5000 realizations. We consider a \(2 \times 2\) OFDM system, with K frames, and \(N= 128\) subcarriers. The pilot symbols are transmitted on every third subcarrier such that \(\Delta F = 3\) and only during the first frame, i.e., \(\Delta K = K\). Using these pilots, we obtain a pilot-based channel estimate with which we initialize all of the algorithms. The transmitted data symbols are independent and they are drawn from a quadrature amplitude modulation (4-QAM). The frequency selective propagation channel is modeled according to the 3rd Generation Partnership Project (3GPP) Pedestrian A channel (Ped A)  [31]. The duration of the cyclic prefix is 32 samples and the weighting factor \(\alpha = 1\), for the recursive LS. The maximum number of iterations for the iterative algorithm is set to 7.

Fig. 3
figure 3

SER versus \(E_b/N_0\) for \(N=128\), \(Q=2\), \(M_{{\mathrm{T}}} =2\), \(M_{\mathrm{R}}=2\), \(\Delta K = K\), \(\Delta F = 4\) and different numbers of blocks K

In Fig. 3, we compare the SER performance of the traditional frequency domain ZF receiver, the proposed Khatri–Rao (KR) receiver (see Algorithm 1) and the proposed Khatri–Rao receiver with one additional LS iteration (see Algorithm 2) for different numbers of transmitted blocks. In this case, note that the KR and the KR+LS receivers benefit from the increased number of frames as the channel has been kept constant during the \(P=Q\cdot K\) frames. Moreover, as the number of frames increases, the advantages of the enhancement via LS become more pronounced.

Fig. 4
figure 4

SER comparison for different numbers of transmit and receive antennas

Moreover, the SER comparison between the ZF and the Khatri–Rao-coded algorithms, for \(N=128\), \(Q=M_T\), \(K=2\), \(\Delta _K =2\), \(\Delta _F=4\), and different numbers of antennas are depicted in Fig. 4. The KR and KR-LS receivers benefit from an increased number of transmit antennas due to the increased spreading factor, \(Q = M_T\). The performance enhancement with the additional LS estimate is achieved for \(K > 2\). However, the KR receiver outperforms the ZF one even without the LS enhancement. We can observe that for the Khatri–Rao-coded algorithms, i.e., the KR and the KR-LS, the performance of the receiver is increased. However, as shown in Table 2, we linearly increase the computational complexity of the receiver, since more rank-one approximations must be computed.

Fig. 5
figure 5

SER versus \(E_b/N_0\) for \(N=128\) subcarriers, \(K=10\) blocks, \(\Delta K = 10\), while \(P=10\) (\(K=5\) and \(Q=2\)) for the Khatri–Rao-coded MIMO-OFDM

In Fig. 5, we depict the SERs for these two systems. The KR receiver has similar accuracy to the ILSP and the RLSP algorithms [17] that improves with the increased SNR. The KR+LS receiver outperforms the ILSP algorithm and the KR algorithm in terms of SER. Recall that the KR-coded OFDM model in Eq. (17) has a richer tensor structure than the OFDM model in Eq. (12) due to the coding. The KR algorithm and the KR-LS algorithm effectively exploit this structure to estimate the channel and the symbols. Note that the KR-LS algorithm computes an improved estimate of the scaling matrix. Therefore, KR-LS leads to lower SER levels than the ILSP and KR algorithms.

Fig. 6
figure 6

SER for a \(2\times 2\) cross-coded OFDM system with parameters \(N=128\), \(Q=2\), K, \(\Delta K\), \(\Delta F\), and the symbols are drawn from a 4-QAM modulation. The parameters K, \(\Delta K\) and \(\Delta F\) are indicated in the legend

In Fig. 6, we provide an SER comparison for two scenarios. For both scenarios, we assume \(Q=2\), and the symbols are drawn from a 4-QAM modulation. Moreover, \(K=5\), \(\Delta F=10\), and \(\Delta K= 5\), for the first scenario, whereas for the second scenario \(K=3\), \(\Delta F=5\), and \(\Delta K= 3\). Hence, in the first scenario we estimate more symbols than in the second scenario, using fewer pilot symbols. As expected, we achieve a lower SER if more pilot symbols are used because they lead to a more accurate initial pilot-based channel estimate. Moreover, in Fig. 6 we see that the CC-KR+ALS receiver outperforms the CC-KR receiver. Thus, we benefit from the additional iterations and from exploiting the complete tensor structure. In contrast to CC-KR, CC-KR+ALS also estimates the channel matrix. Furthermore, the accuracy gain of the CC-KR+ALS receiver is more pronounced if we initialize the CC-KR+ALS with a less accurate pilot-based channel estimate (the gain is more pronounced for the solid lines than for the dashed lines in Fig. 6).

Fig. 7
figure 7

SER for \(4\times 4\) KR-coded OFDM, cross-coded OFDM, and traditional OFDM systems

Finally, in Fig. 7, we depict the SER performance for a \(4\times 4\) MIMO system, considering the following receivers: (i) ILSP receiver [17], (ii) RLSP receiver [17], (iii) KR receiver (Algorithm 1), (iv) KR-LS receiver (Algorithm 2), (v) CC-KR receiver (Algorithm 3), and (vi) CC-KR+ALS (Algorithm 4). To ensure a fair comparison in terms of spectral efficiency, the following parameters were chosen for the different receivers: The KR-coded OFDM system assumes \(N=128\), \(\Delta F = 10\), \(K=2\), \(\Delta K = 2\), \(Q=4\), \(P=KQ=8\) and the symbols are modulated using 16-QAM. For the CC-coded OFDM system we assume \(N=128\), \(\Delta F = 10\), \(K=2\), \(\Delta K = 2\), \(Q=4\), \(P=KQ=8\) and the symbols are drawn from a BPSK modulation. The OFDM system assumes \(N=128\), \(\Delta F = 10\), \(K=8\), \(\Delta K = 8\), and BPSK symbols. We see that the CC-KR receiver outperforms ILSP and RLSP receivers from [17]. In addition, the KR and KR-LS receivers for KR-coded OFDM have different slopes than the uncoded OFDM and the cross-coded OFDM, exhibiting a better performance, as expected.

Table 2 Computational cost of compared algorithms

In Table 2, we show the computational complexity of the compared algorithms. We take into account the main computational efforts, i.e., the computation of matrix inverses. For a matrix \({\varvec{{A}}} \in {\mathbb {C}}^{N \times M}\), we consider the cost of \({\mathcal {O}}\left( M^3\right)\) for its inversion, and \({\mathcal {O}}\left( NM^2 \right)\) for the computation of its rank-one SVD approximation. The Khatri–Rao factorization-based algorithms, i.e., the KR and the CC-KR algorithms have the lowest computational effort. This is due to the fact that they compute \(NM_T\) independent rank-one matrix approximations, while the remaining algorithms (ZF, RLSP, and ILSP), require iterations and/or the inversion of large matrices. Compared to the KR coding, the proposed CC-KR has similar complexity if \(Q = M_R\). On the other hand, in the proposed CC-KR receiver, two data symbol matrices are transmitted (\({\varvec{{S}}}^{(1)}\) and \({\varvec{{S}}}^{(2)}\)), increasing the spectral efficiency of the MIMO-OFDM system.

7 Conclusion and discussion

In this paper, we have presented a tensor model for MIMO-OFDM systems using the double contraction between a channel tensor and a transmit signal tensor. The use of double contractions allows us to derive explicit CP-like, or Tucker-like, tensor models for the received signal, which are exploited for a joint channel and symbol estimation using semi-blind algorithms. The proposed model is a very general and flexible way of describing the received signal in MIMO-OFDM systems for all subcarriers jointly.

We have also proposed Khatri–Rao-coded MIMO-OFDM models and proposed the corresponding semi-blind receivers based on the derived explicit CP-like tensor structure of the data model. In particular, the proposed KR-coded receivers, namely KR, KR+LS, CC-KR, and CC-KR+ALS, achieve a better performance in terms of the symbol error rate than the state-of-the-art schemes from the literature (ZF, ILSP, and RLSP). Also, the Khatri–Rao-based receivers (KR and CC-KR) can benefit from parallel processing, thus having a lower computational processing delay than the competitors. In addition, we have improved the performance of the Khatri–Rao-based receivers by means of an additional LS iteration (KR+LS) and an ALS procedure (CC-KR+ALS). Note that the Khatri–Rao coding strategy (KR and KR+LS) has a reduced spectral efficiency than the uncoded MIMO-OFDM system. To overcome this limitation, we have proposed a cross-coded Khatri–Rao strategy (CC-KR and CC-KR+ALS algorithms), where the “coding matrix” contains useful data symbols. For this cross-coded system, two receivers have been proposed.

A natural perspective of this work is an extension of the proposed semi-blind receivers to other multi-carrier techniques such as universal filtered multi-carrier (UFMC) and FBMC modulation [16], relay-assisted systems, and multi-user systems. In the case of a multi-user system, the proposed CC-KR algorithm, and possibly the CC-KR+ALS algorithm, can be used where the transmitted data symbols of multiple users are used as “coding matrices” to improve the total spectral efficiency of the system.

Availability of data and materials

Data sharing is not applicable to this article.

Change history

  • 10 February 2023

    Missing Open Access funding information has been added in the Funding Note.

Notes

  1. In our simulations, we use the pilot-based channel estimate obtained in the time domain.

Abbreviations

ALS:

Alternating least squares

BTD:

Block term decomposition

CC-KR:

Cross-coded Khatri–Rao

CP:

Canonical PARAFAC decomposition

DFT:

Discrete Fourier transform

FBMC:

Filter bank-based multi-carrier

ILSP:

Iterative least squares with projection

ISI:

Inter-symbol interference

KR:

Khatri–Rao receiver

KR-LS:

Khatri–Rao least-squares receiver

LS:

Least squares

LS-KRF:

Least-squares Khatri–Rao factorization

MIMO:

Multiple-input multiple-output

OFDM:

Orthogonal-frequency division multiplex

PARAFAC:

Parallel factors decomposition

QAM:

Quadrature amplitude modulation

RLSP:

Recursive least squares with projection

ZF:

Zero forcing

References

  1. R.A. Harshman, M.E. Lundy, Uniqueness proof for a family of models sharing features of Tucker’s three-mode factor analysis and PARAFAC/CANDECOMP. Psychometrika 61(1), 133–154 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  2. R.A. Harshman, PARAFAC2: Mathematical and Technical Notes. CLA Working Papers in Phonetics (University Microfilms, Ann Arbor, Michigan, No. 10,085, 1972), pp. 30–44

  3. K. Naskovska, Y. Cheng, A.L.F. de Almeida, M. Haardt, Efficient computation of the PARAFAC2 decomposition via generalized tensor contractions, in Proceedings of 52nd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA (2018), pp. 323–327

  4. A.L.F. de Almeida, G. Favier, L.R. Ximenes, Space-time-frequency (STF) MIMO communication systems with blind receiver based on a generalized PARATUCK2 model. IEEE Trans. Signal Process. 61(8), 1895–1909 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. J. Zhang, A. Nimr, K. Naskovska, M. Haardt, Enhanced tensor based semi-blind estimation algorithm for relay-assisted MIMO systems, in Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), eds. by E. Vincent, A. Yeredor, Z. Koldovsky, P. Tichavsky (2015), pp. 64–72

  6. L.R. Ximenes, G. Favier, A.L.F. de Almeida, Y.C.B. Silva, PARAFAC-PARATUCK semi-blind receivers for two-hop cooperative MIMO relay systems. IEEE Trans. Signal Process. 62, 3604–3615 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  7. T. Hwang, C. Yang, G. Wu, S. Li, G. Ye Li, OFDM and its wireless applications: a survey. IEEE Trans. Veh. Technol. 58(4), 1673–1694 (2009)

    Article  Google Scholar 

  8. B. Farhang-Boroujeny, OFDM versus filter bank multicarrier. IEEE Signal Process. Mag. 28(3), 92–112 (2011)

    Article  Google Scholar 

  9. M. Speth, S.A. Fechtel, G. Fock, H. Meyr, Optimum receiver design for wireless broad-band systems using OFDM. I. IEEE Trans. Commun. 47(11), 1668–1677 (1999)

    Article  Google Scholar 

  10. I. Barhumi, G. Leus, M. Moonen, Optimal training design for MIMO OFDM systems in mobile wireless channels. IEEE Trans. Signal Process. 5, 1615–1624 (2003)

    Article  Google Scholar 

  11. D. Hu, L. Yang, Y. Shi, L. He, Optimal pilot sequence design for channel estimation in MIMO OFDM systems. IEEE Commun. Lett. 10, 1–3 (2006)

    Article  Google Scholar 

  12. A.L.F. de Almeida, G. Favier, Unified tensor model for space-frequency spreading-multiplexing (SFSM) MIMO communication systems. EURASIP J. Adv. Signal Process. (2013). https://doi.org/10.1186/1687-6180-2013-48

    Article  Google Scholar 

  13. G. Favier, A.L.F. de Almeida, Tensor space-time-frequency coding with semi-blind receivers for MIMO wireless communication systems. IEEE Trans. Signal Process. 62(22), 5987–6002 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  14. K. Liu, J.P.C.L. da Costa, H.C. So, A.L.F. de Almeida, Semi-blind receivers for joint symbol and channel estimation in space-time-frequency MIMO-OFDM systems. IEEE Trans. Signal Process. 61(21), 5444–5457 (2013)

    Article  Google Scholar 

  15. B. Sokal, P.R.B. Gomes, A.L.F. de Almeida, M. Haardt, Tensor-based receiver for joint channel, data, and phase-noise estimation in MIMO-OFDM systems. IEEE J. Sel. Top. Signal Process. 15, 803–815 (2021)

    Article  Google Scholar 

  16. E. Kofidis, A tensor-based approach to joint channel estimation/data detection in flexible multicarrier MIMO systems. IEEE Trans. Signal Process. 68, 3179–3193 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  17. K. Naskovska, M. Haardt, A.L.F. de Almeida, Generalized tensor contractions for an improved receiver design in MIMO-OFDM systems, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2018), pp. 3186–3190

  18. K. Naskovska, M. Haardt, A.L.F. de Almeida, Generalized tensor contraction with application to Khatri-Rao coded MIMO OFDM systems, in Proceedings IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) (2017), pp. 286–290

  19. N.D. Sidiropoulos, R.S. Budampati, Khatri-Rao space-time codes. IEEE Trans. Signal Process. 50(10), 2396–2407 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  20. A.L.F. de Almeida, G. Favier, Double Khatri-Rao space-time-frequency coding using semi-blind PARAFAC based receiver. IEEE Signal Process. Lett. 20, 471–474 (2013)

    Article  Google Scholar 

  21. T.G. Kolda, B.W. Bader, Tensor decompositions and applications. SIAM Rev. 51, 455–500 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  22. M. Haardt, F. Roemer, G. Del Galdo, Higher-order SVD based subspace estimation to improve the parameter estimation accuracy in multi-dimensional harmonic retrieval problems. IEEE Trans. Signal Process. 56, 3198–3213 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  23. A. Cichocki, D. Mandic, A. Phan, C. Caiafa, G. Zhou, Q. Zhao, L. De Lathauwer, Tensor decompositions for signal processing applications: from two-way to multiway. IEEE Signal Process. Mag. 32, 145–163 (2015)

    Article  Google Scholar 

  24. X. Luciani, L. Albera, Semi-algebraic canonical decomposition of multi-way arrays and joint eigenvalue decomposition, in Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011), pp. 4104–4107

  25. F. Roemer, C. Schroeter, M. Haardt, A semi-algebraic framework for approximate CP decompositions via joint matrix diagonalization and generalized unfoldings, in Proceedings of of the 46th Asilomar Conference on Signals, Systems, and Computers (2012), pp. 2023–2027

  26. P. Comon, Tensors: a brief introduction. IEEE Signal Process. Mag. 31(3), 44–53 (2014)

    Article  Google Scholar 

  27. L. De Lathauwer, Decompositions of a higher-order tensor in block terms-part II: definitions and uniqueness. SIAM J. Matrix Anal. Appl. (SIMAX) 30, 1033–1066 (2008)

    Article  MATH  Google Scholar 

  28. S. Talwar, M. Viberg, A. Paulraj, Blind estimation of multiple co-channel digital signals using an antenna array. IEEE Signal Process. Lett. 1(2), 29–31 (1994)

    Article  Google Scholar 

  29. S. Talwar, M. Viberg, A. Paulraj, Blind separation of synchronous co-channel digital signals using an antenna array. I. Algorithms. IEEE Trans. Signal Process. 44(5), 1184–1197 (1996)

    Article  Google Scholar 

  30. F. Roemer, M. Haardt, Tensor-based channel estimation (TENCE) for two-way relaying with multiple antennas and spatial reuse, in Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan (2009), pp. 3641–3644

  31. ITU-R Recommendation M.1225, Guidelines for evaluation of radio transmission technologies for IMT-2000, Recommendation (1997)

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL. The authors would like to thank Coordenação de Aperfeicoamento de Pessoal de Nivel Superior-Brasil (CAPES)-Finance Code 001 and CNPq (Proc. 306616/2016-5). This work is also supported by CAPES/PROBRAL Proc. 88887.144009/2017-00, CAPES/PRINT Proc. 88887.311965/2018-00.

Author information

Authors and Affiliations

Authors

Contributions

KN was responsible for the document writing, proposal of the algorithms, and elaboration of the experimental results. BS also contributed to the writing of the manuscript and review. ALFdA and MH were responsible for the review of the manuscript and provided ideas and discussions on the main contributions of this manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Bruno Sokal.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

The picture materials quoted in this article have no copyright requirements.

Competing interests

The authors declare that they have no competing interests.

Experimental section

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Derivation of the four-way channel tensor in the frequency domain and its unfoldings

Let us assume a MIMO-OFDM system with \(M_{{\mathrm{T}}}\) transmit antennas and \(M_{\mathrm{R}}\) receive antennas. Such a system is depicted in Fig. 2, for \(M_{{\mathrm{T}}}=2\) and \(M_{\mathrm{R}}=3\). As shown in Sect. 3, we can define a four-way channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\in {\mathbb {C}}^{N \times N\times M_{\mathrm{R}} \times M_{{\mathrm{T}}}}\) by concatenating the channel tensors for each transmit antenna, i.e., \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\in {\mathbb {C}}^{N \times N\times M_{\mathrm{R}}}\) along the four-mode. The tensors \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\in {\mathbb {C}}^{N \times N\times M_{\mathrm{R}}}\) contain the channel vectors for the \(m_{{\mathrm{T}}}\)th transmit antenna and all receive antennas as defined in Eq. (6), for \(m_{{\mathrm{T}}}=1,\ldots M_{{\mathrm{T}}}\). Recall that these tensors have a CP structure, i.e., \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})} ={\varvec{{{\mathcal {I}}}}}_{3,N}\times _3\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\), for \(m_{{\mathrm{T}}}=1,\ldots M_{{\mathrm{T}}}\). The matrices \(\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\) (\(m_{{\mathrm{T}}}=1,\ldots M_{{\mathrm{T}}}\)) are defined in Eq. (7). Hence, the four-way channel tensor is

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}} }&=\begin{bmatrix} \tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(1)}&\sqcup _4&\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(2)}&\sqcup _4&\ldots&\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(M_{{\mathrm{T}}})} \end{bmatrix} \end{aligned}$$

We can rewrite this concatenation by means of an outer product with a pining vector \({\varvec{{e}}}_{m_{{\mathrm{T}}}}\). Moreover, if we substitute the CP structure of the tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\), we get \(\tilde{{\varvec{{{\mathcal {H}}}}} } = \sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}} \tilde{{\varvec{{{\mathcal {H}}}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})} \circ {\varvec{{e}}}_{m_{{\mathrm{T}}}}\)

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}} } = \sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}} \left( {\varvec{{{\mathcal {I}}}}}_{3,N}\times _1{\varvec{{I}}}_N\times _2{\varvec{{I}}}_N\times _3\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\right) \circ {\varvec{{e}}}_{m_{{\mathrm{T}}}}. \end{aligned}$$

Replacing the outer product by an n-mode product, we have

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}} } =\sum _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}} {\varvec{{{\mathcal {D}}}}}\times _1{\varvec{{I}}}_N\times _2{\varvec{{I}}}_N\times _3\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(m_{{\mathrm{T}}})}\times _4 {\varvec{{e}}}_{m_{{\mathrm{T}}}}, \end{aligned}$$
(27)

where \({\varvec{{{\mathcal {D}}}}}_{(.,.,.,1)}={\varvec{{{\mathcal {I}}}}}_{3,N}\). Note that the tensor \({\varvec{{{\mathcal {D}}}}}\in {\mathbb {R}}^{N\times N\times N\times 1}\) is a four-way tensor, but its four-mode is a singleton dimension. We can define this tensor in terms of a Kronecker product, which yields \({\varvec{{{\mathcal {D}}}}} = {\varvec{{{\mathcal {I}}}}}_{4,1} \otimes {\varvec{{{\mathcal {I}}}}}_{3,N}\). Equation (27) represents a very special BTD where the block terms are equivalent in all modes, except the three-mode and the four-mode. Next, we can replace the sum in (27) with a block diagonal core tensor and factor matrices partitioned accordingly.

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}} }&= \mathop {\mathrm{{blkdiag}}}\left( {\varvec{{{\mathcal {I}}}}}_{4,1} \otimes {\varvec{{{\mathcal {I}}}}}_{3,N}\right) _{m_{{\mathrm{T}}}=1}^{M_{{\mathrm{T}}}}\times _1\begin{bmatrix} {{\varvec{{I}}}_N}&\ldots&{{\varvec{{I}}}_N} \end{bmatrix}\\&\quad \times _2\begin{bmatrix} {{\varvec{{I}}}_N}&\ldots&{{\varvec{{I}}}_N} \end{bmatrix} \times _3 \underbrace{\begin{bmatrix} \tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(1)}&\ldots&\tilde{{\varvec{{H}}}}_{\mathrm{R}}^{(M_{{\mathrm{T}}})} \end{bmatrix}}_{\tilde{{\varvec{{H}}}}}\times _4\begin{bmatrix} {{\varvec{{e}}}_1}&\ldots&{{\varvec{{e}}}_{M_{{\mathrm{T}}}}} \end{bmatrix} \end{aligned}$$

Further, we rewrite the block diagonal structure and the partitioned factor matrices using Kronecker products

$$\begin{aligned} \tilde{{\varvec{{{\mathcal {H}}}}}}&=({\varvec{{{\mathcal {I}}}}}_{4,M_{{\mathrm{T}}}}\otimes {\varvec{{{\mathcal {I}}}}}_{3,N}) \times _1 ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N}) \times _2 ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N}) \times _3 \tilde{{\varvec{{H}}}} \times _4 {\varvec{{I}}}_{M_{{\mathrm{T}}}}. \end{aligned}$$
(28)

This last equation explicitly reveals the structure of the channel tensor \(\tilde{{\varvec{{{\mathcal {H}}}}}}\). Exploiting this structure, we can define any of the tensor unfoldings. For the generalized unfolding \(\left[ \tilde{{\varvec{{{\mathcal {H}}}}}} \right] _{([1,3],[2,4])}\), from Eq. (28), we get

$$\begin{aligned} \left[ \tilde{{\varvec{{{\mathcal {H}}}}}} \right] _{([1,3],[2,4])} =&\left[ \tilde{{\varvec{{H}}}}\otimes ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\right] \\&{\left[ {\varvec{{{\mathcal {I}}}}}_{4,M_{{\mathrm{T}}}}\otimes {\varvec{{{\mathcal {I}}}}}_{3,N} \right] _{([1,3],[2,4])}\left[ {\varvec{{I}}}_{M_{{\mathrm{T}}}} \otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N} \right] } \end{aligned}$$
(29)

Next, we have

$$\begin{aligned} \left[ {\varvec{{{\mathcal {I}}}}}_{4,M_{{\mathrm{T}}}}\otimes {\varvec{{{\mathcal {I}}}}}_{3,N} \right] _{([1,3],[2,4])}\left[ {\varvec{{I}}}_{M_{{\mathrm{T}}}} \otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N} \right] = {\varvec{{I}}}_{N{M_{{\mathrm{T}}}}} \diamond {\varvec{{I}}}_{N M_{{\mathrm{T}}}} \end{aligned}$$

for the second part in (29). Recognize that \({\varvec{{I}}}_{N{M_{{\mathrm{T}}}}} \diamond {\varvec{{I}}}_{NM_{{\mathrm{T}}}}={\varvec{{J}}}_{NM_{{\mathrm{T}}}}\) is a selection matrix that converts a Kronecker product into a Khatri–Rao. Using this property, (29) becomes

$$\begin{aligned} \left[ \tilde{{\varvec{{{\mathcal {H}}}}}} \right] _{([1,3],[2,4])} = \tilde{{\varvec{{H}}}}\diamond ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N}). \end{aligned}$$
(30)

Moreover, the generalized unfolding \(\left[ \tilde{{\varvec{{{\mathcal {H}}}}}} \right] _{([1,3],[4,2])}\) can also be derived directly from Eq. (28). However, to simplify the final result is not straightforward because N is the faster rising index along the columns of the factor matrix \(\tilde{{\varvec{{H}}}}\) in Eq. (28). On the other hand, \(M_{{\mathrm{T}}}\) varies faster than N along the columns in the generalized unfolding \(\left[ \tilde{{\varvec{{{\mathcal {H}}}}}} \right] _{([1,3],[4,2])}\). Therefore, we derive this generalized unfolding by means of a permutation matrix \({\varvec{{P}}} \in {\mathbb {R}}^{NM_{{\mathrm{T}}} \times M_{{\mathrm{T}}}N}\). The permutation matrix \({\varvec{{P}}}\) reorders the columns such that the faster increasing index is \(M_{{\mathrm{T}}}\) instead of N and is defined as \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}=[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[2,4])}\cdot {\varvec{{P}}}\). Hence,

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}= \left[ \tilde{{\varvec{{H}}}}\diamond ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\right] \cdot {\varvec{{P}}}. \end{aligned}$$
(31)

Considering that the permutation matrix \({\varvec{{P}}}\) reorders the columns in Eq. (31) and the Khatri–Rao product is a column-wise operator (Khatri–Rao product is column-wise Kronecker product), the following equality holds

$$\begin{aligned}{}[\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])}&= \left[ \tilde{{\varvec{{H}}}}\diamond ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\right] \cdot {\varvec{{P}}} \\&= \left[ \tilde{{\varvec{{H}}}}\cdot {\varvec{{P}}}\right] \diamond \left[ ({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\cdot {\varvec{{P}}}\right] . \end{aligned}$$

Finally, using \(({\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}\otimes {\varvec{{I}}}_{N})\cdot {\varvec{{P}}}= ({\varvec{{I}}}_{N}\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}})\) and defining \(\bar{{\varvec{{H}}}}=\tilde{{\varvec{{H}}}}\cdot {\varvec{{P}}}\), we get \([\tilde{{\varvec{{{\mathcal {H}}}}}}]_{([1,3],[4,2])} = \bar{{\varvec{{H}}}}\diamond ({\varvec{{I}}}_{N}\otimes {\varvec{{1}}}_{M_{{\mathrm{T}}}}^{\mathrm{T}}).\)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Naskovska, K., Sokal, B., de Almeida, A.L.F. et al. Using tensor contractions to derive the structure of slice-wise multiplications of tensors with applications to space–time Khatri–Rao coding for MIMO-OFDM systems. EURASIP J. Adv. Signal Process. 2022, 109 (2022). https://doi.org/10.1186/s13634-022-00937-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-022-00937-5

Keywords