 Research
 Open Access
 Published:
CP decomposition approach to blind separation for DSCDMA system using a new performance index
EURASIP Journal on Advances in Signal Processing volume 2014, Article number: 128 (2014)
Abstract
In this paper, we present a canonical polyadic (CP) tensor decomposition isolating the scaling matrix. This has two major implications: (i) the problem conditioning shows up explicitly and could be controlled through a constraint on the socalled coherences and (ii) a performance criterion concerning the factor matrices can be exactly calculated and is more realistic than performance metrics used in the literature. Two new algorithms optimizing the CP decomposition based on gradient descent are proposed. This decomposition is illustrated by an application to directsequence code division multiplexing access (DSCDMA) systems; computer simulations are provided and demonstrate the good behavior of these algorithms, compared to others in the literature.
1 Introduction
Blind source separation consists in estimating unknown signals observed from their mixture without knowing any information about them, except mild properties such as their independence. Early work on blind source separation was initiated by Jutten and Hérault [1, 2] in the case of an instantaneous mixture. More recently, the use of multilinear algebra methods has attracted attention in several areas such as data mining, signal processing, and particularly in wireless communication systems, among others. Wireless communication data can sometimes be viewed as components of a highorder tensor (order strictly larger than 2). Solving the problem of source separation then means finding a decomposition of this tensor and determining its parameters. One of the most popular tensor decompositions is the canonical polyadic decomposition (CP), also known as parallel factor analysis (PARAFAC), which can be seen as an analog of the matrix singular value decomposition (SVD), since it decomposes the tensor into a sum of rankone components [3–5]. This decomposition has been exploited and generalized in several works for solving different signal processing problems [6, 7] such as multiarray multisensor signal processing. The interest of the CP decomposition lies in its uniqueness under certain conditions. Typical algorithms for finding the CP components include alternating least squares (ALS) and descent algorithms [8, 9], which do not isolate the scaling factor matrix. Herein, we propose two new optimization algorithms for CP tensor decomposition, which isolate the scaling matrix in the optimization process and offers the possibility to monitor the conditioning.
It is well known that loading matrices are identified up to column scaling. This indeterminacy is complicated to take into account, given that the product of all scaling matrices must be equal to the identity. For this reason, only approximate performance indices have been used so far by ignoring the last constraint. However, one can ask oneself whether it is possible to calculate the exact performance index: this is our second contribution. The present paper develops preliminary results appeared in [10] and includes performances obtained in the frame of directsequence code division multiplexing access (DSCDMA) blind multiuser detection and estimation.
The rest of this paper is organized as follows. Section 2 presents notation, definitions, and properties of thirdorder tensors, and the exact CP decomposition problem is then stated. In Section 3, the lowrank approximation is formulated. Existence and uniqueness of this decomposition are also investigated. ALS and the two proposed algorithms are presented in Section 4. Section 5 is dedicated to the new performance criterion with a focus on an exact performance index calculation. In Section 6, we show the usefulness of our algorithms and the performances obtained. An application of these algorithms to CDMA transmission is then provided to illustrate the effectiveness of the latter.
2 Notation and preliminaries
2.1 Notations and definitions
Let us first introduce some essential notation. Scalars are denoted by lowercase letters, e.g., a. Vectors are denoted by boldface lowercase letters, e.g., a; matrices are denoted by boldface capital letters, e.g., A. Higherorder tensors are denoted by boldface Euler script letters, e.g., . The p th column of a matrix A is denoted a_{ p }, the (i,j) entry of a matrix A is denoted by A_{ ij }, and element (i,j,k) of a thirdorder tensor is denoted by T_{ ijk }. 1 will represent a vector containing ones, and I the identity matrix.
Definition 2.1
The scalar product of two samesized tensors , $\mathcal{Y}\epsilon {\mathbb{C}}^{{I}_{1}\times {I}_{2}\times \cdots \times {I}_{N}}$ is defined as:
where ${X}_{{i}_{1},\cdots \phantom{\rule{0.3em}{0ex}},\phantom{\rule{0.3em}{0ex}}{i}_{N}}$ is the (i_{1},⋯, i_{ N }) elements of the N^{th} order of tensor.
Definition 2.2.
The outer (tensor) product between two tensor arrays $\mathcal{X}\in {\mathbb{C}}^{{I}_{1}\times {I}_{2}\times \cdots \times {I}_{N}}$ and $\mathcal{Y}\in {\mathbb{C}}^{\phantom{\rule{0.3em}{0ex}}{J}_{1}\times {J}_{2}\times \cdots \times {J}_{M}}$ of orders N and M, respectively, is denoted by $\mathcal{Z}={\mathcal{X}}_{{}^{\otimes}}\mathcal{Y}\in {\mathbb{C}}^{{I}_{1}\times {I}_{2}\times \cdots \times {I}_{N}\times {J}_{1}\times {J}_{2}\times \cdots \times {J}_{M}}$ and defined by:
The symbol ⊗ represents the tensor outer product. The outer product of two tensors is another tensor, the order of which is given by the sum of the orders of the two former tensors. Equation 2 is a generalization of the concept of outer product of two vectors, which yields itself a matrix (secondorder tensor). The outer product of three vectors $\mathbf{a}\in {\mathbb{C}}^{I}$ and $\mathbf{b}\in {\mathbb{C}}^{\phantom{\rule{0.3em}{0ex}}J}$ and $\mathbf{c}\in {\mathbb{C}}^{K}$ yields a thirdorder decomposable tensor $\mathcal{Z}={\mathbf{a}}_{{}^{\otimes}}{\mathbf{b}}_{{}^{\otimes}}\mathbf{c}\in {\mathbb{C}}^{I\times J\times K}$ where Z_{ ijk }=a_{ i }b_{ j }c_{ k }.
Definition 2.3.
The rank of an arbitrary tensor $\mathcal{T}\in {\mathbb{C}}^{{I}_{1}\times {I}_{2}\times \dots \times {I}_{N}}$, denoted by R = r a n k(X), is the minimal number of rank1 tensors that yield in a linear combination.
Decomposable tensors have thus a rank equal to 1.
Definition 2.4.
The Kruskal rank, or krank , of a matrix is the largest number j such that every set of j columns is linearly independent.
Definition 2.5.
The Frobenius norm of a tensor $\mathcal{T}\in {\mathbb{C}}^{{I}_{1}\times {I}_{2}\times \dots \times {I}_{N}}$ is defined as:
Definition 2.6.
The KhatriRao product between two matrices with the same number of columns, $\mathbf{A}=\left[{\mathbf{a}}_{1},{\mathbf{a}}_{2},\cdots \phantom{\rule{0.3em}{0ex}},\phantom{\rule{0.3em}{0ex}}{\mathbf{a}}_{F}\right]\in {\mathbb{C}}^{I\times F}$ and $\mathbf{B}=\left[{\mathbf{b}}_{1},{\mathbf{b}}_{2},\cdots \phantom{\rule{0.3em}{0ex}},\phantom{\rule{0.3em}{0ex}}{\mathbf{b}}_{F}\right]\in {\mathbb{C}}^{\phantom{\rule{0.3em}{0ex}}J\times F}$ , is defined as the columnwise Kronecker product:
where ⊠ is the Kronecker product.
Definition 2.7.
The coherence of a collection V={v_{1},⋯, v_{ r }} of unitnorm vectors is defined as the maximal value of the modulus of cross scalar products. In other words, the coherence of the collection V is defined as:
Definition 2.8.
Let $\mathbf{A}\in {\mathbb{C}}^{I\times J}$ , then $\text{vec}\left\{\mathbf{A}\right\}\in {\mathbb{C}}^{K}$ denotes the column vector defined by:
where K = I J.
2.2 Preliminaries
A tensor of order d is an object defined on a product between d linear spaces. Once the bases of these spaces are fixed, a d t h order tensor can be represented by a dway array (a hyper matrix) of coordinates [4]. The order of a tensor thus corresponds to the number of indices of the associated array. We are interested in decomposing a thirdorder tensor as:
where $\mathcal{D}\left(r\right)$ are decomposable tensors, that is, $\mathcal{D}\left(r\right)={\mathbf{a}{}_{r}}_{{}^{\otimes}}{\mathbf{b}{}_{r}}_{{}^{\otimes}}{\mathbf{c}}_{r}$. Denote by ()^{T} matrix transposition, λ_{ r } real positive scaling factors, λ= [λ_{1},⋯,λ_{ R }]^{T}, and R the tensor rank. Vectors a_{ r } (resp. b_{ r } and c_{ r }) live in a linear space of dimension I (resp. dimension J and K). Equivalently, decomposition (7) can be rewritten as a function of array entries:
where A_{ ir } (resp. B_{ jr } and C_{ kr }) denote the entries of vector a_{ r } (resp. b_{ r } and c_{ r }). The above decomposition is called the canonical polyadic decomposition (CP) of tensor . The model (7) can be written in a compact form using the KhatriRao product, as:
where ${\mathbf{\text{T}}}_{1}^{I,\phantom{\rule{0.3em}{0ex}}\mathit{\text{KJ}}}\left(\text{resp.}\phantom{\rule{2.77626pt}{0ex}}{\mathbf{\text{T}}}_{2}^{J,\phantom{\rule{0.3em}{0ex}}\mathit{\text{KI}}}\phantom{\rule{2.77626pt}{0ex}}\text{and}\phantom{\rule{2.77626pt}{0ex}}{\mathbf{\text{T}}}_{3}^{K,\phantom{\rule{0.3em}{0ex}}\mathit{\text{JI}}}\right)$ is the matrix of size I × K J (resp. J × K I and K × J I) obtained by unfolding the array of size I × J × K in the first mode (resp. the second mode and the third mode), and Λ is the R × R diagonal matrix defined as Λ= D i a g{λ_{1},…, λ_{ R }}; see [5] for further details on matrix unfoldings.
The explicit handwriting of decomposable tensors as given in (8) is subject to scale indeterminacies. In the tensor literature, optimization of the CP decomposition (8) has been carried out without isolating the scaling factor Λ, which is generally included in one of the loading matrices, so that Λ= I. In [5], Kolda and Bader proposed to reduce the indeterminacies by normalizing the vectors and storing the norms in Λ. Our first proposal is to pull the factors λ_{ r } outside the product and calculate the optimal value of the scaling factor, which permits to monitor the conditioning of the problem. Scaling indeterminacies are then clearly reduced to unit modulus but are not completely fixed, hence the difficulty in estimating the identification error of loading matrices A, B and C. Our second proposal (Section 5) is to calculate the 3R complex phases (reducing to signs in the real case).
3 Existence and uniqueness
The goal is to identify all parameters in the right hand side of (8), given the whole array . According to existing results [11–13], a thirdorder tensor (i.e., a threeway array) of rank R can be uniquely represented as sum of R rank1 tensors, under certain conditions. Kruskal has demonstrated that the condition (9) is sufficient for uniqueness in CP decomposition [13], where k_{ A } is the krank of A. This means that matrices A, B, and C are unique up to permutation and (complex) scaling of their columns, under the Kruskal’s condition:
For uniqueness, Harshman has shown that is sufficient to have A and B of full rank, and C of krank ≥ 2 [3]. When 1 < R ≤ 2, the Kruskal and Harshman conditions are equivalent. For R > 2, Kruskal’s condition may be satisfied even when Harshman’s are not, and this condition is claimed to be only sufficient for R > 3 [14]. However, observations are actually corrupted by noise, so that (8) does not hold exactly.
3.1 Lowrank approximation
In practice, the exact CP decomposition always exists but with a very large rank. Hence, it may not be physically meaningful, and additionally, it is generally not unique. It is consequently preferred to fit a multilinear model of lower rank, $R<\text{rank}\mathcal{T}$, fixed in advance, so that we have to deal with an approximation problem. To estimate the parameters of the decomposition, we need to minimize the following cost function:
By convention (A,B,C).Λ denotes the tensor of rank R of coordinates $\sum _{r=1}^{R}{\lambda}_{r}{A}_{\mathit{\text{ir}}}{B}_{\mathit{\text{jr}}}{C}_{\mathit{\text{kr}}}$. Minimizing error (10) means finding the best rankR approximate of and its CP decomposition. The cost function (10) can also be written in three equivalent compact forms with respect to the three loading matrices:
3.2 Conditioning of the problem
Assuming that the matrices A, B, and C are given, we will calculate the optimal value of the scaling factor Λ. This can be done by expanding the Frobenius norm in (10), which is a quadratic form in the entries of Λ and canceling the gradient with respect to λ (see details in Appendix 1). Then, the following linear system is obtained:
where f is Rdimensional vector defined by the contraction $\phantom{\rule{0.3em}{0ex}}{f}_{r}=\sum _{\mathit{\text{ijk}}}{T}_{\mathit{\text{ijk}}}{A}_{\mathit{\text{ir}}}{B}_{\mathit{\text{jr}}}{C}_{\mathit{\text{kr}}}$, G represents the R × R Gram matrix defined by:
In view of matrix G, we can see that coherences play a role in the conditioning of the problem. From Equations 11 to 12, and since diagonal entries of G are equal to 1, it is indeed clear that imposing cross scalar products of the form ${\mathbf{a}}_{p}^{H}{\mathbf{a}}_{q}$ to have a modulus strictly smaller than 1 will lead with greater chances to an acceptable conditioning. Also note that scalar products do not appear individually in (12) but through their products, since entries of G can also be written as ${G}_{\mathit{\text{pq}}}={\mathbf{a}}_{p}^{H}{\mathbf{a}}_{q}\phantom{\rule{0.3em}{0ex}}{\mathbf{b}}_{p}^{H}{\mathbf{b}}_{q}\phantom{\rule{0.3em}{0ex}}{\mathbf{c}}_{p}^{H}{\mathbf{c}}_{q}$. This statement has deeper implications, particularly in existence and uniqueness of the solution to Problem (10), as subsequently elaborated.
3.3 Existence
According to the results in [15, 16], the infimum of (10) may not be reachable. In fact, the set of tensors of rank at most ξ is not closed if ξ > 1. Examples of the lack of closeness have been provided in the literature [15, 16], which suffice to prove it. In other words, it may happen that for a given tensor, and for any rankr approximation of it, there always exists another better rankr approximation.
This is the reason why the authors proposed the constraint below, which ensures existence of a minimum. Define the three coherences μ_{ A }, μ_{ B }, and μ_{ C } associated with the matrices A, B, and C, respectively. It has been indeed shown in [15, 17] that under the constraint:
the infimum of (10) is reached. It may be seen that this condition already gives a quantitative bound to the conditioning of (11) because coherences bound extradiagonal entries of matrix G, which has ones on its diagonal.
3.4 Uniqueness
The uniqueness of the tensor decomposition can be ensured by using a sufficient condition based on Kruskal’s theorem (9), previously mentioned.
According to the lemma reported in [17, 18], an inequality holds between Krank and coherence, namely ${k}_{A}\ge \frac{1}{{\mu}_{A}}$, as long as k_{ A } is strictly smaller than the column rank of A. Including this inequality in Equation 9 leads to the following sufficient uniqueness condition:
4 Optimization for CP decomposition
In Section 3, we presented CP for threeway tensors. Various optimization algorithms exist to compute CP decomposition without constraint, as ALS or descent algorithms [7, 8, 19]. We subsequently present optimization algorithms to compute the CP decomposition (10), under the constraints of unit norm columns of loading matrices.
4.1 Alternating least squares algorithm
The ALS algorithm was proposed for CP computation by Carroll and Chang in [20] and Harshman in [3] and still stays the workhorse algorithm today, mainly owing to its ease of implementation [21]. ALS is hence the classical solution to minimize the cost function (10), despite its lack of convergence proof. This iterative algorithm alternates among the estimation of matrices A, B, and C.
The principle of the ALS algorithm is to convert a nonlinear optimization problem into three independent linear least squares (LS) problems. In the first steps, one of the three matrices, say, A is updated while the two others (B and C) are fixed to their values obtained in previous estimation steps. The estimate of A is given by:
where T^{I, KJ} is the unfolding matrix of size I × J K defined in Section 2.2, and ()^{†} is the MoorePenrose pseudo inverse. By symmetry, the expressions are similar for $\hat{\mathbf{B}}$ and $\hat{\mathbf{C}}$.
4.2 Proposed algorithms
Our optimization problem consists in minimizing the squared error Υ under a collection of 3R constraints, namely:
Therefore, we need to find three matrices A, B, and C of unit norm columns which minimize (16). Stack these three matrices in a I+J+K by R matrix denoted by X. The objective can now be also written Υ(X,Λ), for the sake of convenience.
The computation of loading matrices is performed by minimizing the quadratic cost function (10). One generates a series of iterates X(k) = [A(k)^{T}, B(k)^{T}, C(k)^{T}]^{T}, $k\in \mathbb{N}$, with initial value X(0) arbitrarily chosen. Generally, the algorithm consists of choosing at the k th iteration a point X(k + 1) in a direction lying in the half space defined by the gradient of objective function Υ, defined by matrix D(k), which verifies [22]:
One possibility is to choose the direction opposite to the gradient:
The gradient components ∇Υ_{ A } (size I × R), ∇Υ_{ B } (J × R), and ∇Υ_{ C } (K × R) can be stacked into a single matrix of size I+J+K by R:
Since objective function (10) is real but its arguments are complex, its gradient can be computed in the sense of [23, 24]. Using the quadratic form proposed in [23], the objective function can be expanded into:
where ${M}_{\mathit{\text{pq}}}=\sum _{\mathit{\text{jk}}}{\lambda}_{p}{\lambda}_{q}^{\ast}{B}_{\mathit{\text{jp}}}{B}_{\mathit{\text{jq}}}^{\ast}{C}_{\mathit{\text{kp}}}{C}_{\mathit{\text{kq}}}^{\ast}$ and ${N}_{\mathit{\text{ip}}}=\sum _{\mathit{\text{jk}}}{T}_{\mathit{\text{ijk}}}{\lambda}_{p}^{\ast}{B}_{\mathit{\text{jp}}}^{\ast}{C}_{\mathit{\text{kp}}}^{\ast}$. Thus, the gradient of Υ with respect to A is:
where M of size R × R and N of size I × R. The gradient of Υ with respect to B and C is similar, taking into account the fact that matrices M and N need to be defined accordingly (for the gradient of Υ with respect to B and C, the dimension of matrix N is J × R and K × R, respectively, while the dimension of M is always R × R).
The difficulty that arises in constrained optimization is to make sure that the move remains within the feasible set, , defined by the constraints. In the following subsections, we propose two versions of our algorithm, with two different ways of calculating scale factor Λ.
Descent algorithms are also determined by the steps that will be executed in the chosen direction. There are various methods for the step selection, and the most widely used are Backtracking and Armijo[22]. To compute the stepsize ℓ(k) in Algorithm 1 and Algorithm 2, we use a popular inexact line search method, very simple and quite effective, which is the backtracking line search. It depends on two fixed constants α, β with 0 < α < 0.5 and 0 < β < 1.
Backtracking algorithm

1.
Given a descent direction D for Υ, α ∈ (0,0.5), β ∈ (0,1).

2.
Initialization: ℓ = 1.

3.
while Υ (X +ℓ D;Λ) > Υ(X;Λ)+α ℓ∇Υ ^{T}D

4.
ℓ = β ℓ.
4.2.1 Algorithm 1
In the recent work on CP tensor decomposition, optimization of the objective function (10) was made without explicitly considering the factor Λ. More precisely in most contributions, the scaling factor is integrated into loading matrices, so that Λ may be set to the identity. The first solution we propose is to use a projected gradient algorithm while calculating Λ as the product of normalizing factors of matrices A, B, and C:
where ⊡ is the Hadamard product, Λ_{ A }=Diag{∥a_{1}∥,⋯,∥a_{ R }∥}, and similar definitions for Λ_{ B } and Λ_{ C }. This approach, which we call ‘Algorithm 1’ can be described by the pseudocode below:
4.2.2 Algorithm 2
The other approach is to consider Λ as an additional variable. By canceling the gradient of Υ(X,Λ) with respect to Λ, one obtains Equation 11, which can be solved for Λ when X is fixed. This gives the algorithms below.
4.2.2.0 Stopping criterion
The convergence is usually considered to be obtained at the k th iteration when the error between tensor , and the tensor reconstructed from the estimated loading matrices, does not significantly change between iterations k and k + 1.
However, in the complex case, the phase of the entries of loading matrices found at the end of the algorithm – as defined by the stopping criterion above  is different from the original. To remedy this problem, we propose a new performance criterion in order to minimize the distance between the original and the estimated matrices. Although this criterion is not usable when actual loading matrices are unknown, it still permits to assess the performances effectively attained.
5 Performance criterion
As highlighted in Section 2, there is always an indeterminacy in the representation of decomposable tensors, characterized in the CP decomposition by 3R complex numbers of unit modulus. In order to better understand this problem, let a, b, and c denote the r th column of matrices A, B, and C, respectively, with 1 ≤ r ≤ R. Furthermore, $\widehat{\mathbf{a}}$, $\widehat{\mathbf{b}}$, and $\widehat{\mathbf{c}}$ denote one column of the estimated matrices entering in the CP decomposition. We seek to minimize a distance:
In the literature, only approximate performance indices have been used so far by neglecting relation between the angles φ, ψ and χ given by:
Our contribution herein consists in finding the exact minimum distance (23) under this angular constraint, by calculating the 3R optimal phases affecting the columns of the estimated loading matrices.
The derivative of Equation 23 with respect to φ and ψ yields a system of two equations. Finding 3R phases means to solve, if the 2 sets of 3 columns are fixed:
where x = φ −α and y = ψ −β. After some trigonometric manipulations described in Appendix 2, the solutions can be obtained by rooting a polynomial of degree 6 in a single variable φ. By replacing the admissible values of φ into system (24), the corresponding values of ψ are obtained. The calculation of the minimum distance (23) is done for all possible permutations. We end up with the following performance criterion:
where Π is the set of permutations of {1,2,⋯, R}. When the permutation acts in too large dimension, greedy versions are possible to limit the exhaustive search in the permutation set.
Denote π(i) as the permutations of Π, 1 ≤ i ≤ R!. The overall algorithm to compute the performance criterion is summarized as follows:

1.
For 1 ≤ i ≤ R! do

2.
Calculate the 3R optimal phases affecting the columns of the estimated loading matrices:

(a)
Permute the columns of the three estimated matrices according to the permutation π(i): ${\hat{\mathbf{A}}}_{\pi \left(i\right)}$, ${\hat{\mathbf{B}}}_{\pi \left(i\right)}$, and ${\hat{\mathbf{C}}}_{\pi \left(i\right)}$;

(b)
For each r th columns of loading matrices and estimated matrices, do:

Set x = φ −α and y = ψ −β and solve the polynomial of degree 6 in a single variable φ:
$$\begin{array}{lcr}{c}_{0}& +\hfill & \phantom{\rule{0.3em}{0ex}}{c}_{1}cos\left(2x\right)+{c}_{2}\stackrel{2}{cos}\left(2x\right)+{c}_{3}\stackrel{3}{cos}\left(2x\right)\hfill \\ +& \phantom{\rule{0.3em}{0ex}}{c}_{4}\stackrel{4}{cos}\left(2x\right)\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{c}_{5}\stackrel{5}{cos}\left(2x\right)\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{c}_{6}\stackrel{6}{cos}\left(2x\right)=0.\end{array}$$ 
Replace x in $siny=\frac{{\rho}_{\mathbf{a}}}{{\rho}_{\mathbf{b}}}sinx$, obtain y and consequently ψ;

Calculate χ in: exp(ȷ(φ +ψ +χ)) = 1;

Calculate the minimum distance δ:
$$\begin{array}{l}\delta \phantom{\rule{0.3em}{0ex}}\left(\mathbf{\text{x}};\widehat{\mathbf{\text{x}}}\right)=\underset{\phi ,\psi ,\chi}{min}\left\{{\u2225\mathbf{a}{e}^{\mathrm{\u0237\phi}}\widehat{\mathbf{a}}\u2225}^{2}+{\u2225\mathbf{b}{e}^{\mathrm{\u0237\psi}}\widehat{\mathbf{b}}\u2225}^{2}\right.\\ \phantom{\rule{6em}{0ex}}\left(\right)close="\}">+{\u2225\mathbf{c}{e}^{\mathrm{\u0237\chi}}\widehat{\mathbf{c}}\u2225}^{2}& .\end{array}$$ 
Save the results: distance(i) = δ, phase_{ φ }(i) = φ, phase_{ ψ }(i) = ψ and phase_{ χ }(i) = χ;

3.
End do.

4.
Choose the 3R angles which return the smaller distance δ.
6 Simulation results
To evaluate the efficiency and behavior of the proposed algorithms with new performance criterion, two experiments are made: the first one for random loading matrices and the second one for DSCDMA system. In all experiments, the results are obtained from 100 Monte Carlo runs. At each iteration and for every SNR value, a new noise realization is drawn. The stopping criterion chosen for all experiments is Υ^{(n)}<ε and Υ^{(n)}−Υ^{(n−1)}<ε, where ε is a threshold by which the global minimum is considered to be reached, and n is the current iteration. In the following simulations, we take: ε=10^{−6}.
6.1 Example 1: random loading matrices
In order to illustrate the behavior and performances of the proposed algorithms, we first report simulation results run on random tensors. In a first stage, we compare ‘Algorithm 2’, ‘Algorithm 1’, and the gradient descent algorithm named herein ‘Algorithm 0’. We will analyze the errors of matrix estimation obtained through the performance criterion proposed in Section 5. Two scenarios are analyzed using random tensors: (i) one tensor of rank 2 with dimensions 3 × 3 × 3 and (ii) another tensor of rank 5 with dimensions 8 × 8 × 8. Loading matrices are initialized with randomvalued columns and the two tensors are corrupted by an additive Gaussian noise.Figures 1 and 2 depict matrix estimation errors implied in (25) as a function of SNR. As expected, it can be seen that the results using Algorithm 0 show a poor performance when compared to the results obtained with Algorithms 1 and 2 (curves with diamonds and circles). This supports the idea that our algorithms isolating the scaling matrix are attractive. Furthermore, we check that when the phase constraints are neglected in the calculation of the performance measure, the results are significantly more optimistic, particularly at high SNR (curves of the same color), which supports the interest of using our performance index defined in (5).In order to show the significant difference between Algorithms 1 and 2, we will examine the convergence speed of the tensor reconstruction according to the number of iterations, since the final error is about the same (cf. Figures 1 and 2). We consider the case of a 3 × 3 × 3 tensor of rank 2. The latter was constructed from three 3 × 2 Gaussian matrices drawn randomly. The results are presented in Figure 3, and show that in all our experiments, Algorithm 2 converged faster in terms of number of iterations, while Algorithm 1 and Algorithm 0 require more iterations to reach convergence.
6.2 Example 2: application To DSCDMA system
In this example, we place ourselves in a blind context. We assume that the receiver has no knowledge neither on the spreading codes nor on symbol sequences. Classically, telecommunications blind techniques are based on some a priori knowledge, such as temporal properties of transmitted signals or the spatial properties of the receiver [25–27].
Recently, algebraic tensor methods have received considerable attention in signal processing [2]. It also turns out that multilinear algebra tools are often more powerful than their matrix equivalent. Sidiropoulos et al. are the first to adopt tensor approaches in the telecommunications field in 2000 [12]. They observed that the samples of a CDMA signal received by an array of antennas can be stored in a cube, each dimension corresponding to a diversity (coding diversity, temporal diversity, and spatial diversity). Thus, they showed that the deterministic blind separation problem of CDMA signals can be solved by the CP decomposition [28].
In this example, we propose to apply the CP decomposition algorithms as detailed in Section 4 with the new performance criterion on the DSCDMA technique. A comparison with the ALS algorithm is then made.
6.2.1 Tensor modeling
We consider R users with one transmitting antenna, transmitting simultaneously their signals to an array of K receiving antennas. In other words, we consider a communication system of type ‘multiuser SIMO’.
For example, assume the user r transmits the symbols s_{ r } of size J. These symbols are spreaded by c_{ r } CDMA code of length I, uniquely allocated to user r and assumed to be unknown at the receiver. These codes are binary sequences taking values from {−1,1} and could be nonorthogonal. The spreading sequence propagates along a single path in a memoryless channel and is received by the antenna array under angle of arrival θ_{ r }. Each of the K antennas receives a signal Y_{ k }(t) of size J × I. Our approach for detection and separation of the received signals is to exploit the multilinear algebraic structure of these signals using the new performance criterion. We observe these signals during a time span of length J T_{ s }, where T_{ s } is the symbol period. The Y_{ k }(t) signals are sampled at the chip period T_{ c } = T_{ s }/I, where I is the spreading factor. Therefore, each antenna provides a set of IJ samples that can be ordered in a tensor of order 3, denoted $\mathcal{Y}\in {\mathbb{C}}^{I\times J\times K}$. The Y_{ ijk } component of this tensor that corresponds to the sample of the overall signal received by the k th antenna at time i of the j th symbol period can be written as follows:
where the complex scalar A_{ kr } = β_{ r }a_{ k } (θ_{ r }), with β_{ r } is the fading coefficient of the r th user and a_{ k } (θ_{ r }) the response of the antenna k at the angle of arrival θ_{ r }.
Separating the received signals is then equivalent to decompose the tensor into a sum of R contributions, where R represents the number of active users in the system. To calculate the CP decomposition of , we resort to Algorithm 2 presented in Section 4 with performance index (25). The detection and separation of the matrix S of transmitted symbols will be made, using the following objective function:
where a_{ r }, c_{ r }, and s_{ r } are the normalized vectors; the scaling ambiguities on the estimated symbols are eliminated by normalizing each symbol sequence by its norm and calculating the exact scaling factor Λ. As to correct the phases of the three estimated matrices, we will use the exact performance index (28) seen in Section 5.
6.2.2 Simulation
In this experiment, we present the performance of the receiver algorithms (Algorithm 1 and Algorithm 2 with new performance criterion) which estimate blindly the symbol S.
We consider R = 4 users communicate simultaneously in the same bandwidth. Each user transmits a sequence J = 20 of consecutive symbols and is uniquely assigned a spreading sequence of length I = 10. The user symbols are generated from an i.i.d distribution and are modulated using a pseudorandom quaternary phase shift keying (QPSK) sequence. The signal is transmitted to a receiver of K = 4 antennas. The angles of arrival are described in Table 1.In Figure 4, we will illustrate the ability of the blind tensorial receiver Algorithm 2 using the new performance criterion and the ALS receiver for noisy data extraction. Using Monte Carlo simulations, we will represent the evolution of bit error rate (BER) according to the signaltonoise ratio (SNR). These results shown in Figure 4 indicate that the performance of the proposed receiver Algorithm 2 is better since it converges faster than the ALS algorithm. That is tied to the normalization of the factor matrices and the exact calculation of the scaling factor. Moreover, we can see that BER of Algorithm 2 without the use of our criterion performance is very optimistic, which proves the interest of our study.
The impact of factor $\frac{K}{R}$, which represents the number of receiving antennas per user is illustrated in Figure 5. In this figure, curve 1 represents the case where the number of antenna is K=4 and the number of users is R=2, K=4, and R=4 for the second curve, while for the third curve K=2 and R=5. The angles of arrival are described in Table 2. As a result, the overall system performance is enhanced when increasing the factor $\frac{K}{R}$, indicating the importance of spatial diversity.
7 Conclusions
In this paper, we have shown in Section 3.2 that, in CP tensor decompositions, the scale matrix Λ takes as optimal value a Gram matrix controlling the conditioning of the problem. This shows that bounding coherences would allow to ensure a minimal conditioning. We have described several algorithms able to compute the minimal polyadic decomposition of threeway arrays. The two proposed algorithms Algorithm 1 and Algorithm 2 have been described and tested, which involve a separate explicit calculation of the scale matrix Λ. Contrary to the performance measures used in the literature, which are optimistic by construction, the performance index calculated herein is more realistic by taking into account the angular constraint. An application of the CP decomposition with exact performance criterion to DSCDMA system has been presented. Finally, computer simulations have been performed in the context of SIMOCDMA system, in order to demonstrate both the good performances of the proposed algorithms, compared to ALS one and their usefulness in CDMA system. The judgment of our algorithms do not solely rely on the reconstruction error and the convergence speed, but it also takes into account the error in the loading matrices obtained and the BER in the case of the CDMA application.
Appendices
Appendix 1
Detailed Λ optimal calculation
We intend to calculate the optimal value of Λ which minimize the following expression:
By developing, it leads to:
where ${G}_{\mathit{\text{pq}}}=\sum _{\mathit{\text{ijk}}}{A}_{\mathit{\text{ip}}}{B}_{\mathit{\text{jp}}}{C}_{\mathit{\text{kp}}}{A}_{\mathit{\text{iq}}}^{\ast}{B}_{\mathit{\text{jq}}}^{\ast}{C}_{\mathit{\text{kq}}}^{\ast}$ and ${f}_{q}=\sum _{\mathit{\text{ijk}}}{T}_{\mathit{\text{ijk}}}{A}_{\mathit{\text{iq}}}^{\ast}{B}_{\mathit{\text{jq}}}^{\ast}{C}_{\mathit{\text{kq}}}^{\ast}$. By canceling the derivative of Υ with respect to λ, we find the following linear system:
Appendix 2
Performance criterion details
In this appendix, we explain in more details how to obtain performance index δ, in particular how phases (φ,ψ,χ) are calculated. Setting χ = −φ −ψ [2π], equation (23) can be rewritten as:
where ${\mathbf{a}}^{H}\widehat{\mathbf{a}}\stackrel{\text{def}}{=}{\rho}_{\mathbf{a}}\phantom{\rule{0.3em}{0ex}}{e}^{\mathrm{\u0237\alpha}}$, ${\mathbf{b}}^{\mathsf{\text{H}}}\widehat{\mathbf{b}}\stackrel{\text{def}}{=}{\rho}_{\mathbf{b}}\phantom{\rule{0.3em}{0ex}}{e}^{\mathrm{\u0237\beta}}$ and ${\mathbf{s}}^{\mathsf{\text{H}}}\widehat{\mathbf{s}}\stackrel{\text{def}}{=}{\rho}_{\mathbf{s}}\phantom{\rule{0.3em}{0ex}}{e}^{\mathrm{\u0237\gamma}}$. Stationary points are given by the solutions of the trigonometric system in e.g. variables x = φ − α and y = ψ −β as unknowns:
The first simplification is achieved by noting that
implies
Now, using trigonometric identities, we can rewrite the first equation of the trigonometric system
as
Letting $cosy=\sqrt{1\stackrel{2}{sin}y}$ and $siny=\frac{{\rho}_{\mathbf{a}}}{{\rho}_{\mathbf{b}}}sinx$, we obtain
The goal of the next step is to eliminate the square root and to rewrite the equation in term of variables sinx or cosx. So, let us squaring both side of this equation and using trigonometric identities such as $\stackrel{2}{cos}x=\frac{1+cos\left(2x\right)}{2}$, $\stackrel{2}{sin}x=\frac{1cos\left(2x\right)}{2}$, $cosxsinx=\frac{sin\left(2x\right)}{2}$, and cos2 (2x) + sin2(2x) = 1. Thus, after simplification we obtain
In the same way as above, we squared twice both sides of the resulting equation to eliminate the squares roots. Finally, we get an equation of degree six of the form
with
and
and
Solving the sixth degree equation yields x. Replacing x in $siny=\frac{{\rho}_{\mathbf{a}}}{{\rho}_{\mathbf{b}}}sinx$ yields y.
References
 1.
Jutten C, Hérault J: Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Process 1991, 24: 120. 10.1016/01651684(91)90079X
 2.
Comon P, Jutten C: Handbook of Blind Source Separation, Independent Component Analysis and Applications. Academic Press, Oxford UK, Burlington USA; 2010. ISBN: 9780123747266
 3.
Harshman RA: Foundations of the Parafac procedure: models and conditions for an “explanatory” multimodal factor analysis. UCLA working papers in phonetics, (University Microfilms, Ann Arbor, Michigan, No. 10,085) 1970, 16: 184.
 4.
Comon P: Tensors: a brief introduction. IEEE Sig. Proc. Mag 2014, 31(3):4453. Special issue on BSS. hal00923279
 5.
Kolda TG, Bader BW: Tensor decompositions and applications. SIAM Rev 2009, 51(3):455500. 10.1137/07070111X
 6.
Comon P: Tensor decompositions, state of the art and applications. In IMA Conf. Mathematics in Signal Processing. Clarendon press,, Oxford, UK; 2000:124.
 7.
Almeida ALFD, Favier G, Mota JCM: The constrained trilinear decomposition with application to MIMO wireless communication systems. In GRETSI’07. Colloque GRETSI. Troyes; 2007.
 8.
Comon P, Luciani X, De Almeida ALF: Tensor decompositions, alternating least squares and other tales. J. Chemometrics 2009, 23: 393405. 10.1002/cem.1236
 9.
Sorber L, Van Barel M, De Lathauwer L: Optimizationbased algorithms for tensor decompositions: canonical polyadic decomposition, decomposition in ranks terms and a new generalization. SIAM J. Optimization 2013, 23(2):695720. 10.1137/120868323
 10.
Comon P, Minaoui K, Rouijel A, Aboutajdine D: Performance index for tensor polyadic decompositions. In 21th EUSIPCO Conference. IEEE,, Marrakech, Morocco; 2013.
 11.
Berge JMFT, Sidiropoulos N: On uniqueness in candecomp/parafac. Psychometrika 2002, 67(3):399409. 10.1007/BF02294992
 12.
Sidiropoulos ND, Bro R, Giannakis GB: Parallel factor analysis in sensor array processing. IEEE Trans. Sig. Proc 2000, 48(8):23772388. 10.1109/78.852018
 13.
Kruskal JB: Threeway arrays: rank and uniqueness of trilinear decompositions. Linear Algebra Appl 1977, 18: 95138. 10.1016/00243795(77)900696
 14.
ten Berge JMF, Sidiropoulos ND: On uniqueness in candecomp/parafac. Psychometrika 2002, 67(3):399409. 10.1007/BF02294992
 15.
Lim LH, Comon P: Blind multilinear identification. IEEE Trans. Inf. Theory 2014, 60(2):12601280.
 16.
Comon P, Lim LH: Sparse representations and lowrank tensor approximation. Research Report ISRN I3S//RR201101FR, I3S, SophiaAntipolis, France (February 2011.) http://hal.archivesouvertes.fr/docs/00/70/34/94/PDF/RR1102P.COMON.pdf
 17.
Lim LH, Comon P: Multiarray signal processing: tensor decomposition meets compressed sensing. CompteRendus Mécanique de l’Academie des Sci 2010, 338(6):311320.
 18.
Gribonval R, Nielsen M: Sparse representations in unions of bases. IEEE Trans. Inf. Theory 2003, 49(13):33203325.
 19.
Acar E, Dunlavy DM, Kolda TG: A scalable optimization approach for fitting canonical tensor decompositions. J. Chemometrics 2011, 25: 6786. 10.1002/cem.1335
 20.
Carroll J, Chang J: Analysis of individual differences in multidimensional scaling via an nway generalization of “eckartyoung” decomposition. Psychometrika 1970, 35: 283319. 10.1007/BF02310791
 21.
Smilde A, Bro R, Geladi P: MultiWay Analysis. (Wiley, Chichester UK; 2004.
 22.
Boyd S, Vandenberghe L: Convex Optimization. Cambridge University Press, the United States of America, New York; 2004. ISBN: 978521833783 hardback
 23.
Comon P: Estimation multivariable complexe. Traitement du Signal 1986, 3(2):97101.
 24.
Hjorungnes A, Gesbert D: Complex valued matrix differentiation: techniques and key results. IEEE Trans. Signal Process 2007, 55(6):27402746.
 25.
Moulines E, Duhamel P, Cardoso JF, Mayrargue S: Subspace methods for the blind identification of multichannel fir filters. IEEE Trans. Signal Process 1995, 43: 516525. 10.1109/78.348133
 26.
Godard DN: Selfrecovering equalization and carrier tracking in twodimensional data communication systems. IEEE Trans. Commun 1980, 28: 18671875. 10.1109/TCOM.1980.1094608
 27.
der Veen AJV, Paulraj A: An analytical constant modulus algorithm. IEEE Trans. Signal Proc 1996, 44: 11361155. 10.1109/78.502327
 28.
de Almeida ALF, Favier G, Mota JCM: Parafacbased unified tensor modeling for wireless communication systems with application to blind multiuser equalization. Signal Process 2007, 87(2):337351. 10.1016/j.sigpro.2005.12.014
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Rouijel, A., Minaoui, K., Comon, P. et al. CP decomposition approach to blind separation for DSCDMA system using a new performance index. EURASIP J. Adv. Signal Process. 2014, 128 (2014) doi:10.1186/168761802014128
Received
Accepted
Published
DOI
Keywords
 CP decomposition
 Tensor
 DSCDMA
 Blind separation
 Optimization