Skip to main content

CP decomposition approach to blind separation for DS-CDMA system using a new performance index


In this paper, we present a canonical polyadic (CP) tensor decomposition isolating the scaling matrix. This has two major implications: (i) the problem conditioning shows up explicitly and could be controlled through a constraint on the so-called coherences and (ii) a performance criterion concerning the factor matrices can be exactly calculated and is more realistic than performance metrics used in the literature. Two new algorithms optimizing the CP decomposition based on gradient descent are proposed. This decomposition is illustrated by an application to direct-sequence code division multiplexing access (DS-CDMA) systems; computer simulations are provided and demonstrate the good behavior of these algorithms, compared to others in the literature.

1 Introduction

Blind source separation consists in estimating unknown signals observed from their mixture without knowing any information about them, except mild properties such as their independence. Early work on blind source separation was initiated by Jutten and Hérault [1, 2] in the case of an instantaneous mixture. More recently, the use of multi-linear algebra methods has attracted attention in several areas such as data mining, signal processing, and particularly in wireless communication systems, among others. Wireless communication data can sometimes be viewed as components of a high-order tensor (order strictly larger than 2). Solving the problem of source separation then means finding a decomposition of this tensor and determining its parameters. One of the most popular tensor decompositions is the canonical polyadic decomposition (CP), also known as parallel factor analysis (PARAFAC), which can be seen as an analog of the matrix singular value decomposition (SVD), since it decomposes the tensor into a sum of rank-one components [35]. This decomposition has been exploited and generalized in several works for solving different signal processing problems [6, 7] such as multi-array multi-sensor signal processing. The interest of the CP decomposition lies in its uniqueness under certain conditions. Typical algorithms for finding the CP components include alternating least squares (ALS) and descent algorithms [8, 9], which do not isolate the scaling factor matrix. Herein, we propose two new optimization algorithms for CP tensor decomposition, which isolate the scaling matrix in the optimization process and offers the possibility to monitor the conditioning.

It is well known that loading matrices are identified up to column scaling. This indeterminacy is complicated to take into account, given that the product of all scaling matrices must be equal to the identity. For this reason, only approximate performance indices have been used so far by ignoring the last constraint. However, one can ask oneself whether it is possible to calculate the exact performance index: this is our second contribution. The present paper develops preliminary results appeared in [10] and includes performances obtained in the frame of direct-sequence code division multiplexing access (DS-CDMA) blind multiuser detection and estimation.

The rest of this paper is organized as follows. Section 2 presents notation, definitions, and properties of third-order tensors, and the exact CP decomposition problem is then stated. In Section 3, the low-rank approximation is formulated. Existence and uniqueness of this decomposition are also investigated. ALS and the two proposed algorithms are presented in Section 4. Section 5 is dedicated to the new performance criterion with a focus on an exact performance index calculation. In Section 6, we show the usefulness of our algorithms and the performances obtained. An application of these algorithms to CDMA transmission is then provided to illustrate the effectiveness of the latter.

2 Notation and preliminaries

2.1 Notations and definitions

Let us first introduce some essential notation. Scalars are denoted by lowercase letters, e.g., a. Vectors are denoted by boldface lowercase letters, e.g., a; matrices are denoted by boldface capital letters, e.g., A. Higher-order tensors are denoted by boldface Euler script letters, e.g., . The p th column of a matrix A is denoted a p , the (i,j) entry of a matrix A is denoted by A ij , and element (i,j,k) of a third-order tensor is denoted by T ijk . 1 will represent a vector containing ones, and I the identity matrix.

Definition 2.1

The scalar product of two same-sized tensors , Y ε C I 1 × I 2 × × I N is defined as:

X,Y= i 1 = 1 I 1 i N = 1 I N X i 1 , , i N Y i 1 , , i N .

where X i 1 , , i N is the (i1,, i N ) elements of the Nth order of tensor.

Definition 2.2.

The outer (tensor) product between two tensor arrays X C I 1 × I 2 × × I N and Y C J 1 × J 2 × × J M of orders N and M, respectively, is denoted by Z= X Y C I 1 × I 2 × × I N × J 1 × J 2 × × J M and defined by:

Z i 1 , , i N , j 1 , , j M = X i 1 , , i N Y j 1 , , j M .

The symbol represents the tensor outer product. The outer product of two tensors is another tensor, the order of which is given by the sum of the orders of the two former tensors. Equation 2 is a generalization of the concept of outer product of two vectors, which yields itself a matrix (second-order tensor). The outer product of three vectors a C I and b C J and c C K yields a third-order decomposable tensor Z= a b c C I × J × K where Z ijk =a i b j c k .

Definition 2.3.

The rank of an arbitrary tensor T C I 1 × I 2 × × I N , denoted by R = r a n k(X), is the minimal number of rank-1 tensors that yield in a linear combination.

Decomposable tensors have thus a rank equal to 1.

Definition 2.4.

The Kruskal rank, or krank , of a matrix is the largest number j such that every set of j columns is linearly independent.

Definition 2.5.

The Frobenius norm of a tensor T C I 1 × I 2 × × I N is defined as:

T F = T , T .

Definition 2.6.

The Khatri-Rao product between two matrices with the same number of columns, A= a 1 , a 2 , , a F C I × F and B= b 1 , b 2 , , b F C J × F , is defined as the column-wise Kronecker product:

AB= a 1 b 1 , , a F b F C IJ × F .

where is the Kronecker product.

Definition 2.7.

The coherence of a collection V={v1,, v r } of unit-norm vectors is defined as the maximal value of the modulus of cross scalar products. In other words, the coherence of the collection V is defined as:

μ V = max p q | v p H v q |.

Definition 2.8.

Let A C I × J , then vec{A} C K denotes the column vector defined by:

vec { A } i + ( j 1 ) I = A ij .

where K = I J.

2.2 Preliminaries

A tensor of order d is an object defined on a product between d linear spaces. Once the bases of these spaces are fixed, a d t h order tensor can be represented by a d-way array (a hyper matrix) of coordinates [4]. The order of a tensor thus corresponds to the number of indices of the associated array. We are interested in decomposing a third-order tensor as:

T= r = 1 R λ r D(r).

where D(r) are decomposable tensors, that is, D(r)= a r b r c r . Denote by ()T matrix transposition, λ r real positive scaling factors, λ= [λ1,,λ R ]T, and R the tensor rank. Vectors a r (resp. b r and c r ) live in a linear space of dimension I (resp. dimension J and K). Equivalently, decomposition (7) can be rewritten as a function of array entries:

T ijk = r = 1 R λ r A ir B jr C kr , i 1 , , I , j 1 , , J , k 1 , , K .

where A ir (resp. B jr and C kr ) denote the entries of vector a r (resp. b r and c r ). The above decomposition is called the canonical polyadic decomposition (CP) of tensor . The model (7) can be written in a compact form using the Khatri-Rao product, as:

T 1 I , KJ = A Λ C B T , T 2 J , KI = B Λ C A T , T 3 K , JI = C Λ B A T .

where T 1 I , KJ resp. T 2 J , KI and T 3 K , JI is the matrix of size I × K J (resp. J × K I and K × J I) obtained by unfolding the array of size I × J × K in the first mode (resp. the second mode and the third mode), and Λ is the R × R diagonal matrix defined as Λ= D i a g{λ1,…, λ R }; see [5] for further details on matrix unfoldings.

The explicit handwriting of decomposable tensors as given in (8) is subject to scale indeterminacies. In the tensor literature, optimization of the CP decomposition (8) has been carried out without isolating the scaling factor Λ, which is generally included in one of the loading matrices, so that Λ= I. In [5], Kolda and Bader proposed to reduce the indeterminacies by normalizing the vectors and storing the norms in Λ. Our first proposal is to pull the factors λ r outside the product and calculate the optimal value of the scaling factor, which permits to monitor the conditioning of the problem. Scaling indeterminacies are then clearly reduced to unit modulus but are not completely fixed, hence the difficulty in estimating the identification error of loading matrices A, B and C. Our second proposal (Section 5) is to calculate the 3R complex phases (reducing to signs in the real case).

3 Existence and uniqueness

The goal is to identify all parameters in the right hand side of (8), given the whole array . According to existing results [1113], a third-order tensor (i.e., a three-way array) of rank R can be uniquely represented as sum of R rank-1 tensors, under certain conditions. Kruskal has demonstrated that the condition (9) is sufficient for uniqueness in CP decomposition [13], where k A is the krank of A. This means that matrices A, B, and C are unique up to permutation and (complex) scaling of their columns, under the Kruskal’s condition:

k A + k B + k C 2R+2

For uniqueness, Harshman has shown that is sufficient to have A and B of full rank, and C of krank ≥ 2 [3]. When 1 < R ≤ 2, the Kruskal and Harshman conditions are equivalent. For R > 2, Kruskal’s condition may be satisfied even when Harshman’s are not, and this condition is claimed to be only sufficient for R > 3 [14]. However, observations are actually corrupted by noise, so that (8) does not hold exactly.

3.1 Low-rank approximation

In practice, the exact CP decomposition always exists but with a very large rank. Hence, it may not be physically meaningful, and additionally, it is generally not unique. It is consequently preferred to fit a multi-linear model of lower rank, R<rankT, fixed in advance, so that we have to deal with an approximation problem. To estimate the parameters of the decomposition, we need to minimize the following cost function:

Υ A , B , C , Λ =T A , B , C .Λ F 2 .

By convention (A,B,C).Λ denotes the tensor of rank R of coordinates r = 1 R λ r A ir B jr C kr . Minimizing error (10) means finding the best rank-R approximate of and its CP decomposition. The cost function (10) can also be written in three equivalent compact forms with respect to the three loading matrices:

Υ A , B , C , Λ = T 1 I , KJ A Λ C B T F 2 , = T 2 J , KI B Λ C A T F 2 , = T 3 K , JI C Λ B A T F 2 .

3.2 Conditioning of the problem

Assuming that the matrices A, B, and C are given, we will calculate the optimal value of the scaling factor Λ. This can be done by expanding the Frobenius norm in (10), which is a quadratic form in the entries of Λ and canceling the gradient with respect to λ (see details in Appendix 1). Then, the following linear system is obtained:


where f is R-dimensional vector defined by the contraction f r = ijk T ijk A ir B jr C kr , G represents the R × R Gram matrix defined by:

G pq = a p b p c p H a q b q c q .

In view of matrix G, we can see that coherences play a role in the conditioning of the problem. From Equations 11 to 12, and since diagonal entries of G are equal to 1, it is indeed clear that imposing cross scalar products of the form a p H a q to have a modulus strictly smaller than 1 will lead with greater chances to an acceptable conditioning. Also note that scalar products do not appear individually in (12) but through their products, since entries of G can also be written as G pq = a p H a q b p H b q c p H c q . This statement has deeper implications, particularly in existence and uniqueness of the solution to Problem (10), as subsequently elaborated.

3.3 Existence

According to the results in [15, 16], the infimum of (10) may not be reachable. In fact, the set of tensors of rank at most ξ is not closed if ξ > 1. Examples of the lack of closeness have been provided in the literature [15, 16], which suffice to prove it. In other words, it may happen that for a given tensor, and for any rank-r approximation of it, there always exists another better rank-r approximation.

This is the reason why the authors proposed the constraint below, which ensures existence of a minimum. Define the three coherences μ A , μ B , and μ C associated with the matrices A, B, and C, respectively. It has been indeed shown in [15, 17] that under the constraint:

μ A μ B μ C 1 R 1 ,

the infimum of (10) is reached. It may be seen that this condition already gives a quantitative bound to the conditioning of (11) because coherences bound extra-diagonal entries of matrix G, which has ones on its diagonal.

3.4 Uniqueness

The uniqueness of the tensor decomposition can be ensured by using a sufficient condition based on Kruskal’s theorem (9), previously mentioned.

According to the lemma reported in [17, 18], an inequality holds between Krank and coherence, namely k A 1 μ A , as long as k A is strictly smaller than the column rank of A. Including this inequality in Equation 9 leads to the following sufficient uniqueness condition:

μ A 1 + μ B 1 + μ C 1 2(R+1).

4 Optimization for CP decomposition

In Section 3, we presented CP for three-way tensors. Various optimization algorithms exist to compute CP decomposition without constraint, as ALS or descent algorithms [7, 8, 19]. We subsequently present optimization algorithms to compute the CP decomposition (10), under the constraints of unit norm columns of loading matrices.

4.1 Alternating least squares algorithm

The ALS algorithm was proposed for CP computation by Carroll and Chang in [20] and Harshman in [3] and still stays the workhorse algorithm today, mainly owing to its ease of implementation [21]. ALS is hence the classical solution to minimize the cost function (10), despite its lack of convergence proof. This iterative algorithm alternates among the estimation of matrices A, B, and C.

The principle of the ALS algorithm is to convert a nonlinear optimization problem into three independent linear least squares (LS) problems. In the first steps, one of the three matrices, say, A is updated while the two others (B and C) are fixed to their values obtained in previous estimation steps. The estimate of A is given by:

A ̂ = T I , KJ C B T .

where TI, KJ is the unfolding matrix of size I × J K defined in Section 2.2, and () is the Moore-Penrose pseudo inverse. By symmetry, the expressions are similar for B ̂ and C ̂ .

4.2 Proposed algorithms

Our optimization problem consists in minimizing the squared error Υ under a collection of 3R constraints, namely:

min A , B , C , Λ T r = 1 R λ r a r b r c r F 2 , a r = b r = c r = 1 , 1 r R

Therefore, we need to find three matrices A, B, and C of unit norm columns which minimize (16). Stack these three matrices in a I+J+K by R matrix denoted by X. The objective can now be also written Υ(X,Λ), for the sake of convenience.

The computation of loading matrices is performed by minimizing the quadratic cost function (10). One generates a series of iterates X(k) = [A(k)T, B(k)T, C(k)T]T, kN, with initial value X(0) arbitrarily chosen. Generally, the algorithm consists of choosing at the k th iteration a point X(k + 1) in a direction lying in the half space defined by the gradient of objective function Υ, defined by matrix D(k), which verifies [22]:

vec { Υ ( X ( k ) ) } T vec{D(k)}<0.

One possibility is to choose the direction opposite to the gradient:


The gradient components Υ A (size I × R), Υ B (J × R), and Υ C (K × R) can be stacked into a single matrix of size I+J+K by R:

D= Υ A Υ B Υ C .

Since objective function (10) is real but its arguments are complex, its gradient can be computed in the sense of [23, 24]. Using the quadratic form proposed in [23], the objective function can be expanded into:

Υ A , B , C ; Λ = T 2 + i p = 1 R q = 1 R A ip M pq A iq i q = 1 R N iq A iq + p = 1 R A ip N ip .

where M pq = jk λ p λ q B jp B jq C kp C kq and N ip = jk T ijk λ p B jp C kp . Thus, the gradient of Υ with respect to A is:

∂Υ A =2AM2N.

where M of size R × R and N of size I × R. The gradient of Υ with respect to B and C is similar, taking into account the fact that matrices M and N need to be defined accordingly (for the gradient of Υ with respect to B and C, the dimension of matrix N is J × R and K × R, respectively, while the dimension of M is always R × R).

The difficulty that arises in constrained optimization is to make sure that the move remains within the feasible set, , defined by the constraints. In the following subsections, we propose two versions of our algorithm, with two different ways of calculating scale factor Λ.

Descent algorithms are also determined by the steps that will be executed in the chosen direction. There are various methods for the step selection, and the most widely used are Backtracking and Armijo[22]. To compute the stepsize (k) in Algorithm 1 and Algorithm 2, we use a popular inexact line search method, very simple and quite effective, which is the backtracking line search. It depends on two fixed constants α, β with 0 < α < 0.5 and 0 < β < 1.

Backtracking algorithm

  1. 1.

    Given a descent direction D for Υ, α (0,0.5), β (0,1).

  2. 2.

    Initialization: = 1.

  3. 3.

    while Υ (X + D;Λ) > Υ(X;Λ)+α Υ TD

  4. 4.

    = β .

4.2.1 Algorithm 1

In the recent work on CP tensor decomposition, optimization of the objective function (10) was made without explicitly considering the factor Λ. More precisely in most contributions, the scaling factor is integrated into loading matrices, so that Λ may be set to the identity. The first solution we propose is to use a projected gradient algorithm while calculating Λ as the product of normalizing factors of matrices A, B, and C:

Λ(k)=Λ(k1) Λ A Λ B Λ C .

where is the Hadamard product, Λ A =Diag{a1,,a R }, and similar definitions for Λ B and Λ C . This approach, which we call ‘Algorithm 1’ can be described by the pseudo-code below:

4.2.2 Algorithm 2

The other approach is to consider Λ as an additional variable. By canceling the gradient of Υ(X,Λ) with respect to Λ, one obtains Equation 11, which can be solved for Λ when X is fixed. This gives the algorithms below. Stopping criterion

The convergence is usually considered to be obtained at the k th iteration when the error between tensor , and the tensor reconstructed from the estimated loading matrices, does not significantly change between iterations k and k + 1.

However, in the complex case, the phase of the entries of loading matrices found at the end of the algorithm – as defined by the stopping criterion above - is different from the original. To remedy this problem, we propose a new performance criterion in order to minimize the distance between the original and the estimated matrices. Although this criterion is not usable when actual loading matrices are unknown, it still permits to assess the performances effectively attained.

5 Performance criterion

As highlighted in Section 2, there is always an indeterminacy in the representation of decomposable tensors, characterized in the CP decomposition by 3R complex numbers of unit modulus. In order to better understand this problem, let a, b, and c denote the r th column of matrices A, B, and C, respectively, with 1 ≤ rR. Furthermore, a ̂ , b ̂ , and c ̂ denote one column of the estimated matrices entering in the CP decomposition. We seek to minimize a distance:

δ x ; x ̂ = min φ , ψ , χ a e ȷφ a ̂ 2 + b e ȷψ b ̂ 2 + c e ȷχ c ̂ 2 .

In the literature, only approximate performance indices have been used so far by neglecting relation between the angles φ, ψ and χ given by:

exp ȷ ( φ + ψ + χ ) = 1 .

Our contribution herein consists in finding the exact minimum distance (23) under this angular constraint, by calculating the 3R optimal phases affecting the columns of the estimated loading matrices.

The derivative of Equation 23 with respect to φ and ψ yields a system of two equations. Finding 3R phases means to solve, if the 2 sets of 3 columns are fixed:

ρ a sin x + ρ c sin ( φ + ψ + γ ) = 0 , ρ b sin y + ρ c sin ( φ + ψ + γ ) = 0 .

where x = φα and y = ψβ. After some trigonometric manipulations described in Appendix 2, the solutions can be obtained by rooting a polynomial of degree 6 in a single variable φ. By replacing the admissible values of φ into system (24), the corresponding values of ψ are obtained. The calculation of the minimum distance (23) is done for all possible permutations. We end up with the following performance criterion:

E T ; A , B , C , Λ = min π Π r = 1 R δ x r ; x ̂ π ( r ) .

where Π is the set of permutations of {1,2,, R}. When the permutation acts in too large dimension, greedy versions are possible to limit the exhaustive search in the permutation set.

Denote π(i) as the permutations of Π, 1 ≤ iR!. The overall algorithm to compute the performance criterion is summarized as follows:

  1. 1.

    For 1 ≤ iR! do

  2. 2.

    Calculate the 3R optimal phases affecting the columns of the estimated loading matrices:

  3. (a)

    Permute the columns of the three estimated matrices according to the permutation π(i): A ̂ π ( i ) , B ̂ π ( i ) , and C ̂ π ( i ) ;

  4. (b)

    For each r th columns of loading matrices and estimated matrices, do:

  • Set x = φα and y = ψβ and solve the polynomial of degree 6 in a single variable φ:

    c 0 + c 1 cos ( 2 x ) + c 2 cos 2 ( 2 x ) + c 3 cos 3 ( 2 x ) + c 4 cos 4 ( 2 x ) + c 5 cos 5 ( 2 x ) + c 6 cos 6 ( 2 x ) = 0 .
  • Replace x in siny= ρ a ρ b sinx, obtain y and consequently ψ;

  • Calculate χ in: exp(ȷ(φ +ψ +χ)) = 1;

  • Calculate the minimum distance δ:

    δ x ; x ̂ = min φ , ψ , χ a e ȷφ a ̂ 2 + b e ȷψ b ̂ 2 + c e ȷχ c ̂ 2 .
  • Save the results: distance(i) = δ, phase φ (i) = φ, phase ψ (i) = ψ and phase χ (i) = χ;

  1. 3.

    End do.

  2. 4.

    Choose the 3R angles which return the smaller distance δ.

6 Simulation results

To evaluate the efficiency and behavior of the proposed algorithms with new performance criterion, two experiments are made: the first one for random loading matrices and the second one for DS-CDMA system. In all experiments, the results are obtained from 100 Monte Carlo runs. At each iteration and for every SNR value, a new noise realization is drawn. The stopping criterion chosen for all experiments is Υ(n)<ε and |Υ(n)Υ(n−1)|<ε, where ε is a threshold by which the global minimum is considered to be reached, and n is the current iteration. In the following simulations, we take: ε=10−6.

6.1 Example 1: random loading matrices

In order to illustrate the behavior and performances of the proposed algorithms, we first report simulation results run on random tensors. In a first stage, we compare ‘Algorithm 2’, ‘Algorithm 1’, and the gradient descent algorithm named herein ‘Algorithm 0’. We will analyze the errors of matrix estimation obtained through the performance criterion proposed in Section 5. Two scenarios are analyzed using random tensors: (i) one tensor of rank 2 with dimensions 3 × 3 × 3 and (ii) another tensor of rank 5 with dimensions 8 × 8 × 8. Loading matrices are initialized with random-valued columns and the two tensors are corrupted by an additive Gaussian noise.Figures 1 and 2 depict matrix estimation errors implied in (25) as a function of SNR. As expected, it can be seen that the results using Algorithm 0 show a poor performance when compared to the results obtained with Algorithms 1 and 2 (curves with diamonds and circles). This supports the idea that our algorithms isolating the scaling matrix are attractive. Furthermore, we check that when the phase constraints are neglected in the calculation of the performance measure, the results are significantly more optimistic, particularly at high SNR (curves of the same color), which supports the interest of using our performance index defined in (5).In order to show the significant difference between Algorithms 1 and 2, we will examine the convergence speed of the tensor reconstruction according to the number of iterations, since the final error is about the same (cf. Figures 1 and 2). We consider the case of a 3 × 3 × 3 tensor of rank 2. The latter was constructed from three 3 × 2 Gaussian matrices drawn randomly. The results are presented in Figure 3, and show that in all our experiments, Algorithm 2 converged faster in terms of number of iterations, while Algorithm 1 and Algorithm 0 require more iterations to reach convergence.

Figure 1
figure 1

Matrix estimation errors, with a random tensor of size 8 × 8 × 8 and rank 5. (a) Matrix A. (b) Matrix B. (c) Matrix S.

Figure 2
figure 2

Sum of matrix estimation errors, with a random tensor of size 3 × 3 × 3 and rank 5. Note the asymptote depending on the maximum number of iterations executed.

Figure 3
figure 3

Typical example of reconstruction error ( 10) as a function of the number of iterations. For a tensor of size 3 × 3 × 3 and rank 5.

6.2 Example 2: application To DS-CDMA system

In this example, we place ourselves in a blind context. We assume that the receiver has no knowledge neither on the spreading codes nor on symbol sequences. Classically, telecommunications blind techniques are based on some a priori knowledge, such as temporal properties of transmitted signals or the spatial properties of the receiver [2527].

Recently, algebraic tensor methods have received considerable attention in signal processing [2]. It also turns out that multilinear algebra tools are often more powerful than their matrix equivalent. Sidiropoulos et al. are the first to adopt tensor approaches in the telecommunications field in 2000 [12]. They observed that the samples of a CDMA signal received by an array of antennas can be stored in a cube, each dimension corresponding to a diversity (coding diversity, temporal diversity, and spatial diversity). Thus, they showed that the deterministic blind separation problem of CDMA signals can be solved by the CP decomposition [28].

In this example, we propose to apply the CP decomposition algorithms as detailed in Section 4 with the new performance criterion on the DS-CDMA technique. A comparison with the ALS algorithm is then made.

6.2.1 Tensor modeling

We consider R users with one transmitting antenna, transmitting simultaneously their signals to an array of K receiving antennas. In other words, we consider a communication system of type ‘multiuser SIMO’.

For example, assume the user r transmits the symbols s r of size J. These symbols are spreaded by c r CDMA code of length I, uniquely allocated to user r and assumed to be unknown at the receiver. These codes are binary sequences taking values from {−1,1} and could be non-orthogonal. The spreading sequence propagates along a single path in a memoryless channel and is received by the antenna array under angle of arrival θ r . Each of the K antennas receives a signal Y k (t) of size J × I. Our approach for detection and separation of the received signals is to exploit the multilinear algebraic structure of these signals using the new performance criterion. We observe these signals during a time span of length J T s , where T s is the symbol period. The Y k (t) signals are sampled at the chip period T c = T s /I, where I is the spreading factor. Therefore, each antenna provides a set of IJ samples that can be ordered in a tensor of order 3, denoted Y C I × J × K . The Y ijk component of this tensor that corresponds to the sample of the overall signal received by the k th antenna at time i of the j th symbol period can be written as follows:

Y ijk = r = 1 R A kr S jr C ir i 1 , I j 1 , J k 1 , K .

where the complex scalar A kr = β r a k (θ r ), with β r is the fading coefficient of the r th user and a k (θ r ) the response of the antenna k at the angle of arrival θ r .

Separating the received signals is then equivalent to decompose the tensor into a sum of R contributions, where R represents the number of active users in the system. To calculate the CP decomposition of , we resort to Algorithm 2 presented in Section 4 with performance index (25). The detection and separation of the matrix S of transmitted symbols will be made, using the following objective function:

Υ A , C , S , Λ = Y r = 1 R λ r a r c r s r F 2 .

where a r , c r , and s r are the normalized vectors; the scaling ambiguities on the estimated symbols are eliminated by normalizing each symbol sequence by its norm and calculating the exact scaling factor Λ. As to correct the phases of the three estimated matrices, we will use the exact performance index (28) seen in Section 5.

E T ; A , C , S , Λ = min π Π r = 1 R min φ , ψ , χ a r e ȷφ a ̂ π ( r ) 2 + c r e ȷψ c ̂ π ( r ) 2 + s r e ȷχ s ̂ π ( r ) 2 .

6.2.2 Simulation

In this experiment, we present the performance of the receiver algorithms (Algorithm 1 and Algorithm 2 with new performance criterion) which estimate blindly the symbol S.

We consider R = 4 users communicate simultaneously in the same bandwidth. Each user transmits a sequence J = 20 of consecutive symbols and is uniquely assigned a spreading sequence of length I = 10. The user symbols are generated from an i.i.d distribution and are modulated using a pseudo-random quaternary phase shift keying (QPSK) sequence. The signal is transmitted to a receiver of K = 4 antennas. The angles of arrival are described in Table 1.In Figure 4, we will illustrate the ability of the blind tensorial receiver Algorithm 2 using the new performance criterion and the ALS receiver for noisy data extraction. Using Monte Carlo simulations, we will represent the evolution of bit error rate (BER) according to the signal-to-noise ratio (SNR). These results shown in Figure 4 indicate that the performance of the proposed receiver Algorithm 2 is better since it converges faster than the ALS algorithm. That is tied to the normalization of the factor matrices and the exact calculation of the scaling factor. Moreover, we can see that BER of Algorithm 2 without the use of our criterion performance is very optimistic, which proves the interest of our study.

Table 1 Angles of arrival for four users
Figure 4
figure 4

Bit error rate (BER) versus SNR results for scenario 1: R = 4, K = 4, I = 10, and J = 20 .

The impact of factor K R , which represents the number of receiving antennas per user is illustrated in Figure 5. In this figure, curve 1 represents the case where the number of antenna is K=4 and the number of users is R=2, K=4, and R=4 for the second curve, while for the third curve K=2 and R=5. The angles of arrival are described in Table 2. As a result, the overall system performance is enhanced when increasing the factor K R , indicating the importance of spatial diversity.

Figure 5
figure 5

Bit error rate (BER) versus SNR results. Receiver performance for {two users, four antennas}, {four users, four antennas}, and {five users, two antennas}.

Table 2 Angles of arrival for three scenarios

7 Conclusions

In this paper, we have shown in Section 3.2 that, in CP tensor decompositions, the scale matrix Λ takes as optimal value a Gram matrix controlling the conditioning of the problem. This shows that bounding coherences would allow to ensure a minimal conditioning. We have described several algorithms able to compute the minimal polyadic decomposition of three-way arrays. The two proposed algorithms Algorithm 1 and Algorithm 2 have been described and tested, which involve a separate explicit calculation of the scale matrix Λ. Contrary to the performance measures used in the literature, which are optimistic by construction, the performance index calculated herein is more realistic by taking into account the angular constraint. An application of the CP decomposition with exact performance criterion to DS-CDMA system has been presented. Finally, computer simulations have been performed in the context of SIMO-CDMA system, in order to demonstrate both the good performances of the proposed algorithms, compared to ALS one and their usefulness in CDMA system. The judgment of our algorithms do not solely rely on the reconstruction error and the convergence speed, but it also takes into account the error in the loading matrices obtained and the BER in the case of the CDMA application.


Appendix 1

Detailed Λ optimal calculation

We intend to calculate the optimal value of Λ which minimize the following expression:

Υ(A,B,C,Λ)=T(A,B,C).Λ F 2 .

By developing, it leads to:

Υ ( A , B , C , Λ ) = T 2 ijk p T ijk λ p A ip B jp C kp ijk q T ijk λ q A iq B jq C kq , + pq ijk λ p A ip B jp C kp λ q A iq B jq C kq , = T 2 p λ p f p q λ q f q + pq λ p λ q G pq .

where G pq = ijk A ip B jp C kp A iq B jq C kq and f q = ijk T ijk A iq B jq C kq . By canceling the derivative of Υ with respect to λ, we find the following linear system:


Appendix 2

Performance criterion details

In this appendix, we explain in more details how to obtain performance index δ, in particular how phases (φ,ψ,χ) are calculated. Setting χ = −φψ [2π], equation (23) can be rewritten as:

δ = | | a | | 2 + | | a ̂ | | 2 + | | b | | 2 + | | b ̂ | | 2 + | | s | | 2 + | | s ̂ | | 2 2 ρ a cos ( φ α ) 2 ρ b cos ( ψ β ) 2 ρ s cos ( φ + ψ + γ ) .

where a H a ̂ = def ρ a e ȷα , b H b ̂ = def ρ b e ȷβ and s H s ̂ = def ρ s e ȷγ . Stationary points are given by the solutions of the trigonometric system in e.g. variables x = φα and y = ψβ as unknowns:

ρ a sin x + ρ s sin ( φ + ψ + γ ) = 0 , ρ b sin y + ρ s sin ( φ + ψ + γ ) = 0 .

The first simplification is achieved by noting that

ρ s sin ( φ + ψ + γ ) = ρ a sin x = ρ b sin y.


sin y = ρ a ρ b sin x.

Now, using trigonometric identities, we can rewrite the first equation of the trigonometric system

ρ s sin ( φ + ψ + γ ) = ρ s sin ( x + y + α + β + γ ) = ρ a sin x.


ρ a sin x = ρ s sin x cos y cos ( α + β + γ ) + sin y cos x cos ( α + β + γ ) + cos x cos y sin ( α + β + γ ) sin x sin y sin ( α + β + γ ) .

Letting cosy= 1 sin 2 y and siny= ρ a ρ b sinx, we obtain

ρ a sin x = ρ s ρ a ρ b sin x cos x cos ( α + β + γ ) + ρ s ρ a ρ b sin 2 x sin ( α + β + γ ) + ρ s sin x sin ( α + β + γ ) ρ s cos x sin ( α + β + γ ) 1 ρ a ρ b sin 2 x .

The goal of the next step is to eliminate the square root and to rewrite the equation in term of variables sinx or cosx. So, let us squaring both side of this equation and using trigonometric identities such as cos 2 x= 1 + cos ( 2 x ) 2 , sin 2 x= 1 cos ( 2 x ) 2 , cosxsinx= sin ( 2 x ) 2 , and cos2 (2x) + sin2(2x) = 1. Thus, after simplification we obtain

ρ b 2 2 + 1 2 ρ s ρ a ρ b 2 ρ s 2 2 + ρ b 2 2 + 1 2 ρ s ρ a ρ b 2 + ρ s 2 2 cos 2 ( α + β + γ ) ρ s 2 2 sin 2 ( α + β + γ ) cos 2 ( 2 x ) + 2 ρ s cos ( α + β + γ ) sin ( α + β + γ ) 1 cos 2 ( 2 x ) + 2 ρ s ρ a 2 ρ b cos ( α + β + γ ) 1 cos 2 ( 2 x ) ρ s ρ a 2 ρ b sin ( α + β + γ ) 1 cos ( 2 x ) 1 cos ( 2 x ) 2 = 0 .

In the same way as above, we squared twice both sides of the resulting equation to eliminate the squares roots. Finally, we get an equation of degree six of the form

c 0 + c 1 cos ( 2 x ) + c 2 cos 2 ( 2 x ) + c 3 cos 3 ( 2 x ) + c 4 cos 4 ( 2 x ) + c 5 cos 5 ( 2 x ) + c 6 cos 6 ( 2 x ) = 0 .


c 0 = c 1 2 c 5 2 , c 1 = 2 c 1 c 4 2 c 5 c 7 , c 2 = 2 c 1 c 3 + c 4 2 2 c 5 c 6 c 7 2 + c 5 2 , c 3 = 2 c 1 c 2 + 2 c 3 c 4 2 c 6 c 7 + 2 c 5 c 7 , c 4 = c 3 2 + 2 c 2 c 4 c 6 2 + 2 c 5 c 6 + c 7 2 , c 5 = 2 c 2 c 3 + 2 c 6 c 7 , c 6 = c 2 2 + c 6 2 .


c 1 = c ′′ 1 2 + c ′′ 3 2 1 2 c ′′ 4 2 1 2 c ′′ 5 2 , c 2 = 1 2 c ′′ 4 2 + 1 2 c ′′ 5 2 , c 3 = c ′′ 2 2 c ′′ 3 2 + 1 2 c ′′ 4 2 3 2 c ′′ 5 2 , c 4 = 2 c ′′ 1 c ′′ 2 + 1 2 c ′′ 4 2 + 4 2 c ′′ 5 2 , c 5 = 2 c ′′ 1 c ′′ 3 + c ′′ 4 c ′′ 5 , c 6 = c ′′ 4 c ′′ 5 , c ′′ 7 = 2 c ′′ 2 c ′′ 3 2 c ′′ 4 c ′′ 5 .


c ′′ 1 = 1 2 ρ a 2 + 1 2 ρ s ρ a ρ b 2 1 2 ρ s 2 , c ′′ 2 = 1 2 ρ a 2 1 2 ρ s ρ a ρ b 2 + ρ s 2 cos 2 ( α + β + γ ) 1 2 , c ′′ 3 = 2 ρ s 2 cos ( α + β + γ ) sin ( α + β + γ ) , c ′′ 4 = 2 ρ s ρ a 2 ρ b cos ( α + β + γ ) , c ′′ 5 = ρ s ρ a 2 ρ b sin ( α + β + γ ) .

Solving the sixth degree equation yields x. Replacing x in siny= ρ a ρ b sinx yields y.


  1. Jutten C, Hérault J: Blind separation of sources, part I: an adaptive algorithm based on neuromimetic architecture. Signal Process 1991, 24: 1-20. 10.1016/0165-1684(91)90079-X

    Article  MATH  Google Scholar 

  2. Comon P, Jutten C: Handbook of Blind Source Separation, Independent Component Analysis and Applications. Academic Press, Oxford UK, Burlington USA; 2010. ISBN: 978-0-12-374726-6

    Google Scholar 

  3. Harshman RA: Foundations of the Parafac procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA working papers in phonetics, (University Microfilms, Ann Arbor, Michigan, No. 10,085) 1970, 16: 1-84.

    Google Scholar 

  4. Comon P: Tensors: a brief introduction. IEEE Sig. Proc. Mag 2014, 31(3):44-53. Special issue on BSS. hal-00923279

    Article  Google Scholar 

  5. Kolda TG, Bader BW: Tensor decompositions and applications. SIAM Rev 2009, 51(3):455-500. 10.1137/07070111X

    Article  MathSciNet  MATH  Google Scholar 

  6. Comon P: Tensor decompositions, state of the art and applications. In IMA Conf. Mathematics in Signal Processing. Clarendon press,, Oxford, UK; 2000:1-24.

    Google Scholar 

  7. Almeida ALFD, Favier G, Mota JCM: The constrained trilinear decomposition with application to MIMO wireless communication systems. In GRETSI’07. Colloque GRETSI. Troyes; 2007.

    Google Scholar 

  8. Comon P, Luciani X, De Almeida ALF: Tensor decompositions, alternating least squares and other tales. J. Chemometrics 2009, 23: 393-405. 10.1002/cem.1236

    Article  Google Scholar 

  9. Sorber L, Van Barel M, De Lathauwer L: Optimization-based algorithms for tensor decompositions: canonical polyadic decomposition, decomposition in rank-s terms and a new generalization. SIAM J. Optimization 2013, 23(2):695-720. 10.1137/120868323

    Article  MathSciNet  MATH  Google Scholar 

  10. Comon P, Minaoui K, Rouijel A, Aboutajdine D: Performance index for tensor polyadic decompositions. In 21th EUSIPCO Conference. IEEE,, Marrakech, Morocco; 2013.

    Google Scholar 

  11. Berge JMFT, Sidiropoulos N: On uniqueness in candecomp/parafac. Psychometrika 2002, 67(3):399-409. 10.1007/BF02294992

    Article  MathSciNet  MATH  Google Scholar 

  12. Sidiropoulos ND, Bro R, Giannakis GB: Parallel factor analysis in sensor array processing. IEEE Trans. Sig. Proc 2000, 48(8):2377-2388. 10.1109/78.852018

    Article  Google Scholar 

  13. Kruskal JB: Three-way arrays: rank and uniqueness of trilinear decompositions. Linear Algebra Appl 1977, 18: 95-138. 10.1016/0024-3795(77)90069-6

    Article  MathSciNet  MATH  Google Scholar 

  14. ten Berge JMF, Sidiropoulos ND: On uniqueness in candecomp/parafac. Psychometrika 2002, 67(3):399-409. 10.1007/BF02294992

    Article  MathSciNet  MATH  Google Scholar 

  15. Lim L-H, Comon P: Blind multilinear identification. IEEE Trans. Inf. Theory 2014, 60(2):1260-1280.

    Article  MathSciNet  Google Scholar 

  16. Comon P, Lim L-H: Sparse representations and low-rank tensor approximation. Research Report ISRN I3S//RR-2011-01-FR, I3S, Sophia-Antipolis, France (February 2011.)

  17. Lim L-H, Comon P: Multiarray signal processing: tensor decomposition meets compressed sensing. Compte-Rendus Mécanique de l’Academie des Sci 2010, 338(6):311-320.

    Article  MATH  Google Scholar 

  18. Gribonval R, Nielsen M: Sparse representations in unions of bases. IEEE Trans. Inf. Theory 2003, 49(13):3320-3325.

    Article  MathSciNet  MATH  Google Scholar 

  19. Acar E, Dunlavy DM, Kolda TG: A scalable optimization approach for fitting canonical tensor decompositions. J. Chemometrics 2011, 25: 67-86. 10.1002/cem.1335

    Article  Google Scholar 

  20. Carroll J, Chang J: Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young” decomposition. Psychometrika 1970, 35: 283-319. 10.1007/BF02310791

    Article  MATH  Google Scholar 

  21. Smilde A, Bro R, Geladi P: Multi-Way Analysis. (Wiley, Chichester UK; 2004.

    Google Scholar 

  22. Boyd S, Vandenberghe L: Convex Optimization. Cambridge University Press, the United States of America, New York; 2004. ISBN: 978-521-83378-3 hardback

    Book  MATH  Google Scholar 

  23. Comon P: Estimation multivariable complexe. Traitement du Signal 1986, 3(2):97-101.

    Google Scholar 

  24. Hjorungnes A, Gesbert D: Complex valued matrix differentiation: techniques and key results. IEEE Trans. Signal Process 2007, 55(6):2740-2746.

    Article  MathSciNet  Google Scholar 

  25. Moulines E, Duhamel P, Cardoso JF, Mayrargue S: Subspace methods for the blind identification of multichannel fir filters. IEEE Trans. Signal Process 1995, 43: 516-525. 10.1109/78.348133

    Article  Google Scholar 

  26. Godard DN: Self-recovering equalization and carrier tracking in two-dimensional data communication systems. IEEE Trans. Commun 1980, 28: 1867-1875. 10.1109/TCOM.1980.1094608

    Article  Google Scholar 

  27. der Veen AJV, Paulraj A: An analytical constant modulus algorithm. IEEE Trans. Signal Proc 1996, 44: 1136-1155. 10.1109/78.502327

    Article  Google Scholar 

  28. de Almeida ALF, Favier G, Mota JCM: Parafac-based unified tensor modeling for wireless communication systems with application to blind multiuser equalization. Signal Process 2007, 87(2):337-351. 10.1016/j.sigpro.2005.12.014

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Awatif Rouijel.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rouijel, A., Minaoui, K., Comon, P. et al. CP decomposition approach to blind separation for DS-CDMA system using a new performance index. EURASIP J. Adv. Signal Process. 2014, 128 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: