Local low-rank approach to nonlinear-matrix completion

This paper deals with a problem of matrix completion in which each column vector of the matrix belongs to a low-dimensional diﬀerentiable manifold (LDDM), with the target matrix being high or full rank. To solve this problem, algorithms based on polynomial mapping and matrix-rank minimization (MRM) have been proposed; such methods assume that each column vector of the target matrix is generated as a vector in a low-dimensional linear subspace (LDLS) and mapped to a p -th order polynomial, and that the rank of a matrix whose column vectors are d -th monomial features of target column vectors is deﬁcient. However, a large number of columns and observed values are needed to strictly solve the MRM problem using this method when p is large; therefore, this paper proposes a new method for obtaining the solution by minimizing the rank of the submatrix without transforming the target matrix, so as to obtain high estimation accuracy even when the number of columns is small. This method is based on the assumption that an LDDM can be approximated locally as an LDLS to achieve high completion accuracy without transforming the target matrix. Numerical examples show that the proposed method has a higher accuracy than other low-rank approaches. with respect to each column of the matrix. It is as-sumed that each hyperplane is of low dimension and that the sum of the rank of each local submatrix with respect to each column belonging to the set of nearest neighborhoods of each column is minimized. Numerical examples show that the proposed algorithm oﬀers higher accuracy for matrix completion than other algorithms in the case where each column vector is given by a p -th order polynomial mapping of a latent feature. In particular, the proposed method is suitable when the order p and the dimension of the latent space are high.


Introduction
This paper deals with the following completion problem for a matrix X ∈ R M ×N on a low-dimensional differentiable manifold (LDDM) M r : Find X = [x 1 x 2 · · · x N ] subject to (X) m,n = (X (0) ) m,n for (m, n) ∈ Ω x i ∈ M r for all i ∈ I, where the (m, n)th element of a matrix is denoted by (·) m,n , I is an index set defined as I = {1, 2, · · · , N }, and M r ⊂ R M , Ω, and X (0) denote an unknown rdimensional differential manifold, a given index set, and a given observed matrix, respectively. If M r is an unknown low-dimensional linear subspace (LDLS), then this is a low-rank matrix-completion problem. Many algorithms have been proposed [5,6,7,8,9,10] to obtain solutions to this problem with high estimation accuracy. The low-rank matrix-completion problem has various applications in the field of signal processing, including collaborative filtering [1], low-order model fitting and system identification [2], image inpainting [3], and human-motion recovery [4], all of which are formulated as signal-recovery or estimation problems. However, in most practical applications, the column vectors of a matrix do not belong to an LDLS, i.e., M r is not an LDLS. Therefore, these algorithms do not achieve high performance. As an example, a matrix is of high rank when its column vectors lie on a union of linear subspaces (UoLS); several method have been proposed to solve this highrank matrix-completion problem [11,12,13,14], all of which are based on subspace clustering [15]. In particular, [14] proposed an algebraic-variety approach known as variety-based matrix completion (VMC), which is based on the fact that the monomial features of each column vector belong to an LDLS when the column vectors belong to a UoLS. This approach solves the rank-minimization problem about the Gram matrix of the monomial features by relaxing the problem into one of rank minimization of a polynomial-kernel matrix. Unfortunately, these algorithms recover a matrix only when M r can be approximately divided into some LDLSs and do not work well otherwise.
To solve the matrix-completion problem on a general LDDM, some nonlinear methods have been proposed [16,17]. These methods are based on a kind of kernel-principal-component analysis [18] that assumes that the dimension of the subspace spanned by the column vectors mapped nonlinearly is low. They formulate the matrix-completion problem as a low-rank approximation problem of the kernel matrix, in common with [14]; however, they require a large number of observed entries in the matrix to solve the problem, and the matrix-completion accuracy declines when the number of observed entries is small.
In the present paper, a new method is proposed that uses neither the monomial features nor the kernel method to achieve high completion accuracy. Based on an idea similar to that of locally linear embedding [19,20], this paper assumes that an LDDM can be approximated locally as a LDLS, because there are tangent hyperplanes whose dimension is equal to that of the manifold. The matrix-completion problem is then formulated as one of minimizing the rank of the local submatrix of X whose columns are local nearest neighborhoods of each other. This paper is organized as follows. In Section 2, related works are introduced. Section 3 proposes a local low-rank approach (LLRA) to solve a matrixcompletion problem on an LDDM, and the convergence properties of the proposed algorithm are shown in Section 4. Finally, numerical examples are presented to illustrate that the proposed algorithm has a higher accuracy than other low-rank approaches in Section 5.

Related works
Here we focus on some matrix-completion algorithms based on matrix-rank minimization (MRM) on an unknown manifold, M r . First, this paper introduces algorithms for the case where M r is an r-dimensional linear subspace in Subsection 2.1; then, Subsection 2.2 shows algorithms using the polynomial kernel for a UoLS and an LDDM.

Matrix-rank Minimization for Linear Subspace
Most algorithms for matrix completion deal with the case where the manifold M r is an LDLS [5,6,7,9]. In this case, since the dimension of r is unknown, they formulate a matrix-completion problem as the following MRM problem to simultaneously estimate r and to restore X. Minimize X rank(X) subject to (X) m,n = (X (0) ) m,n for (m, n) ∈ Ω. (2) Since this problem is generally NP-hard, several surrogate functions such as the nuclear norm [5] and truncated nuclear norm [9], Schatten norm [7] have been proposed. These algorithms recover X well if X can be approximated as a low-rank matrix.

High-rank Matrix Completion with the Kernel Method
To recover a high-rank matrix with columns belonging to an UoLS or an LDDM, some algorithms have been proposed that minimize the rank of its kernel matrix [14,16,17].
In [14], the authors focused on a matrix-completion problem on an union of d linear subspaces d k=1 S k , where S k denotes an LDLS of dimension r or lower. Since the matrix rank is high or full in this problem, the MRM approach does not achieve high performance. To solve this matrix-completion problem, an algebraic-variety-model approach was proposed based on the fact that the monomial features of each column vector (x i ∈ d k=1 S k ) belong to a LDLS. Here, the monomial features of x are defined as is rank deficient, and the high-rank matrix-completion problem is formulated as follows: subject to (X) m,n = (X (0) ) m,n for (m, n) ∈ Ω.
This problem can be solved efficiently by replacing ψ d (X) with a polynomial kernel-gram matrix and by using the Schatten norm-minimization algorithm [7]. The details are presented in [14]. Another approach to the high-rank matrix-completion problem was proposed in [17]. The matrix ψ d (X) is rank deficient when each column vector x i is given by a polynomial mapping of latent features y i ∈ R r (r M < N ) denoted by with polynomial coefficients U p ∈ R M ×( r+p p ) and order p M , because R = rank (ψ d (X)) satisfies r + pd pd and R < M +d d if r, p M < N . Therefore the matrix ψ d (X) can be approximated by a low-rank ma-trix. [17] proposed a high-rank matrix-completion algorithm using matrix factorization in the same way as [14]; however, this algorithm requires a large number of observed entries and does not recover the matrix when only a small number are present. The algorithm restores [ψ d (x 1 ) · · · ψ d (x N )] uniquely if the sample number N and the sampling rate q = |Ω| M N satisfy the inequality 3. Hence, we expect that the matrix-completion accuracy will worsen when p and r are high and N is small. Therefore, this paper proposes a new approach that makes use of neither monomial features nor the kernel method, but which is rather based on the assumption that an LDDM can be approximated locally as an LDLS to achieve high completion accuracy with a small q and too few samples N .

Local Low-dimensional Model
First, in order to consider how the LDDM is structured when the matrix X is given, this paper assumes that some columns x j are approximated by a vector within a set of tangent vectors x + U (x) at x in the LDDM M r . Here, U (x) is defined as > 0 denotes an upper bound on the radius of an r-dimensional hyperball and J(x) denotes a Jacobian matrix defined as . φ λ : U λ → U λ and φ −1 λ = [φ −1 λ,1 · · · φ −1 λ,M ] T : U λ → U λ denote a chart and its inverse for an index λ, with an open set U λ that includes x satisfying λ U λ = M r and U λ ⊂ R r . Then, we consider that each x j in a set {x 1 , x 2 , · · · , x N } can be approximated by a vector belonging to i =j (x i + U (x i )) for all j ∈ I. In other words, we assume that we have the following non-empty-index set I i for i ∈ I defined as where η > 0 denotes the upper bound of the Euclidean distance between x j − x i and a vector z i,j ∈ U (x i ). In this case, the rank of a matrix is less than or equal to r because of rank(J(x i )) = r. Figure 1 illustrates the construction of each z i,j . From the figure, it is apparent that the z i,j ∈ U (x i ) can be obtained for suitable parameters and η. Therefore, the matrix-completion problem for an arbitrary LDDM (1) can be substituted with the problem of finding sets I i , Z i that satisfy (6) and the missing entries of the matrix X with the set of tangent vectors x i +U (x i ) and given parameters and η. Next, we consider how to find z i,j and I i . To simplify the below explanation, we redefine each column of z i,j as follows: where e i,j denotes an error vector satisfying e i,j 2 2 ≤ η and d i,j ∈ {0, 1} denotes a variable for which finding d i,j is equivalent to finding I i . In order to find a suitable solution for d i,j , this paper formulates the following maximization problem: Since the problem (8) cannot be solved because of U (x i ) when the LDDM M r is unknown (as is often the case in actual problems), this paper reformulates the constraint condition z i,j ∈ U (x i ) as two constraint conditions : 1. rank(Z i ) ≤ r, because the span of U (x i ) is an r-dimensional linear subspace, and 2.
Because it is difficult to estimate the radius of an ellipsoid since each J(x i ) is arbitrary and unknown, this paper uses the Euclidean distance of x j − x i and gives the radius of the hyperball i for each x i . Thus, this paper reformulates the problem (8) with the given parameters r and { i } i∈I as where Z i ∈ R M ×N is a matrix whose j-th column vector is z i,j , and the 2nd constraint condition is the same as Thus, we obtain the formulation for finding z i,j on each U (x i ) and the set I i without understanding J(x). However, it is difficult to solve the problem (9) for X, {Z i } i∈I , {d i,j } (i,j)∈I 2 at the same time due to the condition rank(Z i ) ≤ r. In actual applications, may not be able to find a suitable dimension r In order to solve this issue, the present paper explains how to obtain the solution using a MRM technique in subsection 3.2.

Local Low-rank-approximation Algorithm
For simplicity, the technique used to obtain the solution for Z i and d i,j with a given matrix X is described in the first part of this section.
First, we consider how to estimate the dimension of the LDDM, r, with a given X. We can estimate r simply using a principal-component analysis if we obtain d i,j ; however, the higher the rank of the matrix Z i , the lower that the number of solutions to d i,j = 1 becomes for the solution of (9). It can be seen that there is a trade-off between the dimension r and the number of solutions to d i,j = 1. Therefore, this paper formulates the following problem, where 0 ≤ α ≤ 1 denotes a given trade-off parameter, which is the ratio of the decreasing rank of Z i to the sum of d i,j . Because solving the problem (10) is NP-hard due to rank(Z i ), this paper reformulates the problem as one of relaxation: where f is defined as Here, β, γ ≥ 0 denote the given parameters, function · * denotes the nuclear norm, 1 N ∈ R N denotes the vector whose elements are all 1, D i denotes a diagonal matrix whose diagonal elements (D i ) j,j each equal d i,j , and trace(Y ) denotes the sum of all diagonal elements of Y . In order to solve problem (11), we repeat the following schemes until a termination condition is satisfied with respect to Z i and d i,j : where a, b denotes the inner product of a and b, sat(c) = max(0, min(1, c)), and T τ denotes the matrix-shrinkage operator for the nuclear norm-minimization problem [5]. Each step of (13) minimizes the objective function (12) for Z i and d i,j . Finally, let us consider the problem (1) in terms of the problem (11). The objective function (12) includes a variable X, which we can solve simply to minimize the function (12) for X as follows: The problem (15) is solved in the same way as the schemes (13) and is to repeat the schemes which are added minimizing f β,γ with the constraint condition (X) m,n = (X (0) ) m,n for (m, n) ∈ Ω by the schemes (13). Since the objective function (12) is quadratic for vec(X) = x, we obtain the following solution to the quadratically constrained quadratic program for a given Z i and d i,j , where I M,M ∈ R M ×M is the identity matrix, ⊗ is the Kronecker product of two matrices, and L ∈ R N ×N is a graph Laplacian defined as for l = 1, · · · , M . denotes the Hadamard product.

Convergence analysis
This section presents the convergence property of Algorithm 1.
First, let us define the following schemes with regard to the t-th iteration of the second iteration statements in Algorithm 1: u(t 1 , t 2 ) behaves as for t, t 1 , t 2 ≥ 0 and a given X ∈ R M ×N .
Lemma 1 For t ≥ 0 and a given X ∈ R M ×N , the d (t) i,j generated by the update schemes (19) satisfies i,j ) is the closed-form optimal solution of the convex quadratic-minimization problem with linear constraints for fixed Z i,j satisfies the KKT condition of problem (17) i,j satisfies the constraint condition, Then, each sequence {Z  Proof From Theorem 1 of [9], any X ∈ R M ×N and each optimal solutionZ i andD Here, y i,j denotes the j-th column of is the optimal solution for (17).
Next, let us define the following schemes with regard to the k-th iteration of the first-iteration statements in Algorithm 1 with β (k) , γ (k) , r (k) for k ≥ 0, whered i,j andZ i are the t-th elements of the sequences obtained by the schemes (19) with β (k) , γ (k) , r (k) , X (k) , and vector c (k) as Here, i ) l,j for (i, j) ∈ I 2 , and the graph Laplacian L (k) is whereD (k) ∈ R N ×N denotes a matrix whose every element is given by (D (k) Proof Since a vector a ∈ kernel(L (k) ) satisfies i,j generated by the schemes (19) and (20) Now, let us describe the properties of the sequences generated by Algorithm 1 and replace the linear-constraint condition (X (k) ) m,n = (X (0) ) m,n for (m, n) ∈ Ω with Ax (k) = b, where b ∈ R |Ω| denotes a vector whose elements are observed values {(X (0) ) m,n } (m,n)∈Ω and A ∈ {0, 1} |Ω|×M N denotes a selector matrix. Proof Since x (k+1) satisfies the KKT condition for v(k, k + 1, k, k) with Ax k+1 = b and Q i,j x (k+1) 2 2 − i ≤ 0 for (i, j) ∈ I 2 (where Q i,j ∈ R M ×M N denotes a matrix defined as Q i,j = q T i,j ⊗ I M,M and q i,j ∈ R N is defined such that the i-th element is 1, the j-th element is −1, and the others are 0), Q i,j satisfies where the second equality uses the fact that A(x (k) − x (k+1) ) = 0 p and µ (k+1) i,j ≥ 0 denotes the KKT multipliers for the condition Q i,j x (k+1) 2 The second inequality uses Since the sequence {v(k, k, k, k)} generated by (20) converges to a limit point because of i,j } converges to limit points Z i andd i,j with a fixedX from Lemma 1.
Finally, some improvements to Algorithm (1) are offered in this section. First, the dimension of the LDDM is unknown in actual applications, although Algorithm 1 requires a suitable r. In order to solve this issue, we adopt a method that estimates the dimension r based on the ratio of the singular value σ r /σ 1 , just as [9] did for each column i ∈ I. Second, we consider ways to reduce the computational complexity. Two key possibilities are considered: one is to ignore the quadraticconstraint condition x j − x i 2 2 − i d i,j ≤ 0 when we update X and the other is to update X for only the columns in thei-th neighborhoods, for example, by minimizing the only i-th Frobenius norm term of (18) with regard to the column x i , which is expected to work like a stochastic gradient-descent algorithm. Furthermore, this paper utilizes the parameter β = max i (i) because the update schemes (19) yield limit points for Z i and d i,j only once for each i ∈ I from Lemma 2. Thus, this paper proposes a heuristic algorithm for reducing the calculation time, as shown in Algorithm 2. There, the parameters satisfy 1 > α (0) ≥ α (1) ≥ · · · ≥ α min > 0 for k = 0, 1, · · · , k max and δ > 0, just as in [9].

Simulation results and discussion
This section presents several numerical examples for the matrix-completion problem (1). In this section, each i-th column of X (0) is generated by F p : R r → R M with mapping function (3) as Algorithm 2 LLRA using a stochastic-gradient-descentlike algorithm (LLRASGD) Require: 4: k ← 0 5: repeat 6: for i ∈ I do 7: r ← argmax r σ r ≥ α (k) σ 1
The results are shown in Tables 1,2, and 3 for q ∈ {0.2, 0.3, 0.4} and r ∈ {2, 3, 4, 5, 6}. As can be seen, estimation accuracy of LLRASGD is better than the others for r = 5, 6, q = 0.2, 0.3, 0.4 and d = 3, 5, 7, and r = 3, 4, 5, 6 and q = 0.2 especially. Figures 2, 3 and 4 compare all algorithms with q = 0.3. In Figs. 2 and 3, the MSEs of LLRASGD tend not to decay more than other algorithms. From this result, the proposed method is more effective for the case in which the missing rate or the latent dimension is high.

Conclusion
This paper proposed a local low-rank approach (LLRA) for a matrix-completion problem in which the columns of the matrix belong to an LDDM. The convergence properties of this approach were also presented. The proposed method is based on the idea of tangent hyperplanes of dimension equal to that of the LDDM with respect to each column of the matrix. It is assumed that each hyperplane is of low dimension and that the sum of the rank of each local submatrix with respect to each column belonging to the set of nearest neighborhoods of each column is minimized. Numerical examples show that the proposed algorithm offers higher accuracy for matrix completion than other algorithms in the case where each column vector is given by a p-th order polynomial mapping of a latent feature. In particular, the proposed method is suitable when the order p and the dimension of the latent space are high.  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63 64 65