 Research
 Open access
 Published:
Matrix completion via modified schatten 2/3norm
EURASIP Journal on Advances in Signal Processing volume 2023, Article number: 62 (2023)
Abstract
Lowrank matrix completion is a hot topic in the field of machine learning. It is widely used in image processing, recommendation systems and subspace clustering. However, the traditional method uses the nuclear norm to approximate the rank function, which leads to only the suboptimal solution. Inspired by the closedform formulation of \(L_{2/3}\) regularization, we propose a new truncated schatten 2/3norm to approximate the rank function. Our proposed regularizer takes full account of the prior rank information and achieves a more accurate approximation of the rank function. Based on this regularizer, we propose a new lowrank matrix completion model. Meanwhile, a fast and efficient algorithm are designed to solve the proposed model. In addition, a rigorous mathematical analysis of the convergence of the proposed algorithm is provided. Finally, the superiority of our proposed model and method is investigated on synthetic data and recommender system datasets. All results show that our proposed algorithm is able to achieve comparable recovery performance while being faster and more efficient than stateoftheart methods.
1 Introduction
The problem of recovering an incomplete low rank or approximately low rank matrix with missing values, namely low rank matrix completion (LRMC), has attracted significant attention in recent years. Such a problem is a central issue in the field of computer vision and machine learning, and can be found in various practical applications, such as recommender systems [1, 2], motion capture [3], video denoising [4], subspace clustering [5], and hyperspectral imaging [6]. Roughly, the methods for LRMC can be classified into two categories: the low rank matrix factorization methods and the rank minimization methods. In this work, we only focus on the latter category. It is because the factorization based algorithms are heavily rely on a prespecified rank [7], which is difficult to preestimated in some real applications.
It is well known that the rank function has nonconvex and discontinuous properties. Therefore, the rank minimization problem is NPhard and is difficult to optimize. To alleviate this problem, many researchers have suggested to relax the rank function and, instead, to consider the nuclear norm. Theoretical analysis illustrate that the nuclear norm, i.e., the sum of singular values of the matrix, is the tightest convex lower bound of the rank [8]. Candès and Recht have proven that [9], if the observed entries of the matrix are sampled uniformly at random and the matrix satisfies restricted isometry property condition, the target low rank matrix can be exactly recovered by nuclear norm minimization. Because of this, the nuclear norm minimization gets its popularity and has been accepted as a very powerful method for the solution of low rank problems. During the past decades, a variety of algorithms have been proposed to solve the nuclear norm based model with strong theoretical guarantees, such as singular value thresholding (SVT) [10], accelerated proximal gradient with line search algorithm (APGL) [11], softimpute [12] and its accelerated version (AISImpute) [13]. Nevertheless, the relaxation of the nuclear norm is too loose to approximate the rank function. Thus, the algorithms mentioned above may only yield suboptimal performance in practice. One important reason is that the nuclear norm treats all singular values equally. Intuitively, large singular values should shrink less, and small singular values should shrink more. All in all, a further improvement is required.
A very natural idea is the suggestion of the use of nonconvex surrogate functions to approximate the rank function. The representative nonconvex surrogate functions include the schatten pnorm \((0< p < 1)\) [14], capped\(l_{1}\) norm [15], logsum penalty (LSP) [16], smmothly clipped absolute deviation (SCAD) [17], transformed \(l_{1}\) penalty [18, 19], and Laplace [20]. The empirical results demonstrate that these nonconvex surrogate functions can achieve better performance than that of its convex counterpart. However, the resultant optimization problem is nonconvex, nonsmooth, and nonLipschitz. It is a big challenge to solve these optimization problems efficiently. To this end, a number of algorithms, such as iteratively reweighted nuclear norm (IRNN) [21], fast nonconvex low rank learning (FaNCL) [22], matrix completion based on nonconvex relaxation (MCNC) [23], double nonconvex nonsmooth rank (DNNR) relaxations function based method [24], and blockwise model dubbed differentiable lowrank learning (DLRL) [25], have been proposed to solve the nonconvex low rank approximation problems.
Another parallel research is to consider the different contributions of different rank components, with the weighted nuclear norm minimization (WNNM) [26, 27] being the most representative one. Comparing with the traditional nuclear norm minimization, the weighted nuclear norm minimization scheme assigns different weights to different singular values such that decrease the punishment on larger singular values. In order to achieve better recovery performance, the weighted schatten pnorm minimization (WSNM) [28] is proposed to solve LRMC problem. By setting appropriate values for the weights and p, the weighted nuclear norm minimization can be viewed as a special case of the weighted schatten pnorm minimization. The WNNM and WSNM models have been successfully applied to deal with typical low level vision tasks, such as image denoising and background subtraction [27, 28]. However, both WNNM and WSNM do not take into consideration a priori rank information. The variance of data distribution within the target rank does not need to minimize, which means that we only need to minimize the singular values in residual ranks. Along this line of research, the truncated nuclear norm (TNN) [29] and partial sum of singular values (PSSV) [30] have been proposed for low rank matrix recovery problems. Indeed, the TNN and PSSV can be regared as one of the concrete examples of WNNM and WSNM. Although TNN can achieve a more accurate and robust approximation to the rank function, it still suffer from some drawbacks. More specifically, the algorithms for solving the traditional TNNbased models are timeconsuming and a prespecified parameter is difficult to preestimated. Recent studies in [31,32,33,34], and [35] have addressed partially these issues.
In this work, we continue such a study. Our aim is to establish a novel continuous but nonconvex regularizer namely Modified Schatten 2/3Norm Minimization with Reweighting strategy (TSNMR) for LRMC problem. Subsequently, a more accurate and flexible model with TSNMR is build. As can be seen latter, our proposed model is fully consider the priori rank information, and achieves robust approximation to the rank function. Furthermore, its solution can be analytically expressed in a thresholding form. Based on this finding, a computationally efficient optimization method is designed for solving matrix completion problems. The contributions of this work are highlighted as follows:

1.
By virtue of the idea of TNN and WSNM, a nonvel continuous but nonconvex regularizer namely TSNMR is proposed for LRMA problem. Armed with it, a more accurate and flexible model is obtained. Meanwhile, the property of TSNMR is also analysed, and its closedform solutions can be derived from a thresholding operator. By involving this finding, the resultant optimization model becomes more tractable.

2.
An efficient and fast optimization algorithm with inexact proximal steps and Nesterov’s acceleration rules is designed to optimize the proposed model. Rigorous mathematical proof of the proposed algorithm demonstrating that any accumulation point of its generated sequence is a firstorder stationary point.

3.
We apply the proposed TSNMR model to solve some typical low rank matrix completion problems, e.g., image inpainting.

4.
Experimental results on synthetic data and color images demonstrate that our proposed model can achieve superior performance than the stateoftheart models.
The rest of this paper is organized as follows. Section 2 briefly reviews some related works. Section 3 presents the proposed model and develops its optimization method with rigorous convergence guarantees. Section 4 introduces the applications of our proposed model to low level tasks. Section 5 reports and analyzes the experimental results. Finally, several concluding remarks are provided in Sect. 6.
Notations: Some notations used in this paper are listed in Table 1.
2 Background
In this section, we briefly introduce the closedform thresholding formula for \(L_{2/3}\) regularization and some widely used nonconvex low rank regularizers.
2.1 Thresholding formulas for \(L_{2/3}\) regularization
The \(L_{2/3}\) regularization model was recently proposed by Xu et al. [37] for solving the image deconvolution problem. It is believe that the \(L_{2/3}\) regularization is more effective than \(L_{1/2}\) regularization [36] in many practical applications. Mathematically, the \(L_{2/3}\) regularization model can be represented as
where \(a \ge 0\) is a constant in \({\mathbb {R}}\). It follows from [37] that the solutions of (1) can be analytically expressed by
where
and \(\gamma = 2/3(3(2\lambda )^{3})^{1/4}\).
2.2 Existing nonconvex low rank regularizers for LRMA
(1) Weighted nuclear norm With the aim of improving the flexibility of nuclear norm minimization, Gu et al. [27] proposed the weighted nuclear norm (WNN), which can be represented as
where \(X \in {\mathbb {R}}^{m \times n}\), \(w = [w_{1}, w_{2}, \cdots , w_{n}]^{T}\), and \(w_{1} \ge w_{2} \ge \cdots \ge w_{n} \ge 0\). Therefore, the WNNM model is obtained and it can be solved by weighted nuclear norm proximal (WNNP) operator
However, it is difficult to solve (5) due to the nonconvexity of WNNM. Fortunately, theoretical analysis of (5) reveals that it is actually a quadratic programming problem with linear constraints. Thus, the globally optimal solution of (5) can be achieved in closedform.
Lemma 1
([27]) Suppose that \(W \in {\mathbb {R}}^{m \times n}\) admits singular value decomposition (SVD) as \(U\Sigma V^{T}\), where \(\Sigma = \textrm{Diag}(\sigma )\), \(\sigma = [\sigma _{1}, \sigma _{2}, \cdots , \sigma _{r}]^{T}\), and \(\sigma _{1} \ge \sigma _{2} \ge \cdots \ge \sigma _{r} \ge 0\). The global solution to
is given by
where \(\Sigma '_{ii} = \max (\Sigma _{ii}  w_{i}/2, 0)\).
(2) Weighted schatten pnorm Inspired by the Schatten pnorm and WNN, Xie et al. [28] proposed the weighted shatten pnorm (WSN), which can be represented as
WSN can be seen as a generalization of WNN, but it can approximate the rank function better than WNN. By this relaxation, the WSNM model could be obtained. To handle such models efficiently, one need to consider the following nonconvex optimization problem.
Intuitively, solving (9) is nontrivial due to the noncovexity and nonsmoothness of the objective function. However, the following lemma shows that the optimal solution of (9) can be achieved by solving r independent subproblems, where r is the rank of W.
Lemma 2
([28]) Suppose that \(W \in {\mathbb {R}}^{m \times n}\) admits SVD as \(U\Sigma V^{T}\), the optimal solution to
is given by
where
It follows from [28] that (12) can be decoupled into r independent subproblems, and these subproblems can be effectively solved by generalized softthresholding (GST) algorithm (for more details about GST, please refer to [28]).
Comparing with the NNM models, the WNNM and WSNM models are fully consider the difference between different singular values, and achieve better approximation to the rank function. Nevertheless, these models do not take into consideration a priori rank information for the practical applications. Thus, they are still not accurate enough for solving real LRMC problems.
3 The proposed model and its optimization method
In this section, we first introduce the definition of TSNMR and then establish the low rank matrix completion problem. By analysing the property of TSNMR, the optimization method for the resultant model is proposed and its convergence property is analysed. Furthermore, we also discuss the adaptive regularization parameter.
3.1 Problem formulation
In this work, we devise a novel continuous but nonconvex surrogate function, namely truncated schatten 2/3norm minimization with reweighting strategy. More precisely, the TSNMR is defined as
where \(X \in {\mathbb {R}}^{m \times n}\), r is the target rank, \(q = \min \{m, n\}\), \(\epsilon _{r + 1} \ge \epsilon _{r + 2} \ge \cdots \ge \epsilon _{q} > 0\) are set to sufficiently small positive numbers to avoid dividing by 0, and \(C > 0\) is a constant. Our proposed TSNMR not only takes into consideration the importance of different rank components, but also fully considers the priori rank information.
Obviously, the function \(\psi _{2/3,\alpha }^{\epsilon }(t) = Ct^{2/3}/(t + \epsilon )^{2/3  \alpha }\) is concave for any \(\alpha \in (0, 2/3]\) and \(\epsilon > 0\). With the change of parameters \(\alpha\) and \(\epsilon\), it is easy to verify that
where \(C = 1\). Therefore, with the proper choices of \(\alpha\) and \(\epsilon _{i}\), we have
In other words, if we set \(\alpha \rightarrow 0^{+}\) and \(\epsilon \rightarrow 0^{+}\), then \(P_{p, \alpha }^{\epsilon }(X)\) is degraded to TNN in [29] and [30].
Armed with TSNMR, in this paper, we mainly focus on the following lowrank minimization problem, which can be formulated as the form
where \(\lambda > 0\) is given parameter, \(\Omega\) denotes the set of the locations of the observed entries, and \({\mathcal {P}}_{\Omega }\) denotes the orthogonal projector onto the span of matrices vanishing outside of \(\Omega\), i.e.,
3.2 Solving scheme
Directly solving the nonconvex and nonsmooth optimization problem (16) is difficult. To make this issue tractable, we first define the following quadratic function
where \(f(Y) = (1/2) \Arrowvert {\mathcal {P}}_{\Omega }(Y)  {\mathcal {P}}_{\Omega }(M) \Arrowvert _{F}^{2}\) and \(X, Y \in {\mathbb {R}}^{m \times n}\). For any \(\mu > 0\), it is easy to find that \(F(X) = F_{\lambda , \mu }(X, X)\). In what follows, we will reveal that any global minimizer of F(X) is also a global minimizer of \(F_{\lambda , \mu }(X, Y)\). The following lemma addresses this issue.
Lemma 3
Assume that \(\mu \le 1/L_{f}\) and \(X^{*}\) is the global minimizer of F(X), then we have
Proof
Considering the objective function F(X) at \(X = X^{*}\), we have
Although f is possibly nonconvex, from the assumption that f is differentiable with \(L_{f}\)Lipschitz continuous gradient, we can obtain that [40, 41]
Substituting (20) into (19) and setting \(Y = X^{*}\), we have
We complete the proof. \(\square\)
By Lemma 3, we can conclude that the global minimizer of optimization problem (16) can be obtained by computing the optimal solution of \(F_{\lambda , \mu }(X, Y)\) in optimization problem (17). Using the basic algebra calculation, we obtain that
where \(B_{\mu }(Y) = Y  \mu \nabla f(Y)\). Ignoring constant terms of (22), the global minimizer of \(F_{\lambda , \mu }(X, Y)\) can be obtained by solving the following optimization problem
Now the crucial thing we need to deal with is how to obtain the global minimizer of optimization problem (23). Thus, we extend the aforementioned wellknown \(L_{2/3}\) regularization to solve the resultant nonconvex optimization problem. Additionally, in the next section, we will show that its global optimal solution can be easily obtained in closedform.
3.3 Optimization
In this subsection, we will exploit an efficient and fast optimization method to optimize problem (16). The main obstacle in this method is how to solve the optimization problem (23). As mentioned above, owing to the nonconvexity of TSNMR, this problem is much more challenging. To this end, we first show that the global optimal solution of such problem can be efficiently achieved. In order to better address this issue, we introduce the following lemma.
Lemma 4
(von Neumann [42, 43]) For any matrices A and B in \({\mathbb {R}}^{m \times n}\) and assume that \(\sigma (A)\) and \(\sigma (B)\) are the singular value vector of A and B, respectively, then
The case of equality occurs iff there exists a simultaneous SVD U and \(V^{T}\) of A and B in the following form
By means of von Neummann’s lemma, we establish the following theorem, which reveals that the global minimizer of optimization problem (23) can be obtained in closedform.
Theorem 3.1
Suppose that \(\lambda > 0\), \(B = Y  \mu \nabla f(Y)\) admits SVD as \(U\textrm{Diag}(\sigma )V^{T}\). Let \(B = {\hat{B}} + {\tilde{B}} = {\hat{U}}\textrm{Diag}({\hat{\sigma }}){\hat{V}}^{T} + {\tilde{U}}\textrm{Diag}({\tilde{\sigma }}){\tilde{V}}^{T}\), where \({\hat{\sigma }} = (\sigma _{1}, \cdots , \sigma _{r}, 0, \cdots , 0)\), \({\tilde{\sigma }} = (0, \cdots , 0, \sigma _{r + 1}, \cdots , \sigma _{q})\), \({\hat{U}}\) and \({\hat{V}}\) are the singular vector matirces correspongding to the r largest singular values, \({\tilde{U}}\) and \({\tilde{V}}\) from the \((r + 1)\)th to the last singular values. Then, the optimal solutions to
are given by
where \(\lambda ' = 2\lambda \mu C/(\sigma _{i}(Y) + \epsilon _{i})^{2/3  \alpha }\), and \(\textrm{prox}_{\lambda , P_{2/3,\alpha }^{\epsilon }(\cdot )}(B) = {\hat{B}} + {\tilde{U}}\left( \textrm{Diag}(H_{\lambda }({\tilde{\sigma }}))\right) {\tilde{V}}^{T}\) with
Proof
Assume that \(\tau = \lambda \mu\) and X admits SVD as \(U'\textrm{Diag}(\sigma ')V'^{T}\). Note that
where \(\phi _{i} = (C \sigma _{i}'^{2/3})/((\sigma _{i}(Y) + \epsilon _{i})^{2/3  \alpha })\).
By applying the von Neumann trace inequality in Lemma 4, we can obtain that \(\langle X, B \rangle\) reaches its maximum value \(\sum _{i}^{q}\sigma _{i}'\sigma _{i}\) if \(U = U'\) and \(V = V'\). Therefore, we can get
Moreover, the Eq. (30) can be further rewritten as
It is easy to observe that Eq. (31) consists of simple quadratic equations for each \(\sigma _{i}'\) independently. Thus, by using the firstorder optimality condition and the closedform thresholding formula for \(L_{2/3}\) regularization, we can obtain
Hence, the global optimal solutions to (26) can be achieved as
where \(\lambda ' = 2\lambda \mu C/(\sigma _{i}(Y) + \epsilon _{i})^{2/3  \alpha }\), and
which are the desired results. We complete the proof. \(\square\)
As can be seen from Theorem 3.1, solving the optimization problem (23) involves a full SVD step. As we all know, for any matrix \(B \in {\mathbb {R}}^{m \times n}\), computing its SVD takes \(O(mn^{2})\) time. Therefore, when the scale of matrix B is large, directly computing its SVD may be timeconsuming. Fortunately, from (29) in Theorem 3.1, we only need to compute the singular values larger than \(\gamma\), which can be made more efficient by using partial SVD. Specifically, we first employ the power method [44, 45] algorithm to achieve a orthogonal matrix \(Q \in {\mathbb {R}}^{m \times t}\), and then perform SVD on a much smaller matrix. Inspired by [22, 44], and [46], we establish the following lemma to address this issue.
Lemma 5
Assume that B has \({\hat{r}} \le t\) singular values that are larger than \(\gamma\), and let \(U_{{\hat{r}}}\textrm{Diag}(\sigma _{{\hat{r}}})V_{{\hat{r}}}^{T}\) be the rank\({\hat{r}}\) SVD of B, then there exists an orthonormal matrix \(Q \in {\mathbb {R}}^{m \times t}\) \((t \ll n)\), such that

(1)
\(span(U_{{\hat{r}}}) \subseteq span(Q)\), and

(2)
\(prox_{\lambda , P_{p,\alpha }^{\epsilon }(\cdot )}(B) = Q prox_{\lambda , P_{p,\alpha }^{\epsilon }(\cdot )}(Q^{T}B)\).
Proof
The proof follows the footsteps of Proposition 1 in [44], and we omit it here. \(\square\)
Since the partial SVD strategy is employed to compute the proxima operator in (27), this can be made the results inexact, meaning that
where \(\xi _{k}\) denotes the error in the proximal operator at the kth iteration.
With the representation (27), the TSNMRbased algorithm for solving the problem (16) is naturally proposed in Algorithm 1.
In convex optimization, the Nesterov’s acceleration rules are commonly used to speed up the convergence of firstorder methods. Recently, this acceleration strategy has been successfully extended to solve the nonconvex optimization problem [47,48,49,50]. In this work, we try to integrate Nesterov’s acceleration strategy with our proposed algorithm. As can be seen from Algorithm 1, the accelerated iterate is obtained in step 8. Since the TSNMR is absent convexity, a monitor is needed to ensure that the objective value F can achieve a sufficient decrease (step 9). Specifically, if \(V_{k}\) is a good extrapolation, this iterate is accepted (step 12); otherwise, we discard it (step 10). In order to make the Nesterov’s acceleration strategy more efficient, an alternative choice of the momentum stepsize [51] is employed (step 10 and step 12). When \(F(X_{k})\) is larger than \(F(V_{k})\), such a scheme provides the opportunity to further exploit acceleration by enlarging the momentum \(\beta\). Due to the successful application of Nesterov’s acceleration technique, the number of iterations of Algorithm 1 is greatly reduced.
Now the last issue is how to choose the regularization parameter \(\lambda\), which plays an important role in a regularization problem. In general, it is hard to select an optimal \(\lambda\). By virtue of the idea in [36], in this paper, we tune the optimal regularization parameter at kth iteration as
where \(r_{0}\) is the rank of the optimal solution of problem (16). Accordingly, the regularization parameter \(\lambda\) can be selected more adaptive and intelligent. Thus, the Algorithm 1 is free from the choice of regularization parameter during iteration.
3.4 Convergence analysis
In this subsection, we will discuss the convergence of our proposed algorithm. First, we introduce some definitions that will be useful in this paper.
Definition 3.2
([52]) The Frech\(\acute{e}\)t subdifferential of H at x is
where \(H: {\mathbb {R}}^{d} \rightarrow (\infty , +\infty ]\) is an extended realvalued function that is proper. The limiting subdifferential of H at x is \(\partial H(x) = \{u: \exists x_{k} \rightarrow x, H(x_{k}) \rightarrow H(x), {\hat{\partial }}H(x_{k}) \ni u_{k} \rightarrow u\), as \(k \rightarrow \infty \}\).
Definition 3.3
([52]) x is a critical point of H iff \(0 \in \partial H(x)\).
Inspired by the pioneering works in [41, 47, 49, 52], we present the following lemma, which shows that \(X_{k}\) satisfies a sufficient decrease condition similar to lemma 1 in [53]. Its proof is provided in the Supplementary Material.
Lemma 6
If \(\{\xi _{k}\}\) is a decreasing sequence and \(\sum _{k = 1}^{K}\xi _{k} < \infty\), we have
In what follows, we give the following theorem to show that the Algorithm 1 achieves a bounded sequence making the objective function monotonically decreasing. The proof can be found in the Supplementary Material.
Theorem 3.4
The sequence \(\{X_{k}\}\) is generated by Algorithm 1 with \(\mu \le 1/L_{f}\). If for all \(k \in {\mathbb {N}}\), \(\xi _{k} \le \delta \Arrowvert X_{k}  Y_{k} \Arrowvert _{F}^{2}\), where \(\delta \le 1/2  \mu L_{f}/2\). Then, we have

(1)
\(\{X_{k}\}\) is bounded, and has at least one limit point.

(2)
The objective function F is monotonically decreasing.

(3)
\(\sum _{k = 1}^{+\infty } \Arrowvert X_{k}  Y_{k} \Arrowvert _{F}^{2} \le +\infty\), which implies that \(\lim _{k \rightarrow +\infty }\Arrowvert X_{k}  Y_{k} \Arrowvert _{F}^{2} = 0\).
4 Experiments
To illustrate the effectiveness of our proposed algorithm, in this section, we conduct two types of experiments based on the synthetic data and recommendation datasets. Specifically, we compare the proposed method with the following stateoftheart matrix completion methods.

(1)
APGL[11] A nuclear normbased matrix completion method uses the accelerated proximal gradient algorithm to solve the matrix completion problem.

(2)
AISImpute [13] A nuclear normbased matrix completion method uses the accelerated and inexact softimpute to solve the largescale matrix completion problem.

(3)
ASD [54] A decompositionbased method uses alternating steepest descent algorithm to solve matrix completion problem.

(4)
IRNNTNN [21] A nonconvexbased matrix completion method uses iteratively reweighted nuclear norm algorithm to solve matrix completion problem.

(5)
FaNCLLSP [22] A nonconvexbased matrix completion method uses some accelerated scheme to solve matrix completion problem.

(6)
DNNR(\(p=2/3\)) [24] A matrix completion method based on double nonconvex nonsmooth rank relaxations.

(7)
DLRL [25] A nonconvex based back propagation method uses multischattenp norm surrogate function to solve matrix completion problem.
In the following experiments, the parameters of these algorithms are set according to the recommendations of the original paper. For our algorithm, we set \(\mu = 1.95\), \(\beta = (k  1)/(k + 2)\). All the algorithms are implemented in MATLAB R2014a on a Windows server 2008 system with Intel Xeon E52680v4 CPU (3 cores, 2.4 GHz) and 256 GB memory.
4.1 Synthetic data
The synthetic matrices \(M \in {\mathbb {R}}^{m \times n}\) with rank r are generated as \(M = M_{L}M_{R} + N\), where the entries of random matrices \(M_{L} \in {\mathbb {R}}^{m \times r}\) and \(M_{R} \in {\mathbb {R}}^{r \times n}\) are sampled i.i.d. from the standard normal distribution N(0, 1), and entries of N sampled from N(0, 0.1). In the following test, we set \(m = n\) and \(r = 5\). The symbol \(\Omega\) stands for the location of observations, which are sampled uniformly at random. We let \(sr = \Omega /mn\) to denote the sample ratio.
The performance of all algorithms is evaluated as: (i) the normalized mean squared error \(NMSE = \Arrowvert {\mathcal {P}}_{\Omega ^{\bot }} (X  M_{L}M_{R}) \Arrowvert _{F}/\Arrowvert {\mathcal {P}}_{\Omega ^{\bot }} (M_{L}M_{R}) \Arrowvert _{F}\), where X is the recovered matrix and \(\Omega ^{\bot }\) denotes the unobserved positions; (ii) rank of X; and (iii) running time. We vary m in the range \(\{1000, 2000, 3000, 5000\}\). For each algorithm, we repeat 5 times and report its average NMSE, rank and running time.
We report the average NMSE, rank, and running time in Table 2. As can be seen from Table 2, although all algorithms can attain satisfactory results, our proposed algorithm achieves the lowest NMSE value among all problems, which indicates that our proposed algorithm has excellent performance. In terms of accuracy, we can find that our proposed algorithm runs fastest among all algorithms. Specifically, it is 2 and 8 times faster than ASD and FaNCLLSP, respectively. We also observe that as the matrix size increases, the advantages of our algorithm become more pronounced. In addition, for largescale lowrank matrix completion problem, our proposed algorithm can still solve it efficiently. For example, the running time of the proposed algorithm for solving problem with \(m = 10^{5}\), \(sr = 0.12\%\) is within 1163.7 s (NMSE is smaller than 0.0140), while other algorithms cannot solve it at all or cannot get satisfactory results within this time. Therefore, taking both accuracy and converge speed into consideration, our proposed algorithm has the best recovery performance among these algorithms.
4.2 Recommendation
In this section, the Jester and MovieLens datasets will be used in our experiments to further demonstrate the effectiveness of our proposed method. There are six datasets that will be considered, namely Jester1, Jester2, Jester3, Jesterall, MovieLens100K, and MovieLens1 M, whose characteristics are shown in Table 3. The Jester datasets are collected from the joke recommendation system, all of which are stored in three Excel files with the following characteristics.

(1)
Jester1: 24,983 users who have rated 36 or more jokes;

(2)
Jester2: 23,500 users who have rated 36 or more jokes;

(3)
Jester3: 24,938 users who have rated between 15 and 35 jokes.
The MovieLens datasets are collected from the MovieLens website, and these datasets are characterized as follows.

(1)
Movie100K: 100,000 ratings for 1682 movies by 943 users;

(2)
Movie1 M: 1 million ratings for 3900 movies by 6040 users.
The Jester1, Jester2, and Jester3 datasets are combined to form the Jesterall dataset. In the experiment, we follow the setup in [35], which is to randomly select \(50\%\) of the observations for training and use the remaining \(50\%\) for testing. We use the root mean squared error (RMSE) and running time to evaluate the recovery performance of the algorithms. The RMSE is defned as \(RMSE = \sqrt{\Arrowvert {\mathcal {P}}_{{\bar{\Omega }}} (X  M) \Arrowvert _{F}^{2}/ {\bar{\Omega }} _{1}}\), where \({\bar{\Omega }}\) is the test set, X is the recovered matrix. The test of each algorithm is repeated 5 times.
The recovery results regarding RMSE and running time are shown in Table 4. From Table 4, we can see that our algorithm has the best performance, that is, it achieves the smallest RMSE value in all problems. In addition, we can also find that our proposed algorithm is the fastest among all algorithms. As the size of datasets increases, some algorithms cannot get the recovery results in a satisfactory time, while our algorithm can run on all six datasets. This proves once again that our proposed algorithm has excellent performance in the field of lowrank matrix completion.
5 Conclusions
In this paper, we proposed a new nonconvex regularizer for lowrank minimization problems. This regularizer is better able to induce low ranks, and the resulting optimization problem has a closedform solution. Based on the proposed regularizer, we proposed a more reasonable matrix completion model. Meanwhile, we designed an efficient optimization algorithm based on the firstorder gradient method to solve the proposed model. It is simple to use and more suitable for largescale lowrank matrix completion problems. The rationality of our proposed model and the efficiency of the algorithm is verified on a series of synthetic data and recommender system datasets. All results show that our proposed algorithm is able to achieve comparable recovery performance while being faster and more efficient than stateoftheart methods.
Availability of data and materials
Please contact any of the authors for data and materials.
References
H. Steck, Training and testing of recommender systems on data missing not at radom, in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Washiongton, DC, USA, Jul. (2010), pp. 713–22
X. Luo, M. Zhou, S. Li, Y. Xia, Q. Zhu, A nonnegative latent factor model for largescale sparse matrices in recommender systems via alternating direction method. IEEE Trans. Neural Netw. Learn. Syst. 27(3), 579–592 (2016)
G. Xia, H. Sun, B. Chen, Q. Liu, L. Feng, G. Zhang, R. Hang, Nonlinear lowrank matrix completion for human motion recovery. IEEE Trans. Image Process. 27(6), 3011–3024 (2018)
H. Ji, C. Liu, Z. Shen, Y. Xu, Robust video denoising using low rank matrix completion, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jun. (2010), pp. 1791–1798
G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, Y. Ma, Robust recovery of subspace structures by lowrank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 171–184 (2013)
Y. Xie, Y. Qu, D. Tao, W. Wu, Q. Yuan, W. Zhang, Hyperspectral image restoration via iteratively regularized weighted schattenp norm minimization. IEEE Trans. Geosci. Remote Sens. 54(8), 4642–4659 (2016)
Z. Wen, W. Yin, Y. Zhang, Solving a lowrank factorization model for matrix completion by a nonlinear successive overrelaxation algorithm. Math. Program. Comput. 4(4), 333–361 (2012)
B. Recht, M. Fazel, P.A. Parrilo, Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
E.J. Candès, B. Recht, Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009)
J.F. Cai, E.J. Candès, Z. Shen, A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
K.C. Toh, S. Yun, An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Pacific J. Optim. 6(615–640), 15 (2010)
R. Mazumder, T. Hastie, R. Tibshirani, Spectral regularization algorithms for learning large incomplete matrices. J. Mach. Learn. Res. 11, 2287–2322 (2010)
Q. Yao, J. T. Kwok, Accelerated inexact softimpute for fast largescale matrix completion, in Proceedings of 24th International Joint Conference on Artificial Intelligence, (2015), pp. 4002–4008
F. Nie, H. Wang, H. Huang, C. Ding, Joint schatten pnorm and l_{p}norm robust matrix completion for missing value recovery. Knowl. Inf. Syst. 42(3), 525–544 (2015)
T. Zhang, Analysis of multistage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11, 1081–1107 (2010)
E.J. Candès, M.B. Wakin, S.P. Boyd, Enhancing sparsity by reweighted l_{1} minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)
J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)
S. Zhang, J. Xin, Minimization of transformed L_{1} penalty: theory, difference of convex function algorithm, and robust application in compressed sensing. Math. Program. 169(1–2), 307–336 (2018)
Z. Wang, D. Hu, X. Luo, W. Wang, J. Wang, W. Chen, Performance guarantees of transformed Schatten1 regularization for exact lowrank matrix recovery. Int. J. Mach. Learn. Cyber. 12, 3379–3395 (2021)
J. Weston, A. Elisseeff, B. Schölkopf, M. Tipping, Use of the zeronorm with linear models and kernel methods. J. Mach. Learn. Res. 3, 1439–1461 (2003)
C. Lu, J. Tang, S. Yan, Z. Lin, Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm. IEEE Trans. Image Process. 25(2), 829–839 (2016)
Q. Yao, J.T. Kwok, T. Wang, T.Y. Liu, Largescale lowrank matrix learning with nonconvex regularizers. IEEE Trans. Pattern Anal. Mach. Intell 41(11), 2628–2643 (2019)
F. Nie, Z. Hu, X. Li, Matrix completion based on nonconvex lowrank approximation. IEEE Trans. Image Process. 28(5), 2378–2388 (2019)
H. Zhang, C. Gong, J. Qian, B. Zhang, C. Xu, J. Yang, Efficient recovery of lowrank matrix via double nonconvex nonsmooth rank minimization. IEEE Trans. Neural Netw. Learn. Syst. 30(10), 2916–2925 (2019)
Z. Chen, J. Yao, J. Xiao, S. Wang, Efficient and differentiable lowrank matrix completion with back propagation. IEEE Trans. Multimed. 25, 228–242 (2023)
S. Gu, L. Zhang, W. Zuo, X. Feng, Weighted nuclear norm minimization with application to image denoising, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Jun. (2014), pp. 2862–2869
S. Gu, Q. Xie, D. Meng, W. Zuo, X. Feng, L. Zhang, Weighted nuclear norm minimization and its applications to low level vision. Int. J. Comput. Vis. 121(2), 183–208 (2017)
Y. Xie, S. Gu, Y. Liu, W. Zuo, W. Zhang, L. Zhang, Weighted schatten pnorm minimization for image denoising and background subtraction. IEEE Trans. Image Process. 25(10), 4842–4857 (2016)
Y. Hu, D. Zhang, J. Ye, X. Li, X. He, Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2117–2130 (2013)
T.H. Oh, Y.W. Tai, J.C. Bazin, H. Kim, I.S. Kweon, Partial sum minimization of sigular values in robust PCA: algorithm and applications. IEEE Trans. Pattern Anal. Mach. Intell. 38(4), 744–758 (2016)
X. Su, Y. Wang, X. Kang, R. Tao, Nonconvex truncated nuclear norm minimization based on adaptive bisection method. IEEE Trans. Circuits Syst. Video Technol. 29(11), 3159–3172 (2019)
Q. Liu, Z. Lai, Z. Zhou, F. Kuang, Z. Jin, A truncated nuclear norm regularization method based on weighted residual error for matrix completion. IEEE Trans. Image Process. 25(1), 316–330 (2016)
C. Lee, E. Lam, Computationally efficient truncated nuclear norm minimization for high dynamic range imaging. IEEE Trans. Image Process. 25(9), 4145–4157 (2016)
T. Saeedi, M. Rezghi, A novel enriched version of truncated nuclear norm regularization for matrix completion of inexact observed data. IEEE Trans. Knowl. Data Eng. 34(2), 519–530 (2022)
J. Zheng, M. Qin, X. Zhou, J. Mao, H. Yu, Efficient implementation of truncated reweighting lowrank matrix approximation. IEEE Trans. Ind. Inform. 16(1), 488–500 (2020)
Z. Xu, X. Chang, F. Xu, H. Zhang, L_{1/2} regularization: a thresholding representation theory and a fast solver. IEEE Trans. Neural. Netw. Learn. Syst. 23(7), 1013–1027 (2012)
W. Cao, J. Sun, Z. Xu, Fast image deconvolution using closedform thresholding fomulas of l_{q}(q = 1/2, 2/3) regularization. J. Vis. Commun. Image Represent. 24(1), 1529–1542 (2013)
B. Chen, H. Sun, J. Xia, L. Feng, B. Li, Human motion recovery utilizing truncated schatten pnorm and kinematic constraints. Inf. Sci. 450, 80–108 (2018)
C. Wen, W. Qian, Q. Zhang, F. Cao, Algorithms of matrix recovery based on truncated schatten pnorm. Int. J. Mach. Learn. Cyber. 12, 1557–1570 (2021)
Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, vol. 87 (Springer, New York, 2013)
T. Sun, H. Jiang, L. Cheng, Covergence of proximal iteratively reweighted nuclear norm algorithm for image processing. IEEE Trans. Image Process. 26(2), 5632–5644 (2017)
E.M. de Sá, Exposed faces and duality for symmetric and unitarily invariant norms. Linear Algebra Appl. 197, 429–450 (1994)
L. Mirsky, A trace inequality of John von Neumann. Monatshefte Math. 79(4), 303–306 (1975)
T.H. Oh, Y. Matsushita, Y. Tai, H. Kim, I.S. Kweon, Fast randomized singular value thresholding for lowrank optimization. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 376–391 (2018)
Z. Wang, M.J. Lai, Z. Lu, W. Fan, H. Davulcu, J. Ye, Orthogonal rankone matrix pursuit for low rank matrix completion. SIAM J. Sci. Comput. 37(1), A488–A514 (2015)
Z. Wang, Y. Liu, X. Luo, J. Wang, C. Gao, D. Peng, W. Chen, Largescale affine matrix rank minimization with a novel nonconvex regularizer. IEEE Trans. Neural Netw. Learn. Syst. 33(9), 4661–4675 (2022)
H. Li, Z. Lin, Accelerated proximal gradient methods for nonconvex programming, in Proceedings of Advances in neural information processing systems, (2015), pp. 379–387
S. Ghadimi, G. Lan, Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156(1–2), 59–99 (2016)
Q. Yao, J. T. Kwok, F. Gao, W, Chen, T.Y. Liu, Efficient inexact proximal gradient algortihm for nonconvex problems, in Proc. 26th Int. Joint Conf. Artif. Intell., Aug. (2017), pp. 3308–3314
B. Gu, Z. Huo, H. Huang, Inexact proximal gradient methods for nonconvex and nonsmooth optimization, in Proceedings 32nd AAAI Conference on Artificial Intelligence, (2018), pp. 3093–3100
Q. Li, Y. Zhou, Y. Liang, P. K. Varshney, Convergence analysis of proximal gradient with momentum for nonconvex optimization, in Proceedings 34th International Conference on Machine Learning, (2017), pp. 2111–2119
H. Attouch, J. Bolte, B.F. Svaiter, Convergence of descent methods for semialgebraic and tame problems: proximal algorithms, forwardbackward splitting, and regularized GaussSeidel methods. Math. Program. 137(1–2), 91–129 (2013)
P. Gong, C. Zhang, Z. Lu, J. Z. Huang, J. Ye, A general iterative shrinkage and thresholding algorithm for nonconvex regularized optimization problems, In Proceedings 30th International Conference on Machine Learning, (2013), pp. 37–45
J. Tanner, K. Wei, Low rank matrix completion by alternating steepest descent methods. Appl. Comput. Harmon A. 40, 417–420 (2016)
Acknowledgements
The authors would like to thank the editors and referees for their valuable comments that improve the presentation of this paper.
Funding
This work was supported by the Natural Science Foundation of Ningxia (2020AAC03254).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ha, J., Li, C., Luo, X. et al. Matrix completion via modified schatten 2/3norm. EURASIP J. Adv. Signal Process. 2023, 62 (2023). https://doi.org/10.1186/s1363402301027w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1363402301027w