Next Article in Journal
Attack–Defense Confrontation Analysis and Optimal Defense Strategy Selection Using Hybrid Game Theoretic Methods
Next Article in Special Issue
Semi-Proximal ADMM for Primal and Dual Robust Low-Rank Matrix Restoration from Corrupted Observations
Previous Article in Journal
RG-Based Region Incrementing Visual Cryptography with Abilities of OR and XOR Decryption
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Gradient-Based Algorithm with Nonmonotone Line Search for Nonnegative Matrix Factorization

1
School of Sciences, Xi’an University of Technology, Xi’an 710054, China
2
Beijing Mechanical Equipment Institute, Beijing 100039, China
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(2), 154; https://doi.org/10.3390/sym16020154
Submission received: 2 November 2023 / Revised: 14 December 2023 / Accepted: 27 December 2023 / Published: 29 January 2024
(This article belongs to the Special Issue Advanced Optimization Methods and Their Applications)

Abstract

:
In this paper, we first develop an active set identification technique, and then we suggest a modified nonmonotone line search rule, in which a new parameter formula is introduced to control the degree of the nonmonotonicity of line search. By using the modified line search and the active set identification technique, we propose a global convergent method to solve the NMF based on the alternating nonnegative least squares framework. In addition, the larger step size technique is exploited to accelerate convergence. Finally, a large number of numerical experiments are carried out on synthetic and image datasets, and the results show that our presented method is effective in calculating speed and solution quality.

1. Introduction

As a typical nonnegative data dimensionality reduction technology, nonnegative matrix factorization (NMF) [1,2,3,4,5] can efficiently mine hidden information from data, so it has been gradually applied to research into high-dimensional data. This method as a data reduction technique appears in many applications, such as image processing [2], text mining [6], blind source separation [7], clustering [8], music analysis [9], and hyperspectral imaging unmixing [10], to name a few. Generally speaking, the fundamental NMF problem can be summarized as follows: given an m × n data matrix V = ( V i j ) with V i j 0 and a predetermined positive integer r < min ( m , n ) , then NMF plans to find two nonnegative matrices W R + m × r and H R + r × n such that
V W H .
Our visualization illustration of NMF is shown in Figure 1.
One of the most commonly used models of NMF (1) is
min W , H f ( W , H ) 1 2 V W H F 2 subject to W 0 , H 0 .
where · F is the Frobenius norm.
The project Barzilai-Borwein (PBB) algorithm is regarded as a popular and effective method for solving (2) which was originated by Barzilai and Borwein [11]. In recent years, a large number of studies [12,13,14,15,16] have shown that the PBB algorithm is a very effective algorithm in solving optimal problems. The PBB algorithm has the characteristics of simple calculation and high efficiency, so it has been paid attention to by various disciplines. So far, the research results based on the PBB have been widely used in the field of NMF (see [17,18,19,20,21]).
In view of the perfect symmetry of the interaction between W and H, we will focus on the updating of matrix W based on the PBB algorithm. Remember that H k is an approximate value of H after kth update, and there are
f ( W , H k ) = 1 2 V W H k F 2 k .
At each step for solving (3), there are three different updates:
W k + 1 = min W 0 f ( W , H k ) ;
W k + 1 = min W 0 f ( W , H k ) + f ( W ) , W W k + L W k 2 W W k F 2 ;
W k + 1 = min W 0 f ( W k ) , W W k + L W k 2 W W k F 2 ,
where L W k > 0 , f ( W ) = W f ( W , H k ) .
Original cost function (4) is the most frequently used form in the PBB method for NMF and has been widely and deeply researched [17,20,21,22,23]. But the major disadvantage of (4) is that it is not strongly convex [24,25,26,27,28], and we can only hope that this method can find a stationary point, rather than a global or local minimizer. To overcome this drawback, a proximal modification of cost function (4) is presented in [18,19], namely, the proximal cost function (5).
At present, the proximal cost function (5) has been used with the PBB method for NMF in [18,19]. When the cost function (5) is a strongly convex quadratic optimization problem, their lower bound is zero, so the subproblem (5) has a unique minimizer. In [18], the authors present a quadratic regularization nonmonotone PBB algorithm to solve (5) and established its global convergence result under mild conditions. Recently, it is revisited in [19] for the monotone PBB method and is also shown to converge globally to a stationary point of (3), and through the analysis of numerical experiments, it is proved that the monotone PBB method can win over the nonmonotone one under certain conditions. However, when solving the problems (4) and (5), the existing gradient methods based on the PBB converge slowly due to the nonnegative conditions. Therefore, this project intends to develop a new fast NMF algorithm.
In this paper, we introduce a prox-linear approximation of f ( W , H k ) at W k based on f ( W ) which is the cost function (6). And then we propose an active set identification technique. Next, we present a modified nonmonotone line search technique so as to improve the efficiency of nonmonotone line search, in which a new parameter formula is presented to attempt to control the degree of the nonmonotonicity of line search, and thus improve both the possibility of finding the global optimal solution and the convergence speed. By using the active set identification strategy and the modified nonmonotone line search, a global convergent method is proposed to solve (6) based on the alternating nonnegative least squares framework. In particular, in each iteration, identification techniques are used to determine active and free variables. We take ( D t ) i j = 0 or ( D t ) i j = ( Z t ) i j to update some active variables, while using a projected Barzilai-Borwein method to update the free variables and some active variables. The calculation speed is improved by using the method of larger step size. Finally, through the numerical experiments of simulation data and image data, it is proved that the proposed algorithm is effective.
This paper is organized in the following manner. In Section 3, we introduce our estimation of active set, put forward an efficient NMF algorithm, and present the global convergence results of this method. The experimental results are given in Section 4. Finally, Section 5 is the conclusion of the thesis.

2. A Fast PBB Algorithm

In this section, we present an efficient algorithm for solving the NMF and establish the global convergence of our algorithm. Now, let us first introduce some main results of the objective function f ( W , H k ) that we know.
Lemma 1 
([29]). The following two statements are valid.
(i)
The objective function f ( W , H k ) of (3) is convex.
(ii)
The gradient
W f ( W , H k ) = ( W H k V ) ( H k ) T
is Lipschitz continuous with the constant L W = H k ( H k ) T 2 .
In order to facilitate the discussion, we mainly focus on (6) and then rewrite it. Note that the cost function (7) is closely related to the one in Xu et al. [30], but has the following difference: matrix U is W t in our cost function (7), however, to [30] the matrix U is an extrapolation point in W t .
min W 0 φ ( U , W ) : = f ( U ) , W U + L W 2 W U F 2 ,
where the fixed matrix U 0 .
According to (ii) of the Lemma 1, φ ( U , W ) is strictly convex in W for any given U. In each iteration, we will first solve the following strongly convex quadratic minimization problem, so as to obtain a Z t value
min W 0 φ ( W t , W ) .
Because the objective function of the problem (8) is strongly convex, the solution of the problem is unique and closed-form
Z t = P [ W t 1 L W W f ( W t , H k ) ] ,
Here, the operator P [ X ] projects all negative terms of X to zero.
Let W t + 1 = Z t + D t , where D t is the direction which is obtained by (23) with α t being the BB stepsize [11], whereby we see that the convergence of { W t + 1 } can not be guaranteed. Therefore, a global optimization strategy is proposed based on the modified Armiji line search [31].
Therefore, a globalization strategy based on the modified Armiji line search [31] has been proposed, that is, we ask for a step size λ t , so that
f ( Z t + λ t D t ) max 0 j m i n { t , M 1 } f ( Z t j ) + γ λ t f ( Z t ) , D t ,
here M > 0 . Owing to the maximum function, a good function value obtained in any iteration will be discarded, and the numerical performance depends largely on the selection of M in some cases (see [32]).
So as to overcome these shortcomings and obtain a large step size in each procedure, we present a modified nonmonotone line search rule. The modified line search is as follows: for the known iteration point Z t and search direction D t at Z t , we select η t [ η m i n , η m a x ] , where 0 < η m i n < η m a x < 1 , γ t [ γ m i n , γ m a x ( 1 η m a x ) ] , where γ m a x < 1 , 0 < γ m i n < ( 1 η m a x ) γ m a x , 0 μ 1 , and s > 1 , to find a λ t satisfying the following inequality:
S t + 1 S t + γ t λ t [ f ( Z t ) , D t + μ α t D t 2 ] ,
where S t is defined as
S t = f ( W 0 ) , if t = 0 , f ( W t ) + η t 1 ( S t 1 f ( W t ) ) , if t 1 ,
Similar to M in (10), the selection η t in (12) is an important factor in determining the degree of nonmonotonicity (see [33]). Thus, to improve the efficiency of a nonmonotone line search, Ahookhosh et al. [34] choose a varying value for the parameter η t by using a simple formula. Later, Nosratipour et al. [35] decided that η t should be related to a suitable criterion to measure the distance to the optimal solution. Thus, they defined η t by
η t = 1 e f ( Z t ) .
However, we found that if the iterative sequence { Z t } is trapped in a narrow curved valley, then it can lead to f ( Z t ) = 0 , from which we can obtain η t = 0 , so the nonmonotone line search is reduced to the standard Armijo line search, which is inefficient owing to the generation of very short or zigzagging steps. To overcome this drawback, we suggest the following η t :
η t = 2 π a r c t a n ( | f ( Z t ) f ( Z t 1 ) | ) .
It is obvious that | f ( Z t ) f ( Z t 1 ) | is large when the function value decreases rapidly, and then η t will also be large, so therefore the nonmonotone strategy will be stronger. However, when f ( Z t ) is close to the optimal solution, we can obtain | f ( Z t ) f ( Z t 1 ) | which tends toward zero, and then η t also tends toward zero, so then the nonmonotone rule will be weaker and it tends to be a monotone rule.
As was observed in [16], the active set method can enhance the efficiency of the local convergence algorithm and reduce the computing cost. There-in-after, we will recommend an active set recognition technology to approximate the right sustain of the solution points. In our context, we deal with the active set which is considered as the subset of zero components of Z * . Now, we introduce the active set L as the index set corresponding to the zero component. Meanwhile, the inactive set F is to be the support of Z * .
Definition 1. 
Let Ω = { Z R m × r : Z 0 } and Z * be a stationary point of (3). We define the active set as follows:
L = { i j : Z i j * = 0 } ,
We further define an inactive set F which is a complementary set of L,
F ( Z ) = I \ L ( Z ) ,
where I = { 11 , 12 , , 1 r , 21 , 22 , , 2 r , , m 1 , m 2 , , m r } .
Then, for any ( Z t ) Ω , we define the following approximations L ( Z t ) and F ( Z t ) as L ¯ and F ¯ , respectively,
L ( Z t ) = { i j : ( Z t ) i j 1 α t f ( Z t ) i j } ,
F ( Z t ) = I \ L ( Z t ) ,
where α t is the BB step size. For simplicity, we abbreviate L ( Z t ) and F ( Z t ) as L t and F t , respectively. Similar to the Lemma 1 in [21], we have that if the strict complementarity is satisfied at Z t , then L ( Z t ) coincides with the active set if Z t is sufficiently close to Z * .
In order to obtain a well estimate of the active set, the active set is further subdivided into two sets
L 1 ( Z t ) = { i j L ( Z t ) : f ( Z t ) i j c } ,
and
L 2 ( Z t ) = { i j L ( Z t ) : f ( Z t ) i j < c } ,
here c > 0 is a constant.
Obviously, L 2 ( Z t ) is the index set of variables with the first-order necessary condition. Therefore, we have reason to set the variables with indices in L 2 ( Z t ) to 0. In addition, because L 1 ( Z t ) is an index set that does not satisfy the first-order necessary condition, we further subdivide L 1 ( Z t ) into two subsets
L ¯ 1 ( Z t ) = { i j : i j L 1 ( Z t ) and ( Z t ) i j = 0 } ,
and
L ˜ 1 ( Z t ) = { i j : i j L 1 ( Z t ) and ( Z t ) i j 0 } .
When a variable is with indices in L ¯ 1 ( Z t ) , we consider the direction of the form 0. And for the variables of the indexs in L ˜ 1 ( Z t ) , we consider the direction of the form Z t , so as to to improve the corresponding components. Thus, through the above discussion, we define this direction in the following compact form:
( D t ) i j = 0 , if i j L ¯ 1 ( Z t ) , ( Z t ) i j , if i j L ˜ 1 ( Z t ) , ( P [ Z t α t f ( Z t ) ] Z t ) i j , if i j L 2 ( Z t ) F ( Z t ) ,
where α t is the BB stepsize.
Finally, we let
W t + 1 = Z t + λ t D t ,
where λ t is the step size which is found by using a nonmonotonic line search (11).
It is known from [36] that the larger step size technique can significantly accelerate the rate of convergence of the algorithm, so by adding a relaxation factor s to the update rule of W t + 1 (24), we modify the update rule (24) as
W t + 1 = Z t + s λ t D t
for relaxation factor s > 1 . We show that the optimal parameter s in (25) is s = 1.7 by number experiments in Section 4.4.
Based on the above discussion, we develop a nonmonotone projected Barzilai-Borwein method based on the active set strategy proposed in Section 3 and outline the proposed algorithm in Algorithm 1. We can follow a similar procedure for updating H.
Algorithm 1 Nonmonotone projected Barzilai-Borwein algorithm (NMPBB).
  • Initialize α 0 = 1 , η t ( 0 , 1 ) , choose parameters η t [ η m i n , η m a x ] , γ t [ γ m i n , γ m a x ( 1 η m a x ) ] , α m a x > α m i n > 0 , μ [ 0 , 1 ] , ρ ( 0 , 1 ) , s > 1 , L W = H k ( H k ) T 2 and W 0 = W k . Set t = 0 .
  • If P [ W t f ( W t ) ] W t = 0 , stop.
  • Compute Z t = P [ W t 1 L W f ( W t , H k ) ] .
  • Compute S t by (12) and compute D t by (23).
  • Nonmonotone line search. Let m t be the smallest nonnegative integer m satisfying
    S t + 1 S t + γ t ρ m [ f ( Z t ) , D t + μ α t D t 2 ] ,
    where D t = P [ Z t α t f ( Z t ) ] Z t . Set λ t = ρ m t , calculate W t + 1 = Z t + s λ t D t .
  • Calculate X t = W t + 1 Z t and Y t = f ( W t + 1 ) f ( Z t ) . If X t , X t / X t , Y t 0 , set α t + 1 = α m a x ; otherwise, set α t + 1 = min { α m a x , max { α m i n , X t , X t / X t , Y t } } .
  • Set t = t + 1 and go to step 2.
Remark 1. 
According to (11), from the definition of S t , we obtain
( 1 η t ) f ( Z t + s λ t D t ) ( 1 η t ) S t + γ t λ t [ f ( Z t ) , D t + μ α t D t 2 ] .
Since η t < 1 , we can find that (11) equals
f ( Z t + s λ t D t ) S t + 1 1 η t γ t λ t [ f ( Z t ) , D t + μ α t D t 2 ] .
If γ m i n and γ m a x are close to 0 and 1, respectively, and μ = 0 , then (11) reduces to the Gu’s line search in [33] with γ t = γ 1 η t and γ [ γ m i n ( 1 η t ) , γ m a x ] , which implies that the linear search condition of Gu in [33] can be regarded as a special case of (11). In addition, when μ = 0 and η t = 0 , the line search rule (11) can be reduced to the Armijo line search rule.
Next, we prove that the improved nonmonotone line search is well-defined. Before presenting this fact, we state the scaled projected gradient direction by
D α ( W ) = P [ W α f ( W ) ] W
for all α > 0 and W 0 .
For each α > 0 and W 0 . The next Lemma 2 is very important in our proof.
Lemma 2 
([37]). For each α ( 0 , α m a x ] , W 0 ,
(i)
f ( W ) , D α ( W ) 1 α D α ( W ) 2 1 α m a x D α ( W ) 2 ,
(ii)
The stationary point of (3) is at W if and only if D α ( W ) = 0 .
The lemma that follows states that D t = 0 is true if and only if the stationary point of problem (3) is the iteration point { Z t } .
Lemma 3. 
Let D t be calculated by (23), then D t = 0 if and only if Z t is a stationary point of problem (3).
Proof. 
Let ( D t ) i j = 0 . It is obvious that ( Z t ) i j is a stationary point of problem (3) when i j L ¯ 1 ( Z t ) . If i j L ˜ 1 ( Z t ) , we have
0 = ( D t ) i j = ( Z t ) i j 1 α t f ( Z t ) i j .
The above inequality implies that f ( Z t ) i j 0 . By the KKT condition, we can find that ( Z t ) i j is a stationary point of problem (3). If ( D t ) i j = 0 , i j L 2 ( W t ) F ( W t ) , by (ii) of Lemma 2, we know that ( Z t ) i j is a stationary point of problem (3).
Assume that Z t is a stationary point of (3). From the KKT condition, (17) and (18), we have
L ¯ t = { i j : ( Z t ) i j = 0 } , F ¯ t = { i j : ( Z t ) i j > 0 } .
By the definition of ( D t ) i j , we have ( D t ) i j = 0 for all i j L 1 ( Z t ) . And then from the (ii) of Lemma 2, we have ( D t ) i j = 0 for all i j L 2 ( Z t ) . Therefore, we have ( D t ) i j = 0 for all i j L ¯ ( Z t ) . For another case, since f ( Z t ) i j = 0 , for i j F ¯ t , and { Z t } i j is a feasible point, from the definition of ( D t ) i j , we have ( D t ) i j = 0 , i j F ¯ t . □
The next Lemma 4 is very important in our proof.
Lemma 4. 
Sequence { Z t } produced by Algorithm 1, we have
f ( Z t ) , D t ( Z t ) 1 α t D t ( Z t ) 2 ,
D t ( Z t ) α t f ( Z t ) .
Proof. 
By (23), we know
D i j = 0 , if i j L ¯ 1 ( Z t ) , ( Z t ) i j , if i j L ˜ 1 ( Z t ) , ( P [ Z t α t f ( Z t ) ] Z t ) i j , if i j L 2 ( Z t ) F ( Z t ) .
If i j L ¯ 1 ( Z t ) , it is obvious that f ( Z t ) i j , ( D t ( Z t ) ) i j 1 α t ( D t ( Z t ) ) i j 2 holds.
If i j L 2 ( Z t ) F ( Z t ) , from (i) of Lemma 2, we have
f ( Z t ) i j , ( D t ( Z t ) ) i j 1 α t ( D t ( Z t ) ) i j 2 .
Thus, we now only need to prove that
f ( Z t ) i j , ( D t ( Z t ) ) i j 1 α t ( D t ( Z t ) ) i j 2 , i j L ˜ 1 ( Z t ) .
If ( D t ( Z t ) ) i j = 0 , the inequality (32) holds. If ( D t ( Z t ) ) i j 0 , for all i j L ˜ 1 ( Z t ) , from (21), we have
( D t ( Z t ) ) i j = ( Z t ) i j and ( Z t ) i j 1 α t f ( Z t ) i j ,
which lead to
f ( Z t ) i j , ( D t ( Z t ) ) i j 1 α t ( D t ( Z t ) ) i j 2 , i j L ˜ 1 ( Z t ) .
The above deduction implies that the inequality (29) holds for i j L ¯ 1 ( Z t ) . Combining (13) and (33), we obtain that (29) holds. By means of the Cauchy equality, from (29), we obtain (30). □
The following lemma is borrowed from Lemma 3 [18].
Lemma 5 
([18]). Suppose Algorithm 1 generates { Z t } and { W t } , there is
f ( Z t ) f ( W t ) L W 2 Z t W t 2
Now, we will show the nice property of our line search.
Lemma 6. 
Suppose Algorithm 1 generates sequences { Z t } and { W t } , there is
f ( W t ) S t .
Proof. 
Based on the definition of S t , we have
S t S t 1 = f ( W t ) + η t 1 ( S t 1 f ( W t ) ) S t 1 = ( 1 η t 1 ) ( f ( W t ) S t 1 ) 0 ,
where the last inequality from Lemma 2 and μ [ 0 , 1 ] . From 1 η t 1 > 0 , it concludes that f ( W t ) S t 1 0 , i.e., f ( W t ) S t 1 .
Therefore, if η t 1 0 , from (12), we have
S t f ( W t ) = f ( W t ) + η t 1 ( S t 1 f ( W t ) ) f ( W t ) = η t 1 ( S t 1 f ( W t ) ) 0
where the last inequality follows from (36). Thus, (37) indicates
f ( W t ) S t .
In addition, if η t 1 = 0 , we have f ( W t ) = S t . □
It follows from Lemma 6 that
f ( W t ) S t S 0 = f ( W 0 ) .
In addition, for any initial iterate W 0 0 , Algorithm 1 generates sequences { Z t } and { W t } that are both included in the level set.
L ( W 0 ) = { W | f ( W ) f ( W 0 ) , W 0 } .
Again, from Lemma 6, the theorem shown below can be easily obtained.
Theorem 1. 
Assume that the level set L ( W 0 ) is bounded, so the sequence { S t } is convergent.
Proof. 
First, we show that { W t } L ( W 0 ) . Apparently, according to (35) we have
f ( W t ) S t S t 1 S 0 = f ( W 0 ) t N .
Therefore, we obtain that { W t } L ( W 0 ) for all t N .
From (39), we can obtain that
τ 0 s . t . n N : τ f ( W t + n ) S t + n S t 1 + n S t + 1 S t ,
that is, the sequence { S t } has a lower bound. Since the sequence { S t } is nonincreasing, the sequence { S t } is convergent. □
Next, we will exhibit that the line search (11) is well-defined.
Theorem 2. 
Assume Algorithm 1 generates sequences { Z t } and { W t } , so step 5 of the Algorithm 1 is well-defined.
Proof. 
For this purpose, we prove that the line search stops at a limited value of steps. To establish a contradiction, we suppose that λ t such that (26) does not exist, and then for all adequately large positive integers m, according to Lemmas 5 and 6, we have
S t + 1 > S t + γ t ρ m [ f ( Z t ) , D t + μ α t D t 2 ] ,
According to (40), from the definition of S t , we have
( 1 η t ) f ( Z t + s ρ m D t ) > ( 1 η t ) S t + γ t ρ m [ f ( Z t ) , D t + μ α t D t 2 ] .
Since η t < 1 , we can find that (40) is equivalent to
f ( Z t + s ρ m D t ) > S t + 1 1 η t γ t ρ m [ f ( Z t ) , D t + μ α t D t 2 ] .
From Lemmas 5 and 6, we have
f ( Z t + s ρ m D t ) > f ( Z t ) + γ t ρ m ( 1 η t ) [ f ( Z t ) , D t + μ α t D t 2 ] .
Due to f ( Z t ) , D t + μ α t D t 2 f ( Z t ) , D t , thus,
f ( Z t + s ρ m D t ) f ( Z t ) > 1 ( 1 η t ) γ t ρ m f ( Z t ) , D t .
According to the mean-theorem, there is a θ t ( 0 , 1 ) such that
s ρ m f ( Z t + θ t s ρ m D t ) , D t > 1 ( 1 η t ) γ t ρ m f ( Z t ) , D t ,
that is,
f ( Z t + θ t ρ m D t ) f ( Z t ) , D t > ( γ t s ( 1 η t ) 1 ) f ( Z t ) , D t .
When m , we find that
( γ t s ( 1 η t ) 1 ) f ( Z t ) , D t 0 .
Since 0 < γ t 1 η t < 1 < s , f ( Z t ) , D t 0 is correct. This is not consistent with the fact that f ( Z t ) , D t 0 . Therefore, step 5 of Algorithm 1 is well-defined. □

3. Convergence Analysis

In this part, we prove the global convergence of NMPBB. To establish the global convergence of NMPBB, we firstly present the following result.
Lemma 7. 
Suppose that Algorithm 1 generates a step size λ t , if the stationary point of (3) is not W t + 1 , so there is a constant λ ˜ that will cause λ t λ ˜ .
Proof. 
For the resulting step size λ t , if λ t does not satisfy (26), namely,
f ( Z t + s λ t D t ) > S t + 1 1 η t γ t λ t [ f ( Z t ) , D t + μ α t D t 2 ] S t + 1 1 η t γ t λ t f ( Z t ) , D t f ( Z t ) + 1 1 η t γ t λ t f ( Z t ) , D t
where Lemmas 5 and 6 lead to the final inequality. Thus,
f ( Z t + s λ t D t ) f ( Z t ) 1 1 η t γ t λ t f ( Z t ) , D t .
By the mean-value theorem, we can find an θ ( 0 , 1 ) that makes
f ( Z t + s λ t D t ) f ( Z t ) = s λ t f ( Z t + θ s λ t D t ) , D t = s λ t f ( Z t ) , D t + s λ t f ( Z t + θ t s λ t D t ) f ( Z t ) , D t s λ t f ( Z t ) , D t + s 2 L W λ t 2 D t 2 ,
where L W > 0 is the Lipschitz constant of f ( W t ) .
Substitute the last inequality we obtained from (43) into (42) to find
λ t s ( 1 η t ) γ t L W s 2 α m a x ( 1 η t ) .
From η t 1 [ η m i n , η m a x ] and γ t [ γ m i n , γ m a x ( 1 η m a x ) ] , we have
λ t s ( 1 η m a x ) γ m a x L W s 2 α m a x ( 1 η m i n ) : = λ ˜ .
Lemma 8. 
Assume that Algorithm 1 generates the sequence { W t } , for the given level set L ( W 0 ) , if it is considered bounded, so there is
(i)
lim t S t = lim t f ( W t ) .
(ii) there is a positive constant δ makes
S t f ( W t + 1 ) δ D t + 1 2 .
Proof. 
(i) By the definition of S t + 1 , for t 1 we have
S t + 1 S t = ( 1 η t ) ( f ( W t + 1 ) S t ) .
Since η m a x [ 0 , 1 ] , and η t [ η m i n , η m a x ] for all t,
1 η m i n 1 η t 1 η m a x > 0 .
According to Theorem 1, as t ,
lim t 1 1 η m a x ( S t + 1 S t ) = lim t 1 1 η m i n ( S t + 1 S t ) = 0 .
which implies that
lim t ( f ( W t + 1 ) S t ) = 0 .
(ii) From (11) and Lemma 2 (i), we have
S t f ( W t + 1 ) 1 1 η t γ t λ t [ f ( Z t ) , D t + μ α t D t 2 ] γ m i n 1 η m i n λ t α t ( 1 μ ) D t 2 γ m i n λ ˜ ( 1 μ ) ( 1 η m i n ) α m a x D t 2 = δ D t 2 ,
where δ = γ m i n λ ˜ ( 1 μ ) ( 1 η m i n ) α m a x . □
The global convergence of Algorithm 1 is proved by the theorem shown below.
Theorem 3. 
Suppose that Algorithm 1 generates sequences { Z t } and { W t } , so we obtain
lim t D t = 0 .
Proof. 
According to Lemma 8 (ii), we have
S t f ( W t + 1 ) δ D t 2 0 t N .
Based on Lemma 8 (i), as t , we can obtain
lim t D t = 0 .
According to Theorem 3, Lemma 3, and (25), we will exhibit the main convergence results we find as follows.
Theorem 4. 
For a given level set L ( W 0 ) , assume that it is bounded, hence Algorithm 1 computes the generated sequence { W t } , and any accumulation point obtained is a stationary point of (3).

4. Numerical Experiments

In the following content, by using synthetic datasets and real-world datasets (ORL image database and Yale image database (Both ORL and Yale image datasets in MATLAB format are available at http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html (accessed on 26 December 2023))), we exhibit the main numerical experiments to compare the performance of NMPBB with that of the other five efficient methods including the NeNMF [29], the projected BB method (APBB2 [17]) (The code is available at http://homepages.umflint.edu/∼lxhan/software.html (accessed on 26 December 2023)), QRPBB [18], hierarchical alternating least squares (HALS) [38], and block coordinate descent (BCD) method [39]. All of the reported numerical results are performed using MATLAB v8.1 (R2013a) on a Lenovo laptop.

4.1. Stopping Criterion

According to the Karush-Kuhn-Tucker (KKT) conditions optimized by existing constraints, we know that ( W k , H k ) is a stationary point of NMF (2) if and only if W P f ( W , H ) = 0 and H P f ( W , H ) = 0 are simultaneously satisfied, here
[ W P f ( W , H ) ] i j = [ W f ( W , H ) ] i j , if W i j > 0 , min { 0 , [ W f ( W , H ) ] i j } , if W i j = 0 ,
and H P f ( W ( k ) , H ( k ) ) is also written as shown above. Hence, we employ the stopping criteria shown below, which is also used in [40] in numerical experiments:
[ W P f ( W ( k ) , H ( k ) ) , H P f ( W ( k ) , H ( k ) ) T ]
ϵ [ W P f ( W ( 1 ) , H ( 1 ) ) , H P f ( W ( 1 ) , H ( 1 ) ) T ] ,
here ϵ > 0 is a tolerance. When employing the stop criterion (52), we need to pay attention to the scale degrees of freedom of the NMF solution, as discussed in [41].

4.2. Synthetic Data

In this section, first the NMPBB method and the other three ANLS-based methods are tested on synthetic datasets. Since the matrix V in this test happens to be a low-rank matrix, it will be rewritten as V = L R , and here we generate the L and R by using the MATLAB commands m a x ( 0 , r a n d n ( m , r ) ) and m a x ( 0 , r a n d n ( r , n ) ) , respectively.
For NMPBB, in a later experiment we adopt the parameters shown below:
α m a x = 10 20 , α m i n = 10 20 , ρ = 0.25 , γ = 10 3 .
The settings are identical with those of APBB2 and QRPBB. Take s = 1.7 for NMPBB, the reason of selecting relaxation factor s = 1.7 is given in Section 4.4, and take t o l = 10 8 for all comparison algorithms. In addition, for NMPBB we choose η 0 = 0.15 and the update η t by the following recursive formula
η t = η 0 2 , if t = 1 , η t 1 + η t 2 2 , if t 2 .
We unify the maximum number of iterations of all algorithms to 50,000. All other parameters of APBB2, NeNMF, and QRPBB are unified as default values.
For all the problems we are considering, casually generated 10 diverse starting values, and the average outcomes obtained from using these starting points are presented in Table 1. The item iter represents that the number of iterations required to satisfy the termination condition (52) is met. The item niter represents the total number of sub-iterations for solving W and H. V W k H k F / V F is relative error, [ H P f ( W k , H k ) , W P f ( W k , H k ) ] F is the final value of the projected gradient norm, and CPU time (in seconds) separately measures performance.
Table 1 clearly indicates that all methods met the condition of convergence within a reasonable number of iterations. Table 1 also clearly indicates that our ANMPBB needs the least execution time and the least number of sub-iterations among all methods, particularly in the case of large-scale problems.
Since the NMPBB method is closely related to the QRPBB method, as we all know that the hierarchical ALS (HALS) algorithm for NMF is the most effective upon most occasions, we use the coordinate descent method to solve subproblems in NMF. We further examine algorithms of NMPBB, QRPBB, HALS, and BCD. We show that these four methods compare on eight randomly generated independent Gaussian noise measures when the signal-to-noise ratio which is 30 dB in Figure 2, Figure 3 and Figure 4 is terminated when the stopping criterion said by the inequality in (52) satisfies ϵ = 10 8 or the maximum number of iterations is more than 30. Figure 2 shows the value of the objective function compared to the number of iterations. From Figure 2, for most of the test problems, we will draw a conclusion that NMPBB decreases the objective function much quicker than the other three methods in 30 iterations. This may be because our NMPBB exploits an efficient modified nonmonotone line search and adds a relaxing factor s to the update rules of W t + 1 and H t + 1 . Hence our NMPBB significantly outperforms the other three methods. Figure 3 shows the relationship between the relative residual errors and the number of iterations. Figure 4 exhibits the relative residual errors versus CPU time. The results shown in Figure 3 and Figure 4 are consistent with those shown in Figure 2.

4.3. Image Data

The ORL image database is a collection of 400 images of people’s faces belonging to 40 individuals representing 10 each. The dataset includes variations in lighting conditions, facial expressions (including whether they open their eyes, whether they smile), and facial details including whether they wear glasses. Some subjects have multiple photos taken at different times. The images were captured with the subject positioned upright and facing forward (allowing for slight movement to the sides). The background used was uniformly dark and even. All the images were taken against a dark homogeneous background with the subjects in an upright frontal position (with tolerance for some side movement). The pictures used are represented by the columns of the matrix V, and V has 400 rows and 1024 columns.
The Yale face database was created at the Yale Center for Computational Vision and Control. It consists of 165 gray-scale images, with each person in the database having 11 images associated with them. In total, there are 15 people. The facial images in question were captured under different lighting conditions (left-light, center-light, right-light), with various facial expressions (calm, cheerful, sorrowful, amazed, and blinking), and with or without glasses. The pictures used are represented by the rows of the matrix V, and V has 165 rows and 1024 columns.
For all the databases we used in (52), we performed a diverse casually generated starting iteration with ϵ = 10 8 , the maximum number of iterations (maxit) for all algorithms is set to 50,000, and the average results are presented in Table 2. From Table 2, we conclude that the QRPBB method converges in fewer iterations and CPU times than APBB2 and NeNMF, and in contrast to QRPBB, our NMPBB method requires 1/4 CPU time to satisfy the set tolerance. Although the residuals by NMPBB are not the smallest among all algorithms appearing for all the databases we use, the results of p g n mean that solutions by NMPBB are nearer to the point of stationary.

4.4. The Importance of Relaxation Factor s

In the following content, the clear experimental results indicate that relaxation factor s is used for updating rules of W t + 1 and H t + 1 . We implement NMPBB using diverse s given s = 0.1 , 0.3 , 0.7 , 1.0 , 1.3 , 1.7 , 1.9 on synthetic datasets which are the same as those in Section 4.2. We set the required maximum number of iterations to 30, and the other parameters required in the experiment will have the same values as those in Section 4.2. Figure 5 shows the relationship between the relative residuals error and the run-time results. In Figure 5, we can see that the relaxation factor s fails to accelerate the convergence when s < 1 and increasing constant s significantly accelerates the convergence when 1 < s < 2 . As for NMPBB, it seems that s = 1.7 is the best compared with other experimental values in terms of speed of convergence, and hence s = 1.7 was used as our NMPBB in all experiments.

5. Conclusions

In this paper, a prox-linear quadratic regularization objective function is presented, and the prox-linear term leads to strongly convex quadratic subproblems. Then, we propose a new line search technique based on the idea of [33]. According to the new line search, we put forward a global convergent method with larger step size to solve the subproblems. Finally, a series of numerical results are given to show that the method is a promising tool for NMF.
Symmetric nonnegative matrix factorization is a special but important class of NMF which has found numerous applications in data analysis such as various clustering tasks. Therefore, a direction for future research would be to extend the proposed algorithm to solve symmetric nonnegative matrix factorization problems.

Author Contributions

W.L.: supervision, methodology, formal analysis, writing—original draft, writing—review and editing. X.S.: software, data curation, conceptualization, visualization, formal analysis, writing—original draft. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China under grant No. 12201492.

Data Availability Statement

The datasets generated or analyzed during this study are available in the face databases in matlab format at http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html (accessed on 26 December 2023).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Gong, P.H.; Zhang, C.S. Efficient nonnegative matrix factorization via projected Newton method. Pattern Recognit. 2012, 45, 3557–3565. [Google Scholar] [CrossRef]
  2. Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788–791. [Google Scholar] [CrossRef] [PubMed]
  3. Lee, D.D.; Seung, H.S. Algorithms for non-negative matrix factorization. Adv. Neural Process. Inf. Syst. 2001, 13, 556–562. [Google Scholar]
  4. Kim, D.; Sra, S.; Dhillon, I.S. Fast Newton-type methods for the least squares nonnegative matrix approximation problem. SIAM Int. Conf. Data Min. 2007, 1, 38–51. [Google Scholar]
  5. Paatero, P.; Tapper, U. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 1994, 5, 111–126. [Google Scholar] [CrossRef]
  6. Ding, C.; Li, T.; Peng, W. On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Stat. Data Anal. 2008, 52, 3913–3927. [Google Scholar] [CrossRef]
  7. Chan, T.H.; Ma, W.K.; Chi, C.Y.; Wang, Y. A convex analysis framework for blind separation of nonnegative sources. IEEE Trans. Signal Process. 2008, 56, 5120–5134. [Google Scholar] [CrossRef]
  8. Ding, C.; He, X.; Simon, H. On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering. SIAM Int. Conf. Data Min. (SDM’05) 2005, 606–610. [Google Scholar]
  9. Févotte, C.; Bertin, N.; Durrieu, J.L. Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Comput. 2009, 21, 793–830. [Google Scholar]
  10. Ma, W.K.; Bioucas-Dias, J.; Chan, T.H.; Gillis, N.; Gader, P.; Plaza, A.; Ambikapathi, A.; Chi, C.Y. A Signal Processing Perspective on Hyperspectral Unmixing. IEEE Signal Process. Mag. 2014, 31, 67–81. [Google Scholar] [CrossRef]
  11. Barzilai, J.; Borwein, J.M. Two-point step size gradient methods. IMA J. Numer. Anal. 1988, 8, 141–148. [Google Scholar] [CrossRef]
  12. Dai, Y.H.; Liao, L.Z. R-Linear convergence of the Barzilai-Borwein gradient method. IMA J. Numer. Anal. 2002, 22, 1–10. [Google Scholar] [CrossRef]
  13. Raydan, M. On the Barzilai-Borwein choice of steplength for the gradient method. IMA J. Numer. Anal. 1993, 13, 321–326. [Google Scholar] [CrossRef]
  14. Raydan, M. The Barzilai and Borwein gradient method for the large-scale unconstrained minimization problem. SIAM J. Optim. 1997, 7, 26–33. [Google Scholar] [CrossRef]
  15. Xiao, Y.H.; Hu, Q.J. Subspace Barzilai-Borwein gradient method for large-scale bound constrained optimization. Appl. Math. Optim. 2008, 58, 275–290. [Google Scholar] [CrossRef]
  16. Xiao, Y.H.; Hu, Q.J.; Wei, Z.X. Modified active set projected spectral gradient method for bound constrained optimization. Appl. Math. Model. 2011, 35, 3117–3127. [Google Scholar] [CrossRef]
  17. Han, L.X.; Neumann, M.; Prasad, U. Alternating projected Barzilai-Borwein methods for nonnegative matrix factorization. Electron. Trans. Numer. Anal. 2009, 36, 54–82. [Google Scholar]
  18. Huang, Y.K.; Liu, H.W.; Zhou, S.S. Quadratic regularization projected alternating Barzilai-Borwein method for nonnegative matrix factorization. Data Min. Knowl. Discov. 2015, 29, 1665–1684. [Google Scholar] [CrossRef]
  19. Huang, Y.K.; Liu, H.W.; Zhou, S.S. An efficint monotone projected Barzilai-Borwein method for nonnegative matrix factorization. Appl. Math. Lett. 2015, 45, 12–17. [Google Scholar] [CrossRef]
  20. Li, X.L.; Liu, H.W.; Zheng, X.Y. Non-monotone projection gradient method for non-negative matrix factorization. Comput. Optim. Appl. 2012, 51, 1163–1171. [Google Scholar] [CrossRef]
  21. Liu, H.W.; Li, X. Modified subspace Barzilai-Borwein gradient method for non-negative matrix factorization. Comput. Optim. Appl. 2013, 55, 173–196. [Google Scholar] [CrossRef]
  22. Bonettini, S. Inexact block coordinate descent methods with application to non-negative matrix factorization. IMA J. Numer. Anal. 2011, 31, 1431–1452. [Google Scholar] [CrossRef]
  23. Zdunek, R.; Cichocki, A. Fast nonnegative matrix factorization algorithms using projected gradient approaches for large-scale problems. Comput. Intell. Neurosci. 2008, 2008, 939567. [Google Scholar] [CrossRef]
  24. Bai, J.C.; Bian, F.M.; Chang, X.K.; Du, L. Accelerated stochastic Peaceman-Rachford method for empirical risk minimization. J. Oper. Res. Soc. China 2023, 11, 783–807. [Google Scholar] [CrossRef]
  25. Bai, J.C.; Han, D.R.; Sun, H.; Zhang, H.C. Convergence on a symmetric accelerated stochastic ADMM with larger stepsizes. CSIAM Trans. Appl. Math. 2022, 3, 448–479. [Google Scholar]
  26. Bai, J.C.; Hager, W.W.; Zhang, H.C. An inexact accelerated stochastic ADMM for separable convex optimization. Comput. Optim. Appl. 2022, 81, 479–518. [Google Scholar] [CrossRef]
  27. Bai, J.C.; Li, J.C.; Xu, F.M.; Zhang, H.C. Generalized symmetric ADMM for separable convex optimization. Comput. Optim. Appl. 2018, 70, 129–170. [Google Scholar] [CrossRef]
  28. Bai, J.C.; Zhang, H.C.; Li, J.C. A parameterized proximal point algorithm for separable convex optimization. Optim. Lett. 2018, 12, 1589–1608. [Google Scholar] [CrossRef]
  29. Guan, N.Y.; Tao, D.C.; Luo, Z.G.; Yuan, B. NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Trans. Signal Process. 2012, 60, 2882–2898. [Google Scholar] [CrossRef]
  30. Xu, Y.Y.; Yin, W.T. A globally convergent algorithm for nonconvex optimization based on block coordinate update. J. Sci. Comput. 2017, 72, 700–734. [Google Scholar] [CrossRef]
  31. Zhang, H.C.; Hager, W.W. A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 2004, 14, 1043–1056. [Google Scholar] [CrossRef]
  32. Dai, Y.H. On the nonmonotone line search. J. Optim. Theory Appl. 2002, 112, 315–330. [Google Scholar] [CrossRef]
  33. Gu, N.Z.; Mo, J.T. Incorporating nonmonotone strategies into the trust region method for unconstrained optimization. Comput. Math. Appl. 2008, 55, 2158–2172. [Google Scholar] [CrossRef]
  34. Ahookhosh, M.; Amini, K.; Bahrami, S. A class of nonmonotone Armijo-type line search method for unconstrained optimization. Optimization 2012, 61, 387–404. [Google Scholar] [CrossRef]
  35. Nosratipour, H.; Borzabadi, A.H.; Fard, O.S. On the nonmonotonicity degree of nonmonotone line searches. Calcolo 2017, 54, 1217–1242. [Google Scholar] [CrossRef]
  36. Glowinski, R. Numerical Methods for Nonlinear Variational Problems; Springer: New York, NY, USA, 1984. [Google Scholar]
  37. Birgin, E.G.; Martinez, J.M.; Raydan, M. Nonmonotone spectral projected gradient methods on convex sets. SIAM J. Optim. 2000, 10, 1196–1211. [Google Scholar] [CrossRef]
  38. Cichocki, A.; Zdunek, R.; Amari, S.I. Hierarchical ALS Algorithms for Nonnegative Matrix and 3D Tensor Factorization. Lect. Notes Comput. Sci. Springer 2007, 4666, 169–176. [Google Scholar]
  39. Xu, Y.Y.; Yin, W.T. A block coordinate descent method for regularized multi-convex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 2015, 6, 1758–1789. [Google Scholar] [CrossRef]
  40. Lin, C.J. Projected Gradient Methods for non-negative matrix factorization. Neural Comput. 2007, 19, 2756–2779. [Google Scholar] [CrossRef] [PubMed]
  41. Gillis, N. The why and how of nonnegative matrix factorization. arXiv 2015, arXiv:1401.5226v2. [Google Scholar]
Figure 1. Visualization illustration of NMF.
Figure 1. Visualization illustration of NMF.
Symmetry 16 00154 g001
Figure 2. Objective value versus iteration on random problem min W , H 0 1 2 V W H F 2 .
Figure 2. Objective value versus iteration on random problem min W , H 0 1 2 V W H F 2 .
Symmetry 16 00154 g002
Figure 3. Residual value versus iteration on random problem min W , H 0 1 2 V W H F 2 .
Figure 3. Residual value versus iteration on random problem min W , H 0 1 2 V W H F 2 .
Symmetry 16 00154 g003
Figure 4. Residual value versus CPU time on random problem min W , H 0 1 2 V W H F 2 .
Figure 4. Residual value versus CPU time on random problem min W , H 0 1 2 V W H F 2 .
Symmetry 16 00154 g004
Figure 5. Residual value versus CPU time on random problem min W , H 0 1 2 V W H F 2 .
Figure 5. Residual value versus CPU time on random problem min W , H 0 1 2 V W H F 2 .
Symmetry 16 00154 g005
Table 1. Experimental results on synthetic datasets.
Table 1. Experimental results on synthetic datasets.
(m n r)AlgIterNiterPgnTimeResidual
(200 100 10)NeNMF153.36073.7 3.44 × 10 5 0.250.4596
APBB2171.92442.8 2.76 × 10 5 0.260.4596
QRPBB158.01476.4 2.66 × 10 5 0.190.4596
NMPBB50.3496.4 2.50 × 10 5 0.090.4596
(100 500 20)NeNMF1946.783,561.7 1.62 × 10 4 14.460.4257
APBB22798.748,444.2 1.31 × 10 4 15.770.4257
QRPBB2365.726,052.7 1.32 × 10 4 8.490.4258
NMPBB625.47400.4 1.31 × 10 4 2.670.4257
(500 300 25)NeNMF687.328,304.9 3.73 × 10 4 7.300.4496
APBB2456.58077.3 3.20 × 10 4 5.000.4496
QRPBB436.65452.2 3.26 × 10 4 3.310.4496
NMPBB135.11958.1 2.77 × 10 4 1.460.4496
(700 700 30)NeNMF183.46638.0 1.04 × 10 3 3.450.4588
APBB2161.53438.7 8.83 × 10 4 4.560.4588
QRPBB153.02191.9 9.11 × 10 4 2.780.4588
NMPBB60.7936.4 8.41 × 10 4 1.050.4588
(1000 500 30)NeNMF221.07685.5 1.05 × 10 3 4.220.4578
APBB2180.43513.8 8.62 × 10 4 4.520.4578
QRPBB162.82195.5 9.41 × 10 4 2.630.4578
NMPBB60.5937.4 9.17 × 10 4 1.500.4578
(600 1000 40)NeNMF1139.043,519.6 1.69 × 10 3 33.860.4515
APBB2554.49117.8 1.40 × 10 3 18.900.4515
QRPBB434.25963.0 1.52 × 10 3 9.690.4515
NMPBB143.42489.9 1.22 × 10 3 3.770.4515
(1000 600 40)NeNMF644.525,379.5 1.68 × 10 3 20.000.4518
APBB2723.312,948.1 1.41 × 10 3 26.160.4518
QRPBB536.57686.2 1.31 × 10 3 12.550.4518
NMPBB137.82262.7 1.18 × 10 3 3.530.4518
(1000 2000 50)NeNMF330.812,081.3 4.98 × 10 3 25.350.4574
APBB2240.34783.6 4.29 × 10 3 23.410.4574
QRPBB252.84264.2 3.84 × 10 3 18.290.4574
NMPBB79.11558.7 4.10 × 10 3 6.120.4574
(2000 2000 50)NeNMF172.36796.9 8.25 × 10 3 18.960.4629
APBB2147.63734.1 7.30 × 10 3 24.920.4629
QRPBB149.02524.7 5.83 × 10 3 16.430.4629
NMPBB57.11089.3 5.75 × 10 3 6.810.4629
(3000 1000 60)NeNMF485.717,642.4 8.79 × 10 3 63.100.4555
APBB2396.37386.3 7.29 × 10 3 64.500.4555
QRPBB380.36049.4 6.77 × 10 3 48.120.4555
NMPBB116.22141.4 5.81 × 10 3 16.550.4555
(5000 1000 70)NeNMF1036.950,207.5 1.65 × 10 2 344.920.4540
APBB21397.723,570.8 1.47 × 10 2 433.550.4540
QRPBB1307.320,456.8 1.36 × 10 2 304.930.4540
NMPBB281.75639.0 1.22 × 10 2 76.280.4540
Table 2. Experimental results on Yale and ORL datasets.
Table 2. Experimental results on Yale and ORL datasets.
(m n r)AlgIterNiterPgnTimeResidual
(165 1024 25)NeNMF3735.1178,254.1 4.41 × 10 1 65.780.1930
APBB23079.697,375.7 6.42 × 10 2 78.750.1930
QRPBB2711.154,215.7 6.16 × 10 2 42.250.1931
NMPBB1019.224,063.1 2.60 × 10 2 16.570.1930
(400 1024 25)NeNMF13,613.4836,034.3 7.71 × 10 2 349.620.1117
APBB29430.6446,361.6 6.88 × 10 2 474.260.1117
QRPBB7593.5213,178.5 7.05 × 10 2 205.260.1117
NMPBB1982.741,597.0 6.25 × 10 2 34.220.1117
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, W.; Shi, X. A Gradient-Based Algorithm with Nonmonotone Line Search for Nonnegative Matrix Factorization. Symmetry 2024, 16, 154. https://doi.org/10.3390/sym16020154

AMA Style

Li W, Shi X. A Gradient-Based Algorithm with Nonmonotone Line Search for Nonnegative Matrix Factorization. Symmetry. 2024; 16(2):154. https://doi.org/10.3390/sym16020154

Chicago/Turabian Style

Li, Wenbo, and Xiaolu Shi. 2024. "A Gradient-Based Algorithm with Nonmonotone Line Search for Nonnegative Matrix Factorization" Symmetry 16, no. 2: 154. https://doi.org/10.3390/sym16020154

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop