Next Article in Journal
An Effective Naming Heterogeneity Resolution for XACML Policy Evaluation in a Distributed Environment
Next Article in Special Issue
A Dynamically Adjusted Subspace Gradient Method and Its Application in Image Restoration
Previous Article in Journal
Numerical Simulation of the Evacuation Process in a Tunnel during Contraflow Traffic Operations
Previous Article in Special Issue
Symplectic All-at-Once Method for Hamiltonian Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Proximal Algorithm with Convergence Guarantee for a Nonconvex Minimization Problem Based on Reproducing Kernel Hilbert Space

1
School of Science, Xihua Univercity, Chengdu 610039, China
2
School of Mathematical Science, University of Electronic and Technology of China, Chengdu 611731, China
*
Author to whom correspondence should be addressed.
Symmetry 2021, 13(12), 2393; https://doi.org/10.3390/sym13122393
Submission received: 3 November 2021 / Revised: 24 November 2021 / Accepted: 6 December 2021 / Published: 12 December 2021

Abstract

:
The underlying function in reproducing kernel Hilbert space (RKHS) may be degraded by outliers or deviations, resulting in a symmetry ill-posed problem. This paper proposes a nonconvex minimization model with 0 -quasi norm based on RKHS to depict this degraded problem. The underlying function in RKHS can be represented by the linear combination of reproducing kernels and their coefficients. Thus, we turn to estimate the related coefficients in the nonconvex minimization problem. An efficient algorithm is designed to solve the given nonconvex problem by the mathematical program with equilibrium constraints (MPEC) and proximal-based strategy. We theoretically prove that the sequences generated by the designed algorithm converge to the nonconvex problem’s local optimal solutions. Numerical experiment also demonstrates the effectiveness of the proposed method.

1. Introduction

The reproducing kernel Hilbert space (RKHS, denote as H ) has been widely studied in many studies [1,2,3,4,5,6,7]. Its most critical property is that the functions in RKHS can be linearly represented by reproducing kernel function. In addition, many studies have analyzed the properties of unitary or binary functions in RKHS. These functions usually can be regarded as signals or images in discrete form, so as to build optimization models and solve some application problems, such as image super-resolution and image restoration.
In general, the Hilbert space can be considered as H L 2 ( P ) which P is a probability measure on the subset X R , H is complete for a class of real valued functions f : X R with f L 2 ( P ) < . Moreover, the reproducing kernel κ of H can be defined as: (1) for any x X , the function κ ( · , x ) blongs to H . (2) the function κ has so-called reproducing property, that is f ( x ) = f , κ ( · , x ) H for all f H , and · , · H represents an associated inner product. By this relation, we could get the Gram matrix K R n × n by the discretization of reproducing kernel κ ( · , x ) for x R n , thus it is easy to get the following discrete formulation by considering the bias e R n :
g = K α + e ,
where α R n is the coefficient we need to estimate, besides, K is a real symmetry matrix.
In the real world, the underlying function g generally will be polluted by outliers and Gaussian perturbation, which gets the following symmetry ill-posed problem:
y = K α + e + u + σ ,
where u R n can be viewed as outliers and σ R n stands for Gaussian perturbation. Our final goal is to accurately estimate the coefficients α and e from the known y and K . After obtaining α and e , we could calculate the underlying function g by (1). Note that, solving the problem (2) is quite a chanllenging task, since the variables α , e , u and σ in (2) are all unknown, which leads to a ill-posed problem.
In [8], Papageorfiou et al, considered that the function g can be linearly represented by coefficient α and constant c as g = K α + c 1 , and proposed a kernel regularized orthogonal maching pursuit (KROMP) method to solve the nonconvex problem. However, the KROMP method has two weakness: one is that the constant c is not general and flexible; the other is that the convergence of its algorithm is not guaranteed theoretically. Therefore, this paper mainly establishes the nonconvex optimization model for the degraded problem (2) in RKHS, and gives the designed algorithm whose convergence can be guaranteed, finally shows the effectiveness of the proposed method in some simulation experiments.
Regularized modeling is a promising way to deal with ill-posed problems. Actually, the variable u representing outliers is generally sparse, which motivates us to formulate a sparsity-based regularization model. Especially, 0 -quasi norm that counts the non-zero elements in a vector is an ideal metric to depict the sparse property. Therefore, the nonconvex minimization problem for solving the ill-posed problem (2) can be simply shown as follows,
min α , e , u R n Φ ( α , e , u ) = μ 2 K α + e + u y 2 2 + λ 1 2 α 2 2 + λ 2 2 e 2 2 + u 0 ,
where μ , λ 1 and λ 2 are positive parameters. The first term in (3) is deduced from the Gaussian perturbation of σ under the framework of maximum a posteriori (MAP) estimation. The second and third terms are two regularized terms to depict the underlying prior for α and e . The last term is a 0 -quasi norm to depict the sparse prior of outlier u . Note that the given regularization model (3) is a nonconvex minimization problem due to the nonconvex property of 0 term. In general, the 0 term will be replaced approximately by some other convex terms (e.g., 1 term or hard threshold [9,10,11,12]) for simpler computation and convergence guarantee. However, if taking this way, it will lose the accuracy of depicting sparsity, which may result in unreasonable outcomes.
However, the nonconvex minimization problems usually have the following difficulties which are encountered to solve: (1) Whether the designed algorithm can effectively solve the minimization model? (2) Whether the convergence analysis of the designed algorithm can be guaranteed? (3) Whether the initial value affects the convergence of the designed algorithm? Thus, many studies have been devote to conquer these weaknesses of nonconvex problems.
Recently, the nonconvex problem can be reformulated as an equivalent minimization problem based on the mathematical program with equilibrium constraints (MPEC) which can be effectively solved by the classical algorithms [13,14,15]. For instance, Yuan et al. [14] have proposed an equivalent biconvex MPEC formulation for 0 -quasi norm of the nonconvex minimization problem. Additionally, the proximal alternating based algorithm has widely been used to solve the nonconvex and nonsmooth problems [16,17,18,19,20,21,22,23,24]. In [18], Bolte et al. propose a proximal alternating linear minimization algorithm (PALM) framework to solve nonconvex and nonsmoothing minimization problem, and give the convergence analysis of the algorithm.
In this paper, we mainly focus on the above mentioned difficulties of nonconvex minimization problem (3) to design an efficient algorithm with convergence guarantee theoretically. A simple and representative example is employed to verify the effectiveness of the proposed method. Besides, the contributions of this work can be summarized as: (1) New nonconvex minimization modeling based on RKHS; (2) Convergence guarantee of the designed algorithm for the nonconvex problem.
The outline of this paper is as follows. Section 2 shows the detailed algorithm for the nonconvex minimization problem (3). In Section 3, the convergence analysis of the given algorithm is given. Numerical results are reported in Section 4. Finally, conclusions are drawn in Section 5.

2. The Solution for the Nonconvex Minimization Problem

Based on the MPEC lemma of 0 -quasi norm (see more details from [14]), the nonconvex minimization problem (3) can be equivalently reformulated the following model:
min α , e , u , v R n Φ ( α , e , u , v ) = μ 2 K α + e + u y 2 2 + λ 1 2 α 2 2 + λ 2 2 e 2 2 + 1 , 1 v + ι [ 0 , 1 ] ( v ) s . t . v | u | = 0 ,
where ⊙ represents point-wise multiplication, and ι [ 0 , 1 ] ( v ) is indicator function projecting the elements of v into [ 0 , 1 ] . The constrained minimization problem (4) can be rewritten as the following unconstrained minimization problem by the penalty strategy:
Φ ( α , e , u , v ) = H ( α , e , u , v ) + f 1 ( α ) + f 2 ( e ) + f 3 ( u ) + f 4 ( v ) ,
where H ( α , e , u , v ) = μ 2 K α + e + u y 2 2 + β 2 v | u | 2 2 , f 1 ( α ) = λ 1 2 α 2 2 , f 2 ( e ) = λ 2 2 e 2 2 , f 3 ( u ) = 0 , f 4 ( v ) = 1 , 1 v + ι [ 0 , 1 ] ( v ) .
We utilize the proximal-based algorithm to effectively deal with the unconstrained problem (5) by alternatingly solving each variable, which leads to the following subproblems:
α k + 1 arg min α α α k , α H ( α k , e k , u k , v k ) + f 1 ( α ) + τ 1 2 α α k 2 2 ,
e k + 1 arg min e e e k , e H ( α k + 1 , e k , u k , v k ) + f 2 ( e ) + τ 2 2 e e k 2 2 ,
u k + 1 arg min u H ( α k + 1 , e k + 1 , u , v k ) + f 3 ( u ) + δ 1 2 u u k 2 2 ,
v k + 1 arg min v H ( α k + 1 , e k + 1 , u k + 1 , v ) + f 4 ( v ) + δ 2 2 v v k 2 2 ,
where the related parameters are all nonnegative, i.e., δ 1 > 0 , δ 2 > 0 , and τ 1 = γ 1 K T K F , τ 2 = γ 2 I m T I m F , here I m R m × m is identity matrix, and · F represents Frobenius norm, γ 1 > 1 , γ 2 > 1 .
In particular, above subproblems are all convex functions whose closed-form solutions can be easily calculated as follows:
α k + 1 = ( λ 1 + τ 1 ) 1 μ K T ( y K α k e k u k ) + τ 1 α k ,
e k + 1 = ( λ 2 + τ 2 ) 1 μ y K α k + 1 e k u k + τ 2 e k ,
u k + 1 = μ ( y K α k + 1 e k + 1 ) + δ 1 u k μ + β v k v k + δ 1 ,
v k + 1 = min 1 , max 0 , 1 + δ 2 v k β u k + 1 u k + 1 + δ 2 .
We iteratively and alternatingly update α k + 1 , e k + 1 , u k + 1 and v k + 1 according to (10)–(13). The final algorithm for the nonconvex minimization problem (3) is summarized in Algorithm 1.
In Algorithm 1, “ Max i t e r ” means the maximum iterations and the “ r h o ” represents the relative error between adjacent iterations. When the iteration stops, the final underlying function can be estimated by the relation of (1).
Algorithm 1: The algorithm to minimize the problem (5)
Input: blurred matrix K , positive parameters λ 1 , λ 2 , δ 1 δ 2 ,
          and γ 1 > 1 , γ 2 > 1 .
Initialize: start with any ( α 0 , e 0 , u 0 , v 0 ) , r h o = 1 , k = 0 .
While k < Max i t e r or r h o > 10 4
      (1) Compute α k + 1 solved by Equation (10).
      (2) Compute e k + 1 by Equation (11).
      (3) Compute u k + 1 by Equation (12).
      (4) Compute v k + 1 by Equation (13).
      (5) Update penalty parameter β by β k + 1 = 1.1 β k .
      (6) Calculate the relative error r h o = g k + 1 g k 2 g k 2 ,
           where g k + 1 = K α k + 1 + e k + 1 , and g k = K α k + e k .
      (7) k k + 1 .
Endwhile
Output: α * , and e *

3. Convergence Analysis

For the sake of notational simplicity, we uniform expression * as: (1) Frobenius norm * F if ∗ is a matrix; (2) 2 -norm * 2 if * is a vector. Denote that z k = ( α k , e k , u k , v k ) and its domain field is R n × R n × R n × R n .
Lemma 1.
Let the bounded sequence { z k } k N is generated by designed algorithm. Then the sequence { Φ ( z k ) } k N sufficiently decreases as follows:
Φ ( z k ) Φ ( z k + 1 ) ρ 1 2 z k + 1 z k 2 ,
where ρ 1 = min { τ 1 L 1 2 , τ 2 L 2 2 , δ 1 2 , δ 2 2 } . Note that τ 1 L 1 > 0 , τ 1 L 1 > 0 , and L 1 is Lipschitz constant of α H for α , and L 2 is that of e H for e .
Proof. 
Since H ( α , e , u , v ) is Lipschitz differential for variables α and e , respectively. There exist positive constants L 1 and L 2 which satisfy:
H ( α k + 1 , e k , u k , v k ) H ( α k , e k , u k , v k ) + α k + 1 α k , α H ( α k , e k , u k , v k ) + L 1 2 α k + 1 α k 2 ,
H ( α k + 1 , e k + 1 , u k , v k ) H ( α k + 1 , e k , u k , v k ) + e k + 1 e k , e H ( α k + 1 , e k , u k , v k ) + L 2 2 e k + 1 e k 2 .
α -subproblem: Based on the designed algorithm, α k + 1 is the minimum solution of α -subproblem in ( k + 1 ) -th iteration, then we have:
α k + 1 α k , α H ( α k , e k , u k , v k ) + f 1 ( α k + 1 ) + τ 1 2 α k + 1 α k 2 f 1 ( α k ) ,
Combine (15) and (17), we have:
H ( α k + 1 , e k , u k , v k ) + f 1 ( α k + 1 ) H ( α k , e k , u k , v k ) + f 1 ( α k ) + L 1 τ 1 2 α k + 1 α k 2 ,
e -subproblem: Based on the designed algorithm, e k + 1 is the minimum solution of e -subproblem in ( k + 1 ) -th iteration, then we have:
e k + 1 e k , e H ( α k + 1 , e k , u k ) + f 2 ( e k + 1 ) + τ 2 2 e k + 1 e k 2 f 2 ( e k ) ,
Combine (16) and (19), we have:
H ( α k + 1 , e k + 1 , u k , v k ) + f 2 ( e k + 1 ) H ( α k + 1 , e k , u k , v k ) + f 2 ( e k ) + L 2 τ 2 2 e k + 1 e k 2 .
u -subproblem: Based on the designed algorithm, u k + 1 is the minimum solution of u -subproblem in ( k + 1 ) -th iteration, then we have:
H ( α k + 1 , e k + 1 , u k , v k ) + f 3 ( u k ) H ( α k + 1 , e k + 1 , u k + 1 , v k ) + f 3 ( u k + 1 ) + δ 1 2 u k + 1 u k 2 ,
v -subproblem: Based on the designed algorithm, v k + 1 is the minimum solution of v -subproblem in ( k + 1 ) -th iteration, then we have:
H ( α k + 1 , e k + 1 , u k + 1 , v k ) + f 4 ( v k ) H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) + f 4 ( v k + 1 ) + δ 2 2 v k + 1 v k 2 ,
According to (18), (20)–(22), we have:
Φ ( z k ) Φ ( z k + 1 ) = H ( α k , e k , u k , v k ) + f 1 ( α k ) H ( α k + 1 , e k , u k , v k ) f 1 ( α k + 1 ) + H ( α k + 1 , e k , u k , v k ) + f 2 ( e k ) H ( α k + 1 , e k + 1 , u k , v k ) f 2 ( e k + 1 ) + H ( α k + 1 , e k + 1 , u k , v k ) + f 3 ( u k ) H ( α k + 1 , e k + 1 , u k + 1 , v k ) f 3 ( u k + 1 ) + H ( α k + 1 , e k + 1 , u k + 1 , v k ) + f 4 ( v k ) H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) f 4 ( v k + 1 ) τ 1 L 1 2 α k + 1 α k 2 + τ 2 L 2 2 e k + 1 e k 2 + δ 1 2 u k + 1 u k 2 + δ 2 2 v k + 1 v k 2 ρ 1 z k + 1 z k 2 ,
where ρ 1 = min { τ 1 L 1 2 , τ 2 L 2 2 , δ 1 2 , δ 2 2 } . Thus, the consequence of Lemma 1 obviously holds. □
Lemma 2.
Let the bounded sequence { z k } k N be generated by the designed algorithm. Then:
d k + 1 ρ 2 z k + 1 z k ,
where d k + 1 Φ ( z k + 1 ) , and ρ 2 = 4 max { μ K T K + τ 1 , μ K + μ + τ 2 , μ K + μ + δ 1 , 2 β M 2 + δ 2 } , which the contant M > 0 is bounded-value of the sequence { z k } k N and ∂ is subdifferential operator.
Proof. 
Obviously, α k + 1 satisfies the first-order optimal condition of α -subproblem because of α k + 1 is the k-th solution of the α -subproblem. Similar to the other variables, then we have:
0 α H ( α k , e k , u k , v k ) + f 1 ( α k + 1 ) + τ 1 ( α k + 1 α k ) , 0 e H ( α k + 1 , e k , u k , v k ) + f 2 ( e k + 1 ) + τ 2 ( e k + 1 e k ) , 0 u H ( α k + 1 , e k + 1 , u k + 1 , v k ) + f 3 ( u k + 1 ) + δ 1 ( u k + 1 u k ) , 0 v H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) + f 4 ( v k + 1 ) + δ 2 ( v k + 1 v k ) .
Since the object function Φ is continuous and differentiable for each variable, we have:
α Φ ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) = α H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) + f 1 ( α k + 1 ) , e Φ ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) = e H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) + f 2 ( e k + 1 ) , u Φ ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) = u H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) + f 3 ( u k + 1 ) , v Φ ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) = v H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) + f 4 ( v k + 1 ) .
As the sequence { z k } k N is bounded, there exists M > 0 such that z k < M holds. Then, combine (25) and (26), we have:
d k + 1 α H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) α H ( α k , e k , u k , v k ) τ 1 ( α k + 1 α k ) + e H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) e H ( α k + 1 , e k , u k , v k ) τ 2 ( e k + 1 e k ) + u H ( α k + 1 , e k + 1 , u k + 1 , v k + 1 ) u H ( α k + 1 , e k + 1 , u k + 1 , v k ) δ 1 ( u k + 1 u k ) + δ 2 ( v k + 1 v k ) μ K T ( K α k + 1 + e k + 1 + u k + 1 ) μ K T ( K α k + e k + u k ) τ 1 ( α k + 1 α k ) + μ ( K α k + 1 + e k + 1 + u k + 1 ) μ ( K α k + 1 + e k + u k ) τ 2 ( e k + 1 e k ) + β v k + 1 v k + 1 u k + 1 β v k v k u k + 1 δ 1 ( u k + 1 u k ) + δ 2 ( v k + 1 v k ) μ K T K α k + 1 α k + μ K e k + 1 e k + μ K u k + 1 u k + τ 1 α k + 1 α k + μ e k + 1 e k + μ u k + 1 u k + τ 2 e k + 1 e k + β v k + 1 v k + 1 u k + 1 v k v k u k + 1 + δ 1 u k + 1 u k + δ 2 v k + 1 v k μ K T K + τ 1 α k + 1 α k + μ K + μ + τ 2 e k + 1 e k + μ K + μ + δ 1 u k + 1 u k + 2 β M 2 + δ 2 v k + 1 v k ρ 2 z k + 1 z k ,
where ρ 2 = 4 max { μ K T K + τ 1 , μ K + μ + τ 2 , μ K + μ + δ 1 , 2 β M 2 + δ 2 } . Thus, Equation (24) holds. □
Lemma 3.
Let the bounded sequence { z k } k N be generated by designed algorithm and the initial variable z 0 be bounded. Then the sequence { Φ ( z k ) } k N is bounded.
Proof. 
Obviously, the continuous function Φ is proper and coercive, since there exists Φ if and only if z . Thus, the sequence Φ ( z k ) is bounded because the sequence { z k } generated by the designed algorithm is bounded. □
Lemma 4.
The function Φ ( z ) in Equation (3) is a Kurdyka-ojasiewicz (K) function (The definition of a K function and some examples can be found in [18]).
Proof. 
According to the definition and some examples for the KŁ function in [18,25], f 1 , f 2 , f 3 , f 4 , H are polynomial functions which are obviously real analytic functions. Thus, Φ ( z ) is a KŁ function. □
Theorem 1.
Let the sequence { z k } k N be bounded which is generated by the designed algorithm. Then the sequence { z k } k N converges to a critical point.
Proof. 
Since the sequence { z k } k N is bounded, so it must exist a subsequence { z k q } q N which converges to a critical point z ˜ satisfying lim q z k q = z ˜ . Since the function Φ ( z ) is continuous, then the sequence { Φ ( z k q ) } q N is converged to Φ ( z ˜ ) , i.e., lim q Φ ( z k q ) = Φ ( z ˜ ) .
According to Lemmas 1–3, it means that the sequence { Φ ( z k ) } k N is also converged. Thus, the sequence { Φ ( z k ) } k N and the subsequence { Φ ( z k q ) } q N will converge to the same function value as follows:
lim k Φ ( z k ) = Φ ( z ˜ ) .
If there is an index k ˜ such that Φ ( z k ˜ ) = Φ ( z ˜ ) . It is obviously that the sequence { z k } k N up to a stationary point and the corresponding function value does also not change based on Lemma 1, i.e., Φ ( z k ˜ + 1 ) = Φ ( z k ˜ ) . Thus, it is the critical point of { z k } and the conclusion of Theorem 1 is obviously established.
Next part will prove that Theorem 1 still holds when the index k ˜ is nonexistent.
Based on Lemma 1, it implies that for any k > 0 , we have:
Φ ( z ˜ ) < Φ ( z k ) .
From (28), it implies that for any η > 0 , there exists k 1 N , so we have the following inequation when k > k 1 ,
Φ ( z k ) < Φ ( z ˜ ) + η .
Denote a set of limit points of the sequence { z k } as ϑ ( z 0 ) , and let dist ( z ˜ , ϑ ( z 0 ) ) express the minimum distance between one point z ˜ and a set ϑ ( z 0 ) , i.e.,
dist z k , ϑ ( z 0 ) = min { z k z ^ : z ^ ϑ ( z 0 ) } .
Based on Lemma 2, Φ ( z ) is a continuous and differential function, and we have 0 Φ ( z ˜ ) since d k + 1 0 as k . It implies that for any ϵ > 0 , there exists a positive index k 2 N , when k > k 2 , then:
dist z k , ϑ ( z 0 ) < ϵ .
Set l = max { k 1 , k 2 } , and for each k > l , there exists:
Ω = { z : dist z k , ϑ ( z 0 ) < ϵ } { z : Φ ( z ˜ ) < Φ ( z ) < Φ ( z ˜ ) + η } .
From the Lemma 4, Φ ( z ) is a KŁ function in domain Ω . Thus, there is a concave function φ such that:
φ Φ ( z k ) Φ ( z ˜ ) dist 0 , Φ ( z k ) 1 .
Moreover, based on the concavity of φ , it has:
φ Φ ( z k + 1 ) Φ ( z ˜ ) φ Φ ( z k ) Φ ( z ˜ ) + φ Φ ( z k ) Φ ( z ˜ ) Φ ( z k + 1 ) Φ ( z k ) .
Denote Θ p , q = φ Φ ( z p ) Φ ( z ˜ ) φ Φ ( z q ) Φ ( z ˜ ) for all nonnegative integers p, and q N . Combine Equation (33), Lemmas 1 and 2, we have:
Θ k , k + 1 φ Φ ( z k ) Φ ( z ˜ ) Φ ( z k ) Φ ( z k + 1 ) φ Φ ( z k ) Φ ( z ˜ ) ρ 1 z k z k + 1 2 1 ρ 2 z k z k 1 1 ρ 1 z k z k + 1 2 ,
which can be rewritten as follows:
z k z k + 1 2 ρ 2 ρ 1 Θ k , k + 1 z k z k 1 .
It is a well-known inequality 2 α β α + β for any α , β > 0 , thus, we have:
2 z k z k + 1 2 ρ 2 ρ 1 Θ k , k + 1 z k z k 1 z k z k 1 + ρ 2 ρ 1 Θ k , k + 1 .
Taking summation to (36) for i = l + 1 , , k , it has the following inequation:
2 i = l + 1 k z i z i + 1 i = l + 1 k z i z i 1 + ρ 2 ρ 1 i = l + 1 k Θ i , i + 1 i = l + 1 k z i z i + 1 + z l + 1 z l + ρ 2 ρ 1 Θ l + 1 , k + 1 ,
that is:
i = l + 1 k z i z i + 1 z l + 1 z l + ρ 2 ρ 1 Θ l + 1 , k + 1 .
According to the definition of Θ , we have lim k Θ l + 1 , k + 1 = φ Φ ( z l + 1 ) Φ ( z ˜ ) . Thus,
lim k i = 1 k z i z i + 1 = i = 1 l z i z i + 1 + lim k i = l + 1 k z i z i + 1 i = 1 l z i z i + 1 + z l + 1 z l + ρ 2 ρ 1 φ Φ ( z l + 1 ) Φ ( z ˜ ) < + .
Thus, { z k } is a Cauchy sequence, and has a finite length. Because of the completeness of Hilbert space, the Cauchy sequence { z k } is also certainly a convergence sequence. Thus, the sequence { z k } generated by designed algorithm converges to a critical point z * = ( α * , e * , u * , v * ) . Moreover, the convergence of the sequence generated by the designed algorithm can be guaranteed for any initial value. □

4. Numerical Result

In this section, we conduct some simple simulation examples to show the effectiveness of the proposed method. We choose f as the ground-truth function (discrete form as f ) and add Gaussian noise and sparse outliers to generate the observation. The proposed method is compared with the kernel-based regression using orthogonal matching pursuit (KROMP) method [8], and the parameters of KROMP method are selected according to the range mentioned in the literature. The parameters of the proposed method in the experiment set empirically to: μ = 10 , λ 1 [ 10 3 , 10 2 ] , λ 2 [ 10 3 , 10 ] , β = 0.01 , γ 1 = 1.1 , γ 2 = 10 γ 1 , and it should be noted that better visual and numerical results can be obtained by fine-tuning the parameters more carefully.
The related error (ReErr) for the quantitative evaluation, which is a commonly index to measure the effect of restoration, and it is defined as:
ReErr = f g 2 f 2 ,
where g is the restoration result which is estimated by different methods. Experiments are implemented in MATLAB (R2016a) on a desktop with 16Gb RAM and Inter(R) Core(TM) CPU i5-4590: @3.30GHz.
Example 1.
The binary continuous function f is a given as follows:
f ( x , y ) = 50 s i n c π x 2 + y 2 ,
where x [ 1 , 1 ] , y [ 1 , 1 ] (two dimensions, respectively, take 21 discrete points), s i n c ( x ) = s i n ( π x ) π x . After discretization of f, 20 dB Gaussian noise and 10% outlier noise were added to obtain the final degraded data.
In order to show the experimenta results of different examples, we will show the ground-truth data, the degraded data polluted by noise and outliers, and the restored outcome calculated by the proposed method and the KROMP method, respectively.
Example 2.
The binary continuous function f is a given as follows:
f ( x , y ) = s i n x 2 + y 2 x 2 + y 2 ,
where x [ 8 , 8 ] , y [ 8 , 8 ] (two dimensions, respectively, take 31 discrete points), s i n c ( x ) = s i n ( π x ) π x . After discretization of f which is a binary continuous function, 10 dB Gaussian noise and 10% outlier noise were added to obtain the final degraded data.
From Figure 1 and Figure 2, although the shape of ground-truth in Example 2 is similar to that of Example 1 (see Figure 1a and Figure 2a), in fact, the degree of degradation of Example 2 is much greater than that of Example 1, which the function value and the degree of noise pollution are different (see Figure 1b and Figure 2b). The proposed method has obvious restorated outcomes, and can effectively recover original data (see Figure 1c, however the restorated outcomes of the KROMP method still have obvious noise residual in Figure 1d and Figure 2d. It also shows the effectiveness of the proposed method.
Example 3.
The binary continuous function f is a given as follows
f ( x , y ) = x exp x 2 y 2 ,
where x [ 2 , 2 ] , y [ 2 , 2 ] (two dimensions, respectively, take 21 discrete points). After discretization of f which is a binary continuous function, 10 dB Gaussian noise and 5% outlier noise were added to obtain the final degraded data.
It can be seen that even in the case of extremely large external pollution, such as Example 3 in Figure 3b, the proposed method can obtain more accurate recovery data (Figure 3c) than the KROMP method (Figure 3d). In addition, the relative error (ReErr) results from Example 1 to Example 3 are shown in Table 1, and the better results have been bolded. It is obvious that the proposed method has smaller relative errors compared with the KROMP method, which verifies the effectiveness of the proposed method.

5. Conclusions

In this paper, we proposed a new nonconvex modeling to deal with a challenging symmetry ill-posed problem. An efficient algorithm for solving the given nonconvex problem is designed by MPEC and the proximal-based regularization. Theoretically, the bounded sequence generated by the designed algorithm can be guaranteed to converge to the nonconvex problem’s local optimal solution. Furthermore, the convergence of the designed algorithm can be guaranteed for any initial value. The numerical experiments show that the proposed method can achieve better restoration results. For example, our method could obtain smaller relative error comparing with benchmark KROMP approach, besides, could interpolate more mesh points and significantly reduce noise.

Author Contributions

Conceptualization, software and writing—original draft preparation, H.-X.D. and L.-J.D.; methodology and validation, formal analysis and writing—review and editing, H.-X.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by Scientific Research Startup Fund of Xihua University (RZ2000002862).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data in the study is generated by ourself.

Acknowledgments

The authors thank the reviewers for their comments, which have improved the content of this manuscripts.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tanaka, K. Generation of point sets by convex optimization for interpolation in reproducing kernel Hilbert spaces. Numer. Algorithms 2020, 84, 1049–1079. [Google Scholar] [CrossRef] [Green Version]
  2. Mo, Y.; Qian, T.; Mai, W.X.; Chen, Q.H. The AFD methods to compute Hilbert transform. Appl. Math. Lett. 2015, 45, 18–24. [Google Scholar] [CrossRef]
  3. Karvonen, T.; Särkkä, S.; Tanaka, K. Kernel-based interpolation at approximate Fekete points. Numer. Algorithms 2020, 84, 1049–1079. [Google Scholar]
  4. Silalahi, D.D.; Midi, H.; Arasan, J.; Mustafa, M.S.; Caliman, J.P. Kernel Partial Least Square Regression with High Resistance to Multiple Outliers and Bad Leverage Points on Near-Infrared Spectral Data Analysis. Symmetry 2021, 13, 547. [Google Scholar] [CrossRef]
  5. Deng, L.J.; Guo, W.H.; Huang, T.Z. Single-image super-resolution via an iterative reproducing kernel Hilbert space method. IEEE Trans. Circuits Syst. Video Technol. 2015, 26, 2001–2014. [Google Scholar] [CrossRef]
  6. Li, X.Y.; Wu, B.Y. A new reproducing kernel collocation method for nonlocal fractional boundary value problems with non-smooth solutions. Appl. Math. Lett. 2018, 86, 194–199. [Google Scholar] [CrossRef]
  7. Wu, Q.; Li, Y.; Xue, W. A Kernel Recursive Maximum Versoria-Like Criterion Algorithm for Nonlinear Channel Equalization. Symmetry 2019, 11, 1067. [Google Scholar] [CrossRef] [Green Version]
  8. Papageorgiou, G.; Bouboulis, P.; Theodoridis, S. Robust kernel-based regression using orthogonal matching pursuit. In Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, Southampton, UK, 22–25 September 2013; pp. 1–6. [Google Scholar]
  9. Chen, S.S.; Donoho, D.L.; Saunders, M.A. Atomic decomposition by basis pursuit. SIAM Rev. 2001, 43, 129–159. [Google Scholar] [CrossRef] [Green Version]
  10. Donoho, D.L. For most large underdetermined systems of linear equations the minimial 1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 2006, 59, 797–829. [Google Scholar] [CrossRef]
  11. Dong, B.; Zhang, Y. An efficient algorithm for 0 minimization in wavelet frame based image restoration. J. Sci. Comput. 2013, 54, 350–368. [Google Scholar] [CrossRef] [Green Version]
  12. Zuo, W.M.; Meng, D.Y.; Zhang, L.; Feng, X.C.; Zhang, D. A generalized iterated shrinkage algorithm for non-convex sparse coding. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 217–224. [Google Scholar]
  13. Ye, J.J.; Zhu, D. New necessary optimality conditions for bilevel programs by combining the MPEC and value function approaches. SIAM J. Optim. 2010, 20, 1885–1905. [Google Scholar] [CrossRef]
  14. Yuan, G.Z.; Ghanem, B. 0 TV: A sparse optimization method for impulse noise image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 41, 352–364. [Google Scholar] [CrossRef] [Green Version]
  15. Yuan, G.Z.; Ghanem, B. An exact penalty method for binary optimization based on MPEC formulation. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
  16. Attouch, H.; Bolte, J.; Redont, P. and Soubeyran, A. Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka–Lojasiewicz inequality. Math. Oper. Res. 2010, 35, 435–457. [Google Scholar] [CrossRef] [Green Version]
  17. Wang, Y.J.; Dong, M.M.; Xu, Y. A sparse rank-1 approximation algorithm for high-order tensors. Appl. Math. Lett. 2020, 102, 106–140. [Google Scholar] [CrossRef]
  18. Bolte, J.; Sabach, S.; Teboulle, M. Proximal alternating linearized minimization or nonconvex and nonsmooth problems. Math. Program. 2014, 146, 459–494. [Google Scholar] [CrossRef]
  19. Sun, T.; Barrio, R.; Jiang, H.; Cheng, L.Z. Convergence rates of accelerated proximal gradient algorithms under independent noise. Numer. Algorithms 2019, 81, 631–654. [Google Scholar] [CrossRef]
  20. Hu, W.; Zheng, W.; Yu, G. A Unified Proximity Algorithm with Adaptive Penalty for Nuclear Norm Minimization. Symmetry 2019, 11, 1277. [Google Scholar] [CrossRef] [Green Version]
  21. An, Y.; Zhang, Y.; Guo, H.; Wang, J. Compressive Sensing Based Three-Dimensional Imaging Method with Electro-Optic Modulation for Nonscanning Laser Radar. Symmetry 2020, 12, 748. [Google Scholar] [CrossRef]
  22. Ma, F. Convergence study on the proximal alternating direction method with larger step size. Numer. Algorithms 2020, 85, 399–425. [Google Scholar] [CrossRef]
  23. Pham, Q.M.; Lachmund, D.; Hào, D.N. Convergence of proximal algorithms with stepsize controls for non-linear inverse problems and application to sparse non-negative matrix factorization. Numer. Algorithms 2020, 85, 1255–1279. [Google Scholar] [CrossRef]
  24. Tiddeman, B.; Ghahremani, M. Principal Component Wavelet Networks for Solving Linear Inverse Problems. Symmetry 2021, 13, 1083. [Google Scholar] [CrossRef]
  25. Xu, Y.Y.; Yin, W.T. A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 2013, 6, 1758–1789. [Google Scholar] [CrossRef]
Figure 1. In Example 1, the ground-truth f was degraded by 20 dB Gaussian noise and 10% outliers, and the visual results of restoration by each method were obtained. (a) Ground-truth; (b) degraded; (c) the restored outcome by the proposed method, (d) the restored outcome by the KROMP method.
Figure 1. In Example 1, the ground-truth f was degraded by 20 dB Gaussian noise and 10% outliers, and the visual results of restoration by each method were obtained. (a) Ground-truth; (b) degraded; (c) the restored outcome by the proposed method, (d) the restored outcome by the KROMP method.
Symmetry 13 02393 g001
Figure 2. In Example 2, the real data f was degraded by 10 dB Gaussian noise and 10% outliers, and the visual results of restoration by each method were obtained. (a) Ground-truth; (b) degraded; (c) the restored outcome by the proposed method, (d) the restored outcome by the KROMP method.
Figure 2. In Example 2, the real data f was degraded by 10 dB Gaussian noise and 10% outliers, and the visual results of restoration by each method were obtained. (a) Ground-truth; (b) degraded; (c) the restored outcome by the proposed method, (d) the restored outcome by the KROMP method.
Symmetry 13 02393 g002aSymmetry 13 02393 g002b
Figure 3. In Example 3, the real data f was degraded by 10 dB Gaussian noise and 5% outliers, and the visual results of restoration by each method were obtained. (a) Ground-truth; (b) degraded; (c) the restored outcome by the proposed method, (d) the restored outcome by the KROMP method.
Figure 3. In Example 3, the real data f was degraded by 10 dB Gaussian noise and 5% outliers, and the visual results of restoration by each method were obtained. (a) Ground-truth; (b) degraded; (c) the restored outcome by the proposed method, (d) the restored outcome by the KROMP method.
Symmetry 13 02393 g003
Table 1. The ReErr results from Example 1 to Example 3 (Bold: the best).
Table 1. The ReErr results from Example 1 to Example 3 (Bold: the best).
KROMPProposed
Example 10.1280.116
Example 20.2570.159
Example 30.1780.117
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Dou, H.-X.; Deng, L.-J. A Proximal Algorithm with Convergence Guarantee for a Nonconvex Minimization Problem Based on Reproducing Kernel Hilbert Space. Symmetry 2021, 13, 2393. https://doi.org/10.3390/sym13122393

AMA Style

Dou H-X, Deng L-J. A Proximal Algorithm with Convergence Guarantee for a Nonconvex Minimization Problem Based on Reproducing Kernel Hilbert Space. Symmetry. 2021; 13(12):2393. https://doi.org/10.3390/sym13122393

Chicago/Turabian Style

Dou, Hong-Xia, and Liang-Jian Deng. 2021. "A Proximal Algorithm with Convergence Guarantee for a Nonconvex Minimization Problem Based on Reproducing Kernel Hilbert Space" Symmetry 13, no. 12: 2393. https://doi.org/10.3390/sym13122393

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop