1. Introduction
The bilevel optimization problem on Euclidean spaces has been shown to be NPhard, and even the verification of the local optimality for a feasible solution is in general NPhard. Bilevel optimization problems are often nonconvex optimization problems, and this makes the computation of an optimal solution a challenging task. Thus, it is natural to consider the bilevel optimization problems on Riemannian manifolds. Actually, studying optimization problems on Riemannian manifolds has many advantages. Some constrained optimization problems on Euclidean spaces can be seen as unconstrained ones from the Riemannian geometry viewpoint. Moreover, some nonconvex optimization problems in the setting of Euclidean spaces may become convex optimization problems by introducing an appropriate Riemannian metric. See for instance [
1,
2]. The aim of this paper is to study the bilevel optimization problem on Riemannian manifolds.
In order to study the bilevel optimization problem on Riemannian manifolds, it is reasonable to have some idea of solving the bilevel optimization problem in Euclidean spaces. An approach to investigate bilevel optimization problems on Euclidean spaces is to replace the lowerlevel problem by its (under certain necessary and sufficient assumptions) KKT optimality conditions. In a recent article [
3], the authors presented the KKT reformulation of the bilevel optimization problems on Riemannian manifolds. Moreover, it has been shown that global optimal solutions of the KKT reformulation correspond to global optimal solutions of the bilevel problem on the Riemannian manifolds provided the lower level convex problem satisfies Slater’s constraint qualification. On this basis, we consider a semivectorial bilevel optimization problem on Riemannian manifolds with a multiobjective problem in the lowerlevel problem. Since the Inexact Restoration (IR) algorithm [
4,
5] was introduced to solve constrained optimization problems and if we transform the semivectorial bilevel optimization problem into a singlelevel problem, it also can be solved by using the IR algorithm as a constrained optimization problem.
For the convenience of the readers, let us review the IR algorithm on Euclidean spaces firstly. Each iteration of the IR algorithm consists of two phases: restoration and minimization. Consider the following nonlinear programming:
where
$f:{\mathbb{R}}^{n}\to \mathbb{R}$ and
$C:{\mathbb{R}}^{n}\to {\mathbb{R}}^{m}$ are continuous differentiable functions and the set
$\Omega \subset {\mathbb{R}}^{m}$ is closed convex. The algorithm generates feasible iterates with respect to
$\Omega $,
${x}^{k}\in \Omega $ (for all
$k=0,1,2\dots $).
In the restoration step, which is executed once per iteration, an intermediate point ${y}^{k}\in \Omega $ is found such that the infeasibility at ${y}^{k}$ is a fraction of the infeasibility at ${x}^{k}$. Immediately after restoration, we construct an approximation ${\pi}_{k}$ of the feasible region using available information at ${y}^{k}$. In the minimization step, we compute a trial point ${z}^{ki}\in {\pi}_{k}$ such that $f\left({z}^{ki}\right)\ll f\left({y}^{k}\right)$. Here, the symbol ≪ means sufficiently smaller than, and $\parallel {z}^{ki}{y}^{k}\parallel \le {\delta}_{ki}$, where ${\delta}_{ki}$ is a trustregion radius. The trial point ${z}^{ki}$ is accepted as a new iteration one if the value of a nonsmooth (exact penalty) merit function at ${z}^{ki}$ is sufficiently smaller than its value at ${x}^{k}$. If ${z}^{ki}$ is not acceptable, the trustregion radius is reduced.
The IR algorithm is related to classical feasible methods for nonlinear programming, such as the generalized reduced gradient (GRG) and the family of sequential gradient restoration algorithms. There are several studies on the numerical characteristics of the IR algorithm. For example, this method was applied to the general constraint problem in [
6], and good results were obtained. In addition, the IR algorithm using the regularization strategy was proposed in [
7], in which the problem of derivativefree optimization was effectively solved. The IR algorithms are especially useful when there is some natural way to restore feasibility. One of the most successful applications of the IR algorithm is electronic structure calculation, as shown in [
8]. Moreover, the IR algorithm has also been successful applied to optimization problems with the box constraint in [
9] and problems with multiobjective constraints under weightedsum scalarization in [
10]. For more applications, please see [
11,
12].
Since the IR algorithm is so important in applications, many researches have been trying to improve it from different angles. The restoration phase improves feasibility, and in the minimization step, optimality is improved as a linear tangent approximation of the constraints. When a sufficient descent criterion does not hold, the trial point is modified in such a way that, eventually, acceptance occurs at a point that may be close to the solution of the restoration (first) phase. The acceptance criterion may use merit functions [
4,
5] or filters [
13]. The minimization step consists of an inexact (approximate) minimization of
f with linear constraints. In this case, the restoration step represents also an inexact minimization of infeasibility with linear constraints. Therefore, the available algorithms for (largescale) linearly constrained minimization can be fully exploited; see the published articles [
14,
15,
16]. Furthermore, IR techniques for constrained optimization were improved, extended, and analyzed in [
7,
17,
18,
19], among others.
Inspired and motivated by the research works [
4,
10,
20,
21,
22,
23,
24,
25], we introduce a kind of bilevel programming with a multiobjective problem in the lower level on Riemannian manifolds, the socalled semivectorial bilevel programming. Then, we transform the semivectorial bilevel programming into a singlelevel programming by using the KKT optimality conditions of the lowerlevel problem, which is convex and satisfies the Slater constraint qualification. Finally, we divide the singlelevel programming into two stages: restoration and minimization, and give an IR algorithm for semivectorial bilevel programming. Under certain conditions, we analyze the welldefiniteness and convergence of the presented algorithm.
The remainder of this paper is organized as follows: In
Section 2, some basic concepts, notations, and important results of Riemannian geometry are presented. In
Section 3, we propose the semivectorial bilevel programming on the Riemannian manifold and give the KKT reformulation, and then, we present an algorithm by using the IR technique for solving the semivectorial bilevel programming on Riemannian manifolds. In
Section 4, its convergence properties are studied. The conclusions are given in
Section 5.
2. Preliminaries
An mdimensional Riemannian manifold is a pair $(M,g)$, where M stands for an mdimensional smooth manifold and g stands for a smooth, symmetric positive definite $(0,2)$tensor field on M, called a Riemannian metric on M. If $(M,g)$ is a Riemannian manifold, then for any point $x\in M$, the restriction ${g}_{x}:{T}_{x}M\times {T}_{x}M\to \mathbb{R}$ is an inner product on the tangent space ${T}_{x}M$. The tangent bundle $TM$ over M is $TM:={\bigcup}_{x\in M}{T}_{x}M$, and a vector field on M is a section of the tangent bundle, which is a mapping $X:M\to TM$ such that, for any $x\in M$, $X\left(x\right)\equiv {X}_{x}\in {T}_{x}M$.
We denote
${\langle \xb7,\xb7\rangle}_{x}$ by the scalar product on
${T}_{x}M$ with the associated norm
${\parallel .\parallel}_{x}$. The length of a tangent vector
$v\in {T}_{x}M$ is defined by
${\parallel v\parallel}_{x}={\langle v,v\rangle}^{\frac{1}{2}}$. Given a piecewise smooth curve
$\gamma :[a,b]\subset \mathbb{R}\to M$ joining
x to
y, i.e.,
$\gamma \left(a\right)=x$ and
$\gamma \left(b\right)=y$, then its length is defined by
$L\left(\gamma \right)={\int}_{a}^{b}{\parallel \dot{\gamma}\left(t\right)\parallel}_{\gamma \left(t\right)}\mathrm{d}t$, where
$\dot{\gamma}$ means the first derivative of
$\gamma $ with respect to
t. Let
x and
y be two points in Riemannian manifold
$(M,g)$ and
${\Gamma}_{x,y}$ the set of all piecewise smooth curves joining
x and
y. The function:
is a distance on
M, and the induced metric topology on
M coincides with the topology of
M as the manifold.
Let ∇ be the LeviCivita connection associated with the Riemannian metric and
$\gamma $ be a smooth curve in
M. A vector field
X is said to be parallel along
$\gamma :[0,1]\to M$ if
${\nabla}_{\dot{\gamma}}X=0$. If
$\dot{\gamma}$ itself is parallel along
$\gamma $ joining
x to
y,
then we say that
$\gamma $ is a geodesic, and in this case,
$\parallel \dot{\gamma}\parallel $ is constant. When
$\parallel \dot{\gamma}\parallel =1$,
$\gamma $ is said to be normalized. A geodesic joining
x to
y in
M is said to be minimal if its length equals
$\mathrm{d}(x,y)$.
By the Hopf–Rinow theorem, we know that, if M is complete, then any pair of points in M can be joined by a minimal geodesic. Moreover, $(M,\mathrm{d})$ is a complete metric space, and the bounded closed subsets are compact. Furthermore, for the exponential mapping at x, ${\mathrm{exp}}_{x}:{T}_{x}M\to M$ is well defined on ${T}_{x}M$. Clearly, a curve $\gamma :[0,1]\to M$ is a minimal geodesic joining x to y if and only if there exists a vector $v\in {T}_{x}M$ such that $\parallel v\parallel =\mathrm{d}(x,y)$ and $\gamma \left(t\right)={\mathrm{exp}}_{x}\left(tv\right)$ for each $t\in [0,1]$.
Set $p\in M$ and ${V}_{p}:=\{v\in {T}_{p}M:{\gamma}_{v}$ defined in $[0,1]\}$. The exponential mapping ${\mathrm{exp}}_{p}:{V}_{p}\to M$ is defined by ${\mathrm{exp}}_{p}\left(v\right)={\gamma}_{v}\left(1\right),\forall v\in {V}_{p}$. The exponential mapping ${\mathrm{exp}}_{p}:{T}_{p}M\to M$ at $p\in M$ is well posed on the tangent space ${T}_{p}M$. Obviously, a curve $\gamma :[0,1]\to M$ joining p and q is a minimum geodesic, if and only if there is a vector $v\in {T}_{p}M$ such that $\parallel v\parallel =\mathrm{d}(p,q)$ and $\gamma \left(t\right)={\mathrm{exp}}_{p}\left(tv\right)$ hold for every $t\in [0,1]$.
The gradient of a differentiable function $f:M\to \mathbb{R}$ with respect to the Riemannian metric g is the vector field $\mathrm{grad}f$ defined by $g(\mathrm{grad}f,X)=df\left(X\right)$, $\forall X\in TM$, where $df$ denotes the differential of the function f.
In this normal coordinate system, the geodesics through
p are represented by lines passing through the origin. Moreover, the matrix
$\left({g}_{ij}\right)$ associated with the bilinear form
g at the point
p in this orthonormal basis reduces to the identity matrix, and the Christoffel symbols vanish. Thus, for any smooth function
$f:M\to \mathbb{R}$, in normal coordinates around
p, we obtain
Now, consider a smooth function $f:M\to R$ and the realvalued function ${T}_{p}M\ni v\mapsto {f}_{p}\left(v\right):=f\left({\mathrm{exp}}_{p}v\right)$ defined around 0 in ${T}_{p}M$.
The Taylor–Young formula (for Euclidean spaces) applied to
${f}_{p}$ around the origin can be written using matrices as
where
In other words, we have the following Taylor–Young expansion for
f around
p:
which holds in any coordinate system.
The set
$A\subset M$ is said to be convex if it contains a geodesic segment
$\gamma $ whenever it contains the end points of
$\gamma $, that is
$\gamma \left(\right(1t)a+tb)$ is in
A whenever
$x=\gamma \left(a\right)$ and
$y=\gamma \left(b\right)$ are in
A, and
$t\in [0,1]$. A function
$f:M\to \mathbb{R}$ is said to be convex if its restriction to any geodesic curve
$\gamma :[a,b]\to M$ is convex in the classical sense, such that the one real variable function
$f\circ \gamma :[a,b]\to \mathbb{R}$ is convex. Let
${P}_{A}$ denote the projection on
$A\subset M$, that is, for each
$x\in M$,
For more details and complete information on the fundamentals in Riemannian geometry, see [
1,
26,
27,
28].
3. Inexact Restoration Algorithm
We study an optimistic bilevel programming on an
mdimensional Riemannian manifold
$(M,\phantom{\rule{4pt}{0ex}}g)$, where the lowerlevel problem is a multiobjective problem, the socalled semivectorial bilevel programming. The problem is formulated below:
where
$F:M\to \mathbb{R}$ and
$\mathrm{Sol}\left(\mathrm{MOP}\right)$ is the effective solution set of the following multiobjective problem (MOP):
where
$f=\{{f}_{1}\left(x\right),\dots ,{f}_{p}\left(x\right)\}:M\to {\mathbb{R}}^{p}$,
$I:=\{1,\dots ,p\}$,
$h:M\to {\mathbb{R}}^{n}$, and
$D=\{x\in M:h(x)=0\}$ denote the feasible solution of the MOP.
Definition 1. Let $f:M\to {\mathbb{R}}^{p}$ be a vectorial function on Riemannian manifold M. Then, f is said to be convex on M if, for every $x,y\in M$ and every geodesic segment $\gamma :[0,\phantom{\rule{4pt}{0ex}}1]\to M$ joining x to y, i.e., $\gamma \left(0\right)=x$ and $\gamma \left(1\right)=y$, it holds that The above definition is a natural extension of the definition of convexity in Euclidean space to the Riemannian context; see [
29].
Definition 2. A point $x\in M$ is said to be Pareto critical of f on Riemannian manifold M if, for any $v\in {T}_{x}M$, there are an index $i\in I$ and $u\in \mathrm{grad}{f}_{i}\left(x\right)$, such that Definition 3. (a) A point ${x}^{*}\in M$ is a Paretooptimal point of f on Riemannian manifold M if there is no $x\in M$ with $f\left(x\right)\u2aaff\left({x}^{*}\right)$. (b) A point ${x}^{*}\in M$ is a weak Paretooptimal point of f on Riemannian manifold M if there is no $x\in M$ with $f\left(x\right)\prec f\left({x}^{*}\right)$.
We know that criticality is a necessary, but not a sufficient condition for optimality. Under the convexity of the vectorial function f, the following proposition shows that criticality is equivalent to weak optimality.
Proposition 1 ([
29]).
Let $f:M\to {\mathbb{R}}^{p}$ be a convex function given by $f=\{{f}_{1}\left(x\right),\dots ,{f}_{p}\left(x\right)\}$. A point $x\in M$ is a critical Paretooptimal point of the function f if and only if it is a weak Paretooptimal point of the function f. We assume that the functions $f=\{{f}_{1}\left(x\right),\dots ,{f}_{p}\left(x\right)\}:M\to {\mathbb{R}}^{p}$ and $h:M\to {\mathbb{R}}^{n}$ are twice continuously differentiable and consider the weighted sun scaling problem related to the MOP, as follows.
Let
${\omega}_{i}\ge 0,\phantom{\rule{4pt}{0ex}}i=1,\phantom{\rule{4pt}{0ex}}\dots ,\phantom{\rule{4pt}{0ex}}p$ such that
$\sum _{i=1}^{p}}{\omega}_{i}=1$:
Note that, if
${\omega}_{i}\ge 0,i=1,\dots ,p$ such that
${\sum}_{i=1}^{p}{\omega}_{i}=1$, then the weak Paretooptimal solution sets of Problem (
4) are equivalent to the union of the optimal solution sets of Problem (
5). Meanwhile, if
${f}_{i}:M\to \mathbb{R}$,
$i=1,\dots ,p$ is the convex function on the Riemannian manifold, then the function
$\sum _{i=1}^{p}}{\omega}_{i}{f}_{i}\left(x\right)$ is also convex. Thus, the bilevel programming (
3)–(
4) can be transformed into the following problem:
A strategy to solve the bilevel problem (
6) on the Riemannian manifolds is to replace the lowerlevel problem with the KKT conditions. When the lowerlevel problem is convex and satisfies the Slater constraint qualification, the global optimal solutions of the KKT reformulation correspond to the global optimal solutions of the bilevel problem on the Riemannian manifolds. See Theorems 4.1 and 4.2 in [
3].
In the following, we give the KKT reformulation of the semivectorial bilevel programming on Riemannian manifolds.
where
is a convex and compact set,
$\mu \in {\mathbb{R}}^{n}$, and
M is a complete
mdimensional Riemannian manifold.
We will adopt an IR method to solve the optimization problem in two stages, first pursuing feasibility and optimality, keeping a certain control over the feasibility that has been realized. Consequently, the approach exploits the inherent minimization structure of the problem, especially in the feasibility phase, so that it can obtain better solutions. Moreover, in the feasibility phase of the IR strategy, the user is free to choose the method of his/her choice, as long as the recovered iteration satisfies some mild assumptions [
4,
5].
For simplicity, we introduce the following notations:
and
We write shortly
$s=(x,\phantom{\rule{4pt}{0ex}}\omega ,\phantom{\rule{4pt}{0ex}}\mu )\in M\times W\times {\mathbb{R}}^{n}$ and give the Jacobian of
C as follows:
Thus, the semivectorial bilevel programming can be reduced:
Before giving a rigorous description of the algorithm, let us start with an overview of each step.
Restoration step: We apply any globally convergent optimization algorithm to solve the lowerlevel minimization problem parameterized by ${z}^{k}=(\overline{x},{\omega}^{k},\overline{\mu})$. Once an approximate minimizer $\overline{x}$ and a pair of corresponding estimated Lagrange multiplier vectors are obtained, then we compute the current set ${\pi}_{k}$ and the direction ${d}_{\mathrm{tan}}^{k}$.
Approximate linearized feasible region: The set
${\pi}_{k}$ is a linear approximation of the region described by KKT
$\left(\overline{x}\right)$ containing
${z}^{k}=(\overline{x},{\omega}^{k},\overline{\mu})$. This auxiliary region is given by
Descent direction: Using the projection on Riemannian manifolds, the projection defined on
${\pi}_{k}$ is represented as follows:
where
$\eta >0$ is an arbitrary scaling parameter independent of
k. It turns out that
which is a feasible descent direction on
${\pi}_{k}$.
Minimization step: The objective of the minimization step is to obtain ${v}^{k,i}\in {\pi}_{k}$ such that $L({v}^{k,i},{\lambda}^{k})<L({z}^{k},{\lambda}^{k})$ and ${v}^{k,i}\in {B}_{k,i}=\{v:\mathrm{d}(v,{z}^{k})\le {\delta}_{k,i}\}$, where ${\delta}_{k,i}$ is a trustregion radius. The first trial point at each iteration is obtained using a trustregion radius ${\delta}_{k,0}$. A successive trustregion radius is tried until a point ${v}^{k,i}$ is found such that the merit function at this point is sufficiently smaller than the merit function at ${s}^{k}$.
Merit function and penalty parameter: We decided to use a variant of the sharp Lagrangian merit function, given by
where
$\theta \in (0,1]$ is a penalty parameter used to give different weights to the objective function and the feasibility objective. The choice of the parameter
$\theta $ at each iteration depends on practical and theoretical considerations. Roughly speaking, we wish the merit function at the new point to be less than the merit function at the current point
${s}^{k}$.
That is, we want
${\mathrm{Ared}}_{k,i}>0$, where
${\mathrm{Ared}}_{k,i}$ is the actual reduction of the merit function, defined by
However, merely a reduction of the merit function is not sufficient to guarantee convergence. In fact, we need a sufficient reduction of the merit function, which will be defined by the satisfaction of the following test:
where
${\mathrm{Pred}}_{k,i}$ is a positive predicted reduction of the merit function
$\mathsf{\Psi}(s,\lambda ,\theta )$ between
${s}^{k}$ and
${v}^{k,i}$. It is defined by
The quantity
${\mathrm{Pred}}_{k,i}$ defined above can be nonpositive depending on the value of the penalty parameter. Fortunately, if
${\theta}_{k,i}$ is small enough,
${\mathrm{Pred}}_{k,i}$ is arbitrarily close to
$\left[C\left({s}^{k}\right)C\left({z}^{k}\right)\right]$, which is necessarily nonnegative. Therefore, we will always be able to choose
${\theta}_{k,i}\in (0,1]$ such that
When the criterion ${\mathrm{Ared}}_{k,i}\ge 0.1{\mathrm{Pred}}_{k,i}$ is satisfied, we accept ${v}^{k,i}={z}^{k}$. Otherwise, we reduce the trustregion radius.
To establish IR methods for semivectorial bilevel programming on Riemannian manifolds, we adapt the IR method presented in [
4]. In the presented algorithm, the parameters
$\eta >0$,
$N>0$,
${\theta}_{1}\in (0,1)$,
${\delta}_{min}>0$,
${\tau}_{1}>0$, and
${\tau}_{2}>0$ are given. The initial approximations
${s}^{0}\in W\times M\times {\mathbb{R}}^{n}$,
${\lambda}^{0}\in {\mathbb{R}}^{m+n}$, as well as a sequence
$\left\{{\omega}^{k}\right\}$ such that
$\sum _{k=0}^{+}}\infty {\omega}^{k}<+\infty $ are also given.
4. Convergence Results
Using the method for studying the convergence of the IR algorithm in Euclidean spaces [
20,
22], the convergence results of IR algorithms for semivectorial bilevel programming on Riemannian manifolds are given under the following assumptions. From now on, we assume that the semivectorial bilevel optimization problems on Riemannian manifolds satisfy assumptions
${H}_{1}$–
${H}_{3}$ stated below:
 ${H}_{1}$
There exists
${L}_{1}$ such that, for all
$(x,\omega )$,
$(\overline{x},\overline{\omega})\in M\times W$,
$\mu ,\overline{\mu}\in {\mathbb{R}}^{n}$, and
$\xi \in [0,{\xi}_{max}]$,
 ${H}_{2}$
There exists
${L}_{2}$ such that, for all
$x,\overline{x}\in M$,
 ${H}_{3}$
There exists
$r\in [0,1)$, independently of
k, such that the point
${z}^{k}=(\overline{x},\overline{\omega},\overline{\mu})$ obtained at the restoration phase satisfies
where
${s}^{k}=({x}^{k},{\omega}^{k},{\mu}^{k})$. Moreover, if
$C\left({s}^{k}\right)=0$, then
${z}^{k}={s}^{k}$.
Theorem 1 (
Welldefiniteness)
. Under assumptions ${H}_{1}\u2013{H}_{3}$, IR Algorithm 1 for bilevel programming is well defined.
Algorithm 1: Inexact Restoration algorithm 
Define${\theta}_{k}^{min}=min\{1,{\theta}_{k1},\dots ,{\theta}_{1}\}$, ${\theta}_{k}^{\mathrm{large}}=min\{1,{\theta}_{k}^{min}+{\omega}^{k}\}$, and${\theta}_{k,1}={\theta}_{k}^{\mathrm{large}}$. ( Restoration phase) Find an approximate minimizer $\overline{x}$ and multipliers$\overline{\mu}\in {\mathbb{R}}^{n}$ for the problem:and define ${z}^{k}=(\overline{x},{\omega}^{k},\overline{\mu})$. ( Direction) Computewhere ${P}_{k}$ is the projection onand ${P}_{k}\left({\mathrm{exp}}_{{z}^{k}}\left(\eta {\mathrm{grad}}_{s}L({z}^{k},{\lambda}^{k})\right)\right)$ is a solution of the following problem:If ${z}^{k}={s}^{k}$, ${d}_{\mathrm{tan}}^{k}=0$, then stop and return ${x}^{k}$ as a solution of Problem (7). Otherwise, we set $i\leftarrow 0$ and choose ${\delta}_{k,0}\ge {\delta}_{min}$. ( Minimization phase) If ${d}_{\mathrm{tan}}^{k}=0$, then we take ${v}^{k,i}={z}^{k}$. Otherwise, we take ${t}_{\mathrm{break}}^{k,i}=min\left\{1,\frac{{\delta}_{k,i}}{{d}_{\mathrm{tan}}^{k}}\right\}$ and find ${v}^{k,i}\in {\pi}_{k}$ such that, for some $0<t<{t}_{\mathrm{break}}^{k,i}$, we haveand $\mathrm{d}({v}^{k,i},{z}^{k})\le {\delta}_{k,i}$. If ${d}_{\mathrm{tan}}^{k}=0$, define ${\lambda}^{k,i}={\lambda}^{k}$. Otherwise, we take ${\lambda}^{k,i}\in {\mathbb{R}}^{n+m}$ such that ${\lambda}^{k,i}\le N$. For all $\theta \in [0,1]$, we define We take ${\theta}_{k,i}$ as the maximum $\theta \in [0,{\theta}_{k,i1}]$ that it satisfies:and define${\mathrm{Pred}}_{k,i}={\mathrm{Pred}}_{k,i}\left({\theta}_{k,i}\right)$. Ifthen we takeand finish the current $k\mathrm{th}$ iteration. Otherwise, we choose ${\delta}_{k,i+1}\in [0.1{\delta}_{k,i},0.9{\delta}_{k,i}]$, set $i\leftarrow i+1$, and go to Step 4.

Proof. According to Step 6 and Step 7 of Algorithm 1, it can be calculated that
Through the condition (
12), we have
Then, from the assumption
${H}_{3}$,
If
$C\left({s}^{k}\right)\ne 0$, due to the continuity of
C and
${\delta}_{k,i}\to 0$, we have
$C\left({z}^{k}\right)C\left({v}^{k,i}\right)\to 0$. Thus, there exists a positive constant
${\delta}_{k,i}$ such that
This means that the algorithm is well defined when $C\left({s}^{k}\right)\ne 0$.
If
$C\left({s}^{k}\right)=0$, then
${s}^{k}$ is feasible. Since the algorithm does not terminate at the
kth iteration, we know that
${d}_{\mathrm{tan}}^{k}\ne 0$. Therefore, we have
Combining the condition (
12), it follows that
and independent of
$\theta $, for all
i,
${\theta}_{k,i}={\theta}_{k,1}$. In terms of the inequality (
13), when
${\delta}_{k,i}$ is sufficiently small, we obtain
Therefore, Algorithm 1 is well defined. □
The next theorem is an important tool for proving the convergence of Algorithm 1. We prove that the actual reduction ${\mathrm{Ared}}_{k,{i}^{*}}$, with ${i}^{*}$ the accepted value of i, achieved at each iteration necessarily tends to 0.
Theorem 2. Under the assumptions ${H}_{1}\u2013{H}_{3}$, if Algorithm 1 generates an infinite sequence, thenThe same results above occur when ${\lambda}^{k}=0$, for all k. Proof. Let us prove that
${lim}_{k\to +\infty}{\mathrm{Ared}}_{k}=0$, i.e., we need to prove
that is
namely
where
$\mathsf{\Psi}({s}^{k},{\theta}_{k})={\theta}_{k}L({s}^{k},{\lambda}^{k})+(1{\theta}_{k})\leftC\left({s}^{k}\right)\right$.
By contradiction, suppose that there is an infinite indicator set
${T}_{1}\subset \{0,1,2\dots \}$ and a positive constant
$\zeta >0$ such that, for any
$k\in {T}_{1}$, we have
Let
${\mathsf{\Psi}}_{k}=\mathsf{\Psi}({s}^{k},{\theta}_{k})$, then
Equivalently,
where
${\zeta}_{k}>0$ and
${\zeta}_{k}>\zeta >0$,
$k\in {T}_{1}$.
According to the definition of
${\theta}_{k,1}$,
There is an upper bound
$c>0$, such that
Combining the inequalities (
14) and (
15), it follows that
Then, for all
$k\ge 1$, we have
Since ${\sum}_{j=0}^{k1}2{\omega}_{j}$ is the convergence and ${\zeta}_{j}$ is bounded away from zero, this implies that ${\mathsf{\Psi}}_{k}$ is unbounded. This is a contradiction. Thus, we have that ${lim}_{k\to +\infty}{\mathrm{Ared}}_{k}=0$. In addition, in a similar way, we can prove ${lim}_{k\to +\infty}\leftC\left({s}^{k}\right)\right=0$. □
According to Theorem 2, it means that the point generated by the IR algorithm for the KKT transformation (
7) will converge to a feasible point eventually. Then, we prove that
${d}_{tan}^{k}$ cannot be bounded away from zero under the following assumption
${H}_{4}$. This means that the point generated by the IR algorithm will converge to a weak Pareto solution of Problem (
7):
 ${H}_{4}$
There exists
$\beta >0$, independently of
k, such that
Theorem 3. Suppose that the assumptions ${H}_{1}$, ${H}_{2}$, ${H}_{3}$, and ${H}_{4}$ hold. If $\left\{{s}^{k}\right\}$ is an infinite sequence generated by Algorithm 1, $\left\{{z}^{k}\right\}$ is the sequence defined at the restoration phase in Algorithm 1, then:
 1
$\leftC\left({s}^{k}\right)\right\to 0$.
 2
There exists a limit point ${s}^{*}$ of $\left\{{s}^{k}\right\}$.
 3
Every limit point of $\left\{{s}^{k}\right\}$ is a feasible point of the KKT reformulation (7).  4
If, for all ω, a global solution of the lowerlevel problem is found, then any limit point $({x}^{*},{\omega}^{*})$ is feasible for the weighted semivectorial bilevel programming (6).  5
If ${s}^{*}$ is a limit point of $\left\{{s}^{k}\right\}$, there exists an infinite set $K\subset \mathbb{N}$ such that
Proof. We can prove the first two items from Theorem 2 and the assumption ${H}_{1}\u2013{H}_{3}$. Based on the conclusions of the first two terms, the third and forth items are valid. The fifth item follows from the assumption ${H}_{4}$ and the first item. □
The above conclusions give the welldefiniteness and convergence of the algorithm proposed for semivectorial bilevel programming on Riemannian manifolds. From the point of view of the assumption put forward in this paper, the assumptions ${H}_{3}$ and ${H}_{4}$ are related to the sequences generated by the IR algorithm. Therefore, it is worth studying establishing sufficient conditions to ensure their effectiveness. Two assumptions about the lowerlevel problem are given below to verify the hypotheses ${H}_{3}$ and ${H}_{4}$:
 ${H}_{5}$
For every solution $s=(x,\omega ,\mu )$ of $C(x,\omega ,\mu )=0$, such that the gradients $\mathrm{grad}{h}_{i}\left(x\right)$, $i=1,\dots ,n$ of the active lower level constraints are linearly independent.
 ${H}_{6}$
For every solution
$s=(x,\omega ,\mu )$ of
$C(x,\omega ,\mu )=0$ such that the matrix:
is positive definite in the following set:
For convenience, to verify
${H}_{3}$ and
${H}_{4}$, we define the following matrix:
Lemma 1. The matrix ${D}^{\prime}\left(s\right)$ is nonsingular for any solution $s=(x,\omega ,\mu )$ of $C(x,\omega ,\mu )=0$.
Proof. Assuming that there exist
$u\in {\mathbb{R}}^{m}$ and
$v\in {\mathbb{R}}^{p}$ such that
then we have
According to the assumptions ${H}_{5}$–${H}_{6}$ and Equalities (16) and (17), it follows that $u=0$ and $v=0$. This means that the matrix ${D}^{\prime}\left(s\right)$ is nonsingular for any solution $s=(x,\omega ,\mu )$ of $C(x,\omega ,\mu )=0$. □
Let $D\left(s\right)$ be defined on $M\times W\times {\mathbb{R}}^{n}$, for each $\omega \in W$, a solution $u\left(\omega \right)=\left(x\right(\omega ),\mu (\omega \left)\right)$ of $C(x,\omega ,\mu )=0$ such that the function $v\left(\omega \right)=u\left(\omega \right)$ is continuous on W. Now, we fix the function $v\left(\omega \right)$, by Lemma 1, and we can define a function ${\rm Y}\left(\omega \right)={D}^{\prime}{(\omega ,v\left(\omega \right))}^{1}$ over the set W. Let $V(v\left(\omega \right),\alpha )=\{v\in M\times {\mathbb{R}}^{n}:\mathrm{d}(v,v\left(\omega \right))\le \alpha \}$. Furthermore, the following lemma can be obtained.
Lemma 2. There exist $\alpha >0$ and $\beta >0$, such that, for all $\omega \in W$, it holds ${\rm Y}(\omega \left)\right<\beta $, and for all $v\in V\left(v\right(\omega ),\alpha )$, ${\rm Y}\left(\omega \right)$ coincides with the local inverse operator of ${D}^{\prime}(\omega ,\xb7)$.
Proof. Since ${D}^{\prime}(\omega ,v)$ is continuous on $(\omega ,v)$, $v\left(\omega \right)$ is continuous on W, and ${\rm Y}\left(\omega \right)$ is continuous with respect to $\omega \in W$, there exists $\beta >0$, such that, for all $\omega \in W$, ${\rm Y}(\omega \left)\right<\beta $.
For each fixed value of $\omega \in W$, associated with each v, the continuously differentiable operator of the vector $C(\omega ,v)$ verifies the assumption of the inverse function theorem at $v\left(\omega \right)$. Hence, there exists $\alpha >0$ such that $C(\omega ,\xb7)$ has a continuously differentiable local inverse operator $G\left(\omega \right):C(\omega ,V(v\left(\omega \right),\alpha \left)\right)\mapsto V\left(v\right(\omega ),\alpha )$, and the Jacobian matrix ${\left[G\left(\omega \right)\right]}^{\prime}$ is consistent with ${\rm Y}\left(\omega \right)$. This ends the proof. □
Finally, we state that ${H}_{3}$ and ${H}_{4}$ hold under the assumptions ${H}_{5}$ to ${H}_{6}$. The next theorem summarizes this fact, and it can be proven as follows.
Theorem 4. Let $r\in [0,1)$, $(\omega ,u)\in W\times M\times {\mathbb{R}}^{n}$ be such that $C(\omega ,u)\ne 0$. If the assumptions ${H}_{5}$–${H}_{6}$ hold, then there exist $\beta >0$, $\omega \in W$, and $\overline{u}=(\overline{x},\overline{\mu})\in M\times {\mathbb{R}}^{n}$ such thatand Proof. According to Lemmas 1 and 2, combining the assumptions
${H}_{5}$ and
${H}_{6}$, by using Taylor expansions of the functions on Riemannian manifolds, the statement follows from the results of [
20]. This ends the proof. □
Example 1. We consider the particular case $M={\mathbb{R}}_{+}^{2}:=\{({x}_{1},{x}_{2})\in {\mathbb{R}}^{2}{x}_{1}>0,{x}_{2}>0\}$ with the metric g given in Cartesian coordinates $({x}_{1},{x}_{2})$ around the point $x\in M$ by the matrix: In other words, for any vectors $u=({u}_{1},{u}_{2})$ and $v=({v}_{1},{v}_{2})$ in the tangent plane at $x\in M$, denoted by ${T}_{y}M$, which coincides with ${\mathbb{R}}^{2}$, we have Let $a=({a}_{1},{a}_{2})\in M$ and $v=({v}_{1},{v}_{2})\in {T}_{a}M$. It is easy to see that the (minimizing) geodesic curve $t\mapsto \gamma \left(t\right)$ verifying $\gamma \left(0\right)=a$, $\gamma \left(0\right)=v$ is given by Hence, M is a complete Riemannian manifold. Furthermore, the (minimizing) geodesic segment $\gamma :[0,1]\to {M}_{2}$ joining the points $a=({a}_{1},{a}_{2})$ and $b=({b}_{1},{b}_{2})$, i.e., $\gamma \left(0\right)=a$, $\gamma \left(1\right)=b$ is given by ${\gamma}_{i}\left(t\right)={a}_{1}^{1t}{b}_{i}^{t}$, $i=1,2$. Thus, the distance d on the metric space $({M}_{2},{g}_{2})$ is given by It follows easily that the closed ball $\mathbb{B}(a;R)$ centered in $a\in M$ of radius $R\ge 0$ verifiesthus, every closed rectangle $[{\rho}_{1},{\eta}_{1}]\times [{\rho}_{2},{\eta}_{2}]\phantom{\rule{3.33333pt}{0ex}}({\rho}_{1}>0,{\rho}_{2}>0)$ is bounded in the metric space $(M,g)$ with the distance d. Next, we consider the functions $F:M\to \mathbb{R}$, $f:M\to {\mathbb{R}}^{2}$ and $h:M\to \mathbb{R}$ given for any $x\in M$ by It is easy to see that, for $x\in M$ and any geodesic segment $\gamma :[0,1]\to M$ with $\gamma \left(0\right)=a$, $\gamma \left(1\right)=b$, the functions ${f}_{i}\left(x\right),i=1,2$, and $h\left(x\right)$ are all convex on M with the Riemannian metric g. Moreover, the function $h\left(x\right)$ satisfies the Slater constraint qualification.
We then consider the corresponding KKT reformulation of the semivectorial bilevel programming on Riemannian manifolds: By the definition of the gradient of a differentiable function with respect to the Riemannian metric g, let ${\omega}_{1}=\frac{1}{3}$, ${\omega}_{2}=\frac{2}{3}$, ${\omega}_{1}+{\omega}_{2}=1$, and $\mu ={(\frac{1}{2},\frac{3}{4})}^{T}\in {\mathbb{R}}^{2}$; we have It is easy to see that the unique optimal solution of the KKT reformulation is $x=(\frac{3\sqrt{7}}{4},\frac{3+\sqrt{7}}{4})$.
According to Algorithm 1, we first give the initial approximations ${s}^{0}\in W\times M\times {\mathbb{R}}^{2}$, ${\lambda}^{0}\in {\mathbb{R}}^{2}$, and a sequence $\left\{{\omega}^{k}\right\}$. In the restoration phase, find an approximate minimizer $\overline{x}=(\overline{{x}_{1}},\overline{{x}_{2}})\in M$ and multiplier $\overline{\mu}=(\overline{{\mu}_{1}},\overline{{\mu}_{2}})\in {\mathbb{R}}^{2}$ for the problem:and define ${z}^{k}=(\overline{x},{\omega}^{k},\overline{\mu})$. We then compute the direction by using the exponential mapping and the projection defined on Riemannian manifold M.where $L({z}^{k},{\lambda}^{k})={x}_{1}+{\lambda}_{1}^{k}\left({\sum}_{i=1}^{2}{\omega}_{i}^{k}{\mathrm{grad}}_{s}{f}_{i}\left(\overline{x}\right)+{\mathrm{grad}}_{s}h\left(\overline{x}\right)\overline{\mu}\right)+{\lambda}_{2}^{k}h\left(\overline{x}\right)$. In the minimization phase, we first find ${v}^{k,i}$ such that $L({v}^{k,i},{\lambda}^{k})<L({z}^{k},{\lambda}^{k})$ and ${v}^{k,i}\in {B}_{k,i}=\{v:\mathrm{d}(v,{z}^{k})\le {\delta}_{k,i}\}$. Then, by calculating the actual reduction ${\mathrm{Ared}}_{k,i}$ and positive predicted reduction ${\mathrm{Pred}}_{k,i}$ of the merit function $\mathsf{\Psi}(s,\lambda ,\theta )$ such that ${\mathrm{Ared}}_{k,i}\ge 0.1{\mathrm{Pred}}_{k,i}$, we obtain a sequence $\left\{{s}^{k}\right\}$.
According to Theorems 3 and 4, the sequence $\left\{{s}^{k}\right\}$ generated by the IR method established in the present paper converges to a solution of the semivectorial bilevel programming on Riemannian manifolds.