An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem

Elsobky, Bothina; Ashry, Gehan

doi:10.3390/fractalfract6080412

Open AccessArticle

An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem

by

Bothina Elsobky

^* and

Gehan Ashry

Department of Mathematics, Faculty of Science, Alexandria University, Alexandria 5424041, Egypt

^*

Author to whom correspondence should be addressed.

Fractal Fract. 2022, 6(8), 412; https://doi.org/10.3390/fractalfract6080412

Submission received: 18 May 2022 / Revised: 1 July 2022 / Accepted: 18 July 2022 / Published: 27 July 2022

(This article belongs to the Section Numerical and Computational Methods)

Download Versions Notes

Abstract

:

In this paper, the Fischer–Burmeister active-set trust-region (FBACTR) algorithm is introduced to solve the nonlinear bilevel programming problems. In FBACTR algorithm, a Karush–Kuhn–Tucker (KKT) condition is used with the Fischer–Burmeister function to transform a nonlinear bilevel programming (NBLP) problem into an equivalent smooth single objective nonlinear programming problem. To ensure global convergence for the FBACTR algorithm, an active-set strategy is used with a trust-region globalization strategy. The theory of global convergence for the FBACTR algorithm is presented. To clarify the effectiveness of the proposed FBACTR algorithm, applications of mathematical programs with equilibrium constraints are tested.

Keywords:

a bilevel optimization problem; Fischer–Burmeister function; a Karush–Kuhn–Tucker conditions; active-set strategy; trust-region strategy; global convergence

MSC:

65Dxx; 65Kxx; 65Zxx

1. Introduction

The mathematical formulation for NBLP problem which we will consider it is

\begin{matrix} min_{v} & f_{u} (v, w) \\ s . t . & g_{u} (v, w) \leq 0, \\ min_{w} & f_{l} (v, w), \\ s . t . & g_{l} (v, w) \leq 0, \end{matrix}

(1)

where

v \in ℜ^{n_{1}}

and

w \in ℜ^{n_{2}}

. In our approach, the functions

f_{u} : ℜ^{n_{1} + n_{2}} \to ℜ

,

f_{l} : ℜ^{n_{1} + n_{2}} \to ℜ

,

g_{u} : ℜ^{n_{1} + n_{2}} \to ℜ^{m_{1}}

, and

g_{l} : ℜ^{n_{1} + n_{2}} \to ℜ^{m_{2}}

must have a twice continuously differentiable function at least.

The NBLP problem (1) is utilized so extensively in transaction network, resource allocation, finance budget, price control, etc., see [1,2,3,4]. The NBLP problem (1) has two levels of optimization problems, upper and lower levels. A decision maker with the upper level objective function

f_{u} (v, w)

takes the lead, and so he chooses the decision vector v. According to this, the decision maker with lower level objective function

f_{l} (v, w)

, chooses the decision vector w to optimize her objective, parameterized in v.

To obtain the solution of problem (1), number of different approaches have been offered, see (1), see [5,6,7,8,9]. In our method, we utilize one of these approaches to transforme NBLP problem (1) to a single level one by replacing the lower level optimization problem with its KKT conditions, see [10,11].

Utilizing KKT optimality conditions for the lower level problem, the NBLP problem (1) is reduced to the following single-objective optimization problem:

\begin{matrix} min_{v, w} & f_{u} (v, w) \\ s . t . & g_{u} (v, w) \leq 0, \\ \nabla_{w} f_{l} (v, w) + \nabla_{w} g_{l} (v, w) λ = 0, \\ g_{l} (v, w) \leq 0, \\ λ_{j} g_{l_{j}} (v, w) = 0, j = 1, \dots, m_{2}, \\ λ_{j} \geq 0, j = 1, \dots, m_{2}, \end{matrix}

(2)

where

λ \in ℜ^{m_{2}}

a multiplier vector which is associated with inequality constraint

g_{l} (v, w)

.

Problem (2) is non-differentiable and non-convex. Furthermore, the regularity assumption prerequisites to successfully handle smooth optimization problems are never satisfied. Following the smoothing method which is proposed by [2], we can introduce the FBACTR algorithm to solve the problem (2). Before introducing FBACTR algorithm, we need the following definition.

Definition 1.

A Fischer–Burmeister function is the function

Ψ (e, d) : ℜ^{2} \to ℜ

and it is defined by

Ψ (e, d) = e + d - \sqrt{e^{2} + d^{2}}

. A perturbed Fischer–Burmeister function is the function

ψ (e, d, \hat{ε}) : ℜ^{3} \to ℜ

and it is defined by

ψ (e, d, \hat{ε}) = e + d - \sqrt{e^{2} + d^{2} + \hat{ε}}

.

The Fischer–Burmeister function has the property that

Ψ (e, d) = 0

if and only if

e \geq 0

,

d \geq 0

, and

e d = 0

. It is non-differentiable at

e = d = 0

. Its perturbed variant satisfies

ψ (e, d, \hat{ε}) = 0

if and only if

e > 0

,

d > 0

, and

e d = \frac{\hat{ε}}{2}

for

\hat{ε} > 0

. This function is smooth with respect to e, d, for

\hat{ε} > 0

, and for more details see [12,13,14,15].

The next perturbed Fischer–Burmeister function is used to satisfy the asymptotic stability conditions, and allow the FBACTR algorithm to solve problem (2).

ψ (e, d, \hat{ε}) = \sqrt{e^{2} + d^{2} + \hat{ε}} - e - d .

(3)

Using the perturbed Fischer–Burmeister function (3), problem (2) can be approximated by:

\begin{matrix} min_{v, w} & f_{u} (v, w) \\ s . t . & g_{u} (v, w) \leq 0, \\ \nabla_{w} f_{l} (v, w) + \nabla_{w} g_{l} (v, w) λ = 0, \\ \sqrt{g_{l_{j}}^{2} + λ_{j}^{2} + \hat{ε}} - λ_{j} + g_{l_{j}} = 0, j = 1, \dots, m_{2} . \end{matrix}

(4)

The following notations are introduced to simplify our discussion. These notations are

n = n_{1} + n_{2} + m_{2}

,

x = {(v, w, λ)}^{T} \in ℜ^{n}

and

c (x) = (\nabla_{w} f_{l} (v, w) + \nabla_{w} g_{l} (v, w) λ,

\sqrt{g_{l_{j}}^{2} + λ_{j}^{2} + \hat{ε}} - λ_{j} + g_{l_{j}})^{T}, j = 1, \dots, m_{2}

. Hence problem (4) can be reduced as follows:

\begin{matrix} minimize & f_{u} (x) \\ subject to & g_{u} (x) \leq 0, \\ c (x) = 0, \end{matrix}

(5)

where

f_{u} : ℜ^{n} \to ℜ

,

g_{u} : ℜ^{n} \to ℜ^{m_{1}}

, and

c : ℜ^{n} \to ℜ^{n_{2} + m_{2}}

.

A set of indices of binding or violated inequality constraints at x is defined by

I (x) = {i : g_{u_{i}} (x) \geq 0}

. A regular point is the point

x_{*}

at which the vectors of the set

{\nabla c_{i} (x_{*}), i = 1, 2, \dots, n_{2} + m_{2}}

⋃

{\nabla g_{u_{i}} (x_{*}), i \in I (x_{*})}

are linearly independent.

A regular point

x_{*}

is KKT point of problem (5) if there exist Lagrange multiplier vectors

μ_{*} \in ℜ^{n_{2} + m_{2}}

and

λ_{*} \in ℜ^{m_{1}}

such that the following KKT conditions hold:

\begin{matrix} \nabla f_{u} (x_{*}) + \nabla c (x_{*}) μ_{*} + \nabla g_{u} (x_{*}) λ_{*} & = & 0, \end{matrix}

(6)

\begin{matrix} c (x_{*}) & = & 0, \end{matrix}

(7)

\begin{matrix} g_{u} (x_{*}) & \leq & 0, \end{matrix}

(8)

\begin{matrix} {(λ_{*})}_{i} g_{u_{i}} (x_{*}) & = & 0, i = 1, \dots, m_{1}, \end{matrix}

(9)

\begin{matrix} {(λ_{*})}_{i} & \geq & 0, i = 1, \dots, m_{1} . \end{matrix}

(10)

To solve the nonlinear single-objective constrained optimization problem (5), various approaches have been proposed; for more details, see [16,17,18,19,20,21,22].

An active-set strategy is utilized to reduce problem (5) to equality constrained optimization problem. The idea beyond the active-set method is to identify at every iteration, the active inequality constraints and treat them as equalities and this allows to utilize the improved methods which are used to solve the equality constrained problems, see [21,23,24]. Most of the methods that are used to solve the equality constrained problems, may not converge if the starting point is far away from the stationary point, so it is called a local method.

To ensure a convergence to the solution from any starting point, a trust-region strategy which is strongly global convergence can be induced. It is very important strategy to solve a smooth optimization. It is more robust when it deals with rounding errors. It does not require the objective function of the model be convex. For more details see [11,21,22,23,24,25,26,27,28,29,30,31,32].

To treat the difficult of having infeasible trust-region subproblem in FBACTR algorithm, a reduced Hessian technique which is suggested by [33,34] and used by [22,24,35] is utilized.

Under five assumptions, a theory of global convergence for FBACTR algorithm is proved. Moreover, numerical experiments display that FBACTR algorithm performers effectively and efficiently in pursuance.

We shall use the following notation and terminology. We use

∥ . ∥

to denote the Euclidean norm

{∥ . ∥}_{2}

. Subscript k refers to iteration indices. For example,

f_{u_{k}} \equiv f_{u} (x_{k})

,

g_{u_{k}} \equiv g_{u} (x_{k})

,

c_{k} \equiv c (x_{k})

,

Y_{k} \equiv Y (x_{k})

,

P_{k} \equiv P (x_{k})

,

\nabla_{x} ℓ_{k} \equiv \nabla_{x} ℓ (x_{k}, μ_{k})

, and so on to denote the function value at a particular point.

The rest of the paper is organized as follows. Section 2 is devoted to the description of an active-set trust-region algorithm to solve problem (5) and summarized to FBACTR algorithm to solve NBLP problem (1) is introduced. In Section 3 the analysis of the theory of global convergence of the FBACTR algorithm is presented. Section 4 contains an implementation of the FBACTR algorithm and the results of test problems. Finally, some further remarks are given in Section 5.

2. Active-Set with Trust-Region Technique

A detailed description for active-set with the trust-region strategy to solve problem (5) and summarized to FBACTR algorithm to solve problem (1) are introduced in this section.

Based on the active-set method which is suggested by [36] and used with [21,22,23,24], we define a 0–1 diagonal matrix

P (x) \in ℜ^{m_{1} \times m_{1}}

, whose diagonal entries are:

p_{i} (x) = \{\begin{matrix} 1 & i f g_{u_{i}} (x) \geq 0, \\ 0 & i f g_{u_{i}} (x) < 0 . \end{matrix}

(11)

Using the previous definition of the matrix

P (x)

, a smooth and simple function is utilized to replace problem (5) with the following simple problem

\begin{matrix} minimize & f_{u} (x) + \frac{r}{2} {∥ P (x) g_{u} (x) ∥}^{2} \\ subject to & c (x) = 0, \end{matrix}

(12)

where

r > 0

is a parameter, see [21,22,23]. The Lagrangian function associated with problem (12) is given by:

L (x, μ; r) = ℓ (x, μ) + \frac{r}{2} {∥ P (x) g_{u} (x) ∥}^{2},

(13)

where

ℓ (x, μ) = f_{u} (x) + μ^{T} c (x),

(14)

and

μ \in ℜ^{n_{2} + m_{2}}

represents a Lagrange multiplier vector which is associated with the constraint

c (x)

. A KKT point

(x_{*}, μ_{*})

for problem (12) is the point at which the following conditions are satisfied

\begin{matrix} \nabla ℓ (x_{*}, μ_{*}) + r \nabla g_{u} (x_{*}) P (x_{*}) g_{u} (x_{*}) & = & 0, \end{matrix}

(15)

\begin{matrix} h (x_{*}) & = & 0, \end{matrix}

(16)

where

\nabla ℓ (x_{*}, μ_{*}) = \nabla f_{u} (x_{*}) + \nabla c (x_{*}) μ_{*}

.

If the KKT point

(x_{*}, μ_{*})

satisfies conditions (6)–(10), we notice that it is also satisfies conditions (15) and (16), but the converse is not necessarily true. So, we design FBACTR algorithm in a way that, if

(x_{*}, μ_{*})

satisfies conditions (15) and (16), then it is also satisfies KKT conditions (6)–(10).

Various approaches which were proposed to solve the equality constrained are local methods. By local method, we mean a method such that if the starting point is sufficiently close to a solution, then under some reasonable assumptions the method is guaranteed by theory to converge to the solution. There is no guarantee that the local method converges starting from the remote. Globalizing a local method means modifying the method in such a way that is guaranteed to converge from any starting point without sacrificing its fast local rate of convergence. To ensure convergence from the remote, the trust-region technique is utilized.

2.1. A Trust-Region Technique

To solve problem (12) and to convergence from remote with any starting point, the trust-region strategy is used. A naive trust-region quadratic subproblem associated with problem (12) is:

\begin{matrix} minimize & q_{k} (s) = ℓ_{k} + \nabla_{x} ℓ_{k}^{T} s + \frac{1}{2} s^{T} H_{k} s + \frac{r}{2} ∥ P_{k} {(g_{u_{k}} + \nabla g_{u_{k}})}^{T} s {) ∥}^{2} \\ subject to & c_{k} + \nabla c_{k}^{T} s = 0, \\ ∥ s ∥ \leq δ_{k}, \end{matrix}

(17)

where

0 < δ_{k}

represents the trust-region radius and

H_{k}

is the Hessian matrix of the Lagrangian function (14) or an approximation to it.

Subproblem (17) may be infeasible because there may be no intersecting points between hyperplane of the linearized constraints

c (x) + \nabla c {(x)}^{T} s = 0

and the constraint

∥ s ∥ \leq δ_{k}

. Even if they intersect, there is no guarantee that this will keep true if

δ_{k}

is reduced, see [37]. To overcome this problem, a reduced Hessian technique which was suggested by [33,34] and used by [22,23,35] is used. In this technique, to obtain the trial step

s_{k}

, it is decomposed into two orthogonal components: the tangential component

s_{k}^{t}

to improve optimality and the normal component

s_{k}^{n}

to improve feasibility. To evaluate each of

s_{k}^{n}

and

s_{k}^{t}

, two unconstrained trust-region subproblems are solved.

To obtain the normal component $s^{n}$

To evaluate the normal component

s_{k}^{n}

, the following trust-region subproblem must be solved:

\begin{matrix} minimize & \frac{1}{2} {∥ c_{k} + \nabla c_{k}^{T} s^{n} ∥}^{2} \\ subject to & ∥ s^{n} ∥ \leq ζ δ_{k}, \end{matrix}

(18)

for some

ζ \in (0, 1)

.

Any method can be used to solve subproblem (18), as long as a fraction of the normal predicted decrease obtained by the Cauchy step

s_{k}^{n c p}

is less than or equal to the normal predicted decrease obtained by

s_{k}^{n}

. That is, the following condition must be held:

∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k}^{n} ∥^{2} \geq ϑ_{1} {∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k}^{n c p} ∥^{2}},

(19)

for some

ϑ_{1} \in (0, 1]

. The normal Cauchy step

s_{k}^{n c p}

is given by:

s_{k}^{n c p} = - τ_{k}^{n c p} \nabla c_{k} c_{k},

(20)

where the parameter

τ_{k}^{n c p}

is given by:

τ_{k}^{n c p} = \{\begin{matrix} \frac{∥ \nabla c_{k} c_{k} ∥^{2}}{∥ {(\nabla c_{k})}^{T} \nabla c_{k} c_{k} ∥^{2}} & i f \frac{∥ \nabla c_{k} c_{k} ∥^{3}}{∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} {) ∥}^{2}} \leq δ_{k} \\ and ∥ \nabla c_{k}^{T} \nabla c_{k} c_{k}) ∥ > 0, \\ \frac{δ_{k}}{∥ \nabla c_{k} c_{k} ∥} & otherwise . \end{matrix}

(21)

A dogleg method is used to solve subproblem (18). It is very cheap if the Hessian is indefinite. The dogleg algorithm approximates the solution curve to subproblem (18) by piecewise linear function connecting the Newton point to the Cauchy point. For more details, see [35].

Once

s_{k}^{n}

is estimated, we will compute

s_{k}^{t} = Y_{k} {\bar{s}}_{k}^{t}

. A matrix

Y_{k}

is the matrix whose columns form a basis for the null space of

{(\nabla c_{k})}^{T}

.

To obtain the tangential component $s_{k}^{t}$ .

To evaluate the tangential component

s_{k}^{t}

, the following subproblem is solved by using the dogleg method

\begin{matrix} minimize & {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{s}}^{t} + \frac{1}{2} {\bar{s}}^{t^{T}} Y_{k}^{T} B_{k} Y_{k} {\bar{s}}^{t} \\ subject to & ∥ Y_{k} {\bar{s}}^{t} ∥ \leq Δ_{k}, \end{matrix}

(22)

where

\nabla q_{k} (s_{k}^{n}) = \nabla_{x} ℓ_{k} + B_{k} s_{k}^{n} + r_{k} \nabla g_{u_{k}} P_{k} g_{u_{k}}

,

B_{k} = H_{k} + r_{k} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T}

, and

Δ_{k} = \sqrt{δ_{k}^{2} - {∥ s_{k}^{n} ∥}^{2}}

.

Since the dogleg method is used to solve the above subproblem, then a fraction of the tangential predicted decrease obtained by a tangential Cauchy step

{\bar{s}}_{k}^{t c p}

is less than or equal to the tangential predicted decrease which is obtained by tangential step

{\bar{s}}_{k}^{t}

. That is, the following conditions hold

q_{k} (s_{k}^{n}) - q_{k} (s_{k}^{n} + Y_{k} {\bar{s}}_{k}^{t}) \geq ϑ_{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k}^{n} + Y_{k} {\bar{s}}_{k}^{t c p})],

(23)

for some

ϑ_{2} \in (0, 1]

. The tangential Cauchy step

s_{k}^{t c p}

is defined as follows

{\bar{s}}_{k}^{t c p} = - τ_{k}^{t c p} Y_{k}^{T} \nabla q_{k} (s_{k}^{n}),

(24)

where the parameter

τ_{k}^{t c p}

is given by

τ_{k}^{t c p} = \{\begin{matrix} \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{2}}{{(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})} & i f \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{3}}{{(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})} \leq Δ_{k} \\ and {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) > 0, \\ \frac{Δ_{k}}{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥} & otherwise, \end{matrix}

(25)

such that

{\bar{B}}_{k} = Y_{k}^{T} B_{k} Y_{k}

.

To be decided whether the step

s_{k} = s_{k}^{n} + s_{k}^{t}

will be accepted or not, a merit function is needed to tie the objective function with the constraints in such a way that progress in the merit function means progress in solving the problem. The following augmented Lagrange function is used in FBACTR algorithm as a merit function,

Φ (x, μ; r; σ) = ℓ (x, μ) + \frac{r}{2} ∥ P (x) g_{u} {(x) ∥}^{2} + σ {∥ c (x) ∥}^{2},

(26)

where

σ > 0

is a penalty parameter.

To test whether the point

(x_{k} + s_{k}, μ_{k + 1})

will be taken in the next iterate, an actual reduction and a predicted reduction are defined.

The actual reduction

A r e d_{k}

in the merit function in moving from

(x_{k}, μ_{k})

to

(x_{k} + s_{k}, μ_{k + 1})

is defined as follows:

A r e d_{k} = Φ (x_{k}, μ_{k}; r_{k}; σ_{k}) - Φ (x_{k} + d_{k}, μ_{k + 1}; r_{k}; σ_{k}) .

A r e d_{k}

can also be written as follows:

A r e d_{k} = ℓ (x_{k}, μ_{k}) - ℓ (x_{k + 1}, μ_{k}) - Δ μ_{k}^{T} c_{k + 1} + \frac{r_{k}}{2} [∥ P_{k} g_{u} (x_{k}) ∥^{2} - ∥ P_{k + 1} g_{u_{k + 1}} ∥^{2}] + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k + 1} ∥^{2}],

(27)

where

Δ μ_{k} = (μ_{k + 1} - μ_{k})

.

The predicted reduction in the merit function is defined to be:

\begin{matrix} P r e d_{k} & = & - {(\nabla_{x} ℓ (x_{k}, μ_{k}))}^{T} s_{k} - \frac{1}{2} s_{k}^{T} H_{k} s_{k} - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) + \frac{r_{k}}{2} [∥ P_{k} g_{u_{k}} ∥^{2} - ∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}) ∥^{2}] \\ + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

(28)

P r e d_{k}

can be written as:

\begin{matrix} P r e d_{k} & = & q_{k} (0) - q_{k} (s_{k}) - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

(29)

To update the penalty parameter $σ_{k}$

To update the penalty parameter

σ_{k}

to ensure that

P r e d_{k} \geq 0

, the following schemeis used (see Algorithm 1):

Algorithm 1 To update the penalty parameter

σ_{k}

If

P r e d_{k} \leq \frac{σ_{k}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} μ_{k} s_{k} ∥^{2}],

(30)

then, set

σ_{k} = \frac{2 [q_{k} (s_{k}) - q_{k} (0) + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k})]}{∥ c_{k} ∥^{2} - {∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥}^{2}} + β_{0},

(31)

where

β_{0} > 0

is a small fixed constant. Else, set

σ_{k + 1} = max (σ_{k}, r_{k}^{2}) .

(32)

End if.

For more details, see [22].

To test the step $s_{k}$ and update $δ_{k}$

The framework to test the step

s_{k}

and update

δ_{k}

is clarified in the Algorithm 2.

Algorithm 2 (To test the step

s_{k}

and update

δ_{k}

)

Choose

0 < α_{1} < α_{2} < 1

,

0 < τ_{1} < 1 < τ_{2}

, and

δ_{m i n} \leq δ_{0} \leq δ_{m a x}

.

While

\frac{A r e d_{k}}{P r e d_{k}} \in (0, α_{1})

or

P r e d_{k} \leq 0

.

Set

δ_{k} = τ_{1} ∥ s_{k} ∥

.

Evaluate a new trial step

s_{k}

.

End while.

If

\frac{A r e d_{k}}{P r e d_{k}} \in [α_{1}, α_{2})

.

Set

x_{k + 1} = x_{k} + s_{k}

and

δ_{k + 1} = max (δ_{k}, δ_{m i n})

.

End if. If

\frac{A r e d_{k}}{P r e d_{k}} \in [α_{2}, 1]

.

Set

x_{k + 1} = x_{k} + s_{k}

and

δ_{k + 1} = min {δ_{m a x}, max {δ_{m i n}, τ_{2} δ_{k}}}

.

End if.

To update the positive parameter $r_{k}$

To update the positive parameter

r_{k}

, we use the following scheme (see Algorithm 3)

Algorithm 3 To update the positive parameter

r_{k}

If

\frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] \leq ∥ \nabla g_{u} (x_{k}) P (x_{k}) g_{u} (x_{k}) ∥ min {∥ \nabla g_{u} (x_{k}) P (x_{k}) g_{u} (x_{k}) ∥, δ_{k}},

(33)

Set

r_{k + 1} = r_{k}

.

Else, set

r_{k + 1} = 2 r_{k}

.

End if.

For more details see, [25].

Finally, the algorithm stopped if the termination criteria

∥ Y_{k}^{T} \nabla_{x} ℓ_{k} ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ + ∥ c_{k} ∥ \leq ε_{1}

or

∥ s_{k} ∥ \leq ε_{2}

, for some

ε_{1}, ε_{2} > 0

is satisfied.

A trust-region algorithm

The framework of the trust-region algorithm to solve subproblem (17) are summarized as follows (see Algorithm 4).

Algorithm 4 Trust-region algorithm

Step 0. (Initialization)

Starting with

x_{0}

. Evaluate

μ_{0}

and

P_{0}

. Set

r_{0} = 1

,

σ_{0} = 1

, and

β_{0} = 0.1

.

Choose

ε_{1}

,

ε_{2}

,

τ_{1}

,

τ_{2}

,

α_{1}

, and

α_{2}

such that

0 < ε_{1}

,

0 < ε_{2}

,

0 < τ_{1} < 1 < τ_{2}

,

and

0 < α_{1} < α_{2} < 1

.

Choose

δ_{m i n}

,

δ_{m a x}

, and

δ_{0}

such that

δ_{m i n} \leq δ_{0} \leq δ_{m a x}

. Set

k = 0

.

Step 1. If

∥ Y_{k}^{T} \nabla_{x} ℓ_{k} ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ + ∥ c_{k} ∥ \leq ε_{1}

, then stop.
Step 2. (How to compute

s_{k}

)

(a) Evaluate the normal component

s_{k}^{n}

by solving subproblem (18).

(b) Evaluate the tangential component

{\bar{s}}_{k}^{t}

by solving subproblem (22).

(c) Set

s_{k} = s_{k}^{n} + Y_{k} {\bar{s}}_{k}^{t}

.

Step 3. If

∥ s_{k} ∥ \leq ε_{2}

, then stop.
Step 4. Set

x_{k + 1} = x_{k} + s_{k}

.
Step 5. Compute

P_{k + 1}

given by (11).
Step 6. Evaluate

μ_{k + 1}

by solving the following subproblem

m i n i m i z e ∥ \nabla f_{u_{k + 1}} + \nabla c_{k + 1} μ + r_{k} \nabla g_{u_{k + 1}} P_{k + 1} g_{u_{k + 1}} ∥^{2} .

(34)

Step 7. To update the penalty parameter

σ_{k}

, using Algorithm 1.
Step 8. To test the step

s_{k}

and update the radius

δ_{k}

, using Algorithm 2.
Step 9. To update the positive parameter

r_{k}

, using Algorithm 3.
Step 10. Set

k = k + 1

and go to Step 1.

The main steps for solving the NBLP problem (1) are clarified in the following algorithm.

2.2. Fischer–Burmeister Active-Set Trust-Region Algorithm

The framework to solve NBLP problem (1) is summarized in the Algorithm 5.

Algorithm 5 FBACTR algorithm

Step 1. Use KKT optimality conditions for the lower level of problem (1) and convert it to a single objective constrained optimization problem (2).
Step 2. Using Fischer–Burmeister function (3) with

ϵ = 0.001

to obtain the smooth problem (4).
Step 3. Summarize problem (4) to the form of nonlinear optimization problem (5).
Step 4. Use the active set strategy to reduce problem (5) to problem (12).
Step 5. Use trust-region Algorithm 4 to solve problem (12) and obtained approximate solution for problem (5) which is approximate solution for problem (1).

The next section is dedicated to the global convergence analysis for the active-set with the trust-region algorithm.

3. Global Convergence Analysis

Let

{(x_{k}, μ_{k})}

be the sequence of points generated by FBACTR Algorithm 5. Let

Ω \subseteq ℜ^{n}

be a convex set which is contained all iterates

x_{k} \in ℜ^{n}

and

x_{k} + s_{k} \in ℜ^{n}

.

Standard assumptions which are needed on the set

Ω

to demonstrate global convergence theory for FBACTR Algorithm 5 are stated in the following section.

3.1. A Standard Assumptions

The next standard assumptions are required to demonstrate the global convergence theory for the FBACTR Algorithm 5.

[ ${SA}_{1}$ .] Functions $f_{u} : ℜ^{n} \to ℜ$ , $g_{u} : ℜ^{n} \to ℜ_{1}^{m}$ , $f_{l} : ℜ^{n} \to ℜ^{n_{2}}$ , and $g_{l} : ℜ^{n} \to ℜ^{m_{2}}$ are twice continuously differentiable functions for all $x \in Ω$ .
[ ${SA}_{2}$ .] The sequence of the Lagrange multiplier vectors ${μ_{k}}$ is bounded.
[ ${SA}_{3}$ .] All of $c (x)$ , $\nabla c (x)$ , $\nabla^{2} c_{i} (x)$ for $i = 1, 2, \dots, n_{2} + m_{2}$ , $g_{u} (x)$ , $\nabla g_{u} (x)$ , $\nabla^{2} g_{u_{i}} (x)$ for $i = 1, 2, \dots, m_{1}$ , and ${(\nabla c {(x)}^{T} \nabla c (x))}^{- 1}$ are uniformly bounded on $Ω$ .
[ ${SA}_{4}$ .] The matrix $\nabla c (x)$ has full column rank.
[ ${SA}_{5}$ .] The sequence of Hessian matrices ${H_{k}}$ is bounded.

Some fundamental lemmas which are needed in the proof of the main theorem introduced in the following section.

3.2. Main Lemmas

Some basic lemmas which are required to demonstrate the main theorems are presented in this section.

Lemma 1.

Under standard assumption

S A_{1}

–

S A_{5}

and at any iteration k, there exists a positive constant

K_{1}

such that:

∥ s_{k}^{n} ∥ \leq K_{1} ∥ c_{k} ∥ .

(35)

Proof.

Since the normal component

s_{k}^{n}

is normal to the tangent space, then we have:

\begin{matrix} ∥ s_{k}^{n} ∥ & = & ∥ \nabla c_{k} {(\nabla c_{k}^{T} \nabla c_{k})}^{- 1} \nabla c_{k}^{T} s_{k} ∥ \\ = & ∥ \nabla c_{k} {(\nabla c_{k}^{T} \nabla c_{k})}^{- 1} [c_{k} + \nabla c_{k}^{T} s_{k} - c_{k}] ∥ \\ \leq & ∥ \nabla c_{k} {(\nabla c_{k}^{T} \nabla c_{k})}^{- 1} ∥ [∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥ + ∥ c_{k} ∥] \\ \leq & ∥ \nabla c_{k} {(\nabla c_{k}^{T} \nabla c_{k})}^{- 1} ∥ ∥ c_{k} ∥, \end{matrix}

where

∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥ \leq ∥ c_{k} ∥

. Using standard assumptions

S A_{1}

–

S A_{5}

, we have the desired result. □

Lemma 2.

Under standard assumptions

S A_{1}

and

S A_{3}

, the functions

P (x) g_{u} (x)

are Lipschitz continuous in Ω.

Proof.

See Lemma (4.1) in [36]. □

From Lemma 2, we conclude that

g_{u} {(x)}^{T} P (x) g_{u} (x)

is differentiable and

\nabla g_{u} (x) P (x) g_{u} (x)

is Lipschitz continuous in

Ω

.

Lemma 3.

At any iteration k, let

A (x_{k}) \in ℜ^{(m_{1}) \times (m_{1})}

be a diagonal matrix whose diagonal entries are:

{(a_{k})}_{i} = \{\begin{matrix} 1 & i f {(g_{u_{k}})}_{i} < 0 a n d {(g_{u_{k + 1}})}_{i} \geq 0, \\ - 1 & i f {(g_{u_{k}})}_{i} \geq 0 a n d {(g_{u_{k + 1}})}_{i} < 0, \\ 0 & o t h e r w i s e, \end{matrix}

(36)

where

i = 1, 2, \dots, m_{1}

. Then

P_{k + 1} = P_{k} + A_{k} .

(37)

Proof.

See Lemma (6.2) in [21]. □

Lemma 4.

Under standard assumptions

S A_{1}

and

S A_{3}

, there exists a positive constant

K_{2}

such that

∥ A_{k} g_{u_{k}} ∥ \leq K_{2} ∥ s_{k} ∥ .

(38)

Proof.

See Lemma (6.3) in [21]. □

Lemma 5.

Under standard assumptions

S A_{1}

–

S A_{5}

, there exists a positive constant

K_{3}

such that:

∣ A r e d_{k} - P r e d_{k} ∣ \leq K_{3} σ_{k} {∥ s_{k} ∥}^{2} .

(39)

Proof.

From (37) and (27) we have:

A r e d_{k} = ℓ (x_{k}, μ_{k}) - ℓ (x_{k + 1}, μ_{k}) - Δ μ_{k}^{T} c_{k + 1} + \frac{r_{k}}{2} [g_{u_{k}}^{T} P_{k} g_{u_{k}} - g_{u_{k + 1}}^{T} (P_{k} + A_{k}) g_{u_{k + 1}}] + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k + 1} ∥^{2}] .

(40)

From (40), (28), and using Cauchy–Schwarz inequality, we have:

\begin{matrix} ∣ A r e d_{k} - P r e d_{k} ∣ & \leq & ∣ ℓ (x_{k}, μ_{k}) + \nabla_{x} ℓ {(x_{k}, μ_{k})}^{T} s_{k} - ℓ (x_{k + 1}, μ_{k}) ∣ + ∣ Δ μ_{k}^{T} [c_{k} + \nabla c_{k}^{T} s_{k} - c_{k + 1}] ∣ \\ + \frac{r_{k}}{2} ∣ ∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}) ∥^{2} - g_{u_{k + 1}}^{T} (P_{k} + A_{k}) g_{u_{k + 1}} ∣ + σ_{k} ∣ ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2} - {∥ c_{k + 1} ∥}^{2} ∣ . \end{matrix}

Hence,

\begin{matrix} | A r e d_{k} - P r e d_{k} | & \leq & \frac{1}{2} ∣ s_{k}^{T} (H_{k} - \nabla^{2} ℓ (x_{k} + ξ_{1} s_{k}, μ_{k})) s_{k} ∣ + \frac{1}{2} ∣ s_{k}^{T} [\nabla^{2} c (x_{k} + ξ_{2} s_{k}) Δ μ_{k}] s_{k} ∣ \\ + \frac{r_{k}}{2} ∣ s_{k}^{T} [\nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} - \nabla g_{u} (x_{k} + ξ_{4} s_{k}) P_{k} \nabla g_{u} {(x_{k} + ξ_{4} s_{k})}^{T}] s_{k} ∣ \\ + \frac{r_{k}}{2} ∣ s_{k}^{T} \nabla^{2} g_{u} (x_{k} + ξ_{4} s_{k}) P_{k} g_{u} (x_{k} + ξ_{4} s_{k}) s_{k} ∣ + \frac{r_{k}}{2} {∥ A_{k} [g_{u_{k}} + \nabla g_{u} {(x_{k} + ξ_{5} s_{k})}^{T} s_{k}] ∥}^{2} \\ + σ_{k} ∣ s_{k}^{T} [\nabla c_{k} \nabla c_{k}^{T} - \nabla c (x_{k} + ξ_{6} s_{k}) \nabla c {(x_{k} + ξ_{6} s_{k})}^{T}] s_{k} ∣ \\ + σ_{k} ∣ s_{k}^{T} \nabla^{2} c (x_{k} + ξ_{6} s_{k}) c (x_{k} + ξ_{6} s_{k}) s_{k} ∣, \end{matrix}

for some

ξ_{1}

,

ξ_{2}

,

ξ_{3}

,

ξ_{4}

,

ξ_{5}

, and

ξ_{6} \in (0, 1)

. Using standard assumptions

S A_{1}

–

S A_{5}

,

σ_{k} \geq r_{k}

,

σ_{k} \geq 1

, and inequality (38), we have:

∣ A r e d_{k} - P r e d_{k} ∣ \leq κ_{1} ∥ s_{k} ∥^{2} + κ_{2} σ_{k} ∥ s_{k} ∥^{2} ∥ c_{k} ∥ + κ_{3} σ_{k} {∥ s_{k} ∥}^{3},

(41)

where

κ_{1} > 0

,

κ_{2} > 0

, and

κ_{3} > 0

are constants and independent of the iteration k. From inequality (41),

σ_{k} \geq 1

,

∥ s_{k} ∥

, and

∥ c_{k} ∥

are uniformly bounded, we obtain the desired result. □

Lemma 6.

Under standard assumptions

S A_{1}

–

S A_{5}

, there exists a positive constant

K_{4}

such that:

∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k}^{n} ∥^{2} \geq K_{4} ∥ c_{k} ∥ min {δ_{k}, ∥ c_{k} ∥} .

(42)

Proof.

We consider two cases:

Firstly, from (20), if

s_{k}^{n c p} = - \frac{δ_{k}}{∥ \nabla c_{k} c_{k} ∥} (\nabla c_{k} c_{k})

and

δ_{k} ∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2} \leq {∥ \nabla c_{k} c_{k} ∥}^{3}

, then we have:

\begin{matrix} ∥ c_{k} ∥^{2} - {∥ c_{k} + \nabla c_{k}^{T} s_{k}^{n c p} ∥}^{2} & = & - 2 {(\nabla c_{k} c_{k})}^{T} s_{k}^{n c p} - s_{k}^{{n c p}^{T}} \nabla c_{k} \nabla c_{k}^{T} s_{k}^{n c p} \\ = & 2 δ_{k} ∥ \nabla c_{k} c_{k} ∥ - \frac{δ_{k}^{2} {∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥}^{2}}{∥ \nabla c_{k} c_{k} ∥^{2}} \\ \geq & 2 δ_{k} ∥ \nabla c_{k} c_{k} ∥ - δ_{k} ∥ \nabla c_{k} c_{k} ∥ \\ \geq & δ_{k} ∥ \nabla c_{k} c_{k} ∥ . \end{matrix}

(43)

Secondly, from (20), if

s_{k}^{n c p} = - \frac{∥ \nabla c_{k} c_{k} ∥^{2}}{∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2}} (\nabla c_{k} c_{k})

and

δ_{k} ∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2} \geq {∥ \nabla c_{k} c_{k} ∥}^{3}

, then we have:

\begin{matrix} ∥ c_{k} ∥^{2} - {∥ c_{k} + \nabla c_{k}^{T} s_{k}^{n c p} ∥}^{2} & = & - 2 {(\nabla c_{k} c_{k})}^{T} s_{k}^{n c p} - s_{k}^{{n c p}^{T}} \nabla c_{k} \nabla c_{k}^{T} s_{k}^{n c p} \\ = & \frac{2 ∥ \nabla c_{k} c_{k} ∥^{4}}{∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2}} - \frac{∥ \nabla c_{k} c_{k} ∥^{4}}{∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2}} \\ = & \frac{∥ \nabla c_{k} c_{k} ∥^{4}}{∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2}} \\ \geq & \frac{∥ \nabla c_{k} c_{k} ∥^{2}}{∥ \nabla c_{k}^{T} \nabla c_{k} c_{k} ∥^{2}} . \end{matrix}

(44)

Using standard assumption

S A_{3}

, we have

∥ \nabla c_{k} c_{k} ∥ \geq \frac{∥ c_{k} ∥}{∥ {(\nabla c_{k}^{T} \nabla c_{k})}^{- 1} \nabla c_{k} ∥}

. From inequalities (19), (43), (44), and using standard assumption

S A_{2}

, we obtain the desired result.

From Algorithm 1 and Lemma 6, we have, for all k:

P r e d_{k} \geq \frac{σ_{k}}{2} K_{4} ∥ c_{k} ∥ min {δ_{k}, ∥ c_{k} ∥} .

(45)

□

Lemma 7.

Under standard assumptions

S A_{1}

–

S A_{5}

, there exists a constant

K_{5} > 0

, such that:

\begin{matrix} q_{k} (s_{k}^{n}) - q_{k} (s_{k}^{n} + Y_{k} {\bar{s}}_{k}^{t}) \geq \frac{1}{2} K_{5} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ min {Δ_{k}, \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥}{∥ {\bar{B}}_{k} ∥}}, \end{matrix}

(46)

where

{\bar{B}}_{k} = Y_{k}^{T} B_{k} Y_{k}

.

Proof.

We consider two cases:

Firstly, from (24), if

{\bar{s}}_{k}^{t c p} = - \frac{Δ_{k}}{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})

and

Δ_{k} {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) \leq {∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥}^{3}

, then we have:

\begin{matrix} q_{k} (s_{k}^{n}) - q_{k} (s_{k}^{n} + Y_{k} {\bar{s}}_{k}^{t c p}) & = & - {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{s}}_{k}^{t c p} - \frac{1}{2} {\bar{s}}_{k}^{{t c p}^{T}} {\bar{B}}_{k} {\bar{s}}_{k}^{t c p} \\ = & Δ_{k} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ \\ - \frac{Δ_{k}^{2}}{2 ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{2}} [{(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})] \\ \geq & Δ_{k} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ - \frac{1}{2} Δ_{k} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ \\ \geq & \frac{1}{2} Δ_{k} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ . \end{matrix}

(47)

Secondly, from (24), if

{\bar{s}}_{k}^{t c p} = - \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{2}}{Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})

and

Δ_{k} {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) \geq {∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥}^{3}

, then we have:

\begin{matrix} q_{k} (s_{k}^{n}) - q_{k} (s_{k}^{n} + Y_{k} {\bar{s}}_{k}^{t c p}) & = & - {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{s}}_{k}^{t c p} - \frac{1}{2} {\bar{s}}_{k}^{{t c p}^{T}} {\bar{B}}_{k} {\bar{s}}_{k}^{t c p} \\ = & \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{4}}{{(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})} \\ - \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{4}}{2 {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})} \\ = & \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{4}}{2 {(Y_{k}^{T} \nabla q_{k} (s_{k}^{n}))}^{T} {\bar{B}}_{k} Y_{k}^{T} \nabla q_{k} (s_{k}^{n})} \\ \geq & \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥^{2}}{2 ∥ {\bar{B}}_{k} ∥} . \end{matrix}

(48)

□

From inequalities (23), (47), (48), and using standard assumptions

S A_{1}

–

S A_{5}

, we obtain the desired result.

The next lemma shows that FBACTR algorithm cannot be looped infinitely without finding an acceptable step.

Lemma 8.

Under standard assumptions

S A_{1}

–

S A_{5}

, if there exists

ε > 0

such that

∥ c_{k} ∥ \geq ε

, then

\frac{A r e d_{k^{j}}}{P r e d_{k^{j}}} \geq α_{1}

for some finite j.

Proof.

Using (39), (45), and from

∥ c_{k} ∥ \geq ε

, we have:

|\frac{A r e d_{k}}{P r e d_{k}} - 1| = \frac{∣ A r e d_{k} - P r e d_{k} ∣}{P r e d_{k}} \leq \frac{2 K_{3} δ_{k}^{2}}{K_{4} ε min {ε, δ_{k}}} .

δ_{k^{j}}

becomes small as

s_{k^{j}}

gets rejected and eventually we will have:

|\frac{A r e d_{k^{j}}}{P r e d_{k^{j}}} - 1| \leq \frac{2 K_{3} δ_{k^{j}}}{K_{4} ε} .

Then, the acceptance rule will be met for finite j. This completes the proof. □

Lemma 9.

Under standard assumptions

S A_{1}

–

S A_{5}

and if jth trial step of iteration k satisfies

∥ s_{k^{j}} ∥ \leq min {\frac{(1 - α_{1}) K_{4}}{4 K_{3}}, 1} ∥ c_{k} ∥,

(49)

then it must be accepted.

Proof.

This lemma is proved by contradiction. In case of assuming that inequality (49) holds, the trial step

s_{k^{j}}

is rejected, by using inequalities (39), (45), and (49), then:

(1 - α_{1}) < \frac{∣ A r e d_{k^{j}} - P r e d_{k^{j}} ∣}{P r e d_{k^{j}}} < \frac{2 K_{3} {∥ s_{k^{j}} ∥}^{2}}{K_{4} ∥ c_{k} ∥ ∥ s_{k^{j}} ∥} \leq \frac{(1 - α_{1})}{2} .

This contradiction and therefore the lemma is proved.

□

Lemma 10.

Under standard assumptions

S A_{1}

–

S A_{5}

, there exists

δ_{k^{j}}

satisfies:

δ_{k^{j}} \geq min {\frac{δ_{m i n}}{β_{1}}, \frac{τ_{1} (1 - α_{1}) K_{4}}{4 K_{3}}, τ_{1}} ∥ c_{k} ∥,

(50)

for all trial steps j of any iteration k where

β_{1}

is a positive constant independent of k or j.

Proof.

At any trial iterate

k^{j}

of any iteration k, we consider two cases:

Firstly, if

j = 1

, then the step is accepted. That is

δ_{k}^{1} \geq δ_{m i n}

and take

β_{1} = {sup}_{x \in Ω} ∥ c_{k} ∥

, we have

δ_{k} \geq δ_{m i n} \geq \frac{δ_{m i n}}{β_{1}} ∥ c_{k} ∥ .

(51)

Secondly, if

j > 1

, then there is at least one trial step which is rejected. From Lemma 9, we have

∥ s_{k^{i}} ∥ > min {\frac{(1 - α_{1}) K_{4}}{4 K_{3}}, 1} ∥ c_{k} ∥,

for all trial steps

i = 1, 2, \dots j - 1

which are rejected. Since

s_{k^{i}}

is a trial step which is rejected, then from the previous inequality and Algorithm 2, we have:

δ_{k^{j}} = τ_{1} ∥ s_{k^{j - 1}} ∥ > τ_{1} min {\frac{(1 - α_{1}) K_{4}}{4 K_{3}}, 1} ∥ c_{k} ∥ .

From inequality (51) and the above inequality, we obtain the desired result.

The next lemma obviously shows that as long as

∥ c_{k} ∥

is bounded away from zero, the radius of the trust-region is bounded away from zero. □

Lemma 11.

Under standard assumptions

S A_{1}

–

S A_{5}

, if there exists

ε > 0

such that

∥ c_{k} ∥ \geq ε

. Then there exists

K_{6} > 0

such that:

δ_{k^{j}} \geq K_{6} .

Proof.

Let

K_{6} = ε min {\frac{δ_{m i n}}{β_{1}}, \frac{τ_{1} (1 - α_{1}) K_{4}}{4 K_{3}}, τ_{1}},

(52)

and using (50), the proof follows directly. □

In the next section, the iteration sequence convergence is studied when

r_{k} \to \infty

.

3.3. Convergence When the Positive Parameter $r_{k} \to \infty$

This section is devoted to the convergence of the iteration sequence when the positive parameter

r_{k}

goes to infinity.

Notice that, we do not require [

\nabla g_{u_{i}} (x)

,

i \in I (x)

] has full column rank in standard assumption

S A_{4}

, so, we may have other kinds of stationary points, which are defined in the following definitions.

Definition 2.

A feasible Fritz John (FFJ) point is a point

x_{*}

that satisfies the following FFJ conditions:

\begin{matrix} η_{*} \nabla f_{u} (x_{*}) + \nabla c (x_{*}) μ_{*} + \nabla g_{u} (x_{*}) λ_{*} & = & 0, \\ c (x_{*}) & = & 0, \\ P (x_{*}) g (x_{*}) & = & 0, \\ {(λ_{*})}_{i} g_{u_{i}} (x_{*}) & = & 0, i = 1, \dots, m_{1}, \\ η_{*}, {(λ_{*})}_{i} & \geq & 0, i = 1, \dots, m_{1} . \end{matrix}

where

η_{*}

,

μ_{*}

, and

λ_{*}

are not all zeros. For more details see [18].

If

η_{*} \neq 0

, then the point

(x_{*}, 1, \frac{μ_{*}}{η_{*}}, \frac{λ_{*}}{η_{*}})

is called a KKT point and FFJ conditions are called KKT conditions.

Definition 3.

An infeasible Fritz John (IFJ) point is a point

x_{*}

that satisfies the following IFJ conditions:

\begin{matrix} η_{*} \nabla f_{u} (x_{*}) + \nabla c (x_{*}) μ_{*} + \nabla g_{u} (x_{*}) λ_{*} & = & 0, \\ c (x_{*}) & = & 0, \\ \nabla g_{u} (x_{*}) P (x_{*}) g_{u} (x_{*}) & = & 0 b u t ∥ P (x_{*}) g_{u} (x_{*}) ∥ > 0, \\ {(λ_{*})}_{i} g_{u_{i}} (x_{*}) & \geq & 0, i = 1, \dots, m_{1}, \\ η_{*}, {(λ_{*})}_{i} & \geq & 0, i = 1, \dots, m_{1}, \end{matrix}

where

η_{*}

,

μ_{*}

, and

λ_{*}

are not all zeros. For more details see [18].

If

η_{*} \neq 0

, then the point

(x_{*}, 1, \frac{μ_{*}}{η_{*}}, \frac{λ_{*}}{η_{*}})

is called an infeasible KKT point and IFJ conditions are called infeasible KKT conditions.

Lemma 12.

Under standard assumptions

S A_{1}

–

S A_{5}

, a subsequence

{k_{i}}

of the sequence of the iteration satisfies IFJ conditions if the following conditions satisfied:

(i): ${lim}_{k_{i} \to \infty} c (x_{k_{i}}) = 0 .$
(ii): ${lim}_{k_{i} \to \infty} ∥ P_{k_{i}} g_{u} (x_{k_{i}}) ∥ > 0$ .
(iii): ${lim}_{k_{i} \to \infty} \{m i n_{s \in ℜ^{n - m_{1} + 1}} {∥ P_{k_{i}} (g_{u_{k_{i}}} + \nabla g_{u_{k_{i}}}^{T} Y_{k_{i}} {\bar{s}}^{t}) ∥}^{2}\} = {lim}_{k_{i} \to \infty} {∥ P_{k_{i}} g_{u_{k_{i}}} ∥}^{2} .$

Proof.

For simplification and without loss of generality, let

{k_{i}}

represents the whole sequence

{k}

. Assume that

{\tilde{s}}_{k}

is the solution of the subproblem

m i n i m i z e_{{\bar{s}}^{t}} {∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}^{t}) ∥}^{2}

, then it satisfies the following equation:

Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} + Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{s}}_{k} = 0 .

(53)

It also satisfies the right hand side of Condition (iii). That is,

lim_{k \to \infty} {2 {\tilde{s}}_{k}^{T} Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} + {\tilde{s}}_{k}^{T} Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{s}}_{k}} = 0 .

(54)

We will consider two cases:

Firstly, if

{lim}_{k \to \infty} {\tilde{s}}_{k} = 0

, then from Equation (53) we have

{lim}_{k \to \infty} Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} = 0

.

Secondly, if

{lim}_{k \to \infty} {\tilde{s}}_{k} \neq 0

, then by multiplying Equation (53) from the left by

2 {\tilde{s}}_{k}^{T}

and subtract it from Equation (54), we have

{lim}_{k \to \infty} {∥ P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{s}}_{k} ∥}^{2} = 0

. Hence

{lim}_{k \to \infty} Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} = 0

. That is, in two cases, we have

lim_{k \to \infty} Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} = 0 .

(55)

Since

{lim}_{k \to \infty} ∥ P_{k} g_{u_{k}} ∥ > 0

, then

{lim}_{k \to \infty} {(P_{k} g_{u_{k}})}_{i} \geq 0

, for

i = 1, \dots, m_{1}

and

{lim}_{k \to \infty} {(P_{k} g_{u_{k}})}_{i} > 0

, for some i. Let

{(λ_{k})}_{i} = {(P_{k} g_{u_{k}})}_{i}

,

i = 1, \dots, m_{1}

, then

{lim}_{k \to \infty} Y_{k}^{T} \nabla g_{u_{k}} λ_{k} = 0

. Hence, there exists a sequence of

{μ_{k}}

such that

{lim}_{k \to \infty} Y_{k}^{T} {\nabla c_{k} μ_{k} + \nabla g_{u_{k}} λ_{k}} = 0

. That is, IFJ conditions hold in the limit with

η_{*} = 0

, see Definition 3. □

Lemma 13.

Under standard assumptions,

S A_{1}

–

S A_{5}

, a subsequence

{k_{i}}

of the sequence of the iteration satisfies FFJ conditions if the following conditions are satisfied:

(i): ${lim}_{k_{i} \to \infty} c (x_{k_{i}}) = 0 .$
(ii): For all $k_{i}$ , $∥ P_{k_{i}} g_{u_{k_{i}}} ∥ > 0$ and ${lim}_{k_{i} \to \infty} P_{k_{i}} g_{u_{k_{i}}} = 0 .$
(iii): ${lim}_{k_{i} \to \infty} \{{min}_{s \in ℜ^{n - m_{1} + 1}} \frac{∥ P_{k_{i}} (g_{u_{k_{i}}} + \nabla g_{u_{k_{i}}}^{T} Y_{k_{i}} {\bar{s}}^{t}) ∥^{2}}{∥ P_{k_{i}} g_{u_{k_{i}}} ∥^{2}}\} = 1$ .

Proof.

For simplification and without loss of generality, let

{k_{i}}

represents the whole sequence

{k}

. Notice that the following equation,

lim_{k \to \infty} \{min_{ν \in ℜ^{n - m_{1} + 1}} \{∥ U_{k} + P_{k} \nabla g_{k}^{T} Y_{k} {ν ∥}^{2}\}\} = 1,

(56)

is equivalent to Condition (iii), where

U_{k}

is a unit vector in the direction of

P_{k} g_{u_{k}}

and

ν = \frac{{\bar{s}}^{t}}{∥ P_{k} g_{u_{k}} ∥}

. Let

{\tilde{ν}}_{k}

be a solution of the following problem:

min_{ν \in ℜ^{n - m_{1} + 1}} \{∥ U_{k} + P_{k} \nabla g_{k}^{T} Y_{k} {ν ∥}^{2}\} .

(57)

Hence,

Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{ν}}_{k} + Y_{k}^{T} \nabla g_{u_{k}} P_{k} U_{k} = 0 .

(58)

Now two cases are considered:

Firstly, if

{lim}_{k \to \infty} Y_{k} {\tilde{ν}}_{k} = 0

and using (58), then

{lim}_{k \to \infty} Y_{k}^{T} \nabla g_{u_{k}} P_{k} U_{k} = 0

.

Secondly, if

{lim}_{k \to \infty} Y_{k} {\tilde{ν}}_{k} \neq 0

, then from (56) and the fact that

{\tilde{ν}}_{k}

is a solution of problem (57) we have:

lim_{k \to \infty} {{\bar{ν_{k}}}^{T} Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{ν}}_{k} + 2 U_{k}^{T} P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{ν}}_{k}} = 0 .

Multiplying Equation (58) from the left by

2 {\tilde{ν}}_{k}^{T}

and subtracting it from the above limit, we have the following equation:

{lim}_{k \to \infty} {\tilde{ν}}_{k}^{T} Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} {\tilde{ν}}_{k} = 0

. That is

{lim}_{k \to \infty} \{Y_{k} \nabla g_{u_{k}} P_{k} U_{k}\} = 0

. Hence in both cases, we have

{lim}_{k \to \infty} \{Y_{k} \nabla g_{u_{k}} P_{k} U_{k}\} = 0

. The remnant of the proof follows using cases similar to those in Lemma 12. □

Lemma 14.

Under standard assumptions

S A_{1}

–

S A_{5}

, if k represents the index of iteration at which

σ_{k}

is increased, then we have:

r_{k} {∥ c_{k} ∥}^{2} \leq K_{7},

(59)

where

K_{7}

is a positive constant.

Proof.

Since

σ_{k}

is increased, then from Algorithm 1 we have:

\begin{matrix} \frac{σ_{k}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] & = & [q_{k} (s_{k}) - q_{k} (0) + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k})] + \frac{β_{0}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

From (42), (50), and using the above equality, we have:

\begin{matrix} \frac{σ_{k}}{2} K_{4} {∥ c_{k} ∥}^{2} & min & {\frac{δ_{m i n}}{β_{1}}, \frac{τ_{1} (1 - α_{1}) K_{4}}{4 K_{3}}, τ_{1}, 1} \leq \nabla_{x} ℓ_{k}^{T} s_{k} + \frac{1}{2} s_{k}^{T} H_{k} s_{k} + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) \\ + & \frac{r_{k}}{2} [∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}) ∥^{2} - ∥ P_{k} g_{u_{k}} ∥^{2}] + \frac{β_{0}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

However,

σ_{k} \geq r_{k}^{2}

, then:

\begin{matrix} \frac{r_{k}^{2}}{2} K_{4} {∥ c_{k} ∥}^{2} min {\frac{δ_{m i n}}{β_{1}}, \frac{τ_{1} (1 - α_{1}) K_{4}}{4 K_{3}}, τ_{1}, 1} & \leq & \nabla_{x} ℓ {(x_{k}, μ_{k})}^{T} s_{k} + \frac{1}{2} s_{k}^{T} H_{k} s_{k} + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) \\ + & \frac{r_{k}}{2} [∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}) ∥^{2} + \frac{β_{0}}{2} {∥ c_{k} ∥}^{2} . \end{matrix}

Hence,

\begin{matrix} \frac{r_{k}}{2} K_{4} {∥ c_{k} ∥}^{2} min {\frac{δ_{m i n}}{β_{1}}, \frac{τ_{1} (1 - α_{1}) K_{4}}{4 K_{3}}, τ_{1}, 1} & \leq & \frac{1}{r_{k}} [\nabla_{x} ℓ_{k}^{T} s_{k} + \frac{1}{2} s_{k}^{T} H_{k} s_{k} + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) \\ + & \frac{β_{0}}{2} ∥ c_{k} ∥^{2}] + \frac{1}{2} {∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}) ∥}^{2} \\ \leq & \frac{1}{r_{k}} [| \nabla_{x} ℓ_{k}^{T} s_{k} | + \frac{1}{2} | s_{k}^{T} H_{k} s_{k} | + | Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) | \\ + & \frac{β_{0}}{2} ∥ c_{k} ∥^{2}] + \frac{1}{2} {∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}) ∥}^{2} . \end{matrix}

From Cauchy–Schwarz inequality, standard assumptions

S A_{3}

–

S A_{5}

, and the fact that

∥ s_{k} ∥ \leq δ_{m a x}

, the proof is completed. □

Lemma 15.

Under standard assumptions

S A_{1}

–

S A_{5}

, if

r_{k} \to \infty

and there is an infinite subsequence

{k_{i}}

of the sequence of the iteration at which

σ_{k}

is increased, then:

lim_{k_{i} \to \infty} ∥ c_{k_{i}} ∥ = 0 .

(60)

Proof.

From lemma (59) and using

r_{k}

is unbounded, the proof is completed. □

Theorem 1.

Under standard assumptions

S A_{1}

–

S A_{5}

, if

r_{k} \to \infty

as

k \to \infty

, then

lim_{k \to \infty} ∥ c_{k} ∥ = 0 .

(61)

Proof.

See Theorem 4.18 [22]. □

Lemma 16.

Under standard assumptions

S A_{1}

–

S A_{5}

, if there exists a subsequence

{k_{j}}

of indices indexing iterates that satisfy

∥ P_{k} g_{u_{k}} ∥ \geq ε > 0

for all

k \in {k_{j}}

and

r_{k} \to \infty

as

k \to \infty

. Then a subsequence of the iteration sequence indexed

{k_{j}}

satisfies IFJ conditions in the limit.

Proof.

For simplification and without loss of generality, the total sequence

{k}

denotes to

{k_{j}}

. This lemma is proved by contradiction, therefor we suppose there is no subsequence of the sequence

{k}

satisfies IFJ conditions in the limit. Using Lemma 12, we have for all k,

| ∥ P_{k} g_{u_{k}} ∥^{2} - ∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}_{k}^{t}) ∥^{2} | \geq ε_{1}

for some

ε_{1} > 0

. From (55), we have

∥ Y_{k} \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ \geq ε_{2}

, for some

ε_{2} > 0

, hence:

\begin{matrix} ∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} + Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} s_{k}^{n} ∥ & \geq & ∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ - ∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} ∥ ∥ s_{k}^{n} ∥ \\ \geq & ε_{2} - K_{1} ∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} ∥ ∥ c_{k} ∥ . \end{matrix}

Since

{∥ c_{k} ∥}

converges to zero and

∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} ∥

is bounded, then we can write:

∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} + Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} s_{k}^{n} ∥ \geq \frac{ε_{2}}{2}

. Therefore,

\begin{matrix} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ & \geq & r_{k} ∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} g_{u_{k}} + Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} s_{k}^{n} ∥ - ∥ Y_{k}^{T} \nabla_{x} ℓ_{k} + Y_{k}^{T} H_{k} s_{k}^{n} ∥ \\ \geq & r_{k} \frac{ε_{2}}{2} - ∥ Y_{k}^{T} \nabla_{x} ℓ_{k} + Y_{k}^{T} H_{k} s_{k}^{n} ∥ \\ \geq & r_{k} [\frac{ε_{2}}{2} - \frac{1}{r_{k}} ∥ Y_{k}^{T} \nabla_{x} ℓ_{k} + Y_{k}^{T} H_{k} s_{k}^{n} ∥] . \end{matrix}

From (46), we have:

\begin{matrix} q_{k} (s_{k}^{n}) - q_{k} (s_{k}) & \geq & \frac{K_{5}}{2} r_{k} [\frac{ε_{2}}{2} - \frac{1}{r_{k}} ∥ Y_{k}^{T} [\nabla_{x} ℓ_{k} + H_{k} s_{k}^{n}] ∥] min {Δ_{k}, \frac{\frac{ε_{2}}{2} - \frac{1}{r_{k}} ∥ Y_{k}^{T} [\nabla_{x} ℓ_{k} + H_{k} s_{k}^{n}] ∥}{∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} ∥ + \frac{1}{r_{k}} ∥ Y_{k}^{T} H_{k} Y_{k} ∥}} . \end{matrix}

For a k sufficiently large, we have:

q_{k} (s_{k}^{n}) - q_{k} (s_{k}) \geq \frac{K_{5} ε_{2}}{4} r_{k} min {Δ_{k}, \frac{ε_{2}}{2 ∥ Y_{k}^{T} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T} Y_{k} ∥}} .

From Algorithm 3,

{r_{k}}

is boundless only if there exist an infinite subsequence of indices

{k_{i}}

, at which:

\frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] < ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ min {∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥, δ_{k}} .

(62)

Since

r_{k} \to \infty

, therefore an infinite number of acceptable iterates at which (62) holds and from the way of updating

r_{k}

, we have

r_{k} \to \infty

as

k \to \infty

. This gives a contradiction unless

r_{k} δ_{k}

is bounded and hence

δ_{k} \to 0

. Therefore

∥ s_{k} ∥ \to 0

. We will consider two cases:

Firstly, if

∥ P_{k} g_{u_{k}} ∥^{2} - {∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}_{k}^{t}) ∥}^{2} > ε_{1}

, then we have

r_{k} {∥ P_{k} g_{u_{k}} ∥^{2} - ∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}_{k}^{t}) ∥^{2}} > r_{k} ε_{1} \to \infty .

(63)

Using (63) and standard assumptions

S A_{3} - S A_{5}

, we have

[q_{k} (s_{k}^{n}) - q_{k} (s_{k})] \to \infty

. That is, the left hand side of inequality (62) goes to infinity while the right hand side tends to zero and this is a contradiction in this case.

Secondly, if

∥ P_{k} g_{u_{k}} ∥^{2} - {∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}_{k}^{t}) ∥}^{2} < - ε_{1}

, then

r_{k} {∥ P_{k} g_{u_{k}} ∥^{2} - ∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}_{k}^{t}) ∥^{2} |} < - r_{k} ε_{1} \to - \infty,

where

r_{k} \to \infty

as

k \to \infty

and similar to the first case,

[q_{k} (s_{k}^{n}) - q_{k} (s_{k})] \to - \infty

. This is a contradiction with

[q_{k} (s_{k}^{n}) - q_{k} (s_{k})] > 0

. The lemma is proved. □

Lemma 17.

Under standard assumptions

S A_{1}

–

S A_{5}

, if

r_{k} \to \infty

as

k \to \infty

, and there exists a subsequence indexed

{k_{j}}

of iterates that satisfy

∥ P_{k} g_{u_{k}} ∥ > 0

for all

k \in {k_{j}}

and

{lim}_{k_{j} \to \infty} ∥ P_{k_{j}} g_{u_{k_{j}}} ∥ = 0

, then a subsequence of the sequence of iterates indexed

{k_{j}}

satisfies FFJ conditions in the limit.

Proof.

Without loss of generality, let

{k_{j}}

be the whole iteration sequence

{k}

to simplify. This lemma is proved by contradiction and so suppose that there is no subsequence that satisfies FFJ conditions in the limit. From condition (iii) of Lemma 13, for all k sufficiently large, there exists a constant

ε_{3} > 0

such that:

\frac{∣ ∥ P_{k} g_{u_{k}} ∥^{2} - {∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} Y_{k} {\bar{s}}_{k}^{t}) ∥}^{2} ∣}{∥ P_{k} g_{u_{k}} ∥^{2}} \geq ε_{3} .

(64)

The following two cases are considered:

Firstly, if

lim i n f_{k \to \infty} \frac{{\bar{s}}_{k}^{t}}{∥ P_{k} g_{u_{k}} ∥} = 0

, then there is a contradiction with inequality (64).

Secondly, if

lim s u p_{k \to \infty} \frac{{\bar{s}}_{k}^{t}}{∥ P_{k} g_{u_{k}} ∥} = \infty

, then from subproblem (22) we have:

Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) = - Y_{k}^{T} (B_{k} + υ_{k} I) Y_{k} {\bar{s}}_{k}^{t},

(65)

where

υ_{k} \geq 0

represents the Lagrange multiplier vector, which is associated with the constraint

∥ Y_{k} {\bar{s}}^{t} ∥ \leq Δ_{k}

. From (65) and (46), we have:

q_{k} (s_{k}^{n}) - q_{k} (s_{k}) \geq \frac{K_{5}}{2} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ min {Δ_{k}, \frac{∥ Y_{k}^{T} [\frac{1}{r_{k}} H_{k} + (\frac{υ_{k}}{r_{k}} I + \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T})] Y_{k} {\bar{s}}_{k}^{t} ∥}{∥ Y_{k}^{T} (\frac{1}{r_{k}} H_{k} + \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T}) Y_{k} ∥}} .

(66)

Since

r_{k} \to \infty

as

k \to \infty

, then there exists an infinite number of acceptable steps such that inequality (62) holds. However, inequality (62) can be written as follows:

\frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] < β_{2}^{2} {∥ P_{k} g_{u_{k}} ∥}^{2},

(67)

where

β_{2} = s u p_{x \in Ω} ∥ Y_{k} \nabla g_{u_{k}} ∥

.

From (66) and (67), we have:

\frac{K_{5}}{2} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ min {\frac{Δ_{k}}{∥ P_{k} g_{u_{k}} ∥}, \frac{∥ Y_{k}^{T} [\frac{1}{r_{k}} H_{k} + (\frac{υ_{k}}{r_{k}} I + \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T})] Y_{k} {\bar{s}}_{k}^{t} ∥}{∥ Y_{k}^{T} (\frac{1}{r_{k}} H_{k} + \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T}) Y_{k} ∥ ∥ P_{k} g_{u_{k}} ∥}} < 2 β_{2}^{2} ∥ P_{k} g_{u_{k}} ∥ .

However, in previous inequality, the right hand side tends to zero as

k \to \infty

and also

{lim}_{k_{i} \to \infty} \frac{{\bar{s}}_{k_{i}}^{t}}{∥ P_{k_{i}} g_{k_{i}} ∥} = \infty

along the subsequence

{k_{i}}

. Therefore,

∥ Y_{k_{i}}^{T} \nabla q_{k_{i}} (s_{k_{i}}^{n}) ∥ \frac{∥ Y_{k_{i}}^{T} [\frac{1}{r_{k_{i}}} H_{k_{i}} + (\frac{υ_{k_{i}}}{r_{k_{i}}} I + \nabla g_{k_{i}} P_{k_{i}} \nabla g_{k_{i}}^{T})] Y_{k_{i}} {\bar{s}}_{k_{i}}^{t} ∥}{∥ Y_{k_{i}}^{T} (\frac{1}{r_{k_{i}}} H_{k_{i}} + \nabla g_{k_{i}} P_{k_{i}} \nabla g_{k_{i}}^{T}) Y_{k_{i}} ∥ ∥ P_{k_{i}} g_{k_{i}} ∥},

is bounded. That is, either

\frac{{\bar{s}}_{k_{i}}^{t}}{∥ P_{k_{i}} g_{k_{i}} ∥}

lies in the null space of

Y_{k_{i}}^{T} (\frac{υ_{k_{i}}}{r_{k_{i}}} I + \nabla g_{k_{i}} P_{k_{i}} \nabla g_{k_{i}}^{T}) Y_{k_{i}}^{T}

or

∥ Y_{k_{i}} \nabla q_{k_{i}} (s_{k_{i}}^{n}) ∥ \to 0

.

The first possibility occurs only when

\frac{υ_{k_{i}}}{r_{k_{i}}} \to 0

as

k_{i} \to \infty

and

\frac{{\bar{s}}_{k_{i}}^{t}}{∥ P_{k_{i}} g_{k_{i}} ∥}

lie in the null space of the matrix

Y_{k_{i}}^{T} \nabla g_{k_{i}} P_{k_{i}} \nabla g_{k_{i}}^{T} Y_{k_{i}}

which is contradicted with assumption (64). This means that, FFJ conditions are satisfied in the limit. As

k_{i} \to \infty

, the second possibility is,

∥ Y_{k_{i}} \nabla q_{k_{i}} (s_{k_{i}}^{n}) ∥ \to 0

and from (65), we have

∥ {\bar{s}}_{k_{i}}^{t} ∥ \to 0

which is contradicted with assumption (64). That is FFJ conditions are satisfied in the limit.

In the next section, the convergence of the sequence of the iteration sequence is studied when

r_{k}

bounded.

3.4. Global Convergence When $r_{k}$ Is Bounded

Our analysis in this section is continued supposing that

r_{k}

is bounded. Therefore, let

\bar{k}

be an integer at which

r_{k} = \bar{r} < \infty

for all

k \geq \bar{k}

. That is,

\frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] \geq ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ min {∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥, δ_{k}} .

(68)

From assumptions

S A_{3}

and

S A_{5}

, and using (68), then for all k, there is a constant

β_{3} > 0

such that:

∥ B_{k} ∥ \leq β_{3}, ∥ Y_{k}^{T} B_{k} ∥ \leq β_{3}, a n d ∥ Y_{k}^{T} B_{k} Y_{k} ∥ \leq β_{3},

(69)

where

B_{k} = H_{k} + \bar{r} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T}

. □

Lemma 18.

Under standard assumptions

S A_{1}

–

S A_{5}

, there exists a constant

K_{8} > 0

such that:

q_{k} (0) - q_{k} (s_{k}^{n}) - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) \geq - K_{8} ∥ c_{k} ∥ .

(70)

Proof.

Since

\begin{matrix} q_{k} (0) - q_{k} (s_{k}^{n}) & = & - \nabla_{x} ℓ_{k}^{T} s_{k}^{n} - \frac{1}{2} s_{k}^{n^{T}} H_{k} s_{k}^{n} + \frac{\bar{r}}{2} [∥ P_{k} g_{u_{k}} ∥^{2} - ∥ P_{k} (g_{u_{k}} + \nabla g_{u_{k}}^{T} s_{k}^{n}) ∥^{2}] \\ = & - {(\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}})}^{T} s_{k}^{n} - \frac{1}{2} {s_{k}^{n}}^{T} (H_{k} + \bar{r} \nabla g_{u_{k}} P_{k} \nabla g_{u_{k}}^{T}) s_{k}^{n} \\ = & - {(\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}})}^{T} s_{k}^{n} - \frac{1}{2} {s_{k}^{n}}^{T} B_{k} s_{k}^{n}, \end{matrix}

then we have:

\begin{matrix} q_{k} (0) - q_{k} (s_{k}^{n}) & - & Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) = - {(\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}})}^{T} s_{k}^{n} - \frac{1}{2} {s_{k}^{n}}^{T} B_{k} s_{k}^{n} - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) \\ \geq & - ∥ \nabla_{x} ℓ_{k} ∥ ∥ s_{k}^{n} ∥ - \bar{r} ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ ∥ s_{k}^{n} ∥ - ∥ B_{k} ∥ ∥ s_{k}^{n} ∥^{2} - ∥ Δ μ_{k} ∥ ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥ \\ \geq & - [∥ \nabla_{x} ℓ_{k} ∥ + \bar{r} ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ + ∥ B_{k} ∥ ∥ s_{k}^{n} ∥] ∥ s_{k}^{n} ∥ - ∥ Δ μ_{k} ∥ ∥ \nabla c_{k} ∥ ∥ s_{k}^{n} ∥ . \end{matrix}

From inequality (35) and the fact that

Y_{k}^{T} \nabla c (x_{k}) = 0

, then we have:

\begin{matrix} q_{k} (0) - q_{k} (s_{k}^{n}) - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) & \geq & [(∥ \nabla_{x} ℓ_{k} ∥ + \bar{r} ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ + ∥ B_{k} ∥ ∥ s_{k}^{n} ∥ + ∥ Δ μ_{k} ∥ ∥ \nabla c_{k} ∥) K_{1}] ∥ c_{k} ∥ . \end{matrix}

From standard assumptions

S A_{2}

,

S A_{3}

,

S A_{5}

, the fact that

∥ s_{k}^{n} ∥ \leq δ_{m a x}

, and using (69), then there exists

K_{8} > 0

, such that inequality (70) holds. □

Lemma 19.

Under standard assumptions

S A_{1}

–

S A_{5}

, then for all k we have:

\begin{matrix} P r e d_{k} & \geq & \frac{1}{2} K_{5} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ min {Δ_{k}, \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥}{∥ {\bar{s}}_{k} ∥}} + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ min {∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥, δ_{k}} \\ - K_{8} ∥ c_{k} ∥ + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

(71)

Proof.

From (29), we have:

\begin{matrix} P r e d_{k} & = & [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] + [q_{k} (0) - q_{k} (s_{k}^{n}) - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k})] + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] \\ = & \frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] + \frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] \\ + [q_{k} (0) - q_{k} (s_{k}^{n}) - Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k})] + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

Using inequalities (46), (68), and (70), we obtain the desired result. □

Lemma 20.

Under standard assumptions

S A_{1}

–

S A_{5}

, if

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ \geq ε > 0

and

∥ c_{k} ∥ \leq τ δ_{k}

where τ is a positive constant given by

τ \leq min \{\frac{ε}{6 β_{3} K_{1} δ_{m a x}}, \frac{\sqrt{3}}{2 K_{1}}, \frac{K_{5} ε}{24 K_{8}} min {\frac{2 ε}{3 δ_{m a x}}, 1}, \frac{ε}{4 K_{8}} min {\frac{ε}{2 δ_{m a x}}, 1}\},

(72)

then there exists a constant

K_{9} > 0

such that:

P r e d_{k} \geq K_{9} δ_{k} + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] .

(73)

Proof.

Since

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ \geq ε

, then we can say

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ \geq \frac{ε}{2}

and

∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ \geq \frac{ε}{2}

. We will consider two cases:

Firstly, if

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ \geq \frac{ε}{2}

, then from inequalities (69), (35) and

∥ c_{k} ∥ \leq τ δ_{k}

, we have:

\begin{matrix} ∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}} + B_{k} s_{k}^{n}) ∥ & \geq & ∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ - ∥ Y_{k}^{T} B_{k} s_{k}^{n} ∥ \\ \geq & ∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ - β_{3} K_{1} ∥ c_{k} ∥ \\ \geq & \frac{ε}{2} - β_{3} K_{1} τ δ_{k} . \end{matrix}

However,

τ \leq \frac{ε}{6 β_{3} K_{1} δ_{m a x}}

, then

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}} + A_{k} s_{k}^{n}) ∥ \geq \frac{ε}{3} .

(74)

From inequality (35), assumption

∥ c_{k} ∥ \leq τ δ_{k}

, and using the value of

τ

in (72), we have

∥ s_{k}^{n} ∥ \leq K_{1} ∥ c_{k} ∥ \leq K_{1} τ δ_{k} \leq K_{1} \frac{\sqrt{3}}{2 K_{1}} δ_{k} = \frac{\sqrt{3}}{2} δ_{k}

. That is,

Δ_{k}^{2} = δ_{k}^{2} - {∥ s_{k}^{n} ∥}^{2} \geq δ_{k}^{2} - \frac{3}{4} δ_{k}^{2} = \frac{1}{4} δ_{k}^{2}

. This means that,

Δ_{k} \geq \frac{1}{2} δ_{k} .

(75)

From inequalities (71), (74), (75), and assumption

∥ c_{k} ∥ \leq τ δ_{k}

, we have the following:

\begin{matrix} P r e d_{k} & \geq & \frac{1}{2} K_{5} ∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}} + B_{k} s_{k}^{n}) ∥ min {∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}} + B_{k} s_{k}^{n}) ∥, \frac{1}{2} δ_{k}} \\ - K_{8} ∥ c_{k} ∥ + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] \\ \geq & \frac{K_{5} ε}{12} δ_{k} min {\frac{2 ε}{3 δ_{m a x}}, 1} - K_{8} τ δ_{k} + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] . \end{matrix}

However,

τ \leq \frac{K_{5} ε}{24 K_{8}} min {\frac{2 ε}{3 δ_{m a x}}, 1}

, then we have

P r e d_{k} \geq \frac{K_{5} ε}{24} min {\frac{2 ε}{3 δ_{m a x}}, 1} δ_{k} + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] .

Secondly, if

∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ \geq \frac{ε}{2}

and using inequality (71), then

\begin{matrix} P r e d_{k} & \geq & ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ min {∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥, δ_{k}} - K_{8} ∥ c_{k} ∥ + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] \\ \geq & \frac{ε}{2} min {\frac{ε}{2 δ_{max}}, 1} δ_{k} - K_{8} τ δ_{k} + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] \\ \geq & \frac{ε}{4} min {\frac{ε}{2 δ_{max}}, 1} δ_{k} + σ_{k} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}], \end{matrix}

where

τ \leq \frac{ε}{4 K_{8}} min {\frac{ε}{2 δ_{max}}, 1}

. Let

K_{9} = min \{\frac{K_{5} ε}{24} min {\frac{2 ε}{3 δ_{m a x}}, 1}, \frac{ε}{4} min {\frac{ε}{2 δ_{max}}, 1}\}

, then the result follows.

From the previous lemma, we notice that either

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ \geq \frac{ε}{2} > 0

or

∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ \geq \frac{ε}{2} > 0

and

∥ c_{k} ∥ \leq τ δ_{k}

, where

τ

is given by (72) at any iteration k, the value of the penalty parameter

σ_{k}

is not needed to increase. That is the penalty parameter

σ_{k}

is increased only when

∥ c_{k} ∥ \geq τ δ_{k}

.

□

Lemma 21.

Under standard assumptions

S A_{1}

–

S A_{5}

, if

σ_{k}

is increased at kth iteration, then there is a positive constant

K_{10}

such that:

σ_{k} min {∥ c_{k} ∥, δ_{k}} \leq K_{10} .

(76)

Proof.

From Algorithm 1, we have:

\begin{matrix} \frac{σ_{k}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] & = & [q_{k} (s_{k}) - q_{k} (s_{k}^{n})] + [q_{k} (s_{k}^{n}) - q_{k} (0)] + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k}) \\ + \frac{β_{0}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}] \\ = & - \frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] - \frac{1}{2} [q_{k} (s_{k}^{n}) - q_{k} (s_{k})] \\ + [q_{k} (s_{k}^{n}) - q_{k} (0) + Δ μ_{k}^{T} (c_{k} + \nabla c_{k}^{T} s_{k})] + \frac{β_{0}}{2} [∥ c_{k} ∥^{2} - ∥ c_{k} + \nabla c_{k}^{T} s_{k} ∥^{2}], \end{matrix}

where

σ_{k}

increased at any iteration and

r_{k} = \bar{r}

. From the previous equation, (42), (46), (68), and (70), we have

\begin{matrix} \frac{σ_{k}}{2} K_{4} ∥ c_{k} ∥ min {δ_{k}, ∥ c_{k} ∥} & \leq & - \frac{K_{5}}{2} ∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥ min {Δ_{k}, \frac{∥ Y_{k}^{T} \nabla q_{k} (s_{k}^{n}) ∥}{∥ {\bar{B}}_{k} ∥}} \\ - ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ min {∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥, δ_{k}} + K_{8} ∥ c_{k} ∥ + \frac{β_{0}}{2} {∥ c_{k} ∥}^{2} \\ \leq & K_{8} ∥ c_{k} ∥ + \frac{β_{0}}{2} {∥ c_{k} ∥}^{2} . \end{matrix}

Using assumption

S A_{3}

, we get the desired result. □

Lemma 22.

Under standard assumptions

S A_{1}

–

S A_{5}

and at the jth trial iterate of any iteration k. If

σ_{k^{j}}

is increased, then there is a constant

K_{11} > 0

, such that

σ_{k^{j}} ∥ c_{k} ∥ \leq K_{11} .

(77)

Proof.

From (50) and (76), we get the desired result. □

Lemma 23.

Under standard assumptions

S A_{1}

–

S A_{5}

, if

σ_{k} \to \infty

, then

lim_{k_{i} \to \infty} ∥ c_{k_{i}} ∥ = 0,

(78)

where

{k_{i}}

is a subsequence indexes the iterates at which

σ_{k}

is increased.

Proof.

From Lemma 22 we obtain the desired result. □

3.5. Main Results for Global Convergence

In this section, main global convergence results for FBACTR algorithm are introduced.

Theorem 2.

Under standard assumptions

S A_{1}

–

S A_{5}

, the sequence of iterates which is generated by FBACTR algorithm satisfies

lim_{k \to \infty} ∥ c_{k} ∥ = 0 .

(79)

Proof.

This theorem is proved by contradiction and so we suppose that

{lim sup}_{k \to \infty} ∥ c_{k} ∥ \geq ε > 0

. This means that there exists an infinite subsequence of indices

{k_{j}}

indexing iterates that satisfy

∥ c_{k_{j}} ∥ \geq \frac{ε}{2}

. However, there exists an infinite sequence of acceptable steps from Lemma 8. Without loss of generality and to simplify, we suppose that all members of

{k_{j}}

are acceptable iterates. Now, two cases are considered:

Firstly, if

{σ_{k}}

is unbounded, then an infinite number of iterates

{k_{i}}

exists and at which the penalty parameter

σ_{k}

is increased. So, for k that is sufficiently large and from Lemma 23, let

{k_{i}}

and

{k_{j}}

be the two sequences which are not have common elements. Let

k_{ρ_{1}}

and

k_{ρ_{2}}

be two consecutive iterates at which

σ_{k}

is increased and

k_{ρ_{1}} < k < k_{ρ_{2}}

, where

k \in {k_{j}}

. The penalty parameter

σ_{k}

is the same for all iterates that lie between

k_{ρ_{1}}

and

k_{ρ_{2}}

. Since all the iterates of

{k_{j}}

are acceptable, then for all

k \in {k_{j}}

,

Φ_{k} - Φ_{k + 1} = A r e d_{k} \geq α_{1} P r e d_{k} .

Using inequality (45), we have:

\frac{Φ_{k} - Φ_{k + 1}}{σ_{k}} \geq \frac{α_{1} K_{4}}{2} ∥ c_{k} ∥ min {∥ c_{k} ∥, δ_{k}} .

Summing over all acceptable iterates that lie between

k_{ρ_{1}}

and

k_{ρ_{2}}

, we have:

\sum_{k = k_{ρ_{1}}}^{k_{ρ_{2}} - 1} \frac{Φ_{k} - Φ_{k + 1}}{σ_{k}} \geq \frac{α_{1} K_{4} ε}{4} min {\hat{K_{6}}, \frac{ε}{2}},

where

\hat{K_{6}}

is as

K_{6}

in (52)but

ε

is replaced by

\frac{ε}{2}

. Hence,

\frac{ℓ (x_{k_{ρ_{1}}}, μ_{k_{ρ_{1}}}; \bar{r}) - ℓ (x_{k_{ρ_{2}}}, μ_{k_{ρ_{2}}}; \bar{r})}{σ_{k_{ρ_{1}}}} + [∥ c_{k_{ρ_{1}}} ∥^{2} - ∥ c_{k_{ρ_{2}}} ∥^{2}] \geq \frac{α_{1} K_{4} ε}{4} min {\hat{K_{6}}, \frac{ε}{2}} .

Since

σ_{k} \to \infty

, then for

k_{ρ_{1}}

sufficiently large, we have:

\frac{∣ ℓ (x_{k_{ρ_{1}}}, μ_{k_{ρ_{1}}}; \bar{r}) - ℓ (x_{k_{ρ_{2}}}, μ_{k_{ρ_{2}}}; \bar{r}) ∣}{σ_{k_{ρ_{1}}}} < \frac{α_{1} K_{4} ε}{8} min {\hat{K_{6}}, \frac{ε}{2}} .

Therefore,

∥ c_{k_{ρ_{1}}} ∥^{2} - {∥ c_{k_{ρ_{2}}} ∥}^{2} \geq \frac{α_{1} K_{4} ε}{8} min {\hat{K_{6}}, \frac{ε}{2}} .

This leads to a contradiction with Lemma 23 unless

ε = 0

.

Secondly, If

{σ_{k}}

is bounded, then for all an integer

\tilde{k}

and

k \geq \tilde{k}

, we have

σ_{k} = \tilde{σ}

. Hence, for any

\hat{k} \in {k_{j}}

where

\hat{k} \geq \tilde{k}

and using (45), we have:

\begin{matrix} P r e d_{\hat{k}} & \geq & \frac{\tilde{σ} K_{4}}{2} ∥ c_{\hat{k}} ∥ min {δ_{\hat{k}}, ∥ c_{\hat{k}} ∥} \geq \frac{ε \tilde{σ} K_{4}}{4} min {\frac{ε}{2 δ_{m a x}}, 1} δ_{\hat{k}} . \end{matrix}

(80)

Then for any

\hat{k} \in {k_{j}}

, we have:

Φ_{\hat{k}} - Φ_{\hat{k} + 1} = A r e d_{\hat{k}} \geq α_{1} P r e d_{\hat{k}},

such that all the iterates of

{k_{j}}

are acceptable. From above inequality, inequality (80) and using Lemma 11 we have:

Φ_{\hat{k}} - Φ_{\hat{k} + 1} \geq \frac{α_{1} ε \tilde{σ} K_{4}}{4} min {\frac{ε}{2 δ_{m a x}}, 1} \hat{K_{6}} > 0 .

However, this is a contradiction of the fact that

{Φ_{k}}

is bounded when

{σ_{k}}

is bounded. Therefore, we have a contradiction in both cases. Hence the supposition is not correct and this proves the theorem. □

Theorem 3.

Under standard assumptions

S A_{1}

–

S A_{5}

, the sequence of iterates generated by FBACTR algorithm satisfies:

\underset{k \to \infty}{lim inf} [∥ Y_{k}^{T} \nabla_{x} ℓ_{k} ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥] = 0 .

(81)

Proof.

First, we prove that:

\underset{k \to \infty}{lim inf} [∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥] = 0 .

(82)

The proof of (82) is by contradiction, so, for all k, assume that

∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ > ε

. Let

{k_{i}}

be an infinite subsequence at which

∥ c_{k_{i}} ∥ > τ δ_{k_{i}}

, where

τ

is defined in (72). However,

∥ c_{k} ∥ \to 0

, then

lim_{k_{i} \to \infty} δ_{k_{i}} = 0 .

Let

k^{j}

be any trial iterate belonging to

{k_{i}}

and we consider two cases:

Firstly, if

{σ_{k}}

is unbounded, then for the rejected trial step

j - 1

of iteration

k \in {k_{i}}

, we have

∥ c_{k} ∥ > τ δ_{k^{j}} = τ_{1} τ ∥ s_{k^{j - 1}} ∥

. Since the trial step

s_{k^{j - 1}}

is rejected and using inequalities (45) and (41), then

\begin{matrix} (1 - α_{1}) & \leq & \frac{| A r e d_{k^{j - 1}} - P r e d_{k^{j - 1}} |}{P r e d_{k^{j - 1}}} \\ \leq & \frac{[2 κ_{1} ∥ s_{k^{j - 1}} ∥ + 2 κ_{2} σ_{k^{j - 1}} ∥ s_{k^{j - 1}} ∥ ∥ c_{k} ∥ + 2 κ_{3} σ_{k^{j - 1}} ∥ s_{k^{j - 1}} ∥^{2}]}{σ_{k^{j - 1}} K_{4} min (τ_{1} τ, 1) ∥ c_{k} ∥} \\ \leq & \frac{2 κ_{1}}{σ_{k^{j - 1}} K_{4} τ_{1} τ min (τ_{1} τ, 1)} + \frac{2 κ_{2} τ_{1} τ + 2 κ_{3}}{K_{4} τ_{1} τ min (τ_{1} τ, 1)} ∥ s_{k^{j - 1}} ∥ . \end{matrix}

However,

{σ_{k}}

is unbounded, hence for all

k \geq \hat{k}

,

\hat{k}

is sufficiently large, we have:

σ_{k^{j - 1}} > \frac{4 κ_{1}}{K_{4} τ_{1} τ min (τ_{1} τ, 1) (1 - α_{1})} .

Therefore, for all

k \geq \hat{k}

, we have:

∥ s_{k^{j - 1}} ∥ \geq \frac{K_{4} τ_{1} τ min (τ_{1} τ, 1) (1 - α_{1})}{4 (κ_{2} τ_{1} τ + κ_{3})} .

From Algorithm 2, we have:

δ_{k^{j}} = τ_{1} ∥ s_{k^{j - 1}} ∥ \geq \frac{K_{4} τ_{1}^{2} τ min (τ_{1} τ, 1) (1 - α_{1})}{4 (κ_{2} τ_{1} τ + κ_{3})} .

This gives a contradiction and this leads to

δ_{k^{j}}

not being able to go to zero in this case.

Secondly, if the sequence

{σ_{k}}

is bounded, then there exists an integer

\bar{k}

and

\bar{σ}

such that for all

k \geq \bar{k}

,

σ_{k} = \bar{σ}

. Consider a trial step j of iteration

k \geq \bar{k}

and

∥ c_{k} ∥ > τ δ_{k^{j}}

, we consider three cases:

(i): If $j = 1$ , then $δ_{k^{j}} \geq δ_{min}$ , see Algorithm 2. This means that, $δ_{k^{j}}$ is bounded in this case;
(ii): If $j > 1$ , and $∥ c_{k^{l}} ∥ > τ δ_{k^{l}}$ for $l = 1, \dots, j$ , then for all rejected trial steps $l = 1, \dots, j - 1$ of iteration $k \geq \bar{k}$ , we have

$(1 - α_{1}) \leq \frac{| A r e d_{k^{l}} - P r e d_{k^{l}} |}{P r e d_{k^{l}}} \leq \frac{2 K_{6} ∥ s_{k^{l}} ∥}{K_{4} min (τ, 1) ∥ c_{k} ∥} .$

Hence,

$\begin{matrix} δ_{k^{j}} = τ_{1} ∥ s_{k^{j - 1}} ∥ & \geq & \frac{τ_{1} K_{4} min (τ, 1) (1 - α_{1}) ∥ c_{k} ∥}{2 K_{3}} \geq \frac{τ_{1} K_{4} min (τ, 1) (1 - α_{1}) τ}{2 K_{3}} δ_{k^{1}} \\ \geq & \frac{τ_{1} K_{4} min (τ, 1) (1 - α_{1}) τ}{2 K_{3}} δ_{min} . \end{matrix}$

That is, $δ_{k^{j}}$ is also bounded in this case.
(iii): If $j > 1$ and $∥ c_{k^{l}} ∥ > τ δ_{k^{l}}$ does not hold for all l, then there exists an integer $ϱ$ such that $∥ c_{k^{l}} ∥ > τ δ_{k^{l}}$ holds for $l = ϱ + 1, \dots, j$ and $∥ c_{k^{l}} ∥ \leq τ δ_{k^{l}}$ holds for all $l = 1, \dots, ϱ$ . As in case (ii), we can write:

$δ_{k^{j}} \geq \frac{τ_{1} K_{4} min (τ, 1) (1 - α_{1})}{2 K_{3}} ∥ c_{k} ∥ \geq \frac{τ_{1} K_{4} min (τ, 1) (1 - α_{1}) τ}{2 K_{3}} δ_{k^{ϱ + 1}} .$

(83)

From Algorithm 2, we have:

$δ_{k^{ϱ + 1}} \geq τ_{1} ∥ s_{k^{ϱ}} ∥ .$

(84)

From Lemma 20, if $∥ c_{k^{l}} ∥ \leq τ δ_{k^{l}}$ and $s_{k^{ϱ}}$ is rejected, then we have:

$(1 - α_{1}) \leq \frac{| A r e d_{k^{ϱ}} - P r e d_{k^{ϱ}} |}{P r e d_{k^{ϱ}}} \leq \frac{2 K_{6} \bar{r} ∥ s_{k^{ϱ}} ∥}{K_{9}} .$

That is,

$∥ s_{k^{ϱ}} ∥ \geq \frac{K_{9} (1 - α_{1})}{2 K_{6} \bar{σ}} .$

This implies that $∥ s_{k^{ϱ}} ∥$ is bounded and from (83) and (84) we have also $δ_{k^{j}}$ is bounded in this case. That is in three cases, we have $δ_{k^{j}}$ is bounded, but this leading to a contradiction. Hence, all the iterates satisfy $∥ c_{k} ∥ \leq τ δ_{k^{j}}$ for $k^{j}$ are sufficiently large. From Lemma 20, then the value of the penalty parameter is not needed to increase. Hence, ${σ_{k}}$ is bounded. Using Lemma 20 and for $k^{j} \geq \bar{k}$ , we have:

$Φ_{k^{j}} - Φ_{k^{j} + 1} = A r e d_{k^{j}} \geq α_{1} P r e d_{k^{j}} \geq α_{1} K_{9} δ_{k^{j}} .$

As $k \to \infty$ , then:

$lim_{k \to \infty} δ_{k^{j}} = 0 .$

(85)

That is the trust-region radius is not bounded below and this leading to a contradiction. Because at iteration $k^{j} > \bar{k}$ , if the previous step was accepted; i.e., at $j = 1$ , then $δ_{k^{1}} \geq δ_{min}$ . That is $δ_{k^{j}}$ is bounded in this case.

If

j > 1

, then there exists at least one rejected trial step. From Lemmas 5 and 20, then for the rejected trial step

s_{k^{j - 1}}

we have:

(1 - α_{1}) < \frac{\bar{σ} K_{3} {∥ s_{k^{j - 1}} ∥}^{2}}{K_{9} δ_{k^{j - 1}}} .

From Algorithm 2, we have:

δ_{k^{j}} = τ_{1} ∥ s_{k^{j - 1}} ∥ > \frac{τ_{1} K_{9} (1 - α_{1})}{\bar{σ} K_{3}} .

Hence

δ_{k^{j}}

is bounded and this contradicts (85). That is, the supposition is wrong and hence,

\underset{k \to \infty}{lim inf} [∥ Y_{k}^{T} (\nabla_{x} ℓ_{k} + \bar{r} \nabla g_{u_{k}} P_{k} g_{u_{k}}) ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥] = 0 .

That is, (81) holds and the proof is completed.

From the above two theorems, we conclude that, given any

ε > 0

, the algorithm terminates because

∥ Y_{k}^{T} \nabla_{x} ℓ_{k} ∥ + ∥ \nabla g_{u_{k}} P_{k} g_{u_{k}} ∥ + ∥ c_{k} ∥ < ε

, for some finite k. □

4. Numerical Results and Comparisons

In this section, we introduce an extensive variety of possible numeric NBLP problems to illustrate the validity of the proposed Algorithm FBACTR Algorithm 5 to solve the NBLP problem. The proposed algorithm FBACTR experimented on 16 benchmark examples given in [4,7,38,39,40].

Ten independent runs with a distinct initial value starting points for every test example are performed to observe the matchmaking of the result. Statistical results of all examples are briefed in Table 1 which displays that the results found by the FBACTR Algorithm 5 are approximate or equal to those by the compered algorithms in method [11] and the literature.

For comparison, the corresponding results of the mean number of iterations (iter), the mean number of function evaluations (nfunc), and the mean value of CPU time (CPUs) in seconds obtained by Methods in [11,41,42] respectively are included and summarized in Table 2. These results show that results of the FBACTR Algorithm 5 are approximate or equal to those of the compared algorithms in the literature.

It is evident from the results that our approach is able to handle NBLP problems even if the upper and the lower levels are convex or not and the computed results converge to the optimal solution which is similar or approximate to the optimal reported in the literature. Finally, it is obvious from the comparison between the solutions obtained using the FBACTR Algorithm 5 with those in the literature, that the FBACTR Algorithm 5 is capable of finding the optimal solution to some problems by a small number of iterations, a small number of function evaluations, and less time.

We offered the numerical results of FBACTR Algorithm 5 using MATLAB (R2013a) (8.2.0.701)64-bit(win64) and a starting point

x_{0} \in i n t (\tilde{F})

. The following parameter setting is used:

δ_{m i n} = 10^{- 4}

,

δ_{0} = m a x (∥ s_{0}^{c p} ∥, δ_{m i n})

,

δ_{m a x} = 10^{4} δ_{0}

,

α_{1} = 10^{- 3}

,

α_{2} = 0.8

,

τ_{1} = 0.5

,

τ_{2} = 2

,

ε_{1} = 10^{- 10}

, and

ε_{2} = 10^{- 12}

.

5. Conclusions

In this paper, the FBACTR Algorithm 5 is presented to solve the NBLP problem (1). A KKT condition is used with the Fischer–Burmeister function and an active-set strategy to convert the NBLP problem to an equivalent smooth equality constrained optimization problem. To ensure global convergence for the FBACTR algorithm, a trust-region globalization strategy is used.

A global convergence theory for the FBACTR algorithm is introduced and applications to mathematical programs with equilibrium constraints are provided to clarify the effectiveness of the proposed approach. Numerical results reflect the good behavior of the FBACTR algorithm and the computed results converge to the optimal solutions. It is clear from the comparison between the solutions obtained using the FBACTR algorithm with algorithms [11,41,42] that the FBACTR can find the optimal solution to some problems with a small number of iterations, small number of function evaluations, and in less time.

Test Problem 1 [41]:

\begin{matrix} min_{v} & f_{u} = w_{1}^{2} + w_{2}^{2} + v^{2} - 4 v \\ s . t . & 0 \leq v \leq 2, \\ min_{w} & f_{l} = w_{1}^{2} + 0.5 w_{2}^{2} + w_{1} w_{2} + \\ (1 - 3 v) w_{1} + (1 + v) w_{2}, \\ s . t . & 2 w_{1} + w_{2} - 2 v \leq 1, \\ w_{1} \geq 0, w_{2} \geq 0 . \end{matrix}

Test Problem 2 [41]:

\begin{matrix} min_{v} & f_{u} = w_{1}^{2} + w_{3}^{2} - w_{1} w_{3} - 4 w_{2} - 7 v_{1} + 4 v_{2} \\ s . t . & v_{1} + v_{2} \leq 1, \\ v_{1} \geq 0, v_{2} \geq 0 \\ min_{w} & f_{l} = w_{1}^{2} + 0.5 w_{2}^{2} + 0.5 w_{3}^{2} + w_{1} w_{2} + \\ (1 - 3 v_{1}) w_{1} + (1 + v_{2}) w_{2}, \\ s . t . & 2 w_{1} + w_{2} - w_{3} + v_{1} - 2 v_{2} + 2 \leq 0, \\ w_{1} \geq 0; w_{2} \geq 0 w_{3} \geq 0 . \end{matrix}

Test Problem 3 [41]:

\begin{matrix} min_{v} & f_{u} = 0.1 (v_{1}^{2} + v_{2}^{2}) - 3 w_{1} - 4 w_{2} + 0.5 (w_{1}^{2} + w_{2}^{2}) \\ s . t . \\ min_{w} & f_{l} = 0.5 (w_{1}^{2} + 5 w_{2}^{2}) - 2 w_{1} w_{2} - v_{1} w_{1} - v_{2} w_{2}, \\ s . t . & - 0.333 w_{1} + w_{2} - 2 \leq 0, \\ w_{1} - 0.333 w_{2} - 2 \leq 0, \\ w_{1} \geq 0, w_{2} \geq 0, \end{matrix}

Test Problem 4 [41]:

\begin{matrix} min_{v} & f_{u} = v_{1}^{2} - 2 v_{1} + v_{2}^{2} - 2 v_{2} + w_{1}^{2} + w_{2}^{2} \\ s . t . & v_{1} \geq 0, v_{2} \geq 0 \\ min_{w} & f_{l} = {(w_{1} - v_{1})}^{2} + {(w_{2} - v_{2})}^{2}, \\ s . t . & 0.5 \leq w_{1} \leq 1.5, \\ 0.5 \leq w_{2} \leq 1.5, \end{matrix}

Test Problem 5 [41]:

\begin{matrix} min_{v} & f_{u} = v^{2} + {(w - 10)}^{2} \\ s . t . & - v + w \leq 0, \\ 0 \leq v \leq 15, \\ min_{w} & f_{l} = {(v + 2 w - 30)}^{2}, \\ s . t . & v + w \leq 20, \\ 0 \leq w \leq 20, \end{matrix}

Test Problem 6 [41]:

\begin{matrix} min_{v} & f_{u} = {(v_{1} - 1)}^{2} + 2 w_{1}^{2} - 2 v_{1} \\ s . t . & v_{1} \geq 0, \\ min_{w} & f_{l} = {(2 w_{1} - 4)}^{2} + {(2 w_{2} - 1)}^{2} + v_{1} w_{1}, \\ s . t . & 4 v_{1} + 5 w_{1} + 4 w_{2} \leq 12, \\ - 4 v_{1} - 5 w_{1} + 4 w_{2} \leq - 4, \\ 4 v_{1} - 4 w_{1} + 5 w_{2} \leq 4, \\ - 4 v_{1} + 4 w_{1} + 5 w_{2} \leq 4, \\ w_{1} \geq 0, w_{2} \geq 0, \end{matrix}

Test Problem 7 [41]:

\begin{matrix} min_{v} & f_{u} = {(v - 5)}^{2} + {(2 w + 1)}^{2} \\ s . t . & v \geq 0, \\ min_{w} & f_{l} = {(2 w - 1)}^{2} - 1.5 v w, \\ s . t . & - 3 v + w \leq - 3, \\ v - 0.5 w \leq 4, \\ v + w \leq 7, \\ w \geq 0 . \end{matrix}

Test Problem 8 [41]:

\begin{matrix} min_{v} & f_{u} = v_{1}^{2} - 3 v_{1} + v_{2}^{2} - 3 v_{2} + w_{1}^{2} + w_{2}^{2} \\ s . t . & v_{1} \geq 0, v_{2} \geq 0, \\ min_{w} & f_{l} = {(w_{1} - v_{1})}^{2} + {(w_{2} - v_{2})}^{2}, \\ s . t . & 0.5 \leq w_{1} \leq 1.5, \\ 0.5 \leq w_{2} \leq 1.5, \end{matrix}

Test Problem 9 [3]:

\begin{matrix} min_{v} & f_{u} = 16 v^{2} + 9 w^{2} \\ s . t . & - 4 v + w \leq 0, \\ v \geq 0, \\ min_{w} & f_{l} = {(v + w - 20)}^{4}, \\ s . t . & 4 v + w - 50 \leq 0, \\ w \geq 0 . \end{matrix}

Test Problem 10 [3]:

\begin{matrix} min_{v} & f_{u} = v_{1}^{3} w_{1} + w_{2} \\ s . t . & 0 \leq v_{1} \leq 1, \\ min_{w} & f_{l} = - w_{2} \\ s . t . & v_{1} w_{1} \leq 10, \\ w_{1}^{2} + v_{1} w_{2} \leq 1, \\ w_{2} \geq 0 . \end{matrix}

Test Problem 11 [42]:

\begin{matrix} min_{v} & f_{u} = 2 v_{1} + 2 v_{2} - 3 w_{1} - 3 w_{2} - 60 \\ s . t . & v_{1} + v_{2} + w_{1} - 2 w_{2} \leq 40, \\ 0 \leq v_{1} \leq 50, \\ 0 \leq v_{2} \leq 50, \\ min_{w} & f_{l} = {(w_{1} - v_{1} + 20)}^{2} + {(w_{2} - v_{2} + 20)}^{2}, \\ s . t . & v_{1} - 2 w_{1} \geq 10, \\ v_{2} - 2 w_{2} \geq 10, \\ - 10 \leq w_{1} \leq 20, \\ - 10 \leq w_{2} \leq 20 . \end{matrix}

Test Problem 12 [3]:

\begin{matrix} min_{v} & f_{u} = {(v - 3)}^{2} + {(w - 2)}^{2} \\ s . t . & - 2 v + w - 1 \leq 0, \\ v - 2 w + 2 \leq 0, \\ v + 2 w - 14 \leq 0, \\ 0 \leq v \leq 8, \\ min_{w} & f_{l} = {(w - 5)}^{2} \\ s . t . & w \geq 0 . \end{matrix}

Test Problem 13 [42]:

\begin{matrix} min_{v} & f_{u} = - v_{1}^{2} - 3 v_{2}^{2} - 4 w_{1} + w_{2}^{2} \\ s . t . & v_{1}^{2} + 2 v_{2} \leq 4, \\ v_{1} \geq 0, v_{2} \geq 0, \\ min_{w} & f_{l} = 2 v_{1}^{2} + w_{1}^{2} - 5 w_{2}, \\ s . t . & v_{1}^{2} - 2 v_{1} + 2 v_{2}^{2} - 2 w_{1} + w_{2} \geq - 3, \\ v_{2} + 3 w_{1} - 4 w_{2} \geq 4, \\ w_{1} \geq 0, w_{2} \geq 0 . \end{matrix}

Test Problem 14 [42]:

\begin{matrix} min_{v} & f_{u} = {(v - 1)}^{2} + {(w - 1)}^{2} \\ s . t . & v \geq 0, \\ min_{w} & f_{l} = 0.5 w^{2} + 500 w - 50 v w \\ s . t . & y \geq 0 . \end{matrix}

Test Problem 15 [42]:

\begin{matrix} min_{v} & f_{u} = - 8 v_{1} - 4 v_{2} + 4 w_{1} - 40 w_{2} - 4 w_{3} \\ s . t . & v_{1} \geq 0, v_{2} \geq 0 \\ min_{w} & f_{l} = v_{1} + 2 v_{2} + w_{1} + w_{2} + 2 w_{3}, \\ s . t . & w_{2} + w_{3} - w_{1} \leq 1, \\ 2 v_{1} - w_{1} + 2 w_{2} - 0.5 w_{3} \leq 1, \\ 2 v_{2} + 2 w_{1} - w_{2} - 0.5 w_{3} \leq 1, \\ w_{i} \geq 0, i = 1, 2, 3 . \end{matrix}

Test Problem 16 [42]:

\begin{matrix} min_{v} & f_{u} = - 8 v_{1} - 4 v_{2} + 4 w_{1} - 40 w_{2} - 4 w_{3} \\ s . t . & v_{1} \geq 0, v_{2} \geq 0 \\ min_{w} & f_{l} = \frac{1 + v_{1} + v_{2} + 2 w_{1} - w_{2} + w_{3}}{6 + 2 v_{1} + w_{1} + w_{2} - 3 w_{3}}, \\ s . t . & - w_{1} + w_{2} + w_{3} + w_{4} = 1, \\ 2 v_{1} - w_{1} + 2 w_{2} - 0.5 w_{3} + w_{5} = 1, \\ 2 v_{2} + 2 w_{1} - w_{2} - 0.5 w_{3} + w_{6} = 1, \\ w_{i} \geq 0, i = 1, \dots, 6 . \end{matrix}

Author Contributions

B.E.: Conceptualization and software G.A.: formal analysis and writing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

The author would like to thank the anonymous referees for their valuable comments and suggestions which have helped to greatly improve this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bialas, W.; Karwan, M. On two-level optimization. IEEE Trans. Autom. Control. 1982, 27, 211–214. [Google Scholar] [CrossRef]
Dempe, S. Foundation of Bilevel Programming; Kluwer Academic: London, UK, 2002. [Google Scholar]
Gumus, H.; Flouda, A. Global Optimization of Nonlinear Bilevel Programming Problems. J. Glob. Optim. 2001, 20, 1–31. [Google Scholar] [CrossRef]
Muu, D.; Quy, N. A Global Optimization Method for Solving Convex Quadratic Bilevel Programming Problems. J. Glob. Optim. 2003, 26, 199–219. [Google Scholar] [CrossRef]
Abo-Elnaga, Y.; El-Shorbagy, M. Multi-Sine Cosine Algorithm for Solving Nonlinear Bilevel Programming Problems. Int. J. Comput. Intell. Syst. 2020, 13, 421–432. [Google Scholar] [CrossRef] [Green Version]
Abo-Elnaga, Y.; Nasr, S. Modified Evolutionary Algorithm and Chaotic Search for Bilevel Programming Problems. Symmetry 2020, 12, 767. [Google Scholar] [CrossRef]
Falk, J.; Liu, J. On bilevel programming, Part I: General nonlinear cases. Math. Program. 1995, 70, 47–72. [Google Scholar] [CrossRef]
Ma, L.; Wang, G. A Solving Algorithm for Nonlinear Bilevel Programing Problems Based on Human Evolutionary Model. Algorithms 2020, 13, 260. [Google Scholar] [CrossRef]
Savard, G.; Gauvin, J. The steepest descent direction for the nonlinear bilevel programming problem. Oper. Res. Lett. 1994, 15, 265–272. [Google Scholar] [CrossRef]
Edmunds, T.; Bard, J. Algorithms for nonlinear bilevel mathematical programs. IEEE Trans. Syst. Man Cybern. 1991, 21, 83–89. [Google Scholar] [CrossRef]
El-Sobky, B.; Ashry, G. An interior-point trust-region algorithm to solve a nonlinear bilevel programming problem. AIMS Math. 2022, 7, 5534–5562. [Google Scholar] [CrossRef]
Chen, J. The semismooth-related properties of a merit function and a descent method for the nonlinear complementarity problem. J. Glob. Optim. 2006, 36, 565–580. [Google Scholar] [CrossRef]
Chen, J. On some NCP-functions based on the generalized Fischer–Burmeister function. Asia-Pac. J. Oper. Res. 2007, 24, 401–420. [Google Scholar] [CrossRef] [Green Version]
Chen, J.; Pan, S. A family of NCP-functions and a descent method for the nonlinear complementarity problem. Comput. Optim. Appl. 2008, 40, 389–404. [Google Scholar] [CrossRef] [Green Version]
Facchinei, F.; Jiang, H.; Qi, L. A smoothing method for mathematical programming with equilibrium constraints. Math. Program. 1999, 85, 107–134. [Google Scholar] [CrossRef]
Byrd, R.; Hribar, M.; Nocedal, J. An interior point algorithm for largescale nonlinear programming. SIAM J. Optim. 1999, 9, 877–900. [Google Scholar] [CrossRef]
Byrd, R.; Gilbert, J.; Nocedal, J. A trust region method based on interior point techniques for nonlinear programming. Math. Program. 2000, 89, 149–185. [Google Scholar] [CrossRef] [Green Version]
Bazaraa, M.; Sherali, H.; Shetty, C. Nonlinear Programming Theory and Algorithms; John Wiley and Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
Curtis, F.E.; Schenk, O.; Wachter, A. An interior-point algorithm for large-scale nonlinear optimization with inexact step computations. Siam J. Sci. Comput. 2010, 32, 3447–3475. [Google Scholar] [CrossRef]
Esmaeili, H.; Kimiaei, M. An efficient implementation of a trust-region method for box constrained optimization. J. Appl. Math. Comput. 2015, 48, 495–517. [Google Scholar] [CrossRef]
El-Sobky, B. A Multiplier active-set trust-region algorithm for solving constrained optimization problem. Appl. Math. Comput. 2012, 219, 127–157. [Google Scholar] [CrossRef]
El-Sobky, B. An active-set interior-point trust-region algorithm. Pac. J. Optim. 2018, 14, 125–159. [Google Scholar] [CrossRef] [Green Version]
El-Sobky, B.; Abotahoun, A. An active-set algorithm and a trust-region approach in constrained minimax problem. Comput. Appl. Math. 2018, 37, 2605–2631. [Google Scholar] [CrossRef]
El-Sobky, B.; Abotahoun, A. A trust-region Algorithm for Solving Mini-Max Problem. J. Comput. Math. 2018, 36, 881–902. [Google Scholar]
El-Sobky, B.; Abouel-Naga, Y. A penalty method with trust-region mechanism for nonlinear bilevel optimization problem. J. Comput. Appl. Math. 2018, 340, 360–374. [Google Scholar] [CrossRef]
El-Sobky, B.; Abo-Elnaga, Y.; Mousa, A.; El-Shorbagy, A. trust-region based penalty barrier algorithm for constrained nonlinear programming problems: An application of design of minimum cost canal sections. Mathematics 2021, 9, 1551. [Google Scholar] [CrossRef]
Kouri, D.; Heinkenschloss, M.; Ridzal, D.; van Waanders, B. A trust-region Algorithm with Adaptive Stochastic Collocation for PDE Optimization under Uncertainty. SIAM J. Sci. Comput. 2020, 35, 1847–1879. [Google Scholar] [CrossRef] [Green Version]
Li, N.; Xue, D.; Sun, W.; Wang, J. A stochastic trust-region method for unconstrained optimization problems. Math. Probl. Eng. 2019, 2019, 8095054. [Google Scholar] [CrossRef] [Green Version]
Niu, L.; Yuan, Y. A new trust region algorithm for nonlinear constrained optimization. J. Comput. Math. 2020, 28, 72–86. [Google Scholar]
Wang, X.; Yuan, Y. A trust region method based on a new affine scaling technique for simple bounded optimization. Optim. Methods Softw. 2013, 28, 871–888. [Google Scholar] [CrossRef]
Wang, X.; Yuan, Y. An augmented Lagrangian trust region method for equality constrained optimization. Optim. Methods Softw. 2015, 30, 559–582. [Google Scholar] [CrossRef]
Zeng, M.; Ni, Q. A new trust region method for nonlinear equations involving fractional mode. Pac. J. Optim. 2019, 15, 317–329. [Google Scholar]
Byrd, R. Robust trust-region methods for nonlinearly constrained optimization. In Proceedings of the Second SIAM Conference on Optimization, Houston, TX, USA, 18–20 May 1987. [Google Scholar]
Omojokun, E. Trust-Region Strategies for Optimization with Nonlinear Equality and Inequality Constraints. Ph.D. Thesis, Department of Computer Science, University of Colorado, Boulder, CO, USA, 1989. [Google Scholar]
El-Sobky, B.; Abouel-Naga, Y. Multi-objective optimal load flow problem with interior-point trust-region strategy. Electr. Power Syst. Res. 2017, 148, 127–135. [Google Scholar] [CrossRef]
Dennis, J.; El-Alem, M.; Williamson, K. A trust-region approach to nonlinear systems of equalities and inequalities. SIAM J. Optim. 1999, 9, 291–315. [Google Scholar] [CrossRef]
Dennis, J.; Heinkenschloss, M.; Vicente, L. trust-region interior-point SQP algorithms for a class of nonlinear programming problems. SIAM J. Control. Optim. 1998, 36, 1750–1794. [Google Scholar] [CrossRef] [Green Version]
Bard, J.F. Convex two-level optimization. Math. Program. 1988, 40, 15–27. [Google Scholar] [CrossRef]
Oduguwa, V.; Roy, R. Bi-level optimization using genetic algorithm. In Proceedings of the IEEE international Conference Artificial Intelligence Systems, Divnomorskoe, Russia, 5–10 September 2002; pp. 123–128. [Google Scholar]
Shimizu, K.; Aiyoshi, E. A new computational method for Stackelberg and min-max problems by use of a penalty method. IEEE Trans. Autom. Control 1981, 26, 460–466. [Google Scholar] [CrossRef]
Li, H.; Jiao, Y.; Zhang, L. Orthogonal genetic algorithm for solving quadratic bilevel programming problems. J. Syst. Eng. Electron. 2010, 21, 763–770. [Google Scholar] [CrossRef] [Green Version]
Wang, Y.; Jiao, Y.; Li, H. An evolutionary algorithm for solving nonlinear bilevel programming based on a new constraint-Handling scheme. IEEE Trans. Syst. Man Cybern. Part C 2005, 35, 221–232. [Google Scholar] [CrossRef]

Table 1. Comparisons of the results of FBACTR Algorithm 5 with the method [11] and methods in the reference.

Problem Name	$(v_{}, w_{})$ Method [11]	$f_{u}^{}$ $f_{l}^{}$ Method [11]	$(v_{}, w_{})$ FBACTR Algorithm 5	$f_{u}^{}$ $f_{l}^{}$ FBACTR Algorithm 5	$(v_{}, w_{})$ Ref.	$f_{u}^{}$ $f_{l}^{}$ Ref.
TP1	(0.8503, 0.0227,	−2.6764	(0.8465, 0.7695, 0)	−2.0772	(0.8438, 0.7657, 0)	−2.0769
	0.03589)	0.0332		−0.5919		−0.5863
TP2	(0.609, 0.391, 0,	0.6086	(0.6111, 0.3890, 0,	0.64013	(0.609, 0.391, 0,	0.6426
	0, 1.828)	1.6713	0, 1.8339)	1.6816	0, 1.828)	1.6708
TP3	(0.97, 3.14,	−8.92	(0.97, 3.14	−8.92	(0.97, 3.14,	−8.92
	2.6, 1.8)	−6.05	2.6, 1.8)	−6.05	2.6, 1.8)	−6.05
TP4	(0.5, 0.5, 0.5, 0.5)	−1	(0.5, 0.5, 0.5, 0.5)	−1	(0.5, 0.5, 0.5, 0.5)	−1
		0		0		0
TP5	(9.839, 10.059)	96.809	(9.9953, 9.9955)	99.907	(10.03, 9.969)	100.58
		0.0019		$1.8628 \times 10^{- 4}$		0.001
TP6	(1.6879, 0.8805, 0)	−1.3519	(1.8889, $8.8889 \times 10^{- 1}$ ,	−1.4074	NA	3.57
		7.4991	$6.8157 \times 10^{- 6}$ )	7.6172		2.4
TP7	(1, 0)	17	(1, 0)	17	(1, 0)	17
		1		1		1
TP8	(0.75, 0.75,	−2.25	(0.7513, 0.7513,	−2.2480	( $\sqrt{3} / 2$ , $\sqrt{3} / 2$ ,	−2.1962
	0.75, 0.75)	0	0.752, 0.752)	0	$\sqrt{3} / 2$ , $\sqrt{3} / 2$ )	0
TP9	(11.138, 5)	2209.8	(11.25, 5)	2250	(11.25, 5)	2250
		222.52		197.753		197.753
TP10	(1, 0, $6.6387 \times 10^{- 6}$ )	$6.6387 \times 10^{- 6}$	(1, 0, 1)	1	(1, 0, 1)	1
		$- 6.6387 \times 10^{- 6}$		−1		−1
TP11	(24.972, 29.653,	4.9101	(25, 30, 5, 10)	5	(25, 30, 5, 10)	5
	5.0238, 9.7565)	0.01332		0		0
TP12	(3, 5)	9	(3, 5)	9	(3, 5)	9
		0		0		0
TP13	(0, 1.7405,	−15.548	(0, 2, 1.875, 0.9063)	−12.68	(0, 2, 1.875, 0.9063)	−12.68
	1.8497, 0.9692)	−1.4247		−1.016		−1.016
TP14	(10.016, 0.81967)	81.328	(10, 0.011)	$8.1978 \times 10^{1}$	(10.04, 0.1429)	82.44
		−0.3359		0		0.271
TP15	(0, 0.9, 0, 0.6, 0.4)	−29.2	(0, 0.9, 0, 0.6, 0.4)	−29.2	(0, 0.9, 0, 0.6, 0.4)	−29.2
		3.2		3.2		3.2
TP16	(0, 0.9, 0, 0.6,	−29.2	(0, 0.9, 0, 0.6,	−29.2	(0, 0.9, 0, 0.6,	−29.2
	0.4, 0, 0, 0)	0.3148	0.4, 0, 0, 0)	0.3148	0.4, 0, 0, 0)	0.3148

Table 2. Comparisons of the results of FBACTR Algorithm 5 with method [11], method [41] and method [42] with respect to the number of iterations, the number of function evaluations, and time/s.

Problem Name	Iter Method [11]	nfunc Method [11]	CPUs Method [11]	Iter FBACTR Algorithm	nfunc FBACTR Algorithm	CPUs FBACTR Algorithm	CPUs Method [41]	CPUs Method [42]
TP1	11	12	1.43	10	13	1.62	1.734	-
TP2	10	14	1.987	9	12	1.87	2.375	-
TP3	6	8	2.9	7	8	2.52	3.315	11.854
TP4	10	14	1.68	12	13	1.92	1.576	-
TP5	6	9	1.635	6	7	1.523	1.825	5.888
TP6	6	11	4.1	8	10	3.95	4.689	25.332
TP7	12	13	1.9	11	12	1.652	1.769	-
TP8	10	11	1.002	11	12	0.953	1.124	-
TP9	10	13	1.95	8	10	1.87	-	-
TP10	5	7	2.987	5	6	3.31	-	-
TP11	9	12	3.742	10	13	3.632	-	37.308
TP12	8	9	1.23	7	9	1.33	-	-
TP13	5	7	2.1	5	8	1.998	-	14.42
TP14	6	8	2.12	5	6	1.97	-	4.218
TP15	5	6	20.512	6	7	20.125	-	45.39
TP16	5	7	40.319	4	5	35.21	-	107.55

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Elsobky, B.; Ashry, G. An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem. Fractal Fract. 2022, 6, 412. https://doi.org/10.3390/fractalfract6080412

AMA Style

Elsobky B, Ashry G. An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem. Fractal and Fractional. 2022; 6(8):412. https://doi.org/10.3390/fractalfract6080412

Chicago/Turabian Style

Elsobky, Bothina, and Gehan Ashry. 2022. "An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem" Fractal and Fractional 6, no. 8: 412. https://doi.org/10.3390/fractalfract6080412

Article Menu

An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem

Abstract

1. Introduction

2. Active-Set with Trust-Region Technique

2.1. A Trust-Region Technique

2.2. Fischer–Burmeister Active-Set Trust-Region Algorithm

3. Global Convergence Analysis

3.1. A Standard Assumptions

3.2. Main Lemmas

3.3. Convergence When the Positive Parameter $r_{k} \to \infty$

3.4. Global Convergence When $r_{k}$ Is Bounded

3.5. Main Results for Global Convergence

4. Numerical Results and Comparisons

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

An Active-Set Fischer–Burmeister Trust-Region Algorithm to Solve a Nonlinear Bilevel Optimization Problem

Abstract

1. Introduction

2. Active-Set with Trust-Region Technique

2.1. A Trust-Region Technique

2.2. Fischer–Burmeister Active-Set Trust-Region Algorithm

3. Global Convergence Analysis

3.1. A Standard Assumptions

3.2. Main Lemmas

3.3. Convergence When the Positive Parameter r k → ∞

3.4. Global Convergence When r k Is Bounded

3.5. Main Results for Global Convergence

4. Numerical Results and Comparisons

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3. Convergence When the Positive Parameter $r_{k} \to \infty$

3.4. Global Convergence When $r_{k}$ Is Bounded