Improvement of Unconstrained Optimization Methods Based on Symmetry Involved in Neutrosophy

Stanimirović, Predrag S.; Ivanov, Branislav; Stanujkić, Dragiša; Katsikis, Vasilios N.; Mourtas, Spyridon D.; Kazakovtsev, Lev A.; Edalatpanah, Seyyed Ahmad

doi:10.3390/sym15010250

Open AccessArticle

Improvement of Unconstrained Optimization Methods Based on Symmetry Involved in Neutrosophy

by

Predrag S. Stanimirović

^1,2,*

,

Branislav Ivanov

³

,

Dragiša Stanujkić

³

,

Vasilios N. Katsikis

⁴

,

Spyridon D. Mourtas

^2,4

,

Lev A. Kazakovtsev

^2,5

and

Seyyed Ahmad Edalatpanah

⁶

¹

Faculty of Sciences and Mathematics, University of Niš, Višegradska 33, 18000 Niš, Serbia

²

Laboratory “Hybrid Methods of Modelling and Optimization in Complex Systems”, Siberian Federal University, Prosp. Svobodny 79, Krasnoyarsk 660041, Russia

³

Technical Faculty in Bor, University of Belgrade, Vojske Jugoslavije 12, 19210 Bor, Serbia

⁴

Department of Economics, Division of Mathematics and Informatics, National and Kapodistrian University of Athens, Sofokleous 1 Street, 10559 Athens, Greece

⁵

Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, Prosp. Krasnoyarskiy Rabochiy 31, Krasnoyarsk 660037, Russia

⁶

Department of Applied Mathematics, Ayandegan Institute of Higher Education, Tonekabon P.O. Box 46818-53617, Mazandaran, Iran

^*

Author to whom correspondence should be addressed.

Symmetry 2023, 15(1), 250; https://doi.org/10.3390/sym15010250

Submission received: 1 December 2022 / Revised: 7 January 2023 / Accepted: 12 January 2023 / Published: 16 January 2023

(This article belongs to the Special Issue Nonlinear Analysis and Its Applications in Symmetry II)

Download

Browse Figures

Versions Notes

Abstract

:

The influence of neutrosophy on many fields of science and technology, as well as its numerous applications, are evident. Our motivation is to apply neutrosophy for the first time in order to improve methods for solving unconstrained optimization. Particularly, in this research, we propose and investigate an improvement of line search methods for solving unconstrained nonlinear optimization models. The improvement is based on the application of symmetry involved in neutrosophic logic in determining appropriate step size for the class of descent direction methods. Theoretical analysis is performed to show the convergence of proposed iterations under the same conditions as for the related standard iterations. Mutual comparison and analysis of generated numerical results reveal better behavior of the suggested iterations compared with analogous available iterations considering the Dolan and Moré performance profiles and statistical ranking. Statistical comparison also reveals advantages of the neutrosophic improvements of the considered line search optimization methods.

Keywords:

unconstrained optimization; neutrosophic logic systems; gradient descent methods; convergence

MSC:

90C70; 90C30; 65K05

1. Introduction, Preliminaries, and Motivation

We investigate applications of neutrosophic logic in determining an additional step size in gradient descent methods for solving the multivariate unconstrained optimization problem

min f (x), x \in R^{n},

(1)

in which the objective

f : R^{n} \to R

is uniformly convex and twice continuously differentiable.

The most general iteration aimed to solve (1) is the descent direction (DD) method

x_{k + 1} = x_{k} + t_{k} d_{k},

(2)

such that

x_{k + 1}

is the actual approximation,

x_{k}

is the former approximation,

t_{k} > 0

is a step size, and

d_{k}

is an appropriate search direction that satisfies the descent condition

g_{k}^{T} d_{k} < 0

, in which

g_{k} = \nabla f (x_{k})

stands for the gradient vector of the objective f. The most common choice is the antigradient direction

d_{k} = - g_{k}

, leading to the gradient descent (

G D

) iterations

x_{k + 1} = x_{k} - t_{k} g_{k},

(3)

in which the learning rate

t_{k}

is typically determined by an inexact line search procedure. The iterative rule of the general quasi-Newton (QN) class of iterations with line search

x_{k + 1} = x_{k} - t_{k} H_{k} g_{k}

(4)

utilizes an appropriate symmetric and positive definite estimation

B_{k}

of the Hessian

G_{k} = ▽^{2} f (x_{k})

and

H_{k} = B_{k}^{- 1}

[1]. The upgrade

B_{k + 1}

from

B_{k}

is established based on the QN characteristic

B_{k + 1} ς_{k} = ξ_{k}, such that ς_{k} = x_{k + 1} - x_{k}, ξ_{k} = g_{k + 1} - g_{k} .

(5)

Computation of the Hessian or its approximations that include matrix operations is time-consuming and prohibitive. Following the goal to make optimization methods efficient in solving large-scale problems, we use the simplest scalar Hessian’s approximation [2,3]:

B_{k} = γ_{k} I, γ_{k} > 0 .

(6)

In this paper, we are interested in the following iterative scheme

x_{k + 1} = x_{k} - γ_{k}^{- 1} t_{k} g_{k} .

(7)

Iterations (7) are introduced as improved gradient descent (

I G D

) methods. The roles of the additional step

γ_{k}

and the basic step length

t_{k}

are clearly separated and complement each other. The quantity

t_{k}

is defined as the output of an inexact line search methodology, while

γ_{k}

is calculated based on Taylor series of

f (x)

.

Diverse forms and improvements of the

I G D

iterative scheme (7) were suggested in [4,5,6,7,8]. The

S M

method proposed in [6] corresponds to the iteration

x_{k + 1} = x_{k} - t_{k} {(γ_{k}^{S M})}^{- 1} g_{k},

(8)

where

γ_{k}^{S M} > 0

is the gain parameter determined utilizing the Taylor approximation of

f (x_{k} - t_{k} {(γ_{k}^{S M})}^{- 1} g_{k})

, which results

γ_{k + 1}^{S M} = ℧ (2 γ_{k}^{S M} \frac{γ_{k}^{S M} Δ_{k} + t_{k} {∥ g_{k} ∥}^{2}}{t_{k}^{2} {∥ g_{k} ∥}^{2}}),

such that

f_{p} : = f (x_{p})

,

Δ_{k} : = f_{k + 1} - f_{k}

and

℧ (x) = \{\begin{matrix} x, & x > 0 \\ 1, & x \leq 0 . \end{matrix}

The modification of the

S M

method was defined as the transformation

M S M = M (S M)

[9]

x_{k + 1} = M (S M) (x_{k}) = x_{k} - t_{k} τ_{k} {(γ_{k}^{M S M})}^{- 1} g_{k},

(9)

where

t_{k} \in (0, 1)

is defined by the backtracking search,

τ_{k} = 1 + t_{k} - t_{k}^{2}

, and

γ_{k + 1}^{M S M} = ℧ (2 γ_{k}^{M S M} \frac{γ_{k}^{M S M} Δ_{k} + t_{k} τ_{k} {∥ g_{k} ∥}^{2}}{{(t_{k} τ_{k})}^{2} {∥ g_{k} ∥}^{2}}) .

(10)

We propose improvements of line search iterative rules for solving (1). The main idea is based on the application of neutrosophic logic in determining appropriate step length for various gradient descent rules. This idea is based on the hybridization principle proposed in [5,9,10], where an appropriate correction parameter

α_{k}

with a fixed value is used. A hybridization of the

S M

iterations (termed

H S M

) was introduced in [5] as the iterative rule

x_{k + 1} = H (S M) (x_{k}) = x_{k} - (η_{k} + 1) {(γ_{k}^{H S M})}^{- 1} t_{k} g_{k},

(11)

such that

η_{k}

is the correction quantity and

γ_{k}^{H S M}

is the gain value defined as

γ_{k + 1}^{H S M} = ℧ (2 γ_{k}^{H S M} \frac{γ_{k}^{H S M} Δ_{k} + (η_{k} + 1) t_{k} {∥ g_{k} ∥}^{2}}{{(η_{k} + 1)}^{2} t_{k}^{2} {∥ g_{k} ∥}^{2}}) .

The hybridizations of several

I G D

methods, including the

M S M

method, were proposed and investigated in [9,10]. An overview of methods derived by the hybridization of

I G D

iterations with the Picard–Mann, Ishikawa, and Khan iterative processes [11,12,13] was given in [14]. Some common fixed point results for fuzzy mappings were derived in [15]. A detailed numerical comparison between hybrid and nonhybrid

I G D

methodswas performed in [14]. Four gradient descent algorithms with adaptive step size were proposed and investigated in [16].

Our goal in this paper is to use an adaptive neutrosophic logic parameter

ν_{k}

instead of the fixed correction parameter

η_{k} + 1

in determining appropriate step sizes for various gradient descent methods. The parameter

ν_{k}

in each iteration will be determined on the basis of the neutrosophic logic controller (NLC).

Consider the universe

U

. The fuzzy set theory relies on a membership function

T (u) \in [0, 1], u \in U

[17]. In addition, a fuzzy set

N

over

U

is a set of ordered pairs

N = {〈 u, T (u) 〉 | u \in U} .

The intuitionistic fuzzy set (IFS) was established based on the nonmembership function

F (u) \in [0, 1], u \in U

[18]. Following the philosophy of using two opposing membership functions, an IFS

N

in

U

is defined as the set of ordered triples

N = {〈 u, T (u), F (u) 〉 | u \in U},

which are based on the independence of the members, that is

T (u), F (u) : U \to [0, 1]

and

0 \leq T (u) + F (u) \leq 1

.

The IFS theory was extended by Smarandache in [19] and Wang et al. [20]. The novelty is the introduction of the indeterminacy-membership function

I (u)

, which symbolizes hesitation in a decision-making process. As a result, elements of a set in the neutrosophic theory are defined by three individualistic membership functions [19,20] defined by the rules of symmetry: the truth-membership function

T (u)

, the indeterminacy-membership function

I (u)

, and the falsity-membership

F (u)

function. A single-valued neutrosophic set (SVNS)

N

over

U

is the set of neutrosophic numbers of the form

N = \{〈 u, T (u), I (u), F (u) 〉 | u \in U\} .

Values of the membership functions independently take values from

[0, 1]

, which initiates

T (u), I (u), F (u) : U \to [0, 1]

and

0 \leq T (u) + I (u) + F (u) \leq 3

.

A neutrosophic set is symmetric in nature since the indeterminacy

I

appears in the middle between the Truth

T

and False

F

[21,22]. Furthermore, a refined neutrosophic set with two indeterminacies

I_{1}

and

I_{2}

in the middle between

T

and

F

also includes a kind of symmetry [22]. In [23], the authors firstly introduced a normalized and a weighted symmetry measure of simplified neutrosophic sets and then proposed a neutrosophic multiple criteria decision-making method based on the introduced symmetry estimate.

Fuzzy logic (FL), intuitionistic fuzzy logic (IFL), and neutrosophic logic (NL) appear as efficient tools to handle mathematical models with uncertainty, fuzziness, ambiguity, inaccuracy, incomplete certainty, incompleteness, inconsistency, and redundancy. NL can be considered as one of the new theories based on the fundamental principles of neutrosophy, which actually belongs to the group of many-valued logics and actually represents an extension of FL. NL can also be considered as a new branch of logic that deals with the shortcomings of FL and classical logic, as well as IFL. Some of the disadvantages of FL, such as the failure to handle inconsistent information, are significantly reduced by applying NL. Truth and falsity in NL are independent, while in IFL they are dependent. Neutrosophic logic can manipulate both incomplete and inconsistent data. Thus, there is a need to explore the use of NL in various domains from medical treatment to the role of recommendation systems using new advanced computational intelligence techniques. An NL is a better choice than the FL and IFL in the representation of real-world data and their executions, because of the following reasons:

(a): FL and IFL systems neglect the importance of indeterminacy. A fuzzy logic controller (FLC) is based on membership and nonmembership of a particular element to a particular set and take into account the indeterminate nature of generated data.
(b): An FL or IFL system is further constrained by the fact that the sum of membership and nonmembership values is limited to 1. More details are available in [24].
(c): NL reasoning clearly distinguishes concepts of absolute truth and relative truth, assuming the existence of the absolute truth with assigned value $1^{+}$ .
(d): NL is applicable in the situation of overlapping regions of the fuzzy systems [25].

Neutrosophic sets (NS) have important applications for denoising, clustering, segmentation, and classification in numerous medical image-processing applications. A utilization of neutrosophic theory in denoising medical images and their segmentation was proposed in [26], such that a neutrosophic image is characterized by three membership sets. Several applications of neutrosophic systems were described in [27]. An application of neutrosophy in natural language processing and sentiment analysis was investigated in [22].

Our goal in the present paper is to improve some of the main gradient descent methods for solving unconstrained nonlinear optimization problems utilizing the advantages of neutrosophic systems. Principal results of the current investigation are emphasized as follows.

(1): We investigate applications of neutrosophic logic in determining an additional step size in line search methods for solving the unconstrained optimization problem.
(2): Applications of neutrosophic logic in multiple step-size methods for solving unconstrained optimization problems are described and investigated.
(3): Rigorous theoretical analysis is performed to show convergence of the proposed iterations under the same conditions as for the corresponding original methods.
(4): Numerical comparison between suggested algorithms given the corresponding available iterations considering the Dolan and Moré benchmarking and the statistical ranking is presented.

The remaining sections are developed according to the following arrangement. Optimization methods based on additional neutrosophic parameters are presented in Section 2. Convergence analysis is investigated in Section 3. Section 4 gives numerical experiments and comparisons. Section 4 gives numerical experiments and compares the MSM, SM, and GD methods with the neutrosophic extensions FMSM, FSM, and FGD methods, equipped with neutrosophic control. Moreover, the application of the new methods in regression analysis is given within this section. Some closing remarks and a vision of future investigation are presented in Section 5.

2. Fuzzy Optimization Methods

Fuzzy descent direction (

F D D

) iterations are defined as a modification of the

D D

iterations (2), as follows:

x_{k + 1} = Φ (D D) (x_{k}) = x_{k} + ν_{k} t_{k} d_{k},

(12)

where

ν_{k} > 0

is an appropriately defined fuzzy parameter. In general,

ν_{k}

should satisfy

ν_{k} \{\begin{matrix} < 1, & if Δ_{k} > 0, \\ = 1, & if Δ_{k} = 0, \\ > 1, & if Δ_{k} < 0 . \end{matrix}

(13)

The main idea used in (13) is to decrease the composite step size

ν_{k} t_{k}

of iterations (12) in the case where f increases and increase

ν_{k} t_{k}

in the case when f decreases.

We define the general fuzzy

Q N

(

F Q N

) iterative scheme with the line search as

x_{k + 1} = Φ (Q N) (x_{k}) = Φ (x_{k} - H_{k} g_{k}) = x_{k} - ν_{k} H_{k} g_{k},

(14)

The fuzzy

G D

method (

F G D

) is defined by

x_{k + 1} = Φ (G D) (x_{k}) = Φ (x_{k} - t_{k} g_{k}) = x_{k} - ν_{k} t_{k} g_{k} .

(15)

The fuzzy SM method (FSM) is defined as

x_{k + 1} = Φ (S M) (x_{k}) = x_{k} - ν_{k} t_{k} {(γ_{k}^{F S M})}^{- 1} g_{k},

(16)

where

γ_{k + 1}^{F S M} = ℧ (2 γ_{k}^{F S M} \frac{γ_{k}^{F S M} Δ_{k} + ν_{k} t_{k} {∥ g_{k} ∥}^{2}}{{(ν_{k} t_{k})}^{2} {∥ g_{k} ∥}^{2}}) .

(17)

Starting from (9) and (14), we define the fuzzy

M S M

method (

F M S M

) by

x_{k + 1} = Φ (M S M) (x_{k}) = x_{k} - ν_{k} t_{k} τ_{k} {(γ_{k}^{F M S M})}^{- 1} g_{k},

(18)

where

γ_{k + 1}^{F M S M} = ℧ (2 γ_{k}^{F M S M} \frac{γ_{k}^{F M S M} Δ_{k} + ν_{k} t_{k} τ_{k} {∥ g_{k} ∥}^{2}}{{(ν_{k} t_{k} τ_{k})}^{2} {∥ g_{k} ∥}^{2}}) .

(19)

Table 1 summarizes different steps utilized in the iterations utilized in this paper, in which the strike means absence of a suitable parameter.

Algorithm 1, restated from [6,28], is exploited to determine the step length

t_{k}

.

Algorithm 1 The backtracking inexact line search.

Input: Goal function

f (x)

, a vector

d_{k}

at

x_{k}

and real quantities

0 < σ < 0.5

,

β \in (0, 1)

.
1:

t = 1

2: While

f (x_{k} + t d_{k}) > f (x_{k}) + σ t g_{k}^{T} d_{k}

, perform

t : = t β

.
3: Output:

t_{k} = t

.

Algorithm 2 describes the general framework of the

F D D

class of methods.

Algorithm 2 Framework of

F D D

methods.

Input: Objective

f (x)

and an initial point

x_{0} \in dom (f)

.
1: Put

k = 0

,

ν_{0} = 1

, calculate

f (x_{0})

,

g_{0} = \nabla f (x_{0})

, and generate a descent direction

d_{0}

.
2: If stopping indicators are fulfilled, then stop; otherwise, go to the subsequent step.
3: (Backtracking) Determine

t_{k} \in (0, 1]

applying Algorithm 1.
4: Compute

x_{k + 1}

using (12).
5: Compute

f (x_{k + 1})

and generate descent vector

d_{k + 1}

.
6: (Score function) Compute

Δ_{k} : = f_{k + 1} - f_{k}

.
7: (Neutrosophistication) Compute

T (Δ_{k}), I (Δ_{k}), F (Δ_{k})

using appropriate membership functions.
8: Define neutrosophic inference engine.
9: (De-neutrosophistication) Compute

ν_{k} (Δ_{k})

using de-neutrosophication rule.
10:

k : = k + 1

and go to step 2.
11: Output:

{x_{k + 1}, f (x_{k + 1})}

.

It is worth mentioning that the general structure of fuzzy neutrosophic optimization methods follows the philosophy described in the diagram of Figure 1.

FMSM Method

To define the FMSM method, we need to define the steps Score function, neutrosophistication and de-neutrosophistication in Algorithm 2.

(1): Neutrosophication. Using three membership functions, neutrosophic logic maps the input $ϑ : = f (x_{k}) - f (x_{k + 1})$ into neutrosophic triplets $(T (ϑ), I (ϑ), F (ϑ))$ .
The truth-membership function is defined as the sigmoid function:

$T (ϑ) = 1 / (1 + e^{- c_{1} (ϑ - c_{2})}) .$

(20)

The parameter $c_{1}$ is responsible for its slope at the crossover point $ϑ = c_{2}$ . The falsity-membership function is the sigmoid function:

$F (ϑ) = 1 / (1 + e^{c_{1} (ϑ - c_{2})}) .$

(21)

The indeterminacy-membership function is the Gaussian function:

$I (ϑ) = e^{- \frac{{(ϑ - c_{2})}^{2}}{2 c_{1}^{2}}},$

(22)

where the parameter $c_{1}$ stands for the standard deviation, and the parameter $c_{2}$ is the mean. The neutrosophication of the crisp value $ϑ \in R$ used in the implementation is the transformation of $ϑ$ into $〈 ϑ : T (ϑ), I (ϑ), F (ϑ) 〉$ , where the membership functions are defined in (20)–(22).
Since the final goal is to minimize $f (x)$ , it is reasonable to use $Δ_{k}$ as a measure in the developed NLC. So, we consider the dynamic neutrosophic set (DNS) defined by $D : = \{〈 T (Δ_{k}), I (Δ_{k}), F (Δ_{k}) 〉; Δ_{k} \in R\}$ .
(2): Neutrosophic inference engine: The neutrosophic rule between the fuzzy input set $I$ and the fuzzy output set under the neutrosophic format $O = {T, I, F}$ is described by the following “IF–THEN” rules:

$\begin{array}{l} R_{1} : If I = P then O = {T, I, F} \\ R_{2} : If I = N then O = {T, I, F} . \end{array}$

The notations P and N stand for fuzzy sets and exactly indicate a positive and negative error, respectively. Using the unification $R = R_{1} \cup R_{2}$ , we obtain $O_{i} = I \circ R_{i}$ , $i = 1, 2$ , where ∘ symbolizes the fuzzy transformation. Furthermore, it follows that $κ_{I \circ R} (ζ) = κ_{I \circ R_{1}} ⋁ κ_{I \circ R_{2}}$ , $κ_{I \circ R} (ζ) = sup (κ_{I} ⋀ κ_{O_{i}})$ , and $i = 1, 2$ , where ⋀ (resp. ⋁) denotes the $(min, max, max)$ operator, (resp. $(max, min, min)$ operator). The process of turning the fuzzy outputs into a single, crisp output value is known as defuzzification. There are various defuzzification methods that can be used to perform this procedure. The centroid method, the weighted average method, and the max or mean–max membership principles are some popular defuzzification methods. In this study, the following defuzzification method, called centroid, is employed to obtain a vector of crisp outputs $ζ^{*} = [T (Δ_{k}), I (Δ_{k}), F (Δ_{k})] \in R^{3}$ of the fuzzy vector $ζ = {T (Δ_{k}), I (Δ_{k}), F (Δ_{k})}$ :

$ζ^{*} = \frac{\int_{O} ζ κ_{I \circ R} (ζ) d ζ}{\int_{O} κ_{I \circ R} (ζ) d ζ} .$

(23)
(3): De-neutrosophication. This step assumes conversion $〈 T (Δ_{k}), I (Δ_{k}), F (Δ_{k}) 〉 \to ν_{k} (Δ_{k}) \in R$ resulting in a single (crisp) value $ν_{k} (Δ_{k})$ .
The following de-neutrosophication rule is proposed to obtain the parameter $ν_{k} (Δ_{k})$ using the rule (24), which follows the constraints stated in (13):

$ν_{k} (Δ_{k}) = \{\begin{matrix} 1 - (T (Δ_{k}) + I (Δ_{k}) + F (Δ_{k})) / c_{1}, & Δ_{k} > 0 \\ 1, & Δ_{k} = 0 \\ 3 - (T (Δ_{k}) + I (Δ_{k}) + F (Δ_{k})), & Δ_{k} < 0 . \end{matrix}$

(24)

The parameter $c_{1} \geq 3$ maintains the lower limit $ν_{k} (Δ_{k}) < 1$ of $ν_{k} (Δ_{k})$ in the case $Δ_{k} > 0$ . Moreover, definition (24) assumes that the membership functions must satisfy $T (Δ_{k}) + I (Δ_{k}) + F (Δ_{k}) < 2$ in the case $Δ_{k} > 0$ .

For better understanding, the NLC structure decomposed by the neutrosophic rules is presented in the diagram of Figure 2. It is crucial to remember that the NLC controller structure was built specifically to solve the issues discussed in this paper, including the membership functions chosen, the number of fuzzy rules chosen, the defuzzification method chosen, and the de-neutrosophication method chosen. As a result, the NLC controller structure is heuristic, and different structures can be required for various applications.

The utilized settings for the NLC employed in all numerical experiments and graphs of this paper are presented in Table 2.

Our imperative requirement is

ν_{k} (Δ_{k}) \geq 0

. The fulfillment of this requirement immediately follows from the membership values

T (Δ_{k}), F (Δ_{k}), I (Δ_{k})

during the neutrosophication process, which are presented in Figure 3a. The NLC output value,

ν_{k} (Δ_{k})

, during the de-neutrosophication process is presented in Figure 3b.

Figure 3 clearly shows that (24) satisfies basic requirements imposed in (13). More precisely, graphs in Figure 3 show

1 - (T (Δ_{k}) + I (Δ_{k}) + F (Δ_{k})) / c_{1} < 1

in the case

Δ_{k} > 0

, and

3 - (T (Δ_{k}) + I (Δ_{k}) + F (Δ_{k})) \geq 1

in the case

Δ_{k} < 0

.

Remark 1.

During the iterations, the function decreases and tends to the minimum, so

{lim}_{k \to \infty} Δ_{k} = 0

, that is,

{lim}_{k \to \infty} ν_{k} (Δ_{k}) = 1

. This observation leads to the conclusion that the parameter

ν_{k} \to 1

decreases as we approach the minimum of the function, and thus the influence of neutrosophy on the gradient methods decreases. Such desirable behavior of

ν_{k} (Δ_{k})

was our intention.

Algorithm 3 is the algorithmic framework of the FMSM method.

Algorithm 3 Framework of

F M S M

method.

Input: Objective

f (x)

and appropriate initialization

x_{0} \in dom (f)

.
1: Put

k = 0

and compute

f (x_{0})

,

g_{0} = \nabla f (x_{0})

and take

γ_{0} = 1

,

ν_{0} = 1

.
2: If stopping criteria are satisfied, then stop; otherwise, go to the subsequent step.
3: (Backtracking) Find the step size

t_{k} \in (0, 1]

using Algorithm 1 utilizing the search direction

d_{k} = - ν_{k} τ_{k} {(γ_{k}^{F M S M})}^{- 1} g_{k}

.
4: Compute

x_{k + 1}

using (18).
5: Calculate

f (x_{k + 1})

and

g_{k + 1} = \nabla f (x_{k + 1})

.
6: Compute

γ_{k + 1}^{F M S M}

applying (19).
7: Compute

Δ_{k} : = f_{k + 1} - f_{k}

.
8: Compute

T (Δ_{k}), I (Δ_{k}), F (Δ_{k})

using (20)–(22), respectively.
9: Compute

ζ^{*} = [T (Δ_{k}), I (Δ_{k}), F (Δ_{k})]

using (23).
10: Compute

ν_{k} : = ν_{k} (Δ_{k})

using (24).
11: Put

k : = k + 1

, and go to Step 2.
12: Return

{x_{k + 1}, f (x_{k + 1})}

.

3. Convergence Analysis

The following assumptions are necessary, and the following auxiliary results are useful.

Assumption 1.

(1)

The level set

M = {x \in R^{n} | f (x) \leq f (x_{0})}

, defined around the initial iterate

x_{0}

of (2), is bounded.

(2)

The objective f is continuous and differentiable in a neighborhood

P

of

M

, and its gradient g is Lipschitz continuous, i.e., there exists

L > 0

, which satisfies

∥ g (v) - g (w) ∥ \leq L ∥ v - w ∥, \forall v, w \in P .

(25)

Several useful results from [28,29,30] and [31,32] are restated for completeness. Let

d_{k}

be chosen as a descent direction, and let the gradient

g (x)

fulfill the Lipschitz requirement (25). The step length

t_{k}

derived in the backtracking Algorithm 1 satisfies

t_{k} \geq min \{1, - \frac{β (1 - σ)}{L} \frac{g_{k}^{T} d_{k}}{∥ d_{k} ∥^{2}}\} .

(26)

The notation

f \in ℜ^{n}

(resp.

f \in ℑ^{n}

) is used to indicate that

f : R^{n} \to R

is twice continuously differentiable and uniformly convex (resp. uniformly convex) on

R^{n}

. From [31,32], it follows that Assumption 1 is satisfied if

f \in ℜ^{n}

.

Lemma 1

([31,32]). Assumption

f \in ℜ^{n}

implies the existence of real numbers m, M, such that

0 < m \leq 1 \leq M .

(27)

Moreover,

f (p)

possesses a unique minimum

p^{*}

, such that

\begin{matrix} {m ∥ q ∥}^{2} \leq q^{T} \nabla^{2} f (p) q \leq M {∥ q ∥}^{2}, \forall p, q \in R^{n}; \end{matrix}

(28)

\begin{matrix} \frac{1}{2} m ∥ p - p^{*} ∥^{2} \leq f (p) - f (p^{*}) \leq \frac{1}{2} M {∥ p - p^{*} ∥}^{2}, \forall p \in R^{n}; \end{matrix}

(29)

\begin{matrix} {m ∥ p - q ∥}^{2} \leq {(g (p) - g (q))}^{T} (p - q) \leq M {∥ p - q ∥}^{2}, \forall p, q \in R^{n} . \end{matrix}

(30)

For simplicity, denote the

S M

and

M S M

iterations as

x_{k + 1}^{(M) S M} = x_{k}^{(M) S M} - t_{k} ω_{k} {(γ_{k}^{(M) S M})}^{- 1} g_{k},

where

x_{k}^{(M) S M}

denotes

x_{k}^{S M}

(resp.

x_{k}^{M S M}

) in the case of the

S M

(resp.

M S M

) method and

ω_{k} = 1

(resp.

ω_{k} = τ_{k} : = 1 + t_{k} - t_{k}^{2}

) in the case of the

S M

(resp.

M S M

) method. Similarly, the

F S M

and

F M S M

iterations are denoted by the common notation

x_{k + 1}^{F (M) S M} = x_{k}^{F (M) S M} - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k},

where

x_{k}^{F (M) S M}

denotes

x_{k}^{F S M}

(resp.

x_{k}^{F M S M}

) in the case of the

F S M

(resp.

F M S M

) method and

ω_{k} = 1

(resp.

ω_{k} = τ_{k}

) in the case of the

F S M

(resp.

F M S M

) method. Since the scalar matrix approximation of the Hessian enables to assume that f is twice continuously differentiable, instead of (28) and (27), we assume only the following bounds for

γ_{k}^{F (M) S M}

:

m \leq γ_{k}^{F (M) S M} \leq M, 0 < m \leq 1 \leq M, m, M \in R .

(31)

In addition,

f \in ℜ^{n}

reduces to

f \in ℑ^{n}

.

Lemma 2 estimates the iterative decreasing of f ensured by

S M

and

M S M

iterations.

Lemma 2

([6,9]). Let

f \in ℑ^{n}

and (31) be valid. Then, the

S M

sequence

{x_{k}}

produced by (8), and the

M S M

sequence

{x_{k}}

produced by (9), satisfy

f (x_{k}^{(M) S M}) - f (x_{k + 1}^{(M) S M}) \geq μ {∥ g_{k}^{} ∥}^{2},

(32)

such that

μ = min \{\frac{σ}{M}, \frac{σ (1 - σ)}{L} β\} .

(33)

Theorem 1 investigates the convergence of the

F M S M

and

F S M

iterative sequences.

Theorem 1.

Let

f \in ℑ^{n}

and (31) be valid. Under these conditions, the

F S M

sequence induced by (16), and the

F M S M

sequence induced by (18), satisfy

f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) \geq μ_{ν_{k}} {∥ g_{k} ∥}^{2},

(34)

such that

μ_{ν_{k}} = min \{\frac{σ ν_{k}}{M}, \frac{σ (1 - σ)}{L} β\} .

(35)

Proof.

The

F S M

and

F M S M

iterations

x_{k + 1}^{F (M) S M} = x_{k}^{F (M) S M} - t_{k} ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}

are of the general

D D

pattern

x_{k + 1} = x_{k} + t_{k} d_{k}

in the case

d_{k} = - ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}

. According to the stopping condition used in Algorithm 1, it follows

f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) \geq - σ t_{k} g_{k}^{T} d_{k}, \forall k \in N .

(36)

In the occurrence

t_{k} < 1

, using (36) with

d_{k} = - ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}

, one obtains

f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) \geq - σ t_{k} g_{k}^{T} d_{k} = - σ t_{k} g_{k}^{T} (- ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}) .

(37)

Now, (26) implies

\begin{array}{l} t_{k} \geq - \frac{β (1 - σ)}{L} \cdot \frac{g_{k}^{T} d_{k}}{∥ d_{k} ∥^{2}} & = - \frac{β (1 - σ)}{L} \cdot \frac{g_{k}^{T} (- ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k})}{{∥- ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}∥}^{2}} \\ = \frac{β (1 - σ)}{L} \cdot \frac{ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} {∥ g_{k} ∥}^{2}}{ν_{k}^{2} ω_{k}^{2} {(γ_{k}^{F (M) S M})}^{- 2} {∥ g_{k} ∥}^{2}} \\ = \frac{β (1 - σ)}{L} \cdot \frac{γ_{k}^{F (M) S M}}{ν_{k} ω_{k}} . \end{array}

Now, (37), in conjunction with the last inequality, initiates

\begin{array}{l} f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) & \geq σ t_{k} ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}^{T} g_{k} \\ \geq σ \frac{β (1 - σ)}{L} \cdot \frac{γ_{k}^{F (M) S M}}{ν_{k} ω_{k}} ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}^{T} g_{k} \\ \geq σ \frac{(1 - σ) β}{L} {∥ g_{k} ∥}^{2} . \end{array}

According to (31), in the occurrence

t_{k} = 1

, we conclude

\begin{array}{l} f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) & \geq - σ g_{k}^{T} d_{k} = - σ g_{k}^{T} (- ν_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}) \\ = \frac{σ ν_{k}}{γ_{k}^{F (M) S M}} ∥ g_{k} ∥^{2} \\ \geq \frac{σ ν_{k}}{M} {∥ g_{k} ∥}^{2} . \end{array}

Starting from the above two inequalities, we obtain (34) in both possible situations,

t_{k} < 1

and

t_{k} = 1

, which completes the statement. □

Remark 2.

Based on (32) and (34), respectively, it follows

f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) \in [μ_{ν_{k}} {∥ g_{k} ∥}^{2}, + \infty)

and

f (x_{k}^{(M) S M}) - f (x_{k + 1}^{(M) S M}) \in [μ ∥ g_{k} ∥^{2}, + \infty)

. According to (13), it follows

μ_{ν_{k}} \geq μ

if

f (x_{k + 1}^{F (M) S M}) < f (x_{k}^{F (M) S M})

. So,

f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M}) \in [μ_{ν_{k}} {∥ g_{k} ∥}^{2}, + \infty) \subseteq [μ ∥ g_{k} ∥^{2}, + \infty)

. This means that values

f (x_{k}^{F (M) S M}) - f (x_{k + 1}^{F (M) S M})

belong to the interval with values greater than or equal to the interval which includes values

f (x_{k}^{(M) S M}) - f (x_{k + 1}^{(M) S M})

. Furthermore, it means that possibilities for the reduction of

f (x_{k + 1}^{F (M) S M})

compared with

f (x_{k}^{F (M) S M})

are greater than or equal to possibilities for the reduction of

f (x_{k + 1}^{(M) S M})

compared with

f (x_{k}^{(M) S M})

.

Theorem 2 confirms a linear convergence rate of the

F (M) S M

method for uniformly convex functions.

Theorem 2.

Let

f \in ℑ^{n}

and (31) be valid. If the iterates

{x_{k}}

are generated by Algorithm 3, it follows that

lim_{k \to \infty} ∥ g_{k}^{F (M) S M} ∥ = 0,

(38)

and

{x_{k}}

converges to

x^{*}

with the least linear convergence rate.

Proof.

The proof is analogous to [6] (Theorem 4.1). □

In Lemma 3, we investigate the convergence of the

F (M) S M

method on the class of quadratic strictly convex functions

f (x) = \frac{1}{2} x^{T} A x - b^{T} x,

(39)

wherein A is a real

n \times n

symmetric positive definite and

b \in R^{n}

. Denote by

λ_{1} \leq \dots \leq λ_{n}

the sorted eigenvalues of A. The gradient of (39) is given as

g_{k} = A x_{k} - b .

(40)

Lemma 3.

The eigenvalues of

f \in ℑ^{n}

defined in (39) by a positive definite symmetric matrix

A \in R^{n}

satisfy

λ_{1} \leq \frac{γ_{k + 1}^{F (M) S M}}{t_{k + 1}} \leq \frac{2 λ_{n}}{σ}, k \in N,

(41)

such that

γ_{k}^{F (M) S M}

is determined by (17) and (19), and

t_{k}

is defined in Algorithm 1.

Proof.

Simple calculation leads to

f (x_{k + 1}^{F (M) S M}) - f {(x_{k})}^{F (M) S M} = \frac{1}{2} x_{k + 1}^{T} A x_{k + 1} - b^{T} x_{k + 1} - \frac{1}{2} x_{k}^{T} A x_{k} + b^{T} x_{k} .

(42)

The replacement of (18) in (42) leads to

\begin{array}{l} f (x_{k + 1}^{F (M) S M}) - f (x_{k}^{F (M) S M}) & = \frac{1}{2} [x_{k} - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}]^{T} A [x_{k} - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}] \\ - b^{T} [x_{k} - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}] - \frac{1}{2} x_{k}^{T} A x_{k} + b^{T} x_{k} \\ = - \frac{1}{2} ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} x_{k}^{T} A g_{k} - \frac{1}{2} ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}^{T} A x_{k} \\ + \frac{1}{2} {(ν_{k} t_{k} ω_{k})}^{2} {(γ_{k}^{F (M) S M})}^{- 2} g_{k}^{T} A g_{k} + ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} b^{T} g_{k} . \end{array}

Applying (40) in the previous equation, we conclude

\begin{array}{l} f (x_{k + 1}^{F (M) S M}) - f (x_{k}^{F (M) S M}) & = ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} [b^{T} g_{k} - x_{k}^{T} A g_{k}] + \frac{1}{2} {(ν_{k} t_{k} ω_{k})}^{2} {(γ_{k}^{F (M) S M})}^{- 2} g_{k}^{T} A g_{k} \\ = ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} [b^{T} - x_{k}^{T} A] g_{k} + \frac{1}{2} {(ν_{k} t_{k} ω_{k})}^{2} {(γ_{k}^{F (M) S M})}^{- 2} g_{k}^{T} A g_{k} \\ = - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}^{T} g_{k} + \frac{1}{2} {(ν_{k} t_{k} ω_{k})}^{2} {(γ_{k}^{F (M) S M})}^{- 2} g_{k}^{T} A g_{k} . \end{array}

(43)

After replacing (43) into (19), the parameter

γ_{k + 1}^{F (M) S M}

becomes

\begin{array}{l} γ_{k + 1}^{F (M) S M} & = 2 γ_{k}^{F (M) S M} \frac{γ_{k}^{F (M) S M} (f_{k + 1} - f_{k}) + ν_{k} t_{k} ω_{k} {∥ g_{k} ∥}^{2}}{{(ν_{k} t_{k} ω_{k})}^{2} {∥ g_{k} ∥}^{2}} \\ = 2 γ_{k}^{F (M) S M} \frac{- ν_{k} t_{k} ω_{k} ∥ g_{k} ∥^{2} + \frac{1}{2} {(ν_{k} t_{k} ω_{k})}^{2} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}^{T} A g_{k} + ν_{k} t_{k} ω_{k} {∥ g_{k} ∥}^{2}}{{(ν_{k} (t_{k} ω_{k}))}^{2} {∥ g_{k} ∥}^{2}} \\ = 2 γ_{k}^{F (M) S M} \frac{\frac{1}{2} {(ν_{k} t_{k} ω_{k})}^{2} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}^{T} A g_{k}}{{(ν_{k} t_{k} ω_{k})}^{2} {∥ g_{k} ∥}^{2}} \\ = \frac{g_{k}^{T} A g_{k}}{∥ g_{k} ∥^{2}} . \end{array}

The last identity implies that

γ_{k + 1}^{F (M) S M}

is the Rayleigh quotient of the real symmetric matrix A at

g_{k}

. So,

λ_{1} \leq γ_{k + 1}^{F (M) S M} \leq λ_{n}, k \in N .

(44)

The left inequality in (41) is implied by (44), due to

t_{k + 1} \in (0, 1]

. To verify the right inequality from (41), we use the upper limit imposed by the line search

t_{k} \geq \frac{β (1 - σ) γ_{k}}{L},

which implies

\frac{γ_{k + 1}^{F (M) S M}}{t_{k + 1}} < \frac{L}{β (1 - σ)} .

(45)

Taking into account (40), and the symmetricity of A, we derive

∥ g (x) - g (y) ∥ = ∥ A x - b - (A y - b) ∥ = ∥ A x - A y ∥ \leq ∥ A ∥ ∥ x - y ∥ = λ_{n} ∥ x - y ∥ .

Based on the last inequality, it is concluded that the constant L in (45) can be defined as the largest eigenvalue

λ_{n}

of A. Considering the backtracking parameters

σ \in (0, 0.5)

,

β \in (σ, 1)

, it is obtained that

\frac{γ_{k + 1}^{F (M) S M}}{t_{k + 1}} < \frac{L}{β (1 - σ)} = \frac{λ_{n}}{β (1 - σ)} < \frac{2 λ_{n}}{σ} .

(46)

Therefore, the right-hand side inequality in (41) is proved, and the proof is finished. □

In Theorem 3, we consider the convergence of the

F S M

and

F M S M

iterations under the supplemental presumption

λ_{n} < 2 λ_{1}

.

Theorem 3.

Let f be a strictly convex quadratic in (39). If the eigenvalues of A satisfy

λ_{n} < 2 λ_{1}

,

F S M

iterations (16) and

F M S M

iterations (18) fulfill

{(d_{i}^{k + 1})}^{2} \leq δ^{2} {(d_{i}^{k})}^{2},

(47)

wherein

δ = max \{1 - \frac{σ λ_{1}}{2 λ_{n}}, \frac{λ_{n}}{λ_{1}} - 1\},

(48)

and

lim_{k \to \infty} ∥ g_{k}^{F (M) S M} ∥ = 0 .

(49)

Proof.

Let

{x_{k}}

be the output of Algorithm 3 and

{v_{1}, \dots, v_{n}}

be orthonormal eigenvectors of A. In this case, for a random vector

x_{k}

in (40), there exist real constants

d_{1}^{k}, d_{2}^{k}, \dots, d_{n}^{k}

such that

g_{k} = \sum_{i = 1}^{n} d_{i}^{k} v_{i} .

(50)

Now, using (18), we have

\begin{array}{l} g_{k + 1} & = A x_{k + 1} - b \\ = A (x_{k} - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} g_{k}) - b \\ = A x_{k} - b - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} A g_{k} \\ = g_{k} - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} A g_{k} \\ = (I - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} A) g_{k} . \end{array}

Next, using the (50), we obtain

g_{k + 1} = \sum_{i = 1}^{n} d_{i}^{k + 1} v_{i} = \sum_{i = 1}^{n} (1 - ν_{k} t_{k} ω_{k} {(γ_{k}^{F (M) S M})}^{- 1} λ_{i}) d_{i}^{k} v_{i} .

(51)

To prove (47), it is enough to show that

|1 - \frac{λ_{i}}{{(ν_{k} t_{k} ω_{k})}^{- 1} γ_{k}^{F (M) S M}}| \leq δ

. Two cases are observable. Firstly, if

λ_{i} \leq \frac{γ_{k}^{F (M) S M}}{ν_{k} t_{k} ω_{k}}

using (41), we deduce

1 > \frac{λ_{i}}{ν_{k} t_{k} ω_{k})^{- 1} γ_{k}^{F (M) S M}} \geq \frac{σ λ_{1}}{2 λ_{n}} ⟹ 1 - \frac{λ_{i}}{{(ν_{k} t_{k} ω_{k})}^{- 1} γ_{k}^{F (M) S M}} \leq 1 - \frac{σ λ_{1}}{2 λ_{n}} \leq δ .

(52)

Now, let us examine another case

\frac{γ_{k}^{F (M) S M}}{ν_{k} t_{k} ω_{k}} < λ_{i}

. Since

1 < \frac{λ_{i}}{{(ν_{k} t_{k} ω_{k})}^{- 1} γ_{k}^{F (M) S M}} \leq \frac{λ_{n}}{λ_{1}},

(53)

it follows that

|1 - \frac{λ_{i}}{{(ν_{k} t_{k} ω_{k})}^{- 1} γ_{k}^{F (M) S M}}| \leq \frac{λ_{n}}{λ_{1}} - 1 \leq δ .

(54)

Now, we use the orthonormality of the eigenvectors

{v_{1}, \dots, v_{n}}

and (50) and obtain

∥ g_{k} ∥^{2} = \sum_{i = 1}^{n} {(d_{i}^{k})}^{2} .

(55)

Since (47) holds and

0 < δ < 1

, based on (55), it follows that (50) holds, which completes the proof. □

4. Numerical Experiments

In this section, we prove the numerical efficiency of the gradient methods based on a dynamic neutrosophic set (DNS). We consider six methods, of which three are

F M S M

,

F S M

, and

F G D

based on DNS, while the other three methods,

M S M

,

S M

, and

G D

, are well-known in the literature. To this aim, we perform competitions on standard test functions with given initial points from [33,34]. We compare the

M S M

,

S M

,

G D

,

F M S M

,

F S M

, and

F G D

methods on three criteria:

The CPU time in seconds—CPUts.
The number of iterative steps—NI.
The number of function evaluations—NFE.

The methods which participate in the competition are presented in Section 2 (Table 1). Test problems in ten dimensions [100, 500, 1000, 3000, 5000, 7000, 8000, 10,000, 15,000, 20,000] are considered. The codes are tested in MATLAB R2017a and an LAP (Intel(R) Core(TM) i3-6006U, up to 2.0 GHz, 8 GB Memory) with the Windows 10 Pro operating system.

Algorithms

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

are compared using the backtracking line search with parameters

σ = 0.0001

,

β = 0.8

and the stopping criterion

∥ g_{k} ∥ \leq ϵ and \frac{| Δ_{k} |}{1 + | f_{k} |} \leq δ,

where

ϵ = 10^{- 6}

and

δ = 10^{- 16}

. Specific parameters used only in the

F S M

,

F G D

, and

F M S M

methods are given in Table 2.

In the following, we give a double analysis of the obtained numerical results. One analysis of the numerical results is based on the Dolan–Moré performance profile, and the other on the ranking of the optimization methods.

4.1. Comparison Based on the Dolan–Moré Performance Profile

In this subsection, we give numerical results for the

F S M

,

F G D

, and

F M S M

methods and then compare them with the numerical results obtained for the

M S M

,

S M

, and

G D

methods.

Summarized numerical results for the competition (between

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods), obtained by testing 30 test functions (300 tests), are given in Table 3, Table 4 and Table 5. Table 3, Table 4 and Table 5 include numerical results obtained by monitoring the criteria NI, NFE, and CPUts.

The performance profiles given in [35] are applied to compare numerical results for the criteria CPUts, NI, and NFE, generated by considered methods. The method that achieves the best results generates the upper performance profile curve.

In Figure 4 (resp. Figure 5), we compare the performance profiles NI (resp. NFE) for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods based on numerical values included in Table 3 (resp. Table 4). A careful analysis reveals that the FMSM method solves

20.00

% of the test problems, with the least NI compared with

M S M

(33.33 %)

,

S M

(26.67 %)

,

F S M

(33.33 %)

, GD

(13.33 %)

, and

F G D

(10.00 %)

. From Figure 4, it is perceptible that the

F M S M

graph attains the top level first, which indicates that

F M S M

outperforms other methods with respect to NI.

From Figure 5, we see that the

F M S M

and

F S M

methods are more efficient than the

M S M

,

S M

,

G D

, and

F G D

methods, with respect to NFE, since they solve

F M S M

(10.00 %)

and

F M S

(33.33 %)

of the test problems with the least NFE compared with

M S M

(40.00 %)

, SM

(26.67 %)

,

G D

(13.33 %)

, and

F G D

(6.67 %)

. From Figure 5, it can be observed that the

F M S M

and

F S M

graphs first come to the top, so that

F M S M

and

F S M

are the winners relative to NFE. On the other hand, the slowest iterations are

G D

and

F G D

.

Figure 6 shows the performance profile of the considered methods based on the CPUts for the numerical values included in Table 5. The

F M S M

method solves

23.33

% of the test problems with the least CPUts compared with

M S M

(30.00 %)

,

S M

(23.33 %)

,

F S M

(23.33 %)

,

G D

(6.67 %)

, and

F G D

(0 %)

. According to Figure 6, the

F M S M

and

F S M

graphs achieve the upper limit level 1 first, which verifies their dominance considering CPUts. Moreover,

G D

and

F G D

are the slowest methods.

Based on the data involved in Table 3, Table 4 and Table 5 and graphs in Figure 4, Figure 5 and Figure 6, it is noticed that the

F M S M

and

F S M

methods achieved the best results compared with the

M S M

,

S M

,

G D

, and

F G D

methods, with respect to three basic criteria: NI, NFE, and CPUts.

Table 6 contains the average CPU time, average number of iterations, and the average number of function evaluations for all 300 numerical experiments. Minimal values are marked in bold.

The average results in Table 6 confirm that the average results for

F M S M

and

F S M

are smaller with respect to the corresponding values for

M S M

and

S M

relative to NI, NFE, and CPUts. Such observation leads us to conclude that the use of a dynamic neutrosophic set (DNS) in gradient methods enables an improvement in the numerical results.

4.2. Closer Examination of the Optimization Methods

A closer examination of the optimization methods is presented in this subsection. The optimization methods

G D

,

S M

,

M S M

,

F G D

,

F S M

, and

F M S M

are used to solve two test functions from Table 3, Table 4 and Table 5 under different initial conditions (ICs). These functions are the Extended Penalty and the Diagonal 6, while the ICs were set to IC1:

1.5 \cdot 1_{100}

, IC2:

- 1_{100}

, and IC3:

4.5 \cdot 1_{100}

for the former and IC1:

1.5 \cdot 1_{100}

, IC2:

2.5 \cdot 1_{100}

, and IC3:

3.5 \cdot 1_{100}

for the latter. It is important to note that

1_{100}

denotes a vector of ones with dimensions

100 \times 1

. The results of the optimization methods are depicted in Figure 7.

In the case of the Extended Penalty function, Figure 7a–c show, respectively, the convergence of the optimization methods with IC1, IC2 and IC3. Therein, the convergence of

F G D

and

F S M

are identical in the cases of IC1 and IC2, whereas the convergence of

F G D

is slightly faster than

G D

’s, and the convergence of

F S M

is slightly faster than

S M

’s in the case of IC3. The convergence of

F M S M

is faster than

M S M

’s in the cases of IC2 and IC3, but it is slower than the convergence of

F G D

and

F S M

in the case of IC1. Additionally,

F M S M

finds the function’s minimum point for all ICs with greater accuracy than the other methods.

In the case of the Diagonal 6 function, Figure 7d–f show, respectively, the convergence of the optimization methods with IC1, IC2, and IC3. Therein, the convergence of

G D

and

F G D

are identical for all ICs, whereas the convergence of

F S M

is faster than

S M

’s for all ICs. The convergence of

F M S M

is faster than

M S M

’s in the cases of IC1 and IC2 and slower in the case of IC3. However,

F M S M

finds the function’s minimum point in the cases of IC2 and IC3 with greater accuracy than the other methods, while

M S M

finds the function’s minimum point in the case of IC1 with greater accuracy than the other methods. Additionally,

G D

and

F G D

have the fastest convergence in the case of IC1, while

F S M

has the fastest convergence in the cases of IC2 and IC3.

In general, all the optimization methods presented here were able to find the minimum of the Extended Penalty and the Diagonal 6 functions. The ICs have a significant impact on the optimization methods’ accuracy and speed of convergence. However,

F G D

,

F S M

, and

F M S M

have faster convergence than

G D

,

S M

, and

M S M

, respectively, in most cases.

4.3. Ranking the Optimization Methods

In this subsection, the performances of the optimization methods

G D

,

S M

,

M S M

,

F G D

,

F S M

, and

F M S M

on solving the 30 test functions included in Table 3, Table 4 and Table 5 are ranked from best to worst, i.e., rank 1 to rank 6, respectively.After determining the rank for each test function for each method, it is necessary to calculate the final rank of the methods. The final rank of the methods is based on the average of the ranks obtained for each method in relation to the observed test functions. The method with the lowest average has the highest rank, i.e., rank 1, while the method with the highest average has the lowest rank, i.e., rank 6. We denote by

n_{m}

(resp.

n_{t f}

) the number of methods (resp. the number of test functions). Given a set of methods M and a set of functions F, the rank of the method x on the function y is defined by

r_{x, y}

. In our case,

r_{x, y}

stands rank method x for the observed test function y and can have rank 1 to rank 6. The average rank of method

x \in M

is calculated in the following way:

A R_{x} = \frac{\sum_{y \in F} r_{x, y}}{n_{t f}},

where

A R_{x}

represents the average of all ranks of the observed method x. The final average rank in our case is obtained when all average ranks are ranked from best to worst, i.e., rank 1 to rank 6, respectively.

Figure 8 shows the iterations’ performance rank of the optimization methods on 30 functions and their average iterations’ rank. Note that a method is regarded as rank 1 if it requires the fewest iterations out of all the considered methods. If a method has the second-fewest iterations compared with all the compared methods, it would be considered rank 2, and so on. Particularly, Figure 8a displays the number of functions in which each method is ranked as rank 1, rank 2, etc., while Figure 8b displays the final rank of the methods based on the average of the results presented in Figure 8a.

For example, in Figure 8a,

M S M

reached rank 1 in the same or a higher number of test functions than

F S M

and

F M S M

. However, because

M S M

achieved rank 6 in many more functions than

F S M

and

F M S M

in Figure 8b,

M S M

has an average rank 3,

F S M

an average rank 2, and

F M S M

an average rank 1. In other words,

F M S M

outperforms FSM and MSM in terms of iteration performance. Moreover, the fact that

F M S M

and

F S M

iterations outperform their corresponding original methods is another important discovery from Figure 8b.

Figure 9 shows the function evaluations performance ranking on 30 functions and their average rank. Note that a method is regarded as rank 1 if it requires the fewest number of function evaluations out of all the considered methods. If a method has the second-fewest function evaluations compared with all the compared methods, it would be considered rank 2, and so on. Particularly, Figure 9a displays the number of functions in which each method is ranked as rank 1, rank 2, etc., whereas Figure 9b displays the final function evaluation ranks of the methods based on the average of the results presented in Figure 9a.

M S M

achieved rank 1 positions in a higher number of functions than all the methods considered in Figure 9a, whereas FGD was considered rank 6 in a higher number of functions than all the methods that were considered. As a result,

M S M

has the average rank 1, and

F G D

takes the average rank 6 in Figure 9b. That is,

M S M

outperforms all the considered methods in terms of function evaluation performance. Moreover, the fact that

F S M

, the fuzzy method, outperforms the original

S M

method is another crucial discovery from Figure 9b.

Figure 10 shows the CPU time consumption performance rank of the optimization methods on 30 functions and their average rank. A method is of rank 1 if it requires the least amount of CPU time compared with all the methods considered. A method achieves rank 2 if it requires the second-least amount of CPU time compared with all the methods, and so on. Particularly, Figure 10a displays the number of functions in which each method is ranked as rank 1, rank 2, etc., whereas Figure 10b displays the final rank of the methods, based on the average of the results presented in Figure 10a.

M S M

is observed as rank 1 in a higher number of functions than all the methods considered in Figure 10a, whereas

F G D

was considered rank 6 in a higher number of functions than all the compared methods. As a result,

M S M

has an average rank 3 and

F G D

an average rank 6 in Figure 10b. If we look at Figure 10b, we can see that

F M S M

outperforms all the methods considered in terms of CPU time consumption performance.

To summarize, all the fuzzy methods work excellently in finding the minimum of the 30 functions. In general,

F M S M

has the best iteration performance,

M S M

has the best function evaluation performance, and

F M S M

has the best CPU time consumption performance.

We use the notation

M_{i} ≺ M_{j}

to signify that the method

M_{i}

is ranked better than

M_{j}

.

Figure 8b leads to the conclusion $F M S M ≺ F S M ≺ M S M ≺ S M ≺ G D ≺ F G D$ .
Figure 9b leads to the conclusion $M S M ≺ F S M ≺ S M ≺ F M S M ≺ G D ≺ F G D$ .
Figure 10b leads to the conclusion $F M S M ≺ S M ≺ M S M ≺ F S M ≺ G D ≺ F G D$ .

In general,

F M S M

has the best iteration performance,

M S M

has the best function evaluation performance, and

F M S M

has the best CPU time consumption performance. An interesting conclusion is

G D ≺ F G D

in the last positions according to all criteria. A particularly interesting observation is that the proposed fuzzy parameter

ν_{k}

improves the

S M

and

M S M

methods, but it is not suitable for

G D

. The logical conclusion is that the fuzzy parameter

ν_{k}

is not desirable to use in the role of an isolated parameter, but it is preferable to use it in combination with other scaling parameters.

4.4. Application of the Fuzzy Optimization Methods to Regression Analysis

Regression analysis is an important statistical tool commonly used in the fields of accounting, economics, management, physics, finance, and many more. This tool is used to study the interaction between independent and dependent variables of various data sets. The classical function of regression analysis is defined as

y = f (x_{1}, x_{2}, \dots, x_{k} + ϵ),

(56)

where

x_{i}, i = 1, 2, \dots, k, k > 0

are predictor variables, y is the response variable, and

ϵ

is the error. The linear regression function is obtained by a straight line relationship between y and x

y = a_{0} + a_{1} x_{1} + a_{2} x_{2} + \dots + a_{k} x_{k} + ϵ,

(57)

where

a_{0}

,

a_{1}

, …,

a_{k}

are the parameters of the regression. The main aim of regression analysis is to estimate the parameters

a_{0}

,

a_{1}

, …,

a_{k}

so that the error

ϵ

is minimized. However, the linear relationship rarely occurs. Thus, a nonlinear regression scheme is frequently used. In this paper, we considered the quadratic regression model. The least squares method is the most popular approach to fitting a regression line and is defined by

y = a_{0} + a_{1} x + a_{2} x^{2} .

(58)

The errors for a set of data

(x_{i}, y_{i}), i = 1, 2, \dots, n

are defined as follows

E_{i} (a) = y_{i} - (a_{0} + a_{1} x_{i} + a_{2} x_{i}^{2}), a = (a_{0}, a_{1}, a_{2}) .

(59)

The main goal would be to fit the “best” line through the data in order to minimize the sum of the residual error squares for all the available data

min_{a \in R^{3}} \sum_{i = 1}^{n} E_{i}^{2} (a), a = (a_{0}, a_{1}, a_{2}) .

(60)

The data set in Table 7 is a detailed description of people killed in traffic accidents in Serbia from 2012–2021. This set was considered based on the annual reports of the Agency for Traffic Safety of the Republic of Serbia. The ordinal number of the year of data collection is denoted by the x variable and the number of people killed in traffic accidents in Serbia is represented by the y variable. Moreover, only data from 2012–2020 would be considered for the data fitting, while data for 2021 would be reserved for the error analysis.

The least squares, FMSM, FSM, and FGD methods are used for fitting the regression models to the data collected. The least squares method is frequently used to solve overdetermined linear systems, which usually occurs when the given equations are greater than the number of unknowns [36]. The least squares method includes determining the best approximating line by comparing the total least squares error.

The approximate function for the nonlinear least squares method derived using the data in Table 7 is defined as follows:

f (x) = 0.5303030303031 x^{2} - 24.1030303030320 x + 685.1666666666750 .

(61)

For more details on how the approximate function (61) is calculated, see [36]. Let

x_{i}

denote the ordinal number of the year and

y_{i}

be the number of people killed in traffic accidents in that year. Then, the least squares method (58) is transformed into the following unconstrained minimization problems:

min_{a \in R^{3}} f (a) = min_{a \in R^{3}} \sum_{i = 1}^{n} E_{i}^{2} (a) = \sum_{i = 1}^{n} {(y_{i} - (a_{0} + a_{1} x_{i} + a_{2} x_{i}^{2}))}^{2}, a = (a_{0}, a_{1}, a_{2}) .

(62)

where n = 9, i.e., i has values from 1 to 9, corresponding to the years 2012 to 2020. The data from 2012–2020 are utilized to formulate the nonlinear quadratic model for the least square method and the corresponding test function of the unconstrained optimization problem. However, the data for 2021 are excluded from the unconstrained optimization function so that it could be used to compute the relative errors of the predicted data. The relative error is calculated using the following formula to measure the precision of a regression model:

R e l a t i v e E r r o r = \frac{| E x a c t v a l u e - A p p r o x i m a t e v a l u e |}{| E x a c t v a l u e |} .

(63)

The regression model with the least relative error is considered the best.

The application of the conjugate gradient method in regression analysis to the optimization problems in finding the regression parameters

a_{0}

,

a_{1}

, …,

a_{k}

was considered in [37,38,39,40]. To overcome the difficulty of computing the values of

a_{0}

,

a_{1}

, and

a_{2}

using the matrix inverse, the researchers employed the proposed FMSM, FSM, and FGD methods to solve the test function (62), and the result is presented in Table 8.

The statistics of people killed in traffic accidents in Serbia is estimated using the proposed FMSM, FSM, FGD, least squares, and trend line methods. The trend line is plotted based on the real data obtained from Table 7 using Microsoft Excel and is shown in Figure 11. The equation for the trend line is in the form of a nonlinear quadratic equation

y = 0.5303 x^{2} - 24.103 x + 685.17 .

(64)

If we compare the approximation functions (61) and (64), as well as the regression parameters from Table 8 obtained using the FMSM, FSM, and FGD methods, we can see that there are small differences in the values of the parameters

a_{0}

,

a_{1}

, and

a_{2}

.

The functions of the trend line (64) and the least square method (61) are compared with approximation functions from the FMSM, FSM, and FGD methods obtained by substituting the values of the parameters

a_{0}

,

a_{1}

, and

a_{2}

in (58) for the initial point (1,1,1).

The primary aim of regression analysis is to estimate the parameters

a_{0}

,

a_{1}

, …,

a_{k}

such that the error

ϵ

is minimized. From Table 9, the proposed FMSM, FSM, and FGD methods have similar relative errors compared with the least square and trend line methods.

Thus, we can conclude that the proposed FMSM, FSM, and FGD methods are applicable to real-life situations.

5. Conclusions

It is known that iterations for solving nonlinear unconstrained minimization are based on the step size defined by the inexact line search. Such step size enables just a sufficient decrease in the value of the objective function. However, after that, there are plenty of possibilities for future adjustments based on the behavior of the objective function. Our goal is to use additional step length parameters to improve convergence. One of these parameters is the

g a m m a_{k}

parameter, which is defined in previous works based on Taylor expansion of the objective function. The second parameter,

ν_{k}

, is defined in this paper using neutrosophic logic and the behavior of the objective function in two consecutive iterations. The enhancements of main line search iterations for solving unconstrained optimization are provided based on application of netrosophic logic. Using an appropriate neutrosophic logic, we propose an additional gain parameter

ν_{k}

to solve uncertainty in defining parameters of nonlinear optimization methods. The parameter arises as the output from an appropriately defined neutrosophic logic system, and it is usable in various gradient descent methods as a corrective step size.

Performed theoretical analysis reveals convergence of novel iterations under the same conditions as for corresponding original methods. Numerical comparison and statistical ranking point out better results generated by the proposed enhanced methods compared with some existing methods. Moreover, statistical measures reveal advantages of fuzzy and neutrosophic improvements compared with original line search optimization methods. Precisely, our numerical experience shows that the neutrosophic parameter

ν_{k}

is particularly efficient as an additional step size composed with previously defined parameters. Direct application of

ν_{k}

is not so effective.

Additional research includes several new directions. First of all, other strategies in neutrosophication and de-neutrosophication are possible, as well as other frameworks parallel to neutrosophic sets, known as picture fuzzy sets and spherical fuzzy sets, discussed in the following articles [41,42]. These can be discussed in future research.

Empirical evaluation shows high sensitivity of the results on the choice of the parameters that define the truth, falsity, and indeterminacy membership functions. Such experience confirms the assumption that a different configuration of parameters, as well as improvements in the neutrosophic logic engine, can lead to further improvements of defined methods. The possibility to define if–then rules in a more sophisticated way based on the history of the obtained values of

f (x)

remains an open topic for future research. Another topic of future study is the investigation of a neutrosophic approach to enhance stochastic optimization methods. In addition, positive definite matrices

B_{k}

are usable as more precise approximations of the Hessian compared with simplest diagonal approximations. Finally, continuous-time nonlinear optimization assumes time-varying scaling parameters inside a selected time interval.

Author Contributions

Conceptualization, P.S.S. and V.N.K.; methodology, P.S.S., V.N.K. and L.A.K.; software, B.I. and S.D.M.; validation, V.N.K., P.S.S., D.S. and L.A.K.; formal analysis, P.S.S., S.D.M. and D.S.; investigation, P.S.S., S.D.M., V.N.K. and L.A.K.; resources, B.I. and S.D.M.; data curation, B.I. and S.D.M.; writing—original draft preparation, P.S.S., D.S. and S.A.E.; writing—review and editing, P.S.S., S.D.M. and S.A.E.; visualization, B.I. and S.D.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Science and Higher Education of the Russian Federation (Grant No. 075-15-2022-1121).

Data Availability Statement

Data and code will be provided on request to authors.

Acknowledgments

Predrag Stanimirović is supported by the Science Fund of the Republic of Serbia, (No. 7750185, Quantitative Automata Models: Fundamental Problems and Applications—QUAM). This work was supported by the Ministry of Science and Higher Education of the Russian Federation (Grant No. 075-15-2022-1121).

Conflicts of Interest

The authors declare no conflict of interest.

References

Sun, W.; Yuan, Y.-X. Optimization Theory and Methods: Nonlinear Programming; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Brezinski, C. A classification of quasi-Newton methods. Numer. Algorithms 2003, 33, 123–135. [Google Scholar] [CrossRef]
Nocedal, J.; Wright, S.J. Numerical Optimization; Springer: New York, NY, USA, 1999. [Google Scholar]
Petrović, M.J.; Stanimirović, P.S. Accelerated Double Direction method for solving unconstrained optimization problems. Math. Probl. Eng. 2014, 2014, 965104. [Google Scholar] [CrossRef] [Green Version]
Petrović, M.J.; Rakocević, V.; Kontrec, N.; Panić, S.; Ilić, D. Hybridization of accelerated gradient descent method. Numer. Algorithms 2018, 79, 769–786. [Google Scholar] [CrossRef]
Stanimirović, P.S.; Miladinović, M.B. Accelerated gradient descent methods with line search. Numer. Algorithms 2010, 54, 503–520. [Google Scholar] [CrossRef]
Stanimirović, P.S.; Milovanović, G.V.; Petrović, M.J. A transformation of accelerated double step size method for unconstrained optimization. Math. Probl. Eng. 2015, 2015, 283679. [Google Scholar] [CrossRef] [Green Version]
Petrović, M.J. An accelerated Double Step Size method in unconstrained optimization. Applied Math. Comput. 2015, 250, 309–319. [Google Scholar]
Ivanov, B.; Stanimirović, P.S.; Milovanović, G.V.; Djordjević, S.; Brajević, I. Accelerated multiple step-size methods for solving unconstrained optimization problems. Optim. Methods Softw. 2021, 36, 998–1029. [Google Scholar] [CrossRef]
Petrović, M.J.; Stanimirović, P.S.; Kontrec, N.; Mladenović, J. Hybrid modification of Accelerated Double Direction method. Math. Probl. Eng. 2018, 2018, 1523267. [Google Scholar] [CrossRef]
Picard, E. Memoire sur la theorie des equations aux derivees partielles et la methode des approximations successives. J. Math. Pures Appl. 1890, 6, 145–210. [Google Scholar]
Ishikawa, S. Fixed points by a new iteration method. Proc. Am. Math. Soc. 1974, 44, 147–150. [Google Scholar] [CrossRef]
Khan, S.H. A Picard-Mann hybrid iterative process. Fixed Point Theory Appl. 2013, 2013, 69. [Google Scholar] [CrossRef]
Rakočević, V.; Petrović, M.J. Comparative analysis of accelerated models for solving unconstrained optimization problems with application of Khan’s hybrid rule. Mathematics 2022, 10, 4411. [Google Scholar] [CrossRef]
Humaira, M.S.; Tunç, C. Fuzzy fixed point results via rational type contractions involving control functions in complex–valued metric spaces. Appl. Math. Inf. Sci. 2018, 12, 861–875. [Google Scholar] [CrossRef]
Vrahatis, M.N.; Androulakis, G.S.; Lambrinos, J.N.; Magoulas, G.D. A class of gradient unconstrained minimization algorithms with adaptive step-size. J. Comp. Appl. Math. 2000, 114, 367–386. [Google Scholar] [CrossRef] [Green Version]
Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef] [Green Version]
Atanassov, K.T. Intuitionistic fuzzy sets. Fuzzy Sets Syst. 1986, 20, 87–96. [Google Scholar] [CrossRef]
Smarandache, F. A Unifying Field in Logics, Neutrosophy: Neutrosophic Probability, Set and Logic; American Research Press: Rehoboth, NM, USA, 1999. [Google Scholar]
Wang, H.; Smarandache, F.; Zhang, Y.Q.; Sunderraman, R. Single valued neutrosophic sets. Multispace Multistruct. 2010, 4, 410–413. [Google Scholar]
Khalil, A.M.; Cao, D.; Azzam, A.; Smarandache, F.; Alharbi, W.R. Combination of the single-valued neutrosophic fuzzy set and the soft set with applications in decision-making. Symmetry 2020, 12, 1361. [Google Scholar] [CrossRef]
Mishra, K.; Kandasamy, I.; Kandasamy W.B., V.; Smarandache, F. A novel framework using neutrosophy for integrated speech and text sentiment analysis. Symmetry 2020, 12, 1715. [Google Scholar] [CrossRef]
Tu, A.; Ye, J.; Wang, B. Symmetry measures of simplified neutrosophic sets for multiple attribute decision-making problems. Symmetry 2018, 10, 144. [Google Scholar] [CrossRef] [Green Version]
Smarandache, F. Neutrosophic Logic—A Generalization of the Intuitionistic Fuzzy Logic. 25 January 2016. Available online: https://ssrn.com/abstract=2721587 (accessed on 1 September 2021).
Ansari, A.Q. From fuzzy logic to neutrosophic logic: A paradigme shift and logics. In Proceedings of the 2017 International Conference on Intelligent Communication and Computational Techniques (ICCT), Jaipur, India, 22–23 December 2017; pp. 11–15. [Google Scholar]
Guo, Y.; Cheng, H.D.; Zhang, Y. A new neutrosophic approach to image denoising. New Math. Nat. Comput. 2009, 5, 653–662. [Google Scholar] [CrossRef]
Christianto, V.; Smarandache, F. A Review of Seven Applications of Neutrosophic Logic: In Cultural Psychology, Economics Theorizing, Conflict Resolution, Philosophy of Science, etc. Multidiscip. Sci. J. 2019, 2, 128–137. [Google Scholar] [CrossRef] [Green Version]
Andrei, N. An acceleration of gradient descent algorithm with backtracking for unconstrained optimization. Numer. Algorithms 2006, 42, 63–73. [Google Scholar] [CrossRef]
Andrei, N. Relaxed Gradient Descent and a New Gradient Descent Methods for Unconstrained Optimization. Visited 29 November 2022. Available online: https://camo.ici.ro/neculai/newgrad.pdf (accessed on 1 September 2021).
Shi, Z.-J. Convergence of line search methods for unconstrained optimization. App. Math. Comput. 2004, 157, 393–405. [Google Scholar] [CrossRef]
Ortega, J.M.; Rheinboldt, W.C. Iterative Solution of Nonlinear Equation in Several Variables; Academic Press: New York, NY, USA; London, UK, 1970. [Google Scholar]
Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
Andrei, N. An unconstrained optimization test functions collection. Adv. Model. Optim. 2008, 10, 147–161. [Google Scholar]
Bongartz, I.; Conn, A.R.; Gould, N.; Toint, P.L. CUTE: Constrained and unconstrained testing environments. ACM Trans. Math. Softw. 1995, 21, 123–160. [Google Scholar] [CrossRef] [Green Version]
Dolan, E.D.; Moré, J.J. Benchmarking optimization software with performance profiles. Math. Program. 2002, 91, 201–213. [Google Scholar] [CrossRef]
Dawahdeh, M.; Mamat, M.; Rivaie, M.; Sulaiman, I.M. Application of conjugate gradient method for solution of regression models. Int. J. Adv. Sci. Technol. 2020, 29, 1754–1763. [Google Scholar]
Moyi, A.U.; Leong, W.J.; Saidu, I. On the application of three-term conjugate gradient method in regression analysis. Int. J. Comput. Appl. 2014, 102, 1–4. [Google Scholar]
Sulaiman, I.M.; Bakar, N.A.; Mamat, M.; Hassan, B.A.; Malik, M.; Ahmed, A.M. A new hybrid conjugate gradient algorithm for optimization models and its application to regression analysis. Indones. J. Electr. Eng. Comput. Sci. 2021, 23, 1100–1109. [Google Scholar] [CrossRef]
Sulaiman, I.M.; Malik, M.; Awwal, A.M.; Kumam, P.; Mamat, M.; Al-Ahmad, S. On three–term conjugate gradient method for optimization problems with applications on COVID–19 model and robotic motion control. Adv. Contin. Discret. Model. 2022, 2022, 1. [Google Scholar] [CrossRef] [PubMed]
Sulaiman, I.M.; Mamat, M. A new conjugate gradient method with descent properties and its application to regression analysis. J. Numer. Anal. Ind. Appl. Math. 2020, 14, 25–39. [Google Scholar]
Mahmood, T.; Ullah, K.; Khan, Q.; Jan, N. An approach toward decision–making and medical diagnosis problems using the concept of spherical fuzzy sets. Neural Comput. Appl. 2019, 31, 7041–7053. [Google Scholar] [CrossRef]
Ullah, K. Picture fuzzy maclaurin symmetric mean operators and their applications in solving multiattribute decision–making problems. Math. Probl. Eng. 2021, 2021, 1098631. [Google Scholar] [CrossRef]

Figure 1. The general structure of the fuzzy optimization methods.

Figure 2. The NLC structure decomposed by the neutrosophic rules.

Figure 3. Neutrosophication (20)–(22) and de-neutrosophication (24) under the parameters in Table 2. (a) Neutrosophication. (b) De-neutrosophication.

Figure 4. NI performance profiles for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods.

Figure 4. NI performance profiles for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods.

Figure 5. NFE performance profiles for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods.

Figure 5. NFE performance profiles for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods.

Figure 6. CPUts performance profiles for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods.

Figure 6. CPUts performance profiles for the

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

methods.

Figure 7. Convergence of the optimization methods under different ICs. (a) Extended Penalty function with IC1. (b) Extended Penalty function with IC2. (c) Extended Penalty function with IC3. (d) Diagonal 6 function with IC1. (e) Diagonal 6 function with IC2. (f) Diagonal 6 function with IC3.

Figure 8. Iterations’ performance ranks of the optimization methods on 30 functions and their average rank. (a) Iterations’ performance. (b) Average of iterations’ performance.

Figure 9. Function evaluation performance ranks of the optimization methods on 30 functions and their average rank. (a) Function evaluations performance. (b) Average of function evaluation performance.

Figure 10. CPU time consumption performance ranks of the optimization methods on 30 functions and their average rank. (a) Time consumption’s performance. (b) Average of time consumption’s performance.

Figure 11. Nonlinear quadratic trend line for people killed in traffic accidents in Serbia.

Table 1. Parameters in gradient descent methods and neutrosophic modifications.

Method	Step Sizes
Method	First	Second	Third
GD	$t_{k}$	-	-
FGD	$ν_{k}$	$t_{k}$	-
SM	$t_{k}$	${(γ_{k}^{S M})}^{- 1}$	-
FSM	$ν_{k}$	$t_{k}$	${(γ_{k}^{S M})}^{- 1}$
MSM	$τ_{k}$	${(γ_{k}^{M S M})}^{- 1}$	-
FMSM	$ν_{k}$	$τ_{k}$	${(γ_{k}^{M S M})}^{- 1}$

Table 2. Recommended parameters in NLC.

Set	Membership Function		c $_{1}$	c $_{2}$	Weight
Input	Truth	Sigmoid	1	3	1
	Falsity	Sigmoid	1	3	1
	Indeterminacy	Gaussian	6	0	1
Output	(24)		3	-	1

Table 3. Summary of NI results for

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

.

Table 3. Summary of NI results for

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

.

Test Function	No. of Iterations
Test Function	MSM	FMSM	SM	FSM	GD	FGD
Extended Penalty Function	651	377	549	372	1255	1250
Perturbed Quadratic function	44,419	75,431	77,458	74,473	372,356	369,992
Raydan 1 function	12,965	12,437	15,913	11,035	58,743	58,594
Raydan 2 function	90	87	90	94	67	129
Diagonal 1 function	52,527	11,571	8955	12,189	41,208	42,290
Diagonal 2 function	26,215	24,866	30,912	29,957	543,249	543,054
Diagonal 3 function	7545	12,586	13,892	13,050	62,128	61,072
Hager function	28,073	800	839	817	3104	2956
Generalized Tridiagonal 1 function	290	440	270	376	656	665
Extended TET function	130	248	130	225	1974	1856
Extended quadratic penalty QP1 function	328	189	246	177	563	549
Extended quadratic penalty QP2 function	1538	2105	3302	3564	134,401	122,926
Quadratic QF2 function	44,911	14,203	83,957	11,488	409,859	411,364
Extended quadratic exponential EP1 function	87	100	64	109	496	528
Extended tridiagonal 2 function	568	421	419	415	1145	1099
Almost perturbed quadratic function	44,029	78,452	80,559	79,793	374,841	375,518
ENGVAL1 function (CUTE)	363	298	302	291	573	557
QUARTC function (CUTE)	185	216	246	211	524,612	524,612
Diagonal 6 function	90	87	90	95	67	129
Generalized quartic function	150	150	157	238	1453	1751
Diagonal 7 function	124	113	90	136	543	570
Diagonal 8 function	100	86	103	89	583	573
Diagonal 9 function	16,920	17,221	11,487	17,752	195,362	195,155
HIMMELH function (CUTE)	100	90	100	90	90	90
Extended Rosenbrock	50	50	50	50	50	50
Extended BD1 function (block diagonal)	189	204	191	223	650	682
NONDQUAR function (CUTE)	42	39	42	35	33	30
DQDRTIC function (CUTE)	827	635	1263	497	15,320	15,398
Extended Beale function	480	980	639	831	12,834	12,826
EDENSCH function (CUTE)	337	314	275	275	663	705

Table 4. Summary of NFE results for

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

.

Table 4. Summary of NFE results for

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

.

Test Function	No. of Funct. Evaluation
Test Function	MSM	FMSM	SM	FSM	GD	FGD
Extended Penalty Function	3527	2585	2394	2388	47,378	48,057
Perturbed quadratic function	257,063	438,335	439,924	423,195	16,171,466	16,069,927
Raydan 1 function	89,508	69,791	87,508	61,595	1,667,238	1,658,647
Raydan 2 function	190	233	190	235	144	291
Diagonal 1 function	526,958	56,914	47,874	58,155	1,615,828	1,664,760
Diagonal 2 function	158,515	144,005	171,300	166,567	1,086,508	1,086,118
Diagonal 3 function	41,528	71,024	76,336	70,540	2,407,025	2,364,254
Hager function	271,940	3402	3308	3165	56,824	54,818
Generalized tridiagonal 1 function	1012	1587	931	1445	10,867	11,432
Extended TET function	440	681	440	601	19,800	18,859
Extended quadratic penalty QP1 function	1918	1992	2507	1842	10,771	11,268
Extended quadratic penalty QP2 function	10,731	14,285	24,234	26,528	3,875,768	3,545,317
Quadratic QF2 function	245,407	102,882	465,615	80,626	19,072,367	19,141,623
Extended quadratic exponential EP1 function	807	604	587	830	13,643	14,852
Extended tridiagonal 2 function	2550	2123	2285	2111	9570	9464
Almost perturbed quadratic function	259,487	452,388	452,360	445,028	16,285,621	16,309,931
ENGVAL1 function (CUTE)	1974	2700	2098	2315	8787	8593
QUARTC function (CUTE)	420	492	542	472	1,049,274	1,049,304
Diagonal 6 function	229	335	229	263	158	332
Generalized quartic function	409	470	423	781	19,062	25,071
Diagonal 7 function	458	547	293	1094	3348	4286
Diagonal 8 function	326	462	980	612	3921	4078
Diagonal 9 function	141,781	90,948	71,353	89,023	8,449,946	8,455,412
HIMMELH Function (CUTE)	210	190	210	190	190	190
Extended Rosenbrock	110	110	110	110	110	110
Extended BD1 function (Block Diagonal)	558	696	598	691	7660	8452
NONDQUAR function (CUTE)	2084	2085	2057	2060	2500	2501
DQDRTIC function (CUTE)	4090	2805	6518	2542	395,014	400,147
Extended Beale function	2200	4720	3277	3416	207,852	208,551
EDENSCH function (CUTE)	1198	1213	956	872	9403	10,615

Table 5. Summary of CPUts results for

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

.

Table 5. Summary of CPUts results for

M S M

,

S M

,

G D

,

F S M

,

F G D

, and

F M S M

.

Test Function	CPU Time
Test Function	MSM	FMSM	SM	FSM	GD	FGD
Extended penalty function	3.734	1.969	1.969	1.844	17.672	19.078
Perturbed quadratic function	167.063	323.266	298.813	317.250	10,163.688	9771.406
Raydan 1 function	46.813	35.141	50.953	30.234	727.281	667.094
Raydan 2 function	0.453	0.281	0.281	0.344	0.250	0.531
Diagonal 1 function	522.703	86.500	59.297	99.953	1836.766	2091.281
Diagonal 2 function	236.531	228.188	271.094	276.281	2105.219	2158.156
Diagonal 3 function	75.484	172.250	139.859	157.594	3842.625	4025.688
Hager function	384.438	9.594	9.453	9.250	116.922	118.609
Generalized tridiagonal 1 function	2.656	3.188	2.000	3.797	11.641	14.875
Extended TET function	0.953	1.313	0.906	1.359	15.922	16.281
Extended quadratic penalty QP1 function	1.688	1.625	1.875	1.578	4.203	4.391
Extended quadratic penalty QP2 function	5.844	9.891	7.203	10.516	746.328	770.500
Quadratic QF2 function	124.344	47.875	243.688	35.359	7611.656	8436.359
Extended quadratic exponential EP1 function	0.969	0.594	0.469	1.109	5.281	7.297
Extended tridiagonal 2 function	1.906	1.313	1.609	1.266	3.359	3.766
Almost perturbed quadratic function	135.484	314.953	238.625	267.750	9271.016	13,902.047
ENGVAL1 function (CUTE)	2.031	1.797	1.844	1.828	4.125	4.422
QUARTC function (CUTE)	2.813	2.984	3.250	3.219	6253.828	8032.547
Diagonal 6 function	0.328	0.219	0.344	0.484	0.203	0.438
Generalized quartic function	0.344	0.266	0.438	0.625	6.766	11.922
Diagonal 7 function	0.953	0.797	0.531	1.813	3.672	4.406
Diagonal 8 function	0.781	0.922	1.797	1.047	5.578	4.469
Diagonal 9 function	249.875	74.484	53.234	77.219	2478.422	2705.781
HIMMELH function (CUTE)	0.797	0.594	0.781	0.797	0.609	0.641
Extended Rosenbrock	0.203	0.094	0.156	0.203	0.219	0.141
Extended BD1 function (block diagonal)	0.766	0.766	0.859	0.969	4.984	4.469
NONDQUAR function (CUTE)	7.266	8.891	7.797	9.047	9.406	10.406
DQDRTIC function (CUTE)	2.516	1.500	2.906	1.500	118.250	127.844
Extended Beale function	7.219	18.734	9.766	16.016	488.328	546.359
EDENSCH function (CUTE)	6.141	6.422	4.016	5.063	24.672	36.766

Table 6. Average numerical outcomes for 30 test functions tested on 10 numerical experiments.

Average Performances	MSM	FMSM	SM	FSM	GD	FGD
Average no. of iterations	9477.43	8493.20	11,086.33	8631.57	91,962.60	91,565.67
Average no. of funct. evaluation	67,587.60	49,020.13	62,247.90	48,309.73	2,416,934.77	2,406,242.00
Average CPU time (s)	66.44	45.21	47.19	44.51	1529.30	1783.27

Table 7. The number of people killed in traffic accidents in Serbia from 2012 to 2021.

Year	Number of Data (x)	The Number of People Killed in Traffic Accidents in Serbia (y)
2012	1	688
2013	2	650
2014	3	536
2015	4	599
2016	5	607
2017	6	579
2018	7	548
2019	8	534
2020	9	492
2021	10	521

Table 8. Test results for optimization of quadratic model for the FMSM, FSM, and FGD methods.

Method	Initial Point	NI	NFE	CPUts	Regression Parameters (a $_{0}$ , a $_{1}$ , a $_{2}$ )
Method	Initial Point	NI	NFE	CPUts	a $_{0}$	a $_{1}$	a $_{2}$
FMSM	(1,1,1)	28,998	119,898	1.484	685.166632504562	$- 24.1030144870845$	0.530301492634611
FSM	(1,1,1)	29,612	120,545	1.609	685.166666629541	$- 24.1030302889654$	0.530303029090458
FGD	(1,1,1)	173,004	7,861,471	35.125	685.161769964723	$- 24.1009143873562$	0.530114238129987
FMSM	(5,5,5)	29,791	126,449	1.750	685.166627004962	$- 24.102996538241$	0.530299060289809
FSM	(5,5,5)	29,504	119,706	1.406	685.166666659503	$- 24.1030303019929$	0.530303030290009
FGD	(5,5,5)	172,876	7,855,584	36.812	685.161745521808	$- 24.1009038359837$	0.530113219772043
FMSM	(−1,−1,−1)	29,259	120,695	1.484	685.166666761033	$- 24.1030303425383$	0.530303033790302
FSM	(−1,−1,−1)	29,513	119,912	1.328	685.166388359794	$- 24.1029100449169$	0.530292483042678
FGD	(−1,−1,−1)	173,698	7,893,030	37.797	685.161987072222	$- 24.1010082057947$	0.530122579942827

Table 9. Estimation point and relative errors for 2021 data.

Method	Estimation Point	Relative Error
FMSM	497.16664	0.045745419
FSM	497.16667	0.045745362
FGD	497.16405	0.045750384
Least Square	497.16667	0.045745361
Trend line	497.17000	0.045738964

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Stanimirović, P.S.; Ivanov, B.; Stanujkić, D.; Katsikis, V.N.; Mourtas, S.D.; Kazakovtsev, L.A.; Edalatpanah, S.A. Improvement of Unconstrained Optimization Methods Based on Symmetry Involved in Neutrosophy. Symmetry 2023, 15, 250. https://doi.org/10.3390/sym15010250

AMA Style

Stanimirović PS, Ivanov B, Stanujkić D, Katsikis VN, Mourtas SD, Kazakovtsev LA, Edalatpanah SA. Improvement of Unconstrained Optimization Methods Based on Symmetry Involved in Neutrosophy. Symmetry. 2023; 15(1):250. https://doi.org/10.3390/sym15010250

Chicago/Turabian Style

Stanimirović, Predrag S., Branislav Ivanov, Dragiša Stanujkić, Vasilios N. Katsikis, Spyridon D. Mourtas, Lev A. Kazakovtsev, and Seyyed Ahmad Edalatpanah. 2023. "Improvement of Unconstrained Optimization Methods Based on Symmetry Involved in Neutrosophy" Symmetry 15, no. 1: 250. https://doi.org/10.3390/sym15010250

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improvement of Unconstrained Optimization Methods Based on Symmetry Involved in Neutrosophy

Abstract

1. Introduction, Preliminaries, and Motivation

2. Fuzzy Optimization Methods

FMSM Method

3. Convergence Analysis

4. Numerical Experiments

4.1. Comparison Based on the Dolan–Moré Performance Profile

4.2. Closer Examination of the Optimization Methods

4.3. Ranking the Optimization Methods

4.4. Application of the Fuzzy Optimization Methods to Regression Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI