Next Article in Journal
Data Mining Techniques for Endometriosis Detection in a Data-Scarce Medical Dataset
Previous Article in Journal
Automatic Optimization of Deep Learning Training through Feature-Aware-Based Dataset Splitting
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of the Parabola Method in Nonconvex Optimization

1
Department of Applied Mathematics, Melentiev Energy Systems Institute, Lermontov St. 130, 664033 Irkutsk, Russia
2
Scientific and Educational Center “Artificial Intellegence Technologies”, Bauman Moscow State Technical University, 2nd Baumanskaya, Str. 5, 105005 Moscow, Russia
*
Authors to whom correspondence should be addressed.
Algorithms 2024, 17(3), 107; https://doi.org/10.3390/a17030107
Submission received: 18 December 2023 / Revised: 19 February 2024 / Accepted: 21 February 2024 / Published: 1 March 2024
(This article belongs to the Special Issue Biology-Inspired Algorithms and optimization)

Abstract

:
We consider the Golden Section and Parabola Methods for solving univariate optimization problems. For multivariate problems, we use these methods as line search procedures in combination with well-known zero-order methods such as the coordinate descent method, the Hooke and Jeeves method, and the Rosenbrock method. A comprehensive numerical comparison of the obtained versions of zero-order methods is given in the present work. The set of test problems includes nonconvex functions with a large number of local and global optimum points. Zero-order methods combined with the Parabola method demonstrate high performance and quite frequently find the global optimum even for large problems (up to 100 variables).

1. Introduction

Zero-order methods are usually applied to problems in which the function being optimized has no explicit analytic expression. To calculate the values of such functions, it is necessary to solve non-trivial auxiliary problems of computational mathematics and optimization. In some cases, it is assumed that the calculation of a single objective function value may take from a few minutes to several hours of continuous computer operation. Developing an effective global optimization method requires identifying certain properties of the objective function (and constraints), for example, determining a good estimate of the Lipschitz constant [1,2,3,4] or representing a function as a difference between two convex functions [5,6,7]. Such auxiliary problems do not always have a unique solution and are not often easily solvable. The effectiveness of global optimization methods often depends on the ability to quickly find a good local solution [8], which can significantly speed up some global optimization methods (for example, the branch and bounds method).
The aim of the paper is to test several well-known zero-order methods on nonconvex global optimization problems a using special version of the multistart strategy. We check the ability of these methods not only to find a good local solution but also to find a global optimum of the problem. The multistart strategy was considered as a starting case for population-based optimization. It is necessary to note that the multistart approach can be considered as an overabundant population. Our aim is to check the efficiency of the suggested approach under this condition.
Special attention is given to the Parabola Method in the present work. The reason for such an investigation is the following: if the objective function is differentiable, then in a small neighborhood of an interior local minimum point, the objective function behaves like a convex function. In [9], a more general statement of this property based on the concept of local programming is given. A combination of this method with different zero-order methods in multivariate optimization problems demonstrates quite high performance and global optimization method qualities. We tested several well-known zero-order multivariate optimization methods; nevertheless, application of this modification of the Parabola Method as the line-search procedure can be used in many other methods.
The paper is organized as follows. In Section 2, we give a description of the Golden Section and Parabola Methods and perform numerical experiments on a set of univariate optimization problems. A numerical comparison of zero-order methods in combination with the Golden Section and Parabola Methods is presented in Section 3. We conclude the paper with a short discussion of the results in Section 4 and Section 5.

2. Univariate Optimization: Parabola Method

We consider the following optimization problem
minimize f ( x ) , subject to x P R , P = x R : α x β ,
where f : R R is a continuously differentiable function. It is assumed that f has a finite number of local minima over P, and only values of f are available. Problems with such an assumption are described in [10]. In this section, we consider two well-known univariate optimization methods: the Golden Section Method (GSM) [11] and the Quadratic Interpolation Method or the Powell M.J.D. Method or the Parabola Method (PM) [12]. Let us give brief descriptions of both methods.

2.1. The Golden Section Method

Step 0.
Choose the accuracy of the algorithm ε > 0 . Set γ = 5 1 2 , α 0 = α , β 0 = β , λ 0 = α 0 + ( 1 γ ) ( β 0 α 0 ) , μ 0 = α 0 + γ ( β 0 α 0 ) . Evaluate f ( λ 0 ) and f ( μ 0 ) . Set k = 0 .
Step 1.
If β k α k ε , stop: x * is ε -optimal point. Otherwise, if f ( λ k ) > f ( μ k ) , go to Step 2; else, go to Step 3.
Step 2.
Set α k + 1 = λ k , β k + 1 = β k , λ k + 1 = μ k , μ k + 1 = α k + 1 + γ ( β k + 1 α k + 1 ) . Evaluate f ( μ k + 1 ) . Go to Step 4.
Step 3.
Set α k + 1 = α k , β k + 1 = μ k , μ k + 1 = λ k , λ k + 1 = α k + 1 + ( 1 γ ) ( β k + 1 α k + 1 ) . Evaluate f ( λ k + 1 ) . Go to Step 4.
Step 4.
Increase k k + 1 , and go to Step 1.

2.2. The Parabola Method

Step 0.
Choose the accuracy ε > 0 . Choose points x 1 0 < x 2 0 < x 3 0 [ α , β ] such that
f ( x 1 0 ) > f ( x 2 0 ) < f ( x 3 0 ) .
Set z = x 2 0 , k = 0 .
Step 1.
Find the minimum x ¯ k of the quadratic interpolation polynomial in the following way.
x ¯ k = 1 2 x 1 k + x 2 k a 1 k a 2 k ,
a 1 k = f ( x 2 k ) f ( x 1 k ) x 2 k x 1 k , a 2 k = 1 x 3 k x 2 k f ( x 3 k ) f ( x 1 k ) x 3 k x 1 k a 1 k .
Step 2.
Check stop criterion
| x ¯ k z | ε .
If (3) is held, then terminate the algorithm, and x = x ¯ k is ε -optimal point. Otherwise, go to Step 3.
Step 3.
Choose x ^ k = arg min x = x 1 k , x 3 k f ( x ) .
Set z = x ¯ k . Denote points x ^ , x 2 k , x ¯ k in ascending order as x 1 k + 1 , x 2 k + 1 , x 3 k + 1 . Increase k k + 1 , and go to Step 1.
When f is a unimodal (semi-strictly quasiconvex [13]) over P function, both methods determine an approximate point of minimum in a finite number of iterations. In the case of the Golden Section Method the number of function evaluations is equal to the number of iterations plus two; in the case of the Parabola Method, the number of function evaluations is equal to the number of iterations plus three. It is well known that, in general, the Golden Section Method is more efficient than the Parabola Method. However, in the continuously differentiable case, the behavior of the Parabola Method can be improved.
Consider the following two examples from [14].
Example 1.
In problem (1), f ( x ) = 16 ( x 2 24 x + 5 ) e x , P = [ 1.9 , 3.9 ] . The Golden Section Method with ε = 0.001 finds the approximate solution x * = 2.867996734 with f ( x * ) = 3.850450707 in 16 iterations. The Parabola Method with the same ε finds the approximate solution x * = 2.868823736 with f ( x * ) = 3.850448184 in six iterations.
Example 2.
In problem (1), f ( x ) = 2 ( x 3 ) 2 + e x 2 2 , P = [ 3 , 3 ] . The Golden Section Method with ε = 0.001 finds the approximate solution x * = 1.590558077 with f ( x * ) = 7.515924361 in 19 iterations. The Parabola Method with the same ε finds the approximate solution x * = 1.584929941 with f ( x * ) = 7.516292947 in 35 iterations.
In both examples, the objective functions are unimodal. In the first example, the Parabola Method worked two times faster, and in the second example, the Parabola Method worked about two times slower than the Golden Section Method. From a geometrical point of view, the objective function in the first example is more like a parabola than in the second example, and we are going to show how the efficiency of the Parabola Method can be improved in cases similar to Example 2.
We start from checking efficiency of the Golden Section Method on a number of multimodal functions. Table 1 and Table 2 present the results of 17 test problems from [14].
The sign g in the g/l column corresponds to the case when a global minimum point was determined, and the sign l corresponds to the case when only a local minimum point was determined. Symbol k f corresponds to the number of function evaluations. Surprisingly enough, only in four problems was a local minimum determined; in all other problems, the Golden Section Method found global solutions.
The geometrical interpretation of failure of the Golden Section Method in finding a global minimum can be given on the basis of problems 7 and 11 from Table 2. In these cases, only local minimum points were determined, and, as can be seen from Figure 1 and Figure 2, the global minimum points are located closer to the endpoints of the feasible intervals. Hence, we can make the following assumption: univariate multimodal problems with global minimum points located more or less in the middle of the intervals are more efficiently tractable by the Golden Section Method from the global optimization point of view. This topic needs further theoretical investigation and is not the subject of our paper.
Let us turn now to the Parabola Method. In order to start the method, property (2) must be satisfied. The points satisfying property (2) are known as a three-point pattern (TPP) in [11]. In order to find the TPP, we propose the following procedure.

2.3. TPP Procedure

Step 0.
Choose integer N > 0 , and calculate Δ = β α N . Set k = 0 , k p = 0 .
Step 1.
Calculate y 1 k = α + k Δ , y 2 k = α + ( k + 1 ) Δ , y 3 k = α + ( k + 2 ) Δ .
Step 3.
If f ( y 1 k ) > f ( y 2 k ) < f ( y 3 k ) , then k p = k p + 1 , ν 1 k p = y 1 k , ν 2 k p = y 2 k , ν 3 k p = y 3 k .
Step 4.
If y 3 k p β Δ , then increase k k + 1 , and go to Step 1. Otherwise, stop.
When the TPP procedure stops, we have k p three-point patterns, and in order to find the best solution, we can start the Parabola Method for all k p patterns. In practice, we used the TPP procedure several times. We started from a small enough value of N, say N = 3 , and ran the TPP procedure. Then, we increased N 2 N and ran the TPP procedure again. When the number of three-point patterns was the same in both runs, we stopped. In this case, we say that the TPP procedure stabilized. Otherwise, we increased N and ran the TPP procedure again, and so on. Since we assumed that the number of local minima of f over P is finite, such a repetition is also finite. Finally, we obtained k p TPP subintervals [ ν 1 i , ν 3 i ] and the three-point corresponding patterns ν 1 i , ν 2 i , ν 3 i , i = 1 , , k p . We cannot guarantee that each subinterval contains exactly one local minimum; but, in practice, this property is true rather often. Then, we ran the Parabola Method for each subinterval, found the corresponding local minima, and selected the best ones as solutions. If k p = 0 , then no three-point patterns were detected. In this case, the Parabola Method is not applicable (it can be disconvergent), and we switch to the Golden Section Method. We call this strategy a two-stage local search approach (two-stage approach for short).
Let us apply the two-stage approach to solving problems 11 (Figure 1) and 7 (Figure 2) from Table 1 for which the Golden Section Method did not find global minimum points and then run the Parabola Method over each obtained subinterval.
Example 3.
Problem 11: f ( x ) = e x sin ( 2 π x ) , P = [ 0 , 4 ] . Initial value N = 3 . The TPP procedure was run four times. After the first run, one subinterval with one three-point pattern was obtained, after the second run, two subintervals with two three-point patterns were obtained, after the third run, four subintervals with four three-point patterns were obtained, and after the fourth run, four smaller subintervals with four three-point patterns were obtained again. The TPP procedure was stopped, and k p = 4 . The total number of function evaluations was equal to 24. Then, with the accuracy ε = 0.001 , we ran the Parabola Method, which stopped after two additional function evaluations for each subinterval. The final subintervals, the three-point pattern (TPP), the corresponding starting parabolas, and additional function evaluations ( k f ) of the Parabola Method over each subinterval are given in Table 3.
Therefore, in total, after 32 function evaluations, all minima of the given function were determined. The geometrical interpretation of the results of the TPP procedure application corresponding to Table 3 is presented in Figure 3.
Example 4.
Problem 7: f ( x ) = sin ( x ) + sin 2 x 3 , P = [ 3.1 , 20.4 ] . Initial value N = 3 . Three subintervals were determined after four runs of the TPP procedure, k p = 3 (see Figure 4). The results of the TPP procedure and the Parabola Method are given in Table 4.
After 36 function evaluations, all minima were detected.
Now, we apply the two-stage approach to the problems from Table 1 and Table 2. The results are given in Table 5 and Table 6. Since problems 7 and 11 were considered in Examples 3 and 4, they are not included in the Table 5. Problem 6 is described in Table 5.
In these tables, the Minima structure column shows how many local and global minima each problem has, the k f T P P column shows the number of performed function evaluations until the TPP procedure is stabilized, the TPP subintervals column shows the subintervals obtained from application of the TPP procedure, the k f G S M column shows the number of function evaluations of the Golden Section Method, the k f P M column shows the number of function evaluations of the Parabola Method, and the g/l column shows the type of the calculated point: g—global minimum, l—local minimum. For example, for problem 2 with two local minima and one global minima, the TPP procedure found three subintervals after 12 function evaluations. The first subinterval contains a local minimum, the second subinterval contains a global minimum, and the third subinterval contains a local minimum. The minimum over the first subinterval was found by the Golden Section method in 14 function evaluations and by the Parabola Method in 4 function evaluations. The same results from both methods were demonstrated over the second and the third subintervals. Therefore, the total number of function evaluations spent by the two-stage approach with the Golden Section Method at the second stage was 12 + 3 × 14 = 54 and 12 + 3 × 4 = 24 function evaluations with the Parabola Method.
Table 6 shows the results of the application of the two-stage approach to problem 6. Application of the TPP procedure finished in 19 subintervals, and for each of them, the Golden Section Method and the Parabola Method were used; all global minima were found, as well as all interior local minima, and one local minimum was attained at the endpoint of the feasible interval.
We can see from the presented test results that the direct application of the two-stage approach may involve a lot of computation. If we are interested in finding a good local solution faster than the pure Golden Section Method we can use the following Accelerated Two-Stage Approach (ATSA).

3. Accelerated Two-Stage Approach

In this section, we propose a modification of the two-stage approach described in the previous section. We set the integer parameter N in the TPP procedure in advance and do not change it. When the TPP procedure is finished, N + 1 values of the objective function at points x k = α + k Δ , k = 0 , , N with Δ = β α N are available. We determine the record value f R e c = min { f ( x i ) : i = 0 , , N } and a corresponding record point x R e c Arg min { f ( x i ) : i = 0 , , N } . If the number k p of the TPP subintervals is positive, k p > 0 , then we choose the TPP subinterval, which contains the record point x R e c , and run the Parabola Method over this TPP subinterval. Let x P M be a point found by the Parabola Method. We define the point x * = arg min { f ( x R e c ) , f ( x P M ) and the corresponding objective value f * = f ( x * ) . We deliver the pair ( x * , f * ) as the result of the Accelerated Two-Stage Approach. If the number k p is equal to zero, i.e., no TPP subintervals were detected, then we determine
m { 0 , , N 1 } : x m x R e c x m + 1 ,
and run the Golden Section Method over the interval [ x m , x m + 1 ] . Let x G S M be the corresponding point determined by the Golden Section Method. As in the previous case, we define the point x * = arg min { f ( x G S M ) , f ( x R e c ) and the value f * = f ( x * ) , and we deliver the pair x * , f * , as a result of the Accelerated Two-Stage Approach (ATSA).
Let us now give the description of the ATSA procedure.

The ATSA Procedure

Step 1.
Apply the TPP procedure. Let k p and Δ be parameters calculated by the TPP procedure. Let f R e c be the record value over all calculated points and x R e c : f ( x R e c ) = f R e c .
Step 2.
If k p > 0 , then select the subinterval containing x R e c and run the Parabola Method over the selected subinterval, obtaining the point x P M . Define the point x * = arg min { f ( x R e c ) , f ( x P M ) } and value f * = f ( x * ) . Stop.
Step 3.
If k p = 0 , then determine the subinterval [ x m , x m + 1 ] according to (4) and run the Golden Section Method, obtaining the point x G S M . Define the point x * = arg min { f ( x R e c ) , f ( x G S M ) } and value f * = f ( x * ) . Stop.
The results of testing the accelerated two-stage approach are given in Table 7. The integer parameter N for the TPP procedure was equal to 3. The GSM/PM column shows the number of function evaluations of the corresponding method. If, after the first stage, the number of TPP subintervals was equal to zero (as for problem 1 k p = 0 ), then the Golden Section Method was used, and the number of function evaluations k f G S M = 17 . If, after the first stage, the number of the TPP subintervals was positive (as for problem 2 k p = 2 ), then the Parabola Method was used, and the number of function evaluations k f P M = 5 . The Total k f column shows the total number of function evaluations of the ATSA procedure, and the g/l column shows a global (g) or local (l) minimum was found.
We have to mention that the integer parameter N (the number of the subdivision points in the TPP procedure) is the most crucial. If, for example, N is equal to 20, then for all testing problems, global minimum points were determined. However, in this case, the numbers of objective function evaluations were rather large, more than several tens. The parameter N can be chosen according to the number of expected local minima. In the general case, we aim to find a good local minimum, and from the results of our testing, we recommend choosing an N between 5 and 10.
We can see from Table 7 that the accelerated two-stage approach finds solutions with lower computational efforts compared to the pure Golden Section Method, still finding global solutions in almost all the test problems. As for problems 12 and 17, some further modifications can be invented. Nevertheless, the current results are encouraging.

4. Numerical Comparison of Zero-Order Methods: Multivariate Optimization Problems

We tested and compared the following zero-order methods: the Hooke and Jeeves method combined with the ATSA, the Rosenbrock method with a discrete step in the search procedure, the Rosenbrock method combined with the ATSA, and the coordinate descent method combined with the ATSA. In the current section, we give a brief description of these methods. Many of them are described in monographs and review articles on optimization methods [10,11,15,16,17,18,19].
Hooke and Jeeves method. The pattern search method of Hooke and Jeeves consists of a sequence of exploratory moves about a base point that, if successful, are followed by pattern moves.
The exploratory moves acquire information about the function f ( x ) in the neighborhood of the current base point b k = ( x 1 k , , x n k ) . Each variable x i k , in turn, is given an increment ε i (first, in the positive direction and then, if necessary, in the negative direction), and a check is made of the new function value. If any move results in a reduced function value, the new value of that variable will be retained. After all the variables have been considered, a new base point b k + 1 will be reached. If b k + 1 = b k , no function reduction has been achieved. The step length ε i is reduced, and the procedure is repeated. If b k + 1 b k , a pattern move from b k is made.
A pattern move attempts to speed up the search by using the information already acquired about f ( x ) to identify the best search direction. A move is made from b k + 1 in the direction d = b k + 1 b k , since a move in this direction leads to a decrease in the function value. In this step, we use the ATSA to solve a univariate optimization problem f ( λ ) = min λ f ( b k + 1 + λ d ) and obtain a new point p k = b k + 1 + λ d . The search continues with a new sequence of exploratory moves about p k . If the lowest function value obtained is less than f ( b k ) , then a new base point b k + 2 has been reached. In this case, a second pattern move is made. If not, the pattern move from b k + 1 is abandoned, and we continue with a new sequence of exploratory moves about b k + 1 .
Rosenbrock method. The main idea of the method is to iteratively find the descending direction of a function along linearly independent and orthogonal directions. A successful step in the current direction leads to an increase in this step on the following iteration by means of a stretch coefficient ρ > 0 ; otherwise, the coefficient 0 < ρ < 1 is used to decrease the step. The search within the current direction system is implemented until all possibilities of function reduction are exhausted. If there are no successful directions, a new set of linearly independent and orthogonal directions is constructed by means of rotating the previous ones in an appropriate manner. To obtain a new direction system, the Gram–Schmidt procedure is used.
Iterative coordinate descent method. This method is a simplified version of the Hooke and Jeeves method. It uses only the analogue of exploratory moves with an appropriate step size obtained from the line search procedure along the current basis vector.
Combined versions. We designed three modifications of the described methods to estimate the potential for finding the global optimum. The iterative coordinate descent method, the Hooke and Jeeves method, and the Rosenbrock method are combined with the ATSA and presented in this section. The following test problems were used to perform numerical experiments.
  • Branin function
    f ( x ) = x 2 5.1 x 1 2 4 π 2 + 5 x 1 π 6 2 + 10 1 1 8 π cos x 1 + 10 ,
    X = x R 2 : 5 x 1 10 , 0 x 2 15 , .
    Global optimum: x = ( 3.141593 , 2.275 ) , x = ( 3.141593 , 12.275 ) , f = 0.397667 .
  • Treccani function
    f ( x ) = x 1 4 + 4 x 1 3 + 4 x 1 2 + x 2 2 ,
    X = x R 2 : 3 x 1 3 , 3 x 2 3 , .
    Global optimum: x = ( 0 , 0 ) , x = ( 2 , 0 ) , f = 0 .
  • Shubert function
    f ( x ) = i = 1 5 i cos ( ( i + 1 ) x 1 + i ) · i = 1 5 i cos ( ( i + 1 ) x 2 + i ) .
    X = x R 2 : 10 x 1 10 , 10 x 2 10 , .
    The function has 760 local minimum points for x X , 18 of them are global optimum points, and f * = 186.730909 .
  • 3-hump camel function
    f ( x ) = 2 x 1 2 1.05 x 1 4 + x 1 6 6 x 1 x 2 + x 2 2 ,
    X = x R 2 : 3 x 1 3 , 3 x 2 3 , .
    This function has three minimums for x X , one of them is global: x = ( 0 , 0 ) , and f = 0 .
  • 6-hump camel function
    f ( x ) = 4 x 1 2 2.1 x 1 4 + x 1 6 3 + x 1 x 2 4 x 2 2 + 4 x 2 4 ,
    X = x R 2 : 3 x 1 3 , 1.5 x 2 1.5 , .
    The 6-hump camel function has six minimum points for x X , two of them are global optima: x = ( 0.089842 , 0.712656 ) , x = ( 0.089842 , 0.712656 ) , and f = 1.031628 .
  • Rosenbrock function
    f ( x ) = 100 x 1 2 x 2 2 + ( x 1 1 ) 2 ,
    X = x R 2 : 5 x 1 5 , 5 x 2 5 , .
    The only global minimum of the function is x = ( 1 , 1 ) , and f = 0 .
  • Levy-1 function
    f ( x ) = π n 10 sin 2 ( π x 1 ) + i = 1 n 1 ( x i 1 ) 2 [ 1 + 10 sin 2 ( π x i + 1 ) ] + ( x n 1 ) 2 ,
    X = x R n : 10 x 10 .
    The function has approximately 5 n local minima for x X , the global minimum is x i = 1 , i = 1 , , n , and f = 0 .
  • Levy-2 function
    f ( x ) = π n 10 sin 2 ( π y 1 ) + i = 1 n 1 ( y i 1 ) 2 [ 1 + 10 sin 2 ( π y i + 1 ) ] + ( y n 1 ) 2 ,
    y i = x i 1 4 + 1 ,
    X = x R n : 10 x 10 .
    As with Levy-2 function, the function has approximately 5 n local minima for x X , the global minimum is x i = 1 , i = 1 , , n , and f = 0 .
  • Levy-3 function
    f ( x ) = 1 10 sin 2 ( 3 π x 1 ) + 1 10 i = 1 n 1 ( x i 1 ) 2 1 + s i n 2 ( 3 π x i + 1 )
    + 1 10 ( x n 1 ) 2 1 + sin 2 ( 2 π x n ) ,
    X = x R n : 10 x 10 .
    This function has approximately 30 n local minima for x X , the only global minimum is x i = 1 , i = 1 , , n , and f = 0 .
The results of the numerical experiments are presented in Table 8 and Table 9. The following notations are used: n is the number of variables; f b e s t is the best value of the objective function found during the executing of the algorithm; f is the optimal value of the objective function; k f is the average number of function calculations during the execution of the algorithm considering all launches of the current problem; m is the number of problems solved successfully using the multistart procedure; M is the total number of randomly generated points. The multistart procedure launches the algorithm from one of the generated points. The designation m / M means that m launches of the algorithm from M starting points resulted in a successful problem solution (the global minimum point was found). Coordinate descent–par is the coordinate descent method in combination with the ATSA; Hooke–Jeeves–par is the Hooke and Jeeves method combined with the ATSA; Rosenbrock–dis is the Rosenbrock method with a discrete step in the search procedure; Rosenbrock–par is the Rosenbrock method in combination with the ATSA.

5. Discussion of the Results

The coordinate descent method in combination with the ATSA proved to be an effective method in searching for the global optimum even in quite difficult optimization problems. For instance, notice the results for Levy functions. Despite the large number of local minima in the search area of these functions, the algorithm in most cases found the global minimum points. The coordinate descent method in combination with the ATSA demonstrated an acceptable number of function calculations and quite high accuracy of the best function value for all tested problems.
The Hooke and Jeeves method combined with the ATSA attained global optimum points in most tested problems but not as frequently as the coordinate descent method. Nevertheless, it is possible to obtain quite a high accuracy of the best-found solution for some problems. The price for this accuracy is a large number of objective function calculations due to the careful selection of start points in the ATSA.
The Rosenbrock method with the ATSA has the same obvious drawback as the Hooke and Jeeves method, namely, the number of function calculations is quite large. However, we can notice that the Rosenbrock method with a discrete step demonstrated acceptable performance considering the number of successive launches and function calculations.

6. Conclusions

We tested some of the well-known zero-order methods and added to their algorithms a line search based on the ATSA for univariate problems. Some of the tested problems, for instance, the Shubert problem and Levy problems are quite difficult in terms of searching for the global optimum; however, according to the numerical experiments, it is possible to use zero-order methods to find the global optimum with quite high accuracy and with acceptable performance. The coordinate descent method combined with the ATSA deserves attention in terms of its ability to find a global optimum with high frequency in most tested problems.
It would be very interesting to combine the suggested Parabola Method with other zero-order methods like the Downhill Simplex, Genetic Algorithm, Particle Swarm Optimization, Cuckoo Search Algorithm, and the YUKI algorithm. We will continue our investigations and are working on a new paper devoted to such extension and testing.

Author Contributions

Conceptualization, A.K., O.K. and E.S.; Software, A.K.; Validation, A.K. and O.K.; Investigation, A.K. and O.K.; Methodology, E.S.; Formal analysis, E.S. and V.N.; Resources, V.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Pinter, J. Global Optimization in Action; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1995. [Google Scholar]
  2. Strongin, R.G.; Sergeev, Y.D. Global Optimization with Non-Convex Constraints: Sequential and Parallel Algorithms; Springer: New York, NY, USA, 2000. [Google Scholar]
  3. Wood, G.R.; Zhang, B.P. Estimation of the Lipschitz constant of a function. J. Glob. Optim. 1998, 8, 91–103. [Google Scholar] [CrossRef]
  4. Oliveira, J.B. Evaluating Lipschitz Constants for Functions Given by Algorithms. Comput. Optim. Appl. 2000, 16, 215–229. [Google Scholar] [CrossRef]
  5. Tuy, H. Convex Analysis and Global Optimization; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1998. [Google Scholar]
  6. Lamar, B.W. A Method for Converting a Class of Univariate Functions into d.c. Functions. J. Glob. Optim. 1999, 15, 55–71. [Google Scholar] [CrossRef]
  7. Ginchev, I.; Gintcheva, D. Characterization and recognition of d.c. functions. J. Glob. Optim. 2013, 57, 633–647. [Google Scholar] [CrossRef]
  8. Horst, R.; Tuy, H. Global Optimization: Deterministic Approaches; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
  9. Polyak, B.T. Convexity of Nonlinear Image of a Small Ball with Application to Optimization. Set-Valued Anal. 2001, 9, 159–168. [Google Scholar] [CrossRef]
  10. Conn, A.; Scheinberg, K.; Vicente, L. Introduction to Derivative-Free Optimization; MPS-SIAM Series on Optimization; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2009. [Google Scholar]
  11. Bazaraa, M.S.; Sherali, H.D.; Shetty, C.M. Nonlinear Programming: Theory and Algorithms, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006. [Google Scholar]
  12. Eiselt, H.A.; Sandblom, C.-L. Operations Research: A Model-Based Approach; Springer: Berlin/Heidelberg, Germany, 2010; 447p. [Google Scholar]
  13. Grouzeix, J.-P.; Martinez-Legaz, J.-E. (Eds.) Generalized Convexity, Generalized Monotonicity: Recent Results; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1998; 469p. [Google Scholar]
  14. Hansen, P.; Jaumard, B. Lipschitz Optimization. In Handbook on Global Optimization; Horst, R., Pardalos, P.M., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1994. [Google Scholar]
  15. Walter, É. Numerical Methods and Optimization; Springer: Cham, Switzerland, 2014. [Google Scholar]
  16. Snyman, J.A.; Wilke, D.N. Practical Mathematical Optimization, 2nd ed.; Springer: Cham, Switzerland, 2018. [Google Scholar]
  17. Rios, L.; Sahinidis, N. Derivative-free optimization: A review of algorithms and comparison of software implementations. J. Glob. Optim. 2012, 56, 1247–1293. [Google Scholar] [CrossRef]
  18. Kramer, O.; Ciaurri, D.; Koziel, S. Derivative-Free Optimization. In Computational Optimization, Methods and Algorithms. Studies in Computational Intelligence; Koziel, S., Yang, X.S., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; Volume 356, pp. 61–83. [Google Scholar]
  19. Minoux, M. Mathematical Programming: Theory and Algorithms; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 1986. [Google Scholar]
Figure 1. f ( x ) = e x sin ( 2 π x ) ,   x * —global minimum, x —local minimum, determined by the GSM.
Figure 1. f ( x ) = e x sin ( 2 π x ) ,   x * —global minimum, x —local minimum, determined by the GSM.
Algorithms 17 00107 g001
Figure 2. f ( x ) = sin ( x ) + sin 2 x 3 ,   x * —global minimum, x —local minimum, determined by the GSM.
Figure 2. f ( x ) = sin ( x ) + sin 2 x 3 ,   x * —global minimum, x —local minimum, determined by the GSM.
Algorithms 17 00107 g002
Figure 3. f ( x ) = e x sin ( 2 π x ) , four subintervals of minima localization are given in red.
Figure 3. f ( x ) = e x sin ( 2 π x ) , four subintervals of minima localization are given in red.
Algorithms 17 00107 g003
Figure 4. f ( x ) = sin ( x ) + sin 2 x 3 , three subintervals of minima localization are given in red.
Figure 4. f ( x ) = sin ( x ) + sin 2 x 3 , three subintervals of minima localization are given in red.
Algorithms 17 00107 g004
Table 1. Test problems 1–5.
Table 1. Test problems 1–5.
No.Function and IntervalSolution g / l Record k f
1. f ( x ) = x 6 6 52 25 x 5 + 39 80 x 4 + f * = 29763.23 ,g f = 29763.23 22
71 10 x 3 79 20 x 2 x + 1 10 , x * = 10 x = 9.9996
x [ 1.5 , 11 ]
2. f ( x ) = sin ( x ) + sin 10 x 3 , f * = 1.899599 ,g f = 1.899599 20
x [ 2.7 , 7.5 ] x * = 5.145735 x = 5.145733
3. f ( x ) = k = 1 5 k sin ( ( k + 1 ) x + k ) , f * = 12.031249 ,g f = 12.031234 22
x [ 10 , 10 ] x * , 1 = 6.774576 x = 0.491707
x * , 2 = 0.491390
x * , 3 = 5.791794
4. f ( x ) = ( 3 x 1.4 ) sin ( 18 x ) , f * = 1.489072 ,l f = 0.4612863 17
x [ 0 , 1.2 ] x * = 0.966086 x = 0.6291664
5. f ( x ) = sin ( x ) + sin 10 x 3 + f * = 1.601308 ,g f = 1.601308 20
ln ( x ) 0.84 x + 3 x * = 5.199778 x = 5.199901
x [ 2.7 , 7.5 ]
Table 2. Test problems 6–17.
Table 2. Test problems 6–17.
No.Function and IntervalSolutiong/lRecord k f
6. f ( x ) = k = 1 5 k cos ( ( k + 1 ) x + k ) , f * = 14.50801 ,g f = 14.50780 22
x [ 10 , 10 ] x * , 1 = 7.083506 x = 5.483102
x * , 2 = 0.800321
x * , 3 = 5.482864
7. f ( x ) = sin ( x ) + sin 2 x 3 , f * = 1.905961 ,l f = 1.215982 22
x [ 3.1 , 20.4 ] x * = 17.039199 x = 5.361825
8. f ( x ) = x sin ( x ) , f * = 7.916727 ,g f = 7.916727 22
x [ 0 , 10 ] x * = 7.978666 x = 7.978632
9. f ( x ) = 2 cos ( x ) + cos ( 2 x ) , f * = 1.500000 ,g f = 1.499999 21
x [ 1.57 , 6.28 ] x * , 1 = 2.094395 x = 2.094447
x * , 2 = 4.188790
10. f ( x ) = sin 3 ( x ) + cos 3 ( x ) , f * = 1.000000 ,g f = 0.999999 21
x [ 0 , 6.28 ] x * , 1 = 3.141593 x = 4.712287
x * , 2 = 4.712389
11. f ( x ) = e x sin ( 2 π x ) , f * = 0.788595 ,l f = 0.039202 20
x [ 0 , 4 ] x * = 0.224982 x = 3.226282
12. f ( x ) = x 2 5 x + 6 x 2 + 1 , f * = 0.035534 ,g f = 0.035533 22
x [ 5 , 5 ] x * = 2.414214 x = 2.414418
13. f ( x ) = x 6 15 x 4 + 27 x 2 + 250 , f * = 7.000000 ,g f = 7.000001 21
x [ 4 , 4 ] x * , 1 = 3.0 x = 3.000009
x * , 2 = 3.0
14. f ( x ) = x + sin ( 3 x ) 1 , f * = 7.815675 ,l f = 5.721279 21
x [ 0 , 6.5 ] x * = 5.872866 x = 3.778193
15. f ( x ) = cos ( x ) sin ( 5 x ) + 1 , f * = 0.952897 ,g f = 0.952896 21
x [ 0 , 7 ] x * = 2.839347 x = 2.839196
16. f ( x ) = x e sin ( 3 x ) + 1 , f * = 3.363290 ,g f = 3.363290 20
x [ 3 , 2 ] x * = 1.639062 x = 1.638984
17. f ( x ) = ln ( 3 x ) ln ( 2 x ) 1 , f * = 1.041100 ,g f = 1.041100 21
x [ 0.1 , 7 ] x * = 0.408248 x = 0.408014
Table 3. Results for Example 3.
Table 3. Results for Example 3.
iSubintervalTPPParabola k f
[ ν 1 i , ν 3 i ] { ν 1 i , ν 2 i , ν 3 i }
1 [ 0 , 0.333 ] { 0 , 0.166 , 0.333 } 15.206 x 2 6.931 x 2
2 [ 1 , 1.333 ] { 1 , 1.166 , 0.333 } 5.590 x 2 13.733 x + 8.144 2
3 [ 2 , 2.333 ] { 2 , 2.166 , 2.333 } 2.055 x 2 9.159 x + 10.101 2
4 [ 3 , 3.333 ] { 3 , 3.166 , 3.333 } 0.755 x 2 4.878 x + 7.837 2
Table 4. Results for Example 4.
Table 4. Results for Example 4.
iSubintervalTPPParabola k f
[ ν 1 i , ν 3 i ] { ν 1 i , ν 2 i , ν 3 i }
1 [ 4.542 , 5.983 ] { 4.542 , 5.263 , 5.983 } 0.486 x 2 5.238 x + 12.887 4
2 [ 9.588 , 11.029 ] { 9.588 , 10.308 , 11.029 } 0.249 x 2 5.181 x + 26.728 5
3 [ 16.075 , 17.517 ] { 16.075 , 16.796 , 17.517 } 0.638 x 2 21.713 x + 182.968 3
Table 5. Application of the two-stage approach to problems 1–5, 8–10, and 12–17.
Table 5. Application of the two-stage approach to problems 1–5, 8–10, and 12–17.
ProblemMinima TPP
No.Structure k f TPP Subintervals k f GSM k f PM g/l
1.2 local12 [ 8.917 , 11.0 ] 166g
1 global
2.2 local12 [ 3.100 , 3.900 ] 144l
1 global [ 4.700 , 5.500 ] 144g
[ 6.700 , 7.500 ] 144l
3.17 local24 [ 7.500 , 5.833 ] 164g
3 global [ 2.500 , 0.833 ] 165l
[ 1.666 , 3.333 ] 166l
[ 5.000 , 6.667 ] 165g
4.4 local24 [ 0.050 , 0.150 ] 103l
1 global [ 0.350 , 0.450 ] 103l
[ 0.600 , 0.700 ] 102l
[ 0.900 , 1.000 ] 103g
5.2 local12 [ 3.100 , 3.900 ] 145l
1 global [ 4.700 , 5.500 ] 144g
[ 6.700 , 7.500 ] 144l
8.1 local12 [ 0.833 , 2.500 ] 165l
1 global [ 7.500 , 9.167 ] 164g
9.1 local6 [ 1.047 , 3.663 ] 178g
2 global
10.1 local6 [ 2.093 , 4.187 ] 164g
2 global
12.1 local6 [ 1.666 , 5.000 ] 1714g
2 global
13.1 local12 [ 3.333 , 2.000 ] 157g
2 global [ 0.667 , 0.667 ] 151l
[ 2.000 , 3.333 ] 157g
14.3 local24 [ 1.354 , 1.896 ] 144l
1 global [ 3.521 , 4.063 ] 141l
[ 5.687 , 6.229 ] 144g
15.5 local6 [ 2.333 , 4.667 ] 176g
1 global
16.1 local24 [ 1.750 , 1.333 ] 133l
1 global [ 1.375 , 1.792 ] 134g
17.1 global12 [ 0.100 , 1.250 ] 1519g
Table 6. Application of the two-stage approach to problem 6.
Table 6. Application of the two-stage approach to problem 6.
ProblemMinima TPP
No.Structure k f TPP Subintervals k f GSM k f PM g/l
6.17 local96 [ 9.583 , 9.167 ] 133l
3 global [ 8.541 , 8.125 ] 133l
[ 7.291 , 6.875 ] 133g
[ 6.250 , 5.833 ] 134l
[ 5.208 , 4.791 ] 134l
[ 4.166 , 3.750 ] 134l
[ 3.125 , 2.708 ] 134l
[ 2.291 , 1.875 ] 134l
[ 1.042 , 0.625 ] 134g
[ 0.208 , 0.625 ] 134l
[ 1.042 , 1.458 ] 134l
[ 2.083 , 2.500 ] 133l
[ 3.125 , 3.542 ] 134l
[ 4.166 , 4.583 ] 132l
[ 5.208 , 5.625 ] 134g
[ 6.458 , 6.875 ] 133l
[ 7.500 , 7.917 ] 132l
[ 8.333 , 8.750 ] 134l
[ 9.375 , 9.792 ] 132l
Table 7. Results of application of the accelerated two-stage approach.
Table 7. Results of application of the accelerated two-stage approach.
Problem No.Number of TPP Subintervals, k p GSM/PMTotal k f g/l
1.0GSM, k f G S M = 17 23g
2.2PM, k f P M = 5 11g
3.2PM, k f P M = 9 15g
4.2PM, k f P M = 4 10g
5.2PM, k f P M = 5 11g
6.1PM, k f P M = 8 14g
7.2PM, k f P M = 4 10g
8.2PM, k f P M = 4 10g
9.1PM, k f P M = 2 8g
10.1PM, k f P M = 8 14g
11.1PM, k f P M = 6 10l
12.1PM, k f P M = 22 28g
13.2PM, k f P M = 17 23g
14.2PM, k f P M = 6 10l
15.1PM, k f P M = 6 10g
16.0GSM, k f G S M = 15 21g
17.1PM, k f P M = 21 26g
Table 8. Coordinate descent method and Hooke–Jeeves method.
Table 8. Coordinate descent method and Hooke–Jeeves method.
Coordinate Descent–ParHooke–Jeeves–Par
Problem f best f k f m / M f best f k f m / M
Branin 10 4 1752154/1000 10 4 18,11764/200
Trecani 10 6 8081000/1000 10 6 1318688/1000
Shubert 10 6 10561000/1000 10 1 445133/1000
3-hump camel 10 6 1061673/1000 10 6 12,17233/200
6-hump camel 10 4 18281000/1000 10 2 13,5151/200
Rosenbrock 10 3 37,76895/1000 10 6 830626/200
Levy-1 (n = 5) 10 6 1323735/1000 10 2 10,38344/200
Levy-1 (n = 50) 10 6 12,543780/1000 10 3 13,8535/200
Levy-1 (n = 100) 10 6 25,535750/1000 10 0 14,9500/200
Levy-2 (n = 5) 10 6 13391000/1000 10 3 49,08114/50
Levy-2 (n = 50) 10 6 14,28550/50 10 0 35,75112/50
Levy-2 (n = 100) 10 6 31,37250/50 10 0 40,94511/50
Levy-3 (n = 5) 10 6 48641000/1000 10 1 33,9753/50
Levy-3 (n = 50) 10 6 31,94550/50 10 0 16,24411/50
Levy-3 (n = 100) 10 6 75,75250/50 10 0 46,8820/50
Table 9. Rosenbrock methods.
Table 9. Rosenbrock methods.
Rosenbrock–DisRosenbrock–Par
Problem f best f k f m / M f best f k f m / M
Branin 10 4 296200/200 10 4 984012/200
Trecani 10 6 115200/200 10 6 781753/200
Shubert 10 5 44519/200 10 6 10,964100/200
3-hump camel 10 6 12785/200 10 6 11,10722/200
6-hump camel 10 6 90131/200 10 2 14,21060/200
Rosenbrock 10 6 489172/200 10 4 820115/200
Levy-1 (n = 5) 10 6 70520/200 10 0 33,4181/200
Levy-1 (n = 50) 10 3 22,5501/20 10 0 241,6670/20
Levy-1 (n = 100) 10 3 63,1421/20 10 0 576,1600/20
Levy-2 (n = 5) 10 3 5737/20 10 0 39,9551/20
Levy-2 (n = 50) 10 3 67653/20 10 0 373,8400/20
Levy-2 (n = 100) 10 3 17,1454/20 10 0 871,6500/20
Levy-3 (n = 5) 10 1 33,9754/50 10 1 49,3011/20
Levy-3 (n = 50) 10 1 22,2357/20 10 0 398,0600/20
Levy-3 (n = 100) 10 0 42,0401/20 10 0 886,7400/20
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kolosnitsyn, A.; Khamisov, O.; Semenkin, E.; Nelyub, V. Application of the Parabola Method in Nonconvex Optimization. Algorithms 2024, 17, 107. https://doi.org/10.3390/a17030107

AMA Style

Kolosnitsyn A, Khamisov O, Semenkin E, Nelyub V. Application of the Parabola Method in Nonconvex Optimization. Algorithms. 2024; 17(3):107. https://doi.org/10.3390/a17030107

Chicago/Turabian Style

Kolosnitsyn, Anton, Oleg Khamisov, Eugene Semenkin, and Vladimir Nelyub. 2024. "Application of the Parabola Method in Nonconvex Optimization" Algorithms 17, no. 3: 107. https://doi.org/10.3390/a17030107

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop