1. Introduction
The task of locating the global minimum of a continuous and differentiable function
is defined as
A variety of practical problems from various research fields can be modeled as global optimization problems, such as problems from physics [
1,
2,
3], chemistry [
4,
5,
6], economics [
7,
8], medicine [
9,
10], etc. Many methods have been proposed to tackle the problem of Equation (
1), such as controlled random search methods [
11,
12,
13], simulated annealing methods [
14,
15,
16], differential evolution methods [
17,
18], particle swarm optimization (PSO) methods [
19,
20,
21], ant colony optimization [
22,
23], genetic algorithms [
24,
25,
26], etc. Reviews of stochastic methods for global optimization problems can be found in the work of Pardalos et al. [
27] or in the work of Fouskakis et al. [
28].
The current work proposes a parallel implementation of the differential evolution (DE) method that aims to speed up the optimization process of this particular method and also tries to make adequate use of modern computing structures with multicore architectures. The DE method initially generates a population of candidate solutions, which iteratively evolves through the crossover process in order to discover the global minimum of the objective function. The method was applied in various research fields such as electromagnetics [
29], energy consumption problems [
30], job shop scheduling [
31], image segmentation [
32], etc. The proposed method partitions the processing into independent structural units, such as threads, and each of them acts independently. Furthermore, the new method proposes a way of communication between the different building blocks of parallel processing and a process termination technique suitably modified for parallel processing.
In the recent literature, several methods have been proposed that take full advantage of parallel processing, such as parallel techniques [
33,
34,
35], methods that take advantage of GPU architectures [
36,
37,
38], etc. Moreover, Weber et al. [
39] proposed a parallel DE method for large-scale optimization problems using new search mechanisms for the individuals of the subpopulations. Chen et al. [
40] proposed a parallel DE method for cluster optimization using modified genetic operators. Moreover, Penas et al. [
41] suggested an enhanced parallel asynchronous DE algorithm for problems in computational systems biology. Recently, Sui et al. [
42] proposed a parallel compact DE method applied to image segmentation.
The proposed technique is a modified version of the parallel island methodology for different evolutionary techniques [
43,
44]. Therefore, in the proposed technique, the initial population of agents (candidate solutions) is divided into a series of independent populations and each individual population evolves independently in a parallel computing unit, such as a thread. Populations periodically exchange information with each other, such as the lowest functional value to which they have been driven. The proposed technique uses a new differential weight calculation scheme, can use a number of different information exchange methods between the parallel computing units and furthermore proposes a new termination method of the optimization process that can take advantage of the parallelism so that the optimization terminates in time and is valid.
The rest of this article is organized as follows: in
Section 2, the original DE method as well as the proposed modifications are outlined, in
Section 3, the experimental test functions from the relevant literature and the associated experimental results are listed, and finally, in
Section 4, some conclusions and guidelines for future research are provided.
3. Experiments
In the following, the benchmark functions used in the experiments as well as the experimental results are presented.
3.1. Test Functions
To evaluate the ability of the proposed technique to find the total minimum of functions, a series of test functions from the relevant literature [
54,
55] were used and are presented below.
Bent-cigar function. The function is
with the global minimum
. For the conducted experiments, the value
was used.
Bf1 function. The function Bohachevsky 1 is given by the equation
with
.
Bf2 function. The function Bohachevsky 2 is given by the equation
with
.
Branin function. The function is defined by with . The value of the global minimum is 0.397887 with .
CM function. The cosine mixture function is given by the equation
with
. For the conducted experiments, the value
was used.
Discus function. The function is defined as
with global minimum
For the conducted experiments, the value
was used.
Easom function. The function is given by the equation
with
and a global minimum of −1.0.
Exponential function. The function is given by
The global minimum is located at with a value of . In our experiments, we used this function with and the corresponding functions are denoted by the labels EXP4, EXP16 and EXP64.
Griewank2 function. The function is given by
The global minimum is located at the with a value of 0.
Gkls function.
is a function with
w local minima, described in [
56] with
and
n a positive integer between 2 and 100. The value of the global minimum is −1 and in our experiments, we used
and
.
Hansen function. , . The global minimum of the function is −176.541793.
Hartman 3 function. The function is given by
with
and
and
The value of the global minimum is −3.862782.
Hartman 6 function.
with
and
and
The value of the global minimum is −3.322368.
High conditioned elliptic function, defined as
with global minimum
; the value
was used in the conducted experiments
Potential function. The molecular conformation corresponding to the global minimum of the energy of N atoms interacting via the Lennard–Jones potential [
57] was used as a test case here. The function to be minimized is given by:
In the experiments, two different cases were studied:
Rastrigin function. The function is given by
Shekel 7 function.
with and .
with and .
with and .
Sinusoidal function. The function is given by
The global minimum is located at
with
. For the conducted experiments, the cases of
and
were studied. The parameter
z was used to shift the location of the global minimum [
58].
Test2N function. This function is given by the equation
The function has in the specified range and in our experiments, we used .
Test30N function. This function is given by
with
. The function has
local minima in the specified range, and we used
in the conducted experiments.
3.2. Experimental Results
To evaluate the performance of the modified version of the differential evolutionary technique, a series of experiments were performed in which the number of parallel computing units varied from 1 to 10. The freely available OpenMP library [
59] was used for parallelization, and the method was coded in ANSI C++ inside the OPTIMUS optimization package available from
https://github.com/itsoulos/OPTIMUS (accessed on 4 January 2023). All the experiments were conducted on an AMD Ryzen 5950X with 128 GB of RAM and the Debian Linux operating system. All the experiments were conducted 30 times using different seed for the random generator each time and averages were reported. The values for the parameters used in the DE algorithm are shown in
Table 1. The parameter
F (differential weight) was calculated as:
where
is a random number, which was used in [
60]. This random scheme for the calculation of the parameter
F was used successfully to better explore the search space of the objective function. The experimental results for different numbers of threads for the test functions of the previous subsection are shown in
Table 2. The number in the cells denote average function calls for every test function. The number in parentheses stands for the fraction of executions where the global optimum was successfully found. The absence of this number indicates that the global minimum was computed for every independent run (100% success). At the end of the table, an additional row named average has been added and shows the total number of function calls for all test functions and the average success rate in locating the global minimum.
As can be seen, as the number of computational threads increased, the required number of function calls needed to locate the global minimum decreased, with no appreciable difference in the overall reliability of the method, which remained extremely high (99–100%). In addition, to show the dynamics of the proposed methodology, it was also used in the training of an artificial neural network [
61] for learning a common benchmark problem from machine learning, the wine problem [
62,
63]. The sums of execution times for 30 independent runs are displayed in
Figure 1. As we can see, as the number of network weights increased from w = 5 to w = 20, the gain from using multiple processing threads increased as the training time decreased noticeably.
In addition, to discover whether there was differentiation using the different propagation techniques, additional experiments were performed using 10 processing threads. In each processing thread, as before, the population of each island was 20 agents. The results from these experiments are shown in
Table 3. From the experimental results, it appears that most of the time, there were no significant changes in the total number of function calls except in the case of an “N to N” propagation. There was a significant reduction in function calls, but also a drop in reliability of the techniques from 99% to 91%. This may be because, due to the exchange of the best costs between all the islands, the total population was locked into local minima.
Furthermore, the proposed method was compared against the original differential evolution method and two variants from the relevant literature, mentioned as DERL and DELB [
50]. The results from this comparison are shown in
Table 4. As is evident, the proposed technique significantly outperformed the other modifications of the different evolutionary method. This was largely due to the different differential weight calculation technique but also to the proposed termination method. The used differential weight calculation technique largely succeeded in providing a better search of the search space, while the new termination method terminated the optimization method in time. Moreover, this new termination technique was modified to perform well in parallel computing environments as well.
Moreover, a statistical comparison was performed for the proposed method and different numbers of processing threads, and the results are outlined in
Figure 2. A statistical comparison was also included for the results of the proposed method against the other variations of the DE method and the corresponding plot is shown in
Figure 3.
4. Conclusions
A new global optimization technique was presented in this manuscript, which can be performed in parallel computing environments. This method was based on the well-known differential evolutionary technique and partitioned the initial population of agents, so as to create independent populations executed on parallel computing units. The parallel units periodically exchanged the best values for the objective function with each other, and from the experiments carried out it was found that the most robust information exchange technique was the so-called “1 to 1”, where a randomly selected subpopulation exchanges information with another randomly selected subpopulation. Furthermore, a new termination method was proposed which could take full advantage of the parallel computing environment. With this termination rule, the decision to terminate the method could be efficiently made even by a small portion of the independent computing units.
From the experimental results, it appeared that the proposed technique could successfully find the global minimum in a series of problems from the relevant literature, and in fact, as the number of parallel processing units increased, the required number of function calls decreased. Furthermore, after experiments on a difficult problem such as the training of artificial neural networks, it was shown that the time required for the optimization process decreased dramatically with the increase of threads.
However, much can still be done to improve the methodology, such as finding a better way of communication between parallel processing units or even formulating more efficient termination criteria that exploit parallel computing environments. In addition, the proposed technique could be applied to other global optimization techniques such as genetic algorithms or particle swarm optimization.