Next Article in Journal
Entropy as a Topological Operad Derivation
Next Article in Special Issue
F-Divergences and Cost Function Locality in Generative Modelling with Quantum Circuits
Previous Article in Journal
A Satellite Incipient Fault Detection Method Based on Decomposed Kullback–Leibler Divergence
Previous Article in Special Issue
Metaheuristics in the Optimization of Cryptographic Boolean Functions
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection

Department of Mathematics, Faculty of Science, Zagazig University, Zagazig 44519, Egypt
Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan
Department of Computer, Damietta University, Damietta 34517, Egypt
State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
Electrical Engineering Department, Faculty of Engineering, Fayoum University, Fayoum 63514, Egypt
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 84428, Saudi Arabia
Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, United Arab Emirates
Author to whom correspondence should be addressed.
Entropy 2021, 23(9), 1189;
Submission received: 11 August 2021 / Revised: 5 September 2021 / Accepted: 6 September 2021 / Published: 9 September 2021
(This article belongs to the Special Issue Entropy in Soft Computing and Machine Learning Algorithms)


With the widespread use of intelligent information systems, a massive amount of data with lots of irrelevant, noisy, and redundant features are collected; moreover, many features should be handled. Therefore, introducing an efficient feature selection (FS) approach becomes a challenging aim. In the recent decade, various artificial methods and swarm models inspired by biological and social systems have been proposed to solve different problems, including FS. Thus, in this paper, an innovative approach is proposed based on a hybrid integration between two intelligent algorithms, Electric fish optimization (EFO) and the arithmetic optimization algorithm (AOA), to boost the exploration stage of EFO to process the high dimensional FS problems with a remarkable convergence speed. The proposed EFOAOA is examined with eighteen datasets for different real-life applications. The EFOAOA results are compared with a set of recent state-of-the-art optimizers using a set of statistical metrics and the Friedman test. The comparisons show the positive impact of integrating the AOA operator in the EFO, as the proposed EFOAOA can identify the most important features with high accuracy and efficiency. Compared to the other FS methods whereas, it got the lowest features number and the highest accuracy in 50% and 67% of the datasets, respectively.

1. Introduction

The huge increase of data volume results in different challenges and problems such as irrelevant, high dimensionality, and noisy data [1]. Therefore, such problems affect the efficiency and accuracy of the machine learning algorithms and lead to high computational costs. Feature selection (FS) approaches have been utilized to reduce computational costs and to boost classification accuracy [2]. FS methods are generally used to capture data properties by selecting a subset of relevant features [3]. Additionally, they removed unnecessary and noisy data [3]. FS methods have been widely employed in different fields, such as human action detection [4], text classification [5], COVID-19 CT images classification [6], neuromuscular disorders [7], data analytics problems [8], parameter estimation of biochemical systems [9], MR image segmentation [10,11], and other applications [12,13,14].
There are three types of FS approaches, called wrapper, filter, and embedded [15]. Wrapper-based methods use the learning technique to assess the selected features, whereas filter-based methods use the properties of the datasets. Embedded-based methods learn which features best contribute to the accuracy of the classification model while it is being created. Therefore, filter-based methods are more efficient and faster than wrapper-based methods. The optimal FS method can be considered as a method that can minimize the number of selected features and maximize the accuracy of the classifier [15].
Metaheuristic (MH) optimization algorithms have been widely adopted to address complex optimization tasks, including FS. In literature, several MH algorithms have been used successfully to solve FS problems, such as particle swarm optimization (PSO) [16], differential evolution (DE) [17], genetic algorithms (GA) [18], as grasshopper optimization algorithm (GOA) [19], gravitational search algorithm (GSA) [20], slime mould algorithm (SMA) [3], Harris hawk optimization (HHO) algorithm [21], marine predators algorithm (MPA) [22], Aquila optimizer [23] and others [8,9,10,11,23].
According to the NFL theorem, no one algorithm can solve all problems. Thus, the hybridization concept is widely adopted to address several complex problems, including FS. Following the hybridization concept, we propose a new FS approach using a new variant of the electric fish optimization (EFO). The EFO is a recently proposed MH algorithm inspired by the natural behavior of the nocturnal electric fish [24]. It was evaluated in different and complex optimization problems, and it showed significant performance, except in high dimensional problems due to its slow convergences and getting stuck in local minima. Thus, we use a new optimizer, called the arithmetic optimization algorithm (AOA), to overcome the shortcomings of the traditional EFO algorithm. The AOA is inspired by mathematical operations and proposed by Abualigah et al. [25]. It has been adopted in several applications, such as proton exchange membrane fuel cells by extreme learning machine [26], multilevel thresholding segmentation [27], real-world multiobjective problem [28], and damage assessment in FGM composite plates [29].
The proposed algorithm, EFOAOA, works by splitting the tested dataset into training and testing sets, which represent 70% and 30% of all the data, respectively. Then we set the initial value for a set of individuals that represents the solutions for the FS problem. To assess these individuals, their Boolean versions are computed, and the fitness value is computed based on the features corresponding to Boolean ones. The next process is to determine which individual has the best fitness value and using it as the best individual. Then the operators of AOA and EFO are competitive in the exploration phase to discover the feasible region which contains the optimal solutions. This leads to an increase in the convergence towards the optimal solution. The operators of traditional EFO within the exploitation phase are used. The process of enhancing the value of individuals is conducted until reached the stop criteria. Then the testing set is reduced by using only the features that correspond to ones in the binary version of the best individual, and the performance is computed using different measures. To the best of our knowledge, this is the first time using either EFO or its modified version for feature selection problems.
Our main objectives and contributions can be summarized as follows:
  • Propose a new modified version of electric fish optimization using the operators of arithmetic optimization algorithm to enhance exploration ability.
  • Apply the enhanced EFOAOA as an alternative FS technique to remove the irrelevant features, which leads to improve the classification efficiency and accuracy.
  • Use eighteen UCI datasets to assess the efficiency of the developed EFOAOA and compared it with well-known FS methods.
The rest of this paper is organized as follows: Section 2 introduces the similar FS method from previous literature. Section 3 presents the background of electric fish optimization and arithmetic optimization algorithm. In Section 4, the stages of the developed method are presented. Section 5 presents the experimental results and their discussions. Finally, the conclusion and future works are given in Section 6.

2. Related Works

In this section, we highlight a number of MH algorithms that were developed for feature selection problems. Chaudhuri and Sahu [30] proposed a modified version of the crow search algorithm (CSA) for FS. They used time-varying flight length for balancing the search process (exploration and exploitation). They evaluated eight variants of the developed FS method, and they tested them with 20 UCI datasets benchmark datasets, and it showed prominent performance.
A hybrid FS method based on binary butterfly optimization algorithm (BOA) and information theory was proposed by [31]. The developed method called information gain binary BOA (IG-bBOA) overcomes the shortcomings of the traditional BOA algorithm, and it achieved exceptional performance compared to several MH algorithms.
Maleki et al. [32] used the genetic algorithm as a feature selection method to improve the lung cancer disease classification process. A k-nearest-neighbors classifier was adopted to classify the stage of patients’ disease. The evaluation outcomes confirmed that GA improved classification accuracy.
Song et al. [33] introduced an FS approach based on a new variant of the PSO algorithm, called bare bones PSO. The main idea is to use a swarm initialization technique depending on label correlation. Also, two operators called supplementary, and deletion operators are used to enhance the exploitation process. More so, to avoid getting stuck in local minima, they developed an adaptive flip mutation operator. It was employed with kNN for several datasets, and it was compared to several MH algorithms to verify its performance.
In [34], the authors presented FS approach, called GWORS, using a combination between grey wolf optimizer (GWO) and rough set for mammogram image analysis. The GWORS was compared to well-known FS methods, and it obtained competitive performance. Tubishat et al. [35] developed an FS based on a dynamic salp swarm algorithm (SSA). The SSA was developed based on two methods. The first method is developed to update salps’ position, where the second one is to improve the local search process of the traditional SSA. The developed SSA was applied with kNN classifier, evaluated with well-known benchmark datasets, and compared to the traditional SSA and several MH algorithms.
Dhiman et al. [36] proposed a binary emperor penguin optimizer (EPO) for FS. They used twenty-five datasets to evaluate this approach with extensive comparisons to the state-of-art methods. Overall, the binary version of the EPO showed superior performance compared to the original one. In [21], an FS method based on a hybrid HHO algorithm and simulated annealing was proposed. In [37], the genetic algorithm was used with Elastic Net for feature selection. Neggaz et al. [38] presented an FS approach using the Henry gas solubility optimization (HGSO) algorithm. Ewees et al. [3] proposed an FS technique using a hybrid of slime mould algorithm and firefly algorithm. It was evaluated with different datasets, including two high dimensions QSAR datasets.
Yousri et al. [6] developed a new FS method to enhance COVID-19 CT images classification based on an improved cuckoo search (CS) optimization algorithm. The fractional-order (FO) calculus and heavy-tailed distributions are used to enhance the performance of the traditional CS algorithm. In general, many MH algorithms, including hybrid methods, have been developed for various FS applications, and they showed good performance compared to traditional methods, as described in [39,40].

3. Background

3.1. Electric Fish Optimization

Electric fish optimization (EFO) is proposed in [1], which is inspired by the emergence of several optimization techniques. The electric fish solutions ( N ) are initialized randomly by the search space, considering the area’s boundaries:
x i j = x m i n   j + r a n d   ( x m a x   j x m i n   j )
where x i j is the position number j in the solution number I , m a x , and m i n are the maximum and minimum boundaries, respectively.
In the EFO, as in nature, positions with a larger frequency use effective electrolocation, and others utilize passive electrolocation. The frequency value is given between the maximum and minimum of the fitness function values:
f i t = f m i n + ( f i t w o r s t t f i t t i t f i t w o r s t t f i t t b e s t t ) ( f m a x f m i n )
where f i t is the fitness value of the solution number i at iteration number t . f i t w o r s t t   and   f i t t b e s t t are the worst and best obtained fitness functions values. f m a x f m i n are the max and min fitness functions values.
The amplitude cost of the solution number I ( A i ) is determined as follows:
A i t = α A i t 1 + ( 1 α ) f i t
where α is a value in range [0, 1].

3.1.1. Active Electrolocation

The active range estimation is calculated as follows:
r i = ( x max j x min j ) A i
To discover neighboring solutions in the available space, estimate the distance is required between the current solution and other solutions. The distance between the solution number i and k is calculated as follows:
d i k = | | x i x j | | = j = 1 d ( x i j x k j ) 2
If at least one neighbor is found in the active space, Equation (6) is used; otherwise, Equation (7) is used:
x i j c a n d = x i j + φ   ( x k j x i j )
x i j c a n d = x i j + φ r i
where k is a random selected solution, φ is a value between [−1, 1], x i j c a n d is the candidate positions of the solution number i.

3.1.2. Passive Electrolocation

The probability of the solution number i in an active space is determined as follows:
P k = A k d i k j     N A A j d i j
Using different approaches, such as roulette wheel selection, K solutions are determined from NA using Equation (8). A source location ( x r j ) is defined using Equation (9). The new positions are then produced using Equation (10):
x r j = k = 1 K A k x k j k = 1 K A k
x i j n e w = x i j + φ   ( x r j x i j )
Although it is unusual, there might be a situation where a solution with a higher rate gives passive electrolocation. To evade this, Equation (11) is determined to choose the parameter values:
x i j c a n d = { x i j n e w             r a n d > f i   x i j                             o t h e r w i s e
The final action of passive space is to change one parameter of the solution number i by Equation (12) to improve the likelihood of a trait denoting exchanged:
x i j c a n d = x m i n   j + r a n d   ( x m a x   j x m i n   j )
If the value of the parameter number j of the solution number i oversteps the boundaries, it is relocated to the following limitations:
x i j c a n d = { x m i n   j                                                     x i j c a n d < x m i n   j x i j c a n d             x m a x   j > x i j c a n d > x m i n   j x m a x   j                                                     x i j c a n d > x m a x   j

3.2. Arithmetic Optimization Algorithm

The arithmetic optimization algorithm (AOA) is an optimization depends on using arthimentical operations [2]. The improvement process starts with choosing the search mechanisms based on Equation (14):
M O A ( t ) = M i n + t × ( M a x M i n T )
where t is the active iteration, which is in range [1, T]. Min and Max are the smallest and highest values for this function. The mathematical of the search mechanisms is given as follows.

3.2.1. Exploration Part

The exploration process is given in Equation (3). This search is performed when rand > MOA, rand is a random number, and MOA can be found in Equation (14). The D search is executed when rand < 0.5; otherwise, the M search will be executed:
x i , j ( t + 1 ) = { b e s t ( x j ) ÷ ( M O P + ϵ ) × ( ( U B j L B j ) × μ + L B j ) , r a n d < 0.5 b e s t ( x j ) × M O P × ( ( U B j L B j ) × μ + L B j ) , o t h e r w i s e  
M O P ( t ) = 1 t ( 1 α ) T ( 1 α )
where x i ( t + 1 ) is the solution number i at the iteration number t, x i , j(t) is the position number j in the solution number i, and best ( x j ) is the best solution yet. µ and α are parameter values fixed to 0.5, 5, respectively [3]. t is the used iteration, and T is the total used iterations.

3.2.2. Exploitation Part

This search section is executed when randMOA. The S search is executed when rand < 0.5; otherwise, the A search will be executed. Thus, the exploitation search, based on S and A, typically averts the local search problem. The following mathematical presentation is used to express the exploration search mechanisms:
x i , j ( C _ I t e r + 1 ) = { b e s t ( x j ) M O P × ( ( U B j L B j ) × μ + L B j ) ,   r 3 < 0.5 b e s t ( x j ) + M O P × ( ( U B j L B j ) × μ + L B j ) , o t h e r w i s e  
To summarize, the processes in the AOA begin with stochastic solutions formed over some constraints. By the development rule, the search tools attempt to obtain the optimal solution with possible conditions. The primary practice in improving the worked solutions is the strategy of the best global solution. A transition approach (named MOA) is employed to preserve the stability among the search mechanisms using a linear function rose in the range [0.2, 0.9]. The exploration tools are practiced when rand > MOA; otherwise, the exploitation tools will be used. In searching sections, the operators will be practiced randomly. Eventually, the AOA is stopped by touching the end criterion.

4. Proposed FS Method

The framework of the developed FS technique depends on improving the effienciy of EFO using the operators of AOA is given in Figure 1.
The main target of using AOA is to enhance the exploration ability of EFO since it has the largest influence on its ability to discover the feasible region that contains the optimal solutions. The proposed FS approach, named EFOAOA, begins with dividing the data into training and testing sets, which represent 70% and 30%, respectively. Then the random values for N individuals are assigned, and for each of them, the fitness value is computed. Then the individual that has the best fitness value is used as the best individual. After this process, the updating of the solution is performed using the operators of EFO in the exploitation phase, while during the exploration phase, either the operators of AOA or traditional EFO are used according to random probability. The process of updating individuals is performed again until the stop conditions are reached. Thereafter, the testing set is reduced according to the best individual and the performance of the developed EFOAOA as FS is evaluated using different metrics. The details of the EFOAOA are given in the following paragraphs.

4.1. First Stage

At this stage, the initial individuals are generated, which represents the population of solutions. The formulation of this process is given as:
X i , j = ( U B j L B j ) × r a n d + U B j L B j ,   i = 1 , 2 , , N ,         j = 1 , 2 , , D
where U B j and L B j are the upper and lower boundary at j th dimension. N represents the total number of individuals and D is the dimension of each solution, and it represents the total number of features. r a n d [ 0 , 1 ] is a random number.

4.2. Second Stage

The main aim of this part of the developed EFOAOA is to update the individuals until they reached to the stop conditions. This is achieved through a set of steps; the first step is to convert each individual X i   into a binary individual using the following equation:
B X i , j = { 1           i f   X i , j 0.5 0       o t h e r w i s e
The next step is to use the training features that corresponding to ones in B X i , j to learn the KNN classifier and compute the fitness value that is defined as:
F i t i = α × γ + ( 1 α ) × ( | B X i , j | D )
In Equation (20), γ is the error classification using KNN classifier and | B X i , j | is the total number of ones (i.e., relevant features). α [ 0 , 1 ] is the factor that balances between two parts of fitness value. The main reason of using KNN due to its simplicity and efficiency, as well as, it has one parameter. In addition, it has been provided better performance than most of other classifiers in different applications. Since, it stores the data of the training set and this.
The step after that is to allocate the best individual X b that has the smallest F i t b . Then compute the frequency ( f i ) and amplitude ( A i ) for each X i using Equations (2) and (3), respectively. According to the value of f i the individuals will be updated using either the active phase (i.e., f i > r a n d ) or passive phase (i.e., f i r a n d ). During the active phase, the operators of traditional EFO are used to update the individuals as given in Equations (4)–(7). Meanwhile, inside the passive phase, the operators of AOA and EFO are competitive to improve the individuals, and this is conducted according to the following formula:
X i , j = { E q ( 8 ) E q ( 13 )           i f   P r o 0.5 E q ( 14 ) E q ( 17 )       o t h e r w i s e
where P r o [ 0 , 1 ] refers to probability of using either AOA (i.e., Equations (14)–(17)) or EFO (i.e., Equations (8)–(13)) to update X i , j . In case the update X i , j has fitness value better than its old value, then update X i , j is used; otherwise, the old one is kept. Then the stop conditions are checked in case they are satisfied then the best individual X b is returned from this stage.

4.3. Third Stage

In this stage, the testing set is reduced by selecting only the features that corresponding ones in the binary version of X b . Then applied reduced testing set is applied to the trained classifier (KNN) and predicts the output of the testing set. The next process is to evaluate the quality of the output using different metrics. The setps of EFOAOA are given in Algorithm 1.
Algorithm 1. Steps of EFOAOA
1. Input: the dataset which has D features, number of individuals (N),
number of iterations ( t m a x ), and parameters of EFOAOA
First Stage
2. Split data into twp parts (i.e., training and testing)
3. Construct the population X using Equation (18).
Second Stage
4.  t = 1
5. While ( t < t m a x )
6. Convert each X i into its binary version using Equation (19).
7. Compure fitness value for each X i based on training set as in Equation (20).
8. Find the best individual X b .
9. Update X using Equation (21).
10.  t = t + 1
11. EndWhie
Third Stage
12. Reduce the testing set based on selected features from X b .
13. Evalaute the performance using different measures

4.4. Complexity of EFOAOA

The time complexity of EFOAOA depends on the complexity of EFO and AOA. Since, time complexity of EFO and AOA are given in Equations (22) and (23), respectively:
O ( EFO ) = { O ( t m a x × N × D )       i n   b e s t   c a s e ( t m a x × N 2 × D )         i n   w o r s t   c a s e
O ( AOA ) = O ( t m a x × N × D )        
So, the complexity of EFOAOA can be represented as:
O ( EFOAOA ) = { O ( K p ( t m a x × N × D ) + ( 1 K p ) ( t m a x × N × D ) )       i n   b e s t   c a s e O ( K p ( t m a x × N 2 × D ) + ( 1 K p ) ( t m a x × N × D ) )         i n   w o r s t   c a s e
O ( EFOAOA ) = { O ( t m a x × N × D )                                                                   i n   b e s t   c a s e O ( ( t m a x × N × D ) ( K p ( N + 1 ) + 1 )         i n   w o r s t   c a s e
where K p stand for the number of solutions updated using operators of EFO.

5. Experimental Results

This section evaluates the performance of the developed EOFAOA method over eighteen benchmark datasets. In addition, we compare the proposed EOFAOA with ten FS algorithms.

5.1. Dataset Description and Parameter Setting

The description of eighteen UCI datasets is listed in Table 1. From this table, it can be observed that these datasets are collected from different real-life applications, and they have different characteristics. For example, different numbers of samples, features, and classes. Moreover, the developed EFOAOA is compared with namely EFO, AOA, MRFO, bGWO, HGSO, MPA, TLBO, SGA, WOA, and SSA. The parameter of each algorithm is assigned based on its original work. The common parameters between these methods are the number of iterations and number of individuals, and we put their values to 50 and 20, respectively. In addition, each of these methods is conducted 25 times to make the comparison fair between them. The comparison bead on six measures: the average, worst (MAX), best (MIN), standard deviation (Std) of the fitness value, and accuracy (Acc).

5.2. Results and Discussion

Table 2 lists the feature selection results of all methods using the average of the fitness function values. From this table, we can notice that the proposed EOFAOA got the better average in eight out of 18 datasets (i.e., Breastcancer, D2, D4, D6, D8, D10, D12, and D17), and it was ranked first. Whereas the AOA method achieved the second rank with three out of 18 datasets (i.e., D3, D9, and D18), followed by TILBO and MRFO with two datasets for each one, but the average of the TILBO in all datasets was better than MRFO. The SGA recorded the worst results at all. Figure 2 illustrates the performance of the EFOAOA based on the average results of the fitness functions values. In addition, the developed EFOAOA is more stable than other FS methods in terms of fitness value as can be seen from the bars of standard deviation.
The results of the standard division of the fitness function for all methods are recorded in Table 3. The developed EOFAOA approache showed good stability in all datasets compared to the other approaches, and it obtained the lowest Std value in four out of 18 datasets (i.e., D1, D5, D8, and D17), followed by TILBO and EOF with the lowest std values in three datasets for each one whereas, both MRFO and MPA also showed good stability. The WOA recorded the worst in this measure.
The best results of the fitness values are recorded in Table 4. From this table, we can see that the proposed EOFAOA showed competitive results with TILBO method. The EOFAOA obtained the best results in three datasets (i.e., D3, D4, and D15), whereas the TILBO reached the minimum average in four datasets (i.e., D7, D12, D16, and D17), and they showed the same Min results in the D10 dataset. The worst method was the EFO.
Regarding the worst results of the fitness values, as in Table 5, the proposed EOFAOA is superior to other compared methods and achieved the best results in 39% of all datasets (i.e., D4, D5, D6, D9, D11, D13, and D15), and it showed competitive results in the rest of the datasets. The AOA achieved the second rank by obtaining the best results in 28% of the dataset (i.e., D3, D8, D10, D17, and D18), followed by TILBO and MPA with 22% for each one. The rest methods were arranged in the following sequence the MFO, HGSO, SSA, EFO, bGWO, and WOA; whereas, the SGA showed the worst results.
Furthermore, the best number of the selected features is recorded in Table 6. In this measure, the algorithms should try to select the most relative features meanwhile achieving the highest accuracy values. As shown in Table 6, the EOFAOA got the lowest features number in 50% of the datasets. The WOA obtained the second rank and got the lowest features number in 44% of the datasets, followed by MRFO, bGWO, MPA, HGSO, TILBO, AOA, SSA, and EFO; whereas, the SGA recorded the largest features number among all methods.
The results of the proposed EOFAOA and all the compared methods in terms of accuracy are recorded in Table 7. The results of this measure show that the proposed EOFAOA is superior to other compared approaches. It achieved the highest accuracy in 39% of all datasets and got the same accuracy with the other methods in 28% of all datasets. This result indicates that the EOFAOA is able to select the most relative feature and save the quality of the classification accuracy. The second rank was recorded by MRFO followed by AOA, MPA, TILBO, SSA, HGSO, EFO, and bGWO whereas, the WOA showed the lowest accuracy in all datasets. Figure 3 indicates that the proposed EFOAOA got the highest accuracy in the average of all datasets, as well as, it is considered more stable than other methods as can be observed from standard deviation bars.
Moreover, for further analysis of the results of the developed methods, the Friedman test is used. This test is a non-parametric test that provides a statistical value that indicates if there is a significant difference between the developed method and other methods. Table 8 shows the Friedman rank test results of the compared methods based on the accuracy, the number of selected features, and fitness value results. In this table, the EFOAOA obtained the first rank, followed by MPA, MRFO, AOA, TLBO, and SSA. In terms of the average of fitness value, the developed EFOAOA allocates the second rank following HGSO. In addition, the developed EFOAOA provides the best mean rank in terms of the number of selected features.

5.3. Comparison with Other FS Techniques

In this part, the results of developed EFOAOA are compared with well-known FS approaches that depends on MH technqiues. These FS approaches including the two whale optimization algorithms [41,42], binary bat algorithm (BBA) [43], enhanced GWO (EGWO) [44], BGOA [45,46], PSO, biogeography-based optimization (BBO), two binary GWO algorithms, namely bGWO1 and bGWO2 [47], AGWO [44], satin bird optimizer (SBO) [44], and enhanced crow search algorithm (ECSA) [48].
Table 9 illustrates the results of classification accuracy of the developed EFOAOA and other methods. From these results it can be seen that the developed EFOAOA has high ability of improve the classification accuracy overall the tested datasets except D5, D7, D14, where PSO, bGWO2, and BGOA are the best, respectively. This incdicate the high ability of EFOAOA to select the relevant features with persevring the accuracy of classification.
To sum up, the previous results show that there is an obvious enhancement in solving feature selection problems when using the proposed EOFAOA method. The EOF is improved extremely by using the operators of the AOA in its structure. Therefore, the EOFAOA can be considered as an efficient and effective optimization algorithm for solving feature selection problems.

6. Conclusions and Future Work

Aiming for proposing an efficient feature selection (FS) optimizer, this paper proposed an innovative variant of the electric fish algorithm (EFO) via integrating the operator of the arithmetic optimization algorithm (AOA) into the EFO ‘s exploration phase. The EFO has a drawback while handling large-dimensional optimization problems for solving that a hybrid variant named EFOAOA was proposed. The proposed EFOAOA was applied on eighteen different real-life datasets to tackle the FS optimization task. The EFOAOA was compared with the basic EFO, AOA and MRFO, bGWO, HGSO, MPA, TLBO, SCA, WOA, and SSA through a set of statistical metrics, namely the average value, worst, best, the standard deviation of the fitness function, and the classification accuracy as well as the Friedman test as a non-parametric test. The comparisons revealed that the EFOAOA got the best average fitness in eight out of 18 datasets, the lowest Std value in four out of 18, and the lowest features number in 50% of the datasets. Accordingly, the proposed EFOAOA got the highest accuracy in the average of all datasets; this result indicated that the EOFAOA was able to select the most relative feature and saved the quality of the classification accuracy at the same time. Also, the operators of the AOA play an essential role in improving the exploration stage of the original EFO algorithm.
For future work, the proposed EFOAOA will be examined with several applications include image segmentation, parameter estimation, and time-series forecasting. Also, we will add more improvement to the EFOAOA stages by using different techniques such as chaotic maps and opposition-based learning.

Author Contributions

M.A.E. and R.A.I.: Conceptualization, supervision, methodology, formal analysis, resources, data curation, and writing—original draft preparation. L.A.: Conceptualization, supervision, methodology, formal analysis, resources, data curation, and writing—original draft preparation. D.Y.: Writing—review. M.A.A.A.-q.: Writing—review and editing. S.A.: Writing—review, editing, and editing, project administration, formal analysis, and funding acquisition. A.A.E.: supervision, resources, formal analysis, methodology and writing—review and editing. All authors have read and agreed to the published version of the manuscript.


This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.

Data Availability Statement

The data are available from


This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Tubishat, M.; Idris, N.; Shuib, L.; Abushariah, M.A.; Mirjalili, S. Improved salp swarm algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst. Appl. 2020, 145, 113122. [Google Scholar] [CrossRef]
  2. Hancer, E.; Xue, B.; Karaboga, D.; Zhang, M. A binary ABC algorithm based on advanced similarity scheme for feature selection. Appl. Soft Comput. 2015, 36, 334–348. [Google Scholar] [CrossRef]
  3. Ewees, A.A.; Abualigah, L.; Yousri, D.; Algamal, Z.Y.; Al-Qaness, M.A.A.; Ibrahim, R.A.; Elaziz, M.A. Improved slime mould algorithm based on firefly algorithm for feature selection: A case study on QSAR model. Eng. Comput. 2021, 1–15. [Google Scholar] [CrossRef]
  4. Al-Qaness, M.A.A. Device-free human micro-activity recognition method using WiFi signals. Geo-Spat. Inf. Sci. 2019, 22, 128–137. [Google Scholar] [CrossRef]
  5. Dahou, A.; Elaziz, M.A.; Zhou, J.; Xiong, S. Arabic sentiment classification using convolutional neural network and differential evolution algorithm. Comput. Intell. Neurosci. 2019, 2019, 2537689. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Yousri, D.; Elaziz, M.A.; Abualigah, L.; Oliva, D.; Al-Qaness, M.A.; Ewees, A.A. COVID-19 X-ray images classification based on enhanced fractional-order cuckoo search optimizer using heavy-tailed distributions. Appl. Soft Comput. 2020, 101, 107052. [Google Scholar] [CrossRef]
  7. Benazzouz, A.; Guilal, R.; Amirouche, F.; Hadj Slimane, Z.E. EMG feature selection for diagnosis of neuromuscular disorders. In Proceedings of the 2019 International Conference on Networking and Advanced Systems (ICNAS), Annaba, Algeria, 26–27 June 2019; pp. 1–5. [Google Scholar]
  8. Cheng, S.; Ma, L.; Lu, H.; Lei, X.; Shi, Y. Evolutionary computation for solving search-based data analytics problems. Artif. Intell. Rev. 2020, 54, 1321–1348. [Google Scholar] [CrossRef]
  9. Nobile, M.S.; Tangherloni, A.; Rundo, L.; Spolaor, S.; Besozzi, D.; Mauri, G.; Cazzaniga, P. Computational intelligence for parameter estimation of biochemical systems. In Proceedings of the 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
  10. Rundo, L.; Tangherloni, A.; Cazzaniga, P.; Nobile, M.S.; Russo, G.; Gilardi, M.C.; Vitabile, S.; Mauri, G.; Besozzi, D.; Militello, C. A novel framework for MR image segmentation and quantification by using MedGA. Comput. Methods Programs Biomed. 2019, 176, 159–172. [Google Scholar] [CrossRef] [PubMed]
  11. Ortiz, A.; Górriz, J.; Ramírez, J.; Salas-González, D.; Llamas-Elvira, J. Two fully-unsupervised methods for MR brain image segmentation using SOM-based strategies. Appl. Soft Comput. 2013, 13, 2668–2682. [Google Scholar] [CrossRef]
  12. Ibrahim, R.A.; Ewees, A.; Oliva, D.; Elaziz, M.A.; Lu, S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J. Ambient. Intell. Humaniz. Comput. 2018, 10, 3155–3169. [Google Scholar] [CrossRef]
  13. El Aziz, M.A.; Hassanien, A.E. Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput. Appl. 2016, 29, 925–934. [Google Scholar] [CrossRef]
  14. El Aziz, M.A.; Moemen, Y.S.; Hassanien, A.E.; Xiong, S. Toxicity risks evaluation of unknown FDA biotransformed drugs based on a multi-objective feature selection approach. Appl. Soft Comput. 2019, 97, 105509. [Google Scholar] [CrossRef]
  15. Ibrahim, R.A.; Elaziz, M.A.; Ewees, A.A.; El-Abd, M.; Lu, S. New feature selection paradigm based on hyper-heuristic technique. Appl. Math. Model. 2021, 98, 14–37. [Google Scholar] [CrossRef]
  16. Xue, B.; Zhang, M.; Member, S.; Browne, W.N. Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Trans. Cybern. 2013, 43, 1656–1671. [Google Scholar] [CrossRef] [PubMed]
  17. Hancer, E. Differential evolution for feature selection: A fuzzy wrapper–filter approach. Soft Comput. 2018, 23, 5233–5248. [Google Scholar] [CrossRef]
  18. Tsai, C.-F.; Eberle, W.; Chu, C.-Y. Genetic algorithms in feature and instance selection. Knowl.-Based Syst. 2012, 39, 240–247. [Google Scholar] [CrossRef]
  19. Sayed, G.I.; Hassanien, A.E.; Azar, A.T. Feature selection via a novel chaotic crow search algorithm. Neural Comput. Appl. 2017, 31, 171–188. [Google Scholar] [CrossRef]
  20. Taradeh, M.; Mafarja, M.; Heidari, A.A.; Faris, H.; Aljarah, I.; Mirjalili, S.; Fujita, H. An evolutionary gravitational search-based feature selection. Inf. Sci. 2019, 497, 219–239. [Google Scholar] [CrossRef]
  21. Abdel-Basset, M.; Ding, W.; El-Shahat, D. A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection. Artif. Intell. Rev. 2020, 54, 593–637. [Google Scholar] [CrossRef]
  22. Sahlol, A.T.; Yousri, D.; Ewees, A.A.; Al-Qaness, M.A.A.; Damasevicius, R.; Elaziz, M.A. COVID-19 image classification using deep features and fractional-order marine predators algorithm. Sci. Rep. 2020, 10, 15364. [Google Scholar] [CrossRef]
  23. Abualigah, L.; Yousri, D.; Elaziz, M.A.; Ewees, A.A.; Al-Qaness, M.A.; Gandomi, A.H. Aquila Optimizer: A novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 2021, 157, 107250. [Google Scholar] [CrossRef]
  24. Yilmaz, S.; Sen, S. Electric fish optimization: A new heuristic algorithm inspired by electrolocation. Neural Comput. Appl. 2019, 32, 11543–11578. [Google Scholar] [CrossRef]
  25. Abualigah, L.; Diabat, A.; Mirjalili, S.; Elaziz, M.A.; Gandomi, A.H. The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 2021, 376, 113609. [Google Scholar] [CrossRef]
  26. Xu, Y.-P.; Tan, J.-W.; Zhu, D.-J.; Ouyang, P.; Taheri, B. Model identification of the proton exchange membrane fuel cells by extreme learning machine and a developed version of arithmetic optimization algorithm. Energy Rep. 2021, 7, 2332–2342. [Google Scholar] [CrossRef]
  27. Abualigah, L.; Diabat, A.; Sumari, P.; Gandomi, A. A Novel Evolutionary arithmetic optimization algorithm for multilevel thresholding segmentation of COVID-19 CT images. Processes 2021, 9, 1155. [Google Scholar] [CrossRef]
  28. Premkumar, M.; Jangir, P.; Kumar, B.S.; Sowmya, R.; Alhelou, H.H.; Abualigah, L.; Yildiz, A.R.; Mirjalili, S. A New arithmetic optimization algorithm for solving real-world multiobjective CEC-2021 constrained optimization problems: Diversity analysis and validations. IEEE Access 2021, 9, 84263–84295. [Google Scholar] [CrossRef]
  29. Khatir, S.; Tiachacht, S.; Le Thanh, C.; Ghandourah, E.; Mirjalili, S.; Wahab, M.A. An improved artificial neural network using arithmetic optimization algorithm for damage assessment in FGM composite plates. Compos. Struct. 2021, 273, 114287. [Google Scholar] [CrossRef]
  30. Chaudhuri, A.; Sahu, T.P. Feature selection using binary crow search algorithm with time varying flight length. Expert Syst. Appl. 2020, 168, 114288. [Google Scholar] [CrossRef]
  31. Sadeghian, Z.; Akbari, E.; Nematzadeh, H. A hybrid feature selection method based on information theory and binary butterfly optimization algorithm. Eng. Appl. Artif. Intell. 2020, 97, 104079. [Google Scholar] [CrossRef]
  32. Maleki, N.; Zeinali, Y.; Niaki, S.T.A. A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Syst. Appl. 2020, 164, 113981. [Google Scholar] [CrossRef]
  33. Song, X.-F.; Zhang, Y.; Gong, D.-W.; Sun, X.-Y. Feature selection using bare-bones particle swarm optimization with mutual information. Pattern Recognit. 2020, 112, 107804. [Google Scholar] [CrossRef]
  34. Sathiyabhama, B.; Kumar, S.U.; Jayanthi, J.; Sathiya, T.; Ilavarasi, A.K.; Yuvarajan, V.; Gopikrishna, K. A novel feature selection framework based on grey wolf optimizer for mammogram image analysis. Neural Comput. Appl. 2021, 1–20. [Google Scholar] [CrossRef]
  35. Aljarah, I.; Habib, M.; Faris, H.; Al-Madi, N.; Heidari, A.A.; Mafarja, M.; Elaziz, M.A.; Mirjalili, S. A dynamic locality multi-objective salp swarm algorithm for feature selection. Comput. Ind. Eng. 2020, 147, 106628. [Google Scholar] [CrossRef]
  36. Dhiman, G.; Oliva, D.; Kaur, A.; Singh, K.K.; Vimal, S.; Sharma, A.; Cengiz, K. BEPO: A novel binary emperor penguin optimizer for automatic feature selection. Knowl.-Based Syst. 2020, 211, 106560. [Google Scholar] [CrossRef]
  37. Amini, F.; Hu, G. A two-layer feature selection method using genetic algorithm and elastic net. Expert Syst. Appl. 2020, 166, 114072. [Google Scholar] [CrossRef]
  38. Neggaz, N.; Houssein, E.H.; Hussain, K. An efficient henry gas solubility optimization for feature selection. Expert Syst. Appl. 2020, 152, 113364. [Google Scholar] [CrossRef]
  39. Rostami, M.; Berahmand, K.; Nasiri, E.; Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 2021, 100, 104210. [Google Scholar] [CrossRef]
  40. Agrawal, P.; Abutarboush, H.F.; Ganesh, T.; Mohamed, A.W. Metaheuristic algorithms on feature selection: A survey of one decade of research (2009–2019). IEEE Access 2021, 9, 26766–26791. [Google Scholar] [CrossRef]
  41. Mafarja, M.; Mirjalili, S. Whale optimization approaches for wrapper feature selection. Appl. Soft Comput. 2018, 62, 441–453. [Google Scholar] [CrossRef]
  42. El Aziz, M.A.; Ewees, A.A.; Hassanien, A.E. Multi-objective whale optimization algorithm for content-based image retrieval. Multimed. Tools Appl. 2018, 77, 26135–26172. [Google Scholar] [CrossRef]
  43. Nakamura, R.Y.M.; Pereira, L.A.M.; Costa, K.A.; Rodrigues, D.; Papa, J.P.; Yang, X.-S. BBA: A binary bat algorithm for feature selection. In Proceedings of the 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images, Ouro Puerto, Brazil, 22–25 August 2012; pp. 291–297. [Google Scholar]
  44. Arora, S.; Singh, H.; Sharma, M.; Sharma, S.; Anand, P. A new hybrid algorithm based on grey wolf optimization and crow search algorithm for unconstrained function optimization and feature selection. IEEE Access 2019, 7, 26343–26361. [Google Scholar] [CrossRef]
  45. Mafarja, M.; Aljarah, I.; Heidari, A.A.; Hammouri, A.I.; Faris, H.; Al-Zoubi, A.; Mirjalili, S. Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl.-Based Syst. 2018, 145, 25–45. [Google Scholar] [CrossRef] [Green Version]
  46. Ewees, A.A.; Elaziz, M.A.; Houssein, E.H. Improved grasshopper optimization algorithm using opposition-based learning. Expert Syst. Appl. 2018, 112, 156–172. [Google Scholar] [CrossRef]
  47. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary grey wolf optimization approaches for feature selection. Neurocomputing 2016, 172, 371–381. [Google Scholar] [CrossRef]
  48. Ouadfel, S.; Elaziz, M.A. Enhanced crow search algorithm for feature selection. Expert Syst. Appl. 2020, 159, 113572. [Google Scholar] [CrossRef]
Figure 1. The different stages of the EFOAOA method as FS approach. First stage contains the dataset. Second stage contains the updating process. Third stage contains the evaluation process.
Figure 1. The different stages of the EFOAOA method as FS approach. First stage contains the dataset. Second stage contains the updating process. Third stage contains the evaluation process.
Entropy 23 01189 g001
Figure 2. Average of fitness value over all the tested datasets.
Figure 2. Average of fitness value over all the tested datasets.
Entropy 23 01189 g002
Figure 3. Average of accuracy measure overall the tested datasets.
Figure 3. Average of accuracy measure overall the tested datasets.
Entropy 23 01189 g003
Table 1. Description of datasets.
Table 1. Description of datasets.
DatasetsNumber of
Number of
Number of
Data Category
Breastcancer (D1)69929Biology
BreastEW (D2)569230Biology
CongressEW (D3)435216Politics
Exactly (D4)1000213Biology
Exactly2 (D5)1000213Biology
HeartEW (D6)270213Biology
IonosphereEW (D7)351234Electromagnetic
KrvskpEW (D8)3196236Game
Lymphography (D9)148218Biology
M-of-n (D10)1000213Biology
PenglungEW (D11)732325Biology
SonarEW (D12)208260Biology
SpectEW (D13)267222Biology
tic-tac-toe (D14)95829Game
Vote (D15)300216Politics
WaveformEW (D16)5000340Physics
WaterEW (D17)178313Chemistry
Zoo (D18)101616Artificial
Table 2. Average of fitness values for all methods.
Table 2. Average of fitness values for all methods.
Table 3. Standard division of fitness values for all methods.
Table 3. Standard division of fitness values for all methods.
Table 4. Best fitness values results for all methods.
Table 4. Best fitness values results for all methods.
Table 5. Worst fitness values results for all methods.
Table 5. Worst fitness values results for all methods.
Table 6. Selected features numbers for all methods.
Table 6. Selected features numbers for all methods.
Table 7. Accuracy results for all methods.
Table 7. Accuracy results for all methods.
Table 8. Mean rank obtained using Friedman test for FS methods.
Table 8. Mean rank obtained using Friedman test for FS methods.
NO. Selected features3.9449.2225.2224.9165.5834.8615.3334.50010.4164.4447.555
Fitness value3.3889.2775.3337.8884.8333.2224.1663.5559.8887.38897.055
Table 9. Comparsion with other existing FS methods.
Table 9. Comparsion with other existing FS methods.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ibrahim, R.A.; Abualigah, L.; Ewees, A.A.; Al-qaness, M.A.A.; Yousri, D.; Alshathri, S.; Abd Elaziz, M. An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection. Entropy 2021, 23, 1189.

AMA Style

Ibrahim RA, Abualigah L, Ewees AA, Al-qaness MAA, Yousri D, Alshathri S, Abd Elaziz M. An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection. Entropy. 2021; 23(9):1189.

Chicago/Turabian Style

Ibrahim, Rehab Ali, Laith Abualigah, Ahmed A. Ewees, Mohammed A. A. Al-qaness, Dalia Yousri, Samah Alshathri, and Mohamed Abd Elaziz. 2021. "An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection" Entropy 23, no. 9: 1189.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop