Next Article in Journal
Fermatean Fuzzy Schweizer–Sklar Operators and BWM-Entropy-Based Combined Compromise Solution Approach: An Application to Green Supplier Selection
Previous Article in Journal
A Fast Multi-Scale Generative Adversarial Network for Image Compressed Sensing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Binary Grasshopper Optimization Algorithm for Feature Selection Problem

1
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
2
College of Science and Engineering, Flinders University, Adelaide 5042, Australia
3
Department of Information Management, Chaoyang University of Technology, Taichung 413, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2022, 24(6), 777; https://doi.org/10.3390/e24060777
Submission received: 9 April 2022 / Revised: 24 May 2022 / Accepted: 29 May 2022 / Published: 31 May 2022

Abstract

:
The migration and predation of grasshoppers inspire the grasshopper optimization algorithm (GOA). It can be applied to practical problems. The binary grasshopper optimization algorithm (BGOA) is used for binary problems. To improve the algorithm’s exploration capability and the solution’s quality, this paper modifies the step size in BGOA. The step size is expanded and three new transfer functions are proposed based on the improvement. To demonstrate the availability of the algorithm, a comparative experiment with BGOA, particle swarm optimization (PSO), and binary gray wolf optimizer (BGWO) is conducted. The improved algorithm is tested on 23 benchmark test functions. Wilcoxon rank-sum and Friedman tests are used to verify the algorithm’s validity. The results indicate that the optimized algorithm is significantly more excellent than others in most functions. In the aspect of the application, this paper selects 23 datasets of UCI for feature selection implementation. The improved algorithm yields higher accuracy and fewer features.

1. Introduction

Recent years have witnessed a spurt of the development of informatics, and the data scale of applications such as statistical analysis and data mining is becoming larger and larger. Accordingly, the number of features obtained from the dataset is also increasing. However, some features may be irrelevant or redundant, independent of the final classification goal [1]. Therefore, it is necessary to reduce the dimension of the data and obtain representative features before the classification task. Data preprocessing can smooth out noisy and incomplete data, detect redundancy, and have strong robustness. As an essential preprocessing function, feature selection can clean and remove useless data features effectively [2]. Thus, FS plays an essential role in dimensionality reduction and improving classification performance.
FS is an effective strategy to reduce dimensionality and eliminate noisy and unreliable data [3]. It refers to finding feature-related subsets from a large set of attributes. There are 2 N 1 possible feature subsets in a dataset with N features. Davies proves that the search for the smallest subset of features is an NP problem, which means there is no guarantee of finding an optimal solution other than an exhaustive search [4,5]. However, when the number of features is large, an exhaustive search cannot be applied in practical applications because of the large amount of calculation. Therefore, people are committed to using a heuristic search algorithm to find the suboptimal solution. Many studies have attempted to model feature selection as a combinatorial optimization problem. The objective function can be classification accuracy or some other criterion that considers the best trade-off between the number of extracted features and efficiency [6].
The meta-heuristic algorithms are used to find the optimal or satisfactory solution to complex optimization problems [7,8,9]. The principles of optimization algorithms are revealed through knowledge of relevant behaviors and experiences in biological, physical, and other system domains. In 1991, an Italian scholar proposed the theory of ant colony optimization (ACO) [10]. Since then, swarm intelligence has been formally proposed as a theory. Swarm intelligence takes advantage of group information. It has been extensively used in optimization problems. In 1995, some scholars presented the particle swarm optimization (PSO) algorithm [11], and then the research on this subject was carried out rapidly. The cat swarm optimization based on feline predation strategies was introduced in 2006 [12]. In 2010, fish migration optimization (FMO) emerged, which integrated migration and swim models into the optimization process [13]. In 2017, Saremi et al. proposed the grasshopper optimization algorithm (GOA) [14]. GOA solves optimization problems by mathematically modeling and simulating the behavior of grasshopper swarms in nature. Compared with other existing algorithms, GOA has a higher search efficiency and faster convergence speed. It also solves the continuous problem of finding the best shape for a 52-bar truss and a 3-bar truss. Over recent years, more sophisticated algorithms have been put forward, such as the sparrow search algorithm (SSA) [15], seagull optimization algorithm (SOA) [16], quasi-affine transformation evolution with external archive (QUATRE-EAR) [17], and polar bear optimization algorithm (PBO) [18].
Nonetheless, many optimization problems are discrete problems, such as FS. Conventional methods can not satisfy practical needs, so binary algorithms are needed to solve this problem. Up till now, scholars have proposed many binary algorithms and achieved quite fruitful results. Among them, the well-known PSO algorithm and its binary variants have been put into feature selection [19,20,21,22]. A binary whale optimization algorithm was presented to handle discrete problems in this work [23]. Binary fish migration optimization algorithm (ABFMO) [24] and improved binary symbiotic organism search algorithm (IBSOS) using transfer function also solved the FS problem [25]. Accordingly, the pigeon flock optimization algorithm (PIO) and the gray wolf optimization algorithm (GWO) were improved for better application in feature selection [26,27,28]. The pigeon flock optimization algorithm simulates the pigeons’ homing behavior. Based on the binary pigeon flock optimization algorithm (BPOI), Tian et al. proposed improved binary pigeon-inspired optimization (IBPIO) [29]. They offered a new speed update equation and finally achieved excellent results. Additionally, the binary approach enabled the GWO to be applied to discrete problems [30,31]. The novel gray wolf optimization algorithm (BGWO) added a new parameter update equation to enhance the search capability [32]. Besides, the author gave five transfer functions for the feature selection of UCI datasets. Beyond that, the binary version of GOA was also used to solve the FS problem [33]. Hichem et al. proposed a Novel Binary GOA (NBGOA) by modeling position vectors as binary vectors in [34]. Pinto et al. [35] developed a binary GOA based on the percentile concept for solving the Multidimensional Knapsack Problem (MKP). Moreover, BGOA-M, a binary GOA algorithm based on the mutation operator, was introduced for the FS problem [36].
The sigmoid transfer function is a common method used when converting algorithms to binary versions [37,38]. Some scholars suggested improved binary EO (BEO) for FS problems using the sigmoid function [39]. The authors presented binary MPA (BMPA) and its improved versions using sigmoid and eight transfer functions [6]. In the binary grasshopper optimization algorithm (BGOA), the authors used the sigmoid transfer function to convert space to binary [36]. It has been well applied in feature selection. However, there is a weakness in the original BGOA. The conversion probability of position only accounts for a small range, which can not satisfy the exploration requirement of the algorithm. Thus, this paper presents an improved BGOA to avoid this situation. In the first place, the improved BGOA optimizes the step size in the original BGOA. Secondly, two sigmoid-based and one V-shaped transfer function are proposed based on the new step size. To evaluate the effectiveness of the improved algorithm, 23 well-known datasets are used for experiments. For the performance analysis, the improved BGOA is compared with BGOA, BPSO, and BGWO. Experiments prove that the proposed algorithm performs excellently than the original BGOA in the FS problem. There are the main contributions:
  • The range of step size variables in the original BGOA is optimized.
  • Three new transfer functions and two position conversion formulas are proposed based on the new step size.
  • The efficiency of the improved algorithm is examined by several experiments on 23 benchmark functions [40].
  • The improved algorithm achieves satisfactory results in feature selection application.
The rest of this paper is shown below. Section 2 is the preliminaries, which contain GOA and the original BGOA. Section 3 presents the improved version of BGOA. Section 4 shows the effect of the improved BGOA on 23 benchmark functions. Section 5 describes the application of the improved BGOA to feature selection. Section 6 analyzes the results of feature selection. Section 7 gives a discussion of this paper.

2. Preliminaries

GOA has been maturely applied to continuity problems. Its binary variants have also been gradually refined. This section introduces the standard GOA and the BGOA based on the sigmoid transfer function.

2.1. GOA

Grasshoppers are incompletely metamorphosed insects, consisting of three stages: egg, worm, and adult. Grasshoppers are a worldwide agricultural pest and generally occur individually. Nevertheless, they are swarming organisms that excel in periodic population outbreaks and can migrate over long distances. Grasshoppers are usually found in the worm and adult stages. The adult grasshoppers have solid hind feet, which causes tremendous damage to agriculture, forestry, and animal husbandry. They are adept at jumping and flying through the air with an extensive wide range of movement. In addition to migration, grasshoppers are also characterized by their predation process. Nature-inspired optimization algorithms have two phases: exploration and exploitation. Exploration is a large-scale search to prevent falling into a local optimum, while the exploitation phase is a small-scale search to find the optimal solution [41,42]. Grasshoppers can instinctively perform these two steps to find the target. Furthermore, according to the grasshopper’s characteristics, GOA has a unique adaptive mechanism. It can effectively regulate the global and partial search process with higher search accuracy. This phenomenon is mathematically modeled by Saremi et al. [43]:
X i = S i + G i + A i ,
here X i represents the position of the i-th grasshopper at this time. S i represents the influence factor of two individuals, G i is the gravitational influence. A i is the wind influence. Each operator is multiplied by a random number from 0 to 1 to enhance the randomness, as shown in Equation (2):
X i = k 1 S i + k 2 G i + k 3 A i .
The details of S i for the social interaction operator are as below:
S i = j = 1 j i N s ( d i j ) d i j ^ ,
where d i j is the distance between the i-th and j-th grasshoppers, function s calculates the intensity of social interaction, d i j ^ = x j x i d i j is the unit vector between the i-th and j-th grasshoppers. Where x i and x j represent the positions of the i-th and j-th grasshoppers, respectively. The s function is defined as shown in Equation (4):
s ( a ) = f e a l e a ,
where f is the attraction strength, l is the attraction length scope. The value of s is negative to indicate mutual repulsion, while positive indicates mutual attraction between grasshoppers. 0 means that they are in their comfort zone. The value of f is 0.5, and the value of l is 1.5. When two grasshoppers are too far apart, the force does not exist. Therefore, the distance has to be normalized. In his paper, the author does not take gravity into account. The wind direction is toward the best value. The final position formula is shown in Equation (5). The Pseudocode of GOA is given in Algorithm 1.
X i d = c j = 1 j i N c u b d l b d 2 s ( | X j d X i d | ) x j x i d i j + T d .
Algorithm 1: Pseudocode of GOA
1:
Initialize C m a x , C m i n (two extreme values of parameter c), M a x _ i t e r (iterations’ maximum) and N (population of grasshoppers)
2:
Initialize the position of each grasshopper: X i (i = 1,2, …, n)
3:
Set the best solution as Target
4:
whilet M a x i t e r do
5:
    Update c with Equation (6)
6:
    for each agent do
7:
        Normalize the distance among two individuals to [1, 4]
8:
        Update X i using Equation (5)
9:
        Update Target, if a better value is obtained
10:
    end for
11:
     t = t + 1
12:
end while
13:
Output Target
The u b d and l b d are the boundary values of the d-th dimension, respectively. T d is the optimal value found so far, and N represents the number of populations. An explanation of the calculation of c in Equation (5) can be found below:
c = C m a x t ( C m a x C m i n T ) ,
where C m a x is the maximum and C m i n is the minimum. The t is a number of the current iterations. T is the total iterations. It should be easy to see that c becomes smaller as the number of iterations increases. The first c can narrow the search area around the target with the increased number of iterations. The second c is used to reduce the gravitational force or repulsion between grasshoppers. In the text, C m a x = 1 , C m i n = 0.00001 .
From Equation (5), we can know that the new position of the i-th grasshopper is not only related to its current position but also related to the current situation of all other grasshoppers and the interaction forces between individuals. The adaptive mechanism of the algorithm can balance the global and local search. It has an excellent optimization-seeking ability.

2.2. BGOA

The search space of GOA is continuous. Thus the position can be moved randomly. However, in binary space, the position can only take 0 or 1. Mafarja et al. used the sigmoid transfer function in the paper to implement the binary conversion:
T ( Δ X t ) = 1 1 + e Δ X t ,
here Δ X t is the first part of Equation (5), similar to the velocity variable in the PSO algorithm, which is called step size. The absolute value of Δ X t can be considered as the distance between the updated position of the grasshopper and the target position in the d-th dimension. A conversion probability is obtained based on the transfer function. Accordingly, the formula for updating the grasshopper’s position is also changed through Equation (7) and Equation (8):
X t + 1 d = 1 i f r 1 < T ( Δ X t + 1 d ) 0 i f r 1 T ( Δ X t + 1 d ) ,
where r 1 belongs to [0, 1], X t + 1 d is the position of d-th dimensional after the t-th iteration.

3. Analysis and the Improvement of Binary Grasshopper Optimization Algorithm

The standard GOA algorithm and its modified versions have worked and achieved good results on continuous problems. Feature selection can be seen as a binary problem of selecting the appropriate 0/1 string. The initial length of the string is the whole amount of features in the original dataset. 0 represents the unselected attribute, 1 for a selected attribute. Additionally, The transfer function is a common and classical method in converting continuous to binary space [44].
The original BGOA used step size and transfer function for binary conversion to obtain specific results. From the analysis in Section 2, we know that parameter c and the step size become smaller as the number of iterations increases. After debugging the code and preserving the decimal places, the range of step size is found in [−0.3, 0.4], which indicates that the conversion probability is always taken to be a small part of [0, 1]. The curve is shown in Figure 1. Beyond that, the parameter r 1 in Equation (8) is a random number; thus, it may not be conducive to the position update in the former exploration stage. The ideal result is that the individuals in the population can randomly transform their positions in the binary pattern. To avoid the situation, this paper improves the performance of BGOA by modifying the transfer function. This manuscript proposes a new step size variable and three improved transfer functions. The first two transfer functions are based on the sigmoid transfer function, and the third is a V-shaped function.
The step size is modified to consider both the range of population positions and the uniformity of particles falling around 0.5 to ensure fairness. When the step size takes a value close to 6, the conversion probability is nearly 1. Therefore, increase the step size Δ X to 20 times, and change the range to [−6, 6]. The new transfer functions are proposed based on the new range. These transfer functions have two extremes close to 0 and 1 on [−6, 6], which has strong randomness in converting the binary position. The range on both sides of the 0 point is also evenly distributed. Here we set B to 20.
When the grasshopper updates its position, the transition probability is obtained according to Equation (9), which we refer to as BGOAS1:
S 1 ( B Δ X t ) = 1 1 + e 17 15 ( 1 + B Δ X t ) .
Equation (10) is called BGOAS2:
S 2 ( B Δ X t ) = 15 4 1 1 + e Π 3 ( B Δ X t ) ,
and Equation (11) is called BGOAV:
V ( B Δ X t ) = 2 2 3 | tanh ( B Δ X t ) | tanh 4 .
The new position is derived according to Equation (12) or Equation (13):
X t + 1 d = 1 i f r < S 1 ( B Δ X t ) r < S 2 ( B Δ X t ) 0 i f r S 1 ( B Δ X t ) r S 2 ( B Δ X t )
X t + 1 d = ( X t d ) 1 i f r < V ( B Δ X t ) X t d i f r V ( B Δ X t ) .
From the above description, we can learn that the new position of the grasshoppers depends on the current position of all grasshoppers. It is finally derived from the position conversion probability. Compared with the existing BGOA, the proposed methods in this paper have better exploration ability and randomness. New transfer functions are shown in Figure 2 and Figure 3. The Pseudocode of the new BGOA is displayed in Algorithm 2.
Algorithm 2: Pseudocode of new BGOA
1:
Initialize C m a x , C m i n (two extreme values of parameter c), M a x _ i t e r (iterations’ maximum) and N (population of grasshoppers)
2:
Set the best solution as Target
3:
whilet ≤ Maxiter do
4:
    Update c with Equation (6)
5:
    for each agent do
6:
        Normalize the distance among two individuals to [1, 4]
7:
        Calculate probability using Equation (9) or Equation (10) or Equation (11)
8:
        Update X i using Equation (12) or Equation (13)
9:
        Update Target, if a better value is obtained
10:
    end for
11:
     t = t + 1
12:
end while
13:
Output Target

4. Experimental Results

The validity of the new algorithm is verified in this section. There are many excellent test functions like benchmarks in the BBOB workshop, which support algorithm developers and practitioners alike by automating benchmarking experiments for black-box optimization algorithms [45,46,47]. This manuscript uses 23 benchmark test functions to demonstrate the effectiveness of the proposed algorithm. Among them, f1–f7 are unimodal benchmark test functions, f8–f13 are multimodal benchmark test functions, and f14–f19 are fixed-dimension benchmark test functions. The details of each test function are presented in Table 1, Table 2 and Table 3. Space means the search space of the population, Dim is the function’s dimension, and TM is their theoretical optimum. The settings of all parameters required by the algorithms are in Table 4.
The improved algorithm is compared with BGOA, BPSO, and BGWO. The mean and standard deviation (std) of the test functions are given in Table 5 and Table 6. If the improved algorithm works better than or the same as the original one, then we put the good result in bold font. For example, for f12, the values obtained using BGOAS1, BGOAS2 and BGOAV are smaller than BGOA, so the first three values are indicated in bold font.
As can be seen, the improved algorithm has an obvious advantage over BGOA, BPSO, and BGWO in the mean values of fitness results obtained on the first 13 test functions. The result indicates that the improved algorithm is more effective in solving high-dimensional problems. On the fixed-dimension functions, values obtained by the six algorithms are almost the same. It illustrates that the improved strategy is not the most efficient for addressing the low-dimensional problem.
For the unimodal test functions, there is only one optimal solution. Consequently, they can effectively check the convergence rate of the algorithms. Table 5 and Table 6 show that the results of the proposed algorithm BGOAV outperform the compared algorithms in all seven unimodal functions. The mean and std are the smallest. In f2, f3, and f6, BGOAS1 and BGOAS2 also obtain the optima of considered algorithms.
Functions f8-f13 are multimodal test functions. These functions have many local optima and are suitable for testing the ability of the algorithm to avoid local optima. BGOAS1, BGOAS2, and BGOAV perform well in these functions. BGOAV outperforms the other algorithms in both the mean and standard deviation of the results. Functions f9 and f11 reach the theoretical optimum with BGOAS1, BGOAS2, and BGOAV. As to f8 and f12, the proposed methods are closer to the optimum than BPSO and BGWO. Moreover, for f13, the best result is obtained using BGOAV, BPSO, and BGWO. In other words, the proposed strategy to improve the step size produces good results and prevents the algorithms from falling into local optima.
Functions f14–f23 are the fixed-dimension functions. It is evident from the results that the mean and standard deviation obtained by all algorithms are almost the same. Only f20 does the BGOA get a value closer to the theoretical one. It proves that on the fixed-dimension functions, the new algorithm has no special advantage over BGOA, BPSO, and BGWO. It is due to the low-dimensional and simple structure of the function, while the improved strategies are better at high-dimensional and complex problems.
To judge whether the results of the improved strategies differ from the best results of the other algorithms, Wilcoxon rank-sum test and Friedman test were performed at a 5% significance level in this experiment. We assume that there are no significant differences between the algorithms. If the p-value is smaller than 0.05, the hypothesis is not valid, and the original hypothesis is overturned. It can be identified from Table 7 that for f1–f4, the p-values obtained by BGOAS1, BGOAV, BGWO, and BPSO are smaller than 0.05, which means that there is a significant difference between BGOAS1, BGOAV, and BGOA. Data in Table 8 show that the p-value is not greater than 0.05 in f1–f3 and f5–f13, which could be considered strong evidence against the null hypothesis. The data suggests that there is a significant difference between these algorithms. This result illustrates new algorithm is superior to BGOA, BPSO, and BGWO in these 12 functions. It can be argued that the improved methods outperform the compared algorithms overall.
It is easy to see that the improved strategies promote the exploration and exploitation of BGOA. Moreover, it heightens the competitiveness of the algorithm in finding optimal solutions to functions. In the next section, this paper applies the improved algorithm to a real problem to study its practicability in FS.

5. Application of Feature Selection

Feature selection is a major function in the pre-processing part of data mining. It can remove irrelevant and redundant data from the dataset [48]. Researchers usually focus on the method with high precision and low features. In this section, the improved strategies (BGOAS1, BGOAS2, BGOAV) are exploited in feature selection for classification problems. It can be found that the improved strategies obtain better results and yield more accurate subsets of features.
Twenty-three datasets are selected for feature selection from UCI machine learning [49], each with different attributes and instance data. In addition, this paper uses a wrapper-based method for feature selection. The detailed information of the 23 datasets is introduced in Table 9.
K-Nearest Neighbor (KNN) classification algorithm is the most commonly used classification algorithm in data mining [50]. KNN is a supervised learning method with a simple mechanism: given a testing sample, find the K nearest training samples based on some distance metric, and then use these K “neighbors” to make predictions. Typically, voting can be used to classify the test samples with the most frequent of the K neighbors into one class. The distance metric between different samples generally selects Euclidean distance or Manhattan distance [51]. The computational method is shown in Equation (14):
L p ( x , y ) = i = 1 n | x i y i | p 1 p ,
where p is a variable constant. When p = 1 , L represents the Manhattan distance, and if p = 2 , L refers to the Euclidean distance. The x i and y i represent two different instances in the set, respectively.
The basic idea of cross-validation is to split the original data into training and testing sets [52]. The former is used for training the model, and the testing set is used for model validation. K-fold cross-validation divides all initial samples into K equally sized subsets. Then traverse the K subsets in turn. Each time the current subset is used as the verification set, and all other samples as the training set to train and evaluate the model. Finally, the average value of K evaluations is taken as the final evaluation criterion of the model. 20 is the maximum value of K. Generally, 10 is sufficient [53].
The error rate and accuracy of classification are crucial evaluation indicators in classification prediction. This paper uses Equation (15) as the fitness function:
F i t n e s s = μ e r r a t e ( K N N ) + ( 1 μ ) ( S e F A l F ) ,
where e r r a t e ( K N N ) denotes the classification error rate after K-fold cross-validation, which is explained in Equation (16). Parameter μ is often taken as 0.99. SeF is the subset feature after feature selection, AlF is the number of features for the dataset:
e r r a t e ( K N N ) = E n u m E n u m + C o r n u m .
Enum and Cornum are the error and the correct number of the classification, respectively.
It can be seen from Equation (15) that the fitness function aims to find the combination of features with maximum classification performance and a minimum number of selected features. It is converted into a minimization problem by using the error rate instead of the classification accuracy and using the selected feature ratio instead of the unselected feature ratio.
Wrapper-based method for feature selection directly uses the performance of the final model as the evaluation criterion for the feature subsets. In this paper, the KNN is used as a classification to ensure the goodness of the selected features. The improved BGOAS1, BGOAS2, and BGOAV are used as search methods that can adaptively search the feature space to achieve higher feature evaluation criterion. A single dimension in the search space represents a feature, so the grasshopper’s position represents a combination of features or a solution. It is noted that the higher feature evaluation criterion is expressed as smaller fitness values in Equation (15).

6. Results of Feature Selection

The improved algorithm, BGOA, BPSO, and BGWO algorithms are applied to feature selection. All the population sizes are set to 30. The iterations are 100 and run 15 times on each dataset. The value of K in KNN is taken as 10. Table 10 shows the feature selection fitness values. Table 11 records the number of feature selections. Table 12 describes the accuracy of the feature selection. Wilcoxon rank-sum test and Friedman test are examined for the mean accuracy and fitness values in Table 13 and Table 14.
Table 10 shows that the new strategies have great advantages. The improved strategies obtain better results than the original BGOA on 15 datasets. In Air, Astura, Breast, and Segmentation datasets, new strategies outperform BPSO. Only in 5 datasets does the original BGOA obtain a better value. In Appendicitis, Breast, Bupa, Diabetes, and Glass datasets, the original BGOA gets the best result. The number of selected feature subsets presented in Table 11 also supports the claim that the improved algorithm has better performance than compared algorithms. It is worth mentioning that BGOAS1 performs well in the number of selections, with the smallest subset of features selected in 14 datasets. On all 23 datasets, the improved algorithm obtained better or equal results than BGOA, BPSO, and BGWO. The accuracy of feature selection is shown in Table 12. BGOAS1 achieves exceptionally high accuracy on the 8 datasets: Balancescale, Bupa, Cloud, Diabetes, and Heartstatlog datasets. Compared with the original algorithm, the accuracy of feature selection is improved by about 3%. Accordingly, BGOAS2 and BGOAV obtain higher accuracy than BGOA on 10 and 6 datasets. Among them, the accuracy of the Vowel dataset reaches 1. On the Air, Appendicitis, Breast, WDBC, and Zoo datasets, the performance of the six algorithms is comparable.
Table 13 lists the Wilcoxon rank-sum test of the new strategies with the original BGOA. Values in Air, Cleve, Segmentation, Thyroid, and Ecoli datasets are smaller than 5%. From the Friedman test in Table 14, it can be obtained that values in WDBC, Bupa, Segmentation, Jain, Vowel, and Sonar datasets are smaller than 0.05. Therefore, it can be considered that there are significant differences between these algorithms. The results in these tables prove the better validity and feasibility of the improved algorithm.

7. Discussion

The binary grasshopper optimization algorithm solves discrete problems such as feature selection. This paper presented three improved versions of the binary grasshopper optimization algorithm for feature selection. A new step size variable and three transfer functions were introduced to optimize the algorithm’s exploration capability in binary space. Besides, this paper has done several tests on 23 benchmark test functions to certify the algorithm’s feasibility. The improved algorithm shows preferable performance in high-dimensional functions. Subsequently, simulation experiments for feature selection are conducted. In the 23 UCI datasets, the KNN and 10-fold cross-validation are adopted to address the wrapper-based feature selection problem. The improved algorithms are more competitive than the original BGOA, BPSO, and BGWO regarding fitness values and selected subsets.
It should be noted that the method of this paper has applied only to feature selection. Thus, it can address other binary combinatorial optimization problems, including task scheduling and traveling salesman problems. Apart from that, many excellent benchmarks in the BBOB workshop may be very effective for the further improvement of BGOA. Hence, more in-depth studies like using benchmarks in the BBOB workshop to examine the algorithm will be conducted in the future. Finally, the improved algorithm does not perform well on low-dimensional functions, and the binary conversion increases the computing time. Future work involving shortening the running time of the algorithm and improving its ability to solve low-dimensional problems is expected to execute.

Author Contributions

Conceptualization, G.-L.W.; formal analysis, A.-Q.T.; methodology, T.L.; resources, S.-C.C.; writing—original draft, G.-L.W.; writing—review & editing, J.-S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://archive.ics.uci.edu/ml/index.php, (accessed on 5 September 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Alasadi, S.A.; Bhaya, W.S. Review of data preprocessing techniques in data mining. J. Eng. Appl. Sci. 2017, 12, 4102–4107. [Google Scholar]
  2. Hinchey, M.G.; Sterritt, R.; Rouff, C. Swarms and Swarm Intelligence. Computer 2007, 40, 111–113. [Google Scholar] [CrossRef]
  3. Gopika, N.; ME, A.M.K. Correlation based feature selection algorithm for machine learning. In Proceedings of the 2018 3rd International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 15–16 October 2018; pp. 692–695. [Google Scholar]
  4. Kmimech, H.; Telmoudi, A.J.; Sliman, L.; Nabli, L. Genetic-based approach for minimum initial marking estimation in labeled Petri nets. IEEE Access 2020, 8, 22854–22861. [Google Scholar] [CrossRef]
  5. Bjerkevik, H.B.; Botnan, M.B.; Kerber, M. Computing the interleaving distance is NP-hard. Found. Comput. Math. 2019, 20, 1–35. [Google Scholar] [CrossRef] [Green Version]
  6. Ghaemi, M.; Feizi-Derakhshi, M.R. Feature selection using forest optimization algorithm. Pattern Recognit. 2016, 60, 121–129. [Google Scholar] [CrossRef]
  7. Sun, X.X.; Pan, J.S.; Chu, S.C.; Hu, P.; Tian, A.Q. A novel pigeon-inspired optimization with QUasi-Affine TRansformation evolutionary algorithm for DV-Hop in wireless sensor networks. Int. J. Distrib. Sens. Netw. 2020, 16, 1–15. [Google Scholar] [CrossRef]
  8. Song, P.C.; Chu, S.C.; Pan, J.S.; Yang, H. Simplified Phasmatodea population evolution algorithm for optimization. Complex Intell. Syst. 2021, 1–19. [Google Scholar] [CrossRef]
  9. Sun, L.; Koopialipoor, M.; Armaghani, D.J.; Tarinejad, R.; Tahir, M. Applying a meta-heuristic algorithm to predict and optimize compressive strength of concrete samples. Eng. Comput. 2019, 37, 1133–1145. [Google Scholar] [CrossRef]
  10. Stützle, T.; Dorigo, M. ACO algorithms for the traveling salesman problem. Evol. Algorithms Eng. Comput. Sci. 1999, 4, 163–183. [Google Scholar]
  11. Eberhart, R.; Kennedy, J. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  12. Chu, S.C.; Tsai, P.W.; Pan, J.S. Cat swarm optimization. In Pacific Rim International Conference on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2006; Volume 4099, pp. 854–858. [Google Scholar]
  13. Pan, J.S.; Tsai, P.W.; Liao, Y.B. Fish migration optimization based on the fishy biology. In Proceedings of the 2010 Fourth International Conference on Genetic and Evolutionary Computing, Shenzhen, China, 13–15 December 2010; pp. 783–786. [Google Scholar]
  14. Mirjalili, S.Z.; Mirjalili, S.; Saremi, S.; Faris, H.; Aljarah, I. Grasshopper optimization algorithm for multi-objective optimization problems. Appl. Intell. 2018, 48, 805–820. [Google Scholar] [CrossRef]
  15. Xue, J.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
  16. Dhiman, G.; Kumar, V. Seagull optimization algorithm: Theory and its applications for large-scale industrial engineering problems. Knowl.-Based Syst. 2019, 165, 169–196. [Google Scholar] [CrossRef]
  17. Meng, Z.; Pan, J.S. QUasi-Affine TRansformation Evolution with External ARchive (QUATRE-EAR): An enhanced structure for differential evolution. Knowl.-Based Syst. 2018, 155, 35–53. [Google Scholar] [CrossRef]
  18. Połap, D.; Woz´niak, M. Polar bear optimization algorithm: Meta-heuristic with fast population movement and dynamic birth and death mechanism. Symmetry 2017, 9, 203. [Google Scholar] [CrossRef] [Green Version]
  19. Putri, D.A.; Kristiyanti, D.A.; Indrayuni, E.; Nurhadi, A.; Hadinata, D.R. Comparison of Naive Bayes Algorithm and Support Vector Machine using PSO Feature Selection for Sentiment Analysis on E-Wallet Review. J. Phys. Conf. Ser. 2020, 1641, 012085. [Google Scholar] [CrossRef]
  20. Tran, B.; Xue, B.; Zhang, M. A new representation in PSO for discretization-based feature selection. IEEE Trans. Cybern. 2017, 48, 1733–1746. [Google Scholar] [CrossRef] [PubMed]
  21. Tawhid, M.A.; Dsouza, K.B. Hybrid binary bat enhanced particle swarm optimization algorithm for solving feature selection problems. Appl. Comput. Inform. 2020, 6, 117–136. [Google Scholar] [CrossRef]
  22. Amoozegar, M.; Minaei-Bidgoli, B. Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Syst. Appl. 2018, 113, 499–514. [Google Scholar] [CrossRef]
  23. Hussien, A.G.; Hassanien, A.E.; Houssein, E.H.; Amin, M.; Azar, A.T. New binary whale optimization algorithm for discrete optimization problems. Eng. Optim. 2020, 52, 945–959. [Google Scholar] [CrossRef]
  24. Pan, J.S.; Hu, P.; Chu, S.C. Binary fish migration optimization for solving unit commitment. Energy 2021, 226, 120329. [Google Scholar] [CrossRef]
  25. Du, Z.G.; Pan, J.S.; Chu, S.C.; Chiu, Y.J. Improved Binary Symbiotic Organism Search Algorithm With Transfer Functions for Feature Selection. IEEE Access 2020, 8, 225730–225744. [Google Scholar] [CrossRef]
  26. Emary, E.; Zawbaa, H.M.; Grosan, C.; Hassenian, A.E. Feature subset selection approach by gray-wolf optimization. In Afro-European Conference for Industrial Advancement; Springer: Berlin/Heidelberg, Germany, 2015; Volume 334, pp. 1–13. [Google Scholar]
  27. Blanco, A.L.; Chaparro, N.; Rojas-Galeano, S. An urban pigeon-inspired optimiser for unconstrained continuous domains. In Proceedings of the 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), Salvador, Brazil, 15–18 October 2019; pp. 521–526. [Google Scholar]
  28. Bolaji, A.L.; Babatunde, B.S.; Shola, P.B. Adaptation of binary pigeon-inspired algorithm for solving multidimensional knapsack problem. In Soft Computing: Theories and Applications; Springer: Berlin/Heidelberg, Germany, 2018; Volume 583, pp. 743–751. [Google Scholar]
  29. Pan, J.S.; Tian, A.Q.; Chu, S.C.; Li, J.B. Improved binary pigeon-inspired optimization and its application for feature selection. Appl. Intell. 2021, 51, 8661–8679. [Google Scholar] [CrossRef]
  30. Moazzami, M.; Hosseini, S.J.a.D.; Shahinzadeh, H.; Gharehpetian, G.B.; Moradi, J. SCUC Considering Loads and Wind Power Forecasting Uncertainties Using Binary Gray Wolf Optimization Method. Majlesi J. Electr. Eng. 2018, 12, 15–24. [Google Scholar]
  31. Safaldin, M.; Otair, M.; Abualigah, L. Improved binary gray wolf optimizer and SVM for intrusion detection system in wireless sensor networks. J. Ambient Intell. Humaniz. Comput. 2021, 12, 1559–1576. [Google Scholar] [CrossRef]
  32. Hu, P.; Pan, J.S.; Chu, S.C. Improved binary grey wolf optimizer and its application for feature selection. Knowl.-Based Syst. 2020, 195, 105746. [Google Scholar] [CrossRef]
  33. Meraihi, Y.; Gabis, A.B.; Mirjalili, S.; Ramdane-Cherif, A. Grasshopper optimization algorithm: Theory, variants, and applications. IEEE Access 2021, 9, 50001–50024. [Google Scholar] [CrossRef]
  34. Hichem, H.; Elkamel, M.; Rafik, M.; Mesaaoud, M.T.; Ouahiba, C. A new binary grasshopper optimization algorithm for feature selection problem. J. King Saud-Univ.-Comput. Inf. Sci. 2019, 34, 316–328. [Google Scholar] [CrossRef]
  35. Pinto, H.; Peña, A.; Valenzuela, M.; Fernández, A. A binary grasshopper algorithm applied to the knapsack problem. In Computer Science On-Line Conference; Springer: Berlin/Heidelberg, Germany, 2018; pp. 132–143. [Google Scholar]
  36. Mafarja, M.; Aljarah, I.; Faris, H.; Hammouri, A.I.; Ala’M, A.Z.; Mirjalili, S. Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst. Appl. 2019, 117, 267–286. [Google Scholar] [CrossRef]
  37. Roodschild, M.; Sardiñas, J.G.; Will, A. A new approach for the vanishing gradient problem on sigmoid activation. Prog. Artif. Intell. 2020, 9, 351–360. [Google Scholar] [CrossRef]
  38. Koyuncu, I. Implementation of high speed tangent sigmoid transfer function approximations for artificial neural network applications on FPGA. Adv. Electr. Comput. Eng. 2018, 18, 79–86. [Google Scholar] [CrossRef]
  39. Gao, Y.; Zhou, Y.; Luo, Q. An efficient binary equilibrium optimizer algorithm for feature selection. IEEE Access 2020, 8, 140936–140963. [Google Scholar] [CrossRef]
  40. Jamil, M.; Yang, X.S. A literature survey of benchmark functions for global optimisation problems. Int. J. Math. Model. Numer. Optim. 2013, 4, 150–194. [Google Scholar] [CrossRef] [Green Version]
  41. Tian, A.Q.; Chu, S.C.; Pan, J.S.; Cui, H.; Zheng, W.M. A compact pigeon-inspired optimization for maximum short-term generation mode in cascade hydroelectric power station. Sustainability 2020, 12, 767. [Google Scholar] [CrossRef] [Green Version]
  42. Tian, A.Q.; Chu, S.C.; Pan, J.S.; Liang, Y. A novel pigeon-inspired optimization based MPPT technique for PV systems. Processes 2020, 8, 356. [Google Scholar] [CrossRef] [Green Version]
  43. Saremi, S.; Mirjalili, S.; Lewis, A. Grasshopper optimisation algorithm: Theory and application. Adv. Eng. Softw. 2017, 105, 30–47. [Google Scholar] [CrossRef] [Green Version]
  44. Islam, M.J.; Li, X.; Mei, Y. A time-varying transfer function for balancing the exploration and exploitation ability of a binary PSO. Appl. Soft Comput. 2017, 59, 182–196. [Google Scholar] [CrossRef]
  45. Hansen, N.; Auger, A.; Ros, R.; Mersmann, O.; Tušar, T.; Brockhoff, D. COCO: A Platform for Comparing Continuous Optimizers in a Black-Box Setting. Optim. Methods Softw. 2021, 36, 114–144. [Google Scholar] [CrossRef]
  46. Finck, S.; Hansen, N.; Ros, R.; Auger, A. Real-Parameter Black-Box Optimization Benchmarking 2009: Presentation of the Noiseless Functions; Technical report; Citeseer: Princeton, NJ, USA, 2010. [Google Scholar]
  47. Hansen, N.; Auger, A.; Ros, R.; Finck, S.; Pošík, P. Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009. In Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation, Portland, OR, USA, 7–11 July 2010; pp. 1689–1696. [Google Scholar]
  48. Hidaka, S.; Smith, L.B. Packing: A geometric analysis of feature selection and category formation. Cogn. Syst. Res. 2011, 12, 1–18. [Google Scholar] [CrossRef] [Green Version]
  49. Tanveer, M.; Gautam, C.; Suganthan, P.N. Comprehensive evaluation of twin SVM based classifiers on UCI datasets. Appl. Soft Comput. 2019, 83, 105617. [Google Scholar] [CrossRef]
  50. Chuang, L.Y.; Chang, H.W.; Tu, C.J.; Yang, C.H. Improved binary PSO for feature selection using gene expression data. Comput. Biol. Chem. 2008, 32, 29–38. [Google Scholar] [CrossRef]
  51. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary grey wolf optimization approaches for feature selection. Neurocomputing 2016, 172, 371–381. [Google Scholar] [CrossRef]
  52. Browne, M.W. Cross-validation methods. J. Math. Psychol. 2000, 44, 108–132. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  53. Jung, Y. Multiple predicting K-fold cross-validation for model selection. J. Nonparametr. Stat. 2018, 30, 197–215. [Google Scholar] [CrossRef]
Figure 1. Curve for the range of values.
Figure 1. Curve for the range of values.
Entropy 24 00777 g001
Figure 2. BGOAS1 and BGOAS2 tranafer functions.
Figure 2. BGOAS1 and BGOAS2 tranafer functions.
Entropy 24 00777 g002
Figure 3. BGOAV transfer function.
Figure 3. BGOAV transfer function.
Entropy 24 00777 g003
Table 1. Unimodal test functions.
Table 1. Unimodal test functions.
NumNameSpaceDimTM
1Sphere[−100, 100]300
2Schwefel’s function 2.21[−10, 10]300
3Schwefel’s function 1.2[−100, 100]300
4Schwefel’s function 2.221[−100, 100]300
5Rosenbrock[−30, 30]300
6Step[−100, 100]300
7Dejong’s noisy[−1.28, 1.28]300
Table 2. Multimodal test functions.
Table 2. Multimodal test functions.
NumNameSpaceDimTM
8Schwefel[−500, 500]30−12,569
9Rastringin[−5.12, 5.12]300
10Ackley[−32, 32]300
11Griewank[−600, 600]300
12Generalized penalized 1[−50, 50]300
13Generalized penalized 2[−50, 50]300
Table 3. Fixed-dimension test functions.
Table 3. Fixed-dimension test functions.
NumNameSpaceDimTM
14Fifth of Dejong[−65, 65]21
15Kowalik[−5, 5]40.00030
16Six-hump camel back[−5, 5]6−1.0316
17Branins[−5, 5]20.398
18Goldstein–Price[−2, 2]23
19Hartman 1[0, 10]3−3.86
20Hartman 2[0, 1]6−3.32
21Shekel 1[0, 1]4−10.1532
22Shekel 2[0, 1]4−10.4028
23Shekel 3[0, 1]4−10.5363
Table 4. Parameters and values.
Table 4. Parameters and values.
ParameterValue
Cmax1
Cmin0.00004
C12
C22
wmax0.9
wmin0.2
SigmaMax1
SigamaMin0.1
Vmax6
Vmin−6
popnum30
Max_iter500
Table 5. The result of mean values.
Table 5. The result of mean values.
FunctionsBGOA_S1BGOA_S2BGOA_VBGOABPSOBGWO
f110.00009.00007.800010.00004.60003.2000
f20.00000.80000.00000.60000.00000.0000
f30.40000.00000.00000.20000.00000.0000
f40.20000.60000.00000.20000.00000.2000
f520.80002.40000.000020.00000.00000.0000
f62.85001.25001.25002.45001.25001.2500
f70.00620.20780.00161.00640.00550.0002
f8−4.2074−4.2074−4.2074−4.2074−3.8708−3.8708
f90.80000.60000.00000.40000.00000.2000
f101.02671.0267 8 . 88 × 10 16 0.3422 8.88 × 10 16 8.88 × 10 16
f110.08610.01970.00000.03940.00000.0000
f124.12334.18624.12334.48464.12334.1233
f130.04000.0400 1 . 35 × 10 32 0.0600 1.35 × 10 32 1.35 × 10 32
f1412.670512.670512.670512.670512.670512.6705
f150.14840.14840.14840.14840.14840.1484
f160.00000.00000.00000.00000.00000.0000
f1727.702927.702927.702927.702927.702927.7029
f18600.0000600.0000600.0000600.0000600.0000600.0000
f19−0.3348−0.3348−0.3348−0.3348−0.3348−0.3348
f20−0.1343−0.1196−0.1657−0.0989−0.1657−0.1469
f21−4.2205−5.0552−5.0552−5.0552−5.0552−5.0552
f22−3.4172−5.0877−5.0877−5.0877−5.0877−5.0877
f23−5.1285−5.1285−5.1285−5.1285−5.1285−5.1285
If the improved algorithm works better than or the same as the original BGOA, then we put the good result in bold font.
Table 6. The result of std values.
Table 6. The result of std values.
FunctionsBGOA_S1BGOA_S2BGOA_VBGOABPSOBGWO
f11.41421.22470.83671.41420.89441.4832
f20.00000.44720.00000.54770.00000.0000
f30.54770.00000.00000.44720.00000.0000
f40.54770.54770.00000.44720.00000.4472
f544.30802.19090.000044.72140.00000.0000
f60.89440.00000.00001.09540.00000.0000
f70.00460.44850.00181.40920.00530.0001
f80.46090.46090.00000.00000.00000.0000
f90.44720.54770.00000.54770.00000.4472
f100.93730.93730.00000.76530.00000.0000
f110.08870.04410.00000.05400.00000.0000
f120.00000.14050.00000.49470.00000.0000
f130.05480.05480.00000.05480.00000.0000
f140.00000.00000.00000.00000.00000.0000
f150.00000.00000.00000.00000.00000.0000
f160.00000.00000.00000.00000.00000.0000
f170.00000.00000.00000.00000.00000.0000
f180.00000.00000.00000.00000.00000.0000
f190.00000.00000.00000.00000.00000.0000
f200.01950.02150.00000.02720.00000.0615
f211.86632.28580.00002.28580.00000.0000
f221.86760.00000.00002.28770.00000.0000
f230.00000.00000.00000.00000.00000.0000
If the improved algorithm works better than or the same as the original BGOA, then we put the good result in bold font.
Table 7. p-value of Wilcoxon rank-sum test.
Table 7. p-value of Wilcoxon rank-sum test.
FunctionsBGOA_S1BGOA_S2BGOA_VBPSOBGWO
f10.05560.76190.05560.00790.0079
f20.52380.20630.04760.04760.0476
f30.71430.20630.04760.04760.0476
f40.523810.16670.16671
f511111
f60.16671111
f70.69050.841310.69050.6905
f8110.44440.44440.4444
f9110.44440.44440.4444
f100.52381111
f110.04760.4444111
f120.28570.52380.16670.16670.1667
f1310.52380.44440.44440.4444
f1411111
f1511111
f1611111
f1711111
f1811111
f1911111
f200.38100.68250.16670.16671
f2111111
f2211111
f2311111
Table 8. The results of Friedman test.
Table 8. The results of Friedman test.
FriedmanSum of SquaresDegree of FreedomMean Squaresp-Value
f173.2514.620.0006
f277.3515.460.0004
f370.7514.140.0011
f40501
f580.3516.060.0003
f672.1514.420.0008
f773.1514.620.0009
f869.7513.940.0011
f978.1515.620.0004
f1072.8514.560.0008
f1171.5514.30.0010
f1273514.60.0008
f1378515.60.0004
f140501
f150501
f160501
f170501
f180501
f190501
f201.5550.4159
f21651.20.0752
f221.550.30.4159
f231.550.30.4159
Table 9. The details of datasets.
Table 9. The details of datasets.
S.no.DatasetsInstancesNumber of Classes (k)Features of Each Class (d)Size of Classes
1Air359364107, 103, 149
2Appendicitis1062721, 85
3Austra690214395, 295
4Balancescale6253449, 288, 288
5Blood74824570, 178
6Breast27729196, 81
7Breast_gy27729196, 81
8Bupa34526145, 200
9Cleve296213160, 139
10Cloud1024210627, 403
11Diabetes76882268, 500
12Ecoli33688143, 77, 2, 2, 259, 20, 5, 52
13Glass2146929, 76, 70, 17, 13, 9
14Heartstatlog270213150, 120
15Jain37322276, 97
16phoneme54042515, 863, 818
17Robotnavigation545642582, 620, 972, 205, 329
18Seeds2103770, 70, 70
19segmentation21071830, 30, 30, 30, 30, 30, 30
20Sonar20826097, 111
21Thyroid21535150, 35, 30
22Vowel8716372, 89, 172, 151, 207, 180
23zoo10171641, 20, 5, 13, 4, 7, 10
Table 10. The result of fitness value.
Table 10. The result of fitness value.
DatasetBGOA_S1BGOA_S2BGOA_VBGOABPSOBGWO
Air0.073800.057810.084810.072340.082490.07068
Appendicitis0.135220.139960.146890.092440.140390.13982
Austra0.306960.318360.323910.327670.323560.32189
Balancescale0.226650.152800.199350.171760.183910.17874
WDBC0.045570.040670.047720.047000.056280.05696
Blood0.236350.236440.236340.234040.226840.23165
Breast0.244520.241010.253030.236130.248560.24672
Breast_gy0.209100.209220.227140.236900.217060.21775
Bupa0.339570.327040.366540.313000.331180.32752
Cleve0.176000.181400.204210.190380.174400.18164
Cloud0.005140.011550.018520.010200.014500.01450
Diabetes0.246610.255970.273590.243010.260490.26189
Segmentation0.111900.106440.116830.123100.114510.11996
Thyroid0.050670.050670.060830.082060.067280.06933
Heartstatlog0.161540.163430.191290.166820.156470.17342
Ecoli0.155880.139800.167960.165160.158770.16499
Glass0.578790.578900.608250.569430.570920.57003
Jain0.010000.010000.028550.050000.050000.05000
Vowel0.143820.142200.149740.171700.174220.17374
Seeds0.047640.055040.067300.048820.061600.06160
Sonar0.182990.184210.212310.174100.202570.19467
Balance scale0.219000.170740.255720.193570.199300.19926
Zoo0.045970.045760.057200.032320.057710.06155
If the improved algorithm works better than or the same as the original BGOA, then we put the good result in bold font.
Table 11. The number of selected features.
Table 11. The number of selected features.
DatasetBGOA_S1BGOA_S2BGOA_VBGOABPSOBGWO
Air33.6666733.3333331.0000032.6666736.0000040.33333
Appendicitis5.666675.333331.000006.000008.333336.66667
Austra2.666674.333333.000002.333332.000002.00000
Balancescale2.333336.000000.333334.000004.666675.66667
WDBC3.333334.000003.666674.000004.000004.00000
Blood12.0000015.0000016.666673.3333312.3333313.66667
Breast0.000000.000000.000000.000000.000000.33333
Breast_gy3.333335.333334.666675.333334.000004.00000
Bupa2.333335.333335.666672.666674.000003.00000
Cleve3.333334.000004.666674.000003.666674.00000
Cloud5.000006.000006.333336.000006.333336.00000
Diabetes1.000002.333333.333331.333331.666671.66667
Segmentation4.000005.666674.000003.666672.333333.33333
Thyroid7.6666710.000008.333338.000008.000008.33333
Heartstatlog2.666672.666672.333331.666671.666672.00000
Ecoli5.666678.000008.333335.000006.666676.66667
Glass5.000005.333335.333334.666674.333334.66667
Jain3.000003.666675.333332.666673.666675.00000
Vowel2.000002.000001.666672.000002.000002.00000
Seeds3.000003.000003.000003.000003.000003.00000
Sonar2.000002.666673.333332.000002.333332.33333
Balancescale13.6666725.6666728.3333332.6666726.6666731.66667
Zoo3.333334.000002.666674.000004.000004.00000
If the improved algorithm works better than or the same as the original BGOA, then we put the good result in bold font.
Table 12. The accuracy of feature selection.
Table 12. The accuracy of feature selection.
DatasetBGOA_S1BGOA_S2BGOA_VBGOABPSOBGWO
Air0.930770.946870.919230.950720.942770.95877
Appendicitis0.957140.957140.942860.985710.966670.95714
Austra0.867260.864880.855950.920240.867260.86786
Balancescale0.691620.682750.673060.670120.676960.68247
WDBC0.779470.855760.807890.871830.859040.86448
Blood0.958010.963970.957410.970020.962390.96402
Breast0.761260.761180.761270.753640.761220.76055
Breast_gy0.756750.762540.749650.782630.761750.76368
Bupa0.791400.794650.776930.766230.794910.78833
Cleve0.662610.676390.637610.705610.683560.69033
Cloud0.826110.821430.798650.823890.842060.83310
Diabetes0.995810.990700.984660.996280.993510.99351
Segmentation0.755950.748600.728700.768320.741150.74626
Thyroid0.891270.898100.886670.893810.902860.89810
Heartstatlog0.955560.955560.944440.935560.951110.95333
Ecoli0.841230.841130.813260.844640.862280.84444
Glass0.849760.866490.838040.861230.865460.86141
Jain0.418730.419370.391590.416190.420480.42921
Vowel1.000001.000000.979581.000001.000001.00000
Seeds0.864830.866470.858850.871890.869240.86974
Sonar0.954760.948250.936830.963650.952700.95270
Balancescale0.817460.818250.790320.845400.810160.82286
Zoo0.787210.837630.748430.848870.842850.84288
If the improved algorithm works better than or the same as the original BGOA, then we put the good result in bold font.
Table 13. The result of Wilcoxon rank-sum test.
Table 13. The result of Wilcoxon rank-sum test.
DatasetBGOA_S1BGOA_S2BGOA_VBPSOBGWO
Air0.17460.44440.00790.03970.0079
Appendicitis0.53970.47620.65080.80950.2460
Austra0.74600.30950.84131.00000.4206
Balancescale0.57141.00000.69050.09520.0317
WDBC0.58731.00000.57140.45240.1746
Blood1.00000.00790.15080.14290.4603
Breast0.57140.42060.73020.09520.5873
Breast_gy1.00000.16670.69050.15080.3095
Bupa0.74600.25400.30950.00790.0079
Cleve0.74600.15080.03170.00790.0079
Cloud0.85710.30950.23810.84130.8413
Diabetes0.15080.42061.00000.15080.1508
Segmentation0.13490.00790.15080.00790.0317
Thyroid0.07141.00000.11900.47620.1190
Heartstatlog0.54760.68250.43650.00790.0159
Ecoli0.06350.50790.05560.00790.0079
Glass0.13490.84130.42060.04760.8016
Jain1.00001.00001.00001.00001.0000
Vowel0.54760.39681.00000.00790.0079
Seeds0.95240.70630.23020.11900.0397
Sonar0.19840.54760.09520.05560.0317
Balancescale0.15080.30950.10320.00790.0079
Zoo0.49211.00001.00000.04760.2063
Table 14. The result of Friedman test.
Table 14. The result of Friedman test.
DatasetSum of SquaresDegree of FreedomMean Squaresp-Value
Air5.222.60.2548
Appendicitis0.420.20.0916
Austra2.821.40.8557
Balancescale2.821.40.0823
WDBC0.720.350.0382
Blood3.621.80.4060
Breast1.920.950.1132
Breast_gy0.420.20.8995
Bupa3.621.80.0427
Cleve0200.1257
Cloud0200.1018
Diabetes0.320.150.4830
Segmentation4.922.450.0342
Thyroid4.822.40.0513
Heartstatlog0200.6151
Ecoli0.420.20.2311
Glass1.220.60.0663
Jain2.821.40.0174
Vowel3.621.80.0427
Seeds2.821.40.6151
Sonar0.420.20.0427
Balancescale3.621.80.1546
Zoo1.920.950.0513
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wang, G.-L.; Chu, S.-C.; Tian, A.-Q.; Liu, T.; Pan, J.-S. Improved Binary Grasshopper Optimization Algorithm for Feature Selection Problem. Entropy 2022, 24, 777. https://doi.org/10.3390/e24060777

AMA Style

Wang G-L, Chu S-C, Tian A-Q, Liu T, Pan J-S. Improved Binary Grasshopper Optimization Algorithm for Feature Selection Problem. Entropy. 2022; 24(6):777. https://doi.org/10.3390/e24060777

Chicago/Turabian Style

Wang, Gui-Ling, Shu-Chuan Chu, Ai-Qing Tian, Tao Liu, and Jeng-Shyang Pan. 2022. "Improved Binary Grasshopper Optimization Algorithm for Feature Selection Problem" Entropy 24, no. 6: 777. https://doi.org/10.3390/e24060777

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop