Next Article in Journal
Existence of Common Fixed Points of Generalized ∆-Implicit Locally Contractive Mappings on Closed Ball in Multiplicative G-Metric Spaces with Applications
Next Article in Special Issue
Model Selection for High Dimensional Nonparametric Additive Models via Ridge Estimation
Previous Article in Journal
The Correlation between Bone Density and Mechanical Variables in Bone Remodelling Models: Insights from a Case Study Corresponding to the Femur of a Healthy Adult
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Multi-Strategy Hybrid Fuzzy K-Nearest Neighbor Using Improved Hybrid Sine Cosine Algorithm

by
Chengfeng Zheng
1,
Mohd Shareduwan Mohd Kasihmuddin
1,*,
Mohd. Asyraf Mansor
2,
Ju Chen
1 and
Yueling Guo
1
1
School of Mathematical Sciences, Universiti Sains Malaysia, Penang 11800, Malaysia
2
School of Distance Education, Universiti Sains Malaysia, Penang 11800, Malaysia
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(18), 3368; https://doi.org/10.3390/math10183368
Submission received: 8 August 2022 / Revised: 4 September 2022 / Accepted: 5 September 2022 / Published: 16 September 2022
(This article belongs to the Special Issue Statistical Methods in Data Science and Applications)

Abstract

:
The sine and cosine algorithm is a new simple and effective population optimization method proposed in recent years that has been studied in many works of literature. Based on the basic principle of the sine and cosine algorithm, this paper fully studies the main parameters affecting the performance of the sine and cosine algorithm, integrates the reverse learning algorithm, adds an elite opposition solution and forms the hybrid sine and cosine algorithm (hybrid SCA). Combined with the fuzzy k-nearest neighbor method and the hybrid SCA, this paper numerically simulates two-class datasets and multi-class datasets, obtains a large number of numerical results and analyzes the results. The hybrid SCA FKNN proposed in this paper has achieved good accuracy in classification and prediction results under 10 different types of data sets. Compared with SCA FKNN, LSCA FKNN, BA FKNN, PSO FKNN and SSA FKNN, the prediction accuracy is significantly improved. In the Wilcoxon signed rank test with SCA FKNN and LSCA FKNN, the zero hypothesis (significance level 0.05) is rejected and the two classifiers have a significantly different accuracy.

1. Introduction

The swarm intelligence algorithm (SI) has gained attention from many researchers in various field of sciences. SI is currently being used to provide solutions to various optimization problems. Several applications of swarm intelligence include material technology [1,2,3], biological system modeling [3], train assembly, high-performance graphics card [4], path planning [5] and robot control [6,7]. SI is based on the collective behavior of the elements that self-organize in order to get exposed with the solution of the optimization problem. Examples of popular SI algorithms include particle swarm optimization (PSO) [8], artificial bee colony (ABC) [9], gravitational search algorithm (GSA) [10] and whale optimization algorithm (WOA) [11]. One of the main challenges in SI is the lack of profound theoretical analysis, which requires a solid mathematical foundation that includes a proper analysis that assesses the robustness, computational complexity and parameter setting. All of these analyses are required to ensure that SI can avoid converging to a local minimum solution. Note that the local minimum solution will affect the optimality of the solution to the respective optimization problem.
The sine cosine algorithm (SCA) [12] is a new SI algorithm proposed by Mirjalili in 2016. SCA was inspired by a mathematical model of the sine cosine function used to make the oscillation of the solution converge towards the optimal solution. The random and adaptive parameters in the algorithm have the ability to balance both exploration and exploitation during solution searching. Several advantages of SCA include very few parameters, easy implementation, a simple structure, a fast convergence speed, strong parallelism and universality and a better performance in practical applications. Therefore, it has attracted extensive attention from scholars in recent years.
In [13], a SCA with nonlinear decreasing conversion parameters was proposed. The change in parameter r1 is controlled by a parabolic function and exponential function, respectively. The experimental results show that the adjustment of parameter r1 by the exponential function can better balance the global and local exploration of the solution. The study in [14] proposed a method of combining quantum computing with SCA by using quantum bits, rotating gates and a non-gate, where each gate has their own specialization in exploring the solution. The proposed SCA was reported to be very effective and accurate. The study in [15] introduces the method of reverse learning to generate reverse solutions for the current individual, which expands the exploration of solution space. The study in [16] proposed a hybrid gray wolf with SCA that uses the sine cosine update formula to improve the moving direction and speed of the head wolf. By combining the benefit of SCA, the hybrid algorithm improves drastically in terms of exploration and exploitation. Another interesting work by [17] was carried out, where SCA was combined with a differential evolution algorithm (DEA). In this context, SCA was reported to help DEA jump out from the local optimal solution region. However, the application of the SCA in the above literature failed to classify a more general dataset. The test of the robust metaheuristics depends on the ability of the algorithm to classify non-bias datasets.
In 2017, Rizk M and Rizk-Allah [18] proposed a sine cosine algorithm (MOSCA) based on a multi-orthogonal search strategy to solve engineering design problems. The proposed MOSCA improves the defects of unbalanced exploration and premature convergence of the conventional SCA. MOSCA utilizes SCA during the exploration phase and uses a multi-orthogonal search strategy to find the optimal solution in the search space. MOSCA was reported to obtain a better speed of convergence with a higher solution accuracy. In another development, Elaziz et al. [15] proposed SCA based on reverse learning (OBSCA). A reverse learning strategy is an important method used to enhance the performance of the stochastic optimization algorithm. By selecting the value of the objective function according to greedy selection at the current solution and reverse solution, OBSCA enhances the diversity of the population and improves the ability of the algorithm to approach the global optimal solution. This experiment highlights the robustness of OBSCA in terms of convergence.
In 2017, Songjin and Wen [19] proposed an improved SCA (ISCA) for solving high-dimensional optimization problems. Inspired by PSO, the ISCA algorithm introduces inertia weight to improve the convergence accuracy and increase the convergence speed of the SCA. At the same time, it adopts a reverse learning strategy to generate initial individuals to improve the diversity and reconciliation quality of the population. The experimental results showed that, compared with the basic SCA, ISCA has a better optimization performance in a high-dimensional test function. In 2018, Nenavath and Jatoth [20] proposed a hybrid SCA-DE algorithm based on differential evolution to solve optimization problems and target tracking problems. The experimental results show that the hybrid SCA-DE algorithm has a higher convergence accuracy and faster convergence speed than basic SCA. In 2021, Wu et al. [21] proposed a LSCA method and FKNN method to solve biomedical problems. Compared with other methods, the proposed LSCA obtained acceptable results but the accuracy of this method still requires improvement.
In this paper, we capitalize on the mathematical properties of the SCA to balance the global and local exploration of the algorithm during the searching process. This can be achieved by adaptively changing the amplitude of the sine function and cosine function until the SCA converges towards the global optimal solution. In addition, reverse learning will be used to provide a jump mechanism to the SCA so that it can avoid a potential unwanted local solution. Both methods will be integrated into a fuzzy k-nearest neighbor (FKNN) that has the capability to classify real life datasets. Thus, the contributions of this paper are as follows:
  • In this paper, reverse learning will be implemented into an SCA model to form a hybrid SCA. In this context, the adaptive weight coupled with the reverse learning alter the position of the solution towards the global solution.
  • The proposed hybrid SCA will be implemented into a fuzzy k-nearest neighbor (SCA-FKNN). In this context, the proposed SCA-FKNN has the ability to avoid local convergence by jumping out of the current non-optimal solution.
  • The performance of the proposed SCA-FKNN will be tested using various real life datasets. SCA-FKNN will be evaluated according to the various performance metrics, such as accuracy, precision, sensitivity, specificity, Mathews correlation coefficient and Wilcoxon signed rank test. In addition, the proposed SCA-FKNN will be compared with the existing conventional state-of-the-art classifier.
The rest of this paper is organized as follows. Section 2 introduces in detail the content of the SCA model and FKNN classifier. Section 3 presents in detail the process of forming the hybrid SCA FKNN model based on the SCA model and FKNN classifier by adding a parameter adjustment and reverse learning mechanism. Section 4 introduces and analyzes 10 different data sets and evaluation indicators. Section 5 shows the prediction and classification results of 10 types of data sets under five models with extensive comparison analysis. Section 6 describes conclusions and further research.

2. Background

2.1. Sine Cosine Algorithm

The sine cosine algorithm is an algorithm based on the mathematical characteristics of sin and cos. It updates individuals through the changes in sine and cosine functions. In SCA, it is assumed that, in j-dimensional space, the population size is n, and that, in each iteration, the location update mode of the i-th individual is
X i j ( t + 1 ) = X i j ( t ) + r 1 × sin ( r 2 ) × r 3 X b e s t j X i j ( t ) r 4 < 0.5 X i j ( t ) + r 1 × cos ( r 2 ) × r 3 X b e s t j X i j ( t ) r 4 0.5
where X i j is the position of the i-th individual in the j dimension of the t iteration; X b e s t , j is the optimal position in the j dimension of the position X i of the i-th individual; r 2 , r 3 and r 4 are random numbers subject to uniform distribution, r 2 [ 0 , 2 π ] , r 3 [ 0 , 2 ] and r 4 [ 0 , 1 ] ; r 1 is the control parameter.
r 1 = a ( 1 t M a x F E s )
where M a x F E s represents the maximum number of iterations and a is a constant number and is equal to 2.
The fluctuation amplitude of r 1 × sin ( r 2 ) and r 1 × cos ( r 2 ) (sine and cosine parameters) gradually attenuates with the increase in iteration times. Its values are in the range of ( 1 , 2 ] and [ 2 , 1 ) . The algorithm performs a global search in the solution space, and the algorithm performs a local development in the range of [ 1 , 1 ] . The SCA algorithm flow is shown in Figure 1.

2.2. Fuzzy K-Nearest Neighbors (FKNN)

As one of the simplest classifiers, KNN mainly infers the class of the sample according to the classes of the K training samples closest to the sample to be classified. The default of this method is that each of the k samples have the same weight, which is not the case. The KNN algorithm (nearest neighbor method) was first proposed by Covcr and Hart in 1967. Many researchers have conducted in-depth theoretical research and development due to the low error rate of the nearest neighbor method, which makes it one of the important methods of pattern classification.
The FKNN algorithm (fuzzy k-nearest neighbor algorithm) was proposed by Keller et al. in 1985 [22]. He assigned different weighting coefficients to k-nearest neighbors, and then used the fuzzy decision-making method to calculate the class label with the largest coefficient as the category of test data. Because the weight coefficient based on distance is used, the recognition effect is improved. Nevertheless, the selection of fuzzy k-nearest neighbor parameter K has a great impact on the recognition effect. Choosing appropriate parameters plays an important role in improving the accuracy of the classification.
A fuzzy KNN algorithm is proposed based on the KNN algorithm. This method has the advantages of a high calculation accuracy and no data input assumption. It is a relatively mature classifier. For data sets, the membership of each member data to each class is calculated by Equation (3).
U i , k = 0.51 + n k K · 0.49 , k = Y k n k K · 0.49 , k Y k
where i = 1 , 2 , 3 N represents the i-th training sample and N represents the number of all training samples. k = 1 , 2 , 3 M , where k represents the k-th class, and M denotes the number of classes. U i , k represents the member level of the i-th sample to the k-th class. K represents the present number of nearest neighbors, Y k represents the class of the i-th training sample and n k represents the number of the i-th training sample’s neighbors belonging to the k-class among the nearest K neighbors. Note that membership should meet the following:
U k ( x ) = j 1 K U I j , k x x I j 2 m 1 j 1 K x x I j 2 m 1
where x stands for the test sample, U k ( x ) represents the test sample weight to the k-class, j = 1 , 2 , K represents the test sample’s j-th nearest neighbor, I j represents the i index corresponding to the j-th nearest neighbor in the training samples, U I j , k is the membership degree, which is calculated by Equation (3), and x x I j represents the distance measurement. m stands for fuzzy strength, which is used to control the weight of each neighbor in the membership calculation, and its range is [ 1 , ] .
C ( x ) = a r g   m a x k U k ( x )
The calculation steps of FKNN are as follows in Figure 2.
The above FKNN solved the problem of multivariate classification and distance weight, and the SCA will deal with the problem of a low search efficiency of the optimal solution after the distance weight.

3. The Proposed Method

At the end of the iteration, SCA will conduct a small neighborhood search near the current global optimal location and constantly try to update the optimal solution. If the search process is far from the theoretical optimal solution, it is difficult for the algorithm to converge to the global optimal solution in a short time. Therefore, the current research papers are roughly divided into two ways to improve the convergence speed and accuracy of SCA. One way is to improve the convergence speed of SCA by changing Equation (1). The other is to improve the accuracy of SCA by adding reverse learning.
In the parameter adjustment mechanism, reference [23] has introduced the adaptive weight coefficient parameter adjustment mechanism, and this mechanism has achieved good results in solving the problem of jumping out of local convergence. Based on the parameter adjustment mechanism and combined with reverse learning, this paper forms an improved version of the sine and cosine algorithm with multiple strategies.
The combination of the swarm intelligence algorithm, lion swarm algorithm and reverse learning strategy further expands the search scope of the group, thus improving the problems of a slow convergence speed and insufficient accuracy of the group.

3.1. The Weight Factor

In this part, an adaptive weight w is used, which makes the individual position have a great impact on the individual moving direction and distance in the algorithm and effectively improves the ability of algorithm development. The value of w t + 1 in the latter iteration is 100 times that of the previous iteration w t , with an obvious step search. The mathematical model [23] of w is
w = μ × sinh ( 1 20 t M a x F E s ) 8
where μ is the weight factor; in most cases, the value of μ is 0.5. Adding the weight parameter w to the sine cosine algorithm in Equation (1), we obtain:
X i j ( t + 1 ) = w ( X i j ( t ) + r 1 × sin ( r 2 ) × r 3 X b e s t j X i j ( t ) ) r 4 < 0.5 w ( X i j ( t ) + r 1 × cos ( r 2 ) × r 3 X b e s t j X i j ( t ) ) r 4 0.5

3.2. Reverse Learning

In SCA, the individuals of the population only rely on the current optimal solution to update their own state, so the algorithm is likely to fall into the local optimal state, resulting in the algorithm being unable to find a satisfactory solution. At this time, it is necessary to carry out a local mutation operation on the individual, and the individual reflects on the previous learning situation with the current learning results so as to increase the probability of escaping from the local area. The formula of reflective learning is
X i = X i s + ω ( X i s X i t )
where X i s represents the position of individual i in the t-th iteration; X i represents the position after executing Equation (7); X i represents the new position generated through the reflection process; ω represents a learning factor, ω [ 1 , 1 ] ; ⊗ indicates dot multiplication.
In order to prevent too much randomness in the process of reflection, the learning factor was compared with ω . At the same time, in order to avoid the degradation of the learning ability and enhance the convergence of the algorithm, greedy learning was used to select the best algorithm according to the learning status before and after reflection.
ω = C ( t M a x F E s ) × c o s ( r 5 )
where r 5 is a random number on 0 , π ; C is a constant, and the effect is better when C = 100 .
In order to reduce the possibility of the algorithm deviating from the global optimal position, the evaluation of excellent algorithms is strengthened. It is very necessary to search the space around the volume, and this improvement can improve the efficiency of the algorithm and the ability to explore new solutions. This paper integrates the strategy of elite reverse learning into SCA. The information of the elite population was used to search the space of elite individuals and their reverse solutions.
The specific operations [24] were as follows:
  • The individuals in the population were arranged after the implementation of formula f i t n e s s , where 10% of the excellent individuals were selected to form the elite population X b e s t ;
  • Individual X b e s t i X b e s t boundary [ l b j i , u b j i ] and the dynamic boundary [ m i n ( l b j i ) ,   m a x ( u b j i ) ] were calculated;
  • The dynamic elite reverse population X b e s t of individual X b e s t i was generated according to Equation (10);
  • If the reverse population X b e s t exceeded the limit of dynamic boundary [ m i n ( l b j i ) ,   m a x ( u b j i ) ] , it was replaced by a new individual randomly generated in the boundary;
  • The top 50% from [ X b e s t , X b e s t ] was selected for the next generation according to f i t n e s s ;
  • Steps 2 and 5 were cycled until the stop condition was reached, and the algorithm ended.
The elite inverse solution was set in d-dimensional space. X b e s t = x 1 , x 2 , , x D is the inverse solution of the elite individual. X b e s t = x 1 , x 2 , , x D is the inverse solution of the current population. The inverse solution is defined as
x i = k ( l b i + u b i ) x i
where k 0 , 1 is a random number subject to uniform distribution. Multiple inverse solutions of the elite individual can be generated by using this coefficient.
The generated elite inverse solution increases the useful information of the population converging to the global optimum, strengthens the exploration of the neighborhoods around the optimal individual and improves the local development ability of the algorithm.

3.3. The Proposed Hybrid SCA FKNN Model

In this paper, the F i t n e s s i is equal to ACC. ACC represents the accuracy of FKNN classification, which is obtained by k-cross-validation. In this paper, five-fold validation was used. After combining hybrid SCA and FKNN, the pseudocode of the hybrid SCA FKNN is shown in Algorithm 1. Figure 3 and Figure 4 show the operation flow of the whole hybrid SCA FKNN method in detail.
Algorithm 1: The hybrid SCA-FKNN.
while t < M a x F E s do
   update r 1 , r 2 , r 3 , r 4
    w t = μ s i n h ( 1 20 t M a x F E s ) 8
   if  r 4 < 0.5  then
       X i s = w t ( X i ( t ) + r 1 × sin ( r 2 ) × r 3 X b e s t X i ( t ) )
       X i = X i s + ω ( X i s X i t )
      if  f ( X i s ) > B F  then
         if  f ( X i s ) < f ( X i )  then
             X i t + 1 = X i
             X b e s t = X i t + 1
             B F = f ( X i )
         else
             X i t + 1 = X i s
             X b e s t = X i t + 1
             B F = f ( X i )
         end if
     end if
     for i = 1 to S u p N  do
          generate random k, X i ( t + 1 ) = k ( l b + u b ) X i ( t + 1 )
     end for
     put all X i into train dataset as elite opposition solutions
   else
      X i s = w t ( X i ( t ) + r 1 × cos ( r 2 ) × r 3 X b e s t X i ( t ) )
     As the up, the same progress
   end if
end while

4. Experiment and Discussion

4.1. Experiment Setup

In this section, the components of the experiment will be described in detail. The purpose of this experiment is to illustrate that the hybrid SCA FKNN method proposed in this paper can be used in two or more types of data sets and can achieve good numerical results, with strong adaptability and accuracy. In order to ensure the reproducibility of the experiment, the experimental setup will be shown below.

4.2. Benchmark Datasets

The datasets used in this paper were all open source datasets; for details of the datasets used in this paper, please visit this website: https://archive.ics.uci.edu/ml/index.php (accessed on 30 April 2022). For all datasets, the feature was normalized to [ 1 , 1 ] using maximum and minimum normalization. In order to better show the numerical results of the method proposed in this paper, this paper focused on the numerical experiments on two-class data sets and multi-class data sets.
In order to reflect the wide adaptability of the methods proposed in this paper, this paper selected 10 kinds of data sets from different application scenarios and different data types. The 10 kinds of data sets involve a variety of use scenarios with practical life significance, such as medical treatment, daily necessities, automobiles, etc. From Table 1 and Table 2, it can be seen that the sample size, characteristics and categories of the 10 types of data sets cover different levels. This paper verifies the effectiveness of the method proposed in this paper from various angles according to different conditions and different use scenarios of the data sets.
For two-class data sets, this paper considered the following three data sets. The basic situation of these three data sets is shown in Table 1. Table 1 describes the indicators, such as the number of data label categories, data scale and data feature quantity, of the following three datasets.
For multi-classes datasets, in order to further verify the effectiveness of the method proposed in this paper, this paper retrieved the following eight types of data sets on the open data platform of the University of California for numerical experiments. The contents of the relevant data sets are described in Table 2. These data sets involve multiple areas of life, which is more convincing for verifying the effectiveness of the method. The relevant data categories range from the least to the most, and the diversity of data features ranges from small sample data to large sample data.
Table 2 describes the basic information of seven multi-category data sets, including the category, sample number, feature number and number of positive samples and negative samples. For example, the caesarian section classification dataset, Indian liver patient dataset (ILPD), glass identification dataset, user knowledge modeling dataset, car dataset and QCM sensor alcohol dataset. The content of data sets covers all aspects of real life, with a wider range and more complex data types. There are both large sample data sets and small sample data sets, and both multi-feature data sets and a small number of feature data sets.

4.3. Performance Metrics

The evaluation indicators used in numerical experiments include the classification accuracy (ACC), sensitivity, precision, specificity and Matthews correlation coefficient (MCC). Sensitivity refers to the ability of the model to identify positive examples. Precision indicates samples with positive prediction results. These are positive cases. The specificity measurement model is the ability to identify negative examples. The measurement range of MCC is [ 1 , 1 ] , and the other is [ 0 , 1 ] . The larger the evaluation indicator is, the better the performance of the model under this indicator.
For multi-class datasets, the corresponding concerned data categories are taken as positive categories and other categories are taken as negative categories. The data sets were calculated to obtain the values of relevant evaluation indicators, such as the accuracy (ACC), sensitivity, precision, specificity and Matthews correlation coefficient (MCC).
Standard classification indicators, such as the accuracy (ACC), sensitivity, precision, specificity and Matthews correlation coefficient (MCC), were used in the experiment. According to [25,26,27], the true positive (TP) is the number of positive instances of correct classification, the false negative (FN) is the number of positive instances of incorrect classification, the true negative (TN) is the number of negative instances of correct classification and the false positive (FP) is the number of positive instances of incorrect classification. The basic configuration matrix of TP, FN, TN and FP is shown in Table 3.
By referring to the papers [25,26,27], this paper lists the evaluation indicators of accuracy (ACC), sensitivity, precision, specificity and Matthews correlation coefficient (MCC) as follows:
A C C = T P + T N T P + T N + F N + F P
P r e c i s i o n = T P T P + F P
S e n s i t i v i t y = T P T P + F N
S p e c i f i c i t y = T N F P + T N
M C C = T P · T N F P · F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N )
The above evaluation indicators can comprehensively evaluate the performance of the proposed model.

4.4. Baseline Methods

In order to verify the hybrid SCA FKNN model, a large number of data experiments were conducted to verify the effectiveness of the proposed method. Firstly, in the first part, this paper tested two classes of data sets, and made a numerical comparison of nine metaheuristic-based algorithms (LSCA [21], SCA [12], PSO [28], SSA [29], SA [30], BA [31], CGSCA [32], mSCA [33] and CESCA [34]), fixing their M = 2 and K = 3 to verify the advantages of execution data results, respectively. In order to ensure the fairness of the numerical experiment, the experiment was repeated five times on the same machine. Based on the repeated experiment, the average value and standard deviation of the model were analyzed. Each experiment included five cross validation results. The average value of five cross validations was taken for performance evaluation.

4.5. Experimental Design

All numerical experiments were calculated by MATLAB 2017. All experiments were conducted on the same equipment, and 8 GB ram Intel Core i5 (Intel, Santa Clara, CA, USA) equipped with windows 11 (Microsoft, Redmond, WA, USA) was used as a workstation to avoid the impact of experimental hardware during the simulation process.

5. Results and Discussion

5.1. Numerical Results for Two-Classes Datasets

For the following three types of two-class datasets, the hybrid SCA FKNN method proposed in this paper will be compared and analyzed with eight metaheuristic algorithms. The superior performance of hybrid SCA FKNN in the evaluation metrics fully shows the advantages of the proposed method, and further verifies the effectiveness of the method.

5.1.1. Experimental Results on the Bupa Dataset

The numerical results of the hybrid SCA FKNN compared with other models in the Bupa dataset are shown in Table 4. This paper carried out 10 repeated numerical experiments, and the average value and standard deviation of the 10 repeated experiments are listed in Table 4. It can be seen that the hybrid SCA FKNN model proposed in this paper achieves the best results among the four evaluation indicators. The hybrid SCA FKNN model proposed in this paper observes better results on ACC, which are approximately 8.4–25.1% higher than the comparison models. Although the standard deviation of LSCA-FKNN is lower than that of hybrid SCA FKNN in most cases, the numerical results of hybrid SCA FKNN are significantly better than LSCA FKNN in terms of average evaluation index values.
In order to show the overall benchmarking analysis results of each model, Figure 5 draws a bar graph of the performance of each model, draws the average value of 10 repeated experiments of each model and adds the standard deviation of repeated experiments as the error line. Figure 5 is a visual display of Table 4. As shown in Figure 5, except for precision, good results have been achieved in sensitivity, specificity and MCC. It can be clearly seen from the figure that the hybrid SCA FKNN method has better numerical results and stronger stability.

5.1.2. Experimental Results on the Hepatitis Dataset

As reflected in the above table on the Bupa dataset, Table 5 shows the benchmarking results between the hepatitis dataset and other model methods. It can be seen that the hybrid SCA FKNN model proposed in this paper achieves the best results among the three evaluation indicators.In terms of sensitivity and precision, the hybrid SCA FKNN performs worse than BA FKNN and CGSCA FKNN, but significantly better than their numerical results in terms of ACC, specificity and MCC. The hybrid SCA FKNN model proposed in this paper observes better results on ACC, which are approximately 15.2–19.6% higher than the comparison models.
Similarly, in order to further visualize the comparison of the five types of evaluation indicators, Figure 6 shows the performance of each model more intuitively. For this data set, the data size of the positive samples is small, at only 32 data, so it has a great impact on the sensitivity and MCC. In general, the hybrid SCA FKNN model has competitive advantages.
ACC is an index used to describe the accuracy of the model. The higher the value of ACC, the better the prediction result of the model. For the current hepatitis dataset, Table 5 is obtained according to the comparison with the numerical results in paper [21]. As shown in Table 5 and Figure 6, the hybrid-FKNN model proposed in this paper obtains better results in ACC, which are approximately 15.2–19.6% higher than the comparison model. However, its variance is high and its stability is poor. The data results are greatly affected by random data sampling.

5.1.3. Experimental Results on the SPECT Dataset

Similarly, Table 6 shows the benchmarking results of the SPECT dataset and other model methods on the evaluation indicators of ACC, sensitivity, precision and MCC. In terms of sensitivity and precision, the hybrid SCA FKNN performs worse than BA FKNN and LSCA FKNN. BA FKNN has a high sensitivity and low specificity, and there is a significant difference in its ability to recognize positive cases and negative cases. However, in terms of ACC, specification and MCC, it is significantly better than their numerical results.
Figure 7 vividly shows the performance of various models on different evaluation indicators. It can be seen from the figure that the sensitivity, specificity and MCC indicators of the model vary greatly, which may be due to the small number of negative samples, at only 55 negative samples. Cross validation has a great impact on the sensitivity and MCC. It can be seen from the figure that the hybrid SCA FKNN method is quite competitive in obtaining numerical results on SPECT data sets.
For the current SPECT dataset, Table 6 is also obtained by the hybrid SCA FKNN. As shown in Table 6, the hybrid-FKNN model obtains better results in ACC, which are approximately 12.1–14.3% higher than the comparison models. As shown in Figure 7, except for sensitivity and precision, the best results were also achieved in specificity and MCC. However, satisfactory results have been achieved in sensitivity and precision. The numerical results are quite competitive.
In order to further verify the effectiveness of the proposed method, the standard is attached in Appendix A, which concerns the experimental results obtained in hepatitis, Bupa and SPECT datasets under the conditions of different maximum cycle test times and different cross validation numbers for researchers’ reference.

5.2. Numerical Results for Multi-Classes Datasets

For the above seven different application scenarios and different types of data sets, the following data results are obtained in this paper. As shown in Table 7, for multi-class datasets, the data prediction accuracy is between 0.65–0.90, and the hybrid-SCA FKNN method can still achieve good results. It has good data prediction results for datasets with multiple or few characteristics, and multiple or few samples. This method is more adaptable.
In order to better demonstrate the effectiveness of the method proposed in this paper, this paper compares the numerical results of the above seven different data sets calculated by the LSCA FKNN [21], SCA FKNN [12], PSO FKNN [28], BA FKNN [31] and SSA FKNN [29] methods. The relevant numerical comparison results are shown in Table 8, Table 9, Table 10, Table 11, Table 12, Table 13 and Table 14. From the five evaluation indicators, the hybrid SCA FKNN method proposed in this paper achieved good numerical results under seven datasets.
For the current caesarian section classification dataset, Table 8 is also obtained from hybrid SCA FKNN. As shown in Table 8, the hybrid SCA FKNN model achieved better results in ACC, sensitivity and MCC, which were approximately 5.2–21.7% higher than the comparison model in ACC.
For the Indian liver patient dataset (ILPD), Table 9 shows the data results obtained by the hybrid SCA FKNN method and other numerical models. As shown in Table 9, the hybrid SCA FKNN model achieved better results in ACC, precision, specificity and MCC, which were approximately 10.9–23.2% higher than the comparison model in ACC.
For the glass dataset, Table 10 shows the data results obtained by the hybrid SCA FKNN method and other numerical models. As shown in Table 10, the hybrid SCA FKNN model achieved better results in ACC, precision, sensitivity specificity and MCC, which were approximately 8.6–46.3% higher than the comparison model in ACC.
For the user modeling dataset hamdi tolga dataset, Table 11 shows the data results obtained by the hybrid SCA FKNN method and other numerical models. As shown in Table 11, the hybrid SCA FKNN model achieved better results in ACC, precision, sensitivity and specificity, which were approximately 7.2–66.2% higher than the comparison model in ACC.
For the breast tissues dataset, Table 12 shows the data results obtained by the hybrid SCA FKNN method and other numerical models. As shown in Table 12, the hybrid SCA FKNN model achieved better results in precision, sensitivity specificity and MCC, which were slightly lower than SSA-FKNN and higher than the comparison model in ACC.
For the car dataset, Table 13 shows the data results obtained by the hybrid SCA FKNN method and other numerical models. As shown in Table 13, the hybrid SCA FKNN model achieved better results in ACC, specificity and MCC, which were approximately 1.3–8.6% higher than the comparison model in ACC.
For the QCM sensor alcohol dataset, Table 14 shows the data results obtained by the hybrid SCA FKNN method and other numerical models. As shown in Table 14, the hybrid SCA FKNN model achieved better results in ACC, precision, sensitivity and MCC, which were twice as large as the results of PSO FKNN and BA FKNN in ACC.
In order to better verify the method proposed in this paper, this paper analyzed the operation process of the five data sets. The change in best fitness can analyze the convergence of the data running process. If the best fitness does not change during the cycle, the data prediction results do not change in general. If a few cycles are used, the best fitness will not change, indicating that the method has a faster convergence speed.
From Figure 8, it can be seen that the optimal fitness will start to be in a stable state at approximately 12–23 iterations; that is, in the next cycle, the data of the training set will not be optimized, and the method will be in a convergent state. As shown in Figure 8, after the first iteration, the best fitness of the current training set will be obtained, and the current best fitness will not be equal to 0. The value of this best fitness gradually stabilizes with the increase in the number of iterations, but it does not start from 0. For the car dataset, QCM sensor alcohol dataset and glass identification dataset, the best fitness converges quickly and converges to a stable value at approximately the 10th iteration. For the ILPD and the breast tissue dataset, it also converges to a stable value at approximately the 15th iteration.
With the increase in the number of iterations, the best fitness gradually increases on the car dataset. When the number of iterations reaches 15, the best fitness gradually stabilizes to 0.8543. With the update and optimization of the car training dataset, the data prediction result of the car test set reaches 0.8514. For the QCM sensor alcohol dataset, when the number of iterations reaches 10, the optimal fitness reaches a stable state at 0.8400, and the data prediction result of the QCM sensor alcohol test set reaches 0.8520. For the glass identification dataset, when the number of iterations reaches 12, the optimal fitness reaches a stable state at 0.7256, and the data prediction result of the glass identification test set reaches 0.7364. The classification accuracies of the above three data sets in the cycle process, best fitness and test data sets are roughly the same, which shows that this method can find convergent data points in the simulation process effectively and obtain more accurate results.
For the ILPD, when the number of iterations reaches 20, the optimal fitness reaches a stable state at 0.7256, and the data prediction result of the ILPD test set reaches 0.6609. For the breast tissue dataset, when the number of iterations reaches 15, the optimal fitness reaches a stable state at 0.7256, and the data prediction result of the breast tissue test set reaches 0.6475. For the above two datasets, although the accuracy of the method is slightly lower than that of best fitness, the method achieves better results in precision and stability.
According to Figure 8, it can be clearly seen that the method proposed in this paper has a good convergence effect and achieves the purpose of prediction results when fewer cycles are required.
For further experiments, and to compare the results of LSCA-FKNN and SCA-FKNN for all 10 datasets, this paper will run Wilcoxon signed rank test hybrid SCA FKNN. The null hypothesis is that there is no difference between the accuracy of the two classifiers. As shown in the accuracy results of this paper, we reject the null hypothesis (significance level 0.05), and accept that the two classifiers have significantly different accuracies. This result confirms the advantages of hybrid SCA FKNN.

6. Conclusions

The hybrid SCA FKNN algorithm is proposed in this paper based on the improved SCA algorithm combined with reverse learning and an FKNN classifier. It is a further combination of the swarm intelligence algorithm and classifier. This method is a multi-strategy hybrid algorithm that further optimizes the sine cosine algorithm, makes it easier to jump out of local convergence and obtains more accurate numerical solutions. In the process of implementing hybrid SCA into FKNN, this paper mainly uses the FKNN classifier to calculate the prediction accuracy by cross validation as the current best fitness to iteratively optimize the training dataset in order to obtain a more accurate classification. This way, the training set population can be optimized until the training set cannot be optimized any more and the numerical value converges, which can greatly improve the the accuracy of the numerical results. After comparing the numerical results of the hybrid SCA FKNN method with five other methods in 10 data sets, and through the Wilcoxon signed rank test with SCA-FKNN and LSCA-FKNN, the numerical results were significantly improved.
In the next step, we will further consider logic mining to improve the method [27,35,36,37,38] and integrate multiple patterns to optimize swarm intelligence algorithms and obtain a more efficient method. In the process of using FKNN to calculate fitness, this method repeatedly calculates the location distance, requires a lot of numerical calculations and takes a large amount of time. The next step is to consider a more efficient classifier to reduce the time cost. In the next step, we will further optimize the model in combination with spiritual Gaussian mutation [39] to improve the accuracy.

Author Contributions

C.Z.: conceptualization, methodology, software, data curation, writing—original draft preparation; M.S.M.K.: supervision, visualization, investigation, review; M.A.M.: review, editing, supervision; J.C.: writing, validation; Y.G.: review, editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research is fully funded and supported by Universiti Sains Malaysia, Short Term Grant, 304/PMATHS/6315655.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Please refer to data availability statements in the UC Irvine Machine Learning Repository at https://archive.ics.uci.edu/ml/ (accessed on 30 April 2022).

Acknowledgments

We would like to acknowledge “Universiti Sains Malaysia, Short Term Grant, 304/PMATHS/6315655” for the support and funding.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

NotationExplanation
Hybrid SCAThe hybrid algorithm proposed based on the sine cosine algorithm and reverse learning
SCASine cosine algorithm
LSCAThe linear population size reduction sine and cosine algorithm
PSOParticle swarm optimization
BABat algorithm
SSASparrow search algorithm
SASalp swarm algorithm
CGSCACauchy and Gaussian sine cosine optimization
FKNNFuzzy k-nearest neighbor

Appendix A

Table A1. Numerical results for three two-class datasets in different MaxFEs when K = 5 -fold cross-validation.
Table A1. Numerical results for three two-class datasets in different MaxFEs when K = 5 -fold cross-validation.
MaxFEsDatasetsMetricACCPrecisionSensitiveSpecificityMCC
5Hepatitis datasetavg0.82800.54170.21420.95830.2426
std0.06600.10210.12410.04170.1102
Bupa datasetavg0.66630.62450.49470.79130.2985
std0.03660.06250.09490.03670.0738
SPECT datasetavg0.59340.56670.71640.45150.1978
std0.12870.11150.27630.14000.2842
10Hepatitis datasetavg0.80170.66670.25640.94200.2785
std0.05010.09460.09430.10040.0954
Bupa datasetavg0.69790.70000.51220.83640.372
std0.04830.04860.06430.03030.0514
SPECT datasetavg0.59340.55150.76550.43440.2248
std0.10770.09730.19980.15340.2218
20Hepatitis datasetavg0.85260.58330.45000.94720.4149
std0.04120.01740.05000.02730.0540
Bupa datasetavg0.75760.65000.59090.84090.4429
std0.05820.02840.02990.02830.0303
SPECT datasetavg0.61270.56810.75240.46010.2353
std0.10380.09510.09990.07060.0937
Table A2. Numerical results for three two-class datasets in different-fold cross-validation when M a x F E s = 5 .
Table A2. Numerical results for three two-class datasets in different-fold cross-validation when M a x F E s = 5 .
Fold Cross-ValidationDatasetsMetricACCPrecisionSensitiveSpecificityMCC
K = 3Hepatitis datasetavg0.80650.50000.20010.98720.2418
std0.05590.14210.08540.02220.1120
Bupa datasetavg0.63380.67750.52950.82760.3751
std0.06920.12680.11040.05990.1517
SPECT datasetavg0.66170.65530.74560.59310.3701
std0.11650.09980.26630.16230.2376
K = 5Hepatitis datasetavg0.79570.52220.27620.93380.2673
std0.01860.13470.12880.04260.0877
Bupa datasetavg0.66630.62450.4950.79130.2985
std0.03660.06250.09490.03670.0738
SPECT datasetavg0.59340.56670.71640.45150.1978
std0.12870.11150.27630.14000.2842
K = 8Hepatitis datasetavg0.80650.72220.20500.97250.3000
std0.06450.25460.05560.02420.0343
Bupa datasetavg0.67170.63100.47860.80750.3033
std0.05440.09380.08790.04560.1204
SPECT datasetavg0.58260.51760.87140.41720.3164
std0.11340.20380.13840.23070.1509

References

  1. Wang, C.-N.; Yang, F.-C.; Nguyen, V.T.T.; Nguyen, Q.M.; Huynh, N.T.; Huynh, T.T. Optimal design for compliant mechanism flexure hinges: Bridge-type. Micromachines 2021, 12, 1304. [Google Scholar] [CrossRef] [PubMed]
  2. Nguyen, T.V.; Huynh, N.-T.; Vu, N.-C.; Kieu, V.N.; Huang, S.-C. Optimizing compliant gripper mechanism design by employing an effective bi-algorithm: Fuzzy logic and anfis. Microsyst. Technol. 2021, 27, 3389–3412. [Google Scholar] [CrossRef]
  3. Chau, N.L.; Dao, T.-P.; Nguyen, V.T.T. Optimal design of a dragonfly-inspired compliant joint for camera positioning system of nanoindentation tester based on a hybrid integration of jaya-anfis. Math. Probl. Eng. 2018, 2018, 8546095. [Google Scholar] [CrossRef]
  4. Liu, H.; Wen, Z.; Cai, W. Fastpso: Towards efficient swarm intelligence algorithm on GPUs. In Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, 9–12 August 2021. [Google Scholar]
  5. Du, Y.; Chen, W.; Fan, B. Research and application of swarm intelligence algorithm in path planning. Electron. Meas. Technol. 2016, 39, 65–70. [Google Scholar]
  6. Duan, H.; Qiao, P. Pigeon-inspired optimization: A new swarm intelligence optimizer for air robot path planning. Int. J. Intell. Comput. Cybern. 2014, 7, 24–37. [Google Scholar] [CrossRef]
  7. Xu, H.; Guan, H.; Liang, A.; Yan, X. A multi-robot pattern formation algorithm based on distributed swarm intelligence. In Proceedings of the 2010 Second International Conference on Computer Engineering and Applications, Bali, Indonesia, 19–21 March 2010; IEEE: Piscataway, NJ, USA, 2010; Volume 1, pp. 71–75. [Google Scholar]
  8. Verma, O.P.; Gupta, S.; Goswami, S.; Jain, S. Opposition based modified particle swarm optimization algorithm. In Proceedings of the 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Delhi, India, 3–5 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
  9. Wang, X.; Li, Z.Y.; Xu, G.Y.; Yan, W. Artificial bee colony algorithm based on chaos local search operator. J. Comput. Appl. 2012, 32, 1033–1036. [Google Scholar] [CrossRef]
  10. Xu, Y.; Zhou, J.; Xue, X.; Fu, W.; Li, C. An adaptively fast fuzzy fractional order pid control for pumped storage hydro unit using improved gravitational search algorithm. Energy Convers. Manag. 2016, 111, 67–78. [Google Scholar] [CrossRef]
  11. Kaveh, A.; Dadras, A. A novel meta-heuristic optimization algorithm: Thermal exchange optimization. Adv. Eng. Softw. 2017, 110, 69–84. [Google Scholar] [CrossRef]
  12. Mirjalili, S. Sca: A sine cosine algorithm for solving optimization problems. Knowl. Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
  13. Yong, L.I.U.; Liang, M.A. Sine cosine algorithm with nonlinear decreasing conversion parameter. Comput. Eng. Appl. 2017, 53, 1–5. [Google Scholar]
  14. Cong, C.; Liang, M.; Yong, L.; School, B. Quantum sine cosine algorithm for function optimization. Appl. Res. Comput. 2017, 34, 3214–3218. [Google Scholar]
  15. Abd Elaziz, M.; Oliva, D.; Xiong, S. An improved opposition-based sine cosine algorithm for global optimization. Expert Syst. Appl. Int. J. 2017, 90, 484–500. [Google Scholar] [CrossRef]
  16. Singh, N.; Singh, S.B. A novel hybrid gwo-sca approach for optimization problems. Eng. Sci. Technol. Int. J. 2017, 20, 1586–1601. [Google Scholar] [CrossRef]
  17. Abd Elaziz, M.E.; Ewees, A.A.; Oliva, D.; Duan, P.; Xiong, S. A hybrid method of sine cosine algorithm and differential evolution for feature selection. In Proceedings of the international Conference on Neural Information Processing, Long Beach, CA, USA, 4–9 December 2017; Springer: Cham, Switzerland, 2017; pp. 145–155. [Google Scholar]
  18. Rizk-Allah, R.M. Hybridizing sine cosine algorithm with multi-orthogonal search strategy for engineering design problems. J. Comput. Des. Eng. 2017, 5, 249–273. [Google Scholar] [CrossRef]
  19. Long, W.; Wu, T.; Liang, X.; Xu, S. Solving high-dimensional global optimization problems using an improved sine cosine algorithm. Expert Syst. Appl. 2019, 123, 108–126. [Google Scholar] [CrossRef]
  20. Nenavath, H.; Jatoth, R.K. Hybridizing sine cosine algorithm with differential evolution for global optimization and object tracking. Appl. Soft Comput. 2018, 62, 1019–1043. [Google Scholar] [CrossRef]
  21. Wu, S.; Mao, P.; Li, R.; Cai, Z.; Chen, X. Evolving fuzzy k-nearest neighbors using an enhanced sine cosine algorithm: Case study of lupus nephritis. Comput. Biol. Med. 2021, 135, 104582. [Google Scholar] [CrossRef]
  22. Keller, J.M.; Gray, M.R.; Givens, J.A. A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 1985, 4, 580–585. [Google Scholar] [CrossRef]
  23. Lin, J.; He, Q. Mixed strategy to improve sine cosine algorithm. Appl. Res. Comput. 2020, 37, 6. [Google Scholar]
  24. Wachowiak, M.P.; Smolíková, R.; Zheng, Y.; Zurada, J.M.; Elmaghraby, A.S. An approach to multimodal biomedical image registration utilizing particle swarm optimization. Evol. Comput. IEEE Trans. 2004, 8, 289–301. [Google Scholar] [CrossRef]
  25. Faris, H.; Mafarja, M.M.; Heidari, A.A.; Aljarah, I.; Fujita, H. An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl. -Based Syst. 2018, 154, 43–67. [Google Scholar] [CrossRef]
  26. Jha, K.; Saha, S. Incorporation of multimodal multiobjective optimization in designing a filter based feature selection technique. Appl. Soft Comput. 2021, 98, 106823. [Google Scholar] [CrossRef]
  27. Kasihmuddin, M.S.M.; Jamaludin, S.Z.M.; Mansor, M.A.; Wahab, H.A.; Ghadzi, S.M.S. Supervised learning perspective in logic mining. Mathematics 2020, 10, 915. [Google Scholar] [CrossRef]
  28. Yang, G. A modified particle swarm optimizer algorithm. In Proceedings of the 2007 8th International Conference on Electronic Measurement and Instruments, Warsaw, Poland, 1–3 May 2007; IEEE: Piscataway, NJ, USA, 2007; Volume 2, pp. 675–679. [Google Scholar]
  29. Xue, J.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control. Eng. Open Access J. 2020, 8, 22–34. [Google Scholar] [CrossRef]
  30. Mirjalili, S.; Gandomi, A.H.; Mirjalili, S.Z.; Saremi, S.; Faris, H.; Mirjalili, S.M. Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 2017, 114, 163–191. [Google Scholar] [CrossRef]
  31. Mirjalili, S.; Mirjalili, S.M.; Yang, X.S. Binary bat algorithm. Neural Comput. Appl. 2014, 25, 663–681. [Google Scholar] [CrossRef]
  32. Kumar, N.; Hussain, I.; Singh, B.; Panigrahi, B. Single sensor-based mppt of partially shaded pv system for battery charging by using cauchy and gaussian sine cosine optimization. IEEE Trans. Energy Convers. 2017, 32, 983–992. [Google Scholar] [CrossRef]
  33. Gupta, S.; Deep, K. A hybrid self-adaptive sine cosine algorithm with opposition based learning. Expert Syst. Appl. 2019, 119, 210–230. [Google Scholar] [CrossRef]
  34. Wsa, B.; Zq, B.; Aahc, D.; Hc, E.; Ht, F.; Yt, A. Double adaptive weights for stabilization of moth flame optimizer: Balance analysis, engineering cases, and medical diagnosis. Knowl. -Based Syst. 2020, 214, 106728. [Google Scholar]
  35. Jamaludin, S.Z.M.; Romli, N.A.; Kasihmuddin, M.S.M.; Baharum, A.; Mansor, M.A.; Marsani, M.F. Novel logic mining incorporating log linear approach. J. King Saud-Univ. -Comput. Inf. Sci. 2022. [Google Scholar] [CrossRef]
  36. Jamaludin, S.Z.M.; Kasihmuddin, M.S.M.; Ismail, A.I.M.; Mansor, M.A.; Basir, M.F.M. Energy based logic mining analysis with hopfield neural network for recruitment evaluation. Entropy 2020, 23, 40. [Google Scholar] [CrossRef]
  37. Zamri, N.E.; Mansor, M.A.; Kasihmuddin, M.S.M.; Alway, A.; Jamaludin, S.Z.M.; Alzaeemi, S.A. Amazon employees resources access data extraction via clonal selection algorithm and logic mining approach. Entropy 2020, 22, 596. [Google Scholar] [CrossRef]
  38. Alway, A.; Zamri, N.E.; Kasihmuddin, M.S.M.; Mansor, A.; Sathasivam, S. Palm oil trend analysis via logic mining with discrete hopfield neural network. Pertanika J. Sci. Technol. 2020, 28, 967–981. [Google Scholar]
  39. Zhou, W.; Wang, P.; Heidari, A.A.; Zhao, X.; Chen, H. Spiral gaussian mutation sine cosine algorithm: Framework and comprehensive performance optimization. Expert Syst. Appl. 2022, 209, 118372. [Google Scholar] [CrossRef]
Figure 1. The flow chart of SCA.
Figure 1. The flow chart of SCA.
Mathematics 10 03368 g001
Figure 2. The flow chart of FKNN.
Figure 2. The flow chart of FKNN.
Mathematics 10 03368 g002
Figure 3. The flow chart of the hybrid SCA FKNN.
Figure 3. The flow chart of the hybrid SCA FKNN.
Mathematics 10 03368 g003
Figure 4. Numerical simulation diagram of the whole process.
Figure 4. Numerical simulation diagram of the whole process.
Mathematics 10 03368 g004
Figure 5. Classification performance of each model on the Bupa dataset.
Figure 5. Classification performance of each model on the Bupa dataset.
Mathematics 10 03368 g005
Figure 6. Classification performance of each model on the hepatitis dataset.
Figure 6. Classification performance of each model on the hepatitis dataset.
Mathematics 10 03368 g006
Figure 7. Classification performance of each model on the SPECT dataset.
Figure 7. Classification performance of each model on the SPECT dataset.
Mathematics 10 03368 g007
Figure 8. The process of best fitness change with increasing number of iterations on multi-class dataset.
Figure 8. The process of best fitness change with increasing number of iterations on multi-class dataset.
Mathematics 10 03368 g008
Table 1. The two-class-dataset-related information.
Table 1. The two-class-dataset-related information.
CategoriesSamplesFeaturesPositiveNegative
Bupa23456145200
Hepatitis21551932123
SPECT22672221255
Table 2. The multi-class-dataset-related information.
Table 2. The multi-class-dataset-related information.
DatasetsCategoriesSamplesFearturesPositiveNegative
Caesarian section classification dataset28043446
Indian liver patient dataset (ILPD)258310415167
Glass identification dataset7214969
(class 1)
145
(other classes except positive)
User knowledge modeling dataset44035102
(class 1)
301
(other classes except positive)
Breast tissue dataset6106920
(class 1)
86
(other classes except positive)
Car dataset4172861209
(class 1)
519
(other classes except positive)
QCM sensor alcohol dataset51251524
(class 3)
101
(other classes except positive)
Table 3. The Basic Confusion Matrix.
Table 3. The Basic Confusion Matrix.
Basic Confusion MatrixPredicted Class
PositiveNegative
Actual ClassPositiveTrue Positive (TP)False Negative (FN)
NegativeFalse Positive (FP)True Negative (TN)
Table 4. Results of the hybrid FKNN and comparison models on the Bupa dataset. (Bold indicates the best in comparison method).
Table 4. Results of the hybrid FKNN and comparison models on the Bupa dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.77990.70150.64120.87910.4728
std0.01430.02840.02990.02830.0303
LSCA-FKNNavg0.62320.66740.56870.83930.2465
std0.01990.03380.01750.01770.0367
SCA-FKNNavg0.61750.54940.46450.79680.2946
std0.03830.04390.04870.01780.0546
PSO-FKNNavg0.66860.65310.48510.80470.3105
std0.02660.03440.03790.02310.0472
BA-FKNNavg0.60560.56000.46930.71210.1920
std0.02920.04790.04230.02920.0694
SSA-FKNNavg0.63770.58620.56670.69230.2601
std0.05860.00690.12330.03270.0895
SA-FKNNavg0.67210.64440.45110.80870.3131
std0.01420.03400.01280.04150.0279
CGSCA-FKNNavg0.66000.64860.47110.79810.2939
std0.01930.03560.01950.03080.0382
Table 5. Results of the hybrid-FKNN and comparison models on the hepatitis dataset. (Bold indicates the best in comparison method).
Table 5. Results of the hybrid-FKNN and comparison models on the hepatitis dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.94650.36380.40720.93920.4342
std0.05690.09370.08450.03120.0945
LSCA-FKNNavg0.81910.45660.37600.91920.3276
std0.02960.08740.06760.03030.0648
SCA-FKNNavg0.80510.42360.35120.91670.3009
std0.03230.10610.07450.03010.0875
PSO-FKNNavg0.77420.46410.33760.92170.3118
std0.03780.09690.07270.02300.0788
BA-FKNNavg0.81720.43330.46920.83050.3593
std0.02330.07020.09030.02670.0812
SSA-FKNNavg0.87500.29750.33330.93330.3846
std0.04320.03790.19250.00870.0098
SA-FKNNavg0.80760.41150.32800.90970.2771
std0.02420.09130.07510.02660.0797
CGSCA-FKNNavg0.80330.55330.39440.90670.3744
std0.01990.07400.06660.02660.0582
Table 6. Results of the hybrid-FKNN and comparison models on the SPECT dataset. (Bold indicates the best in comparison method).
Table 6. Results of the hybrid-FKNN and comparison models on the SPECT dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.89360.86200.71570.52200.4436
std0.01950.08450.06100.02270.0605
LSCA-FKNNavg0.75930.85380.87300.40940.2588
std0.01910.01870.03570.05610.0656
SCA-FKNNavg0.72970.79530.86010.34270.1759
std0.04490.02830.02780.07070.0718
PSO-FKNNavg0.76150.84050.85410.38660.2079
std0.03510.02160.03080.06370.0975
BA-FKNNavg0.75850.70980.92700.10490.0259
std0.02870.00640.01640.04930.0702
SSA-FKNNavg0.64710.54550.85710.44430.3228
std0.08990.09510.08250.06730.0943
SA-FKNNavg0.76580.81950.89200.28910.1031
std0.01640.01760.02340.10130.0637
CGSCA-FKNNavg0.75460.84970.83300.43080.2034
std0.02290.01400.02380.05870.0709
Table 7. Results of the hybrid-FKNN on multi-class datasets.
Table 7. Results of the hybrid-FKNN on multi-class datasets.
DatasetsMetricACCPrecisionSensitiveSpecificityMCC
Caesarian section
classification dataset
Avg0.70260.71970.83360.70490.4694
Std0.09010.05000.01600.09270.0747
Indian Liver Patient
Dataset (ILPD)
Avg0.79530.78760.82550.37880.1621
Std0.03420.03790.08530.05410.0437
Glass Identification
Dataset
Avg0.78270.90160.95260.83470.4734
Std0.03550.05010.03760.07510.1037
User Knowledge
Modeling Dataset
Avg0.86060.95450.97090.91850.9042
Std0.02670.04080.01630.08640.0319
Breast Tissue DatasetAvg0.65540.96670.88830.97700.6969
Std0.07270.05770.04590.14430.1163
Car DatasetAvg0.88070.91880.99730.98230.8312
Std0.09160.06120.07160.09820.1616
QCM sensor Alcohol
Dataset
Avg0.90430.85010.85680.84770.8562
Std0.09390.08380.07190.09730.1008
Table 8. Results of the hybrid-FKNN and comparison models on the caesarian section classification dataset. (Bold indicates the best in comparison method).
Table 8. Results of the hybrid-FKNN and comparison models on the caesarian section classification dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.70260.71970.83360.70490.4694
std0.09010.05000.01600.09270.0747
LSCA-FKNNavg0.66770.81480.57410.70950.3923
std0.09550.11970.09860.09650.1295
SCA-FKNNavg0.66670.67660.63490.73330.3626
std0.07220.23840.07550.08820.0414
PSO-FKNNavg0.66750.81940.75050.72500.4045
std0.07010.11380.12200.09920.0939
BA-FKNNavg0.56250.53610.74240.25000.2655
std0.06250.08040.09570.04430.0995
SSA-FKNNavg0.57750.65280.53700.59520.2381
std0.05230.02410.06560.05910.0817
Table 9. Results of the hybrid-FKNN and comparison models on the Indian liver patient dataset (ILPD). (Bold indicates the best in comparison method).
Table 9. Results of the hybrid-FKNN and comparison models on the Indian liver patient dataset (ILPD). (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.79530.78760.82550.37880.1621
std0.03420.03790.08530.05410.0437
LSCA-FKNNavg0.69120.69610.98560.10780.0712
std0.01910.01230.02200.03090.0257
SCA-FKNNavg0.64550.74400.88820.20580.1038
std0.02220.07320.09640.08210.0674
PSO-FKNNavg0.71730.71870.99670.09630.0953
std0.01390.01620.00580.01070.0641
BA-FKNNavg0.71080.72090.95340.08650.0532
std0.00850.00760.00340.00120.0125
SSA-FKNNavg0.70920.72150.96950.05020.0644
std0.00280.01400.02770.04570.0112
Table 10. Results of the hybrid-FKNN and comparison models on the glass dataset. (Bold indicates the best in comparison method).
Table 10. Results of the hybrid-FKNN and comparison models on the glass dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.78270.90160.95260.83470.4734
std0.03550.05010.03760.07510.1037
LSCA-FKNNavg0.65890.59610.67590.76180.4242
std0.03550.24700.15300.05460.1972
SCA-FKNNavg0.66540.60780.86140.70280.5339
std0.03550.04530.05580.03370.0394
PSO-FKNNavg0.60470.64290.69230.77270.4587
std0.09680.04150.09940.06460.0950
BA-FKNNavg0.53490.33330.50000.64290.3187
std0.10050.09070.08740.07070.0587
SSA-FKNNavg0.72090.77780.90910.68750.6209
std0.08800.01280.19050.11680.2040
Table 11. Results of the hybrid-FKNN and comparison models on the user modeling dataset hamdi tolga dataset. (Bold indicates the best in comparison method).
Table 11. Results of the hybrid-FKNN and comparison models on the user modeling dataset hamdi tolga dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.86060.95450.97090.91850.9042
std0.02670.04080.01630.08640.0319
LSCA-FKNNavg0.79010.92860.92860.98080.9093
std0.03170.02990.04690.01270.0388
SCA-FKNNavg0.79770.91070.96390.96780.9143
std0.07440.02330.03130.01380.0400
PSO-FKNNavg0.77130.88600.94340.94440.8876
std0.06810.02180.02460.00590.0395
BA-FKNNavg0.51790.63240.54040.81890.3828
std0.08900.11010.36250.18770.2242
SSA-FKNNavg0.80250.94370.75930.95410.7018
std0.03770.01500.05900.01530.0629
Table 12. Results of the hybrid-FKNN and comparison models on the breast tissues dataset. (Bold indicates the best in comparison method).
Table 12. Results of the hybrid-FKNN and comparison models on the breast tissues dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivtySpecificityMCC
Hybrid SCA-FKNNavg0.65540.96670.88830.97700.6969
std0.07270.05770.04590.14430.1163
LSCA-FKNNavg0.53970.86670.63890.93330.6154
std0.07270.23090.12730.11550.2682
SCA-FKNNavg0.53730.75000.81670.90240.6471
std0.05500.00120.12430.01170.0937
PSO-FKNNavg0.52380.65830.83330.75000.5677
std0.09910.03660.07870.06050.0879
BA-FKNNavg0.49280.58430.68750.67420.4731
std0.07270.04730.08870.07510.0949
SSA-FKNNavg0.66670.94870.81960.66670.5657
std0.08250.08880.02390.07740.0865
Table 13. Resultsof the hybrid-FKNN and comparison models on the car dataset. (Bold indicates the best in comparison method).
Table 13. Resultsof the hybrid-FKNN and comparison models on the car dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.88070.91880.99730.98230.8312
std0.09160.06120.07160.09820.1616
LSCA-FKNNavg0.86910.94500.94670.83880.7922
std0.01610.03720.02890.11060.0454
SCA-FKNNavg0.81680.87890.95590.94280.7818
std0.01340.00950.00570.02710.0110
PSO-FKNNavg0.76450.77180.99770.31790.4601
std0.01590.09640.04340.05300.0064
BA-FKNNavg0.71070.73420.94680.65910.6596
std0.08590.10900.03500.11000.1226
SSA-FKNNavg0.86180.94610.88160.83070.6653
std0.08540.09390.02690.08190.0945
Table 14. Results of the hybrid-FKNN and comparison models on the QCM sensor alcohol dataset. (Bold indicates the best in comparison method).
Table 14. Results of the hybrid-FKNN and comparison models on the QCM sensor alcohol dataset. (Bold indicates the best in comparison method).
AlgorithmMetricACCPrecisionSensitivitySpecificityMCC
Hybrid SCA-FKNNavg0.90430.85010.85680.84770.8562
std0.09390.08380.07190.09730.1008
LSCA-FKNNavg0.76000.79170.83270.81890.7269
std0.03670.09080.08230.09250.1415
SCA-FKNNavg0.77020.73330.73330.88410.7479
std0.06930.08870.10820.02750.1270
PSO-FKNNavg0.44000.50950.66560.67940.2208
std0.04000.07440.06720.05310.0580
BA-FKNNavg0.30670.65670.26110.78040.3568
std0.02310.03330.06740.07560.0632
SSA-FKNNavg0.81330.66400.77140.86820.6117
std0.10070.08560.09600.03040.1126
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zheng, C.; Kasihmuddin, M.S.M.; Mansor, M.A.; Chen, J.; Guo, Y. Intelligent Multi-Strategy Hybrid Fuzzy K-Nearest Neighbor Using Improved Hybrid Sine Cosine Algorithm. Mathematics 2022, 10, 3368. https://doi.org/10.3390/math10183368

AMA Style

Zheng C, Kasihmuddin MSM, Mansor MA, Chen J, Guo Y. Intelligent Multi-Strategy Hybrid Fuzzy K-Nearest Neighbor Using Improved Hybrid Sine Cosine Algorithm. Mathematics. 2022; 10(18):3368. https://doi.org/10.3390/math10183368

Chicago/Turabian Style

Zheng, Chengfeng, Mohd Shareduwan Mohd Kasihmuddin, Mohd. Asyraf Mansor, Ju Chen, and Yueling Guo. 2022. "Intelligent Multi-Strategy Hybrid Fuzzy K-Nearest Neighbor Using Improved Hybrid Sine Cosine Algorithm" Mathematics 10, no. 18: 3368. https://doi.org/10.3390/math10183368

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop