Next Article in Journal
Discrimination of Patients with Varying Degrees of Coronary Artery Stenosis by ECG and PCG Signals Based on Entropy
Previous Article in Journal
Ground-State Properties and Phase Separation of Binary Mixtures in Mesoscopic Ring Lattices
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Experimental Study and Comparison of Imbalance Ensemble Classifiers with Dynamic Selection Strategy

1
School of Science, Dalian Maritime University, Dalian 116026, China
2
School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou 450001, China
*
Authors to whom correspondence should be addressed.
Entropy 2021, 23(7), 822; https://doi.org/10.3390/e23070822
Submission received: 4 May 2021 / Revised: 18 June 2021 / Accepted: 24 June 2021 / Published: 28 June 2021

Abstract

:
Imbalance ensemble classification is one of the most essential and practical strategies for improving decision performance in data analysis. There is a growing body of literature about ensemble techniques for imbalance learning in recent years, the various extensions of imbalanced classification methods were established from different points of view. The present study is initiated in an attempt to review the state-of-the-art ensemble classification algorithms for dealing with imbalanced datasets, offering a comprehensive analysis for incorporating the dynamic selection of base classifiers in classification. By conducting 14 existing ensemble algorithms incorporating a dynamic selection on 56 datasets, the experimental results reveal that the classical algorithm with a dynamic selection strategy deliver a practical way to improve the classification performance for both a binary class and multi-class imbalanced datasets. In addition, by combining patch learning with a dynamic selection ensemble classification, a patch-ensemble classification method is designed, which utilizes the misclassified samples to train patch classifiers for increasing the diversity of base classifiers. The experiments’ results indicate that the designed method has a certain potential for the performance of multi-class imbalanced classification.

1. Introduction

Data imbalance is ubiquitous and encountered in the field of classification problems. It occurs when the number of instances for different classes are significantly out of proportion. The minority classes with fewer instances usually contain the essential information, which has been observed in broad application areas, such as medical diagnosis [1,2,3,4,5,6], sentiment or image classification [7,8], fault identification [9,10], etc. Many typical classifiers may generate unsatisfactory results due to a concentration on global accuracy while ignoring the identification performance for minority samples.
It is a significant task to identify minority samples among majority samples to accurately attain essential information. With the rising prominence of machine learning and artificial intelligence, as well as the continuous emergence of new problems and technologies involving imbalanced data, imbalanced classification methods are widely concerned and developed because of its development prospects [11,12,13,14,15,16,17,18]. Improving classification algorithms, introducing cost sensitive strategies, and using balance algorithms are common methods in imbalanced learning.
One of the most widely-used strategies is ensemble learning [19,20,21], which combines classification algorithms with data processing techniques or cost-sensitive solutions. The superiority of ensemble learning in dealing with imbalanced data is that it implements a combined strategy for the classification results based on multiple base classifiers, so that the classifier group can identify a skewed distribution between data categories as much as possible. It has seen great success in improving the identification of minority class samples for imbalanced classification problems. During the training process, the construction way of ensemble classifiers is diverse and flexible. A single classification algorithm can be used to construct homogeneous ensemble classifiers, whereas different classification algorithms can be used to obtain heterogeneous classification systems [22].
The existing body of research on the application of the ensemble model suggests ensemble classification algorithms for imbalanced data have great potential effectiveness in practice [23,24,25,26]. For further improving the classification efficiency, multiple effective ways [27,28] have been developed from several perspectives. In more recent studies, one of the most promising strategies is dynamic selection [29], in which the most competent or an ensemble classifier is selected by estimating each classifier’s competence level in the classification pool. The benefit of this approach is to identify different unknown samples by choosing different optimum classifiers. Therefore, each base classifier can be regarded as an expert for a special sample in the classification space. Dynamic selection classifiers exhibit a higher accuracy over traditional combined approaches in solving several real-world problems, such as face recognition [30] and text verification [31].
Due to a dynamic selection strategy emerging as an interesting technique for extending ensemble algorithms, we are motivated to provide a comprehensive review of the development of the latest ensemble classification algorithms for imbalanced datasets, and offer a detailed experimental comparison of the performance of the state-of-the-art existing ensemble algorithms incorporating dynamic selection. This study is arranged as follows. We collect several sources of imbalanced datasets and discuss several pertinent evaluation indexes in Section 2. In Section 3, we categorically summarize the latest progress in imbalance ensemble algorithms. In an attempt to improve the classification performance for both binary class and multi-class imbalanced datasets, an experimental scheme is designed by adopting a dynamic selection strategy in Section 4. A series of experimental comparisons are conducted to support the role of imbalance ensemble classification with the dynamic selection strategy in Section 5. Section 6 draws conclusions.

2. Imbalanced Datasets and Evaluation Measures

In this section, the public access of data sources and comparison criteria used in the related literature are summarized.

2.1. Sources of Imbalanced Datasets

Although imbalanced datasets exist widely, most researchers generally select experimental datasets from public access databases to compare the performance of classification models. Table 1 provides a summary of several databases used in the literature.
The UCI database covers 376 datasets used for machine learning classification tasks, including binary class and multi-class datasets. OpenML is an online machine learning platform for sharing or organizing data, machine learning algorithms, and experiments, with a total of 3121 datasets. KEEL includes a module named imbalanced data for imbalance learning containing multi-class imbalanced datasets. In the Defect Prediction database, a series of typical imbalanced datasets concerning network fault detection are available.

2.2. Evaluation of Imbalanced Classification

Owing to the skewed distribution of imbalanced data, overall accuracy and precision cannot be enough to measure the recognition performance of the model for the minority classes, especially when the imbalance is extremely high. The G-mean [32] and F-measure [33] are typical evaluations of imbalance learning, which can describe categorization situations from different perspectives. During the experimental process, a comprehensive evaluation relying on multiple indicators should be recruited to examine the effectiveness of a classification model. Suppose n is the number of classes, the following performance measures of imbalanced classification are suitable for both binary class and multi-class classification.
(I)
MAvA:
MAvA = i = 1 n A c c i / n ,
where A c c i is the accuracy of the i-th class;
(II)
G-mean:
G - mean = i = 1 n A c c i n ;
(III)
Precision:
Precision = 1 n · i = 1 n TP i TP i + FP i ,
where TP i denotes the number of correctly classified samples in the i-th class, and FP i is the number of instances misclassified into the i-th class;
(IV)
F-measure:
F - measure = 2 n · i = 1 n Precision i · Recall i Precision i + Recall i ,
where Recall i = TP i TP i + FN i , FN i denotes the number of samples in the i-th class which are misclassified into the other class label.

3. Ensemble Approaches for Imbalanced Classification

The challenges of imbalanced classification and its prevalence have resulted in much research in this area. An effective solution is to design classifiers based on ensemble approaches [34,35,36,37,38,39,40,41,42]. Ensemble schemes for imbalanced classification have been developed from the perspectives of data decomposition, cost-sensitive schemes, sampling methods, and dynamic selection. These methods can achieve desirable results for binary imbalanced classification problems. However, multi-class classification involving multiple minority classes has to face the complexity of the internal structure of multi-class imbalanced datasets, and the differences of a decision boundary of two classes bring more difficulties to the classification task and require additional effort. Therefore, multi-class imbalanced classification [43,44,45,46,47,48] has always attracted attention in the machine learning field. This part summarizes the state-of-the-art imbalance ensemble classification algorithms for binary class and multi-class imbalanced datasets. Some of the pertinent methods are shown in Figure 1 and later elucidated in our comparative study.

3.1. Imbalanced Learning Based on Ensemble Classification and a Decomposition Strategy

The most notable characteristic of multi-class classification is the diversity of categories compared with binary classification. A common method is to convert a multi-class dataset into several binary datasets via decomposition, for example, OVA [49], OVO [50], AO [51], and OAHO [52].
OVA is a relatively straightforward decomposition strategy and is designed for disassembling data by marking each category as a positive class and all the other classes as a negative part, on which a classifier is trained. If the original dataset has n classes, n binary classifiers can be obtained. OVO trains one binary classifier for each pair of classes, a total of n ( n 1 ) / 2 classifiers are required to be trained. Based on OVA and OVO, Garcia-Pedrajas et al. [51] proposed the AO strategy, which employs OVA to obtain two predicted results with the label ( l i , l j ), and other predicted results from OVO classifiers related to l i and l j are chosen to make the final prediction. OAHO sorts classes according to the number of instances in descending order { l 1 , l 2 , ⋯, l n }, where l 1 is the class which has the largest number of samples. The training process starts from l 1 until l n and sequentially regards the current class as the positive class and all the lower ranking classes as negative classes. Binary classifiers will be trained on sub-datasets.
Data decomposition is an easily applicable conversion method for multi-class problems and has been combined with ensemble classification in succession. For instance, to address the multi-class imbalance ensemble classification issue, Ghanem et al. [53] combined AO and the PRMS-IM algorithm [54] to design the MultiIMAO classifier. It effectively demonstrated that the data decomposition strategy could enhance classification performance to a certain extent. Besides, Bi et al. [55] incorporated the OVA, OVO, and OAHO methods in PRMs-IM classification, named MultiIMOVA, MultiIMOVO, and MultiIMOAHO, respectively, to further investigate their cooperative performance. Dietterich and Bakiri proposed the ECOC decomposition method [48] to classify multi-class datasets using error correction output coding. On the basis of ECOC, the IMECOC method developed in [56] was an improved ECOC method that simultaneously premeditated the between-class and the within-class imbalance in classifying imbalanced datasets. Furthermore, different weights were assigned to different binary classifiers in the IMECOC [56]. According to different encoding methods, IMECOC was further extended to ImECOCOVA, ImECOCsparse, ImECOCdense [34], and so on.
The above classification approaches provide a basic starting point for discovering the potential synergy between ensembles for imbalanced data and data decomposition strategies.

3.2. Imbalanced Learning Based on Ensemble Classification and Cost-Sensitive Scheme

By considering the cost of classifications, numerous effective cost-sensitive-based ensemble algorithms have been developed [57], which favor the minority class by assigning different weights for different samples. Ensemble classification incorporating cost-sensitive schemes can promote more robust performances than a single classification by merely combining multiple classifiers. The representative algorithm for binary classification is AdaBoost (Adaptive Boosting) [46], proposed by Freund and Schapire, in which weak classifiers were integrated to build a stronger classifier by updating weights. Given a training dataset {(x1, y1), (x2, y2), ⋯, (xN, yN)}, the weight updating rule in AdaBoost was defined as:
D t + 1 ( i ) = D t ( i ) exp α t y i h t x i Z t ,
where the initial weight takes D 1 ( i ) = 1 / N , α t is the weight parameter of the t-th weak classifier, h t is a weak classifier in the t-th iteration, and Z t is a normalization factor. The output of AdaBoost was defined as:
H ( x ) = sign t = 1 T α t h t ( x ) .
To tackle multi-class classification problems, a series of AdaBoost extensions were presented. Both AdaBoost.M1 and SAMME [58] extended AdaBoost in terms of updating weights and the combination strategy of classifiers. AdaC2.M1 [59] inherited the general learning framework of AdaBoost.M1 except that it introduced misclassification costs into the weight update formula. The optimal cost setting in AdaC2.M1 was determined by employing a genetic algorithm. AdaBoost.NC [60] was an algorithm emphasizing ensemble diversity in the training process, in which a weight update rule with a penalty term was introduced. The PIBoost classifier [61] based on the ensemble method and cost-sensitivity scheme dealt with multi-class imbalanced data via a series of binary weak-learners and a margin-based exponential loss function. In addition, cost-sensitive schemes and data balancing algorithms have synergistic effects for handling imbalanced datasets in ensemble learning.

3.3. Imbalanced Learning Based on Ensemble Classification and Sampling Methods

For relieving the impact of the imbalanced training data on the classification model, a large number of ensemble algorithms have been improved by incorporating data-balancing algorithms. The training set is re-sampled before constructing the classification model so that the imbalance rate among various categories is close to an equilibrium. In general, ensemble methods with balancing algorithms mainly combine classifiers with under-sampling, over-sampling, or mixed sampling methods.
Under-sampling is a common method for minimizing the proportion of majority class samples in imbalanced data and improving predictive performance. Seiffert et al. [62] proposed the RUSBoost algorithm by combining the under-sampling and boosting method, which randomly removes the instances from majority classes until the desired proportion is achieved. Galar [63] presented a new EUSBoost algorithm based on RUSBoost and the random under-sampling algorithm. The diversity of base classifiers were promoted because more subclassifiers were embedded in the ensemble process. In [64], Luo et al. presented an innovative XGBoost classification method based on bagging to handle classification problems involving imbalanced data. The bagging procedure was designed with random under-sampling. XGBoost synthesizes new samples in a sufficiently small neighborhood of minority samples which averts increasing noisy samples near the classification boundary.
In terms of over-sampling, Ramentol et al. [65] construct a new synthetic minority over-sampling technique, based on the rough set theory and the lower approximation of a minority sample subset. Similarly, Douzas et al. [66] proposed an effective ensemble algorithm based on K-means clustering and SMOTE, stipulating the generation of new data in crucial areas of the space produced by clustering. The imbalance ratio, as well as the average distance among minority samples, was used as the assessment criteria to determine whether new instances should be generated.
When the imbalance rate closes to 1 after the sampling process, the dataset will achieve equilibrium. In addition, the classification accuracy is an alternative for measuring whether data strikes a balance between different classes. This was adopted by Lu et al. for designing HSBagging [67], in which a pre-processing step was conducted by using both random under-sampling and SMOTE at each bagging iteration. The classifier employed the predictive accuracy on out-of-bag instances as an optimal sampling rate for SMOTE and random under-sampling. Among classical imbalanced learning based on ensemble classification and sampling methods, UnderBagging and SMOTEBagging are used to achieve excellent performances [19]. However, HSBagging was demonstrated to show a better classification performance compared to single UnderBagging or SMOTEBagging in [67].
Some other related works based on ensemble classification and sampling methods also contribute to resolving imbalanced classification. For example, Ahmed [68] applied hybrid sampling in the RSYNBagging classifier, which considered the diversification of imbalanced data. Additionally, the ADASYNBagging algorithm [69] was coined by incorporating an algorithm and over-sampling. Although most of the above work was aimed at binary class datasets, it provided a solid foundation for the classification of multi-class imbalanced datasets. Wei et al. [70] put forward a SMOTE-decision-tree classifier that modified a binary classification algorithm for handling multi-class imbalance problems effectively.

3.4. Imbalanced Learning Based on Ensemble Classification and Dynamic Selection

With the extensive application of ensemble approaches, it has become an important issue for designing a more efficient ensemble classification algorithm. Compared with static ensemble algorithms, dynamic selection ensemble algorithms [71,72,73,74,75,76,77,78,79,80,81,82,83] have been shown to effectively improve the F-measure and G-mean values. A dynamic selection ensemble algorithm predicts the label of the test sample by evaluating the capability level of each classifier and selects the set of the most capable or competitive classifiers. In the process of dynamic ensemble classification (Figure 2), each test sample or each subset can select the optimal classification model. Generally, the selection of classifiers is realized by estimating the classification ability in the local region of the test samples or calculating the prediction accuracy of the classifiers.
A function to evaluate the classification capability can be considered as a tool to assist in the selection of classifiers. For example, Garcia et al. [84] constructed a capability function by calculating the classification score of each base classifier and selected out the top nine classifiers with the highest capability values. Specifically, the selection structure for base classifiers was a key component in their dynamic selection model. The classification accuracy can be regarded as another measure for the selection of classifiers. The approach proposed by Lin et al. [85] used a selective ensemble model to deal with multi-class classification by choosing the classifier with the highest classification accuracy in the local region of the test sample. Mendialdua et al. [86] established a more intensive approach to select classifiers for each pair of classes. The model attempted to extract the best classifier in every sub-dataset of OVO. They demonstrated that OVO and dynamic selection have a positive synergy during classification, which enabled the extension of decomposition algorithms to dynamic ensemble selection strategies. Woloszynski et al. [87] used a probability model to evaluate the classification ability of the base classifiers and introduced a random reference classifier in the process of ensemble classification. The probability of the correct classification of the random reference classifier was employed as the measure of the competence of the actual classifier, which combined the dynamic selection of the classifiers with the probabilistic method.
In addition to the selection manner of classifiers, base classifiers’ generalization ability also has an extremely important impact on dynamic selection results. Cruz et al. [88] developed an improved dynamic selection approach. In the first stage, prototype selection techniques were applied to the training data to reduce the overlap between classes. In the next generalization process, a local adaptive K-nearest neighbor algorithm was adopted to minimize the influence of noisy samples on the competency area. Meanwhile, they demonstrated that the distribution of imbalanced datasets would directly affect selecting the optimal classifier during the dynamic selection process, and datasets with a complex structure would result in poor classification. Focusing on the complex structure of imbalanced data, Brun et al. [89] selected the classifier trained on the data subset whose complexity is similar to the neighborhood of the test instances. They also conducted an in-depth consideration and analysis of the data structure in the field of dynamic ensemble learning. Cruz et al. developed a novel dynamic selection framework [90] which extracted meta-features from the training data for training meta-classifiers to judge whether the base classifier had the sufficient ability to classify test instances.
Although each of the strategies mentioned above has its own merits and improves the performance in the design of classifiers, there is still room for improvement in terms of performance optimization by designing dynamic selection in multi-class imbalanced classification. Inspired by the related literature, this study focuses on the investigation of the classification performance of classic multi-classification algorithms combining dynamic selection approaches.

4. Experimental Comparison of Multi-Class Imbalanced Classifiers by Incorporating Dynamic Selection

Due to the significant advantages of ensemble classification algorithms in dealing with class imbalance, this study merges dynamic selection with popular ensemble classification algorithms for multi-class datasets, aimed towards verifying the effectiveness of dynamic selection.

4.1. Experimental Procedure

We employ a homogeneous classifier to generate the candidate classifier pool, and 14 multi-class imbalance ensemble classification algorithms (Figure 1) are employed as base classifiers, respectively. By combining the above-mentioned base classifiers and the dynamic selection process proposed in [84], dynamic ensemble classifiers are designed for both binary class and multi-class imbalanced datasets. The overall process of dynamic selection is shown in Figure 2. The function for capability evaluation referred to in [84] is defined as follows.
Given a test sample x i and a classifier h, we calculate the classification capability of classifier h for x i :
F h x i = t = 1 k I x i t × w i t , I h x i t = y t = 0 , h x i t y t , 1 , h x i t = y t ,
where k is the number of nearest neighbor instances of x i in the training data; x i t is the t-th neighbor instance of x i ; y t is the true label of instance x i t ; w i t is the weight of nearest neighbor sample x i t , w i t = 1 1 + e x p ( λ m ) , λ is the scaling coefficient, and m is the number of samples with the same class as x i t . Obviously, the more samples with the same class as x i t , the lower the weight, which indirectly increases the weight of a minority sample. Meanwhile, if the predicted labels of the K-nearest neighbor samples are identical to the true label, I returns 1; otherwise, I returns 0. The classification capability of classifier h for x i is reflected by the classification performance of its neighbor samples. Taking AdaBoostNC as the base classifier, the procedure of the dynamic AdaBoostNC model is described in Algorithm 1. The other 13 dynamic ensemble algorithms are constructed similarly.
Algorithm 1: Dynamic AdaBoostNC classifier.
Entropy 23 00822 i001
The datasets for the experiments come from the four repositories mentioned in Table 1, including 56 datasets, 32 of which are suitable for binary classification, and the other 24 datasets are available for multi-class classification. These data are closely related to the fields of life sciences, medical research, bank credit evaluation, network fault detection, etc. More information concerning these datasets is offered in Table 2 and Table 3. As can be seen from these tables, the experimental datasets are diverse in the number of attributes (4∼40), classes (2∼10), continuous or discrete attributes, class distribution, and imbalance ratio (1.003∼853).
In this study, 14 groups of comparative experiments between dynamic ensemble algorithms and state-of-the-art methods are carried out on both binary and multi-class imbalanced datasets. In the experimentation, the nearest neighbor number k is set to 9. The dynamic selection parameter N takes 9 (the settings for both parameters are based on reference [84]), the number of base classifiers p is chosen to be 16. Five-fold cross-validation is carried out for all methods for each dataset. The average values of 5 runs of experimental results are calculated for obtaining the predictive performance of the designed methods.

4.2. Experiments Results for Binary Class Datasets

This study first examines the performance of the designed dynamic model on 32 binary class datasets. Aiming to illustrate the performance of the classifier more distinctly, we compare the MAvA, G-mean, precision, and F-measure results with classic algorithms before and after adding the dynamic selection process (Figure 3), where the dashed line at “Number of datasets = 16” means half of the number of binary datasets, and the points represent numbers of datasets on which prediction results are improved after dynamic selection. If the points are above the dashed line, it indicates that the dynamic selection effect on the corresponding index is satisfactory, i.e., the performance of the dynamic selection ensemble classifiers are better than original algorithms and vice versa.
We observe from Figure 3 that by incorporating with the dynamic selection strategy, the results of 12 classical imbalanced classification algorithms are improved under the MAvA, precision, and F-measure indicators, which mean higher precision is obtained for more than half of the binary datasets. The effect of dynamic selection is not obvious for MHDDTECOC and HDDTOVA. The points are below the dashed line under MAvA and F-measure indicators, which indicate the higher results only shown on individual datasets. However, other results for 12 classical imbalanced classification algorithms are over the dashed line except the above two classification algorithms, which indicate incorporating dynamic selection can promote predictive performance (MAvA, precision, and F-measure) for binary class datasets.

4.3. Experiments Results for Multi-Class Imbalanced Datasets

In this study, the above-mentioned classifiers are tested on 24 multi-class imbalanced datasets, where the dashed line at “Number of datasets = 12” means half of the number of multi-class datasets. The multi-class results are shown in Figure 4, and the main observations are enumerated as follows:
(I)
The effect of dynamic selection for MCHDDT and HDDTOVA are not satisfactory (the points of 3 indicators are below the dashed line), and the performance of the other 12 dynamic selection ensemble classifiers are better than the original algorithms. The MAvA, precision, and F-measure indicators of the improved classifiers have been distinctly improved after dynamic selection. However, due to the extremely low representation of the minority samples, the recognition rate for the minority category may drop sharply, resulting in a lower G-mean value. In fact, 11 of the classifiers demonstrate favorable G-mean results on more than half of the multi-class datasets after syncretizing with dynamic selection. In this regard, it is obvious that dynamic selection models for multi-class imbalanced datasets showed superior characteristics compared to those for binary data.
(II)
By combining dynamic selection algorithms, MultiImOVA, MultiImOVO, MultiImOAHO, and MultiImAO have exceptional performance, on which predictive performances are effectively improved compared to using a single classification algorithm, on the whole. Therefore, we further validate the conclusion that data decomposition techniques and dynamic selection have a positive synergy during classification [86]. In particular, the classification MAvA values of the above four algorithms on all 56 datasets are shown in Figure 5, Figure 6, Figure 7 and Figure 8.
The results indicate that dynamic selection provides potential strategies for dealing with imbalanced datasets covering binary class and multi-class imbalanced datasets. Moreover, for the same dataset, we have counted the number of classification algorithms with improved classification results after dynamic selection. As shown in Table 4, Table 5, Table 6 and Table 7, regardless of the structure of the data (both binary class and multi-class imbalanced data), the classification algorithms, for the most part, can better classify imbalanced data after employing dynamic selection. The results reveal that incorporating dynamic selection can relieve the impact of imbalanced training data on the classifier performance. Therefore, a dynamic selection ensemble algorithm can be a potential solution for the imbalanced classification problem.

5. Patch-Ensemble Classification for Imbalanced Data

Dynamic selection strategies and ensemble classification algorithms have a synergistic effect in classification. Training a dynamic selection ensemble classification scheme with excellent performance is usually an uncertain task, which depends on previous experience and a trial-and-error experiment process. Patch Learning (PL), proposed by Wu et al. [91], is a new machine learning strategy for solving the fitting problem in classification. As shown in Figure 9, patch learning is a combination of parallel and serial models, which focuses on the misclassified samples during the training procedure, and enhances the classification diversity with the construction of multiple local classifiers [91]. In this study, a patch-ensemble classification method is designed for classifying imbalanced data, which connects patch learning with a dynamic selection ensemble classification scheme.

5.1. Patch Learning

Patch learning consists mainly of the following three steps [91]:
(I)
Train an initial global model using all the training data;
(II)
Select the incorrectly classified training data to construct several local patch models;
(III)
The correctly classified training samples are utilized to update the global model.
For a new testing sample, PL firstly determines if the sample belongs to a patch so that the corresponding local patch model is selected for a classification task. Otherwise, the global model is employed.

5.2. Patch-Ensemble Classifier

In classification, a classifier tries to distinguish boundaries between binary or multi-class. If the training samples belong to the same class, we can directly detect whether a new testing sample belongs to this class during the predicted phase. One-Class SVM can better solve the above problem [92]. Considering that in patch learning, when the testing sample selects the global classifier or several patch classifiers, it is necessary to detect the similarity between the samples and several patches. The ensemble classification method with patch designed in this paper uses the existing imbalanced ensemble classification algorithm as the global classifier and One-Class SVM as the patch classifier for experimental design. The specific process is shown in Algorithm 2.
In the training process, the number of patch classifiers is determined according to the number of classes misclassified by the global classifier. To ensure that the global classifier maintains its best classification effect for multi-class imbalanced datasets, we weaken the boundary between global and patch training data allowing the training samples of the global classifier to partly overlap with that of the patch classifier. During the testing process, the distances between the new sample and various center points of the training sample are calculated, used as the selection condition to dynamically choose a patch classifier or a global classifier for dealing with the testing data.

5.3. Experiments and Analysis

To explore the effectiveness of patch ensemble classifiers for imbalanced classification, in this study, the patch ensemble classifier is compared with the classical imbalance ensemble algorithms. We choose the AdaC2M1 algorithm as the global classifier and One-Class SVM as the patch classifier, and the relevant experimental data are detailed in Table 2 and Table 3. The performance of the designed patch ensemble classifiers are evaluated by five-fold cross validation on multiple data sets. The results of the designed patch ensemble classifier in this paper are compared with the top five of 14 classical imbalance ensemble algorithms. To show the results more clearly, the classification results of each method are sorted as a whole. The times of dominant classification results of each algorithm in 56 datasets are counted respectively under MAVA, Precision, G-Mean, and F-Measure indexes.
Algorithm 2: Patch ensemble classifier.
Entropy 23 00822 i002
As shown in Table 8, Table 9, Table 10 and Table 11, the proposed patch ensemble classification method achieves better MAvA, Precision, and F-measure values in more than half of the datasets. If only the top two results of the classification are considered, the same conclusion can be drawn. Under the G-mean indicator, the proposed method obtains the optimal and suboptimal classification results on 12 imbalanced datasets, which has a certain degree classification potential compared with other methods.

6. Conclusions

In this study, we reviewed state-of-the-art ensemble classification algorithms for imbalanced data and compared the performance of 14 existing multi-class imbalanced classification algorithms by incorporating a dynamic selection strategy. By applying these dynamic imbalance ensemble classifiers to 56 public datasets, the experimental results demonstrate that the dynamic ensemble classification methods obtain significantly better MAvA, precision, G-mean, and F-measure performances than the original 14 algorithms. In particular, dynamic multi-class ensemble classifiers have the potential to achieve an ideal identification performance. We also designed a patch ensemble classification method, which uses misclassified samples to train patch classifiers for increasing classification diversity. Experiments showed that this method has a certain classification potential for multi-class imbalanced classification. In future work, we will consider the imbalance ratio to further improve the classification performance and tackle practical problems under its guidance.

Author Contributions

Writing—original draft and model visualization, D.Z.; writing—review and editing, X.W.; data curation, D.Z.; Methodology and supervision, L.W.; funding acquisition, Y.M. and X.W.; writing-review and supervision, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Natural Science Foundation of China under Grants Nos. 61803065 and 62006071 and the Science and Technology Project of the Science and Technology department of Henan province (Nos. 212102210149 and 202102210338).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw experimental is available at the following website: https://archive.ics.uci.edu/ml/index.php, https://www.openml.org, https://sci2s.ugr.es/keel/imbalanced.php, and http://tunedit.org/repo/PROMISE/DefectPrediction (all accessed on 24 June 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fasihi, M.; Nadimi-Shahraki, M.H.; Jannesari, A. Multi-Class cardiovascular diseases diagnosis from electrocardiogram signals using 1-D convolution neural network. In Proceedings of the 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI), IEEE Computer Society, Las Vegas, NV, USA, 11–13 August 2020; pp. 372–378. [Google Scholar] [CrossRef]
  2. Kuo, K.M.; Talley, P.; Kao, Y.; Huang, C.H. A multi-class classification model for supporting the diagnosis of type II diabetes mellitus. PeerJ 2020, 8, 9920. [Google Scholar] [CrossRef] [PubMed]
  3. Hosni, M.; García-Mateos, G.; Gea, J.M.C.; Idri, A.; Fernández-Alemán, J.L. A mapping study of ensemble classification methods in lung cancer decision support systems. Med Biol. Eng. Comput. 2020, 1–17. [Google Scholar] [CrossRef] [PubMed]
  4. Wang, S.; Elkin, M.E.; Zhu, X. Imbalanced learning for hospital readmission prediction using national readmission database. In Proceedings of the 2020 IEEE International Conference on Knowledge Graph (ICKG), Nanjing, China, 9–11 August 2020; pp. 116–122. [Google Scholar] [CrossRef]
  5. Piir, G.; Sild, S.; Maran, U. Binary and multi-class classification for androgen receptor agonists, antagonists and binders. Chemosphere 2020, 128313. [Google Scholar] [CrossRef] [PubMed]
  6. Sah, A.K.; Mishra, A.; Reddy, U.S. Machine learning approach for feature interpretation and classification of genetic mutations leading to tumor and cancer. In Advances in Electrical and Computer Technologies; Springer: Berlin/Heidelberg, Germany, 2020; pp. 387–395. [Google Scholar] [CrossRef]
  7. García-Mendoza, C.V.; Gambino, O.J.; Villarreal-Cervantes, M.G.; Calvo, H. Evolutionary optimization of ensemble learning to determine sentiment polarity in an unbalanced multiclass corpus. Entropy 2020, 22, 1020. [Google Scholar] [CrossRef] [PubMed]
  8. Bargshady, G.; Zhou, X.; Deo, R.C.; Soar, J.; Whittaker, F.; Wang, H. Ensemble neural network approach detecting pain intensity from facial expressions. Artif. Intell. Med. 2020, 109, 101954. [Google Scholar] [CrossRef]
  9. Yilmaz, I.; Masum, R.; Siraj, A. Addressing imbalanced data problem with generative adversarial network for intrusion detection. In Proceedings of the 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI), Las Vegas, NV, USA, 11–13 August 2020; pp. 25–30. [Google Scholar] [CrossRef]
  10. Ducharlet, K.; Travé-Massuyès, L.; Lann, M.V.L.; Miloudi, Y. A multi-phase iterative approach for anomaly detection and its agnostic evaluation. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems; Springer: Berlin/Heidelberg, Germany, 2020; pp. 505–517. [Google Scholar] [CrossRef]
  11. Shi, Z. Improving k-nearest neighbors algorithm for imbalanced data classification. IOP Conf. Ser. Mater. Sci. Eng. 2020, 719, 12072. [Google Scholar] [CrossRef]
  12. Wang, C.; Yang, Y. Nearest neighbor with double neighborhoods algorithm for imbalanced classification. Int. J. Appl. Math. 2020, 50, 1–13. Available online: http://www.iaeng.org/IJAM/issues_v50/issue_1/IJAM_50_1_21.pdf (accessed on 3 November 2020).
  13. Fu, G.H.; Wu, Y.J.; Zong, M.J.; Yi, L.Z. Feature selection and classification by minimizing overlap degree for class-imbalanced data in metabolomics. Chemom. Intell. Lab. Syst. 2020, 196, 103906. [Google Scholar] [CrossRef]
  14. Peng, Y.; Li, C.; Wang, K.; Gao, Z.; Yu, R. Examining imbalanced classification algorithms in predicting real-time traffic crash risk. Accid. Anal. Prev. 2020, 144, 105610. [Google Scholar] [CrossRef]
  15. Krawczyk, B. Learning from imbalanced data: Open challenges and future directions. Prog. Artif. Intell. 2016, 5, 221–232. [Google Scholar] [CrossRef] [Green Version]
  16. Wang, D.; Zhang, X.; Chen, H.; Zhou, Y.; Cheng, F. A sintering state recognition eramework to integrate prior knowledge and hidden information considering class imbalance. IEEE Trans. Ind. Electron. 2020, 1. [Google Scholar] [CrossRef]
  17. Charte, F.; Rivera, A.J.; del Jesus, M.J.; Herrera, F. Addressing imbalance in multilabel classification: Measures and random resampling algorithms. Neurocomputing 2015, 163, 3–16. [Google Scholar] [CrossRef]
  18. Tao, X.; Li, Q.; Ren, C.; Guo, W.; He, Q.; Liu, R.; Zou, J. Affinity and class probability-based fuzzy support vector machine for imbalanced data sets. Neural Netw. 2020, 122, 289–307. [Google Scholar] [CrossRef]
  19. Galar, M.; Fernandez, A.; Barrenechea, E.; Bustince, H.; Herrera, F. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. 2011, 42, 463–484. [Google Scholar] [CrossRef]
  20. Herrera, F.; Charte, F.; Rivera, A.J.; del Jesus, M.J. Ensemble-based classifiers. In Multilabel Classification; Springer: Berlin/Heidelberg, Germany, 2016; pp. 101–113. [Google Scholar] [CrossRef]
  21. Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2020, 14, 1–18. [Google Scholar] [CrossRef]
  22. Vranjković, V.S.; Struharik, R.J.; Novak, L.A. Hardware acceleration of homogeneous and heterogeneous ensemble classifiers. Microprocess. Microsyst. 2015, 39, 782–795. [Google Scholar] [CrossRef]
  23. Guan, H.; Zhang, Y.; Cheng, H.D.; Tang, X. Bounded–abstaining classification for breast tumors in imbalanced ultrasound images. Int. J. Appl. Math. Comput. Sci. 2020, 30, 325–336. [Google Scholar] [CrossRef]
  24. Sun, J.; Li, H.; Fujita, H.; Fu, B.; Ai, W. Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting. Inf. Fusion 2020, 54, 128–144. [Google Scholar] [CrossRef]
  25. Jiang, X.; Ge, Z. Data augmentation classifier for imbalanced fault classification. IEEE Trans. Autom. Sci. Eng. 2020, 1–12. [Google Scholar] [CrossRef]
  26. Korkmaz, S. Deep learning-based imbalanced data classification for drug discovery. J. Chem. Inf. Model. 2020, 60, 4180–4190. [Google Scholar] [CrossRef]
  27. Ho, T.K.; Hull, J.J.; Srihari, S.N. Decision combination in multiple classifier systems. IEEE Trans. Pattern Anal. Mach. Intell. 1994, 16, 66–75. [Google Scholar] [CrossRef] [Green Version]
  28. Nguyen, T.T.; Dang, M.T.; Baghel, V.A.; Luong, A.V.; McCall, J.; Liew, A.W. Evolving interval-based representation for multiple classifier fusion. Knowl.-Based Syst. 2020, 106034. [Google Scholar] [CrossRef]
  29. Yuan, P.; Wang, B.; Mao, Z. Using multiple classifier behavior to develop a dynamic outlier ensemble. Int. J. Mach. Learn. Cybern. 2020, 1–13. [Google Scholar] [CrossRef]
  30. Huang, C.; Li, Y.; Chen, C.L.; Tang, X. Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2781–2794. [Google Scholar] [CrossRef] [Green Version]
  31. Ma, S.; Xu, J.; Zhang, C. Automatic identification of cited text spans: A multi-classifier approach over imbalanced dataset. Scientometrics 2018, 116, 1303–1330. [Google Scholar] [CrossRef]
  32. Kubat, M.; Holte, R.; Matwin, S. Learning when negative examples abound. In European Conference on Machine Learning; Springer: Berlin/Heidelberg, Germany, 1997; pp. 146–153. [Google Scholar] [CrossRef] [Green Version]
  33. Hripcsak, G.; Rothschild, A.S. Agreement, the f-measure, and reliability in information retrieval. J. Am. Med. Inform. Assoc. 2005, 12, 296–298. [Google Scholar] [CrossRef] [PubMed]
  34. Zhang, C.; Bi, J.; Xu, S.; Ramentol, E.; Fan, G.; Qiao, B.; Fujita, H. Multi-imbalance: An open-source software for multi-class imbalance learning. Knowl.-Based Syst. 2019, 174, 137–143. [Google Scholar] [CrossRef]
  35. Vong, C.M.; Du, J. Accurate and efficient sequential ensemble learning for highly imbalanced multi-class data. Neural Netw. 2020, 128, 268–278. [Google Scholar] [CrossRef]
  36. Ramentol, E.; Vluymans, S.; Verbiest, N.; Caballero, Y.; Bello, R.; Cornelis, C.; Herrera, F. IFROWANN: Imbalanced fuzzy-rough ordered weighted average nearest neighbor classification. IEEE Trans. Fuzzy Syst. 2014, 23, 1622–1637. [Google Scholar] [CrossRef]
  37. Xu, C.; Zhu, G. Semi-supervised learning algorithm based on linear lie group for imbalanced multi-class classification. Neural Process. Lett. 2020, 52, 1–21. [Google Scholar] [CrossRef]
  38. Zhang, Z.L.; Luo, X.G.; González, S.; García, S.; Herrera, F. DRCW-ASEG: One-versus-one distance-based relative competence weighting with adaptive synthetic example generation for multi-class imbalanced datasets. Neurocomputing 2018, 285, 176–187. [Google Scholar] [CrossRef]
  39. Hoens, T.R.; Qian, Q.; Chawla, N.V.; Zhou, Z.H. Building decision trees for the multi-class imbalance problem. In Pacific-Asia Conference on Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2012; pp. 122–134. [Google Scholar] [CrossRef] [Green Version]
  40. Duan, H.; Wei, Y.; Liu, P.; Yin, H. A novel ensemble framework based on k-means and resampling for imbalanced data. Appl. Sci. 2020, 10, 1684. [Google Scholar] [CrossRef] [Green Version]
  41. Kunakorntum, I.; Hinthong, W.; Phunchongharn, P. A synthetic minority based on probabilistic distribution (SyMProD) oversampling for imbalanced datasets. IEEE Access 2020, 8, 114692–114704. [Google Scholar] [CrossRef]
  42. Guo, H.; Zhou, J.; Wu, C.A. Ensemble learning via constraint projection and undersampling technique for class-imbalance problem. Soft Comput. 2019, 1–17. [Google Scholar] [CrossRef]
  43. Abdi, L.; Hashemi, S. To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Trans. Knowl. Data Eng. 2015, 28, 238–251. [Google Scholar] [CrossRef]
  44. Li, Y.J.; Guo, H.X.; Liu, X.; Li, Y.; Li, J.L. Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl.-Based Syst. 2016, 94, 88–104. [Google Scholar] [CrossRef]
  45. Sahare, M.; Gupta, H. A review of multi-class classification for imbalanced data. Int. J. Adv. Comput. Res. 2012, 2, 160–164. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.300.8687&rep=rep1&type=pdf (accessed on 7 February 2020).
  46. Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
  47. Johnson, J.M.; Khoshgoftaar, T.M. Survey on deep learning with class imbalance. J. Big Data 2019, 6, 27. [Google Scholar] [CrossRef]
  48. Dietterich, T.G.; Bakiri, G. Error-correcting output codes: A general method for improving multiclass inductive learning programs. In The Ninth National Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 1991; pp. 572–577. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.53.9478&rep=rep1&type=pdf (accessed on 10 March 2020).
  49. Rifkin, R.; Klautau, A. In defense of one-vs-all classification. J. Mach. Learn. Res. 2004, 5, 101–141. [Google Scholar] [CrossRef]
  50. Hastie, T.; Tibshirani, R. Classification by pairwise coupling. Adv. Neural Inf. Process. Syst. 1998, 507–513. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.7.572&rep=rep1&type=pdf (accessed on 15 January 2020). [CrossRef]
  51. Garcia-Pedrajas, N.; Ortiz-Boyer, D. Improving multiclass pattern recognition by the combination of two strategies. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1001–1006. [Google Scholar] [CrossRef] [PubMed]
  52. Murphey, Y.L.; Wang, H.; Ou, G.; Feldkamp, L.A. OAHO: An effective algorithm for multi-class learning from imbalanced data. In Proceedings of the 2007 International Joint Conference on Neural Networks, Orlando, FL, USA, 12–17 August 2007; pp. 406–411. [Google Scholar] [CrossRef]
  53. Ghanem, A.S.; Venkatesh, S.; West, G. Multi-class pattern classification in imbalanced data. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2881–2884. [Google Scholar] [CrossRef] [Green Version]
  54. Ghanem, A.S.; Venkatesh, S.; West, G. Learning in imbalanced relational data. In Proceedings of the 2008 19th International Conference on Pattern Recognition, Tampa, FL, USA, 8–11 December 2008; pp. 1–4. [Google Scholar] [CrossRef] [Green Version]
  55. Bi, J.; Zhang, C. An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl.-Based Syst. 2018, 158, 81–93. [Google Scholar] [CrossRef]
  56. Liu, X.Y.; Li, Q.Q.; Zhou, Z.H. Learning imbalanced multi-class data with optimal dichotomy weights. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013; pp. 478–487. [Google Scholar] [CrossRef] [Green Version]
  57. Chen, L.; Zhu, Y. A composite cost-sensitive neural network for imbalanced classification. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 7264–7268. [Google Scholar] [CrossRef]
  58. Hastie, T.; Rosset, S.; Zhu, J.; Zou, H. Multi-class adaboost. Stat. Its Interface 2009, 2, 349–360. [Google Scholar] [CrossRef] [Green Version]
  59. Sun, Y.; Kamel, M.S.; Wang, Y. Boosting for learning multiple classes with imbalanced class distribution. In Proceedings of the Sixth International Conference on Data Mining (ICDM’06), Hong Kong, China, 18–22 December 2006; pp. 592–602. [Google Scholar] [CrossRef] [Green Version]
  60. Wang, S.; Chen, H.; Yao, X. Negative correlation learning for classification ensembles. In Proceedings of the The 2010 International Joint Conference on Neural Networks, Barcelona, Spain, 18–23 July 2010; pp. 1–8. [Google Scholar] [CrossRef]
  61. Fernández-Baldera, A.; Baumela, L. Multi-class boosting with asymmetric binary weak-learners. Pattern Recognit. 2014, 47, 2080–2090. [Google Scholar] [CrossRef]
  62. Seiffert, C.; Khoshgoftaar, T.M.; Hulse, J.V.; Napolitano, A. RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Trans. Syst. 2010, 40, 185–197. [Google Scholar] [CrossRef]
  63. Galar, M.; Fernández, A.; Barrenechea, E.; Herrera, F. EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognit. 2013, 46, 3460–3471. [Google Scholar] [CrossRef]
  64. Luo, R.S.; Dian, S.Y.; Wang, C.; Cheng, P.; Tang, Z.D.; Yu, Y.M.; Wang, S.X. Bagging of xgboost classifiers with random under-sampling and tomek link for noisy label-imbalanced data. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2018; p. 012004. [Google Scholar] [CrossRef] [Green Version]
  65. Ramentol, E.; Caballero, Y.; Bello, R.; Herrera, F. SMOTE-RSB*: A hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory. Knowl. Inf. Syst. 2012, 33, 245–265. [Google Scholar] [CrossRef]
  66. Douzas, G.; Bacao, F.; Last, F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci. 2018, 465, 1–20. [Google Scholar] [CrossRef] [Green Version]
  67. Lu, Y.; Cheung, Y.M.; Tang, Y.Y. Hybrid sampling with bagging for class imbalance learning. In Pacific-Asia Conference on Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2016; pp. 14–26. [Google Scholar] [CrossRef]
  68. Ahmed, S.; Mahbub, A.; Rayhan, F.; Jani, R.; Shatabda, S.; Farid, D.M. Hybrid methods for class imbalance learning employing bagging with sampling techniques. In Proceedings of the 2017 2nd International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS), Bengaluru, India, 21–23 December 2017; pp. 1–5. [Google Scholar] [CrossRef]
  69. He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE World Congress on Computational Intelligence, Bengaluru, India, 21–23 December 2017; pp. 1322–1328. [Google Scholar] [CrossRef] [Green Version]
  70. Feng, W.; Dauphin, G.; Huang, W.; Quan, Y.; Bao, W.; Wu, M.; Li, Q. Dynamic synthetic minority over-sampling technique-based rotation forest for the classification of imbalanced hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2159–2169. [Google Scholar] [CrossRef]
  71. Roy, A.; Cruz, R.M.; Sabourin, R.; Cavalcanti, G.D. A study on combining dynamic selection and data preprocessing for imbalance learning. Neurocomputing 2018, 286, 179–192. [Google Scholar] [CrossRef]
  72. Cruz, R.M.; Souza, M.A.; Sabourin, R.; Cavalcanti, G.D. Dynamic ensemble selection and data preprocessing for multi-class imbalance learning. Int. J. Pattern Recognit. Artif. Intell. 2019, 33, 1940009. [Google Scholar] [CrossRef] [Green Version]
  73. Britto, A.S., Jr.; Sabourin, R.; Oliveira, L.E. Dynamic selection of classifiers–A comprehensive review. Pattern Recognit. 2014, 47, 3665–3680. [Google Scholar] [CrossRef]
  74. Du, H.; Zhang, Y. Network anomaly detection based on selective ensemble algorithm. J. Supercomput. 2020, 1–22. [Google Scholar] [CrossRef]
  75. Cruz, R.M.; Sabourin, R.; Cavalcanti, G.D. Dynamic classifier selection: Recent advances and perspectives. Inf. Fusion 2018, 41, 195–216. [Google Scholar] [CrossRef]
  76. Gao, X.; Ren, B.; Zhang, H.; Sun, B.; Li, J.; Xu, J.; He, Y.; Li, K. An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid sampling. Expert Syst. Appl. 2020, 160, 113660. [Google Scholar] [CrossRef]
  77. Cruz, R.M.; Oliveira, D.V.; Cavalcanti, G.D.; Sabourin, R. FIRE-DES++: Enhanced online pruning of base classifiers for dynamic ensemble selection. Pattern Recognit. 2019, 85, 149–160. [Google Scholar] [CrossRef] [Green Version]
  78. Oliveira, D.V.; Cavalcanti, G.D.; Sabourin, R. Online pruning of base classifiers for dynamic ensemble selection. Pattern Recognit. 2017, 72, 44–58. [Google Scholar] [CrossRef]
  79. Oliveira, D.V.; Cavalcanti, G.D.; Porpino, T.N.; Cruz, R.M.; Sabourin, R. K-nearest oracles borderline dynamic classifier ensemble selection. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar] [CrossRef] [Green Version]
  80. Cruz, R.M.; Hafemann, L.G.; Sabourin, R.; Cavalcanti, G.D. DESlib: A dynamic ensemble selection library in Python. J. Mach. Learn. Res. 2020, 21, 1–5. Available online: https://www.jmlr.org/papers/volume21/18-144/18-144.pdf (accessed on 20 June 2020).
  81. Junior, L.M.; Nardini, F.M.; Renso, C.; Trani, R.; Macedo, J.A. A novel approach to define the local region of dynamic selection techniques in imbalanced credit scoring problems. Expert Syst. Appl. 2020, 152, 113351. [Google Scholar] [CrossRef]
  82. Nguyen, T.T.; Luong, A.V.; Dang, M.T.; Liew, A.W.C.; McCall, J. Ensemble selection based on classifier’s confidence in prediction. Pattern Recognit. 2020, 100, 107104. [Google Scholar] [CrossRef]
  83. Pinagé, F.; dos Santos, E.M.; Gama, J. A drift detection method based on dynamic classifier selection. Data Min. Knowl. Discov. 2020, 34, 50–74. [Google Scholar] [CrossRef]
  84. García, S.; Zhang, Z.L.; Altalhi, A.; Alshomrani, S.; Herrera, F. Dynamic ensemble selection for multi-class imbalanced datasets. Inf. Sci. 2018, 445, 22–37. [Google Scholar] [CrossRef]
  85. Lin, C.; Chen, W.; Qiu, C.; Wu, Y.; Krishnan, S.; Zou, Q. LibD3C: Ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing 2014, 123, 424–435. [Google Scholar] [CrossRef]
  86. Mendialdua, I.; Martínez-Otzeta, J.M.; Rodriguez, I.; Ruiz-Vazquez, T.; Sierra, B. Dynamic selection of the best base classifier in one versus one. Knowl.-Based Syst. 2015, 85, 298–306. [Google Scholar] [CrossRef]
  87. Woloszynski, T.; Kurzynski, M. A probabilistic model of classifier competence for dynamic ensemble selection. Pattern Recognit. 2011, 44, 2656–2668. [Google Scholar] [CrossRef]
  88. Cruz, R.M.; Sabourin, R.; Cavalcanti, G.D. Prototype selection for dynamic classifier and ensemble selection. Neural Comput. Appl. 2018, 29, 447–457. [Google Scholar] [CrossRef]
  89. Brun, A.L.; Britto, A.S.; Oliveira, L.S.; Enembreck, F.; Sabourin, R. Contribution of data complexity features on dynamic classifier selection. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 4396–4403. [Google Scholar] [CrossRef]
  90. Cruz, R.M.; Sabourin, R.; Cavalcanti, G.D.; Ren, T.I. META-DES: A dynamic ensemble selection framework using meta-learning. Pattern Recognit. 2015, 48, 1925–1935. [Google Scholar] [CrossRef] [Green Version]
  91. Wu, D.R.; Mendel, J. Patch learning. IEEE Trans. Fuzzy Syst. 2019, 28, 1996–2008. [Google Scholar] [CrossRef]
  92. Scholkopf, B.; Platt, J.C.; Shawe-Taylor, J.; Smola, A.J.; Williamson, R.C. Estimating the support of a high-dimensional distribution. Neural Comput. 2001, 13, 1443–1471. [Google Scholar] [CrossRef]
Figure 1. Multi-class imbalance ensemble classification.
Figure 1. Multi-class imbalance ensemble classification.
Entropy 23 00822 g001
Figure 2. The process of dynamic ensemble classification.
Figure 2. The process of dynamic ensemble classification.
Entropy 23 00822 g002
Figure 3. The total number of binary class imbalanced datasets on which prediction results are improved after adding the dynamic selection.
Figure 3. The total number of binary class imbalanced datasets on which prediction results are improved after adding the dynamic selection.
Entropy 23 00822 g003
Figure 4. The total number of multi-class imbalanced datasets on which prediction results are improved after adding the dynamic selection.
Figure 4. The total number of multi-class imbalanced datasets on which prediction results are improved after adding the dynamic selection.
Entropy 23 00822 g004
Figure 5. The MAvA results of MultiImAO for 56 datasets.
Figure 5. The MAvA results of MultiImAO for 56 datasets.
Entropy 23 00822 g005
Figure 6. The MAvA results of MultiImOVO for 56 datasets.
Figure 6. The MAvA results of MultiImOVO for 56 datasets.
Entropy 23 00822 g006
Figure 7. The MAvA results of MultiImOVA for 56 datasets.
Figure 7. The MAvA results of MultiImOVA for 56 datasets.
Entropy 23 00822 g007
Figure 8. The MAvA results of MultiImOAHO for 56 datasets.
Figure 8. The MAvA results of MultiImOAHO for 56 datasets.
Entropy 23 00822 g008
Figure 9. Patch classifier.
Figure 9. Patch classifier.
Entropy 23 00822 g009
Table 1. Sources of imbalanced datasets.
Table 1. Sources of imbalanced datasets.
NameSource
UCIhttps://archive.ics.uci.edu/ml/index.php (accessed on 9 November 2019)
OpenMLhttps://www.openml.org (accessed on 20 March 2020)
KEELhttps://sci2s.ugr.es/keel/imbalanced.php (accessed on 7 January 2020)
DefectPredictionhttp://tunedit.org/repo/PROMISE/DefectPrediction (accessed on 10 April 2020)
Table 2. Description of binary datasets.
Table 2. Description of binary datasets.
DatasetsNumber of DataNumber of AttributeContinuous AttributeDiscrete AttributeClasses DistributionClassesImbalance Ratio
transfusion748440178, 57023.202
heart27013130150, 12021.250
chess3196353501669, 152721.093
sick2800272702629, 171215.374
redwinequality69111110681, 10268.100
ar112129920112, 9212.444
ar3632992055, 826.875
ar41072992087, 2024.350
ar5362992028, 823.500
ar61012992086, 1525.733
cm1_req8980869, 2023.450
jEdit_4.0_4.2274808140, 13421.045
jEdit_4.2_4.3369808165, 20421.236
kc252221714415, 10723.879
kc3458391425415, 4329.651
mc2161391524109, 5222.096
mw1403371324372, 31212.000
pc1_req320808213, 10721.991
banknote-authentication1372440762, 61021.249
blood-transfusion-service-center748404570, 17823.202
breast_w699909458, 24121.900
climate-model-simulation-crashes5402017346, 494210.739
diabetes768826268, 50021.866
ilpd583954416, 16722.491
monks-problems1556606272, 28421.044
monks-problems2601606300, 30121.003
monks-problems3554606275, 27921.015
mozilla415,54540410,437, 510822.043
pc111092117477, 1032213.403
pc31563371423160, 140328.769
pc41458371225178, 128027.191
phoneme54045503818, 158622.407
Table 3. Description of multi-class imbalanced datasets.
Table 3. Description of multi-class imbalanced datasets.
DatasetsNumber of DataNumber of AttributeContinuous AttributeDiscrete AttributeClasses DistributionClassesImbalance Ratio
wine1781311259, 71, 4831.479
hayesroth13240451, 51, 3031.700
contraceptive1473909629, 333, 51131.889
penbased110016016114, 114, 106, 114, 106, 105, 115, 105, 106, 115101.095
newthyroid215541150, 35, 3035.000
dermatology36634034112, 61, 72, 49, 52, 2065.600
balance62540449, 288, 28835.878
glass21499070, 76, 17, 13, 9, 2968.444
auto406716254, 73, 7933.479
yeast1484990288, 480, 626, 35, 30, 20, 57125.200
thyroid7202161517, 37, 666339.176
lymphography1481801857, 37, 18, 10, 8, 8, 8, 2828.500
ecoli336770143, 77, 2, 2, 35, 20, 5, 52871.500
pageblocks5481046492, 33, 3, 8, 125164.000
shuttle21759091706, 2, 6, 338, 1235853.000
wan_2750220100, 150, 250, 25042.500
zoo1011601641, 20, 5, 13, 4, 8, 10710.250
hayes13250551, 51, 3031.700
waveform500040400169, 1653, 165531.024
auto_2052052518754, 32, 27, 67, 22, 3622.333
car17286601210, 384, 69, 65418.615
vehicle84618180217, 217, 216, 19641.107
balance-scale62540449, 288, 28835.878
vowel99010100180, 180, 180, 90, 180, 18062.000
Table 4. Numbers of improved classification algorithms after dynamic selection under the MAvA index.
Table 4. Numbers of improved classification algorithms after dynamic selection under the MAvA index.
DatasetsNumberDatasetsNumberDatasetsNumber
wine4ar38shuttle6
hayesroth7ar410wan_27
contraceptive10ar59zoo14
penbased12ar612transfusion14
newthyroid10cm1_req10waveform14
dermatology10jEdit_4.0_4.26chess13
balance6jEdit_4.0_4.311auto_20513
glass12kc27car11
auto9kc39vehicle9
yeast8mc213heart13
thyroid12mw19hayes11
lymphography10pc1_req5sick13
ecoli11balance-scale7redwinequality8
pageblocks7banknote-authentication2ar112
blood-transfusion-service-center9breast_w9climate-model-simulation-crashes14
diabetes7ilpd10vowel14
monks-problems12monks-problems210monks-problems32
mozilla410pc19pc36
pc410phoneme12
Table 5. Numbers of improved classification algorithms after dynamic selection under the G-mean index.
Table 5. Numbers of improved classification algorithms after dynamic selection under the G-mean index.
DatasetsNumberDatasetsNumberDatasetsNumber
wine8ar311shuttle10
hayesroth9ar411wan_27
contraceptive10ar55zoo14
penbased12ar613transfusion11
newthyroid11cm1_req12waveform13
dermatology11jEdit_4.0_4.26chess12
balance8jEdit_4.0_4.313auto_20513
glass9kc210car10
auto9kc310vehicle9
yeast9mc214heart8
thyroid13mw111hayes12
lymphography0pc1_req8sick11
ecoli8balance-scale8redwinequality12
pageblocks12banknote-authentication5ar112
blood-transfusion-service-center9breast_w9climate-model-simulation-crashes13
diabetes7ilpd10vowel14
monks-problems12monks-problems210monks-problems32
mozilla410pc19pc37
pc410phoneme12
Table 6. Numbers of improved classification algorithms after dynamic selection under the Precision index.
Table 6. Numbers of improved classification algorithms after dynamic selection under the Precision index.
DatasetsNumberDatasetsNumberDatasetsNumber
wine4ar37shuttle7
hayesroth7ar411wan_29
contraceptive10ar56zoo14
penbased13ar613transfusion11
newthyroid9cm1_req10waveform10
dermatology5jEdit_4.0_4.25chess11
balance4jEdit_4.0_4.311auto_20512
glass11kc28car13
auto9kc38vehicle9
yeast10mc212heart13
thyroid10mw19hayes13
lymphography11pc1_req7sick11
ecoli11balance-scale6redwinequality9
pageblocks5banknote-authentication7ar111
blood-transfusion-service-center9breast_w9climate-model-simulation-crashes13
diabetes6ilpd10vowel14
monks-problems12monks-problems210monks-problems32
mozilla410pc18pc35
pc49phoneme12
Table 7. Numbers of improved classification algorithms after dynamic selection under the F-measure index.
Table 7. Numbers of improved classification algorithms after dynamic selection under the F-measure index.
DatasetsNumberDatasetsNumberDatasetsNumber
wine4ar37shuttle7
hayesroth7ar49wan_27
contraceptive9ar58zoo9
penbased13ar611transfusion12
newthyroid10cm1_req10waveform5
dermatology5jEdit_4.0_4.24chess12
balance10jEdit_4.0_4.312auto_20511
glass11kc29car12
auto9kc38vehicle9
yeast10mc211heart12
thyroid11mw18hayes8
lymphography7pc1_req5sick10
ecoli13balance-scale11redwinequality11
pageblocks7banknote-authentication11ar111
blood-transfusion-service-center9breast_w9climate-model-simulation-crashes12
diabetes7ilpd10vowel14
monks-problems12monks-problems210monks-problems32
mozilla410pc19pc36
pc410phoneme12
Table 8. MAvA results.
Table 8. MAvA results.
Patch EnsembleABCDEFGHIGKLMN
Times of the optimal MAvA713811324378875
Times of the second MAvA738381436254351
Times of the third MAvA7112144746422321
Total2152312136149169141414147
1. The 14 classical imbalanced classification algorithms are AdaBoostNC (A), AdaC2M1 (B), FuzzyImbECOC (C), MHDDTECOC (D), HDDTOVA (E), ImECOCdense (F), ImECOCOVA (G), ImECOCsparse (H), MCHDDT (I), MultiIMAO (J), MultiIMOAHO (K), MultiIMOVA (L), MultiIMOVO (M), and Piboost (N). 2. “Times of the optimal MAvA” in the table represents the times of best classification result of each algorithm in 56 datasets. “Times of the second MAvA” in the table represents the times of suboptimum classification result of each algorithm, and so on. 3. The bold number indicates that the classification result is equal to or better than that of the Patch Ensemble classifier.
Table 9. G-mean results.
Table 9. G-mean results.
Patch EnsembleABCDEFGHIGKLMN
Times of the optimal MAvA63580022311010993
Times of the second MAvA635371658375570
Times of the third MAvA6191641178545442
Total18719121351914199212018205
The bold number indicates that the classification result is equal to or better than that of the Patch Ensemble classifier.
Table 10. Precision results.
Table 10. Precision results.
Patch EnsembleABCDEFGHIGKLMN
Times of the optimal MAvA424342000301008
Times of the second MAvA1026320301200001
Times of the third MAvA848044215300201
Total2281861065168012010
The bold number indicates that the classification result is equal to or better than that of the Patch Ensemble classifier.
Table 11. F-measure results.
Table 11. F-measure results.
Patch EnsembleABCDEFGHIGKLMN
Times of the optimal MAvA3324350005000012
Times of the second MAvA727282312300001
Times of the third MAvA9210174002610010
Total197197181131414100113
The bold number indicates that the classification result is equal to or better than that of the Patch Ensemble classifier.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhao, D.; Wang, X.; Mu, Y.; Wang, L. Experimental Study and Comparison of Imbalance Ensemble Classifiers with Dynamic Selection Strategy. Entropy 2021, 23, 822. https://doi.org/10.3390/e23070822

AMA Style

Zhao D, Wang X, Mu Y, Wang L. Experimental Study and Comparison of Imbalance Ensemble Classifiers with Dynamic Selection Strategy. Entropy. 2021; 23(7):822. https://doi.org/10.3390/e23070822

Chicago/Turabian Style

Zhao, Dongxue, Xin Wang, Yashuang Mu, and Lidong Wang. 2021. "Experimental Study and Comparison of Imbalance Ensemble Classifiers with Dynamic Selection Strategy" Entropy 23, no. 7: 822. https://doi.org/10.3390/e23070822

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop