Next Article in Journal
Petrogenesis and Tectonic Implications of the Cryogenian I-Type Granodiorites from Gabgaba Terrane (NE Sudan)
Next Article in Special Issue
Coal-Based Activated Carbon via Microwave-Assisted ZnCl2 Activation for Methyl Violet 2B Dye Removal: Optimization, Desirability Function, and Adsorption Mechanism
Previous Article in Journal
Spatial Relationship between Eclogite and Copper-Nickel Mineralization in East Kunlun, China
Previous Article in Special Issue
Influence of Igneous Intrusions on Coal Flotation Feasibility: The Case of Moatize Mine, Mozambique
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Coal Gangue Recognition during Coal Preparation Using an Adaptive Boosting Algorithm

1
School of Mechanical Electronic & Information Engineering, China University of Mining & Technology, Beijing 100083, China
2
Key Laboratory of Intelligent Mining and Robotics, Ministry of Emergency Management, Beijing 100083, China
3
Beijing Railway Electrification School, Beijing 102202, China
*
Author to whom correspondence should be addressed.
Minerals 2023, 13(3), 329; https://doi.org/10.3390/min13030329
Submission received: 2 February 2023 / Revised: 19 February 2023 / Accepted: 24 February 2023 / Published: 26 February 2023
(This article belongs to the Special Issue Characterization, Processing and Utilization of Coal)

Abstract

:
The recognition of coal and gangue is the premise and foundation of coal gangue intelligent sorting. Adaptive boosting (AdaBoost) algorithm-based coal gangue identification has not been studied in depth. This paper proposed a coal gangue image recognition algorithm and a strong classifier based on the AdaBoost algorithm with a genetic algorithm (GA)-optimized support vector machine (SVM). One thousand coal gangue images were collected on-site and expanded to five thousand via rotation and exposure adjustment. The 12 gray-level gradient co-occurrence matrix texture features of the images were extracted to construct a feature vector, establishing the training dataset and test dataset. Selection of the SVM kernel function, the GA optimization parameter setting, and the base classifier number was discussed. The coal gangue image recognition effects of the AdaB-GA-SVM classifier and the other strong classifiers with different base SVM classifiers were investigated. The results indicated that the recognition accuracy of GA-SVM was the best when the kernel function of SVM was RBF and the population number, crossover probability, and mutation probability were 80, 0.9, and 0.005, respectively. The AdaB-GA-SVM classifier has excellent identification and effective classification performance with the highest accuracy of 95%, a precision rate of 92.8%, recall rate of 97.3%, and KS values of 0.79.

1. Introduction

Gangue sorting is one of the crucial processes in coal preparation because raw coal is mixed with gangue impurities during production. A gangue has a low calorific value and releases toxic gases, such as SO2, CO, CO2, and NOx, when burned, which results in coal quality reduction and environmental pollution, affecting the clean and efficient use of coal [1]. The existing coal gangue sorting methods mainly involve manual and mechanical sorting. As shown in Figure 1, when the raw coal flow flows into the raw coal preparation workshop, the iron and sundries in the raw coal flow are removed through the iron remover, and then enter the raw coal classification screen for screening. Those with particle sizes of less than 50 mm directly enter the mechanical separation operation. Those with particle sizes of more than 50 mm are manually selected to remove part of the sundries in the coal and visible gangue, and then broken into qualified particle sizes (less than 50 mm), and further separation is carried out using other mechanical separation methods such as a moving screen jig. When manually sorting gangue, illustrated in Figure 2, the workers judge the gangue in the coal flow through the naked eye and pick it out by hand. The labor intensity is high, the working environment is bad, the workers can very easily inhale fine particles (although wearing protective masks) and can easily be injured by the high-speed belt or scraper conveyor, seriously affecting the health of the workers and posing great potential safety risks [2]. To keep the sorting workers away from the harsh working environment, intelligent coal gangue sorting equipment, especially coal gangue sorting robots, has received considerable attention in the industry [3,4,5,6]. Coal gangue recognition is the foundation of its intelligent sorting and is a crucial coal gangue sorting robot technology.
The investigations on coal gangue identification can be traced back to the 1960s with more than 10 identification methods, such as the γ-ray, X-ray, photoelectric, and infrared methods [7,8,9,10,11]. Despite numerous accomplishments, these methods have bottlenecks, such as limited application occasions, radiation hazards, and low recognition accuracy. With the development of image processing technology and machine learning, coal gangue identification methods have shifted their focus onto coal gangue image recognition. Wang, Li, and Yang [12] investigated coal gangue response characteristics under different illuminations and used a SVM classifier to realize the identification of coal and gangue. Zhao et al. [13] constructed a PSO-SVM model to recognize coal and gangue. Su et al. [14] and Pu et al. [15] introduced transfer learning for the identification of coal and gangue based on a convolutional neural network (CNN). Li et al. [16] proposed a hierarchical framework for coal gangue detection based on CNNs. McCoy and Auret [17] reviewed the machine learning applications in mineral processing. Hou [18] established a coal gangue separation system based on the difference between coal and gangue in their surface textures and grayscale features, and proposed a method of combining image feature extraction and a feed-forward artificial neural network.
Alfarzaeai et al. [19] addressed the topic of coal gangue recognition. They created a new model called CGR-CNN based on CNN using thermal images as standard images for coal gangue recognition. Lei et al. [3] constructed a visual depth neural network fast coal classification net (FCCN) based on CNN, and implemented a visual coal classification detection algorithm for coal gangue sorting robots. Liu, Li, et al. [1] studied coal gangue detection based on enhanced YOLOv4. Li et al. [20] conducted research on a coal gangue detection and recognition algorithm based on deformable convolution YOLOv3 (DCN-YOLOv3). Yan et al. [21] studied an intelligent classification method of coal gangue based on multispectral imaging technology and target detection by the YOLOv5.1. In the coal preparation streamline, the state of gangue in raw coal flow is diverse, such as exposing outside coal, being partially or fully covered by pulverized coal. Furthermore, extracting the features of a coal gangue image is challenging due to the harsh site conditions of low illumination and high dust. The present coal gangue recognition methods in the literature have limited applications. Therefore, further study is required for the segmentation, enhancement, feature extraction, and recognition of a coal gangue image. A coal gangue recognition algorithm based on deep learning does not need to consider the extracted features. Still, the classifier’s training needs a large number of marked samples and a large amount of calculation; thus, the requirements for lightweight and real-time performance pose a significant challenge to its field application. Li et al. [22] investigated the effects of illuminance and external moisture on grayscale and texture features of coal and gangue images, which provided an essential guide for the image-based identification of coal and gangue under working conditions. Li and Gong [23] studied a preprocessing model for low-quality images of coal and gangue based on a bilateral filtering joint enhancement algorithm.
Above all, literature investigations on coal gangue recognition have been extensively conducted, but a practical solution has not yet been satisfactorily provided. Wang, Li, and Yang [12] and Li et al. [4] have studied the coal gangue recognition model based on a support vector machine (SVM). Still, low accuracy and insufficient robustness problems were caused by its insensitivity to noise and difficult parameter adjustment. There are few studies on the use of integrated algorithms to improve the accuracy of SVM coal gangue classification.
Therefore, the novel contributions of this paper are as follows:
  • Aiming at the shortcomings of the SVM classifier for coal gangue recognition, this paper used a genetic optimization algorithm to improve its noise insensitivity and difficult parameter adjustment, used an adaptive boosting (AdaBoost) algorithm to enhance its recognition accuracy, and constructed a coal gangue recognition and classification model;
  • The indices of GGCM were introduced to characterize the features of the coal and gangue and a coal-gangue sample dataset was constructed with the coal and gangue images obtained by experiment and on-site to verify the performance of the proposed algorithm.

2. Principle and Theory

This section presents the proposed method and its algorithm. This study used an SVM classifier as the AdaBoost integration base classifier, and a genetic algorithm (GA) was employed to optimize the SVM parameters.

2.1. SVM Algorithm and SVM Classifier

SVM is a machine learning method based on a statistical theory proposed by Vapnik and Chervonenkis [24]. It is widely used in text classification, handwritten character recognition, and image classification owing to its excellent generalization performance and ability to process high-dimensional data. The fundamental principle is to find an optimal hyperplane that can meet the classification requirements and which has the most considerable interval, as presented in Figure 3.
Let D = {(xi, yi), i = 1, 2, …, N, xR, y ∈ (−1,1)}, xi is the sample data to be classified, and yi is the label of the data xi. The classification plane (ω, b) can be described using the following linear equation:
ω T g x + b = 0
where ω denotes the normal vector to the classification plane, and b indicates the displacement term. For any sample in the linearly separable sample set, there are
{ y i = + 1 ,   ω T x i + b + 1 y i = 1 ,   ω T x i + b < + 1
Equation (2) can be abbreviated as
y i ( ω T x i + b ) 1
In all the vectors ω, there is a vector whose distance from the classification plane is the smallest and satisfies the equal sign of Equation (2), which is called the support vector. The sum of the distances γ from all the support vectors to the hyperplane is
γ = 2 ω
The hyperplane with the largest distance γ is the optimal hyperplane. Then, the classification problem is transformed into the problem of finding the optimal hyperplane, namely, finding the optimal parameters ω and b in Equation (1) under constraints
max ω , b 2 ω s . t .   y i ( ω T x i + b ) 1 , i = 1 , 2 , , N
or:
min ω , b 1 2 ω s . t .   y i ( ω T x i + b ) 1 , i = 1 , 2 , , N
For calculation convenience, remove the root sign in Equation (6), then
min ω , b 1 2 ω 2 s . t .   y i ( ω T x i + b ) 1 ,   i = 1 , 2 , , N
If the Lagrange function is introduced into the above formula, there is
L ( ω , b , α ) = 1 2 ω 2 + i = 1 N α i ( 1 y i ( ω T x i + b ) )
where the Lagrange multiplier αi ≥ 0. Then, the problem of finding the optimal classification plane comes down to solving
max α ( i = 1 N α i 1 2 i = 1 N j = 1 N α i α j y i y j x i T x j ) s . t .       0 α i ,   i = 1 N α i y i = 0 , i = 1 , 2 , , N
For the nonlinear classification problem, the kernel function of K(xi, xj) is introduced, and the classification problem is reduced to
max α ( i = 1 N α i 1 2 i = 1 N j = 1 N α i α j y i y j K ( x i , x j ) ) s . t .     0 α i C , i = 1 m α i y i = 0 , i = 1 , 2 , , N
where C denotes the penalty factor. The final classification discriminant function is defined as
f ( x ) = sgn ( i = 1 N a i y i K ( x i , x j ) + b )
where sgn(x) is the sign discrimination function. When x > 0, it returns 1; otherwise, it returns 0.
The loss function of SVM is an unbounded convex function, resulting in the same penalty on classification error samples, and exceptionally susceptible to noise while ensuring the confidence of the classification results. In addition, SVM has parameter sensitivity, complex parameter tuning, and unstable classification accuracy. The coal gangue recognition model based on the basic SVM algorithm lacks accuracy and robustness, impacting the effect of the algorithm on coal gangue recognition.

2.2. GA Optimization and SVM Base Classifier Construction

The introduction of parameter optimization algorithms, such as the GA algorithm [25], particle swarm optimization (PSO) algorithm [26], and ant colony algorithm [27], can adaptively search the global optimization and effectively reduce the difficulty of the parameter tuning of the SVM model. This paper introduced GA into the SVM algorithm to optimize the penalty factor and kernel function parameters, and the GA-SVM model was created.
GA is a method for searching for the optimal solution by simulating the natural evolutionary process. This algorithm converts problem-solving into a process that is similar to the crossover and mutation of chromosomal genes in biological evolution. Owing to its strong robustness, it is widely used in combination optimization, machine learning, signal processing, adaptive control, and artificial life. The optimization process of the GA is as follows:
  • Set the evolutionary iteration counter t = 0, the maximum evolutionary iteration T, and randomly generate M individuals as the initial population P(0);
  • Calculate the fitness of each individual in the population P(t);
  • Obtain the next generation’s population P(t + 1) through selection, crossover, and mutation of population P(t);
  • Judge whether the termination condition is reached. If t < T, repeat Step 3; else, if t = T, terminate the evolution;
  • Take the individual with the greatest fitness obtained in the evolution process as the optimal solution, and the SVM base classifier is constructed using the optimal parameters.

2.3. ADAB-GA-SVM Classifier Construction

The AdaBoost algorithm was employed to obtain a strong classifier model to improve the anti-noise performance of the SVM algorithm, referred to as the AdaB-GA-SVM model. AdaBoost is the most representative boosting tree algorithm proposed based on Boosting by Freund and Schapire [28]. The AdaBoost tree algorithm is widely used for classification in various fields as it can keep the training between classifiers unaffected to ensure structural stability and maximize model generality. Dou, Chen, and Yue [29] proposed a multi-classification algorithm based on AdaBoost, which exhibited an excellent remote sensing image classification performance. Zhang et al. [30] employed the AdaBoost algorithm to integrate the SVM base classifier into strong classifiers to achieve higher classification accuracy in different dataset sizes. The AdaBoost algorithm flow is as follows:
(1)
Initialize the weight of the sample set D.
ω i 1 = 1 N , i = 1 , 2 , , N
(2)
Let the iteration number be M, for t = 1, 2, …, M:
  • Train the GA-SVM classifier using the sample set D with weight ωit and obtain a base classifier ft(x);
  • Calculate the classification error rate et and weight of the classifier λt:
    e t = i = 1 N ω i t I ( y i f t ( x i ) )
    λ t = 1 2 ln ( 1 e t e t )
    where I (yift(xi)) is a discriminant function, which returns 1 when the prediction result of the base classifier ft(x) is inconsistent with the sample label yi; otherwise, it returns 0.
  • Update the weight of the training set sample to ωit+1 according to the prediction result of the base classifier ft(x):
    ω i t + 1 = { ω i t exp ( λ t ) i = 1 N ω i t exp ( λ t )             f t ( x i ) = y i ω i t exp ( λ t ) i = 1 N ω i t exp ( λ t )             f t ( x i ) y i
(3)
Build the final strong classifier as
F ( x ) = t = 1 M λ t f t ( x ) = F M 1 ( x ) + λ M f M ( x )
Generally, the step size and maximum number of iterations are used together to determine the fitting effect of the AdaBoost algorithm, and the constructed strong classifier is given by the following equation:
F ( x ) = F M - 1 ( x ) + ν λ M f M ( x )
where ν denotes the learning rate, 0 < ν ≤ 1. The classification discriminant function g(x) is defined as
g ( x ) = sgn (   F ( x ) )
where sgn(x) is the same as Equation (11).

3. Materials and Methods

3.1. Coal and Gangue Image Collection and Preprocessing

The coal gangue image samples were collected from the Wuyuan Coal preparation plant. Considering the impact of environmental factors, such as light and coal dust, on the quality of the collected coal-gangue image, the coal gangue image data were collected at different periods to increase the model’s generalization performance.
After the screening, cutting, and labeling of the coal and the gangue images collected on-site, 500 images of the coal and the gangue were individually obtained, comprising a total of 1000 sample data. A total of 800 data were selected as the training set and the remaining 200 as the test set. During selection, the proportion of the coal gangue images was controlled at 1:1 to keep the coal gangue sample data balanced. The training and test sets were expanded to 4000 and 1000, respectively, through left–right rotation and exposure adjustment, to enrich the dataset.
The coal gangue images were preprocessed by gray conversion, gamma function correction, and image enhancement. After preprocessing, the gray-level of each coal-gangue image was 256, the size was 2000 × 2000, and the format was PNG. Figure 4 demonstrates the effect comparison of the coal gangue images before and after preprocessing.

3.2. Gray-Level Gradient Co-Occurrence Matrix Texture Feature Extraction

The gray-level co-occurrence matrix (GLCM) is a matrix function of image pixel distance and angle. It reflects the comprehensive information of an image in direction, adjacent interval, and change range through the correlation between the gray levels of two pixels with a certain distance and direction in the image [31]. The generation principle is described as follows. Starting from the pixel point (x, y) of the gray value i, move a certain distance δ at an angle θ along the matrix construction direction toward the pixel point (x + Δx, y + Δy) of the gray value j, as presented in Figure 5, and calculate the number of pixel pairs with the relative position relationship in the whole image to obtain the joint probability distribution P(i, j) of pixel pair (i, j); construct a square matrix with the joint probability distribution P(i, j) of all pixel pairs (i, j), then normalize the square matrix by the total number of (i, j) combination, and finally obtain the GLCM. Δx and Δy are determined by spacing δ and angle θ, and θ indicates the generation direction of GLCM.
The addition of the gradient information of the image to the GLCM constitutes the GGCM, which can contain the texture primitives and their arrangement information. This reflects the relationship between the gray-level and the image pixel point gradient (or edge). The gray-level of each pixel is the basis of an image, and the gradient is the element of the image edge contour [32]. The GGCM is expressed as follows:
{ H ( x , y ) ;   x = 0 , 1 , L f 1 ;   y = 0 , 1 , , L g 1 }
where H(x, y) is the element of the GGCM, representing the number of pixels with grayscale x in the normalized grayscale image F(i, j) and gradient y in the normalized gradient image G(i, j); Lf, the maximum gray-level of the grayscale image; and Lg, the maximum gradient level of the gradient image. The GGCM is normalized as follows:
H ^ ( x , y ) = H ( x , y ) x = 0 L f 1 y = 0 L g 1 H ( x , y )
Based on the normalized GGCM H ^ ( x , y ) , the 15 grayscale gradient texture features of the coal gangue image can be extracted using the formula in Table A1.
Xue et al. [33] analyzed the importance of 15 coal gangue image texture features using the random forest model, presented in Figure 6. According to the importance, the top 12 of the 15 texture features were selected to construct a feature vector x = [T1, T3, T4, T5, T6, T7, T8, T9, T10, T13, T14, T15]. The coal image was labeled as 1, and the gangue image was labeled as −1. Figure A1 presents the texture features of 100 groups of coal gangue image samples. The abscissa represents the sample serial number, and the ordinate represents the feature value of the sample. The aforementioned 12 GGCM texture features of coal gangue image training and test sets were extracted respectively to provide the training dataset {(xi, yi)| xi = [T1i, T3i, T4i, T5i, T6i, T7i, T8i, T9i, T10i, T13i, T14i, T15i], yi ∈ (−1,1), i = 1, 2, …, 4000} and the test dataset {(xj, yj)| xj = [T1j, T3j, T4j, T5j, T6j, T7j, T8j, T9j, T10j, T13j, T14j, T15j], yj ∈ (−1,1), j = 1, 2, …, 1000} for the subsequent model training and testing.

3.3. Classification Model Training Process

The training steps of the AdaB-GA-SVM model are as follows:
(1)
Input the coal gangue training dataset (xi, yi), i = 1, 2, …, 4000, set the initial weight ωit = 1/4000 (t = 1), and construct a weighted training set (ωit xi, yi);
(2)
Set the value range of penalty factor C and parameter g of RBF-SVM as [0, 100] and [0, 10], respectively, and convert C and g into chromosomes by 8-bit binary coding. According to the abovementioned research results, the initial population size, crossover probability, and mutation probability of GA were set to 80, 0.9, and 0.0005, respectively, and the number of evolutionary iterations was set to 100. The roulette selection method was adopted;
(3)
Using the weighted training dataset and taking the average recognition accuracy Acc of four-fold cross-verification as the chromosome’s fitness, the current population is crossed, mutated, and selected to generate the next generation of population and calculate each fitness value;
(4)
Judge whether the number of iterations has been reached. If not, return to step (3); otherwise, select the individual with the highest fitness in all iterative populations to obtain the final GA-SVM base classifier ft(x);
(5)
Calculate the error rate et of ft(x) and its weight λt, and update the weight of the sample data to ωit+1 according to the prediction result of ft(x);
(6)
Loop through steps (2)–(5) until all 20 GA-SVM base classifiers are obtained, and the final classifier F(x) is constructed using Equation (16).
Four indicators usually evaluate the classification and identification models, namely, accuracy Acc, precision rate P, recall rate R, and F1 score, which are defined as follows:
A c c = T P + T N T P + T N + F P + F N
P = T P T P + F P
R = T P T P + F N
F 1 = 2 × P × R P + R
where TP is a true-positive case (the actual coal is predicted as coal); FP, a false-positive case (the actual coal is predicted as gangue); TN, a true-negative case (the actual gangue is predicted as gangue); and FN, a false-negative case (the actual gangue is predicted as coal).

3.4. Experimental Configuration

The training and testing of the classifier were conducted with the coal gangue image sample set in the software environments of Python 3.7.6 on the Windows 10 professional operating system. The SVM classifier was generated from the SVM function, and the GA-SVM integration method was to call the AdaBoost classifier function in Sklearn [34].

4. Results and Discussion

4.1. Kernel Function Selection

Coal gangue image recognition is a nonlinear classification problem that introduces a kernel function. Different kernel functions have a significant impact on the SVM classification accuracy. The commonly used kernel functions include polynomial, RBF, and Sigmoid kernel function [35]. The existing research shows that the SVM classifier based on the RBF kernel function has good applicability and is more suitable for the classification problem of multidimensional vector space. The RBF kernel function parameter g will not increase the spatial complexity in a particular range. SVM models with different kernel functions were constructed, trained, and tested with the coal gangue image dataset via a four-fold cross-validation. The results are presented in Table 1. It can be seen from Table 1 that the accuracy rate reached 83% when the kernel function of SVM was RBF, higher than that of polynomial and sigmoid. The recognition time was slightly longer than the SVM classifier based on the other kernel functions. Therefore, the SVM model based on the RBF kernel function will be used in the follow-up research of this paper.

4.2. Genetic Algorithm Parameter Tuning

In this paper, GA was used to optimize parameter g and penalty factor C of the SVM model based on the RBF kernel function. During optimization, the GA parameter settings, such as population size, crossover probability, and mutation probability, significantly impacted the optimization effect of GA. In general, the population size range was 20–100, the crossover probability range was 0.4–0.99, the mutation probability range was 0.0001–0.1, and the number of evolutionary iterations was 100–500.
The GA parameters, such as population size, crossover probability, and mutation probability, were studied by a 3-factor 9-level orthogonal test to obtain the best optimization effect on the SVM model. The orthogonal table and test results are presented in Table 2. It can be seen that the population number, crossover probability, and mutation probability of the GA optimal parameters were 80, 0.9, and 0.0005, respectively.

4.3. Number of Base Classifiers

The number of base classifiers has a significant impact on the recognition accuracy of the model. Suppose there are very few integrated base classifiers when using the AdaBoost adaptive ensemble algorithm for classification and recognition. The recognition accuracy cannot reach the recognition effect in such a case. Still, there may be overfitting if there are too many integrated base classifiers, which results in the poor generalization ability of the trained model.
The recognition effect of the AdaB-SVM model with a different number of base classifiers on the coal gangue images was investigated. Figure 7 presents the relationship of the coal gangue image recognition accuracy of AdaB-GA-SVM with the number of integrated base classifiers. It can be seen from Figure 7 that when the number of integrated base classifiers was 20, the coal-gangue image identification and classification accuracies of the AdaB-SVM classifier in the training and the test sets were the highest, up to 95.3% and 95.1%, respectively, which resulted in no overfitting phenomenon and good generalization ability.

4.4. Classification Model Training Results and Evaluation

Figure 8 shows the variation curve of the highest fitness of all individuals during the training process of each of the 20 base classifiers. The training process data of a GA-SVM base classifier ft(x) are shown in Table A2, in which the parameters in the red box are those of the final selected base classifier. Table A3 presents the accuracy, the penalty factor c, the parameter g, the error rate et, the classifier weight λt, and the training time of the obtained 20 GA-SVM base classifiers.
From Figure 8, we can see the difference of the maximum individual fitness change curve of each basic classifier in the training process, and the number of iterations corresponding to the fitness stability was also different. However, after 100 iterations, the maximum individual fitness of each base classifier was stable and above 90.9%.
The AdaB-GA-SVM model was tested with the test dataset, and the results are presented in Table 3. The test results of the GA-SVM model are also listed for comparison. Figure 9 presents the KS curve of AdaB-GA-SVM. It can be seen that the KS value was 0.79.
Table 4 demonstrates that the accuracy of Acc, the precision of P, and the F1 value of the AdaB-GA-SVM model compared with the GA-SVM model increased by 4%, 7.6%, and 4.5%, reaching 95.1%, 92.8%, and 95%, respectively. During coal preparation, the industry specialists focus more on the precision rate of gangue. The gangue precision rate of the GA-SVM model was 85.2%, and the recall rate was 96.6%; the gangue precision rate of the AdaB-GA-SVM model was 92.8%, and the recall rate was 97.3%, indicating that the recognition model proposed in this paper has a better performance and effect. The KS value of 0.79 shown in Figure 9 indicates that the AdaB-GA-SVM model performs well in coal gangue identification.

4.5. Comparison with other SVM Base Classifier

To compare the recognition of the AdaBoost model with different base classifiers, the base classifiers such as SVM, GS-SVM, and PSO-SVM and the corresponding adaptive enhancement classification models were also trained. Then, the SVM, GS-SVM, PSO-SVM, and GA-SVM classification models before and after AdaBoost were tested using the aforementioned test dataset. The classification accuracy and recognition runtime are presented in Table 4.
Table 4 demonstrates that after adopting the AdaBoost adaptive enhancement algorithm, the recognition accuracy of the SVM, GS-SVM, PSO-SVM, and GA-SVM models all improved. The accuracy of the AdaB-GA-SVM classifier was the highest, up to 95%, which was 11%, 9%, and 5% higher than that of AdaB-SVM (84%), AdaB-GS-SVM (86%), and AdaB-PSO-SVM (90%), respectively. The running time of each classification model had a different degree of increment before and after integration enhancement. The recognition time of the classifier after adaptive enhancement was equivalent, and the recognition running time of AdaB-GA-SVM was about 0.124 s.

5. Conclusions

Coal gangue identification is the foundation of realizing coal gangue intelligent sorting in coal preparation. Coal gangue identification based on the adaptive boosting algorithm has not been studied in depth in the literature. This paper proposed an adaptive enhancement recognition algorithm and classification model using AdaB-GA-SVM based on coal gangue images. The main conclusions are as follows:
(1)
The coal gangue image data were been collected on-site, the gray-level gradient co-occurrence matrix texture features were extracted, and the coal gangue image dataset was constructed. The AdaB-GA-SVM classification model proposed in this paper was trained and tested. The results indicated that the model had a precision rate of 92.8% for gangue, a recall rate of 97.3%, and a KS value of 0.79, suggesting that the AdaB-GA-SVM model has excellent classification and identification performance and good generalization ability in coal gangue identification.
(2)
The coal gangue identification effects of the proposed algorithm with other SVM base classifiers, such as SVM, GS-SVM, and PSO-SVM, were compared and analyzed. The results indicated that the enhanced classification model’s accuracy improved. The AdaB-GA-SVM classifier had the highest accuracy of 95%, 5% to 11% higher than the AdaB-SVM, the AdaB-GS-SVM, and the AdaB-PSO-SVM classifiers with equivalent runtimes.
(3)
Image texture features and classification algorithms significantly impact the effect of coal gangue identification. More texture feature extraction methods or machine learning algorithms, such as improved local ternary pattern [36], XGBoost (eXtreme Gradient Boosting) [37,38] and deep learning algorithms, will be further studied for coal gangue recognition.

Author Contributions

Conceptualization, G.X. and X.Q.; methodology, G.X., P.H. and X.Q.; software, P.H. and X.Q.; validation, G.X., P.H. and X.Q.; formal analysis, G.X., P.H. and X.Q.; investigation, G.X., P.H., S.L., X.Q. and S.G.; data curation, P.H.; writing—original draft preparation, P.H., S.L., S.H. and S.G.; writing—review and editing, G.X. and S.G.; visualization, P.H., X.Q. and S.H.; supervision, G.X.; project administration, G.X.; All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key Basic Research and Development Program Fund project (Grant No. 2014CB046306); and the National Natural Science Foundation of China Fund Projection (Grant No. 61673385).

Data Availability Statement

Available upon request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Figure A1. Texture features of sample of coal-gangue image.
Figure A1. Texture features of sample of coal-gangue image.
Minerals 13 00329 g0a1
Table A1. The 15 grayscale gradient texture features and their calculation formulas.
Table A1. The 15 grayscale gradient texture features and their calculation formulas.
No.Texture FeatureCalculation Formula
1large gradient advantage T 1 = x = 0 L f 1 y = 0 L g 1 y 2 H ^ ( x , y ) / x = 0 L f 1 y = 0 L g 1 H ^ ( x , y )
2small gradient advantage T 2 = x = 0 L f 1 y = 0 L g 1 H ^ ( x , y ) ( y + 1 ) 2 / x = 0 L f 1 y = 0 L g 1 H ^ ( x , y )
3gray distribution nonuniformity T 3 = x = 0 L f 1 [ y = 0 L g 1 H ^ ( x , y ) ] 2 / x = 0 L f 1 y = 0 L g 1 H ^ ( x , y )
4gradient distribution nonuniformity T 4 = x = 0 L g 1 [ y = 0 L f 1 H ^ ( x , y ) ] 2 / x = 0 L f 1 y = 0 L g 1 H ^ ( x , y )
5energy T 5 = x = 0 L f 1 y = 0 L g 1 H ^ 2 ( x , y )
6gray average T 6 = x = 0 L f 1 x y = 0 L g 1 H ^ ( x , y )
7gradient average T 7 = y = 0 L g 1 y x = 0 L f 1 H ^ ( x , y )
8gray mean square error T 8 = [ x = 0 L f 1 ( x T 6 ) 2 y = 0 L g 1 H ^ ( x , y ) ] 1 2
9gradient mean square error T 9 = [ y = 0 L g 1 ( y T 7 ) 2 x = 0 L f 1 H ^ ( x , y ) ] 1 2
10correlation T 10 = x = 0 L f 1 y = 0 L g 1 ( x T 6 ) ( y T 7 ) H ^ ( x , y )
11gray-level entropy T 11 = x = 0 L f 1 y = 0 L g 1 H ^ ( x , y ) log y = 0 L g 1 H ^ ( x , y )
12gradient entropy T 12 = x = 0 L f 1 y = 0 L g 1 H ^ ( x , y ) log x = 0 L f 1 H ^ ( x , y )
13mixed entropy T 13 = x = 0 L f 1 y = 0 L g 1 H ^ ( x , y ) log H ^ ( x , y )
14inertia T 14 = x = 0 L f 1 y = 0 L g 1 ( x y ) 2 H ^ ( x , y )
15inverse difference moment T 15 = x = 0 L f 1 y = 0 L g 1 H ^ ( x , y ) / [ 1 + ( x y ) 2 ]
Table A2. Process data during the training of a certain GA-SVM base classifier ft(x).
Table A2. Process data during the training of a certain GA-SVM base classifier ft(x).
123446979899100
1Acc0.90130.90050.89820.90130.90090.90040.90040.91010.9018
C32.56087.60771.39588.02925.49514.75727.17460.67152.160
g1.4056.5620.0276.5612.8706.0156.7393.30023.737
2Acc0.89240.88880.89110.89260.91080.90130.90440.89600.8906
C83.45492.82984.33335.83560.2783.28278.13839.20628.448
g3.9441.3372.8992.45752.7397.1242.9666.2336.873
79Acc0.89680.89600.91050.89460.90310.89860.89910.90090.9039
C50.86354.99753.99072.73884.65845.36484.81784.81765.289
g8.6248.6242.3129.0725.0337.0151.3915.1890.999
80Acc0.89770.90350.89640.89640.90180.90220.90980.89600.9012
C10.20354.63154.63129.72972.70052.39972.7503.40135.384
g8.1633.7498.2913.134 1.884 6.0402.0114.0092.291
Note: The parameters in the red box are those of the final selected base classifier with highest fitness.
Table A3. Training results of the 20 GA-SVM base classifiers.
Table A3. Training results of the 20 GA-SVM base classifiers.
Base ClassifierAccuracyCgetλtTrain TimeBase ClassifierAccuracyCgetλtTrain Time
f1(x)0.9151.733.160.181.47515 min 26 sf11(x)0.9162.022.860.450.18717 min 22 s
f2(x)0.9166.542.820.340.63517 min 42 sf12(x)0.9161.472.850.480.06417 min 42 s
f3(x)0.9155.303.080.390.44418 min 21 sf13(x)0.9161.732.860.450.18817 min 16 s
f4(x)0.9157.362.940.400.41817 min 36 sf14(x)0.9167.352.770.450.18218 min 13 s
f5(x)0.9194.352.520.400.40017 min 20 sf15(x)0.9183.862.020.460.15617 min 43 s
f6(x)0.9155.413.020.430.27917 min 30 sf16(x)0.9166.992.790.470.11015 min 58 s
f7(x)0.9163.792.870.470.09917 min 43 sf17(x)0.9167.712.800.490.00917 min 22 s
f8(x)0.9154.262.820.470.11917 min 25 sf18(x)0.9193.302.510.440.23317 min 23 s
f9(x)0.9156.292.980.480.07716 min 30 sf19(x)0.9169.782.760.450.20116 min 47 s
f10(x)0.9144.733.260.410.36117 min 35 sf20(x)0.9168.631.940.500.00117 min 35 s

References

  1. Liu, Q.; Li, J.; Li, Y.; Gao, M. Recognition Methods for Coal and Coal Gangue Based on Deep Learning. IEEE Access 2021, 9, 77599–77610. [Google Scholar] [CrossRef]
  2. Sun, Z.; Huang, L.; Jia, R. Coal and Gangue Separating Robot System Based on Computer Vision. Sensors 2021, 21, 1349. [Google Scholar] [CrossRef] [PubMed]
  3. Lei, S.; Xiao, X.; Zhang, M.; Dai, J. Visual classification method based on CNN for coal-gangue sorting robots. In Proceedings of the 2020 5th International Conference on Automation, Control and Robotics Engineering (CACRE), Dalian, China, 18–20 September 2020; pp. 543–547. [Google Scholar] [CrossRef]
  4. Li, M.; Duan, Y.; He, X.; Yang, M. Image positioning and identification method and system for coal and gangue sorting robot. Int. J. Coal Prep. Util. 2020, 42, 1759–1777. [Google Scholar] [CrossRef]
  5. Liu, P.; Qiao, X.; Zhang, X. Stability sensitivity for a cable-based coal–gangue picking robot based on grey relational analysis. Int. J. Adv. Robot. Syst. 2021, 18, 1–12. [Google Scholar] [CrossRef]
  6. Wang, Z.; Xie, S.; Chen, G.; Chi, W.; Ding, Z.; Wang, P. An Online Flexible Sorting Model for Coal and Gangue Based on Multi-Information Fusion. IEEE Access 2021, 9, 90816–90827. [Google Scholar] [CrossRef]
  7. Eshaq, R.M.A.; Hu, E.; Li, M.; Alfarzaeai, M.S. Separation Between Coal and Gangue Based on Infrared Radiation and Visual Extraction of the YCbCr Color Space. IEEE Access 2020, 8, 55204–55220. [Google Scholar] [CrossRef]
  8. Guo, Y.; Wang, X.; Wang, S.; Hu, K.; Wang, W. Identification Method of Coal and Coal Gangue Based on Dielectric Characteristics. IEEE Access 2021, 9, 9845–9854. [Google Scholar] [CrossRef]
  9. Hu, F.; Zhou, M.; Yan, P.; Bian, K.; Dai, R. Multispectral Imaging: A New Solution for Identification of Coal and Gangue. IEEE Access 2019, 7, 169697–169704. [Google Scholar] [CrossRef]
  10. Zhang, Y.; Zhu, H.; Zhu, J.; Ou, Z.; Shen, T.; Sun, J.; Feng, A. Experimental study on separation of lumpish coal and gangue using X-ray. Energy Sources Part A Recovery Util. Environ. Eff. 2021, 9, 1–13. [Google Scholar] [CrossRef]
  11. Zou, L.; Yu, X.; Li, M.; Lei, M.; Yu, H. Nondestructive Identification of Coal and Gangue via Near-infrared Spectroscopy based on Improved Broad Learning. IEEE Trans. Instrum. Meas. 2020, 69, 8043–8052. [Google Scholar] [CrossRef]
  12. Wang, J.; Li, L.; Yang, S. Experimental study on gray and texture features extraction of coal and gangue image under different illuminance. J. China Coal Soc. 2018, 43, 3051–3061. (In Chinese) [Google Scholar] [CrossRef]
  13. Zhao, Y.; Wang, S.; Guo, Y.; Cheng, G.; He, L.; Wang, W. The identification of coal and gangue and the prediction of the degree of coal metamorphism based on the EDXRD principle and the PSO-SVM model. Gospod. Surowcami Miner. 2022, 38, 113–129. [Google Scholar] [CrossRef]
  14. Su, L.; Cao, X.; Ma, H.; Li, Y. Research on Coal Gangue Identification by Using Convolutional Neural Network. In Proceedings of the 2018 2nd IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Xi’an, China, 25–27 May 2018; pp. 810–814. [Google Scholar]
  15. Pu, Y.; Apel, D.B.; Szmigiel, A.; Chen, J. Image Recognition of Coal and Coal Gangue Using a Convolutional Neural Network and Transfer Learning. Energies 2019, 12, 1735. [Google Scholar] [CrossRef]
  16. Li, D.; Zhang, Z.; Xu, Z.; Xu, L.; Meng, G.; Li, Z.; Chen, S. An Image-Based Hierarchical Deep Learning Framework for Coal and Gangue Detection. IEEE Access 2019, 7, 184686–184699. [Google Scholar] [CrossRef]
  17. McCoy, J.; Auret, L. Machine learning applications in minerals processing: A review. Miner. Eng. 2019, 132, 95–109. [Google Scholar] [CrossRef]
  18. Hou, W. Identification of Coal and Gangue by Feed-forward Neural Network Based on Data Analysis. Int. J. Coal Prep. Util. 2017, 39, 33–43. [Google Scholar] [CrossRef]
  19. Alfarzaeai, M.S.; Niu, Q.; Zhao, J.; Eshaq, R.M.A.; Hu, E. Coal/Gangue Recognition Using Convolutional Neural Networks and Thermal Images. IEEE Access 2020, 8, 76780–76789. [Google Scholar] [CrossRef]
  20. Li, D.; Wang, G.; Zhang, Y.; Wang, S. Coal gangue detection and recognition algorithm based on deformable convolution YOLOv3. IET Image Process. 2022, 16, 134–144. [Google Scholar] [CrossRef]
  21. Yan, P.; Sun, Q.; Yin, N.; Hua, L.; Shang, S.; Zhang, C. Detection of coal and gangue based on improved YOLOv5.1 which embedded scSE module. Measurement 2022, 188, 110530. [Google Scholar] [CrossRef]
  22. Li, M.; He, X.; Duan, Y.; Yang, M. Experimental study on the influence of external factors on image features of coal and gangue. Int. J. Coal Prep. Util. 2021, 42, 2770–2787. [Google Scholar] [CrossRef]
  23. Li, N.; Gong, X. An Image Preprocessing Model of Coal and Gangue in High Dust and Low Light Conditions Based on the Joint Enhancement Algorithm. Comput. Intell. Neurosci. 2021, 2021, 1–10. [Google Scholar] [CrossRef] [PubMed]
  24. Vapnik, V.N.; Chervonenkis, A. A note on one class of perceptrons. Autom. Remote Control 1964, 25, 821–837. [Google Scholar]
  25. Goldberg, D.E. Genetic algorithms in search, optimization, and machine learning. Choice Rev. 1989, 27, 39–45. [Google Scholar] [CrossRef]
  26. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; pp. 1942–1948. [Google Scholar] [CrossRef]
  27. Dorigo, M.; Caro, G.D. Ant colony optimization: A new meta-heuristic. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, 6–9 July 1999; pp. 1470–1477. [Google Scholar]
  28. Freund, Y.; Schapire, R.E. Experiments with a New Boosting Algorithm. In Proceedings of the 13th International Conference on Machine Learning, Bari, Italy, 3–6 July 1996; pp. 148–156. [Google Scholar] [CrossRef]
  29. Dou, P.; Chen, Y.; Yue, H. Remote-sensing imagery classification using multiple classification algorithm-based AdaBoost. Int. J. Remote. Sens. 2018, 39, 619–639. [Google Scholar] [CrossRef]
  30. Zhang, Y.; Ni, M.; Zhang, C.; Liang, S.; Fang, S.; Li, R.; Tan, Z. Research and Application of AdaBoost Algorithm Based on SVM. In Proceedings of the 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 24–26 May 2019; pp. 662–666. [Google Scholar] [CrossRef]
  31. Hong, J. Gray-gradient co-occurrence matrix texture analysis method. Acta Autom. Sin. 1984, 10, 22–25. [Google Scholar]
  32. Rezaei, M.; Saberi, M.; Ershad, S.F. Texture classification approach based on combination of random threshold vector technique and co-occurrence matrixes. In Proceedings of the 2011 International Conference on Computer Science and Network Technology (ICCSNT), Harbin, China, 24–26 December 2011; Volume 4, pp. 2303–2306. [Google Scholar] [CrossRef]
  33. Xue, G.; Li, X.; Qian, X. Coal-gangue image recognition in fully-mechanized caving face based on random forest. Ind. Mine Autom. 2020, 46, 57–62. (In Chinese) [Google Scholar] [CrossRef]
  34. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar] [CrossRef]
  35. Cho, G.-S.; Gantulga, N.; Choi, Y.-W. A comparative study on multi-class SVM & kernel function for land cover classification in a KOMPSAT-2 image. KSCE J. Civ. Eng. 2017, 21, 1894–1904. [Google Scholar] [CrossRef]
  36. Fekri-Ershad, S. Bark texture classification using improved local ternary patterns and multilayer neural network. Expert Syst. Appl. 2020, 158, 113509. [Google Scholar] [CrossRef]
  37. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the KDD’16: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
  38. Zhou, M.; Lai, W. Coal gangue recognition based on spectral imaging combined with XGBoost. PLoS ONE 2023, 18, e0279955. [Google Scholar] [CrossRef]
Figure 1. The block diagram of raw coal’s preparation process.
Figure 1. The block diagram of raw coal’s preparation process.
Minerals 13 00329 g001
Figure 2. The picture of the manual sorting scene and the sorted coal gangue. (a) Manual sorting; (b) sorted coal gangue.
Figure 2. The picture of the manual sorting scene and the sorted coal gangue. (a) Manual sorting; (b) sorted coal gangue.
Minerals 13 00329 g002
Figure 3. Basic principle of SVM classification.
Figure 3. Basic principle of SVM classification.
Minerals 13 00329 g003
Figure 4. Effect comparison of the coal and gangue images before and after preprocessing: (a) raw image, (b) gray conversion, (c) gamma correction, and (d) enhancement.
Figure 4. Effect comparison of the coal and gangue images before and after preprocessing: (a) raw image, (b) gray conversion, (c) gamma correction, and (d) enhancement.
Minerals 13 00329 g004
Figure 5. Generation principle of the gray-level co-occurrence matrix.
Figure 5. Generation principle of the gray-level co-occurrence matrix.
Minerals 13 00329 g005
Figure 6. Importance of the texture features of the coal gangue image [33].
Figure 6. Importance of the texture features of the coal gangue image [33].
Minerals 13 00329 g006
Figure 7. The variation curve of the coal gangue image recognition accuracy of AdaB-GA-SVM with a different number of integrated base classifiers.
Figure 7. The variation curve of the coal gangue image recognition accuracy of AdaB-GA-SVM with a different number of integrated base classifiers.
Minerals 13 00329 g007
Figure 8. The variation curve of the highest fitness of all individuals of each base classifier during the training process of each of the 20 base classifiers.
Figure 8. The variation curve of the highest fitness of all individuals of each base classifier during the training process of each of the 20 base classifiers.
Minerals 13 00329 g008
Figure 9. The KS curve of AdaB-GA-SVM.
Figure 9. The KS curve of AdaB-GA-SVM.
Minerals 13 00329 g009
Table 1. The accuracy and runtime of the coal-gangue identification by SVM models with a different kernel function.
Table 1. The accuracy and runtime of the coal-gangue identification by SVM models with a different kernel function.
Kernel FunctionAccuracy RateRuntime (s)
Polynomial73%0.0078
RBF83%0.0142
Sigmoid82%0.0128
Table 2. Orthogonal table L81 (9 × 72) design for GA parameter settings and experimental results.
Table 2. Orthogonal table L81 (9 × 72) design for GA parameter settings and experimental results.
No.FactorAccuracy
Population SizeCrossover ProbabilityMutation Probability
1300.40.010.892
2200.40.00010.884
80700.50.010.897
81800.80.050.889
k10.89760.89950.8977
k20.90160.89580.9047
k30.89360.89970.8996
k40.90120.90010.8999
k50.90.89670.8966
k60.90310.90290.9004
k70.90380.90170.8997
k80.8963
k90.8976
Table 3. The results of the GA-SVM model and the AdaB-GA-SVM model when tested with the aforementioned test set.
Table 3. The results of the GA-SVM model and the AdaB-GA-SVM model when tested with the aforementioned test set.
Evaluation IndicatorGA-SVMAdaB-GA-SVM
TP426464
FP7436
FN1513
TN485487
Acc0.9110.951
P0.8520.928
R0.9660.973
F10.9050.950
Table 4. Coal gangue recognition accuracy and recognition runtime of different base classifiers before and after adaptive boosting.
Table 4. Coal gangue recognition accuracy and recognition runtime of different base classifiers before and after adaptive boosting.
Base ClassifierAccuracy (%)Recognition Runtime (s)
BeforeAfterBeforeAfter
SVM83840.01420.076
GS-SVM85860.01210.104
PSO-SVM86900.01390.171
GA-SVM91950.01730.124
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xue, G.; Hou, P.; Li, S.; Qian, X.; Han, S.; Gao, S. Coal Gangue Recognition during Coal Preparation Using an Adaptive Boosting Algorithm. Minerals 2023, 13, 329. https://doi.org/10.3390/min13030329

AMA Style

Xue G, Hou P, Li S, Qian X, Han S, Gao S. Coal Gangue Recognition during Coal Preparation Using an Adaptive Boosting Algorithm. Minerals. 2023; 13(3):329. https://doi.org/10.3390/min13030329

Chicago/Turabian Style

Xue, Guanghui, Peng Hou, Sanxi Li, Xiaoling Qian, Sicong Han, and Song Gao. 2023. "Coal Gangue Recognition during Coal Preparation Using an Adaptive Boosting Algorithm" Minerals 13, no. 3: 329. https://doi.org/10.3390/min13030329

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop