Deep Learning Based Granularity Detection Network for Mine Dump Materials

Cai, Zhen; Lei, Shaogang; Lu, Xiaojuan

doi:10.3390/min12040424

Open AccessArticle

Deep Learning Based Granularity Detection Network for Mine Dump Materials

by

Zhen Cai

^1,2

,

Shaogang Lei

^2,3,* and

Xiaojuan Lu

^2,3

¹

School of Public Policy & Management, China University of Mining & Technology, Xuzhou 221116, China

²

Engineering Research Center of Ministry of Education for Mine Ecological Restoration, China University of Mining & Technology, Xuzhou 221116, China

³

School of Environment and Spatial Informatics, China University of Mining & Technology, Xuzhou 221116, China

^*

Author to whom correspondence should be addressed.

Minerals 2022, 12(4), 424; https://doi.org/10.3390/min12040424

Submission received: 7 February 2022 / Revised: 20 March 2022 / Accepted: 28 March 2022 / Published: 30 March 2022

(This article belongs to the Section Mineral Exploration Methods and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The granularity distribution of mine dump materials has received extensive attention as an essential research basis for dump stability and mine land reclamation. Image analysis is widely used as the fastest and most efficient method to obtain the granularity distribution of the dump materials. This article proposes a deep learning-based approach for granularity detection and identification of mine dump material, conglomerate, and clay. Firstly, a Conglomerate and Clay Dataset (CCD) is proposed to study the granularity of the mine dump. A typical study area is selected for field sampling, and the sampled conglomerate and clay is photographed and labeled. In addition, this article proposes a keypoint-based detection algorithm for the conglomerate and clay detection. The algorithm considers the scale variation of conglomerate and clay in orthophoto images and adopts center point detection to avoid the difficulty of localization. On this basis, dense convolution is introduced in feature extraction to reduce the computational redundancy to conduct detection more efficiently. Finally, the corresponding granularity distributions of conglomerate and clay are obtained by geometric calculation in the deep learning-based detection results. The proposed algorithm is validated on the proposed dataset CCD, and the experiments demonstrate the effectiveness of the proposed algorithm and its application to the granularity analysis of mine dump material.

Keywords:

mine dump material; granularity distribution; object detection; deep learning

1. Introduction

China is the largest producer and consumer of coal globally, and coal, as the primary energy source in China, has made signifixcant contributions to the national economic development [1]. Mine dumps are large piles of stripped strata from open-pit mining, prone to a series of disasters such as slope instability, soil erosion, and environmental pollution due to heterogeneous and loose structural characteristics [2,3]. Currently, a total of

8.84 \times 10^{3}

hm

^{2}

disturbed land and

1.63 \times 10^{4}

hm

^{2}

mine dumps have been generated by open-pit mining in China; moreover, those climbed at an 8–9% ratio annually [4,5]. The granularity distribution of the dump materials has received extensive attention as an essential research basis for dump stability and the reclamation of mine land [6].

The research on the granularity of mine dump materials mainly adopts the methods of indoor experiments and field measurements. Although indoor experiments are simple and easy to control, they are small in scale, differ significantly from the actual situation, and are generally used to study the properties of materials with different granularity [7]. Therefore, field measurements are still the most accurate method to reflect the granularity distribution of mine dump materials. There are mainly direct measurements, sieving methods, and image analysis to obtain the granularity of dump materials in the field. Direct measurement and sieving methods are suitable for measuring materials with small quantities, and high accuracy requirements [8], which are the most convenient methods in engineering but have the disadvantages of being time-consuming and cannot reflect the overall distribution of the materials [9,10]. The image analysis is now the most commonly used field method as it can measure the granularity distribution of dump materials over a large area in a fast, efficient and accurate manner while being portable without touching the materials [11,12]. In addition, the 2D (area) based material size detection method is more accurate than the 3D (weight) based one [13].

As shown in Figure 1, the traditional watershed algorithm applied to the material detection of the mine dump requires first converting the RGB image to a gray-scale image, then binarizing it using otsu [14] and removing the noise using Gaussian filtering. Then, a distance function transformation is applied to obtain the foreground object region. Finally, a simple geometric transformation is adopted to obtain the foreground object and perform object segmentation. Although the watershed algorithm can accurately detect the mine dump material, it suffers from the influence of the morphological algorithm model, which is computationally complex, noise sensitive, and has poor generalization ability [15]. With the development of deep learning, computer vision has been developed in image classification, object detection, and semantic segmentation. Deep learning-based computer vision algorithms can address the problems of traditional algorithms due to being data-driven.

Deep learning is rapidly evolving, benefiting from convolutional neural networks. AlexNet [16] is a first-generation convolutional neural network, which won the image classification competition at ILSVRC 2012. It consists of five convolutional layers and three fully connected layers containing tens of millions of parameters, and for the first time used the ReLU function as the activation unit of the convolutional output to obtain much better results than the previous Sigmod function, and for the first time used the DropOut method to randomly discard neurons in the training process to solve the problem of model overfitting. The new convolutional network VGG [17] was proposed to study the relationship between network depth and model performance, and a 16–19 layer convolutional neural network was constructed. VGG increases the network depth by replacing the large convolutional kernel in AlexNet with several successive small convolutional kernels, and it was found through experimental results that multiple small convolutions can lead to superior performance. The GoogleNet family [18,19,20] is also based on AlexNet, which has only 22 layers and is much smaller than VGG in terms of depth, and the performance of the convolutional network model is affected by the depth and width, and the deeper and wider it is, the better the performance seems to be. However, this approach brings about an increase in the number of parameters, GoogleNet, on the other hand, finds a way to increase the depth and width while reducing the increase in the number of parameters using the Inception module. ResNet [21] pioneered the concept of a deep residual network structure. ResNet proposes a residual structure to solve this problem, which differs from other networks that learn the mapping between input and output directly, but learns the mapping between input and residual indirectly, through a constant mapping relationship that allows the network to be mapped to a deeper network after reaching the optimal solution. The network performance is then fine-tuned on the optimal, which solves the problem of network performance degradation due to the increase of depth to a certain extent.

Intuitively, the semantic segmentation algorithm [22,23,24,25] assigns each pixel to a specific category that captures the location and size of the mine dump material entirely. However, as shown in Figure 2, the semantic segmentation requires a large number of pixel-level labels. Therefore, using semantic segmentation algorithms for material detection consumes much human and material resources. The material is tightly aligned and requires a large number of dense labels, so, for material detection, semantic segmentation is even less suitable. Object detection requires simpler annotation (bounding box) compared to semantic segmentation and can accurately represent the location and category [26]. Currently, deep learning-based object detectors are classified into two-stage and one-stage. The two-stage detectors benefited from R-CNN [27], which first introduced deep learning and machine learning to object detection with some success. Fast R-CNN [28] built on it and proposed regions of interest pooling (RoIpooling) for reducing repeated computations. Faster R-CNN [29] proposed a regional proposal network (RPN) that used the anchor to perform sampling instead of the complex computation caused by selective search (SS) algorithms. With the Faster R-CNN, the subsequent two-stage detectors conducted continuous improvement of network structure. Mask R-CNN [30] introduced a new instance segmentation branch to assist in object localization and recognition. Cascade R-CNN [31] proposed a cascade head to dynamically update the IoU thresholds to solve the mismatch problem that exists in detectors. The model training process is as important as the model construction in object detection. Libra R-CNN [32] proposed a balanced detector structure that addresses detector imbalance in terms of sampling, feature extraction, and loss. The representatives of single-stage object detection algorithms are the YOLO [33,34,35] and SSD [36,37]. The YOLO algorithm proposed a more efficient feature extraction network DarkNet to improve detection efficiency and performed regression and classification based on the anchor. The SSD algorithm proposed a multi-level prediction structure while performing regression on the default boxes. RetinaNet [38] proposed Focal loss aiming to reduce the accuracy difference between single-stage and two-stage detectors. However, most of the algorithms mentioned above are anchor-based. These methods rely on a priori anchor boxes. Therefore, they are less generalizable and more sensitive to hyperparameters. Recently, the anchor-free algorithm has been proposed to solve this problem. FCOS [39] pioneered a fully convolutional anchor-free network to predict the bounding box by dense regression. CornerNet [40] adopted corner point detection to localize the object. Currently, the anchor-free algorithm has been shown to outperform the anchor-based approach in terms of detection speed and accuracy.

In this article, an anchor-free detection algorithm is proposed to accurately obtain the granularity of mine dump materials. Our algorithm is based on the implementation of the already existing object detection algorithm CenterNet [41]. Based on CenterNet, we introduce a dense convolutional improvement backbone network to achieve detection of multi-scale material. For the detection results, we design a geometric transformation to obtain the granularity information. This algorithm uses the center of the material to localize the material and classifies the different materials by the detector head. In addition, a dense convolution approach is introduced in the feature extraction stage for efficient detection of material. Since supervised learning based on deep learning needs to be data-driven, a Conglomerate and Clay’s detection dataset is proposed to train a deep learning model for detecting material. Ultimately, to obtain the granularity of the material, a simple geometric transformation of the detection results is performed in this article to output the granularity information of the material finally.

Our contributions can be summarized as follows:

We produced a Conglomerate and Clay dataset (CCD) for training a deep learning-based substrate detection model. The dataset was sampled in the field and post-processed indoors to obtain a dataset that can reflect the material distribution in the region within a particular area.
We propose a deep learning-based anchor-free material detection algorithm. The algorithm localizes the material by detecting the center point and classifies different matrices by the head of the detector. The granularity information of material is obtained by a simple geometric transformation to the detection results.
We introduce dense convolution to make the detector efficient. This article implements feature reuse utilizing dense concatenation to achieve superior performance with reduced parameters and computations in the feature extraction stage.

2. Materials and Methods

Two typical mine dump profiles of conglomerate and clay are selected, and different granularity materials are collected continuously. In addition, we propose a deep learning-based conglomerate and clay detector. The detector is based on the anchor-free algorithm and introduces dense convolution in feature extraction to efficiently and accurately detect conglomerate and clay in the mine dump. Finally, the granularity information of the material is obtained by a simple geometric transformation. The details of materials and methods will be described below.

2.1. Materials

The mine dump materials are taken from the Shengli No. 1 Open-pit Mine in the northwest suburb of Xilinhot City, Inner Mongolia Autonomous Region. Conglomerate and clay are the primary constituent materials of the mine dump. Therefore, two profiles of conglomerate and clay in the dump area were selected, and different granularity materials were collected continuously from the top to the bottom of the profiles for photoing.

2.2. Methods

To implement deep learning-based object detection, a conglomerate and clay dataset (CCD) is produced in this article. As shown in Figure 3, this article firstly conducts in the field sampling material. First, we select a typical mine dump profile as the study area. The material contained in this area is mainly conglomerate and clay. Then, successive samples of conglomerate and clay of different granularity are taken from the top to the bottom of the slope and brought back to the indoor area for processing. Secondly, the collected conglomerate and clay are processed indoors. A 15 cm × 15 cm iron frame is placed flat on the ground to ensure that the collected material is photographed entirely. To complete the image data collection accurately and quickly, this article adopts a pan-tilt camera to photograph the placed conglomerate and clay. We adjust the pan-tilt camera to make the camera lens perpendicular to the iron frame. The vertical distance of the camera lens from the frame is 1.5 m, and then we randomly arrange material with different granularity into the frame and take the orthophotography to generate 400 images of conglomerate and 400 images of clay as the training set. To verify that the proposed algorithm in this article can obtain accurate detection results at different scales of substrates, we use the sieving method in making the validation set. First, to construct a validation set containing 100 images, we performed m sampling (m = 1, 2, 3, 4, 5), and each sampling captured 20 images as the validation subset of that sampling. Secondly, to construct a validation set containing 100 images, we performed m sampling, taking 20 images per sampling as the validation subset for that sampling. For the

n th

sampling, we selected 7 sieves (10 mm, 15 mm, 20 mm, 25 mm, 30 mm, 40 mm, 50 mm) to perform the granularity ground truth for each sampled material. After obtaining all the images, we use the software LabelMe to label the training and validation sets with the bounding boxes.

This article proposes a deep learning-based algorithm for the conglomerate and clay detection based on the CCD dataset. The proposed detector is anchor-free based, detecting the center point of the material and regressing the transverse and longitudinal diameters to accomplish the task of material localization and classification. The overall framework of the detector is shown in Figure 4. The detector is divided into two parts: (1) feature extraction part—the feature extraction stage is based on the convolutional neural network (CNN) to extract the depth representation of the input image; (2) Detection head part. Based on the extracted depth features, regression and classification are performed to obtain the location information and category information of the final object. In the feature extraction part, the proposed method utilizes a convolutional neural network in the shape of an hourglass, as shown in Figure 4. Two stacked hourglass networks are shown in Figure 4, where information from multiple resolutions is combined by nearest neighbor upsampling and element summation on branches of different resolutions. Three consecutive

1 \times 1

convolutions are applied to output the final prediction when the final output resolution is reached—in which each convolution block uses residual structure [21] for feature extraction. One block is a standard

7 \times 7

convolution, and the other remaining blocks output 256-dimensional features. The two hourglass networks are stacked end-to-end to provide repeated bottom-up and top-down features to provide more refined information for the final output heatmap. We propose a dense hourglass network to obtain multi-scale features based on an hourglass network. A dense hourglass network utilizes convolutional and max-pooling layers to downsample the image, i.e., encoding phase, and when the features are processed to 32 times of resolution downsampling, the network starts upsampling cross-scale feature fusion, i.e., decoding phase. In the decoding stage, in order to integrate information from multiple scales, a dense connection is used to up-sample all low-resolution features and fuse them in the encoding stage to obtain a feature map with an equal resolution to the input image.

We obtain sufficient semantic features in the feature extraction stage with the deep learning approach. Subsequently, we propose an anchor-free detector for the precise localization and classification of material. As shown in Figure 5, the object detection problem is converted into a center point prediction problem, the object is represented by its center point, and the rectangular box of the object is obtained by predicting the offset and width of the center point of the object. Heatmap represents the categorical information, and each category will generate a separate heatmap. For each heatmap, when a coordinate contains the center point of an object, a key point is generated at that object, and we use Gaussian circles to represent the whole keypoint. As shown in Figure 5a, the heatmap branch predicts a heatmap with values in the range [

0, 1

], and the confidence level for determining the presence of a material, where 0 means it is not a material center point and 1 means it is a material center point. To detect smaller objects, the feature map is the same size as the original input image. To reduce the penalty for some points near the center of the object, we set a Gaussian kernel in the loss function. The kernel function is shown as follows:

k (x, x^{^{'}}) = e^{- \frac{‖ x - x^{^{'}} ‖^{2}}{2 σ^{2}}}

(1)

where the x represents the center of the kernel function,

x^{^{'}}

represents the unknown Gaussian values,

σ

controls the range of action of the Gaussian kernel function, and the larger its value, the larger the local range of influence of the Gaussian kernel function.

For a bounding box

(x_{1}, y_{1}, x_{2}, y_{2})

, its center point is calculated as:

p = (\frac{x_{1} + x_{2}}{2}, \frac{y_{1} + y_{2}}{2})

(2)

where p represents the center point coordinate. The offset branch is dedicated to predicting the center point coordinate offset. In the size branch, we output the granularity information. The particle size in the horizontal and vertical directions is calculated in the following way:

\begin{matrix} w = x_{2} - x_{1} \end{matrix}

(3)

\begin{matrix} h = y_{2} - y_{1} \end{matrix}

(4)

On this basis, we defined the average particle size

\frac{w + h}{2}

as the granularity of each material. However, the granularity information obtained is the particle size of the material in the image, which is not the true particle size. Therefore, a simple geometric transformation method is designed in this article as follows:

g_{m a t e r i a l} = \frac{l_{f r a m e}}{l_{f r a m e^{^{'}}}} g_{m a t e r i a l^{^{'}}}

(5)

where

g_{m a t e r i a l}

and

g_{m a t e r i a l^{^{'}}}

represent the true particle size and the particle size in images,

l_{f r a m e}

and

l_{f r a m e l^{^{'}}}

represent the true iron frame size and the iron frame size in images.

Thus, the heatmap branch is employed to predict the location of the center point, the offset branch is utilized to predict the offset of the center point coordinates, and the size branch is applied to output the granularity information of the material. To train our proposed algorithm, the corresponding loss function needs to be designed. The loss function is devised as follows:

L_{t o t a l} = L_{h e a t m a p} + λ_{o f f} L_{o f f} + λ_{s i z e} L_{s i z e}

(6)

where the

L_{t o t a l}

represents the total loss of the training,

L_{h e a t m a p}

represents the loss of heatmap branch,

L_{o f f}

represents the loss of offset branch, and the

L_{s i z e}

represents the loss of the size branch. Moreover, in order to balance the multitasking losses, two hyperparameters

λ_{o f f}

and

λ_{s i z e}

are introduced to indicate their weights, respectively.

The heat map branch is designed with a loss function based on Focal loss [38] in order to obtain the location of the center point better. The loss function is as follows:

L_{h e a t m a p} = \frac{1}{N} \sum_{x y c} \{\begin{matrix} {(1 - {\hat{Y}}_{x y c})}^{α} log ({\hat{Y}}_{x y c}) & {i f \hat{Y}}_{x y c} = 1 \\ \begin{matrix} {(1 - {\hat{Y}}_{x y c})}^{β} {({\hat{Y}}_{x y c})}^{α} \\ log (1 - {\hat{Y}}_{x y c}) \end{matrix} & o t h e r w i s e \end{matrix}

(7)

where

α

and

β

are the hyperparameters of Focal loss, N is the number of center points in the image,

{\hat{Y}}_{x y c}

is the Gaussian kernel. x, y, and c is the point of the feature map. When

{\hat{Y}}_{x y c} = 1

,

(1 - {\hat{Y}}_{x y c})

is used to adjust the weight of hard and easy cases. If

{\hat{Y}}_{x y c}

is close to 1, it means that this is an easy to detect point and the weight will be lower. When

{\hat{Y}}_{x y c}

is close to 0, it means that the center point is more difficult to detect and the weight will be higher. Similarly, in the otherwise case,

{({\hat{Y}}_{x y c})}^{α}

is used to adjust the weights of the hard and simple cases, and

{(1 - {\hat{Y}}_{x y c})}^{β}

is used to adjust the weight of the distance from the center point, and the weight will be reduced when the detected point is close to the center point.

However, the spatial resolution of the output feature map of the backbone network is one-fourth of the original input image. Each pixel point on the output feature map corresponds to a

4 \times 4

region of the original image, which introduces a significant error. An additional prediction of a center point offset is required to eliminate the errors. The loss function of the offset is as follows:

L_{o f f} = \frac{1}{N} \sum_{p} |{\hat{O}}_{\tilde{p}} - (p - \tilde{p})|

(8)

where

{\hat{O}}_{\tilde{p}}

denotes the offset value predicted by the network, p denotes the center point coordinates, and

\tilde{p}

denotes the approximate integer coordinates of the center point after scaling. The whole process uses

L 1

Loss to calculate the offset loss of the positive sample block.

To predict the granularity, we designed a loss function for granularity prediction as follows:

L_{s i z e} = \frac{1}{N} \sum_{k = 1}^{N} |{\hat{S}}_{p_{k}} - s_{k}|

(9)

where N denotes the number of material, k denotes the

k th

material,

{\hat{S}}_{p_{k}}

denotes the predicted particle size, and

s_{k}

denotes the ground truth of the particle size calculated as in Equation (3).

In addition, the hourglass-type network structure can be stacked. In this article, we use two hourglass-type networks, the first output’s result is used to assist training, and the second result is the final result, so that the cascaded structure may obtain higher accuracy; as shown in Figure 4, the dashed line is the supervised signal and the solid line is the data stream.

To make the model more robust to different sizes and shapes of material in different lighting and scenes, each image is sampled by randomly adopting the following strategies: (1) Adding random offsets to the brightness, contrast, saturation, and hue of the image; (2) Randomly cropping a region from the image and then resizing it to a fixed size; (3) Randomly performing horizontal and vertical flips; (4) Randomly performing 90-degree Rotate; and (5) Randomly pan horizontally or vertically.

3. Results and Discussion

This section will show the dataset produced in this article and some details of the problems in producing the dataset. Then, the algorithmic model proposed in this paper is validated on this dataset, and comparison experiments and ablation experiments demonstrate the effectiveness and superiority of the algorithm proposed in this article.

3.1. Datasets

The dataset CCD in this article is produced from the images we took indoors. Firstly, we selected a mine dump of a specific area for sample collection to collect the material. The collected material samples were put into iron frames for photographing and image acquisition. To calculate the final true particle size, the actual iron frame size is adopted as a 15 cm square iron frame. To ensure the same scale of the object within the image, the distance between the camera and the sample is set to 1.5 m for all images, and to obtain the ground truth of the real particle size, we use a soil sieve to calculate the particle size distribution of the material when making the validation set. In the production, we take out a part of the substrate each time, shoot 20 images as the validation set for that collection, and calculate the particle size distribution of this part of the material. In total, we made five acquisitions to obtain 100 validation sets, and the particle size distribution of each acquisition is shown in Table 1.

The proposed dataset CCD contains 500 conglomerate images and 500 clay images, with 400 conglomerate training sets, 400 clay training sets, 100 conglomerate validation sets, and 100 clay validation sets. Figure 6 shows our dataset.

3.2. Implementation Details

The proposed algorithm as well as the experiments in this article are based on the deep learning framework Pytorch [42]. All the experiments were performed on the computer with one Quadro m4000. The Hourglass Net part of the backbone network is pre-trained on ImageNet. For a fair comparison, the basic hyperparameters of some algorithms need to be consistent. In our experiments, we set the batch size of each training input to 16 and set the initial learning rate to

2 \times 10^{- 2}

. We set a total of 140 epochs to complete the training convergence. The learning rate is varied in the form of a warm-up [43], with the learning rate decreasing between the 90th and 120th epochs. In addition, according to the basic settings of CenterMap, we only detect one key point at the center of each object and the width and height of the material. In the test process, the number of images for each detection is 8 to ensure a fair comparison. Before detection, the size of the input image is resized to 512 × 512 to ensure the consistency of the image size.

3.3. Evaluation Metrics

To evaluate the accuracy of the proposed algorithm on the object detection task, we use the mean Average Precision (mAP) in object detection as an evaluation metric. mAP is calculated by averaging the Average Precision (AP) values for each object class. For object detection, the larger the AP and mAP, the better the accuracy of the detection model. mAP is calculated as follows:

m A P = \frac{1}{n} \sum_{i}^{n} A P_{i}

(10)

where i represents the

i th

category and n denotes the total number of categories.

A P_{i}

is the average precision of the

i th

category. The value of AP is the area under the Precision–Recall (

P - R

) curve for each category. Precision expresses the proportion of detected objects that are true positives, while recall expresses the proportion of correctly detected positives. P and R are calculated as follows:

P = \frac{T P}{F P + T P}

(11)

R = \frac{T P}{F P + F N}

(12)

where

T P

,

F P

, and

F N

denote the numbers of true positives, false positives, and false negatives, accordingly.

3.4. Results

3.4.1. Ablation Study

Impact of dense convolution on the algorithm. Since the particle size distribution of the material contained in the picture is not uniform, a densely connected hourglass network (Dense-Hg-Center) is proposed in this paper to adapt to the multi-scale material, and dense convolution is introduced in the hourglass network (Hg-Center) to improve the detection accuracy. To verify the effectiveness of dense convolution, we compare the error change of the hourglass network before and after the introduction of dense convolution. The experimental results are shown in Table 2.

The results showed that Dense-Hg-Center reduced the error by

1.14 %

and the standard deviation by

12.84

compared to Hg-Center on the conglomerate data set. On the clay data set, Dense-Hg-Center reduces the error by

0.19 %

and the standard deviation by

25.4

.

Mixed training and separate training. Since we contain two different categories of datasets within CCD, we designed both mixed training and separate training experiments to explore their influence on each other. In mixed training, 400 conglomerate training sets and 400 clay training sets are shuffled together to train the model. For testing, the trained model (Dense-Hg-Center-Mix) is validated on 100 conglomerate validation sets and 100 clay validation sets using the trained model to calculate the error. Separate training involves training the two models (Dense-Hg-Center-Sep) on the conglomerate training set and the clay training set, respectively, and then testing them on the corresponding validation sets to output the errors. The experimental results are shown in Table 3.

The experimental results indicated that the mixed training strategy is superior to the separate training, and the different material objects will assist the model in extracting discriminative features between them. On the validation set of conglomerate, mixed training reduces the error by

0.91 %

and the standard deviation by

7.12

compared to separate training. On the clay validation set of CCD, hybrid training reduces the error by

0.35 %

and the standard deviation by

13.93

.

3.4.2. Comparison Study

In the same experimental environment, we compare the traditional algorithm watershed, the deep learning object detection algorithm Faster R-CNN, and YOLOv3. Watershed is a segmentation algorithm based on the traditional image algorithm, so it does not include the evaluation metric of object detection mAP. Faster R-CNN is the most classical two-stage algorithm, and it maintains an advanced level in detection accuracy. YOLOv3 is the most commonly used single-stage object detection algorithm, which has the optimal detection speed and accuracy. In our comparison experiments, we compare the mAP accuracy on the object detection task and the error comparison on the granularity detection, respectively. The experimental results are shown in Table 4 and Table 5. The visualization of the experimental results on both conglomerate and clay validation sets is shown in Figure 7.

Table 4 shows the target detection comparison experiments as well as the granularity detection comparison experiments performed on the conglomerate validation set of the CCD. mAP0.5 and mAP0.75 are the mAP values when the IoU threshold is chosen to be 0.5 and 0.75, respectively. mAP small, mAP medium, and mAP large indicate the mAP values for small, medium, and large size material. The error and the total error obtained for different particle size ground truth are expressed by the evaluation metrics during particle size detection in different particle size ranges. In the object detection task, the proposed algorithm outperforms other algorithms with an mAP of

66.90 %

. In the final granularity detection, the proposed algorithm outperforms traditional algorithms and deep learning-based algorithms with an error of only

2.10 %

and the standard deviation of only

83.48

.

As shown in Table 5, on the object detection task, the algorithm proposed in this article outperforms other algorithms with an mAP of

66.50 %

. On the final granularity detection, the proposed algorithm outperforms the traditional algorithm and the deep learning-based algorithm with an error of only

2.84 %

and the standard deviation of only

111.82

.

As shown in Table 6 and Figure 8, we further compare the predicted value and the ground truth of the algorithm proposed in this paper. Table 6 shows the results of grain size prediction by randomly sampling 7–8 pieces of materials on conglomerate and clay, respectively. In addition, it also shows the ground truth of physical particle sizes of these materials. The error of the algorithm proposed in this paper on sampled material is consistent with the experimental results in Table 4 and Table 5. We show the sampled materials in Figure 8 and give the detection results and some prediction results.

In measuring the performance of the algorithm proposed in this paper, we use Frames Per Second (FPS) to verify the operational efficiency of the algorithm. For the proposed dataset, Faster R-CNN achieves 17FPS and YOLOv3 achieves 20FPS. The proposed algorithm achieves 22FPS with dense connections, which is superior in accuracy and efficiency compared to Faster R-CNN. In addition, the algorithm in this paper is less efficient than YOLOv3 but achieves better detection accuracy.

4. Conclusions

This article presents a deep learning-based granularity detection algorithm to obtain the granularity distribution for the material collected from the mine dump, which is an application of deep learning on the mine dump analysis. The article first produces a conglomerate and clay detection dataset and obtains the particle size of the material by a screening method. The material is obtained in the study area using a scientific method by sampling and photographing, and the obtained images are labeled. The algorithm proposed in this article is trained and validated using this dataset. In this article, an anchor-free algorithm is proposed to improve generalization ability. This algorithm detects the center point of the material and returns the size and granularity information of the material. To solve the problem of large-scale variation of the material, a densely connected hourglass network is also proposed as the feature extractor of the algorithm, and dense convolution is introduced to perform multi-scale feature fusion. The algorithm is validated on the proposed dataset, and the results show that the dense convolution can improve the detection accuracy of the model and reduce the particle size detection error. In addition, we also performed error analysis of hybrid training and separate training, and the experiments show that mixed training is more effective, indicating that the discriminative features can be effectively improved in both substrate algorithms. This algorithm can be applied to large-scale remote sensing images of mine dump materials to reference dump stability monitoring and land reclamation.

Author Contributions

Conceptualization, Z.C. and S.L.; methodology, Z.C.; software, Z.C.; validation, X.L., S.L. and Z.C.; formal analysis, S.L.; investigation, Z.C.; resources, S.L.; data curation, Z.C.; writing—original draft preparation, Z.C.; writing—review and editing, Z.C.; visualization, X.L.; supervision, S.L.; project administration, X.L.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Project of Joint Funds of the National Natural Science Foundation of China, Grant No. U1903209; National Key Research and Development Program of China, Grant No. 2016YFC0501107.

Data Availability Statement

Data sharing is not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lei, H.; Peng, Z.; Yigang, H.; Yang, Z. Vegetation and soil restoration in refuse dumps from open pit coal mines. Ecol. Eng. 2016, 94, 638–646. [Google Scholar] [CrossRef]
Bian, Z.; Miao, X.; Lei, S.; Chen, S.E.; Wang, W.; Struthers, S. The challenges of reusing mining and mineral-processing wastes. Science 2012, 337, 702–703. [Google Scholar] [CrossRef]
Zhang, L.; Wang, J.; Feng, Y. Life cycle assessment of opencast coal mine production: A case study in Yimin mining area in China. Environ. Sci. Pollut. Res. 2018, 25, 8475–8486. [Google Scholar] [CrossRef]
Xia, H. Ecological rehabilitation and phytoremediation with four grasses in oil shale mined land. Chemosphere 2004, 54, 345–353. [Google Scholar] [CrossRef]
Li, S.; Di, X.; Wu, D.; Zhang, J. Effects of sewage sludge and nitrogen fertilizer on herbage growth and soil fertility improvement in restoration of the abandoned opencast mining areas in Shanxi, China. Environ. Earth Sci. 2013, 70, 3323–3333. [Google Scholar] [CrossRef]
Upadhyay, O.; Sharma, D.; Singh, D. Factors affecting stability of waste dumps in mines. Int. J. Surface Min. Reclam. Environ. 1990, 4, 95–99. [Google Scholar] [CrossRef]
Tovele, G.; Han, L.; Shu, J.S. Variation of Open-Pit Waste Dump Specimens under Effective Pressure Influence. Front. Earth Sci. 2021, 8, 704. [Google Scholar] [CrossRef]
Yellishetty, M.; Darlington, W.J. Effects of monsoonal rainfall on waste dump stability and respective geo-environmental issues: A case study. Environ. Earth Sci. 2011, 63, 1169–1177. [Google Scholar] [CrossRef]
Wang, J.; Zhang, M.; Bai, Z.; Guo, L. Multi-fractal characteristics of the particle distribution of reconstructed soils and the relationship between soil properties and multi-fractal parameters in an opencast coal-mine dump in a loess area. Environ. Earth Sci. 2015, 73, 4749–4762. [Google Scholar] [CrossRef]
Wang, G.; Kong, X.; Gu, Y.; Yang, C. Research on slope stability analysis of super-high dumping site based on cellular automaton. Procedia Eng. 2011, 12, 248–253. [Google Scholar] [CrossRef] [Green Version]
Shrivastava, S.; Deb, D.; Bhattacharjee, S. Prediction of Particle Size Distribution Curves of Dump Materials Using Convolutional Neural Networks. Rock Mech. Rock Eng. 2022, 55, 471–479. [Google Scholar] [CrossRef]
Zhang, S.; Liu, W. Application of aerial image analysis for assessing particle size segregation in dump leaching. Hydrometallurgy 2017, 171, 99–105. [Google Scholar] [CrossRef]
Hamzeloo, E.; Massinaei, M.; Mehrshad, N. Estimation of particle size distribution on an industrial conveyor belt using image analysis and neural networks. Powder Technol. 2014, 261, 185–190. [Google Scholar] [CrossRef]
Pooja, K.; Rajesh, R. Image segmentation: A survey. In Recent Advances in Mathematics, Statistics and Computer Science; World Scientific: Singapore, 2016; pp. 521–527. [Google Scholar]
Vincent, L. Morphological algorithms. In Mathematical Morphology in Image Processing; CRC Press: Boca Raton, FL, USA, 2018; pp. 255–288. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2818–2826. [Google Scholar]
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–10 February 2017. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Chengdu, China, 15–17 December 2016; pp. 770–778. [Google Scholar]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Cham, Switzerland, 5–9 October 2015; pp. 234–241. [Google Scholar]
Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Iwaszenko, S.; Róg, L. Application of Deep Learning in Petrographic Coal Images Segmentation. Minerals 2021, 11, 1265. [Google Scholar] [CrossRef]
Zhao, Z.Q.; Zheng, P.; Xu, S.t.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [Green Version]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [Green Version]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 6154–6162. [Google Scholar]
Pang, J.; Chen, K.; Shi, J.; Feng, H.; Ouyang, W.; Lin, D. Libra r-cnn: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 821–830. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–16 October 2016; pp. 21–37. [Google Scholar]
Fu, C.Y.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. Dssd: Deconvolutional single shot detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–3 November 2019; pp. 9627–9636. [Google Scholar]
Law, H.; Deng, J. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), Munch, Germany, 8–14 September 2018; pp. 734–750. [Google Scholar]
Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8026–8037. [Google Scholar]
Imamura, K.; Sumita, E. Long warm-up and self-training: Training strategies of NICT-2 NMT system at WAT-2019. In Proceedings of the 6th Workshop on Asian Translation, Hong Kong, China, 4 November 2019; pp. 141–146. [Google Scholar]

Figure 1. Illustration of the watershed algorithm. The image is the original input image. Thresh is the result of a grayscale binarization process. Dist is the result of the distance function processing. Maker is to extract the foreground information. Contour is the result of finding the object. Mask is the result of the final segmentation.

Figure 2. Illustration of semantic segmentation and object detection annotation. Semantic segmentation requires pixel-level annotation while object detection requires only bounding boxes.

Figure 3. Illustration of the full process for producing a dataset. Firstly, the material collection is performed in a specific mine dump. Secondly, post-processing of the collected material is performed indoors. Finally, the data are annotated.

Figure 4. Illustration of deep learning-based algorithm for conglomerate and clay detection. The algorithm is divided into a feature extraction stage and a detection head part.

Figure 5. Illustration of output of the three branches of the proposed algorithm. (a) The location of the center point of the heatmap branch output. (b) Offset branches are used to predict the centroid position offset. (c) The size branch is used to predict the target width and height.

Figure 6. The illustration of our proposed dataset CCD.

Figure 7. The visualization of our comparison experiments.

Figure 8. The illustration of sampling materials and their detection and prediction results.

Table 1. Produce a validation set of the material particle size distribution for each of the five samples.

Part	Material	<10	10–15	15–20	20–25	25–30	30–40	40–50	>50
1	conglomerate	3.50%	1.75%	0.00%	0.00%	19.30%	31.58%	36.74%	7.02%
1	clay	8.82%	0.00%	2.94%	17.65%	29.41%	20.59%	17.65%	2.94%
2	conglomerate	0.00%	13.64%	31.82%	9.09%	27.27%	13.64%	0.00%	4.55%
2	clay	12.50%	12.50%	8.33%	25.00%	12.50%	16.67%	12.50%	0.00%
3	conglomerate	4.17%	16.67%	16.67%	8.33%	20.83%	20.83%	12.50%	0.00%
3	clay	0.00%	14.81%	25.93%	14.81%	11.11%	33.33%	0.00%	0.00%
4	conglomerate	3.85%	0.00%	30.77%	30.77%	19.23%	11.54%	3.85%	0.00%
4	clay	0.00%	3.13%	6.25%	18.75%	31.25%	31.25%	6.25%	3.13%
5	conglomerate	11.11%	3.70%	14.81%	7.41%	14.81%	33.33%	11.11%	3.70%
5	clay	7.14%	14.29%	7.14%	17.86%	7.14%	39.29%	7.14%	0.00%

Table 2. Ablation experimental results for dense convolution.

Material	Network	Error	Standard Deviation
conglomerate	Hg-Center	2.29%	96.32
conglomerate	Dense-Hg-Center	2.10%	83.48
clay	Hg-Center	3.98%	137.22
clay	Dense-Hg-Center	2.84%	111.82

Table 3. Ablation experimental results for training strategy.

Material	Network	Error	Standard Deviation
conglomerate	Dense-Hg-Center-Sep	2.45%	90.60
conglomerate	Dense-Hg-Center-Mix	2.10%	83.48
clay	Dense-Hg-Center-Sep	3.75%	125.75
clay	Dense-Hg-Center-Mix	2.84%	111.82

Table 4. Comparative experiments performed on the conglomerate validation set in CCD.

Task	Evaluation Metrics	Watershed	Faster R-CNN	YOLOv3	Ours
object detection task	mAP	-	12.10%	58.70%	66.90%
	mAP0.5	-	35.10%	98.70%	98.60%
	mAP0.75	-	5.00%	64.60%	82.30%
	mAP small	-	0.00%	45.10%	40.30%
	mAP medium	-	8.30%	58.10%	66.90%
	mAP large	-	38.90%	71.90%	78.40%
granularity detection task	error	5.34%	10.53%	2.78%	2.10%
	standard deviation	434.88	486.43	104.15	83.48
	10	14.61%	0.00%	2.04%	3.01%
	10–15	28.53%	0.03%	11.97%	12.77%
	15–20	23.18%	4.00%	22.86%	23.34%
	20–25	14.47%	23.17%	25.60%	22.37%
	25–30	8.98%	29.63%	12.63%	13.04%
	30–40	6.79%	26.09%	13.47%	13.84%
	40–50	3.09%	10.61%	7.52%	7.62%
	50	0.34%	6.48%	3.91%	4.01%

Table 5. Comparative experiments performed on the clay validation set in CCD.

Task	Evaluation Metrics	Watershed	Faster R-CNN	YOLOv3	Ours
object detection task	mAP	-	10.00%	56.30%	66.50%
	mAP0.5	-	34.10%	97.20%	97.70%
	mAP0.75	-	2.70%	61.90%	76.80%
	mAP small	-	0.00%	36.00%	34.30%
	mAP medium	-	7.70%	56.10%	66.50%
	mAP large	-	29.40%	68.20%	80.70%
granularity detection task	error	7.99%	8.21%	3.59%	2.84%
	standard deviation	416.35	449.29	127.39	111.82
	10	11.76%	0.00%	2.37%	2.20%
	10–15	30.55%	0.20%	9.23%	7.19%
	15–20	24.66%	5.60%	24.59%	24.48%
	20–25	14.32%	26.74%	25.28%	20.62%
	25–30	8.79%	30.99%	13.68%	17.09%
	30–40	7.94%	21.51%	16.78%	18.89%
	40–50	1.84%	9.39%	5.31%	6.34%
	50	0.14%	5.57%	2.77%	3.19%

Table 6. Comparison between physical measured value and predicted value. GT represents the ground truth of the material. PG indicates the predicted granularity value.

	Conglomerate Sample								Clay Sample
	1	2	3	4	5	6	7	8	1	2	3	4	5	6	7
GT	44.52	38.74	34.38	51.70	9.86	7.52	46.78	14.34	50.94	12.32	23.84	18.76	21.10	13.82	22.44
PG	47.37	33.81	31.13	53.14	9.57	7.83	49.29	15.86	52.23	13.31	27.47	15.91	25.82	14.96	26.17

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cai, Z.; Lei, S.; Lu, X. Deep Learning Based Granularity Detection Network for Mine Dump Materials. Minerals 2022, 12, 424. https://doi.org/10.3390/min12040424

AMA Style

Cai Z, Lei S, Lu X. Deep Learning Based Granularity Detection Network for Mine Dump Materials. Minerals. 2022; 12(4):424. https://doi.org/10.3390/min12040424

Chicago/Turabian Style

Cai, Zhen, Shaogang Lei, and Xiaojuan Lu. 2022. "Deep Learning Based Granularity Detection Network for Mine Dump Materials" Minerals 12, no. 4: 424. https://doi.org/10.3390/min12040424

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Based Granularity Detection Network for Mine Dump Materials

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.2. Methods

3. Results and Discussion

3.1. Datasets

3.2. Implementation Details

3.3. Evaluation Metrics

3.4. Results

3.4.1. Ablation Study

3.4.2. Comparison Study

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI