Rock Classification in a Vanadiferous Titanomagnetite Deposit Based on Supervised Machine Learning

Shin, Youngjae; Shin, Seungwook

doi:10.3390/min12040461

Open AccessArticle

Rock Classification in a Vanadiferous Titanomagnetite Deposit Based on Supervised Machine Learning

by

Youngjae Shin

and

Seungwook Shin

^*

Mineral Resources Research Division, Korea Institute of Geoscience and Mineral Resources, Daejeon 34132, Korea

^*

Author to whom correspondence should be addressed.

Minerals 2022, 12(4), 461; https://doi.org/10.3390/min12040461

Submission received: 7 March 2022 / Revised: 6 April 2022 / Accepted: 7 April 2022 / Published: 10 April 2022

(This article belongs to the Special Issue GIS, AI, and Modelling of Mineralization Process and Prospectivity)

Download

Browse Figures

Versions Notes

Abstract

:

As the potential locations of undiscovered ore deposits become deeper, a technique for predicting promising areas in the subsurface media has become necessary. Geoscience data on a wide range of underground media can be obtained through geophysical field exploration, but integration and interpretation of multi-geophysical data are difficult because of differences in spatial resolution. We developed a rock classifier that can predict promising vanadiferous titanomagnetite deposits from multi-geophysical data using supervised machine learning. Vanadiferous titanomagnetite ores are the main source of vanadium, which can be used as a large-scale energy storage system. Model training was conducted using rock samples from drilling cores, and the density of rock samples was used as a criterion for data labeling. We employed the support vector machine, random forest, extreme gradient boosting, LightGBM, and deep neural network for supervised learning, and the accuracy of all methods was 0.95 or greater. We applied trained models to three-dimensional geophysical field data to predict ore body locations. These candidate regions were distributed in the northeast of the geophysical survey area, and some classified areas were verified using a geological map.

Keywords:

supervised machine learning; geophysical survey; rock property measurements; titanomagnetite

1. Introduction

Mineral exploration is the first step in mine development. As the number of discovered deposits has decreased, effective exploration methods are becoming increasingly important for further development [1,2,3]. Comprehensive geoscience approaches, including geophysical, geological, and geochemical surveys, as well as drilling, contribute to the identification of ore deposits. Surveys are conducted depending on the type and location of the ore deposit, and potential ore zones are identified through the analysis of integrated data [4].

Geophysical surveys provide a wide range of information about the subsurface media. These surveys aim to detect anomalous signals generated by geophysical sources and include seismic, magnetic, electromagnetic, electrical resistivity, and induced polarization surveys [5]. To increase the success of exploration, multi-geophysical surveys, which include two or more survey types, are conducted with consideration of the physical properties of the target mineral. In the case of magnetite ore, airborne magnetic surveys are useful due to the strong magnetic force involved, and electrical resistivity and induced polarization surveys can also be performed to investigate structures in detail in the local area. However, the interpretation of multi-geophysical data is challenging. Signals recorded from each geophysical source have different spatial resolutions, complicating quantitative analysis. Furthermore, joint inversion of multiple geophysical data types requires optimization of the objective function and regularization, and initial model selection based on various geophysical properties may be difficult [6,7].

Deeper target ore bodies are associated with more complex geological models, and machine learning (ML) has become an attractive alternative for processing such geoscience data [8,9,10]. Because ML solves problems such as classification and regression using a mathematical model based on a large amount of data rather than existing algorithms, it can efficiently predict candidate ore bodies without physical intervention [11]. Most previous ML studies related to mineral exploration conducted mineral prospectivity mapping (MPM) [4,12,13,14]. An ML-based MPM predicts prospective ore zones using ML models trained with geoscience data. However, this method is effective only for brownfield sites where abundant geoscience data are available, and where “prospective” and “non-prospective” zones have been investigated prior to classification. Exploring potential ore zones using geophysical data for a greenfield where data are scarce can compensate for the limitation of existing MPMs, and the application of ML to multi-geophysical data can alleviate the problem of interpreting data with different spatial resolutions. However, geophysical properties obtained through field surveys do not provide information useful for defining an ore body, which results in a lack of labeled data. Without labeled data, supervised ML of the target lithology is not possible.

In this paper, we describe an ML-based rock classifier developed based on geophysical properties, obtained by drilling cores and subjected to a laboratory experiment and apply the trained model to multi-geophysical data. The study area is the Gonamsan intrusion in South Korea. We used electrical resistivity, chargeability, and susceptibility as features for supervised learning, which correspond to the inversion results of electrical resistivity, induced polarization, and the magnetic survey, respectively. The ML models used for training include support vector machine (SVM), random forest (RF), extreme gradient boosting (XGB), LightGBM (LGBM), and deep neural network (DNN) models. We validate our trained model through multi-geophysical data collected in nearby areas.

2. Study Area

The study area is the Gonamsan intrusion (851–873 Ma) in the northeastern region of South Korea. The Yeoncheon vanadiferous titanomagnetite (TM) deposits were the main target, located within the Gonamsan intrusion. These TM deposits have been mined since 1934, and are currently being exploited by Samyang Resources in the Gwanin magnetite mine. Although investigations using geoscience approaches have been performed over several decades [15,16], the subsurface structures have not been fully delineated due to the limitations of exploration technology and high cost. Therefore, an additional survey will contribute to the detection of unidentified ore deposits.

Figure 1 shows a geological map of the Gonamsan intrusion. The intrusion extends about 3 km in the north-south direction and 1.5 km in the east-west direction. The minerals in the intrusion include vanadiferous titanomagnetite ore (Fe-Ti-V), oxide gabbro, monzogabbro-monzodiorite, and quartz-monzodiorite [17]. Our target is an orthomagmatic deposit differentiated from alkali gabbro magmas in the middle Proterozoic. Because mafic rocks are readily distinguished from surrounding rocks based on their geophysical properties, a rock classifier can help to determine the locations of ore bodies in the subsurface media from geophysical field data.

3. Methods

3.1. Rock Samples from Drilling Cores and Magnetite Mine

In the Gonamsan intrusion, the Korean Institute of Geoscience and Mineral Resources (KIGAM) conducted a total of seven drilling operations in 2019 and 2020. The locations of the wells are shown in Figure 1; they are near the Gwanin magnetite mine. The depth of the drillings was about 300 m. We obtained 541 rock samples at different depths from drilling cores and classified their lithofacies into low-grade ore, gabbro, quartz-monzodiorite, metamorphic, and dyke. Because drilling cores do not contain high-grade ore (HO), we added 55 ore rock samples from the Gwanin magnetite mine.

We used density as the criterion for labeling data in the multi-classification problem, as magnetite has greater density than surrounding rocks. We measured the dry density of rock samples by the buoyancy method using water-saturated and dehydrated grain mass [18]. The first class consists of rock samples containing HO, with a density above 4.40 g/cm³. The average density of HO from the Gwanin magnetite mine was 4.57 g/cm³, with a small standard deviation (0.06). Thus, we selected 4.40 g/cm³, which is around the minimum density of HO, as the criterion for the first class. The second class contains candidate ore (CO), with a density ranging from 3.50 to 4.40 g/cm³. The specific value of 3.50 g/cm³ is the upper boundary of the density range of gabbro [19]. Because our targets are differentiated from alkali gabbro magmas, the potential ore zone is expected to have greater density than gabbro. The third class, host rock (HOST), has a density of 3.50 g/cm³ or less, and its physical properties are most distinct from those of ore deposits. The number of rock samples analyzed in the HO, CO, and HOST classes was 56, 126, and 415, respectively. Figure 2 shows classified rock samples in each density class.

3.2. Laboratory Experiment

KIGAM conducted geophysical field exploration, including electrical resistivity, induced polarization, and airborne magnetic surveys, from 2019 to 2021. Through inversion of the geophysical data resulting from such surveys, electrical resistivity, chargeability, and magnetic susceptibility can be calculated.

To use the geophysical properties of drilling cores as features for training, we measured the electrical resistivity, chargeability, and magnetic susceptibility of all 596 rock samples [17]. Resistivity (ρ) is a coefficient of Ohm’s law that represents the impedance of direct current and is calculated from the cross-sectional area and length of the rock. As metals generally have high electrical conductivity, i.e., low resistivity, the resistivity distribution is useful for the exploration of magnetite deposits [20]. Chargeability (mV/V) represents the overvoltage effect of induced polarization. When the current between electrical poles stops, the voltage does not immediately reach zero, and overvoltage remains for a short time due to the effect of polarization [21]. In the time domain, the chargeability can be theoretically measured as the ratio of the measured voltage to the overvoltage, but obtaining overvoltage at the moment at which the current is cut off is difficult. Therefore, the integral over a specific time period is generally used, which is known as the apparent chargeability (ms). Magnetic susceptibility (dimensionless) is a geological property defined by the ratio of magnetization of the rocks to magnetic field strength and is an important property for the exploration of magnetite, where basic and ultra-basic rocks have high magnetic susceptibility [17]. Figure 3 shows the distribution of geophysical properties using histograms of logarithmic values. It can be seen that the geophysical properties are distributed in a wide range even after log transformation, especially in electrical resistivity and magnetic susceptibility.

The geophysical properties of the HO, CO, and HOST classes are shown in Figure 4. Higher density classes have lower electrical resistivity, along with higher chargeability and magnetic susceptibility. These trends are consistent with the characteristics of magnetite ore outlined in the previous paragraph. Therefore, the measured geophysical properties can be used as features for supervised learning to classify rock samples.

3.3. Data Preprocessing

To ensure that training is unbiased, the skewness and kurtosis of the data must be checked [22]. Table 1 shows the skewness and kurtosis of the raw and transformed training data. As the raw data for electrical and magnetic susceptibility are quite skewed, we applied log transformation to these parameters, while square root transformation was applied to chargeability to remove its relatively mild skewness. The absolute skewness of all transformed data was reduced after each transformation, and kurtosis was acceptable.

We decided to divide the whole data into 80% as a training dataset and 20% as a test dataset. The number of rock samples for the HO, CO, and HOST classes of the training dataset was 45, 100, and 331, respectively. To compensate for the imbalance of data, we oversampled HO and CO data and undersampled HOST data, which made the number of classes equal to 150. The Synthetic Minority Oversampling Technique (SMOTE) based on the k-nearest neighbor algorithm was used to create synthetic HO and CO data [23].

3.4. ML for Rock Classification

To generate a rock classifier using ML methods, we devised SVM, RF, XGB, and LGBM models using the Scikit-learn library [24], and a DNN model with the Keras package [25]. Figure 5 shows schematic diagrams of each method. SVM solves the classification and regression problems using hyperplanes determined from the maximum margin between groups [26]. Soft margins were set to correct data deviating from the average value, and kernel tricks were used for nonlinear classification. These tricks replaced the inner product with kernel functions to reduce the computational cost incurred when mapping low-dimensionality spaces into high-dimensionality spaces. The kernel functions included linear, polynomial, and Gaussian radial basis functions. RF is an ensemble learning method based on a decision tree model [27]. The decision tree model has layer structures with edges and nodes and is used for classification or regression while breaking a dataset down into smaller subsets (Figure 5b). However, the trained model based on the decision tree was vulnerable to overfitting, complicating its application to other datasets. On the other hand, RF can derive a generalized solution through voting on results from multiple tree models. To reduce the correlations between tree models, bagging (boost aggregating) and randomized node optimizations were used. XGB is a representative gradient boosting method (GBM) that provides parallel computation [28]. While the bagging method collects values separately from each tree model, GBM uses the residuals of previous models and reduces errors therein using gradient descent algorithms. LGBM is an improved GBM algorithm that employs leaf-wise (vertical) growth in the tree model [29]. This algorithm can reduce the calculation time and memory requirements while retaining good accuracy. The overfitting issue occurring with leaf-wise growth can be alleviated through the selection of appropriate hyperparameters. DNN is a type of artificial neural network that contains two or more hidden layers [30]. Figure 5c illustrates the basic structure of the DNN, which consisted of input, hidden, and output layers. The nodes of each layer were connected to those of the adjacent layer through weight and bias with activation functions. The activation function types included sigmoid, hyperbolic tangent, and rectified linear unit (ReLU); we selected a function according to the type of problem and training data. The network was trained by updating each weight and bias while reducing errors between the target and output through back-propagation. The structure with multiple hidden layers contributed to the solution of complex non-linear problems.

3.5. Optimizing the Hyperparameters of ML Methods

Each ML method has hyperparameters, which have fixed values during training. Because the performance of the model is affected by the values of these hyperparameters, they should be optimized according to the dataset. Table 2 shows the optimal hyperparameters for each ML method obtained from the grid search algorithm with five-fold cross-validation based on accuracy. The grid search algorithm compared the scores of all combinations defined by the user and identified the optimal hyperparameter set. Other hyperparameters were set to the default values of software packages. For DNN, the number of nodes and activation functions in the last layer were fixed to 3 and the softmax function, respectively.

4. Results

4.1. Validation of ML Methods

Using the optimal hyperparameters listed in Table 2, we evaluated the ML methods via metrics including accuracy, recall score, precision, and F1 score, as shown in Table 3. For all metrics, DNN had higher scores than the other tested methods, but most other methods had scores greater than 0.95. Therefore, the rock samples could be classified with the supervised ML, based on the geophysical properties obtained through the laboratory experiment. The limitations of this evaluation include the small number of datasets and imbalance among classes. The total number of samples in the test was 120, including 11 for HO, 25 for CO, and 84 for HOST. The HO rock samples in the test dataset were properly classified by all ML methods, indicating that they are readily distinguished from other classes.

4.2. Application to Geophysical Field Data

The geophysical field data collected by KIGAM in 2019 were applied to the trained model. An extensive airborne magnetic survey was conducted first, which covered a rectangular area (3 km × 5 km) including the Gonamsan intrusion. From the magnetic anomalies, we determined the most appropriate area for electrical resistivity and induced polarization surveys in consideration of accessibility; survey locations are represented by red dashed lines in Figure 1. The electrical resistivity and induced polarization surveys were conducted on three parallel profile lines using a SuperSting R8/IP (Advanced Geo-sciences, Cedar Park, TX, USA) instrument, and three-dimensional (3D) inversion was performed to obtain geophysical properties. It is beyond the scope of this paper to provide the details of each inversion process. Figure 6 illustrates the 3D inverted electrical resistivity, chargeability, and magnetic susceptibility data obtained from each survey. Originally, magnetic inversion covered a larger region, which was downsized to the local area to match the scales of the other two surveys. The anomalous area, which is common among the three sets of inversion results, is located northeast of the exploration area and has relatively low electrical resistivity, but high chargeability and magnetic susceptibility.

Before we apply the trained models to field data, it is necessary to examine the correlations between geophysical properties obtained by inversion of field data. As described in Section 3.2., in rock samples, higher density classes have lower electrical resistivity, and higher chargeability and magnetic susceptibility. However, because the resolution of the inversion results is different, the correlations between the inverted geophysical properties may not be consistent with those between the geophysical properties of rock samples. Table 4 shows the correlations between the transformed geophysical properties of rock samples and field data; mean and standard deviation values are also provided. The statistics for rock samples were obtained for 596 samples, which were used in training, and those for field data were obtained by 382,084 points of inversion results. The correlations between geophysical properties of the rock samples and field data in Table 4 are not close to each other, but have the same sign and do not differ significantly.

Figure 7 shows the classification results for the survey area, which were obtained using five trained models with inverted geophysical properties as input data. No areas were classified as HO in any of the analyses, and few areas were predicted to be CO. Most of the areas classified as CO were located in the northeast, corresponding to the location of an anomaly in the inversion results shown in Figure 6. The area classified as CO varied among the five ML models, with the largest area being obtained by RF and the smallest by SVM. The classification results can serve as reference information for selecting locations for further drilling or exploration.

To verify the ML results, we compared the top view of the classification results with the geological map shown in Figure 8. The exploration area where the electrical and induced polarization surveys were conducted is located on the border between quartz-monzodiorite and monzogabbro-monzodiorite, and the eastern area is generally more mafic than the western one. Although the verification using the surface data does not guarantee the distribution in the subsurface, most areas predicted to be CO at the surface are located in the monzogabbro-monzodiorite, which is consistent with the distribution shown on the geological map.

5. Discussion

5.1. Difference between Laboratory Experiment and Inversion Results

We generated a rock classifier using ML methods with the geophysical properties of drilling cores obtained by a laboratory experiment and predicted promising areas for ore within the survey area using inverted geophysical properties as input data. However, predictions in the survey area are limited in that the geophysical properties obtained from the two sample groups are fundamentally different. The geophysical properties of drilling cores are measured directly through a laboratory experiment, but the inverted properties are estimated from signals recorded at the surface, resulting in differences in resolution and accuracy. Nevertheless, the correlations between the features of the two sample types are similar, as delineated in Table 4, and the same scaler was applied for data processing during training, such that the classified areas correspond to the anomaly observed in geophysical field data. In future research, to compensate for the problems caused by the differences between these two groups, we will adopt a deep learning technique such as domain adaptation [31], which can handle data with differing domains.

5.2. Accuracy of Inversion Results for Field Exploration

The inverted geophysical properties illustrated in Figure 6 are not unique solutions. These values were calculated using non-linear optimization solutions derived from field data obtained through electrical resistivity, induced polarization, and airborne magnetic surveys [32]. The quality of the inversion results depends on various factors, including the number of data samples acquired, location of the survey line, performance of equipment, inversion algorithm, and exploration environment. Thus, for the inverted values to serve as representative geophysical properties in the survey area, improving the accuracy and resolution of inversion results is essential.

5.3. Training Materials from Drilling Cores

The total number of rock samples used for training is 596, which is insufficient to cover areas around the Gonamsan intrusion. The reliability of training can be improved by investigating diverse rock lithologies with large numbers of rock samples. As KIGAM plans drilling operations near the Gonamsan intrusion, we will obtain additional data in the future.

We labeled our data based on density. As our ultimate aim is the identification of areas that may contain ore, the grade of ore is a reasonable criterion. However, measuring the grade of ore for all rock samples was impractical, due to both the cost and time requirement. Thus, readers should interpret the results carefully, given that density was used as the criterion for labeling data.

5.4. Verification of the Classfication Results from Field Data

We applied geophysical field data to the trained ML models and verified the results through the comparison of the top view with a geological map. However, because our target is a promising area in the subsurface, another method was needed to verify the subsurface composition associated with the classification results. The most reliable method to verify composition is through drilling, but drilling in all areas is unrealistic. An alternative verification method is the creation of geological models for the survey area, and the generation of synthetic data through numerical modeling. The geological model can be determined by geoscience approaches, including geophysical, geological, and geochemical surveys with drilling. As synthetic data are labeled, the trained model can be verified using synthetic data prior to its application to field data. Verification using synthetic data is possible if the geological model is accurate, i.e., if there is a small difference between the field and synthetic data.

6. Conclusions

We generated a rock classifier using supervised ML to investigate vanadiferous titanomagnetite ore deposits. The training materials were rock samples from drilling cores, and geophysical properties including electrical resistivity, chargeability, and magnetic susceptibility were used as features for training. Those properties were obtained through laboratory measurements and were labeled by density into HO, CO, and HOST classes.

We used SVM, RF, XGB, LGBM, and DNN for supervised ML, and optimized the hyperparameters of each method using a grid search algorithm. With the test dataset, the accuracy of DNN was highest, at 0.97, and all methods had values of 0.95 or greater. The trained model was applied to field exploration data acquired by electrical resistivity, induced polarization, and airborne magnetic surveys. The classification results of the trained models contained no areas of HO and few of CO. We verified the classified areas through comparison with a geological map at the surface. Most areas classified as CO are located on the eastern side of the boundary of monzodiorites, which is consistent with the distribution shown on the geological map.

The rock classifier that we generated can predict the distribution of promising ore zones in subsurface media and help guide the selection of locations for future drilling or exploration. However, our method has several problems and limitations to be solved. First, the method of obtaining geophysical properties for rock samples and field data is different, which can cause a difference in resolution and accuracy. To reduce the gap between different domains, additional techniques including domain adaptation are required. The second problem is the lack of training samples and the quality of inversion results of field data. Adding rock samples with various lithology through additional drilling and improving the quality of inversion results increase the reliability of the trained model. The other problem of this study is the lack of verification of classified results in the subsurface media. The synthetic data based on the geological model can help to verify the trained model before applying the field data.

Author Contributions

Conceptualization, Y.S. and S.S.; methodology, Y.S. and S.S.; formal analysis, Y.S.; investigation, Y.S. and S.S.; resources, Y.S. and S.S.; writing—original draft preparation, Y.S.; writing—review and editing, Y.S. and S.S.; visualization, Y.S.; supervision, S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was also supported by the basic research project of the Korea Institute of Geoscience and Mineral Resources, Daejeon, South Korea, funded by the Ministry of Science and ICT of Korea (GP2020-007) and the Energy Efficiency & Resources of the Korea Institute of Energy Technology Evaluation and Planning (KETEP) grant funded by the Korea government Ministry of Trade, Industry and Energy (No. 20216110100040).

Data Availability Statement

Not applicable.

Acknowledgments

We appreciate anonymous reviewers and the associated editor for the constructive comments to improve the quality of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Tichauer, R.; Martins, A.C.; Silva, R.S.; De Tomi, G. The role of geophysics in enhancing mine planning decision-making in small-scale mining. R. Soc. Open Sci. 2020, 7, 200384. [Google Scholar] [CrossRef] [PubMed]
Webb, S.J.; Scheiber-Enslin, S.E.; Cole, J. The importance of large scale geophysical investigations for mineral exploration. In Ore Deposits: Origin, Exploration, and Exploitation; Sophie, D., Laurence, R., Eds.; Wiley: Hoboken, NJ, USA, 2019; pp. 209–223. [Google Scholar]
Blain, C. Fifty-year trends in minerals discovery-commodity and ore-type targets. Explor. Min. Geol. 2000, 9, 1–11. [Google Scholar] [CrossRef]
Qin, Y.; Liu, L.; Wu, W. Machine learning-based 3D modeling of mineral prospectivity mapping in the Anqing Orefield, Eastern China. Nat. Resour. Res. 2021, 30, 3099–3120. [Google Scholar] [CrossRef]
Frasheri, A.; Lubonja, L.; Alikaj, P. On the application of geophysics in the exploration for copper and chrome ores in Albania. Geophys. Prospect. 1995, 43, 743–757. [Google Scholar] [CrossRef]
Pace, F.; Godio, A.; Santilano, A.; Comina, C. Joint optimization of geophysical data using multi-objective swarm intelligence. Geophys. J. Int. 2019, 218, 1502–1521. [Google Scholar] [CrossRef]
Sun, J.; Li, Y. Joint inversion of multiple geophysical data using guided fuzzy c-means clustering. Geophysics 2016, 81, ID37–ID57. [Google Scholar] [CrossRef]
Abedi, M.; Norouzi, G.H.; Bahroudi, A. Support vector machine for multi-classification of mineral prospectivity areas. Comput. Geosci. 2012, 46, 272–283. [Google Scholar] [CrossRef]
Carranza, E.J.M.; Laborte, A.G. Random forest predictive modeling of mineral prospectivity with small number of prospects and data with missing values in Abra (Philippines). Comput. Geosci. 2015, 74, 60–70. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.; Sanchez-Castillo, M.; Chica-Olmo, M.; Chica-Rivas, M.J.O.G.R. Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 2015, 71, 804–818. [Google Scholar] [CrossRef]
Granek, J. Application of Machine Learning Algorithms to Mineral Prospectivity Mapping. Ph.D. Thesis, University of British Columbia, Vancouver, BC, Canada, 21 January 2017. [Google Scholar]
Fu, G.; Lü, Q.; Yan, J.; Farquharson, C.G.; Qi, G.; Zhang, K.; Zhang, Y.; Wang, H.; Luo, F. 3D mineral prospectivity modeling based on machine learning: A case study of the Zhuxi tungsten deposit in northeastern Jiangxi Province, South China. Ore Geol. Rev. 2021, 131, 104010. [Google Scholar] [CrossRef]
Lachaud, A. Analysis of Machine Learning Mineral Prospectivity Models at a Project-Scale using Scarce Training Dataset. Ph.D. Thesis, University of British Columbia, Vancouver, BC, Canada, 31 March 2021. [Google Scholar]
McMillan, M.; Haber, E.; Peters, B.; Fohring, J. Mineral prospectivity mapping using a VNet convolutional neural network. Lead. Edge 2021, 40, 99–105. [Google Scholar] [CrossRef]
Cho, J.D.; Bang, K.Y. A Report of the Magnetic Survey on the Titanomagnetite Ore Bodies of the Mt. Gonam Area; Korea Research Institute of Geoscience and Mineral Resources: Daejeon, Korea, 1980; Volume 8, pp. 149–164. [Google Scholar]
Kee, W.S.; Cho, D.L.; Kim, B.C.; Jin, K.M. Geological Report of the Pocheon Sheet (1:50,000); Korea Institute of Geoscience and Mineral Resources: Daejeon, Korea, 2005; p. 66. [Google Scholar]
Shin, S.; Cho, S.; Kim, E.; Lee, J. Geophysical Properties of Precambrian Igneous Rocks in the Gwanin Vanadiferous Titanomagnetite Deposit, Korea. Minerals 2021, 11, 1031. [Google Scholar] [CrossRef]
Bieniawski, Z.T.; Bernede, M.J. Suggested methods for determining the uniaxial compressive strength and deformability of rock materials: Part 1. Suggested method for determining deformability of rock materials in uniaxial compression. Int. J. Rock Mech. Min. Sci. Geomech. Abstr. 1979, 16, 138–140. [Google Scholar] [CrossRef]
Telford, W.M.; Geldart, L.P.; Sheriff, R.E. Applied Geophysics; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar]
Herman, R. An introduction to electrical resistivity in geophysics. Am. J. Phys. 2001, 69, 943–952. [Google Scholar] [CrossRef] [Green Version]
Oldenburg, D.W.; Li, Y. Inversion of induced polarization data. Geophysics 1994, 59, 1327–1341. [Google Scholar] [CrossRef]
Wang, M.; Deng, W. Mitigating bias in face recognition using skewness-aware reinforcement learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 14–19 June 2020. [Google Scholar]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Duchesnay, E.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Charles, F. Keras, 2015, GitHub, GitHub Repository. Available online: https://github.com/fchollet/keras (accessed on 9 April 2022).
Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 2004, 17, 113–126. [Google Scholar] [CrossRef] [Green Version]
Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 13–17 August 2016. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Liu, W.; Wang, Z.; Liu, X.; Zeng, N.; Liu, Y.; Alsaadi, F.E. A survey of deep neural network architectures and their applications. Neurocomputing 2017, 234, 11–26. [Google Scholar] [CrossRef]
Pan, S.J.; Tsang, I.W.; Kwok, J.T.; Yang, Q. Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. Learn. Syst. 2010, 22, 199–210. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Oldenburg, D.W.; Pratt, D.A. Geophysical inversion for mineral exploration: A decade of progress in theory and practice. In Proceedings of the 5th Decennial International Conference on Mineral Exploration (Exploration 07), Toronto, ON, Canada, 9–12 September 2007. [Google Scholar]

Figure 1. Geological map of the area surrounding the Gonamsan intrusion (modified from Kee et al. [16]). Numbers below the well symbols indicate the installation year and the number of wells. The red dashed lines represent the three parallel lines of the electrical resistivity and induced polarization survey.

Figure 2. Classified rock samples by density: (a) high-grade ore (HO) (>4.4 g/cm³), (b) candidate ore (CO) deposits (3.5–4.4 g/cm³), and (c) host rock (HOST) (<3.5 g/cm³).

Figure 3. Histograms of log value of (a) electrical resistivity, (b) chargeability, and (c) magnetic susceptibility of rock samples.

Figure 4. Box plots of the (a) electrical resistivity, (b) chargeability, and (c) magnetic susceptibility of rock samples classified as high-grade ore (HO), candidate ore deposit (CO), and host rocks (HOST).

Figure 5. Schematic images of machine learning (ML) methods: (a) support vector machine (SVM), (b) random forest (RF), and (c) deep neural network (DNN).

Figure 6. The 3D inversion results from the geophysical field survey: (a) electrical resistivity, (b) chargeability, and (c) magnetic susceptibility.

Figure 7. Classification results obtained in the survey area with inverted geophysical properties as features, obtained using trained ML models: (a) SVM, (b) RF, (c) extreme gradient boosting (XGB), (d) lightGBM (LGBM), and (e) DNN.

Figure 8. Comparison between the geological map and classification results obtained with trained ML methods: (a) SVM, (b) RF, (c) XGB, (d) LGBM and (e) DNN. The areas colored in yellow are candidates for ore deposits.

Table 1. Skewness and kurtosis values for the raw and transformed resistivity, chargeability, and magnetic susceptibility data. Log transformation was applied to resistivity and magnetic susceptibility, and square root transformation to chargeability.

Property	Skewness		Kurtosis
Property	Raw	Transformed	Raw	Transformed
Resistivity (Ωm)	5.28	−0.42	43.42	−1.22
Chargeability (ms)	0.61	0.01	−0.12	−1.40
Magnetic susceptibility	1.84	−0.26	2.37	−1.37

Table 2. Optimal hyperparameters for the ML methods.

Model	Optimal Hyperparameter Set
SVM	‘C’ = 100, ‘gamma’ = 1, ‘kernel’ = rbf
RF	‘n_estimators’ = 100, ‘max_depth’ = 5, ‘max_features’ = 3, ‘min_samples_leaf’ = 3, ‘min_samples_split’=8,
XGB	‘n_estimators’ = 50, ‘learning_rate’ = 1, ‘max_depth’ = 7, ‘gamma’ = 1, ‘colsample_bytree’ = 0.6
LGBM	‘n_estimators’ = 500, ‘colsample_bytree’ = 0.9, ‘max_depth’ = 5, ‘num_leaves’ = 30, ‘subsample’ = 0.2
DNN	‘n_hidden_layer’ = 2, ‘n_hidden_nodes’ = (100, 100), ‘activation_function’ = (Elu, Elu), ‘optimizer’ = Adam, ‘drop_out_ratio’ = 0, ‘learning_rate’ = 0.01

Table 3. Evaluation results for the tested ML methods.

Model	Accuracy	Recall-Score	Precision	F1-Score
SVM	0.950	0.957	0.942	0.950
RF	0.967	0.956	0.974	0.964
XGB	0.958	0.971	0.948	0.958
LGBM	0.967	0.956	0.974	0.964
DNN	0.975	0.969	0.978	0.974

Table 4. Correlation, mean and standard deviation (SD) of values of transformed electrical resistivity, chargeability, and magnetic susceptibility for drilling cores and field observations.

	Mean	SD	Correlations
	Mean	SD	LOGRESI	SQRTCHAR	LOGSI
Rock samples
LOGRESI	2.75	1.43	1.00
SQRTCHAR	12.99	7.58	−0.78	1.00
LOGSI	−1.74	1.21	−0.85	0.85	1.00
Field data
LOGRESI	4.02	0.61	1.00
SQRTCHAR	2.00	2.25	−0.36	1.00
LOGSI	−1.40	0.63	−0.41	0.61	1.00

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shin, Y.; Shin, S. Rock Classification in a Vanadiferous Titanomagnetite Deposit Based on Supervised Machine Learning. Minerals 2022, 12, 461. https://doi.org/10.3390/min12040461

AMA Style

Shin Y, Shin S. Rock Classification in a Vanadiferous Titanomagnetite Deposit Based on Supervised Machine Learning. Minerals. 2022; 12(4):461. https://doi.org/10.3390/min12040461

Chicago/Turabian Style

Shin, Youngjae, and Seungwook Shin. 2022. "Rock Classification in a Vanadiferous Titanomagnetite Deposit Based on Supervised Machine Learning" Minerals 12, no. 4: 461. https://doi.org/10.3390/min12040461

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rock Classification in a Vanadiferous Titanomagnetite Deposit Based on Supervised Machine Learning

Abstract

1. Introduction

2. Study Area

3. Methods

3.1. Rock Samples from Drilling Cores and Magnetite Mine

3.2. Laboratory Experiment

3.3. Data Preprocessing

3.4. ML for Rock Classification

3.5. Optimizing the Hyperparameters of ML Methods

4. Results

4.1. Validation of ML Methods

4.2. Application to Geophysical Field Data

5. Discussion

5.1. Difference between Laboratory Experiment and Inversion Results

5.2. Accuracy of Inversion Results for Field Exploration

5.3. Training Materials from Drilling Cores

5.4. Verification of the Classfication Results from Field Data

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI