Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery

Bahrami, Hazhir; Esmaeili, Pouya; Homayouni, Saeid; Pour, Amin Beiranvand; Chokmani, Karem; Bahroudi, Abbas

doi:10.3390/min14020202

Open AccessArticle

Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery

by

Hazhir Bahrami

¹

,

Pouya Esmaeili

²,

Saeid Homayouni

¹

,

Amin Beiranvand Pour

^3,*

,

Karem Chokmani

¹

and

Abbas Bahroudi

²

¹

Centre Eau Terre Environnement, Institut National de la Recherche Scientifique, Québec, QC G1K 9A9, Canada

²

School of Mining Engineering, College of Engineering, University of Tehran, Tehran 1417935840, Iran

³

Institute of Oceanography and Environment (INOS), Higher Institution Center of Excellence (HICoE) in Marine Science, Universiti Malaysia Terengganu (UMT), Kuala Nerus 21030, Terengganu, Malaysia

^*

Author to whom correspondence should be addressed.

Minerals 2024, 14(2), 202; https://doi.org/10.3390/min14020202

Submission received: 21 December 2023 / Revised: 11 February 2024 / Accepted: 12 February 2024 / Published: 16 February 2024

(This article belongs to the Special Issue Geological, Structural, Geochemical, Hyperspectral, and Geostatistical Modeling for Mineral Exploration)

Download

Browse Figures

Versions Notes

Abstract

:

Accurately mapping lithological features is essential for geological surveys and the exploration of mineral resources. Remote-sensing images have been widely used to extract information about mineralized alteration zones due to their cost-effectiveness and potential for being widely applied. Automated methods, such as machine-learning algorithms, for lithological mapping using satellite imagery have also received attention. This study aims to map lithologies and minerals indirectly through machine-learning algorithms using advanced spaceborne thermal emission and reflection radiometer (ASTER) remote-sensing data. The capabilities of several machine-learning (ML) algorithms were evaluated for lithological mapping, including random forest (RF), support vector machine (SVM), gradient boosting (GB), extreme gradient boosting (XGB), and a deep-learning artificial neural network (ANN). These methods were applied to ASTER imagery of the Sar-Cheshmeh copper mining region of Kerman Province, in southern Iran. First, several spectral features that were extracted from ASTER bands were used as input data. Second, correlation coefficients between the original spectral bands and features were extracted. The importance of the random forest features (RF’s feature importance) was subsequently computed, and features with less importance were removed. Finally, the remained features were given to the models as input data in the second scenario. Accuracy assessments were performed for lithological classes in the study region, including Sar-Cheshmeh porphyry, quartz eye, late fine porphyry, hornblende dike, granodiorite, feldspar dike, biotite dike, andesite, and alluvium. The overall accuracy results of lithological mapping showed that ML-based algorithms without feature extraction have the highest accuracy. The overall accuracy percentages for ML-based algorithms without conducting feature extraction were 84%, 85%, 80%, 82%, and 80% for RF, SVM, GB, XGB, and ANN, respectively. The results of this study would be of great interest to geologists for lithological mapping and mineral exploration, particularly for selecting appropriate ML-based techniques to be implemented in similar regions.

Keywords:

lithological mapping; remote sensing; ASTER; mineral identification; exploration geology; machine learning; deep learning

1. Introduction

The concepts of geological units and lithological mapping in geology are closely related; however, they are distinct from one another. On one hand, different geological units can be categorized based on their characteristics, such as composition, age, and origin, including rock layers, formations, and other distinct bodies of rock. On the other hand, lithological mapping provides information about the distribution and geological history of the Earth’s crust, together with its characteristics [1]. Therefore, it plays a significant role in bedrock surveys and mineral exploration [2]. In many cases, ore deposits have been first discovered on the ground by recognizing hydrothermally altered host rocks. To understand the distribution, properties, and characteristics of different rock types within a particular area, regional lithological maps can play a significant role in lithological mapping as part of geology and mineral exploration [1]. Acquiring such lithological maps is time-consuming, needs intensive fieldwork, and can be challenging in cases where the study area is difficult to reach.

Remote sensing is one of the valuable approaches for lithological mapping to explore commercially viable mineral resources. Remote-sensing mapping techniques can locate high potential zones for ore mineralization in a vast area by recognizing hydrothermally altered rocks, while consuming less time and money and achieving a higher accuracy than ground-based field surveys [3,4]. However, the medium and coarse spatial and spectral resolution of remote-sensing data may make the implementation of such systems difficult. The technical characteristics of multispectral and hyperspectral remote-sensing sensors are crucial for lithological mapping and mineral exploration [5,6,7]. Sensors that are equipped with hyperspectral technology are capable of simultaneously acquiring images with 100 to 200 contiguous spectral bands, allowing for a unique combination of spectrally contiguous images [8]. A substantial amount of spectral information can be derived from satellite-based hyperspectral data, allowing mineral compositions to be determined from the spectra [9]. Yet, in addition to spectral confusion, difficulties in data processing, relatively narrow swath widths, and atmospheric interference, high-resolution hyperspectral images are prone to spectral interference [10]. Thus, a single pixel in the image provides coverage for a large ground surface area (e.g., 1000 m²), making selections of pure pixel spectra for training samples in the supervised classifier difficult and challenging; furthermore, the lithological classification accuracy is potentially low as a result [11,12]. Hyperspectral data are often not openly available or are costly.

Thanks to the availability of high-resolution multispectral data, such as SPOT and GF-2, the problem of low classification accuracy has been solved to a certain extent. Despite an impressive capability of showing structural and textural features, high-spatial-resolution multispectral satellite data have a narrow spectral range. There are only a few visible and near-infrared bands and a marked absence of other spectral bands such as short wave and thermal infrared. Most high-resolution images are also costly and not publicly available.

The advanced spaceborne thermal emission and reflection radiometer (ASTER) sensor can identify lithological units and hydrothermal alteration mineral zones [13,14,15,16,17]. ASTER (Ministry of Economy, Trade, and Industry, Tokyo, Japan) provides worldwide coverage with high revisit times (16 days) at a relatively high spatial resolution (15–90 m). Using this technique, it is possible to identify imagery that is free from cloud cover or that is seasonal in order to minimize the effects of vegetation. It has been demonstrated that the remote identification of iron oxide minerals can be easily achieved using ASTER’s visible and near-infrared (VNIR) bands [14,18,19,20]. The fundamental absorption features of Al–O–H, Mg–O–H, Si–O–H, and CO₃ for identifying hydrothermal alteration minerals (e.g., phyllosilicates, sorosilicate, and carbonates) can be detected using the shortwave infrared (SWIR) bands of ASTER [21,22,23,24]. Furthermore, ASTER’s thermal infrared bands (TIR) can distinguish silicate lithological groups through the emissivity spectra that are derived from Si–O–Si stretching vibrations [15,25,26,27].

To map lithological units and identify alteration mineral zones, several image processing algorithms, namely band math, minimum noise fraction, spectral angle mapper, principal component analysis, false color composite, and matched filter, have been commonly applied to ASTER data [5,8,9,11,28]. The results from these conventional algorithms contain some drawbacks, such as unclassified and misclassified units, which are challenging. Hence, these techniques typically might reduce the accuracy of lithological and alteration mapping [29,30]. Recently, machine-learning (ML) algorithms have been more effective than conventional classification methods when classifying geological targets [31,32]. ML, which is a sub-domain of artificial intelligence, is a data-driven technique that helps to extract useful information and recognize patterns in data with minimal human involvement [33,34,35]. ML algorithms have several advantages, especially in automatically solving the most complex nonlinear problems, and are more robust in handling missing data than traditional image-processing methods [33,36]. Particular attention is devoted to the task of supervised lithology classification for the prediction of classes representing the spatial distributions of geological materials.

Some researchers, such as Bachri et al. [37] and Cracknell and Reading [33], have assessed and evaluated applications of ML algorithms in geological mapping using remote-sensing imagery. They showed considerable potential in various areas, such as mapping lithological units and the identification of alteration zones that are associated with a variety of ore mineralization processes [37]. Extensively applied ML algorithms in geology and mineral mapping include support vector machines (SVM) [33], artificial neural networks (ANNs) [33], random forest (RF) [38], maximum likelihood classifier (MLC) [38], k-nearest neighbors (k-NN) [33], and naïve Bayes (NB) [33]. Advancement in ML algorithms for image processing based on satellite data has considerably assisted in enhancing the detection of lithological and structural features, and in identifying alteration zones for mineral exploration. Lithological mapping could be made more feasible by using state-of-the-art ML algorithms like gradient boosting (GB), extreme gradient boosting (XGB), and artificial neural networks (ANNs).

A neural network is an artificial intelligence algorithm that is capable of analyzing patterns, learning tasks, and solving problems like humans [39]. The ANN is widely used to solve complex problems in diverse fields, including regression and classification problems [40]. The performance of ANNs depends on several key parameters, such as activation functions, loss functions, optimizers, hidden layers, the number of nodes, and regularization layers [41]. The GB [42] is a sequential ensemble learning technique where the model’s performance improves over iterations [43]. This method creates the model in a stage-wise fashion. It infers the model by enabling the optimization of an absolute differentiable loss function [43,44]. The XGB algorithm is an extended version of the gradient boosting algorithm. It is designed to enhance an ML model’s performance and speed. Xiong et al. [45] analyzed deep-learning algorithms and big data in skarn-type (sedimentary–igneous intrusion contacts) iron mineralization in China. Their results showed a strong spatial relationship between known mineralization areas, which were mapped prospectively by a deep-learning method. Elahi et al. [46] investigated the potential of two ML algorithms, including SVM and ANN, using Sentinel-2 optical data for lithological mapping in Pakistan. They reported an overall accuracy of 95.78% and 95.73% for SVM and ANN, respectively. Utilizing ML methods of XGB and ANN algorithms on ASTER data has a high potential and great advantages for lithological mapping and mineral exploration.

The study’s main objective is to propose an approach for identifying the most optimized and efficient ML approach for lithological mapping using ASTER remote-sensing data. This research compares several traditional machine-learning algorithms, such as RF and SVM, to novel ensemble machine learning techniques, such as GB, XGB, and deep-learning ANN, for the spatial modeling of lithological units. It also aims to find the most relevant features and spectral regions for lithological mapping. This study represents an inclusive evaluation of RF, SVM, GB, XGB, and deep ANN algorithms for lithological mapping using ASTER data. The models can provide geologists with accurate lithological mapping and mineral exploration, especially when applied to similar regions.

2. Geology of the Study Area

The current study focuses on mineral exploration in Iran. Most of the country is semi-arid with sparse, mainly herbaceous vegetation on surfaces that are well exposed. This makes remote-sensing-based geological mapping an ideal method of study [47]. The Sar-Cheshmeh copper mining region in Kerman Province (southeast Iran) was selected as a case study (Figure 1A,B). The Sar-Cheshmeh porphyry copper deposit is considered the second largest global deposit of this metal, the most important in Iran, and has been exploited since ancient times. It contains roughly 1200 million tonnes of ore with an average grade of 1.2% copper, 0.03% molybdenum, 3.9 g/t Ag, and 0.11 g/t Au [48]. It is the first time that ML-based techniques (RF, SVM, GB, XGB, and deep-learning ANN) have been used for lithological mapping using ASTER remote-sensing data in the Sar-Cheshmeh copper mining region (Figure 1). The study area is 160 km southeast of Kerman City (55.865556° E, 29.946111° N) and south of the Urmia-Dokhtar volcanic belt (Figure 1A). It is located in an area of Eocene volcanic rock and Oligo-Miocene subvolcanic granitoid rock.

It is believed that the oldest host rocks of the Sar-Cheshmeh porphyry copper deposit are derived from the Eocene volcanogenic complex [49], which consists of the following: pyroxene trachybasalt, potassic and shoshonitic pyroxene andesite [50], less abundant andesite, agglomerate, tuff, and tuffaceous sandstone. During the Oligocene–Miocene transition (~23 Ma), granitoid phases such as quartz diorite, quartz monzonite, and granodiorite were intruded into these rocks. These granitoid rocks are cut by intramineral porphyry dikes composed of hornblende porphyry, feldspar porphyry, and biotite porphyry. The Sar-Cheshmeh copper deposit is placed in Eocene volcanic rocks, where a Miocene sub-volcanic granitoid unit intruded into andesitic host rocks [48]. Porphyry copper mineralization in this area is associated with well-developed zones of hydrothermal phyllic, argillic, propylitic, silicification, and jarositic alteration zones.

The deposit is located at an average altitude of 2620 m asl and its highest altitude reaches 3280 m asl. Generally speaking, the regional climate is characterized by cold, snowy, and windy winters, and mild summers. The temperature ranges from −15 to +35 °C. The average rainfall is reported to be 250 mm or less per year. As a result, the surface of the earth is well exposed, given that there is little or no vegetation cover, which makes the remote-sensing approach very suitable.

3. Materials and Methods

3.1. ASTER Data Characteristics and Preprocessing

ASTER is a moderate spatial and spectral resolution instrument on the Terra satellite platform, which observes the Earth’s surface through various electromagnetic wavelengths from visible to thermal infrared [51]. This sensor has 14 separate bands: (1) 3 bands in the visible and near-infrared (VNIR) (0.52 to 0.86 μm) with a spatial resolution of 15 m (i.e., Bands 1, 2, and 3), (2) 6 bands in the shortwave infrared (SWIR) (1.60 to 2.43 μm) with a spatial resolution of 30 m (i.e., Bands 4, 5, 6, 7, 8, and 9, and (3) 5 bands in the thermal infrared (TIR) with a spatial resolution of 90 m (i.e., Bands 10, 11, 12, 13, and 14) [52]. This study used ASTER Level 2 surface reflectance VNIR and crosstalk-corrected SWIR (AST_07XT) and the surface radiance TIR (AST_09T) datasets (ASTER Level 0: raw data; Level 1A: calibration of Level 0 and conversion to units of radiance; Level 1B: converts radiance to at-sensor reflectance; Level 2: applying atmospheric correction and achieve surface reflectance). The image was acquired on 20 May 2006. AST_07XT includes two product files that have been atmospherically corrected for the VNIR and SWIR derived from the Level 1B data. AST_09T data are atmospherically corrected and provide surface-leaving radiance at the 90 m spatial resolution for ASTER thermal bands. It contains surface-emitted and surface-reflected components. These products are freely available on NASA’s Earthdata website (https://earthdata.nasa.gov (accessed on 18 September 2021)). Orthorectification was applied to AST_07XT and AST_09T by the ENVI’s ASTER Preprocessing Toolkit using a group of ground control points. Using the bilinear method, ASTER SWIR and TIR with 30 m and 90 m were resampled to 15 m to match VNIR data. Finally, all bands were stacked.

3.2. Mineral Spectral Characteristics

As a result of vibrational overtones, electronic transitions, charge transfer, and conduction, many minerals have diagnostic absorption features in the solar-reflected spectral region (0.3–2.5 m) [19]. There is a prominent Al–OH absorption feature at 2.2 μm and a less intense one at 2.35 μm that are characteristic of deictically altered rocks (i.e., molten or plastic rock injected into cavities or between layers) that contain sericite. An advanced argillic alteration is characterized by kaolinite and alunite with Al–OH absorption lines at 2.165 μm and 2.2 μm, respectively. Chlorite, epidote, and calcite are commonly present in propylitically (chemically) altered rocks, with Fe, Mg–OH, and CO₃ absorption features from 2.1 to 2.3 μm (Figure 2A) [53]. Minerals containing iron oxides and hydroxides, such as limonite and hematite, tend to exhibit spectral absorption features between 0.4 and 1.1 μm of the electromagnetic spectrum (Figure 2B) [20].

Using ASTER SWIR bands for lithology and mineral mapping of lithological units, Yamaguchi and Naito [54] proposed several spectral indices using a linear combination of reflectance in each ASTER SWIR band, including the kaolinite index, alunite index, calcite index, and montmorillonite index. Considering the spectral absorption characteristics of vegetation, minerals, and rocks in the various bands of ASTER data, Ninomiya [55,56] proposed a vegetation index and several mineralogic indices utilizing VNIR and SWIR, as well as several lithologic indices using TIR spectra such as the stabilized vegetation index (SVI), OH-bearing altered minerals index (OHI), the quartz index (QI), and the carbonate index (CI). Features extracted from ASTER bands that were used in this study are summarized in Table 1.

3.3. Implementation of Machine-Learning (ML) Algorithms

In extracting spectral features from ASTER bands, we aimed to use these as input for ML algorithms that identify the most important features using random forest (RF) feature importance (FI) and extracting Pearson’s correlation coefficients (r) among all original spectral bands and features. In this analysis, the random forest (RF), support vector machine (SVM), deep-learning ANN, gradient boosting (GB), and extreme gradient boosting (XGB) methods for lithological mapping were selected. A total number of 33 features and bands were considered and have been specified to the ML algorithms through two scenarios. In the first step, all features were used as the inputs to the algorithms. Pearson’s correlation coefficients between all original spectral bands and features were extracted in the second scenario. The RF’s FI was then computed. All two by two features with an absolute correlation greater than 0.9 were considered, and the feature of lesser importance was removed. The remaining features (i.e., 17) were utilized as the model’s input data. ML algorithms were implemented using open-source Python Scikit-learn (1.0 version) (https://scikit-learn.org/ (accessed on 5 April 2023)) and Keras (2.3.0 version) (https://keras.io/ (2.3.0 version)) software packages.

The sampling data were selected through the stratified train–test division. For each class, 35% of the samples were used as test data, and the remaining 65% were used as training data. Parameter tuning of each machine-learning algorithm was conducted through grid search cross-validation (GridSearchCV). GridSearchCV is an existing function in Scikit-learn (Python). GridSearchCV is a process of tuning the model’s hyper parameter to find the optimal values for the parameters in the specific model. The accuracy assessment was followed in two steps. First, all spectral bands and features were given as the model’s input. Second, the importance of random forest features was applied to all spectral bands and features. Considering an absolute correlation greater than 0.9 between features, the feature with the higher importance was preserved, and the feature of lesser importance was removed. Finally, the remaining features were provided as input to the ML algorithms. The flowchart of the methodology that was applied in this study is illustrated in Figure 3.

3.3.1. Random Forest

Random forest, which was developed by Breiman [57], is an ensemble tree-based learning algorithm and a powerful non-parametric technique for solving various data mining problems. RF is less affected by outlier data than decision trees (DTs) and can handle various input data without overfitting the dataset [58]. An RF fits many DTs from a randomly selected subset for training the dataset. RF consists of many DTs fitted to the training data. The DT method’s main problem is that it tends to fit closely to the training data; in other words, DT has an overfitting problem [59]. RF uses averages to improve the regression problems’ accuracy and takes majority voting for classification problems [44]. Thus, RF solves the DT’s problem of overfitting the training data. The parameters that were selected for providing input to the GridSearchCV are shown in Table A1.

3.3.2. Support Vector Machines

Support vector machines (SVMs), formally described by Cortes and Vapnik [60], are powerful supervised machine-learning algorithms used for classification and regression problems [61]. In the original formulation, the SVM model tries to find a hyperplane that separates the training dataset into a predefined number of classes. The decision boundary was obtained during training steps, which minimizes the number of misclassifications related to optimal separation hyperplanes. During an iterative procedure, learning occurs to find the optimal decision boundary to separate the training samples, conceivably in the high-dimensional space [62]. The resulting hyperplane is an n-1 subspace in an n-dimensional space. Training samples specify the decision boundary, a subset of original data, called support vectors. In SVM, we frequently use nonlinear kernel functions to transform input data onto a high-dimensional space and make them more separable [63]. A radial basis function (RBF) is an excellent choice for transforming input data prior to the implementation of nonlinear models [44]. The details of grid search parameters are shown in Table A2.

3.3.3. Deep-Learning ANN

ANN methods try to model problems using interconnected artificial neurons like the human brain to solve machine-learning problems [44]. An ANN is a feed-forward multilayer perceptron consisting of one input, hidden, and output layer. In neural networks, a layer’s neurons can be connected to all other layers’ neurons, but not to other neurons within the same layer [64]. In a fully connected ANN, each neuron is connected to all neurons in the previous and following layers. Each connection between them has its own weight.

Two main characteristics of each ANN are its architecture and the manner in which it learns. The main issue in determining the ANN architecture is selecting the appropriate number of hidden layers and the number of neurons. Several methods propose how the number of these hidden layers and neurons can be selected [65,66]. In this study, we have determined the number of the neurons using Equation (1):

N_{n} = \sqrt{(m + 2) N} + 2 \sqrt{\frac{N}{m + 2}}

(1)

where N_n is the number of neurons in each layer, N is the number of input neurons, and m is the number of layers. We closely scrutinized various activation functions for the deep ANN model, including ReLU (rectified linear unit), Sigmoid, Tanh (hyperbolic tangent function), and Linear [67]. Early stopping was used to avoid overfitting in the deep ANN model. Therefore, 20% of the training data were selected as the validation data that were used in the learning process.

3.3.4. Gradient Boosting

The gradient boosting (GB) decision tree is a variant of ensemble learning [68], where multiple weak predictive models are generated, then combined and weighted in a function approximating or predicting the output variable from the input variable ensemble. Boosting and bagging are two prevalent types of ensemble learning. “Boosting” is defined as the process of converting the ensemble of multiple weak learners into a few strong learners, thereby reducing model bias. “Bagging” refers to bootstrap aggregation, the process by which the variance of the dataset is reduced, while simultaneously avoiding the overfitting of the final ML model. Variance reduction is accomplished by generating multiple decision trees from independent subsets randomly drawn from the data. GB is employed in classification and regression problems in the same manner as RF [69]. GB usually involves three steps: (1) establishing a loss function, which should be optimized; (2) generating weak learners (typically decision trees) to make a prediction; and (3) creating an additive model to include weak learners in a manner that minimizes the loss function [44]. GB trains many models in a sequential and additive way.

XGB was created to implement GB, which is an algorithm that is highly impressive and flexible. Tianqi Chen (Carnegie Mellon University) devised XGB software (in Python Scikit-learn 1.0 version) to be compatible across various platforms (C++, Python, R; http://datascience.la/xgboost-workshop-and-meetup-talk-with-tianqi-chen/ (accessed on 8 March 2022)). All codes are provided on Github (https://github.com/szilard/benchm-ml (accessed on 12 April 2023)). XGB is fast compared to the other implementations of GB [68]. The details of the parameters that were selected for the GridSearchCV are shown in Table A3.

3.3.5. Accuracy Assessment

A confusion matrix is widely used for classification assessment to evaluate an algorithm’s performance. The confusion matrix routinely reports the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) (Table A4).

R e c a l l = \frac{T P}{T P + F N}

(2)

P r e c i s i o n = \frac{F P}{T P + F P}

(3)

F 1 - s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} = \frac{2 \times T P}{2 \times T P + F P + F N}

(4)

A c c u r a c y = \frac{T P + T N}{T P + F P + F N + T N}

(5)

Here, TP represents the number of pixels where both ground truth data and the machine-learning algorithm indicate the same label for test data. FP is the item in which the true label was negative, but the algorithm incorrectly predicted it as being positive. FN is the items in which the true label was positive, but the algorithm incorrectly predicted it as being negative. Finally, the variable TN is the items in which the true label and predicted label correctly matched as negative. Based on these definitions, three criteria were selected to assess each class (Equations (2)–(4)). Moreover, one criterion was chosen to assess each method’s overall accuracy (Equation (5)), which is computed by dividing the sum of correct classification samples by the total number of samples [70].

4. Analysis and Results

4.1. Extraction of Features from ASTER

It is necessary to evaluate the predictive capability of lithological features to acquire more accurate map susceptibility modeling because some features may have a negative effect on the ML models that have been generated. Moreover, an existing strong correlation between these features will also decrease the models’ performance. Figure 4A,B show the correlations between the selected features. As can be seen in Figure 4A, the intra-group correlation between VNIR bands is high (0.92–0.97). The correlation between SWIR and TIR bands is also high (SWIR: 0.84–0.98; TIR: 0.98–1.00). The intra-group correlations for VNIR, SWIR, and TIR bands are shown respectively as 3 × 3, 6 × 6, and 5 × 5 blocks along the diagonal of the correlation matrix (Figure 4A); the inter-group correlations are the large, off-diagonal groups of estimates. The blocks of inter-group correlations particularly between TIR (Bands 10–14) versus the NVIR and SWIR bands were mostly weak and negative (−0.38 to −0.52).

Since the number of features was comparatively large and the resulting visualization of the correlations between features was not clear, we rearranged the variables to emphasize the strongest groupings, as shown in Figure 4B. It should be noted that since the correlations between VNIR and SWIR bands were especially high, we selected four bands from the groups as being representative of the original VNIR (Band 1), SWIR (Bands 5 and 6) and TWIR bands (Bands 10 and 11) among the indices that are featured in Figure 4B.

Figure 5 shows the FI (feature importance) values ranked from least (<0.02: chlorite, ASTER Band 3) to most (close to 0.06: mafic, quartz) important. ASTER Band 10 (TIR) is the third most important feature. Many of the remaining bands cluster at the center of this ranking (FI = 0.03), namely Bands 4, 7, 8, and 9 in the SWIR (shortwave infrared), and Bands 13 and 14 in the TIR (thermal infrared).

The first and second importance features of the random forest (Figure 5) are depicted in Figure 6A,B. Sar-Cheshmeh porphyry and quartz eye classes have low values of the mafic index, while granodiorite and andesite have relatively high values of the mafic index. For the quartz index, granodiorite and andesite have comparatively high values of the quartz index. Two typical image transform algorithms were applied to the data, including principal component analysis (PCA) [71] and independent component analysis (ICA) [72,73], and compared with the two most important features (mafic and quartz). In addition to improving images, these techniques reduce the spectral redundancy of the image [74]. Whether these transformations are suitable for lithological mapping can be further explored. Using PCA, the raw, frequently intercorrelated variables in the high-dimension multivariate dataset were reduced to a smaller number of more easily interpretable orthogonal (uncorrelated) composite variables or components [75].

Only one sub-component among the mixture of sub-components comprising the signal is assumed to be Gaussian: higher-order statistics to separate signals and extract features. The default calculation of ICA yields a number of components, which can (should) be equal in number to those of the source variables. The relative importance of these components is difficult to determine, given that they do not have a ranking [75]. Yet, when compared to PCA, ICA can provide more spectral information, which could enhance lithological differentiation in the geological context of remote sensing [74]. The RGB image composition of PCA and ICA is summarized in Figure 7A,B, respectively. Red, green, and blue designate respectively PC1, PC2, and PC6 (Figure 7A). According to the analysis, PC1, PC6, and PC2 have the highest RF FI, respectively. In Figure 7B, IC1 is R, IC2 is G, and IC4 is B. While PCA does not distinguish between the litho-contacts at the boundaries of the litho-contacts, ICA performs well. The results presented Figure 7 suggest that biotite dike and andesite can be better contrasted using the band combinations that were obtained using both ICA and PCA transformations.

4.2. Training Sample Selection

The 7338 samples, representing nine lithology types, were designated as testing areas. The testing area covered about 35% of all samples. Moreover, the validity and accuracy of the training samples were evaluated and validated by field observation and the analysis of microscopic thin sections. The number of test samples for each class are included in Table 2. Note that the andesite and feldspar dike have the highest and lowest number of test samples, respectively. The training area was introduced to the RF, SVM, GB, XGB, and deep-learning ANN classifiers. The scenarios that are referred to in the table are the bands alone (Scenario 1 or S1) and the bands plus other features (Scenario 2 or S2).

4.3. Rock Type Classifications and Accuracy Analysis

The results of lithological classification for the Sar-Cheshmeh copper mining region using the ML algorithms and ASTER imagery are shown in Figure 8. Regarding the geology map of the study area (see Figure 8A), the image maps that were derived from SVM and ANN algorithms (see Figure 8C,F) display better qualitative matches to each class of lithology compared to other ML algorithms.

The five models and their two associated scenarios exhibited similar overall accuracies (Table 2), ranging from a high of 0.85 (SVM-S1) to a low of 0.79 (GB-S2), with an average (±SD) of 0.82 ± 0.02. Scenario 1 differs from Scenario 2 in that the former includes only the remote-sensing dataset, while latter adds the features. All ML algorithms predict the quartz eye with an acceptable accuracy, esecially RF and XGB. The overall accuracy of lithological classification using bands and features was 0.84, 0.85, 0.80, 0.82, and 0.82 for RF, SVM, GB, XGB, and deep-learning ANN, respectively. The overall classification accuracy with applied RF’s FI in percent was 0.83, 0.84, 0.79, 0.80, and 0.80 for RF, SVM, GB, XGB, and deep-learning ANN, respectively (see Table 2). The results showed that the classification accuracy using RF FI has a slightly lower accuracy than using all features. The difference between the two separate input datasets in all algorithms is less than 2%. Overall, RF and SVM, in nearly all classes, had the highest value in precision, recall, and F1-score utilizing either the first or second scenario.

Among the nine classes, the prediction of alluvium using the RF model with RF’s FI as the input exhibits the greatest precision (Scenario 2: 0.98), despite having a sample size of 131. Furthermore, using the RF model and performing RF’s FI works the best in correctly in predicting alluvium relative to the other models and their respective scenarios (Table 2). The worst estimated precision is exhibited by the biotite dike (Scenario 1: 0.20) for predictions made with the ANN (Table 2, n = 87). The two aforementioned mineral classes have low sample sizes, which may account for their respective performances, but at least they could be estimated in terms of precision, recall, and F1-scores. Feldspar dike exhibited the absolute worst performance in that these metrics could not be made, given its sample size (n = 13). Thus, variation in sample size may be a crucial determinant.

In discounting the factor sample size (n) and despite high average correlations among the metrics, individual measurements of precision and recall may be more informative than accuracy in determining the performance of the models and their associated scenarios. Indeed, the mapping of alluvium, granodiorite, and andesite exhibited the best precision according to their rankings (precision > 0.9 for almost all algorithms; see Table 2). Granodiorite and andesite likewise exhibited the greatest recall (Friedman test: p < 0.0001), while granodiorite had the highest overall mean rank for F1-scores (Friedman test: p < 0.0001). Feldspar dike consistently exhibited the poorest performance across model scenarios for all three metrics.

In terms of the consistency of the precision estimates within mineral classes (omitting the feldspar dike), the ranking among the model-scenarios (10 categories) was significant, but moderately strongly concordant (Kendall’s W = 0.736, p < 0.0001). Indeed, model precision (mean ranks) could be ordered across the mineral classes as follows: RF-S2 = RF-S1 > SVM-S1 = SVM-S2 > XGB-S1 > XGB-S2 = GB-S1 = GB-S2 > ANN-S1 > ANN-S2. In other words, RF (Scenario 2) performed the best, while ANN (Scenario 2) performed the worst in terms of overall model precision; with the exception of RF, Scenario 1 resulted in the best performance of the other models. Ranking recall estimates in the same manner (within mineral classes across models) results in a different and much less concordant (W = 0.590, p < 0.0001) ordering of model scenarios: SVM-S2 = SVM-S1 > ANN-S1 = RF-S1 > RF-S2 = XGB-S1 > ANN-S2 > GB-S1 > XGB-S2 = GB-S2. With respect to F1-scores, there is almost perfect agreement in the ordering of average ranks (W = 0.969, p < 0.0001) among the mineral classes. The F1-scores of the model scenarios decreased for a high of SVM-S1 as follows: SVM-S1 = SVM-S2 > RF-S1 = RF-S2 > XGB-S1 = ANN-S1 > ANN-S2 = GB-S1 = XGB-S2 = GB-S2; again, a better performance was observed for Scenario 1.

Figure 8 summarizes the results of the ML algorithms without considering RF FI. This visualization map showed that SVM was better trained than the other ML algorithms. Figure 9 shows the evaluation criteria of the testing samples by utilizing SVM and all bands and features as the input.

4.4. Assessment of Specific Spectral Regions as an Input to ML Algorithms

The results of utilizing specific spectral regions as input data to RF and SVM can be seen in Table 3 and Table 4, respectively. As is obvious, the ASTER VNIR bands showed a low capability of mapping lithological units in RF. ASTER SWIR and TIR showed a greater potential for mapping lithological units other than VNIR bands. The overall accuracy of ASTER TIR bands is higher than that of ASTER SWIR bands using RF (Table 3). The individual performance metrics for the RF model displayed increasing concordance across the bands based on their respective rankings in each mineral class, equal to 0.799 for precision, 0.827 for recall, and 0.901 for F1-scores. Consistent with expectations, the performance among the mineral classes also differed within each band. The precision was the highest in granodiorite and lowest in hornblende dike, despite the low degree of concordance exhibited by this metric. Recall was the highest for andesite and lowest for the biotite dike, very consistently across bands. A similar degree of concordance across bands was exhibited by F1-scores.

SVM results showed a very similar accuracy for the three different ASTER spectral regions, although TIR was about 5% lower than the VNIR and SWIR estimates (Table 4). Despite having the same accuracy (0.61), VNIR (0.61) was judged to be slightly better than SWIR, as well as TIR (0.59). For a more objective determination of accuracy, each of the three performance metrics was ranked across the three bands and over the nine separate mineral classes. Mean rank precision, recall, and F1-scores could not be distinguished among VNIR, SWIR, and TIR responses (p ≥ 0.823). Given their zero performance metrics under SVM, we determined whether significant differences among bands remained undetected following the removal of andesite, together with the biotite and feldspar dike classes. The reanalysis of the remaining ranked mean metrics did not reveal any underlying differences in the bands.

4.5. Effect of the Number of Training Samples in ML Algorithms on Overall Accuracy

The number of training data samples is important in determining classification accuracy [76]. It is crucial to obtain optimum classification results using the appropriate number of training samples [77]. In most cases, ML algorithms require a suitable number of training samples. When training datasets are reduced, the performance of different ML algorithms is worth considering. This analysis investigated effects of the number of testing samples (training and testing datasets) on the precision, recall, F1-score, and overall accuracy when testing includes 15%, 20%, 25%, 30%, 35%, and 40% of the dataset as an example of ML algorithm performance (Table 5).

The overall accuracy of RF and SVM for different testing sample sizes was calculated for both testing and training datasets (Figure 10A,B). Generally, there was a slight overall reduction in the testing accuracy of 1.5% when the testing sample size increased (from 84.9% to 83.7%). The overall accuracy for all RF training models shows a maximum value of 1 (Figure 10A), meaning that all predictions of models and original lithology classes are the same. Thus, the model has achieved a reasonably good (although not the best possible) understanding of the training dataset.

As a result, it is essential to remember that just because a model can achieve a very high training accuracy does not necessarily mean that it is a good model despite its aforementioned value [78]. However, the test accuracy among all models differs by roughly 15%. This can be a sign of overfitting in this mode. Furthermore, the effects of testing sample size using SVM over the overall accuracy of both training and testing can be seen in Figure 10B. Figure 10B shows that the overall accuracy of training increased as the testing sample size increased. However, the testing accuracy tends to be lower with a larger sample size.

4.6. DEM Assessment as an Additional Feature to Input Data

This section assesses the effects of adding a digital elevation model (DEM) as an input to the ML algorithms. Generally, adding DEM slightly improved the overall accuracy (Figure 11). The most notable results were for RF, which was improved by about 1.6% compared to the other algorithms.

5. Discussion

One factor that must be considered in lithological mapping is the specific spectral reflectance that is associated with each mineral. In other words, lithological units consist of a mixture of spectral reflections from minerals that make up their composition [79,80]. Pixel size is yet another factor that must be considered when classifying lithological units. It should be noted that different datasets generate pixels of different sizes. Even in a single specific sensor, various spectra have different resolutions. In this study, ASTER bands were resampled to the resolution of VNIR bands (i.e., 15 m). According to the results of FI, the TIR bands and indices that were associated with these bands have higher accuracy. However, the TIR resolution is coarser than that of other spectral regions (e.g., VNIR and SWIR). Including thermal data at a lower resolution may lead to more accurate lithological mapping. Yet, it should be noted that some classes were narrow and elongated in the study area. Therefore, classifying these classes may be difficult, as in the case of high-spatial resolution, two or more lithological classes mix in one pixel. Lower spatial resolution due to too many detailed objects can equally pose a problem [81]. Another parameter that can affect the overall accuracy of ML algorithms is how training and testing samples are selected. There was some uncertainty in selecting training samples based on the visual interpretation of geological maps [82]. Yet, the training and testing samples were chosen randomly in this study. In addition, the training models have been repeated by a diverse selection of training samples, and an average accuracy across various situations has been attained to ensure that this study’s overall accuracy is stable.

This study investigated the accuracy of five ML algorithms (RF, SVM, GB, XGB, and a deep-learning ANN) to map lithological units in the Sar-Cheshmeh copper mining region and compared their overall accuracy to one another. A comparison with other studies of ML algorithms in lithological mapping has been made herein. Shebl et al. [83] utilized Sentinel-2 multispectral data and radiometric data to assess the potential of SVM to classify 13 lithological classes in Egypt, including igneous, metamorphic, and sedimentary rocks. Their dataset contained from 955 to 3397 observations per class for training the model. They reported an overall accuracy of between 0.756 to 0.857. Nugroho et al. [84] investigated the potential of several forms of remote-sensing imagery, including Sentinel-2, ALOS PALSAR, and DEM, together with geophysical data. This included magnetic and electromagnetic data to map lithology in Indonesia using an RF algorithm. Their number of training samples per class was between 14 to 337. They reported an accuracy of 0.73 to 0.81 for lithology classes. Bachri et al. [82] utilized several forms of remote-sensing data, including Landsat 8 OLI, DEM, and ALOS PALSAR, to assess lithological mapping in Morocco. They reported an overall accuracy of 0.85 using the SVM ML algorithm. Manap and San [81] reported that adding SAR and DEM data improved the model’s overall accuracy by roughly 10%. They also reported that SVM and ANN were more accurate than RF. In the current study, SVM outperformed other ML algorithms with respect to overall accuracy in the Sar-Cheshmeh copper mining region. Adding DEM could not significantly improve the overall accuracy of ML algorithms (<2%).

According to Table 2, the number of training samples for minor and major classes is considerably different in this study, so the imbalance ratio is quite high. The number of features also reached 33 (both bands and indices). This clarifies that in this study, a complicated problem was confronted, as seen from the accuracy that was reported for the class of feldspar dike by almost all models given that it had the lowest number of sampling data points. Considering the above conditions, it is clear that the results of all ML algorithms provided a relatively high accuracy. One way to improve the accuracy of the ML algorithms in this study is simply by adding more data to train the model. It also should be noted that adding data is not always the best case in ML algorithms. Adding additional data, such as geophysical measurements, may improve the accuracy, as has been well stated by Nugroho et al. [84]. One limitation of such ML algorithms is the presence of vegetation in the area, affecting the spectral information fed back to the sensor and, therefore, classification accuracy [85]. However, the case study region in this analysis was arid, lacking extensive vegetation. The presence of vegetation in the study area can be addressed by utilizing synthetic aperture radar (SAR) coverage (depending on the type and height of vegetation).

6. Conclusions

This study used five ML algorithms, namely RF, SVM, GB, XGB, and a deep-learning ANN together with ASTER multispectral datasets, to evaluate the accuracy of lithological mapping over the Sar-Cheshmeh copper mining region in southeast Iran. Two scenarios were considered in this study. First, ASTER spectral bands and several features were provided as the input data to the models. We then applied the RF’s FI to all features. Considering the features with an absolute correlation greater than 0.90, those with less importance were removed, and the remaining features were provided as the models’ input data. Among the selected ML algorithms in this study, the SVM model has a higher accuracy than all other classification models in lithological mapping. The overall accuracy of the SVM model was 85%. The results of RF FI showed that ASTER TIR data have greater importance than other ASTER bands. The results also showed that combining all features without considering RF’s FI offered slightly better classification accuracy. The overall accuracy of lithological mapping using all bands and features revealed that SVM has the highest overall accuracy (0.85) compared to other ML algorithms. The results further showed that adding additional information, such as DEM (digital elevation model), can slightly improve the overall accuracy. Increasing the testing size also can lead to a decrease in the test’s overall accuracy. Among all classes, alluvium was detected well, while feldspar dike exhibited a lower accuracy. The results showed that ML algorithms can map lithology by utilizing ASTER data, which are significantly cost-effective, saving both time and resources in fieldwork. Nevertheless, a definite statement might require some ground observations.

Author Contributions

Conceptualization, H.B., P.E., S.H. and A.B.P.; methodology, H.B., P.E., S.H., A.B.P., K.C. and A.B.; software, H.B.; validation, H.B., S.H. and K.C.; formal analysis, H.B., A.B.P. and SH; resources, A.B. and A.B.P.; writing—original draft preparation, H.B., S.H. and P.E.; writing—review and editing, H.B., S.H., A.B.P., K.C., P.E. and A.B.; visualization, H.B. and A.B.P.; supervision, S.H., A.B.P. and K.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We express our profound gratitude to the National Iranian Copper Industries Company (NICICO) for providing the ground data essential to conducting this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. GridSearch parameters used in the RF model.

Parameters	Description	Grid Search Values
n_estimators	No. of trees in the forest	5, 10, 50, 75, 100, 200, 500, 1000
max_depth	The maximum depth of the trees	2, 3, 5, 10, 15, 20, 50
min_samples_split	The minimum number of samples required to split an internal node	2, 3, 5, 8, 10, 15, 20
min_samples_leaf	The minimum number of samples required to be at a leaf node	1, 2, 3, 5, 8, 10

Table A2. GridSearchCV parameters set for the SVM.

Parameters	Description	Grid Search Values
Kernel	Specifies the kernel type to be used in the algorithm	‘linear’, ‘rbf’
Gamma	Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’	0.001, 0.01, 0.1, 0.2, 0.3, 0.5, 0.8, 1, 3
C	Penalty parameter	1, 3, 5, 10, 20, 50, 100, 500

Table A3. GridSearch parameters for the GB and XGB algorithms.

Parameters	Description	Grid Search Values
learning_rate	Shrinks the contribution of each tree	0.001, 0.005, 0.01, 0.05, 0.1, 0.15, 0.3, 0.5, 1
n_estimators	The number of boosting stages to conduct	10, 25, 50, 70, 100, 200, 500, 1000, 2000
max_depth	Limits the number of nodes in the tree	2, 3, 5, 7, 10, 15, 20, 25
max_features	The number of features to consider when searching for the best split	‘auto’, ‘sqrt’, ‘log2’

Table A4. A confusion matrix example.

		True Condition
	Total Population	Condition Positive	Condition Negative
Predicted Condition	Predicted Condition Positive	True Positive	False Positive
Predicted Condition	Predicted Condition Negative	False Negative	True Negative

References

El-Omairi, M.A.; El Garouani, A. A review on advancements in lithological mapping utilizing machine learning algorithms and remote sensing data. Heliyon 2023, 9, 20168. [Google Scholar] [CrossRef]
Cardoso-Fernandes, J.; Teodoro, A.C.; Lima, A.; Perrotta, M.; Roda-Robles, E. Detecting Lithium (Li) mineralizations from space: Current research and future perspectives. Appl. Sci. 2020, 10, 1785. [Google Scholar] [CrossRef]
Sabins, F.F. Remote sensing for mineral exploration. Ore Geol. Rev. 1999, 14, 157–183. [Google Scholar] [CrossRef]
Peyghambari, S.; Zhang, Y. Hyperspectral remote sensing in lithological mapping, mineral exploration, and environmental geology: An updated review. J. Appl. Remote Sens. 2021, 15, 031501. [Google Scholar] [CrossRef]
Pour, A.B.; Rahmani, O.; Parsa, M. Multispectral Remote Sensing Satellite Data for Mineral and Hydrocarbon Exploration: Big Data Processing and Deep Fusion Learning Techniques; MDPI: Basel, Switzerland, 2023. [Google Scholar]
Pour, A.B.; Zoheir, B.; Pradhan, B.; Hashim, M. Multispectral and Hyperspectral Remote Sensing Data for Mineral Exploration and Environmental Monitoring of Mined Areas; MDPI: Basel, Switzerland, 2021. [Google Scholar]
Abd El-Wahed, M.; Zoheir, B.; Pour, A.B.; Kamh, S. Shear-related gold ores in the Wadi Hodein Shear Belt, South Eastern Desert of Egypt: Analysis of remote sensing, field and structural data. Minerals 2021, 11, 474. [Google Scholar] [CrossRef]
Qian, S.-E. Hyperspectral Satellites and System Design; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Clark, R.N.; Swayze, G.A.; Livo, K.E.; Kokaly, R.F.; Sutley, S.J.; Dalton, J.B.; McDougal, R.R.; Gent, C.A. Imaging spectroscopy: Earth and planetary remote sensing with the USGS Tetracorder and expert systems. J. Geophys. Res. Planets 2003, 108, 5131. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Y.; Zhang, F.; Dong, Y.; Song, Z.; Liu, G. Remote sensing for lithology mapping in vegetation-covered regions: Methods, challenges, and opportunities. Minerals 2023, 13, 1153. [Google Scholar] [CrossRef]
Ye, B.; Tian, S.; Ge, J.; Sun, Y. Assessment of WorldView-3 data for lithological mapping. Remote Sens. 2017, 9, 1132. [Google Scholar] [CrossRef]
Li, N.; Huang, X.; Zhao, H.; Qiu, X.; Deng, K.; Jia, G.; Li, Z.; Fairbairn, D.; Gong, X. A combined quantitative evaluation model for the capability of hyperspectral imagery for mineral mapping. Sensors 2019, 19, 328. [Google Scholar] [CrossRef] [PubMed]
Bedell, R. Geological mapping with ASTER satellite: New global satellite data that is a significant leap in remote sensing geologic and alteration mapping. Spec. Publ. Geol. Soc. Nevada 2001, 33, 329–334. [Google Scholar]
Pour, A.B.; Hashim, M. Identification of hydrothermal alteration minerals for exploring of porphyry copper deposit using ASTER data, SE Iran. J. Asian Earth Sci. 2011, 42, 1309–1323. [Google Scholar] [CrossRef]
Pour, A.B.; Sekandari, M.; Rahmani, O.; Crispini, L.; Läufer, A.; Park, Y.; Hong, J.K.; Pradhan, B.; Hashim, M.; Hossain, M.S. Identification of phyllosilicates in the antarctic environment using ASTER satellite data: Case study from the Mesa Range, Campbell and Priestley Glaciers, Northern Victoria Land. Remote Sens. 2020, 13, 38. [Google Scholar] [CrossRef]
Shirazi, A.; Hezarkhani, A.; Beiranvand Pour, A.; Shirazy, A.; Hashim, M. Neuro-Fuzzy-AHP (NFAHP) technique for copper exploration using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) and geological datasets in the Sahlabad mining area, east Iran. Remote Sens. 2022, 14, 5562. [Google Scholar] [CrossRef]
Yousefi, M.; Tabatabaei, S.H.; Rikhtehgaran, R.; Pour, A.B.; Pradhan, B. Application of Dirichlet process and support vector machine techniques for mapping alteration zones associated with porphyry copper deposit using ASTER remote sensing imagery. Minerals 2021, 11, 1235. [Google Scholar] [CrossRef]
Pour, A.B.; Hashim, M. The application of ASTER remote sensing data to porphyry copper and epithermal gold deposits. Ore Geol. Rev. 2012, 44, 1–9. [Google Scholar] [CrossRef]
Clark, R.N.; King, T.V.; Klejwa, M.; Swayze, G.A.; Vergo, N. High spectral resolution reflectance spectroscopy of minerals. J. Geophys. Res. Solid Earth 1990, 95, 12653–12680. [Google Scholar] [CrossRef]
Hunt, G.R. Spectral signatures of particulate minerals in the visible and near infrared. Geophysics 1977, 42, 501–513. [Google Scholar] [CrossRef]
Clark, R.N. Spectroscopy of Rocks and Minerals, and Principles of Spectroscopy; Wiley: Hoboken, NJ, USA, 2020. [Google Scholar]
Cloutis, E.A.; Hawthorne, F.C.; Mertzman, S.A.; Krenn, K.; Craig, M.A.; Marcino, D.; Methot, M.; Strong, J.; Mustard, J.F.; Blaney, D.L. Detection and discrimination of sulfate minerals using reflectance spectroscopy. Icarus 2006, 184, 121–157. [Google Scholar] [CrossRef]
Crowley, J.K.; Vergo, N. Near-infrared reflectance spectra of mixtures of kaolin-group minerals: Use in clay mineral studies. Clays Clay Miner. 1988, 36, 310–316. [Google Scholar] [CrossRef]
Beiranvand Pour, A.; Park, Y.; Crispini, L.; Läufer, A.; Kuk Hong, J.; Park, T.-Y.S.; Zoheir, B.; Pradhan, B.; Muslim, A.M.; Hossain, M.S. Mapping listvenite occurrences in the damage zones of Northern Victoria Land, Antarctica using ASTER satellite remote sensing data. Remote Sens. 2019, 11, 1408. [Google Scholar] [CrossRef]
Ninomiya, Y.; Fu, B. Regional lithological mapping using ASTER-TIR data: Case study for the Tibetan Plateau and the surrounding area. Geosciences 2016, 6, 39. [Google Scholar] [CrossRef]
Salisbury, J.; D’Aria, D. Emissivity of terrestrial materials in the 8–14 m atmospheric window. SPIE Milest. Ser. MS 1997, 134, 481–504. [Google Scholar] [CrossRef]
Ninomiya, Y.; Fu, B. Thermal infrared multispectral remote sensing of lithology and mineralogy based on spectral properties of materials. Ore Geol. Rev. 2019, 108, 54–72. [Google Scholar] [CrossRef]
Adcock, C.T.; Haber, D.A.; Burnley, P.C.; Malchow, R.L.; Hausrath, E.M. Modeling gamma radiation exposure rates using geologic and remote sensing data to locate radiogenic anomalies. J. Environ. Radioact. 2019, 208, 106038. [Google Scholar] [CrossRef]
Kumar, C. Developing Innovative Spectral and Machine Learning Methods for Mineral and Lithological Classification Using Multi-Sensor Datasets; Michigan Technological University: Houghton, MI, USA, 2020. [Google Scholar]
Shanmugam, S.; SrinivasaPerumal, P. Spectral matching approaches in hyperspectral image processing. Int. J. Remote Sens. 2014, 35, 8217–8251. [Google Scholar] [CrossRef]
Thompson, S.; Fueten, F.; Bockus, D. Mineral identification using artificial neural networks and the rotating polarizer stage. Comput. Geosci. 2001, 27, 1081–1089. [Google Scholar] [CrossRef]
Waske, B.; Benediktsson, J.A.; Árnason, K.; Sveinsson, J.R. Mapping of hyperspectral AVIRIS data using machine-learning algorithms. Can. J. Remote Sens. 2009, 35, S106–S116. [Google Scholar] [CrossRef]
Cracknell, M.J.; Reading, A.M. Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information. Comput. Geosci. 2014, 63, 22–33. [Google Scholar] [CrossRef]
Zuo, R. Machine learning of mineralization-related geochemical anomalies: A review of potential methods. Nat. Resour. Res. 2017, 26, 457–464. [Google Scholar] [CrossRef]
Mohri, M.; Rostamizadeh, A.; Talwalkar, A. Foundations of Machine Learning; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Kanevski, M.; Pozdnoukhov, A.; Timonin, V. Machine Learning for Spatial Environmental Data: Theory, Applications, and Software; EPFL Press: Lausanne, Switzerland, 2009. [Google Scholar]
Gupta, P.; Venkatesan, M. Mineral identification using unsupervised classification from hyperspectral data. In Emerging Research in Data Engineering Systems and Computer Communications, Proceedings of the CCODE 2019, Islamabad, Pakistan, 6–7 March 2019; Springer: Singapore, 2020; pp. 259–268. [Google Scholar]
He, J.; Harris, J.; Sawada, M.; Behnia, P. A comparison of classification algorithms using Landsat-7 and Landsat-8 data for mapping lithology in Canada’s Arctic. Int. J. Remote Sens. 2015, 36, 2252–2276. [Google Scholar] [CrossRef]
Haykin, S. Neural Networks: A Comprehensive Foundation; Prentice Hall PTR: Bridge, NJ, USA, 1998. [Google Scholar]
Bahrami, H.; McNairn, H.; Mahdianpari, M.; Homayouni, S. A Meta-Analysis of Remote Sensing Technologies and Methodologies for Crop Characterization. Remote Sens. 2022, 14, 5633. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009; Volume 2. [Google Scholar]
Ridgeway, G. Generalized Boosted Models: A guide to the gbm package. Update 2007, 1, 2007. [Google Scholar]
Boehmke, B.; Greenwell, B.M. Hands-On Machine Learning with R; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
Dangeti, P. Statistics for Machine Learning; Packt Publishing, Ltd.: Birmingham, UK, 2017. [Google Scholar]
Xiong, Y.; Zuo, R.; Carranza, E.J.M. Mapping mineral prospectivity through big data analytics and a deep learning algorithm. Ore Geol. Rev. 2018, 102, 811–817. [Google Scholar] [CrossRef]
Elahi, F.; Muhammad, K.; Din, S.U.; Khan, M.F.A.; Bashir, S.; Hanif, M. Lithological Mapping of Kohat Basin in Pakistan Using Multispectral Remote Sensing Data: A Comparison of Support Vector Machine (SVM) and Artificial Neural Network (ANN). Appl. Sci. 2022, 12, 12147. [Google Scholar] [CrossRef]
Pour, B.; Hashim, M.; Marghany, M. Using spectral mapping techniques on short wave infrared bands of ASTER remote sensing data for alteration mineral mapping in SE Iran. Int. J. Phys. Sci. 2011, 6, 917–929. [Google Scholar]
Aftabi, A.; Atapour, H. Alteration geochemistry of volcanic rocks around Sarcheshmeh porphyry copper deposit, Rafsanjan, Kerman, Iran: Implications for regional exploration. Resour. Geol. 2011, 61, 76–90. [Google Scholar] [CrossRef]
Waterman, G.C.; Hamilton, R. The Sar Cheshmeh porphyry copper deposit. Econ. Geol. 1975, 70, 568–576. [Google Scholar] [CrossRef]
Atapour, H.; Aftabi, A. The geochemistry of gossans associated with Sarcheshmeh porphyry copper deposit, Rafsanjan, Kerman, Iran: Implications for exploration and the environment. J. Geochem. Explor. 2007, 93, 47–65. [Google Scholar] [CrossRef]
Sheikhrahimi, A.; Pour, A.B.; Pradhan, B.; Zoheir, B. Mapping hydrothermal alteration zones and lineaments associated with orogenic gold mineralization using ASTER data: A case study from the Sanandaj-Sirjan Zone, Iran. Adv. Space Res. 2019, 63, 3315–3332. [Google Scholar] [CrossRef]
Sabbaghi, H.; Moradzadeh, A. ASTER spectral analysis for host rock associated with porphyry copper-molybdenum mineralization. J. Geol. Soc. India 2018, 91, 627–638. [Google Scholar] [CrossRef]
Pour, A.B.; Hashim, M. Hydrothermal alteration mapping from Landsat-8 data, Sar Cheshmeh copper mining district, south-eastern Islamic Republic of Iran. J. Taibah Univ. Sci. 2015, 9, 155–166. [Google Scholar] [CrossRef]
Yamaguchi, Y.; Naito, C. Spectral indices for lithologic discrimination and mapping by using the ASTER SWIR bands. Int. J. Remote Sens. 2003, 24, 4311–4323. [Google Scholar] [CrossRef]
Ninomiya, Y. A stabilized vegetation index and several mineralogic indices defined for ASTER VNIR and SWIR data. In Proceedings of the (IGARSS 2003) IEEE International Geoscience and Remote Sensing Symposium, Toulouse, France, 21–25 July 2003; pp. 1552–1554. [Google Scholar]
Ninomiya, Y. Advanced remote lithologic mapping in ophiolite zone with ASTER multispectral thermal infrared data. In Proceedings of the International Geoscience and Remote Sensing Symposium, Toulouse, France, 21–25 July 2003; pp. 1561–1563. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Mahdianpari, M.; Salehi, B.; Mohammadimanesh, F.; Motagh, M. Random forest wetland classification using ALOS-2 L-band, RADARSAT-2 C-band, and TerraSAR-X imagery. ISPRS J. Photogramm. Remote Sens. 2017, 130, 13–31. [Google Scholar] [CrossRef]
Albon, C. Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning; O’Reilly Media, Inc.: Newton, MA, USA, 2018. [Google Scholar]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Bahrami, H.; Homayouni, S.; McNairn, H.; Hosseini, M.; Mahdianpari, M. Regional crop characterization using multi-temporal optical and synthetic aperture radar earth observations data. Can. J. Remote Sens. 2022, 48, 258–277. [Google Scholar] [CrossRef]
Mountrakis, G.; Im, J.; Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens. 2011, 66, 247–259. [Google Scholar] [CrossRef]
Wang, L. Support Vector Machines: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2005; Volume 177. [Google Scholar]
Vasilev, I.; Slater, D.; Spacagna, G.; Roelants, P.; Zocca, V. Python Deep Learning: Exploring Deep Learning Techniques and Neural Network Architectures with Pytorch, Keras, and TensorFlow; Packt Publishing, Ltd.: Birmingham, UK, 2019. [Google Scholar]
Huang, G.-B. Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans. Neural Netw. 2003, 14, 274–281. [Google Scholar] [CrossRef] [PubMed]
Madhiarasan, M.; Deepa, S. Comparative analysis on hidden neurons estimation in multi layer perceptron neural networks for wind speed forecasting. Artif. Intell. Rev. 2017, 48, 449–471. [Google Scholar] [CrossRef]
Patterson, J.; Gibson, A. Deep Learning: A Practitioner’s Approach; O’Reilly Media, Inc.: Newton, MA, USA, 2017. [Google Scholar]
Brownlee, J. XGBoost with Python: Gradient Boosted Trees with XGBoost and Scikit-Learn; Machine Learning Mastery: San Juan, Spain, 2016. [Google Scholar]
Bahrami, H.; Homayouni, S.; Safari, A.; Mirzaei, S.; Mahdianpari, M.; Reisi-Gahrouei, O. Deep learning-based estimation of crop biophysical parameters using multi-source and multi-temporal remote sensing observations. Agronomy 2021, 11, 1363. [Google Scholar] [CrossRef]
Story, M.; Congalton, R.G. Accuracy assessment: A user’s perspective. Photogramm. Eng. Remote Sens. 1986, 52, 397–399. [Google Scholar]
Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
Herault, J.; Jutten, C. Space or time adaptive signal processing by neural network models. AIP Conf. Proc. 1986, 151, 206–211. [Google Scholar]
Comon, P. Independent component analysis, a new concept? Signal Process. 1994, 36, 287–314. [Google Scholar] [CrossRef]
Kumar, C.; Shetty, A.; Raval, S.; Sharma, R.; Ray, P.C.J.P.E.; Science, P. Lithological discrimination and mapping using ASTER SWIR Data in the Udaipur area of Rajasthan, India. Procedia Earth Planet. Sci. 2015, 11, 180–188. [Google Scholar] [CrossRef]
Yang, J.; Cheng, Q. A comparative study of independent component analysis with principal component analysis in geological objects identification, Part I: Simulations. J. Geochem. Explor. 2015, 149, 127–135. [Google Scholar] [CrossRef]
Ramezan, C.A.; Warner, T.A.; Maxwell, A.E.; Price, B.S. Effects of training set size on supervised machine-learning land-cover classification of large-area high-resolution remotely sensed data. Remote Sens. 2021, 13, 368. [Google Scholar] [CrossRef]
Kumar, C.; Chatterjee, S.; Oommen, T.; Guha, A. Automated lithological mapping by integrating spectral enhancement techniques and machine learning algorithms using AVIRIS-NG hyperspectral data in Gold-bearing granite-greenstone rocks in Hutti, India. Int. J. Appl. Earth Obs. Geoinf. 2020, 86, 102006. [Google Scholar] [CrossRef]
Yoon, H. Finding unexpected test accuracy by cross validation in machine learning. Int. J. Comput. Sci. Netw. Secur. 2021, 21, 549–555. [Google Scholar]
Nefeslioglu, H.A.; San, B.T.; Gokceoglu, C.; Duman, T. An assessment on the use of Terra ASTER L3A data in landslide susceptibility mapping. Int. J. Appl. Earth Obs. Geoinf. 2012, 14, 40–60. [Google Scholar] [CrossRef]
San, B.T. An evaluation of SVM using polygon-based random sampling in landslide susceptibility mapping: The Candir catchment area (western Antalya, Turkey). Int. J. Appl. Earth Obs. Geoinf. 2014, 26, 399–412. [Google Scholar] [CrossRef]
Manap, H.S.; San, B.T. Data Integration for Lithological Mapping Using Machine Learning Algorithms. Earth Sci. Inform. 2022, 15, 1841–1859. [Google Scholar] [CrossRef]
Bachri, I.; Hakdaoui, M.; Raji, M.; Teodoro, A.C.; Benbouziane, A. Machine learning algorithms for automatic lithological mapping using remote sensing data: A case study from Souk Arbaa Sahel, Sidi Ifni Inlier, Western Anti-Atlas, Morocco. ISPRS Int. J. Geo-Inf. 2019, 8, 248. [Google Scholar] [CrossRef]
Shebl, A.; Abdellatif, M.; Hissen, M.; Abdelaziz, M.I.; Csámer, Á. Lithological mapping enhancement by integrating Sentinel 2 and gamma-ray data utilizing support vector machine: A case study from Egypt. Int. J. Appl. Earth Obs. Geoinf. 2021, 105, 102619. [Google Scholar] [CrossRef]
Nugroho, H.; Wikantika, K.; Bijaksana, S.; Saepuloh, A. Integration of remote sensing and geophysical data to enhance lithological mapping utilizing the Random Forest classifier: A case study from Komopa, Papua Province, Indonesia. J. Degrad. Min. Lands Manag. 2023, 10, 4417. [Google Scholar] [CrossRef]
San, B.T.; Süzen, M.L. Evaluation of cross-track illumination in EO-1 Hyperion imagery for lithological mapping. Int. J. Remote Sens. 2011, 32, 7873–7889. [Google Scholar] [CrossRef]

Figure 1. (A) The geographical location of the Sar-Cheshmeh area in the Urumia-Dokhtar magmatic belt of southern Iran, and (B) lithological map of the Sar-Cheshmeh copper deposit [41].

Figure 2. (A) Laboratory spectra of epidote, calcite, muscovite, kaolinite, chlorite, and alunite. (B) Laboratory spectra of montmorillonite, jarosite, hematite, and goethite [13].

Figure 3. Flowchart of the methodology that was applied in this study.

Figure 4. (A) Pearson’s correlation matrix between ASTER VNIR, SWIR, and TIR bands; (B) Pearson’s correlations between features and bands (some bands were rearranged to better emphasize the correlations).

Figure 5. RF feature values and their rankings.

Figure 6. The mafic index (A) and quartz index (B) were extracted from ASTER bands; (C) Lithological map of the Sar-Cheshmeh copper deposit.

Figure 7. (A) RGB combinations from PCA: R, PC1; G, PC2; B, PC6. (B) RGB combinations for ICA: R, IC1; G, IC2; B, IC4.

Figure 8. The geological maps of the Sar-Cheshmeh copper mining region (A), including results of lithological classification that were derived from (B) RF, (C) SVM, (D) GB, (E) XGB, and (F) deep-learning ANN.

Figure 9. The results of classification by utilizing SVM for all bands and features. Not that the acronym P in this figure refers to porphyry.

Figure 10. Effects of testing sample size over the training and testing data accuracy in (A) RF and (B) SVM.

Figure 11. Overall accuracy of various ML algorithms with and without the inclusion of DEM (digital elevation model).

Table 1. List of the spectral features that were used in this study.

Band Indices	Feature Name	Band Indices	Feature Name
$\frac{B_{4}}{B_{5}}$	Alteration	$\frac{B_{8} + B_{6}}{B_{7}}$	Dolomite
$\frac{B_{4} + B_{6}}{B_{5}}$	Alunite/Kaolinite/Pyrophylite	$\frac{B_{5}}{B_{6}}$	Host Rock
$\frac{B_{6}}{B_{8}}$	Amphibole	$\frac{B_{7}}{B_{5}}$	Kaolinitic
$\frac{B_{6} + B_{9}}{B_{8}}$	Amphibole/MgOH	$\frac{B_{7}}{B_{5}} * \frac{B_{7}}{B_{8}}$	Alunite Index
$\frac{B_{13}}{B_{14}}$	Carbonate	$\frac{B_{7}}{B_{6}}$	Muscovite
$\frac{B_{7} + B_{9}}{B_{8}}$	Carbonate/Chlorite/Epidote	$\frac{B_{14}}{B_{12}}$	Quartz-Rich Rocks
$B_{3 N} \times \frac{B_{2}}{B_{1}^{2}}$	Chlorophyll Vegetation Index (CVI)	$\frac{B_{6}}{B_{8}} \times \frac{B_{9}}{B_{8}}$	Calcite Index
$\frac{B_{3}}{B_{2}} \times \frac{B_{1}}{B_{2}}$	Stabilized Vegetation Index (SVI)	$\frac{B_{11} \times B_{11}}{B_{10} \times B_{12}}$	Quartz Index (QI)
$\frac{B_{7}}{B_{6}} \times \frac{B_{4}}{B_{6}}$	OH-bearing Altered Minerals Index	$\frac{B_{12}}{B_{13}}$	Mafic Index
$\frac{B_{8}}{B_{6}} \times \frac{B_{4}}{B_{5}}$	Kaolinite Index

Table 2. The results of accuracy assessments for the testing area. 1st scenario: by using all bands and features, 2nd scenario: by applying feature selection.

Test Samples (n)	Key Parameters	Accuracy Criteria	RF		SVM		GB		XGB		ANN
Test Samples (n)	Key Parameters	Accuracy Criteria	1st Scenario	2nd Scenario	1st Scenario	2nd Scenario	1st Scenario	2nd Scenario	1st Scenario	2nd Scenario	1st Scenario	2nd Scenario
131	Alluvium	precision	0.96	0.98	0.91	0.95	0.92	0.96	0.95	0.93	0.91	0.90
		recall	0.92	0.86	0.94	0.94	0.85	0.78	0.92	0.80	0.95	0.83
		F1-score	0.94	0.92	0.92	0.94	0.89	0.86	0.94	0.86	0.93	0.87
4073	Andesite	precision	0.86	0.85	0.90	0.89	0.83	0.83	0.84	0.83	0.85	0.88
		recall	0.94	0.95	0.91	0.92	0.93	0.92	0.93	0.92	0.93	0.87
		F1-score	0.90	0.90	0.91	0.90	0.87	0.87	0.88	0.87	0.89	0.88
87	Biotite Dike	precision	0.70	0.79	0.53	0.53	0.46	0.47	0.69	0.54	0.2	0.37
		recall	0.16	0.17	0.41	0.36	0.18	0.16	0.21	0.15	0.01	0.11
		F1-score	0.26	0.28	0.46	0.43	0.26	0.24	0.32	0.23	0.02	0.18
13	Feldspar Dike	precision	0.00	0.00	0.15	0.00	0.00	0.00	0.00	0.00	0.00	0.00
		recall	0.00	0.00	0.15	0.00	0.00	0.00	0.00	0.00	0.00	0.00
		F1-score	0.00	0.00	0.15	0.00	0.00	0.00	0.00	0.00	0.00	0.00
819	Granodiorite	precision	0.93	0.93	0.95	0.94	0.91	0.90	0.91	0.91	0.92	0.91
		recall	0.94	0.92	0.95	0.96	0.90	0.89	0.93	0.91	0.94	0.92
		F1-score	0.93	0.93	0.95	0.95	0.90	0.89	0.92	0.91	0.93	0.92
1514	Hornblende Dike	precision	0.73	0.72	0.70	0.70	0.66	0.63	0.68	0.64	0.70	0.61
		recall	0.57	0.56	0.68	0.64	0.52	0.52	0.54	0.50	0.49	0.60
		F1-score	0.64	0.63	0.69	0.67	0.58	0.57	0.60	0.56	0.58	0.61
189	Late Fine Porphyry	precision	0.78	0.78	0.81	0.79	0.74	0.79	0.74	0.71	0.69	0.65
		recall	0.76	0.75	0.78	0.76	0.66	0.66	0.71	0.60	0.83	0.74
		F1-score	0.77	0.77	0.80	0.78	0.70	0.72	0.73	0.65	0.75	0.69
111	Quartz Eye	precision	0.85	0.91	0.83	0.83	0.84	0.82	0.90	0.85	0.81	0.79
		recall	0.75	0.73	0.90	0.91	0.53	0.58	0.70	0.75	0.84	0.75
		F1-score	0.79	0.81	0.86	0.87	0.65	0.68	0.79	0.79	0.82	0.77
401	Sar-Cheshmeh Porphyry	precision	0.73	0.72	0.72	0.70	0.69	0.66	0.72	0.68	0.61	0.55
		recall	0.72	0.70	0.72	0.73	0.66	0.60	0.68	0.67	0.75	0.76
		F1-score	0.73	0.71	0.72	0.72	0.67	0.63	0.70	0.68	0.67	0.64
	Total	accuracy	0.84	0.83	0.85	0.84	0.80	0.79	0.82	0.80	0.82	0.80

Table 3. Performance of VNIR, SWIR, and TIR bands in identifying nine mineral classes using RF. The values in boldface refer to the highest value of precision, recall, and F1-score for each class.

	ASTER VNIR Bands			ASTER SWIR Bands			ASTER TIR Bands
	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
Alluvium	0.17	0.04	0.06	0.72	0.39	0.50	0.78	0.45	0.57
Andesite	0.62	0.84	0.72	0.73	0.91	0.81	0.75	0.90	0.82
Biotite Dike	0.00	0.00	0.00	1.00	0.02	0.04	0.45	0.11	0.18
Feldspar Dike	0.00	0.00	0.00	1.00	0.08	0.14	1.00	0.15	0.27
Granodiorite	0.55	0.44	0.49	0.82	0.71	0.76	0.76	0.53	0.62
Hornblende Dike	0.25	0.15	0.19	0.50	0.33	0.40	0.52	0.42	0.46
Late Fine Porphyry	0.06	0.02	0.02	0.55	0.26	0.36	0.64	0.57	0.60
Quartz Eye	0.33	0.04	0.07	0.68	0.12	0.20	0.72	0.30	0.42
Sar-Cheshmeh Porphyry	0.17	0.07	0.10	0.55	0.48	0.51	0.60	0.57	0.59
accuracy	0.55			0.70			0.70

Table 4. Performance of VNIR, SWIR, and TIR bands in identifying nine mineral classes using SVM. The values in boldface refer to highest value of precision, recall, and F1-score for each class.

	ASTER VNIR Bands			ASTER SWIR Bands			ASTER TIR Bands
	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
Alluvium	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
Andesite	0.59	0.99	0.74	0.60	0.99	0.75	0.59	0.98	0.74
Biotite Dike	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
Feldspar Dike	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
Granodiorite	0.88	0.45	0.59	0.83	0.46	0.60	0.65	0.13	0.22
Hornblende Dike	0.70	0.03	0.05	0.56	0.02	0.04	0.30	0.02	0.05
Late Fine Porphyry	0.00	0.00	0.00	0.00	0.00	0.00	0.67	0.21	0.32
Quartz Eye	0.57	0.11	0.18	0.67	0.05	0.10	0.88	0.06	0.12
Sar-Cheshmeh Porphyry	0.48	0.04	0.07	0.41	0.13	0.20	0.50	0.25	0.34
Accuracy	0.61			0.61			0.59

Table 5. The effects of increasing testing sample sizes over the accuracy in RF.

	15%			20%			25%
	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
Alluvium	1.000	0.946	0.972	0.986	0.973	0.980	0.989	0.947	0.967
Andesite	0.864	0.952	0.906	0.872	0.950	0.909	0.872	0.948	0.908
Biotite Dike	0.667	0.108	0.186	0.667	0.160	0.258	0.611	0.177	0.275
Feldspar Dike	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
Granodiorite	0.935	0.940	0.937	0.921	0.947	0.934	0.927	0.951	0.939
Hornblende Dike	0.751	0.610	0.673	0.753	0.607	0.672	0.749	0.608	0.671
Late Fine Porphyry	0.800	0.790	0.795	0.796	0.796	0.796	0.797	0.785	0.791
Quartz Eye	0.884	0.792	0.835	0.873	0.762	0.814	0.882	0.759	0.816
Sar-Cheshmeh Porphyry	0.767	0.709	0.737	0.751	0.751	0.751	0.729	0.745	0.737
Accuracy	0.849			0.851			0.849
	30%			35%			40%
	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
Alluvium	0.946	0.946	0.946	0.938	0.931	0.935	0.965	0.913	0.938
Andesite	0.864	0.948	0.904	0.855	0.949	0.900	0.855	0.950	0.900
Biotite Dike	0.682	0.200	0.309	0.722	0.149	0.248	0.727	0.162	0.264
Feldspar Dike	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
Granodiorite	0.931	0.936	0.933	0.927	0.930	0.929	0.923	0.926	0.925
Hornblende Dike	0.737	0.586	0.653	0.732	0.572	0.642	0.739	0.576	0.647
Late Fine Porphyry	0.778	0.759	0.769	0.788	0.746	0.766	0.749	0.759	0.754
Quartz Eye	0.902	0.779	0.836	0.844	0.730	0.783	0.870	0.740	0.800
Sar-Cheshmeh Porphyry	0.744	0.753	0.749	0.733	0.718	0.725	0.745	0.707	0.726
Accuracy	0.843			0.836			0.837

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bahrami, H.; Esmaeili, P.; Homayouni, S.; Pour, A.B.; Chokmani, K.; Bahroudi, A. Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery. Minerals 2024, 14, 202. https://doi.org/10.3390/min14020202

AMA Style

Bahrami H, Esmaeili P, Homayouni S, Pour AB, Chokmani K, Bahroudi A. Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery. Minerals. 2024; 14(2):202. https://doi.org/10.3390/min14020202

Chicago/Turabian Style

Bahrami, Hazhir, Pouya Esmaeili, Saeid Homayouni, Amin Beiranvand Pour, Karem Chokmani, and Abbas Bahroudi. 2024. "Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery" Minerals 14, no. 2: 202. https://doi.org/10.3390/min14020202

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Lithological Mapping from ASTER Remote-Sensing Imagery

Abstract

1. Introduction

2. Geology of the Study Area

3. Materials and Methods

3.1. ASTER Data Characteristics and Preprocessing

3.2. Mineral Spectral Characteristics

3.3. Implementation of Machine-Learning (ML) Algorithms

3.3.1. Random Forest

3.3.2. Support Vector Machines

3.3.3. Deep-Learning ANN

3.3.4. Gradient Boosting

3.3.5. Accuracy Assessment

4. Analysis and Results

4.1. Extraction of Features from ASTER

4.2. Training Sample Selection

4.3. Rock Type Classifications and Accuracy Analysis

4.4. Assessment of Specific Spectral Regions as an Input to ML Algorithms

4.5. Effect of the Number of Training Samples in ML Algorithms on Overall Accuracy

4.6. DEM Assessment as an Additional Feature to Input Data

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI