Deep Learning Approach on Prediction of Soil Consolidation Characteristics

Kim, Mintae; Senturk, Muharrem A.; Tan, Rabia K.; Ordu, Ertugrul; Ko, Junyoung

doi:10.3390/buildings14020450

Open AccessArticle

Deep Learning Approach on Prediction of Soil Consolidation Characteristics

by

Mintae Kim

¹

,

Muharrem A. Senturk

²

,

Rabia K. Tan

³,

Ertugrul Ordu

⁴ and

Junyoung Ko

^5,*

¹

School of Civil, Environmental, and Architectural Engineering, Korea University, Seoul 02841, Republic of Korea

²

Department of Computer Engineering, Yeditepe University, Istanbul 34755, Turkey

³

Department of Computer Engineering, Tekirdağ Namik Kemal University, Tekirdağ 59860, Turkey

⁴

Department of Civil Engineering, Tekirdağ Namik Kemal University, Tekirdağ 59860, Turkey

⁵

Department of Civil Engineering, Chungnam National University, Daejeon 34134, Republic of Korea

^*

Author to whom correspondence should be addressed.

Buildings 2024, 14(2), 450; https://doi.org/10.3390/buildings14020450

Submission received: 5 December 2023 / Revised: 2 February 2024 / Accepted: 4 February 2024 / Published: 6 February 2024

(This article belongs to the Special Issue Advances in Foundation Engineering for Building Structures)

Download

Browse Figures

Versions Notes

Abstract

:

Artificial neural network models, crucial for accurate predictions, should be meticulously designed for specific problems using deep learning-based algorithms. In this study, we compare four distinct deep learning-based artificial neural network architectures to evaluate their performance in predicting soil consolidation characteristics. The consolidation features of fine-grained soil have a significant impact on the stability of structures, particularly in terms of long-term stability. Precise prediction of soil consolidation under planned structures is vital for effective foundation design. The compression index (C_c) is an important parameter used in predicting consolidation settlement in soils. Therefore, this study examines the use of deep learning techniques, which are types of artificial neural network algorithms with deep layers, in predicting compression index (C_c) in geotechnical engineering. Four neural network models with different architectures and hyperparameters were modeled and evaluated using performance metrics such as mean absolute percentage error (MAPE), mean squared error (MSE), root mean squared error (RMSE), and coefficient of determination (R²). The dataset contains 916 samples with variables such as natural water content (w), liquid limit (LL), plasticity index (PI), and compression index (C_c). This approach allows the results of soil consolidation tests to be seen more quickly at less cost, although predictively. The findings demonstrate that deep learning models are an effective tool in predicting consolidation of fine-grained soil and offering significant opportunities for applications in geotechnical engineering. This study contributes to a more accurate prediction of soil consolidation, which is critical for the long-term stability of structural designs.

Keywords:

compression index; deep learning; multilayer perceptron; convolutional neural network

1. Introduction

Geotechnical engineering frequently encounters complex problems that necessitate sophisticated mathematical models [1,2]. However, despite the numerous models that have been developed, they often fall short of fully capturing the intricacies of geotechnical problems. Fortunately, artificial intelligence (AI) techniques have proven to be powerful tools for tackling challenging issues specific to geotechnical engineering. Isik and Ozden demonstrated a significant example by estimating geotechnical problems using artificial neural networks [3]. A research investigation explored the application of soft calculation techniques for estimating the consolidation characteristics of fine-grained soil [4]. In addition, an extensive study inquired into the applications of AI within this field [5]. Moreover, the geotechnical properties of soils underwent analysis through the utilization of Geographic Information Systems (GIS) and artificial neural networks (ANN) [6,7,8]. As the field continues to explore more applications of AI, new techniques will be developed to more effectively address these challenges. It is evident that the utilization of computer science and AI in geotechnical engineering, as well as in other fields, will continue to expand and generate interest [9,10,11,12,13,14,15].

The selection and design of an appropriate and reliable foundation system require a crucial step, which is the identification and characterization of soil layers present in soil profiles, along with their properties [16]. One of the key factors for engineering applications in fine-grained soils is their consolidation properties. These properties encompass index properties such as dry density, void ratio, water content, and consistency limits, as well as compression index, swelling index, and overconsolidation ratio [17,18]. Consolidation is a soil mechanics process that occurs over time when soil undergoes a reduction in volume due to the expulsion of water from the void spaces between soil particles. This process is initiated by an increase in vertical stress, typically caused by structural loads [19]. Determining how well soil layers settle under the additional weight of planned structures is a vital aspect of foundation design [20,21]. It is important to note that all consolidation properties and index properties of soils, including compression index, swelling index, and overconsolidation ratio, can only be determined through laboratory tests [22]. Each encountered soil layer exhibits distinct index and consolidation characteristics, which depend on its composition [23]. For this purpose, field drilling operations are conducted to extract the vertical profile of the soil, and both disturbed and undisturbed samples are obtained for laboratory experiments. This situation underscores the need for methods that can efficiently determine the parameters obtained from consolidation tests in a shorter time and with a closer approximation to reality, all while minimizing costs. The objective of these field drilling operations is to gather representative soil samples, enabling the evaluation of variations in different soil parameters as one moves deeper into a soil cross section. The data obtained from laboratory experiments, coupled with the results from field drilling operations, collectively contribute to identifying the fundamental soil parameters essential for accurate calculations and estimations of the anticipated range of settlement values [24,25,26,27].

Techniques based on machine learning have witnessed significant growth in popularity over the last decade for their ability to make accurate predictions and have been applied to various real-world problems [28]. In geotechnical engineering, the comprehension and prediction of consolidation properties are crucial for ensuring the stability and durability of structures. As mentioned above, AI techniques, especially neural networks, have had an important role in predicting ground behavior in geotechnical engineering. AI technology is a field that aims to enable machines to learn like humans. The ability to learn the results that machines are not specifically programmed with is called machine learning (ML), which is a type of artificial intelligence [29,30,31]. The ML technique includes many different algorithms, but some machine learning algorithms that imitate the human brain are referred to as artificial neural networks. Deep learning (DL), which covers the techniques we use, is a name used for artificial neural network algorithms consisting of many layers. Unlike ML, which involves some human intervention, deep learning involves complete self-learning without human intervention. The relationship between AI and its subfields is shown in Figure 1.

ML is a subset of AI focused on the development of algorithms and models that enable computers to learn from data and improve their performance on a specific task without being explicitly programmed. In addition, a subfield of ML known as DL draws inspiration from the functioning of the human brain. In a DL network, each layer utilizes the output of the layer below it for feature extraction and manipulation [32]. These layers are sequentially designed, allowing the creation of complex models capable of learning from data [33]. DL excels in handling large datasets, which is one of its key advantages over typical ML algorithms. While ML methods often involve problem partitioning and management, DL approaches aim to solve problems end to end. Moreover, DL algorithms automatically extract features from data, eliminating the need for human feature engineering required in traditional ML methods. Consequently, DL techniques offer faster and more advanced capabilities overall.

Recently, the DL technique has made significant advancements in various fields, including voice and image identification, natural language processing, and scientific research. Deep learning models exhibit remarkable accuracy and speed in tasks such as identification, classification, detection, inference, and segmentation [34]. As a result, it has become an indispensable tool for researchers, businesses, and organizations seeking to extract insights and make predictions from large volumes of data. Artificial neural networks (ANNs) provide a comprehensive explanation of what deep learning is and how it works, mainly because DL refers to the algorithms of deep neural networks (DNN), which have multiple hidden layers. Therefore, understanding various types of neural networks (NNs) is crucial for gaining a better understanding of the applied DL methods. These DL-driven approaches hold significant potential in formulating predictive models specifically designed for geotechnical engineering dilemmas. As a result, there exists a critical demand for the advancement of fresh and more exact DL-oriented predictive models, enabling their versatile deployment across a multitude of geotechnical scenarios and research realms [35].

Table 1 provides the parameters employed by the model to predict the compression index in the previous study, offering insights into the algorithm applied during model development. However, it should be noted that, while many previous studies have explored the impact of input factors or the number of hidden layers on model performance, there is a lack of information on the architecture, hyperparameters, and configurations that differ from the specific factors constituting the model.

Therefore, in this study, deep learning-based models are employed and evaluated for predicting soil consolidation characteristics, which are significantly affected by the structural stability of buildings and the long-term performance of structures. This study develops and assesses four deep learning models with different architectures and hyperparameters to predict soil consolidation properties based on parameters such as natural water content, liquid limit, plasticity index, and compression index. The objective of this study is to leverage deep learning techniques to predict soil consolidation characteristics accurately and efficiently, and the results show that deep learning models can be effective tools for this purpose.

2. Materials and Model Development

2.1. Dataset

A total of 916 samples were used in order to assemble a dataset, which were provided by Ongun [39], Satyanarayana and Satyanarayana [40], Kahraman [41], and Kalantary and Kordnaeij [42]. Figure 2 presents 916 samples in the plasticity chart for soil classification. As shown in the figure, the model developed in this study is limited to the scope of the data. Therefore, it should be noted that the fine-grained soil in this study does not encompass all fine-grained soil; rather, it is confined to the data range shown in Figure 2.

In addition, it is essential to consider crucial factors for understanding the complicated consolidation processes while facilitating the evaluation of their structural stability. Since variations in water content have a direct influence on the compressibility and deformation potential, natural water content (w) was selected as one of the input parameters. Also, the liquid limit (LL) represents the critical threshold at which soil changes from a plastic to a liquid state, and the plasticity index (PI) provides information about the soil’s ability to undergo deformation and change in consistency when subjected to moisture variations. In addition, the compression index (C_c) is a crucial value that describes the compressibility behavior of soil and allows for the determination of the compression potential and the compression rate. Therefore, the input parameters of the dataset consisted of natural water content, liquid limit, and plasticity index; the output was determined as the compression index. Table 2 shows the sample dataset of parameters.

Figure 3 presents the data using frequency histograms, illustrating the distribution of the dataset. In addition, Table 3 displays the descriptive statics of the parameters, representing the dataset.

A total of 916 samples were divided into three dataset groups, consisting of training, validation, and test sets. The training phase of the employed methods utilized the complete set of 738 samples from the training dataset group. In the validation phase, 90 samples from the validation group were utilized, and during the testing phase, 88 samples from the test dataset group were employed. However, the limitations of our dataset are rooted in its diverse composition, encompassing various regions and depths. While this diversity contributes to the dataset’s richness with approximately 900 entries and correlates well with the observed high coefficient, it introduces limitations in terms of representativeness. The dataset may not fully capture the broad spectrum of geotechnical conditions worldwide, impacting the model’s ability to generalize beyond the specific range and variation present in the training data.

2.2. Model Development

2.2.1. Artificial Neural Network

Compared to other AI algorithms, DL methods such as ANNs consistently deliver superior results. These networks have been developed with inspiration from the human brain, specifically the biological structures of neuron cells and neuron networks. In ANNs, these biological elements are expressed mathematically, with a neuron cell corresponding to a mathematical expression known as a perceptron. The NNs formed by biological neuron cells are represented as the output of many perceptrons, which serve as inputs to the subsequent perceptron [43] (see Figure 4).

In fact, the learning process in the human brain, characterized by changes in electrical signal values stored by neurons and the formation of new neuron connections, can be likened to updating the weight values and eliminating perceptrons with low weight values in ANNs. Consequently, the learning scenario in an artificial neural network revolves around determining the most suitable weight values for the inputs. Today, there is a wide selection of artificial neural networks to choose from.

2.2.2. Multilayer Perceptron

A multilayer perceptron (MLP) is a type of ANN consisting of multiple layers of interconnected nodes or units called neurons. It is one of the foundational architectures used in DL methods. Fully connected layers are a crucial component of MLP networks. Dense layers refer to layers that are fully connected, meaning the perceptron within these layers is mathematically connected to the perceptrons in the subsequent layers. The parameters of fully connected layers include the number of units (or perceptron) and the activation function. When fully connected layers are used in the intermediate or hidden layers of a neural network design, an activation function called rectified linear unit (ReLU) is commonly employed. On the other hand, if fully connected layers are present in the output layer, activation functions such as softmax or sigmoid are preferred depending on the nature of the problem. The number of outputs required for the problem determines the total number of units within the output layer.

2.2.3. Convolutional Neural Network

A convolutional neural network (CNN) is a neural network variant specifically designed for processing digital image data. Convolutional neural networks consist of fully connected layers, pooling layers, and convolution layers in their architecture. The convolution layers play a crucial role in extracting features from visual data using linear algebra processes. The important parameters of convolution layers include filters, kernel size, strides, padding, and an optional activation function. Pooling layers, such as maximum pooling and average pooling, help retain common features while reducing the overall network size. In addition, fully connected layers are often added at the end of convolutional neural networks to aid in tasks like classification.

2.2.4. Development of the Deep Learning Models

Four deep learning models were developed, referred to as M4SAB, M3SAB, M3SRM, and C4LRM, allowing for the indication of configuration, number of layers, activation function, optimizer algorithm, and loss function in a specified order. These models had distinct architectures and hyperparameters, which are mathematical variables that define the structure and settings of deep learning models. By utilizing these configurations, deep learning models acquire the capability to solve problems. The term ‘learning’ denotes the process in which the variables, known as weights, within deep learning models converge to their optimal values. Figure 5 depicts the schematic architectures of the developed models; a detailed description of the model development follows below.

M4SAB, M3SAB, and M3SRM all utilized multilayer perceptrons, which are a type of deep learning network. In multilayer perceptrons, all layers are referred to as dense or fully connected layers. C4LRM, on the other hand, combined a multilayer perceptron with a convolutional network, which is another type of deep learning network commonly used for image data analysis and feature extraction. The merging of these networks is achieved through a layer called Flatten, which reduces the layer size to a vector size. M4SAB and C4LRM both employed four-layered neural network models, while M3SAB and M3SRM used three-layered neural network models.

One of the critical steps in creating neural networks is selecting the activation function. The hidden layers in a neural network are the layers located between the input and output layers. Generally, the rectified linear unit (ReLU) activation function is commonly used in hidden layers for both multilayer perceptrons and convolutional neural networks. This general practice was followed in all the models. However, the choice of activation functions for the final layers depends on the specifics of the problem at hand. In the M4SAB, M3SAB, and M3SRM models, the sigmoid activation function was employed in the final layers, whereas C4LRM utilized the linear activation function.

In addition, the optimizer algorithm is the most crucial hyperparameter used to calculate the optimal values of variables, known as weights, in deep learning models. In the M4SAB and M3SAB models, the Adam optimizer algorithm, which is an adaptive momentum learning technique proposed by Adam [44], was chosen. In the M3SRM and C4LRM models, the root-mean-squared propagation (RMSProp) optimizer algorithm, which is a gradient descent optimization algorithm, was utilized. These choices were made based on heuristics.

Moreover, the selection of the loss function is another crucial hyperparameter. The loss function was utilized to quantify the differences or errors between the predicted results and the actual outcomes of deep learning models. In the M4SAB and M3SAB models, we chose the Binary Cross Entropy loss function based on heuristic decisions [45,46]. In contrast, for M3SRM and C4LRM, a loss function called mean squared error was employed [32].

Furthermore, the number of epochs serves as a significant hyperparameter in neural network models. An epoch represents each iteration that the data undergo during the training process of the neural network. For all models, we chose to set the number of epochs to 100. This decision was made to facilitate the evaluation of models within a specific timeframe, enabling us to observe the learning progress demonstrated by each model over the same duration. Table 4 shows the summary of model developments.

Table 5 presents the results of the sensitivity analysis aimed at determining which input parameter has the most significant impact on the compression index. The method employed for sensitivity analysis is known as Input Perturbation. This technique entails altering each input feature of a given sample individually and observing the resulting effect on the model’s output. Through this process, the impact or importance of each feature was assessed. C4LRM appears to exhibit a balanced distribution of the three inputs’ effects on the output, as depicted in the table below.

2.3. Runtime Environment

Figure 6 illustrates the proposed workflow, providing a visual representation of the research methodology. In the data collection process, the input and output variables needed for consolidation prediction are combined into a single file. For the data preparation phase, our dataset, which is represented as a single file, was divided into distinct sections for training, validation, and testing purposes in the models. During the model definition phase, all models were implemented in the software using their specific neural network architectures and parameters. The training and validation processes of the models encompass essential operations such as training and validating the neural networks within the software, illustrated as flow diagrams in the figure below.

As described above, this study utilized four distinct models in order to conduct experiments. These models were developed using Python (version 3.9.15), a versatile and user-friendly programming language widely used in machine learning and data science [47]. The implementation of the models was achieved through the utilization of the Keras deep learning application programming interface (Keras 3 API) [48], which is built on top of TensorFlow and serves as a Python library designed for high-level neural network tasks [49]. TensorFlow (version 2.10.0), an open-source platform frequently used for developing and deploying machine learning models, was also utilized. The experiments were performed using Google Colab [50], a free cloud-based platform that provides access to various hardware resources such as CPUs, GPUs, and TPUs. These resources are crucial for the efficient execution of deep learning algorithms. By adopting this approach, the researchers were able to conduct their experiments cost-effectively, eliminating the need for expensive hardware or software licenses.

2.4. Evaluation Metrics

2.4.1. Mean Absolute Percentage Error

The performance of a predictive model can be evaluated by examining its mean absolute percentage error (MAPE). As shown in Equation (1), the MAPE value is computed by taking the mean absolute values of the discrepancies between the obtained values and the expected values and dividing that result by the number of actual values. A lower MAPE value indicates that the model is better at accurately predicting the dependent variable, indicating higher quality.

M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{|A c t u a l_{i} - P r e d_{i}|}{A c t u a l_{i}}

(1)

2.4.2. Mean Squared Error

Mean squared error (MSE) measures the consistency of deviations between numbers and their predicted values, as described in Equation (2). A decrease in MSE value, approaching zero, indicates greater accuracy of the model. While the MAPE metric is also important for regression models, a lower MSE value is generally preferred as it offers a more precise measure of the model’s accuracy.

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(A c t u a l_{i} - P r e d_{i})}^{2}

(2)

2.4.3. Root Mean Squared Error

Root mean square error (RMSE) is a statistical measure used to evaluate the accuracy of a prediction model. It is calculated as the square root of the mean of the squares of the differences between actual and predicted values, as described in Equation (3). A low RMSE value indicates that the model predicts the data better and has high prediction performance.

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(A c t u a l_{i} - P r e d_{i})}^{2}}

(3)

2.4.4. Coefficient of Determination

The R-squared statistic, also known as the coefficient of determination (R²), is utilized to evaluate the goodness of fit of a regression model to the data. As shown in Equation (4), it is calculated by subtracting 1 from the ratio of the sum of squared residuals (SSR) to the sum of squares (SST). In this equation, SSR represents the difference between observed and expected values, while SST is a statistical measure of dissimilarity between the observed values and their mean.

R^{2} = 1 - \frac{S S R}{S S T} = 1 - \frac{\sum_{i = 1}^{n} {(A c t u a l_{i} - P r e d_{i})}^{2}}{\sum_{i = 1}^{n} {(A c t u a l_{i} - M e a n (A c t u a l_{i}))}^{2}}

(4)

These metrics were used in the process of assessing and measuring the effectiveness of the models. For a problem that involves three float-type inputs and a float-type output, the problem is considered as a regression problem. Some commonly used metrics for regression models include MAPE, MSE, RMSE, and the coefficient of determination. The selected metrics were used to provide insight into the accuracy and precision of the models and to assess their overall effectiveness in solving the regression problem at hand.

3. Results and Discussion

3.1. Results of the Training and Validation Process

An epoch refers to a complete cycle of presenting the entire training or validation dataset to the network for learning or evaluation, respectively. During each training epoch, the network processes the training data, adjusts its parameters using optimization techniques like gradient descent, and gradually refines its predictive capabilities. Validation epochs involve feeding the validation dataset through the trained network to assess its performance on unseen data. Monitoring the network’s behavior over multiple epochs provides insights into its learning progress and convergence, helping to determine when to stop training to prevent overfitting or achieve optimal generalization.

The training and validation results of the models, assessed using the MAPE metric, are presented in Figure 7a,b.

It can be concluded that models with lower MAPE values display better performance when compared to their counterparts. As a result, the comparative analysis highlights the M4SAB model as the least effective, evident from its significantly higher MAPE value in comparison to the other models. Upon closer inspection, M3SRM outperforms M4SAB but still falls short when compared to M3SAB and C4LRM, though with minor differences. Importantly, the M3SAB and C4LRM models deliver closely aligned outcomes, establishing them as the most proficient models. In the validation process, lower MAPE values in models indicate superior performance. Consequently, the M4SAB model emerges as the least successful, as indicated by its notably higher MAPE value compared to the other models. Further analysis of the graph reveals that the M3SAB, M3SRM, and C4LRM models exhibit closely similar results with subtle variations, occasionally marked by sporadic upward fluctuations.

The training results of the models, evaluated through the MSE metric, are depicted in Figure 8a. Models with lower MSE values indicate higher efficacy compared to their counterparts. Consequently, the M4SAB model stands out as the least effective, given its significantly higher MSE value in comparison to the other models. Further analysis reveals M3SRM’s superior performance relative to the M4SAB model, although with a slight gap from the M3SAB and C4LRM models. The M3SAB and C4LRM models exhibit closely aligned results, establishing them as the most proficient models in this context. Furthermore, the validation results of the models, assessed based on the mean squared error (MSE) metric, are displayed in Figure 8b. Lower MSE values in models signify superior performance within the group. Thus, the M4SAB model emerges as the least effective, characterized by a notably higher MSE value compared to its counterparts. The graphical representation demonstrates that the M3SAB, M3SRM, and C4LRM models produce closely aligned outcomes with minor differences, occasionally accompanied by sporadic upward deviations.

Figure 9a,b shows the R² values obtained after the training and validating process for 100 iterations.

Models with higher R² values indicate superior performance. Consequently, the M4SAB model is identified as the least effective due to its notably lower R² value compared to the other models. The graphical representations reveal closely similar results for the M3SAB, M3SRM, and C4LRM models, displaying slight variances and occasional downward trends. Additionally, a higher R² value within the models indicates improved performance. Thus, the M4SAB model is recognized as the least successful due to its substantially lower R² value compared to the remaining models. The graphical representations show that the M3SAB, M3SRM, and C4LRM models demonstrate closely similar results with minor variations and intermittent downward fluctuations.

3.2. Results of the Test Dataset

The test results of output prediction value and comparisons between 88 samples of the test dataset (actual values) and predicted values from developed models are presented in Figure 10, Figure 11, Figure 12 and Figure 13, respectively. The prediction results of the models were the values calculated by the neural networks performing the regression operation. The performance of the developed models was assessed using evaluation metrics. The results obtained from the evaluation of the regression models indicate the performance and effectiveness of each model in predicting the consolidation data.

The results of evaluation metrics revealed that the M4SAB model performed less successfully, exhibiting a MAPE value of 71.60, an MSE value of 0.021, a RMSE value of 0.144, and an R² value of 0.776. M4SAB and M3SAB, which involved modifications to the network architecture (e.g., the number of layers), demonstrated significantly improved performance in the prediction of consolidation characteristics. M3SAB achieved a MAPE value of 29.29, an MSE value of 0.017, a RMSE value of 0.132, and an R² value of 0.905. While the M3SRM model shared the same network structure as the M3SAB model, it employed the root-mean-squared propagation for the optimizer algorithm and mean squared error for the loss function, resulting in significant differences in the evaluation metrics compared to the M3SAB model. The evaluation metrics results of the M3SRM model were an MAPE value of 52.99, an MSE value of 0.059, a RMSE value of 0.245, and an R² value of 0.735. The C4LRM model combined a multilayer perceptron with a convolutional network, an alternative deep learning structure commonly used for analyzing image data and extracting features. This integration resulted in the C4LRM model inducing the most successful performance in predictions. The evaluation values of MAPE, MSE, RMSE, and R² values for the C4LRM model were 25.95, 0.013, 0.116, and 0.905, respectively. Table 6 presents the summary of the evaluation results for each model developed using different techniques. The C4LRM model demonstrated the highest level of accuracy and precision among all the models, as evidenced by the lowest values of MAPE, MSE and RMSE and the highest value of the coefficient of determination.

Evaluating a regression model is vital for assessing its effectiveness and enhancing precision. Various techniques, such as data preprocessing, adjusting the network architecture, tuning weights, and optimizing hyperparameters, can be employed to improve the performance of regression models. Statistical and mathematical analysis of experimental results reveals that the C4LRM model outperforms all other models, with the M3SAB model closely approaching the performance of the C4LRM model. In contrast, the M4SAB and M3SRM models exhibit comparatively lower performance. Based on these results, the number of layers, activation function, optimizer algorithm, and loss function did not play a dominant role in the model’s performance. Instead, the optimal choice of each configuration significantly influenced the model’s overall performance for optimal development.

In summary, the discussion highlights the importance of evaluating and comparing regression models using appropriate metrics. The findings suggest that the C4LRM model, which has the most ideal configuration for model performance, is the most successful model, followed by M3SAB, while the M4SAB and M3SRM models demonstrate relatively lower performance. These conclusions provide valuable insights for further model refinement and optimization.

4. Conclusions

In this study, deep learning techniques were employed to predict consolidation in geotechnical engineering. Various methods were applied to enhance the performance of regression models, including data preprocessing, adjustments to the network architecture, and tuning of weights and hyperparameters. These techniques contributed to improving the precision and effectiveness of the models.

This study involved the development of four artificial neural network models, referred to as M4SAB, M3SAB, M3SRM, and C4LRM, using deep learning techniques for consolidation prediction. The M4SAB, M3SAB, and M3SRM models were based on a neural network architecture known as multilayer perceptrons. Additionally, the C4LRM model incorporated an architecture of multilayer perceptrons with the convolutional layer, serving as the first layer. The input parameters of the dataset included natural water content (w), liquid limit (LL), and plasticity index (PI), with the compression index (C_c) as the output. The dataset, consisting of 916 samples, was divided into three groups for training, validation, and testing purposes.

During the training and validation processes, the effectiveness of the models was evaluated using four criteria: mean absolute percentage error (MAPE), mean squared error (MSE), root mean squared error (RMSE), and coefficient of determination (R²). Rigorous statistical analysis revealed that the C4LRM model demonstrated the most successful performance compared to the other models. Notably, the M3SAB model exhibited a strong resemblance to the performance of C4LRM, indicating substantial similarity in their predictive abilities. Conversely, the M4SAB and M3SRM models displayed relatively lower levels of performance.

Furthermore, to determine the C4LRM model as the superior model, four distinct neural network models were assessed using MAPE, MSE, RMSE, and R² evaluation metrics in the testing phase. The assessment yielded results of an MAPE of 25.95, an MSE of 0.013, an RMSE of 0.116, and an R² of 0.905 for the C4LRM model. Particularly, the C4LRM model exhibited the lowest MAPE, MSE and RMSE values and a higher R² value compared to the other models. A distinctive feature of the C4LRM model involved incorporating a convolutional layer in its initial layer to augment the intricacy of the prediction process.

From these findings, the performance of the model was not primarily influenced by the number of layers, activation function, optimizer algorithm, or loss function. Rather, the optimal selection of each configuration significantly impacted the overall performance of the model for optimal development. As a result, this study highlights the promising role of deep learning methods in consolidation prediction and demonstrates their ability to improve decision-making processes and optimize resource allocation in geotechnical engineering for the stability of structures.

Author Contributions

Conceptualization, M.K. and E.O.; methodology, M.K., R.K.T. and J.K.; software, M.A.S. and R.K.T.; validation, M.K. and M.A.S.; formal analysis, M.A.S. and R.K.T.; investigation, M.K. and J.K.; resources, E.O. and R.K.T.; data curation, M.K. and J.K.; writing—original draft preparation, M.K.; writing—review and editing, M.K., M.A.S. and J.K.; visualization, M.K. and M.A.S.; supervision, M.K.; project administration, M.K.; funding acquisition, J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2022R1C1C1011477).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Azzouz, A.S.; Krizek, R.J.; Corotis, R.B. Regression analysis of soil compressibility. Soils Found. 1976, 16, 19–29. [Google Scholar] [CrossRef]
Sari, P.T.K.; Firmansyah, Y.K. The empirical correlation using linear regression of compression index for surabaya soft soil. In Proceedings of the 2013 World Congress on Advances in Structural Engineering and Mechanics, ASEM13, Jeju, Republic of Korea, 8–12 September 2013; pp. 3008–3019. [Google Scholar]
Isik, F.; Ozden, G. Estimating compaction parameters of fine-and coarse-grained soils by means of artificial neural networks. Environ. Earth Sci. 2013, 69, 2287–2297. [Google Scholar] [CrossRef]
Nguyen, M.D.; Pham, B.T.; Ho, L.S.; Ly, H.B.; Le, T.T.; Qi, C.; Le, V.M.; Le, L.M.; Prakash, I.; Bui, D.T. Soft-computing techniques for prediction of soils consolidation coefficient. Catena 2020, 195, 104802. [Google Scholar] [CrossRef]
Baghbani, A.; Choudhury, T.; Costa, S.; Reiner, J. Application of artificial intelligence in geotechnical engineering: A state-of-the-art review. Earth-Sci. Rev. 2022, 228, 103991. [Google Scholar] [CrossRef]
Qader, Z.B.; Karabash, Z.; Cabalar, A.F. Analyzing geotechnical characteristics of soils in Erbil via GIS and ANNs. Sustainability 2023, 15, 4030. [Google Scholar] [CrossRef]
Jolfaei, S.; Lakirouhani, A. Sensitivity analysis of effective parameters in borehole failure, using neural network. Adv. Civ. Eng. 2022, 2022, 4958004. [Google Scholar] [CrossRef]
Lakirouhani, A.; Jolfaei, S. Hydraulic fracturing breakdown pressure and prediction of maximum horizontal in situ stress. Adv. Civ. Eng. 2023, 2023, 8180702. [Google Scholar] [CrossRef]
Zhang, W.; Li, H.; Li, Y.; Liu, H.; Chen, Y.; Ding, X. Application of deep learning algorithms in geotechnical engineering: A short critical review. Artif. Intell. Rev. 2021, 54, 5633–5673. [Google Scholar] [CrossRef]
Kim, M.; Okuyucu, O.; Ordu, E.; Ordu, S.; Arslan, O.; Ko, J. Prediction of undrained shear strength by the GMDH-type neural network using SPT-value and soil physical properties. Materials 2022, 15, 6385. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.Z. Deep learning for geotechnical reliability analysis with multiple uncertainties. J. Geotech. Geoenviron. Eng. 2022, 148, 06022001. [Google Scholar] [CrossRef]
Rizvi, Z.H.; Akhtar, S.J.; Sabeeh, W.T.; Wuttke, F. Effective thermal conductivity of unsaturated soils based on deep learning algorithm. In Proceedings of the 2nd International Conference on Energy Geotechnics, La Jolla, CA, USA, 10–13 April 2022; p. 04006. [Google Scholar]
Xu, J.J.; Zhang, H.; Tang, C.S.; Cheng, Q.; Liu, B.; Shi, B. Automatic soil desiccation crack recognition using deep learning. Geotechnique 2022, 72, 337–349. [Google Scholar] [CrossRef]
Khatti, J.; Grover, K.S. Prediction of compaction parameters for fine-grained soil: Critical comparison of the deep learning and standalone models. J. Rock Mech. Geotech. Eng. 2023, 15, 3010–3038. [Google Scholar] [CrossRef]
Kim, M.; Ordu, S.; Arslan, O.; Ko, J. Prediction of California bearing ratio (CBR) for coarse- and fine-grained soils using the GMDH-model. Geomech. Eng. 2023, 33, 183–194. [Google Scholar]
Terzaghi, K. Principles of Soil Mechanics: A Summary of Experimental Studies of Clay and Sand; McGraw-Hill: New York, NY, USA, 1926. [Google Scholar]
Bowles, J.E. Physical and Geotechnical Properties of Soils; McGraw-Hill: New York, NY, USA, 1984. [Google Scholar]
Das, B.M.; Sobhan, K. Principles of Geotechnical Engineering, SI ed.; Cengage Learning: Boston, MA, USA, 2014. [Google Scholar]
Casagrande, A.; Fadum, R.E. Notes on Soil Testing for Engineering Purposes: Soil Mech; Harvard University: Cambridge, MA, USA, 1940. [Google Scholar]
Das, B.M.; Sivakugan, N. Introduction to Geotechnical Engineering; Cengage Learning: Boston, MA, USA, 2015. [Google Scholar]
Duncan, J.M.; Wright, S.G.; Brandon, T.L. Soil Strength and Slope Stability; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
Germaine, J.T.; Germaine, A.V. Geotechnical Laboratory Measurements for Engineers; John Wiley & Sons: Hoboken, NJ, USA, 2009; pp. 140–160. [Google Scholar]
Raju, P.N.; Pandian, N.S.; Nagaraj, T.S. Analysis and estimation of the coefficient of consolidation. Geotech. Test. J. 1995, 18, 252–258. [Google Scholar]
Carrier, W.D. Consolidation parameters derived from index tests. Geotechnique 1985, 35, 211–213. [Google Scholar] [CrossRef]
Uzielli, M.; Lacasse, S.; Nadim, F.; Phoon, K.K. Soil variability analysis for geotechnical practice. In Proceedings of the Second International Workshop on Characterization and Engineering Properties of Natural Soils, Sinagapore, 29 November–1 December 2006; pp. 1653–1752. [Google Scholar]
Dagdeviren, U.; Demir, A.S.; Kurnaz, T.F. Evaluation of the compressibility parameters of soils using soft computing methods. Soil Mech. Found. Eng. 2018, 55, 173–180. [Google Scholar] [CrossRef]
Lacasse, S.; Nadim, F.; Rahim, A.; Guttormsen, T.R. Statistical description of characteristic soil properties. In Proceedings of the Offshore Technology Conference, OTC 2077, Houston, TX, USA, 30 April–3 May 2007; p. 19117. [Google Scholar]
Owusu-Boadu, B.; Nti, I.K.; Nyarko-Boateng, O.; Aning, J.; Boafo, V. Academic performance modelling with machine learning based on cognitive and non-cognitive features. Appl. Comput. Syst. 2021, 26, 122–131. [Google Scholar] [CrossRef]
Fujita, H.; Fournier-Viger, P.; Ali, M. Editorial of special issue on emerging topics in applied intelligence. Appl. Intell. 2022, 52, 16991–16992. [Google Scholar] [CrossRef]
Shafighfard, T.; Bagherzadeh, F.; Rizi, R.A.; Yoo, D. Data-driven compressive strength prediction of steel fiber reinforced concrete (SFRC) subjected to elevated temperatures using stacked machine learning algorithms. J. Mater. Res. Technol-JMRT 2022, 21, 3777–3794. [Google Scholar] [CrossRef]
Fujita, H.; Selamat, A.; Lin, J.C.; Ali, M. Advances and Trends in Artificial Intelligence. In Proceedings of the 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2021, Kuala Lumpur, Malaysia, 26–29 July 2021. Part II. Lecture Notes in Artificial Intelligence 2021. [Google Scholar]
Şeker, A.; Diri, B.; Balık, H.H. A review about deep learning methods and application. Gazi J. Eng. Sci. 2017, 3, 47–64. [Google Scholar]
Sarle, W.S. Neural Network FAQ, Part 1 of 7: Introduction, Periodic Posting to the Usenet Newsgroup Comp.Ai.Neural-Nets. 1997. Available online: ftp://ftp.sas.com/pub/neural/FAQ.html (accessed on 3 February 2024).
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Wang, Q.; Xie, X.; Shahrour, I. Deep learning model for shield tunneling advance rate prediction in mixed ground condition considering past operations. IEEE Access 2020, 8, 215310–215326. [Google Scholar] [CrossRef]
Park, H.I.; Lee, S.R. Evaluation of the compression index of soils using an artificial neural network. Comput. Geotech. 2011, 38, 472–481. [Google Scholar] [CrossRef]
Kurnaz, T.F.; Dagdeviren, U.; Yildiz, M.; Ozkan, O. Prediction of compressibility parameters of the soils using artificial neural network. SpreingerPlus 2016, 5, 1–11. [Google Scholar] [CrossRef]
Kabeta, W.F.; Feyessa, F.F.; Keneni, Y.F. Numerical modelling for prediction of compression index from soil index properties in Jimma town, Ethiopia. U.Porto J. Eng. 2022, 8, 102–120. [Google Scholar]
Ongun, Y.A. Determination of Ankara Clay Compression Index [Ankara Kilinin Sıkışma Indisinin Belirlenmesi Üzerine Bir Araştırma]. Master’s Thesis, Gazi University, Ankara, Turkey, 2005. [Google Scholar]
Satyanarayana, B.; Satyanarayana, C.N.V.R. Development of empirical Equation for compressibility of marine clays. In Proceedings of the Indian Geotechnical Conference, GEOtrendz, Mumbai, India, 16–18 December 2010; Volume 16, pp. 885–886. [Google Scholar]
Kahraman, E. Statistical Analysis of Consolidation Parameters with Data Set Increased [Konsolidasyon Özelliklerinin Arttırılmış veri seti ile Istatistiksel Analizi]. Master’s Thesis, Istanbul Technical University, Institute of Science and Technology, İstanbul, Turkey, 2012. [Google Scholar]
Kalantary, F.; Kordnaeij, A. Prediction of compression index using artificial neural network. Sci. Res. Essays 2012, 7, 2835–2848. [Google Scholar] [CrossRef]
Abdullah, W. Perceptron: An Introduction to Artificial Neural Networks. Available online: https://medium.com/@wasiqabdullah222/perceptron-an-introduction-to-artificial-neural-networks-e8c8749bca11 (accessed on 10 September 2023).
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference for Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Ruby, U.; Yendapalli, V. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 5393–5397. [Google Scholar]
Zhang, P.; Yin, Z.Y.; Jin, Y.F.; Chan, T.H.; Gao, F.P. Intelligent modelling of clay compressibility using hybrid meta-heuristic and machine learning algorithms. Geosci. Front. 2021, 12, 441–452. [Google Scholar] [CrossRef]
Van Rossum, G.; Drake, F.L. Python 3 Reference Manual; CreateSpace: Scotts Valley, CA, USA, 2009. [Google Scholar]
Chollet, F. Keras. Available online: https://github.com/fchollet/keras (accessed on 11 November 2023).
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. {TensorFlow}: A system for {Large-Scale} machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 16, Savanah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
Bisong, E. Building Machine Learning and Deep Learning Models on Google Cloud Platform; Apress: Berkeley, CA, USA, 2019. [Google Scholar]

Figure 1. Diagram of relationships between subfields of AI.

Figure 2. All 916 samples in the plasticity chart.

Figure 3. Histograms of the distribution of the dataset.

Figure 4. Comparison between biological neurons and artificial neurons (perceptron) [43].

Figure 5. Schematic architectures of the developed models: (a) M4SAB; (b) M3SAB and M3SRM; (c) C4LRM.

Figure 6. Flowchart of the runtime environment.

Figure 7. MAPE results: (a) training; (b) validation process.

Figure 8. MSE results: (a) training; (b) validation process.

Figure 9. Coefficient of determination results: (a) training; (b) validation process.

Figure 10. Results of M4SAB model: (a) test results; (b) comparison between actual and predicted values using M4SAB.

Figure 11. Results of M3SAB model: (a) test results; (b) comparison between actual and predicted values using M3SAB.

Figure 12. Results of M3SRM model: (a) test results; (b) comparison between actual and predicted values using M3SRM.

Figure 13. Results of C4LRM model: (a) test results; (b) comparison between actual and predicted values using C4LRM.

Table 1. Geotechnical parameters and used algorithms in previous studies.

Reference	Geotechnical Parameter	Used Algorithm
Park and Lee (2011) [36]	Water content, w (%)	Artificial neural network (ANN)
	Void ratio, e (-)
	Liquid limit, LL (%)
	Plasticity index, PI (%)
	Specific gravity, G_s (-)
	Weight percentage of sand, W_sand (%)
	Weight percentage of silt, W_silt (%)
	Weight percentage of clay, W_clay (%)
	Compression index, C_c (-)
Kurnaz et al. (2016) [37]	Water content, w (%)	Artificial neural network (ANN)
	Void ratio, e (-)
	Liquid limit, LL (%)
	Plasticity index, PI (%)
	Compression index, C_c (-)
Kabeta et al. (2022) [38]	Liquid limit, LL (%)	Artificial neural network (ANN) and Regression analysis (LR)
	Plastic limit, PL (%)
	Plasticity index, PI (%)
	Compression index, C_c (-)

Table 2. Sample dataset of parameters.

No. of Data	Water Content, w (%)	Liquid Limit, LL (%)	Plasticity Index, PI (%)	Compression Index, C_c (-)
1	36.5	74.0	21.8	0.348
2	25.7	36.5	22.7	0.214
3	34.6	33.5	20.1	0.183
4	43.5	60.0	31.2	0.427
5	45.0	60.5	33.3	0.423
…	…	…	…	…
912	80.0	95.0	34.0	0.880
913	77.0	91.0	32.0	0.840
914	75.0	100.0	34.0	0.830
915	76.0	71.0	31.0	0.800
916	75.0	63.0	30.0	0.760

Table 3. Descriptive statistics of parameters.

Parameter	Minimum	Maximum	Mean	Standard Deviation
Water content, w (%)	10.2	99	34.103	13.645
Liquid limit, LL (%)	0	103	48.074	17.165
Plasticity index, PI (%)	0	96.6	24.156	12.609
Compression index, C_c (-)	0.05	0.996	0.288	0.174

Table 4. Summary of model development.

Model	Configuration	No. of Layer	Activation Function	Optimizer Algorithm	Loss Function	No. of Epochs
M4SAB	Multilayer perceptron (MLP)	4	Sigmoid	Adam optimizer algorithm	Binary Cross Entropy	100
M3SAB		3		Adam optimizer algorithm	Binary Cross Entropy
M3SRM		3		Root-mean-squared propagation	Mean Squared Error
C4LRM	MLP + Conventional network	4	Linear	Root-mean-squared propagation	Mean Squared Error

Table 5. Results of sensitivity analysis of input variables.

Model	Input Variables
Model	Water Content, w (%)	Liquid Limit, LL (%)	Plasticity Index, PI (%)
M4SAB	59.26	10.35	84.28
M3SAB	66.20	22.94	42.61
M3SRM	18.82	58.36	57.49
C4LRM	51.96	58.22	53.91

Table 6. Evaluation metrics on results of predicted values.

Model	MAPE	MSE	RMSE	R²
M4SAB	71.60	0.021	0.144	0.776
M3SAB	29.29	0.017	0.132	0.905
M3SRM	52.99	0.059	0.245	0.735
C4LRM	25.95	0.013	0.116	0.905

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, M.; Senturk, M.A.; Tan, R.K.; Ordu, E.; Ko, J. Deep Learning Approach on Prediction of Soil Consolidation Characteristics. Buildings 2024, 14, 450. https://doi.org/10.3390/buildings14020450

AMA Style

Kim M, Senturk MA, Tan RK, Ordu E, Ko J. Deep Learning Approach on Prediction of Soil Consolidation Characteristics. Buildings. 2024; 14(2):450. https://doi.org/10.3390/buildings14020450

Chicago/Turabian Style

Kim, Mintae, Muharrem A. Senturk, Rabia K. Tan, Ertugrul Ordu, and Junyoung Ko. 2024. "Deep Learning Approach on Prediction of Soil Consolidation Characteristics" Buildings 14, no. 2: 450. https://doi.org/10.3390/buildings14020450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Approach on Prediction of Soil Consolidation Characteristics

Abstract

1. Introduction

2. Materials and Model Development

2.1. Dataset

2.2. Model Development

2.2.1. Artificial Neural Network

2.2.2. Multilayer Perceptron

2.2.3. Convolutional Neural Network

2.2.4. Development of the Deep Learning Models

2.3. Runtime Environment

2.4. Evaluation Metrics

2.4.1. Mean Absolute Percentage Error

2.4.2. Mean Squared Error

2.4.3. Root Mean Squared Error

2.4.4. Coefficient of Determination

3. Results and Discussion

3.1. Results of the Training and Validation Process

3.2. Results of the Test Dataset

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI