BiLSTM for Predicting Post-Construction Subsoil Settlement under Embankment: Advancing Sustainable Infrastructure

Wang, Liyang; Li, Taifeng; Wang, Pengcheng; Liu, Zhenyu; Zhang, Qianli

doi:10.3390/su152014708

Open AccessArticle

BiLSTM for Predicting Post-Construction Subsoil Settlement under Embankment: Advancing Sustainable Infrastructure

by

Liyang Wang

,

Taifeng Li

^*,

Pengcheng Wang

,

Zhenyu Liu

and

Qianli Zhang

Railway Engineering Research Institute, China Academy of Railway Sciences Co., Ltd., Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(20), 14708; https://doi.org/10.3390/su152014708

Submission received: 6 September 2023 / Revised: 25 September 2023 / Accepted: 8 October 2023 / Published: 10 October 2023

(This article belongs to the Special Issue AI in Action: Advancing Infrastructure Inspection, Monitoring, and Management for Sustainable and Resilient Performance)

Download

Browse Figures

Versions Notes

Abstract

:

The load and settlement histories of stage-constructed embankments provide critical insights into long-term surface behavior under embankment loading. However, these data often remain underutilized in predicting post-construction settlement in the absence of geotechnical subsoil characterization. To address this limitation, the current study integrates bidirectional long short-term memory (BiLSTM) into a three-phase framework: data preparation, model construction, and performance evaluation. In the data preparation phase, the feature vector comprises basal pressure, pressure increments, time intervals, and prior settlement values to facilitate a rolling forecast. To manage unevenly spaced data, an Akima spline standardizes the desired time intervals. The model’s efficacy is validated using observational data from two distinct construction case studies, each featuring diverse soil conditions. BiLSTM proves effective in identifying key attributes from load and settlement data during the staged construction process. Compared to traditional curve-fitting methods, the BiLSTM model exhibits superior performance, robustness, and adaptability to varying soil conditions. Additionally, the model demonstrates low sensitivity to the range of post-construction data, allowing for a data collection period reduction—from six months to three—without compromising prediction accuracy (relative error = 0.92%). These advantages not only optimize resource allocation but also contribute to broader sustainability objectives.

Keywords:

embankments; settlement; neural networks; monitoring; staged construction

1. Introduction

Over the past few decades, extensive construction of highway and railway projects has occurred on compressible soil deposits within transportation infrastructure [1]. Accurate prediction of subsoil settlement is not only a significant challenge for project management in large-scale construction but also a cornerstone of sustainable development [2,3]. Despite the utilization of advanced geotechnologies and rational calculations [4], achieving reliable predictions remains elusive. Field instrumentation, particularly via settlement gauge or plate readings at locations anticipated to experience maximum subsidence [5], provides reliable data. Analysis of this data enables the assessment of foundation soil stability, consolidation behavior, and, ultimately, more sustainable long-term planning by minimizing the risks of post-construction settlement.

Several established observational methods exist for predicting future settlement behavior based on historical settlement records. Notable among these are the hyperbolic method, the Asaoka method [6], and the exponential curve method. These graphical methods primarily employ regression analysis of observed settlement–time–fill height data from the post-construction phase while often neglecting data gathered during staged embankment construction. Even when combined models are utilized, the adaptability and predictive accuracy of these traditional methods can be questionable across diverse scenarios. More recent advancements include techniques such as the Richards model [7], the SVM model [6], the back-analysis method based on a genetic algorithm [8], and the gray forecasting model [9]. These advanced methods generally offer improvements over traditional approaches. However, they either dismiss the valuable observational data collected during the staged construction phase—thus requiring extensive post-construction training data—or prove to be time-consuming.

Another significant category of forecasting approaches stems from analytical and numerical modeling techniques [10,11,12,13,14]. These methods, while comprehensive, often necessitate a large number of parameters to be determined [15] or are constrained to specific scenarios. Given these limitations, there persists a substantial need for a more efficient predictive model. Such a model would ideally leverage comprehensive construction information, potentially minimize the volume of observational data required for robust settlement prediction, thereby contributing to sustainability by reducing the need for excessive material use or rework.

Deep learning algorithms, a subset of machine learning techniques, have demonstrated exceptional performance in geotechnical engineering, primarily due to their superior feature learning and expression capabilities [16,17,18,19]. Specifically, recurrent neural networks (RNNs) constitute a specialized category within artificial neural networks (ANNs). These networks feature connections between nodes that form a directed graph along a temporal sequence, enabling them to exhibit temporal dynamic behavior [20]. RNNs use internal memory to process variable-length sequential inputs, making them particularly apt for settlement prediction [21,22,23]. This suitability arises because the subsoil’s settlement response is closely linked to both the current conditions and the history of previous loads and settlements during the construction phase.

Despite these advantages, RNNs possess limitations, chiefly their capacity to learn only short-term memory information, thereby neglecting memory stored over more extended periods. To mitigate the vanishing gradient problem that often occurs during RNN training, long short-term memory (LSTM) networks have been developed, incorporating a specialized gate mechanism [24]. An LSTM unit typically comprises a cell, an input gate, an output gate, and a forget gate. One significant advantage of LSTM over RNNs is its relative insensitivity to time gap lengths, as the cell retains values over arbitrary intervals. This feature allows LSTM to more effectively predict unevenly spaced time series, especially when dealing with irregularly observed data from field monitoring.

Long short-term memory (LSTM) and its variant bidirectional LSTM (BiLSTM) have found extensive applications in civil engineering. These include modeling soil-structure interface shear behavior [25], predicting landslide displacement [26], and estimating maximum ground settlement induced via shield tunneling [27]. In the case of predicting ground settlement in cemented karst regions [28], LSTM algorithms incorporate prior knowledge such as soil and groundwater profiles, tunnel geometry, and tunneling operation parameters. However, the effectiveness of LSTM in capturing the impact of cemented karst caves on ground settlement largely depends on the dataset size.

Another application of BiLSTM focuses on predicting long-term subsoil settlement under embankment loading. The history of load and settlement in stage-constructed embankments provides valuable insights into future deformation behavior. Specifically, the changes in subsoil settlement as embankment load is sustained or increased reveal key mechanical properties of the subsoil. Accordingly, this paper introduces a predictive model based on BiLSTM that leverages observed data both during and after the embankment construction phase. The model consistently outperforms standard LSTM [29]. Notably, this BiLSTM-based approach offers advantages over models tailored for tunneling-induced ground settlement [28]. It requires fewer preliminary data, since it relies on the load-settlement history throughout both the construction and post-construction phases and does not depend on strata properties. Moreover, the model achieves robust time series forecasting by incorporating routinely collected data from the construction phase—data often overlooked in earlier studies and professional practices.

The structure of this article unfolds as follows: Initially, the methodology for both standard curve-fitting methods and a BiLSTM-based algorithm is delineated. Following this, two case histories are presented, encompassing field instrumentation, soil conditions, and monitoring specifics. The article then shifts to data preprocessing methods, illustrating the conversion of raw data into valid input sequences. Finally, the paper details the model construction process and evaluates the performance of BiLSTM-based models in comparison to curve-fitting approaches.

2. Methods

2.1. Traditional Approaches

Several types of curve-fitting have historically been employed to estimate post-construction settlement trends. The technique enjoys broad acceptance due to its ease of implementation [30]. For this study, three particular curve-fitting methods serve as baseline models to represent conventional approaches. While these fitting models offer long-term settlement predictions, their results can occasionally deviate significantly from field monitoring data. A noted limitation is that accurate predictions usually require at least six months of data following the completion of fill placement [31]. Physically based models, although related to soft soil consolidation, are heavily reliant on certain assumptions. Moreover, the applicability of these models is restricted when it comes to diverse site conditions or soil subjected to ground improvement techniques. The first curve-fitting model considered [32] in this context is given by

S_{t} = S_{0} + \frac{t - t_{0}}{α + β (t - t_{0})}

(1)

where

t_{0}

is the beginning time,

S_{0}

is the settlement at

t_{0}

, and

α

and

β

are fitting parameters to be determined. The degree of consolidation (U) is described as the ratio of the ground surface settlement at any given time t to the final settlement. The model is pertinent for scenarios where 60% < U < 90%, assuming that the rate of settlement in the consolidating layer aligns with the average rate of pore pressure dissipation under one-dimensional consolidation conditions [33]. A second curve-fitting model under consideration is formulated as follows:

S_{t} = (S_{f} - S_{d}) (1 - α e^{- β t}) + S_{d}

(2)

where

S_{t}

,

S_{f}

, and

S_{d}

are the settlement at t, immediate settlement, and final settlement upon completion of soil consolidation, respectively. Here,

α

and

β

are also fitting parameters to be identified. It is worth noting that

S_{d}

and

α

are interdependent, and

α

can occasionally be calculated directly using a theoretical solution. This model adheres to Terzaghi’s one-dimensional consolidation theory. Such two-parameter models consider primary consolidation but neglect secondary consolidation, thereby potentially underestimating long-term settlement.

The third fitting model introduces greater flexibility as a three-parameter model [34], represented by

S_{t} = m_{0} [1 - \frac{1}{1 + (t / m_{2})^{m_{1}}}]

(3)

Here,

m_{0}

,

m_{1}

, and

m_{2}

are the fitting parameters requiring identification. For solving the nonlinear least squares issues inherent in these three types of least squares curve-fitting, the damped least-squares (DLS) method is commonly employed [35].

2.2. Data-Driven Modeling

BiLSTM represents an advanced form of recurrent neural network (RNN) architecture designed to capture long-term dependencies in both forward and reverse temporal sequences [36]. BiLSTM integrates two LSTM elements: one processes input data in the forward direction, while the other handles it in the reverse direction [37]. Unlike traditional LSTM, which uses only forward sequential information, BiLSTM incorporates both forward and reverse temporal data by concatenating the outputs of hidden layers. Although BiLSTM doubles the number of weights and biases relative to LSTM, the trade-off is justified for settlement prediction tasks with limited datasets and features, as the added complexity does not significantly hinder computational efficiency.

Regarding input management for BiLSTM, the algorithm can handle large or redundant datasets by transforming them into a more concise set of features, known as the feature vector. This process, termed feature extraction, aims to retain essential information while facilitating generalization [25]. To enhance the generalization capability of a BiLSTM-based predictive model, selected features should encompass both system-specific properties and elements related to the construction sequence. For example, the relationship between soil foundation settlement and construction phases is evident in Figure 1a.

The algorithm will include basal pressure

p_{t}

and the incremental basal pressure

{∆ p}_{t}

. Moreover, the construction rate significantly influences immediate settlement, prompting the inclusion of the time interval

{∆ T}_{t}

for basal pressure increments. As settlement performance at a given time step is highly dependent on the previous step, the preceding settlement value

S_{t - 1}

is also considered for predictive modeling. Therefore, the general expression for the BiLSTM-based predictive model incorporates these variables:

S_{t} = f (p_{t}, {∆ p}_{t}, {∆ T}_{t}, S_{t - 1})

(4)

where

S_{t}

denotes the predicted settlement at the tth time step. The variable

{∆ T}_{t}

is informed by the slope of the load and settlement time series curves. Specifically, steep slopes result in larger

{∆ T}_{t}

values, while flatter slopes yield smaller values. This tailored approach ensures that

{∆ T}_{t}

values are both informative and nonredundant, thereby contributing to an effective training process for the BiLSTM-based model.

During data preprocessing, unevenly spaced time series may require interpolation for effective model training. The Akima spline technique can be employed to adjust datapoints with large spacing [38], ensuring that the derived time interval

{∆ T}_{t}

is consistent with other datapoints. This adjustment is particularly important when the native time interval

{∆ T}_{t}

is too elongated to be integrated with equal weighting in the model’s training phase.

x_{t}

is defined as a feature vector and can be represented as

x_{t} = {[\begin{matrix} p_{t} \\ {∆ p}_{t} \\ {∆ T}_{t} \\ S_{t - 1} \end{matrix}]}_{4 \times 1}

(5)

To establish a sequence of equal-length input data

d_{t}

, a moving window approach is employed:

d_{t} = {[x_{t - l}, \dots, x_{t - 2}, x_{t - 1}]}_{4 \times l}^{T} (t > l)

(6)

where

l

signifies the prescribed length of the time series, set at 5. Two distinct datasets are defined: D for training data and

D^{'}

for testing data. They are given by:

D = \{d_{t 0 - l}, \dots, d_{t 0 - 1}\}

(7)

D^{'} = \{d_{t 0}, \dots, d_{T}\}

(8)

where

t_{0}

marks the cutoff time specific to a training set, and T indicates the total monitoring days.

For model evaluation, k-fold cross-validation with gaps, known as GapKFold, is utilized [39]; see Figure 1b. A 5-fold cross-validation technique helps to identify issues such as selection bias and overfitting while providing insights into the model’s generalization capability [40]. To avoid data leakage, gaps are instituted between the training, validation, and test sets, with the gap size equal to the time step [39].

Figure 2 illustrates the architecture of the BiLSTM model, highlighting shared memory cell structures for both forward and backward directions. The weight matrices and bias vector for data processed in the forward direction are designated as

{}^{f}W

,

{}^{f}U

, and

{}^{f}b

. Correspondingly,

{}^{b}W

,

{}^{b}U

, and

{}^{b}b

are used for the backward direction. The text focuses on detailing the calculation procedure for forward directional data to clarify the BiLSTM algorithm’s functioning. For the sake of brevity, the computational steps for the backward direction are not presented. Despite differences in the bias vectors and weight matrices between the two directions, their computational methods remain analogous.

The input

x_{t}

initially traverses the first hidden layer. The outputs at the input, output, and forget gates are mathematically represented by Equations (9)–(11). The cell input activation vector

{}^{f}{\tilde{c}}_{t}

and cell state vector

{}^{f}c_{t}

are subsequently determined, as depicted by Equations (12) and (13). These vectors employ the Hadamard product, denoted by

⊙

. The hidden layer output

{}^{f}h_{t}

is then formulated as per Equation (14).

{}^{f}i_{t} = σ ({}^{f}W^{i} x_{t} + {}^{f}U^{i} {}^{f}h_{t - 1} + {}^{f}b^{i})

(9)

{}^{f}o_{t} = σ ({}^{f}W^{o} x_{t} + {}^{f}U^{o} {}^{f}h_{t - 1} + {}^{f}b^{o})

(10)

{}^{f}f_{t} = σ ({}^{f}W^{f} + {}^{f}U^{f} {}^{f}h_{t - 1} + {}^{f}b^{f})

(11)

{}^{f}{\tilde{c}}_{t} = \tanh ({}^{f}W^{c} x_{t} + {}^{f}U^{c} {}^{f}h_{t - 1} + {}^{f}b^{c})

(12)

{}^{f}c_{t} = {}^{f}f_{t} ⊙ {}^{f}c_{t + 1} + {}^{f}i_{t} ⊙ {}^{f}{\tilde{c}}_{t}

(13)

{}^{f}h_{t} = {}^{f}o_{t} ⊙ \tanh ({}^{f}c_{t})

(14)

For sequential data in the backward direction,

{}^{b}h_{t}

can be calculated by substituting

{}^{f}W

,

{}^{f}U

, and

{}^{f}b

with

{}^{b}W

,

{}^{b}U

, and

{}^{b}b

in Equations (9)–(13). The first hidden layer’s final output,

{}_{1}h_{t}

, is obtained by concatenating

{}^{f}h_{t}

and

{}^{b}h_{t}

, as defined by Equation (15).

{}_{1}h_{t} = {}^{f}h_{t} ⨁ {}^{b}h_{t}

(15)

Transitioning from the first to the second hidden layer, the final output

{}_{2}h_{t}

is computed based on Equations (8)–(14). Here,

x_{t}

is replaced by

{}_{1}h_{t}

, and the parameters are updated with those derived from the second hidden layer at the preceding time step. The relevant weight matrices and bias vectors from the second hidden layer are also utilized.

The final stage involves linking the second hidden layer to the output layer. A single output layer is incorporated, as the current settlement is most pertinent to the settlement at the preceding step. At the tth step, the output

y

of the BiLSTM-based predictive model is expressed by Equation (16), utilizing weight matrices W and bias vector b from the output layer. The linear output activation function is applied to directly return the weighted sum of the input without modification. Thus, the mathematical structure of the BiLSTM-based predictive model is fully delineated in Equations (8)–(16), yielding a single settlement value at each step.

y = W \times {}_{2}h_{t} + b

(16)

As illustrated in Figure 1c, the developed BiLSTM-based model integrates a rolling forecast model [41]. Rolling forecast is a technique that continuously generates future estimates based on historical data. This approach proves useful for monitoring soil foundation settlement, as it allows for ongoing recalibration based on real-time observations.

It should be noted that the BiLSTM model is case-specific. To achieve satisfactory performance on new datasets, complete model redesign and retraining may be necessary. This need for recalibration can act as a barrier to the broad application of such predictive models compared to their physics-based counterparts. One potential solution involves the development of an adaptive algorithm in conjunction with BiLSTM. This algorithm could then be integrated into a user-friendly graphical interface, thus making it accessible for practitioners who are not familiar with this tool. Although not discussed here, the current study demonstrates that the BiLSTM framework enables accurate settlement prediction while requiring fewer post-construction observational data compared to conventional curve-fitting methods.

3. Field Observations

For validation, two real-world cases from the field monitoring program are presented. The first case examines two test sections with similar subsoil conditions. The natural soil foundation in this case primarily consists of moderately and slightly compressible soils; thus, no ground improvement measures were taken prior to embankment construction. The second case focuses on a test section where the soil foundation was enhanced using deep mixing columns (DMCs). This section features a natural soil profile with a thick layer of clay, creating the potential for hazardous, excessive settlement during the construction of high-speed rail infrastructure.

3.1. Site Overview and Monitoring Protocols

3.1.1. Case Study 1

Field tests were conducted on a heavy-haul railway in China. With a design speed of 120 km/h, the railway encountered substantial stress variations due to the construction stages of its embankment and the characteristics of the moderately and slightly compressible subsoil. Monitoring subsoil settlement during and after embankment filling is thus essential. Of the six locations initially selected for instrumentation, only two are discussed here, given their spatial proximity and similar subsoil profiles. Figure 3 illustrates the field-testing schematic for these sections. The gradients for the upper and lower embankment slopes are set at 1:1.5 and 1:1.75, respectively.

A single point settlement gauge, featuring a maximum displacement of 200 mm and a sensitivity of 0.05 mm, was used to monitor subsoil subsidence during fill placement, as depicted in Figure 4. Settlement measurements took place every three days following the initiation of fill placement. After the completion of the embankment construction, readings occurred weekly for about a month, once every two weeks for two subsequent months, and then shifted to a monthly schedule. The two monitored locations had fill heights of either 14.1 m or 12.7 m. These gauges were strategically positioned on the subsoil surface directly below the embankment axis to capture maximum anticipated subsidence. The objective was to document rate and elevation changes resulting from subsoil settlement due to stress variations introduced by fill placement. Laboratory consolidation tests conducted on field-collected specimens revealed that the topsoil has a coefficient of compressibility ranging from 0.1 to 0.4 MPa⁻¹ at stress levels between 100 and 200 kPa.

3.1.2. Case Study 2

This case is associated with a high-speed railway. As shown in Figure 5, the examined site (Section C) features a thick layer of clay without a stable underlying stratum. To manage post-construction settlement and ensure the high-speed line’s safe operation, embankments at this site were strengthened using basal reinforcement techniques, including a gravel cushion with geosynthetic material, and deep mixing columns (DMCs).

The fill material was composed of several layers: 0.7 m of graded gravels at the top, followed by 2.3 m of lime-treated clay, 1.9 m of highly weathered adamellite, and a gravel cushion at the base. DMCs were arranged at intervals of either 1.2 m (at the shoulder) or 1.4 m (along the embankment slope), each with a total length of 13.5 m and a diameter of 0.5 m. Load and settlement time series data for this site were sourced from a technical report by Tongji University; however, detailed soil property information was not available, as the authors did not carry out the geotechnical instrumentation.

3.2. Data Preparation

The test sections’ load and settlement time series are depicted in Figure 6. For clarity, not all observed values are included in the graph. Data points from Section C in Case 2 correspond to the right y-axis and top x-axis. Increases in basal pressure led to elevated subsoil settlement at all test locations. Section A and Section B finished with settlements of approximately 187 mm and 117 mm, respectively. In Case 2, Section C exhibited a final settlement of around 124 mm. The Akima spline methodology generates data points at significant shifts in the load and settlement time series.

In Case 1, the variable

{∆ T}_{t}

is set to 6 days for the construction stage and 30 days for the time interval. This setup results in a total of 65 data points for Section A and 63 for Section B. In contrast, Case 2 involves a more rapid embankment filling process and a shorter observation period. Here,

{∆ T}_{t}

is defined as 3 days for the construction stage and 15 days for the time interval, yielding a total of 62 data points.

Data normalization, which rescales original data values to a range between 0 and 1, offers several benefits. These include a significant reduction in model training time, a decrease in internal covariate shift, and the provision of unbiased neural networks, as each feature undergoes normalization while preserving its contribution. The study employs min–max normalization for load and settlement time series data, as expressed by

ε_{t}^{'} = \frac{ε_{t} - ε_{m i n}}{ε_{m a x} - ε_{m i n}}

(17)

Here,

ε_{t}

is the original value,

ε_{t}^{'}

is the normalized value, and

ε_{m a x}

and

ε_{m i n}

are the maximum and minimum values of the original data flow, respectively. To evaluate the influence of dataset size on the BiLSTM-based predictive model, additional observed data ranging from 3 to 6 months post-embankment construction are included in the neural network training.

Gray relational analysis (GRA) is utilized to assess the correlation between input variables and embankment settlement. According to [26], GRA values range from 0 to 1.0. A value greater than 0.6 indicates a close correlation, a range between 0.50 and 0.60 signifies correlation, and a value below 0.5 implies a lack of correlation. The distribution of GRA for each influencing factor in relation to subsoil settlement is presented in Table 1. Notably, the GRA values for all input variables in relation to settlement exceed 0.5, thereby validating the reasonableness of the chosen variables.

4. BiLSTM Prediction Model

4.1. Model Construction

The study employs 5-fold cross-validation, wherein the original sample is divided into five equally sized subsamples, as illustrated in Figure 7. A single subsample serves as test data for model validation, while the remaining four act as training data. This cross-validation cycle repeats five times, each time using a different subsample for testing. The results from these five validations are then aggregated to produce a single estimate. Notably, the mean square error (MSE) for each epoch regression is calculated based on the 10 test subsets, not just one validation set. In stratified 5-fold cross-validation, partitions are selected so that the mean response value is approximately the same across all partitions [41]. The MSE, representing the sum of squared differences between the target variable and predicted values, serves as the regression loss function during model training. It is defined by

M S E = \frac{1}{5 n} \sum_{i = 1}^{n} {(S_{i}^{'} - S_{i})}^{2}

(18)

where n is the total number of data points in each folder, and

S_{i}^{'}

and

S_{i}

are the predicted and observed settlements, respectively.

The training process employs a BiLSTM-based predictive model, implemented using the Keras package in Python, with TensorFlow serving as the backend. Hyperparameter optimization is conducted via grid search [26]. Guided by performance metrics evaluated via cross-validation on the training set, this algorithm (Figure 8) identifies optimal settings [42]. The proposed model features two hidden layers and a single dense layer, classifying it as a type of stacked LSTM.

The complexity of the neural network rises with the addition of hidden layers. Three hidden layers lead to overfitting, while a single layer results in underfitting. The length of the input sequence, or time step, plays a crucial role, as it determines the scope of historical data fed into the model. The time step ranges from 1 to 10, and for robust performance, the model is trained 10 times at each step with varying initial weights and biases. Figure 8 provides a statistical summary of the analysis.

Both short and long time steps present challenges: a shorter time step restricts a comprehensive understanding of the input, whereas a longer one complicates the extraction of useful features by the neural network. Consequently, the optimal length for the input sequence is identified as 5.

Table 2 outlines the configurations for the BiLSTM-based predictive model. The optimal number of hidden neurons in each hidden layer is determined to be 64, with ReLU serving as the activation function. The Adam optimizer, leveraging the benefits of RMSProp and AdaGrad, updates the weight matrices and bias vectors [43]. Given that a larger batch size can improve gradient estimates but may extend training time, a batch size of 20 is chosen for this study. A total of 1500 epochs prove adequate for training convergence, with early stopping rules applied after a maximum of 20 iterations without MSE improvement [44].

MSE convergence across various datasets is presented in Figure 9a,b for Case 1, and in Figure 9c for Case 2. Each subplot in Figure 9 demonstrates the BiLSTM performance during the 1-fold to 5-fold cross-training cycles [45]. The training time for a single model is approximately 15 min.

4.2. Model Evaluation

Figure 10 compares BiLSTM-predicted settlements with observed settlements across three instrumented sections, post-data normalization. The coefficient of determination (R²) is computed separately for two field cases within each scenario [46,47,48,49]. Overall, the predictions from the trained BiLSTM models align closely with observed data, as all R² values exceed 0.99. This high correlation underscores the model’s effectiveness [50,51]. Notably, normalized values near 1.0 exhibit exact matches in two cases, indicating that the proposed models can accurately capture the final settlements critical to engineering applications. In regions with smaller settlements, the BiLSTM model tends to overestimate the values. This trend arises primarily because the initial series of input samples

d_{t}

contain a higher proportion of null values, leading to less accurate estimates.

Figure 11 presents the BiLSTM-based model’s predicted settlement time series, six months after completing fill placement. The model demonstrates robust capability for long-term settlement predictions. For performance evaluation, absolute and relative error indicators are defined as:

δ = S_{f}^{'} - S_{f}

(19)

ε = \frac{|δ|}{S_{f}}

(20)

where

δ

and

ε

represent the absolute and relative errors in settlement prediction, respectively;

S_{f}

and

S_{f}^{'}

denote the observed and predicted final settlement. Performance comparisons among BiLSTM, LSTM, and three curve-fitting methods are summarized in Appendix A and illustrated in Figure 12.

Figure 12 exhibits the mean subsoil settlement values, along with error bars for the BiLSTM and mean values for other methods. Generally, a larger dataset minimizes prediction uncertainties. However, the BiLSTM models exhibit less sensitivity to initial sample size variations, which is attributed to the comprehensive data they utilize, particularly construction-phase observations. For Case 1 and Case 2, the relative errors in BiLSTM predictions remain under 1% and 2%, respectively, showcasing both accuracy and generalization capabilities on unknown datasets. In contrast, curve-fitting methods offer less reliable predictions and display notable performance variations across different scenarios.

In conclusion, BiLSTM offers a robust approach for time series forecasting in embankment management, achieving both accuracy and resource optimization. A subsequent goal is to develop a user-friendly graphical interface for professionals unfamiliar with neural networks. This follows the observed trends in Figure 12, which demonstrate BiLSTM’s resilience to sample size variations and its superior accuracy compared to traditional curve-fitting methods.

5. Conclusions

A BiLSTM-based predictive model was developed for estimating post-construction subsoil settlement under embankment loads, marking an early application of deep learning for long-term settlement forecasts with limited site data. The model utilizes preprocessed field data from two rail lines: one on natural soils and another on DMC-improved soils with basal reinforcement.

The model learns effectively from both load and settlement history during the construction phase, even when observations are irregular. This enables the BiLSTM to develop a reliable predictive model for long-term subsoil settlement post-construction. BiLSTM adeptly captures the intricacies of staged construction and its impact on settlement over time. When compared to traditional curve-fitting methods, the BiLSTM-based approach excels in accuracy, generalization, and robustness. The model also displays a lack of sensitivity to the range of post-construction data used as input. Even a limited 3-month data set in the post-construction phase suffices for highly accurate settlement predictions.

In summary, BiLSTM proves effective for time series forecasting in embankment management, allowing for a shorter observation period without sacrificing accuracy. This also aligns with sustainability goals by optimizing resource use. Further research should explore BiLSTM’s applicability to subsoil conditions involving established ground improvement methods like vacuum-assisted preloading, unloading, and prefabricated vertical drains. A future objective includes integrating the BiLSTM algorithm into a user-friendly graphical interface for professionals not versed in neural networks.

Author Contributions

Methodology, P.W.; Formal analysis, L.W.; Investigation, L.W.; Resources, P.W.; Data curation, Z.L.; Writing—original draft, L.W.; Writing—review & editing, T.L.; Supervision, Q.Z.; Project administration, T.L.; Funding acquisition, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

The study was supported by the National Key R&D Program “Transportation Infrastructure” project (2022YFB2603400).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Model Predictions for Section A.

	3 Months after Construction		4 Months after Construction		5 Months after Construction		6 Months after Construction
	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)
Bi-LSTM models	0.37	0.68	−0.11	0.20	0.05	0.10	−0.06	0.11
LSTM models	0.45	0.83	−0.32	0.58	−0.21	0.42	−0.09	0.17
Three-parameter curve-fitting	−0.59	1.10	−0.79	1.48	−0.39	0.72	−0.23	0.43
Two-parameter curve-fitting 1	−2.77	5.16	−2.39	4.46	−1.86	3.47	−1.53	2.85
Two-parameter curve-fitting 2	−4.85	9.05	−4.23	7.90	−3.45	6.44	−2.93	5.47

Table A2. Model Predictions for Section B.

	3 Months after Construction		4 Months after Construction		5 Months after Construction		6 Months after Construction
	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)
Bi-LSTM models	−0.63	0.54	0.74	0.63	0.15	0.13	−0.03	0.03
LSTM models	−0.52	0.45	1.2	1.02	0.54	0.47	0.11	0.10
Three-parameter curve-fitting	0.32	0.27	1.57	1.33	1.17	0.99	0.26	0.22
Two-parameter curve-fitting 1	−3.48	2.95	−2.65	2.25	−2.08	1.77	−1.85	1.57
Two-parameter curve-fitting 2	−5.52	4.68	−4.66	3.95	−3.94	3.34	−3.50	2.97

Table A3. Model Predictions for Section C.

	3 Months after Construction		4 Months after Construction		5 Months after Construction		6 Months after Construction
	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$ (%)	$δ$ (mm)	$ε$
Bi-LSTM models	−2.28	1.83	−2.85	2.28	−1.64	1.31	−0.49	0.40
LSTM models	−2.63	2.11	−3.12	2.50	−1.71	1.37	1.23	1.00
Three-parameter curve-fitting	−3.17	2.54	−3.98	3.19	−1.94	1.56	3.03	2.43
Two-parameter curve-fitting 1	9.51	7.63	5.89	4.72	3.80	3.05	4.61	3.70
Two-parameter curve-fitting 2	6.20	4.97	2.25	1.81	1.12	0.90	3.13	2.51

References

Zhou, S.; Wang, B.; Shan, Y. Review of Research on High-Speed Railway Subgrade Settlement in Soft Soil Area. Railw. Eng. Sci. 2020, 28, 129–145. [Google Scholar] [CrossRef]
Esen, A.F.; Woodward, P.K.; Laghrouche, O.; Connolly, D.P. Long-Term Performance Assessment of a Geosynthetic-Reinforced Railway Substructure. Sustainability 2023, 15, 9364. [Google Scholar] [CrossRef]
Zhuang, Y.; Song, X.; Wang, K. Ground Reaction of Lightly Overconsolidated Subsoil in Reinforced Piled Embankment under Cyclic Loads. Sustainability 2023, 15, 619. [Google Scholar] [CrossRef]
Wei, X.; Zhang, L.; Yang, H.Q.; Zhang, L.; Yao, Y.P. Machine Learning for Pore-Water Pressure Time-Series Prediction: Application of Recurrent Neural Networks. Geosci. Front. 2021, 12, 453–467. [Google Scholar] [CrossRef]
Yu, F.; Li, S.; Dai, Z.; Li, J.; Chen, S. Stability Control of Staged Filling Construction on Soft Subsoil Using Hyperbolic Settlement Prediction Method: A Case Study of a Tidal Flat in China. Adv. Civ. Eng. 2020, 2020, 8899843. [Google Scholar] [CrossRef]
Siddiqui, F.; Asce, S.M.; Sargent, P.; Montague, G. Data-Based Modeling Approaches for Short-Term Prediction of Embankment Settlement Using Magnetic Extensometer Time-Series Data. Int. J. Geomech. 2022, 22, 04021269. [Google Scholar] [CrossRef]
Huang, C.; Li, Q.; Wu, S.; Li, J.; Xu, X. Application of the Richards Model for Settlement Prediction Based on a Bidirectional Difference-Weighted Least-Squares Method. Arab. J. Sci. Eng. 2018, 43, 5057–5065. [Google Scholar] [CrossRef]
Park, H.I.; Kim, K.S.; Kim, H.Y. Field Performance of a Genetic Algorithm in the Settlement Prediction of a Thick Soft Clay Deposit in the Southern Part of the Korean Peninsula. Eng. Geol. 2015, 196, 150–157. [Google Scholar] [CrossRef]
Chen, P.-Y.; Yu, H.-M. Foundation Settlement Prediction Based on a Novel NGM Model. Math. Probl. Eng. 2014, 2014, 242809. [Google Scholar] [CrossRef]
Chai, J.C.; Miura, N.; Kirekawa, T.; Hino, T. Settlement Prediction for Soft Ground Improved by Columns. Proc. Inst. Civ. Eng.-Ground Improv. 2010, 163, 109–119. [Google Scholar] [CrossRef]
Li, Z.; Luo, Z.; Wang, Q.; Du, J.; Lu, W.; Ning, D. A Three-Dimensional Fluid-Solid Model, Coupling High-Rise Building Load and Groundwater Abstraction, for Prediction of Regional Land Subsidence. Hydrogeol. J. 2019, 27, 1515–1526. [Google Scholar] [CrossRef]
Shi, X.-Q.; Xue, Y.-Q.; Ye, S.-J.; Wu, J.-C.; Zhang, Y.; Yu, J. Characterization of Land Subsidence Induced by Groundwater Withdrawals in Su-Xi-Chang Area, China. Environ. Geol. 2006, 52, 27. [Google Scholar] [CrossRef]
Michael, D.J. Limitations of Conventional Analysis of Consolidation Settlement. J. Geotech. Eng. 1993, 119, 1333–1359. [Google Scholar] [CrossRef]
Mesri, G.; Choi, Y.K. Settlement Analysis of Embankments on Soft Clays. J. Geotech. Eng. 1985, 111, 441–464. [Google Scholar] [CrossRef]
Raja, M.N.A.; Shukla, S.K. Predicting the Settlement of Geosynthetic-Reinforced Soil Foundations Using Evolutionary Artificial Intelligence Technique. Geotext. Geomembr. 2021, 49, 1280–1293. [Google Scholar] [CrossRef]
Zhang, W.; Li, H.; Li, Y.; Liu, H.; Chen, Y.; Ding, X. Application of Deep Learning Algorithms in Geotechnical Engineering: A Short Critical Review. Artif. Intell. Rev. 2021, 54, 5633–5673. [Google Scholar] [CrossRef]
Zhang, W.; Wu, C.; Li, Y.; Wang, L.; Samui, P. Assessment of Pile Drivability Using Random Forest Regression and Multivariate Adaptive Regression Splines. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2019, 15, 27–40. [Google Scholar] [CrossRef]
Zhang, W.; Wu, C.; Zhong, H.; Li, Y.; Wang, L. Prediction of Undrained Shear Strength Using Extreme Gradient Boosting and Random Forest Based on Bayesian Optimization. Geosci. Front. 2021, 12, 469–477. [Google Scholar] [CrossRef]
Goh, A.T.C.; Zhang, R.H.; Wang, W.; Wang, L.; Liu, H.L.; Zhang, W.G. Numerical Study of the Effects of Groundwater Drawdown on Ground Settlement for Excavation in Residual Soils. Acta Geotech. 2020, 15, 1259–1272. [Google Scholar] [CrossRef]
Zhang, P.; Yin, Z.Y.; Jin, Y.F.; Sheil, B. Physics-Constrained Hierarchical Data-Driven Modelling Framework for Complex Path-Dependent Behaviour of Soils. Int. J. Numer. Anal. Methods Geomech. 2022, 46, 1831–1850. [Google Scholar] [CrossRef]
Kirts, S.; Nam, B.H.; Panagopoulos, O.P.; Xanthopoulos, P. Settlement Prediction Using Support Vector Machine (SVM)-Based Compressibility Models: A Case Study. Int. J. Civ. Eng. 2019, 17, 1547–1557. [Google Scholar] [CrossRef]
Zhang, G.; Xiang, X.; Tang, H. Time Series Prediction of Chimney Foundation Settlement by Neural Networks. Int. J. Geomech. 2011, 11, 154–158. [Google Scholar] [CrossRef]
Zhu, M.; Li, S.; Wei, X.; Wang, P. Prediction and Stability Assessment of Soft Foundation Settlement of the Fishbone-Shaped Dike Near the Estuary of the Yangtze River Using Machine Learning Methods. Sustainability 2021, 13, 3744. [Google Scholar] [CrossRef]
Wang, K.; Sun, W.C. A Multiscale Multi-Permeability Poroplasticity Model Linked by Recursive Homogenizations and Deep Learning. Comput. Methods Appl. Mech. Eng. 2018, 334, 337–380. [Google Scholar] [CrossRef]
Zhang, P.; Yang, Y.; Yin, Z.-Y. BiLSTM-Based Soil–Structure Interface Modeling. Int. J. Geomech. 2021, 21, 04021096. [Google Scholar] [CrossRef]
Yang, B.; Yin, K.; Lacasse, S.; Liu, Z. Time Series Analysis and Long Short-Term Memory Neural Network to Predict Landslide Displacement. Landslides 2019, 16, 677–694. [Google Scholar] [CrossRef]
Li, L.; Gong, X.; Gan, X. Prediction of Maximum Ground Settlement Induced by Shield Tunneling Based on Recurrent Neural Network. China Civ. Eng. J. 2020, 51, 13–19. [Google Scholar]
Zhang, N.; Zhou, A.; Pan, Y.; Shen, S.-L. Measurement and Prediction of Tunnelling-Induced Ground Settlement in Karst Region by Using Expanding Deep Learning Method. Measurement 2021, 183, 109700. [Google Scholar] [CrossRef]
Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The Performance of LSTM and BiLSTM in Forecasting Time Series. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 3285–3292. [Google Scholar]
Chen, D.Y.; Fan, H.G.; Shen, J.M.; Cheng, J.; Shen, X.L.; Chen, Z.P. Research of Curve Fitting Method on the Measured Settlement of Tanks. Procedia Eng. 2015, 130, 400–407. [Google Scholar] [CrossRef]
NRA of China. TB 10001-2016 Code for Design of Railway Earth Structure; China Railway Publishing House Co., Ltd.: Beijing, China, 2017. [Google Scholar]
Al-Shamrani, M.A. Applying the Hyperbolic Method and Cα/Cc Concept for Settlement Prediction of Complex Organic-Rich Soil Formations. Eng. Geol. 2005, 77, 17–34. [Google Scholar] [CrossRef]
Sridharan, A.; Murthy, N.S.; Prakash, K. Rectangular Hyperbola Method of Consolidation Analysis. Geotechnique 1987, 37, 355–368. [Google Scholar] [CrossRef]
Luo, Q.; Cheng, M.; Wang, T.; Zhu, J. Three-Parameter Power Function Model for Prediction of Post-Construction Settlement of Medium Compressive Soil Foundation. J. Beijing Jiaotong Univ. 2020, 44, 93–100. [Google Scholar]
Wang, X.; Liu, X.; Chen, L.; Hu, H. Deep-Learning Damped Least Squares Method for Inverse Kinematics of Redundant Robots. Measurement 2021, 171, 108821. [Google Scholar] [CrossRef]
Yin, J.; Ning, C.; Tang, T. Data-Driven Models for Train Control Dynamics in High-Speed Railways: LAG-LSTM for Train Trajectory Prediction. Inf. Sci. 2022, 600, 377–400. [Google Scholar] [CrossRef]
Yang, M.; Wang, J. Adaptability of Financial Time Series Prediction Based on BiLSTM. Procedia Comput. Sci. 2022, 199, 18–25. [Google Scholar] [CrossRef]
Zhu, Y.; Han, X. C2 Interpolation T-Splines. Comput. Methods Appl. Mech. Eng. 2020, 362, 112835. [Google Scholar] [CrossRef]
Bergmeir, C.; Benítez, J.M. On the Use of Cross-Validation for Time Series Predictor Evaluation. Inf. Sci. 2012, 191, 192–213. [Google Scholar] [CrossRef]
Zhang, P.; Wu, H.-N.; Chen, R.-P.; Chan, T.H.T. Hybrid Meta-Heuristic and Machine Learning Algorithms for Tunneling-Induced Settlement Prediction: A Comparative Study. Tunn. Undergr. Space Technol. 2020, 99, 103383. [Google Scholar] [CrossRef]
Wang, S.; Wang, X.; Wang, S.; Wang, D. Bi-Directional Long Short-Term Memory Method Based on Attention Mechanism and Rolling Update for Short-Term Load Forecasting. Int. J. Electr. Power Energy Syst. 2019, 109, 470–479. [Google Scholar] [CrossRef]
Li, G.; Wang, W.; Zhang, W.; Wang, Z.; Tu, H.; You, W. Grid Search Based Multi-Population Particle Swarm Optimization Algorithm for Multimodal Multi-Objective Optimization. Swarm Evol. Comput. 2021, 62, 100843. [Google Scholar] [CrossRef]
Hou, S.; Liu, Y. Early Warning of Tunnel Collapse Based on Adam-Optimised Long Short-Term Memory Network and TBM Operation Parameters. Eng. Appl. Artif. Intell. 2022, 112, 104842. [Google Scholar] [CrossRef]
Prechelt, L. Automatic Early Stopping Using Cross Validation: Quantifying the Criteria. Neural Netw. 1998, 11, 761–767. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.Z.; Xiao, C.; Goh, S.H.; Deng, M.-X. Metamodel-Based Reliability Analysis in Spatially Variable Soils Using Convolutional Neural Networks. J. Geotech. Geoenviron. Eng. 2021, 147, 04021003. [Google Scholar] [CrossRef]
Chen, W.H.; Qu, S.J.; Lin, L.B.; Luo, Q.; Wang, T.F. Ensemble learning methods for shear strength prediction of fly ash-amended soils with lignin reinforcement. J. Mater. Civ. Eng. 2023, 35, 04023022. [Google Scholar] [CrossRef]
Chen, W.H.; Luo, Q.; Liu, J.K.; Wang, T.F.; Wang, L.Y. Modeling of frozen soil-structure interface shear behavior by supervised deep learning. Cold Reg. Sci. Technol. 2022, 200, 103589. [Google Scholar] [CrossRef]
Wang, T.; Ma, H.; Liu, J.; Luo, Q.; Wang, Q.; Zhan, Y. Assessing frost heave susceptibility of gravelly soils based on multivariate adaptive regression splines model. Cold Reg. Sci. Technol. 2021, 181, 103182. [Google Scholar] [CrossRef]
Wang, T.F.; Chen, W.H.; Li, T.F.; Connolly, D.P.; Luo, Q.; Liu, K.W.; Zhang, W.S. Surrogate-assisted uncertainty modeling of embankment settlement. Comput. Geotech. 2023, 159, 105498. [Google Scholar] [CrossRef]
Shi, C.; Fan, Z.; Connolly, D.P.; Jing, G.; Markine, V.; Guo, Y. Railway ballast performance: Recent advances in the understanding of geometry, distribution and degradation. Transp. Geotech. 2023, 41, 101042. [Google Scholar] [CrossRef]
Chen, Y.; Tan, L.; Xiao, N.; Liu, K.; Jia, P.; Zhang, W. The hydro-mechanical characteristics and micro-structure of loess enhanced by microbially induced carbonate precipitation. Geomech. Energy Environ. 2023, 34, 100469. [Google Scholar] [CrossRef]

Figure 1. Modeling Strategy (GapKFold: k-fold Cross-Validation with Gaps).

Figure 2. BiLSTM Model Architecture with Two Hidden Layers.

Figure 3. Cross-Sections of Examined Embankments in a Heavy-Haul Railway.

Figure 4. In-Field Single-Point Settlement Gauges.

Figure 5. Cross-Section of DMC-Supported, Geosynthetically Reinforced Embankment of a High-Speed Line.

Figure 6. Load-Settlement History in Test Sections.

Figure 7. BiLSTM Model Design Flowchart.

Figure 8. Influence of Hidden Layers and Time Step on MSE Performance.

Figure 9. BiLSTM Model Training Process: (a) Section A; (b) Section B; (c) Section C.

Figure 10. BiLSTM and Observed Settlements in Test Sections.

Figure 11. BiLSTM-Predicted Settlement Time Histories for Test Sections.

Figure 12. Effect of Input Data Size on Settlement Predictions (dash lines are observed settlements).

Table 1. GRA distribution in Test Sections.

Sections	$p_{t}$	${∆ p}_{t}$	$S_{t - 1}$	${∆ T}_{t}$
A	0.93	0.58	0.97	0.71
B	0.90	0.55	0.96	0.71
C	0.89	0.52	0.98	0.8

Table 2. BiLSTM Model Configurations.

Configuration	Value
Architecture	4-64(ReLU)-64(ReLU)-1(linear)
Optimizer	Adam
Batch size	20
Epoch	1500 (early stopping applied)
Learning rate	0.0001
Validation frequency	1
Timestep	5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, L.; Li, T.; Wang, P.; Liu, Z.; Zhang, Q. BiLSTM for Predicting Post-Construction Subsoil Settlement under Embankment: Advancing Sustainable Infrastructure. Sustainability 2023, 15, 14708. https://doi.org/10.3390/su152014708

AMA Style

Wang L, Li T, Wang P, Liu Z, Zhang Q. BiLSTM for Predicting Post-Construction Subsoil Settlement under Embankment: Advancing Sustainable Infrastructure. Sustainability. 2023; 15(20):14708. https://doi.org/10.3390/su152014708

Chicago/Turabian Style

Wang, Liyang, Taifeng Li, Pengcheng Wang, Zhenyu Liu, and Qianli Zhang. 2023. "BiLSTM for Predicting Post-Construction Subsoil Settlement under Embankment: Advancing Sustainable Infrastructure" Sustainability 15, no. 20: 14708. https://doi.org/10.3390/su152014708

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

BiLSTM for Predicting Post-Construction Subsoil Settlement under Embankment: Advancing Sustainable Infrastructure

Abstract

1. Introduction

2. Methods

2.1. Traditional Approaches

2.2. Data-Driven Modeling

3. Field Observations

3.1. Site Overview and Monitoring Protocols

3.1.1. Case Study 1

3.1.2. Case Study 2

3.2. Data Preparation

4. BiLSTM Prediction Model

4.1. Model Construction

4.2. Model Evaluation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI