Next Article in Journal
Economic and Environmental Effects of Replacing Inorganic Fertilizers with Organic Fertilizers in Three Rainfed Crops in a Semi-Arid Area
Next Article in Special Issue
Vision-Based Reinforcement Learning Approach to Optimize Bucket Elevator Process for Solid Waste Utilization
Previous Article in Journal
Impact of Livelihood Capital on the Adoption Behaviour of Integrated Agricultural Services among Farmers
Previous Article in Special Issue
A Review of Effect of Mineral Admixtures on Appearance Quality of Fair-Faced Concrete and Techniques for Their Measurement
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Chloride Diffusion Coefficient in Concrete Based on Machine Learning and Virtual Sample Algorithm

1
College of Civil Engineering, Zhejiang University of Technology, Hangzhou 310023, China
2
Zhejiang Key Laboratory of Civil Engineering Structures and Disaster Prevention and Mitigation Technology, Hangzhou 310023, China
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(24), 16896; https://doi.org/10.3390/su152416896
Submission received: 17 October 2023 / Revised: 7 December 2023 / Accepted: 13 December 2023 / Published: 15 December 2023
(This article belongs to the Special Issue Resource Utilization of Solid Waste in Cement-Based Materials)

Abstract

:
The durability degradation of reinforced concrete was mainly caused by chloride ingress. Former studies have used component parameters of concrete to predict chloride diffusion by machine learning (ML), but the relationship between microstructure and macroparameter of concrete need to be further clarified. In this study, multi-layer perceptron (MLP) and support vector machine (SVM) were used to establish the prediction model for chloride diffusion coefficient in concrete, especially for the solid waste concrete. A database of concrete pore parameters and chloride diffusion coefficients was generated by the algorithm based on the Gaussian mixture model (GMM-VSG). It is shown that both MLP and SVM could make good predictions, in which the data using the normalization preprocessing method was more suitable for the MLP model, and the data using the standardization preprocessing method was more adapted to the SVM model.

1. Introduction

Given the advantages of combining steel and concrete, reinforced concrete (RC) has become the main component in modern structures. But the production of concrete causes high energy consumption in construction industry. Recently, sustainable materials, such as solid waste concrete, have been widely used to save energy and reduce carbon emissions. At the same time, if the RC structure has been built in a severe environment, it might be damaged by freeze–thaw cycles, water seepage, salt corrosion, carbonization corrosion, etc., which may affect the service life. Some ingredients, such as fly ash, slag, silica fume and basalt fiber, are added to the concrete to improve the overall performance of RC structures.
Concrete is an inhomogeneous porous material with internal pores of varying shapes and sizes. External mediums, such as water, CO2, chloride ions, etc., can enter the interior of the concrete through microscopic cracks or pores [1,2,3,4]. Chloride attack is the main cause of structural deterioration of reinforced concrete structure, which can lead to the depassivation of steel bars, where the protective layer falls off, resulting into a serious degradation of reinforced concrete structures and leading to huge economic losses [5,6]. If the chloride ingresses into the pores slowly, the chloride diffusion coefficient of concrete becomes low. Previous research showed that there was a relationship between the parameters of microstructure (such as porosity, pore size distribution, pore connectivity and path tortuosity) and the chloride diffusion coefficient in concrete [7,8,9]. Some mathematical–physical models or empirical formulas have been established in specific circumstances. For instance, Jin et al. [10] presented an innovative empirical model to predict the long-term migration behavior of chloride in concrete that considered the influences of the water–cement ratio, time, bonding effect, temperature, relative humidity and concrete deterioration. Meanwhile, the reliability of the model was validated by the measurements of chloride concentration in concrete exposed to the marine environment long term that were reported in the literature. Qi et al. [11] proposed a variable coefficient transport model of sulfate-chloride in concrete based on the law of conservation of mass, Fick’s second law and the theory of porous media, which was verified by comparison with the experimental results. Shazali et al. [12] evaluated the effect of chloride binding in terms of the isotherm formulations on time-to-corrosion activation resulting from transport of chloride in saturated concrete through nonlinear finite element analyses.
It should be mentioned that concrete is a heterogeneous material which has pores with different shapes and sizes, so the idealized or simplified mathematical–physical models cannot accurately describe the relationship between the complex microstructure and macroscopic properties of concrete [13,14,15].
Machine learning (ML) is the computational algorithm designed to simulate human intelligence by learning from a given case. It can find patterns and derive values from large amounts of data. With the development of artificial intelligence technology, ML has been widely used in different areas. The ability to make accurate predictions with existing data can eliminate complex calculations, making ML particularly suitable for predicting chloride diffusion of concrete.
Researchers have used ML methods to solve the problem of chloride salt erosion. For example, Cai et al. [16] built a database containing 642 groups of concrete with free chloride concentration in which the prediction approach was established based on linear regression (LR), Gaussian process regression (GPR), support vector machine (SVM), multilayer perceptron artificial neural network (MLP-ANN) and random forests (RF) models, and the input variables include mix proportion, environmental conditions and exposure time. Liu et al. [17] established a database containing 653 groups of chloride diffusion coefficients in which the prediction model was proposed based on artificial neural network (ANN) and validated by the statistical analyses; a total of 13 parameters, including concrete components, experimental process, concrete mechanical properties, and others, were selected as the input variables. Tran [18] presented the prediction model based on extreme learning machine (ELM), SVM, K-nearest neighbors (KNN), light gradient boosting (LGB), extreme gradient boosting (XGB), RF, gradient boosting (GB), and AdaBoost (AdB), then a database of concrete components and chloride diffusion coefficients was established. Liu et al. [19] proposed a hybrid intelligent prediction model combining RF and the least square support vector machine (LSSVM) algorithm to predict the chloride penetration resistance of high-performance concrete based on 100 sets of experimental data which considered 12 mix input parameters; this proposed model can also provide a basis for optimizing concrete mix ratio. Jin et al. [20] established an ANN prediction model, and nine characteristic factors which might affect the chloride penetration resistance of recycled coarse aggregate were considered. The feasibility of the model was verified by analyzing the model performance and simulation results of the out-of-sample data set. Sensitive analysis of the input parameters was performed to obtain the influence index of a single variable affected by recycled coarse aggregate characteristics.
The prediction results obtained from ML methods could be affected by the size of database significantly [21]. Among the described ML methods, ANN and SVM are the most widely used. It is worth noting that MLP is a relatively basic ANN model, which is robust but not sensitive to the missing data; it requires a large amount of data and might meet local problems easily [22,23]. SVM is more accurate, stable and suitable for small samples with high accuracy, but it is sensitive to noisy data [24,25].
The size and quality of the database have significant influences on the prediction results by the ML methods [26]. When predicting the chloride diffusivity based on the microstructural parameters of concrete using the ML method, a large number of experiments are required. Due to time and cost constraints, the availability of effective data is very limited. Recently, researchers [21,27] have tended to utilize a virtual data generation technique to generate sufficient sets of databases with good quality. The generated virtual database can be used with ML methods to build general small-sample frameworks, and thus, the accuracy of the model can be improved [28,29,30].
In most existing literature, the component of concrete was selected as the input variable, and the prediction models were only suitable to specific concrete. The connection between the microstructure and the macroparameter of concrete was not clear. Therefore, in this paper, the data of concrete with fly ash, slag, silica fume and basalt fiber (some are solid waste concrete) were collected from our published paper [31,32,33,34,35,36,37,38,39] and used for ML. The Gaussian mixture model (GMM-VSG) [40] was applied to expand the number of samples, which can improve the accuracy of prediction result. MLP [41] and SVM [42] with a normalization and standardization preprocessing method were proposed to predict the chloride diffusion coefficient in concrete.

2. Machine Learning Methods

2.1. MLP

MLP is a part of ANN which can imitate the connections and transmissions among neurons in the biological brain in parallel. The smallest unit is called the artificial neuron [41]. As shown in Figure 1, the basic structure of the MLP model contains the input layer, the hidden layer and the output layer; with a few of the neurons, the neurons are connected with each other in terms of weight. The hyperparameters of the MLP model can be determined by different search strategies, such as grid search [43,44], random search and Bayesian search.
In this paper, the selected MLP model has two hidden layers, and each layer has 16 neurons, based on Ref. [45]. Let the variables of input layer be as x1, …, xn, and the output of one neuron can be determined by:
y = f ( i = 1 n w i x i + b ) = f ( W T X + b )
where f(x) is the activation function, w is the weight of input variable and b is the deviation.
The common activation functions including the Sigmoid function, Tanh function and ReLU function, as Figure 2 shows. Sigmoid function is the threshold unit for microscopic approximation which can be used for binary classification. Tanh function has a similar shape to Sigmoid function, but it can distinguish different patterns easily due to the great variation of output. However, both two functions would cause the gradient to vanish and hinder model training. The derivation of the ReLu function is constant, which can solve the problem of gradient vanishing. Hence, the ReLu function was selected as the model activation function.

2.2. SVM

SVM was first presented by Vapnik based on statistic learning theory [42], which can minimize the possibility of misclassification, and the schematic diagram of an SVM model can be found in Figure 3. The training process of an SVM is a quadratic optimization problem which is given by Refs. [46,47]:
Max   Q ( a ) = i = 1 N a i 1 2 i , j = 1 N a i a j y i y j K ( x i , x j ) Subject   to   i = 1 N a i y i = 0 ,   ξ i 0 ,   0 a i C , i = 1 , 2 , 3 , , N
where a = (a1, a2, …, ai) are the Lagrange multipliers, K( , ) is kernel matrix, C is a regularization parameter, N is the cardinality of training set and ξ is the slack variable. Normally, Radial Basis Functions (RBFs), polynomial functions and Sigmoid functions can be used as kernels. But the selected function needs to meet various criteria, such as simplicity, high generalization capability and other parameters, to be turned. In this study, Gaussian kernel (one type of RBFs) was used, that is:
K ( x i , x j ) = exp ( | | x i x j | | 2 γ )
The training set can be defined as:
X = ( x i , y i ) R n , i = 1 , 2 , 3 , , N
where xi is the features vector for classifier and yi is the label which belongs to {+1,−1}.
The mathematical model for SVM can be expressed as:
f ( x ) = sgn i = 1 # S V a i * y i K ( x i , x ) + b N
where #SV is the cardinality of SV set, ai* is the optimal value and b is the deviation which can be evaluated from the optimal value.
According to previous research [48], the hyperparameter optimization method was used for SVM method.

2.3. GMM-VSG

GMM-VSG is a density estimation algorithm [40] based on the Gaussian mixture model, which could produce arbitrary nonlinear functions by adjusting its weight. Then, the probability density function of the mixture model might be changed, and the steps of the GMM-VSG could be found in Figure 4.
If the sample X = {X1, …, Xn} were generated by the Gaussian mixture distribution P, which is composed of G components, the maximum mixed likelihood function of P is given by:
L M ( θ 1 , , θ G ; γ 1 , , γ n | x ) = i = 1 n k = 1 G π k f k ( x i | θ k ) ( π k 0 ; k = 1 G π k = 1 )
where θn is the corresponding parameter, γn is the probability of xi fitted by each component and πk is the weight parameter.
The density function fk( ) can be further expressed as:
f k ( x i | μ k , Σ k ) = e x p { 1 2 ( x i μ k ) T Σ k 1 ( x i μ k ) } 2 π p / 2 | Σ k | 1 / 2 N
where μk is the expectation of the sample and ∑k is the covariance of the sample.
Furthermore, the Gaussian mixture distribution can be represented by the density function, that is:
P ( x | θ ) = k = 1 G π k f k x i | μ k , Σ k
The parameters of the Gaussian mixture model can be solved by the EM algorithm. In the E-step, the posterior probability of implicit variables is determined by the initial value or the last iteration of the parameter. In the M-step, the new parameter would be obtained by maximizing the likelihood function. The specific calculation processes are listed as follows:
Q ( θ , θ ( i 1 ) ) = E ( log L ( θ | X , Z ) ) = log L ( θ | X , Z ) f ( Z | θ , θ ( i 1 ) ) d Z
θ * = θ ( i ) = arg max Q ( θ , θ ( i 1 ) )
For the case that data is limited, it is difficult to balance the complexity and ability while selecting models. Based on the concept of entropy, AIC and BIC provide a standard to weigh the complexity of the estimation model and the goodness of fitting data which are defined in Equations (11) and (12).
A I C = 2 k 2 ln ( L )
B I C = k ln ( n ) 2 ln ( L )
where k is the number of model parameters and n is the number of samples.
It can be found that the penalty term of BIC considers the number of samples which is more suitable for small samples. Hence, BIC was selected as the information criterion in this paper.

3. Data Collection and Performance Evaluation

3.1. Database

3.1.1. Data Sources

To predict the value of D (chloride diffusion coefficient), 118 sets of data were selected from our previous test (which includes the exposure test in a marine environment, simulated experiments in a laboratory environment and a porosity test using mercury intrusion method) as the first group of samples, and a database was created containing 194 sets of data (including the 118 sets of data in the first group) sorted from the published literature [31,32,33,34,35,36,37,38,39] as the second group of samples. In addition, the GMM-VSG method proposed by Shen and Qian [40] is used to expand the second group by 1000 sets of data. After eliminating two negative values, a total of 998 sets of data are obtained for the third group.

3.1.2. Input Variable Selection

Six parameters that may affect the D value are selected as input variables. Input variables can be classified into two categories: (1) concrete microscopic parameters, including porosity, contribution porosity of pores with different sizes (20 nm < d, 20 < d < 50 nm, 50 < d < 200 nm, 200 nm < d); (2) control variable (i.e., exposure age).
Among them, contribution porosity is the percentage of different pores multiplied by porosity. The reason for choosing exposure age as the control variable is to prevent the appearance of the same contributing porosity at different ages by concrete with different admixtures.

3.1.3. Statistical Characteristics of Data

Statistical analysis on each group of data is conducted. Table 1, Table 2 and Table 3 show the statistical parameters of each data group, including maximum value, minimum value, average value, median value, standard deviation, kurtosis and skewness.
Taking the porosity as an example, the minimum value of the first group is the same as that of the second group, while the maximum value of the second group is larger than that of the first group. Besides, the maximum value of the second group is the same as that of the third group, while the minimum value of the third group is less than that of the second group. The standard deviation of the third group is the largest, followed by the second group and the first group. The main reason can be attributed to that the data in the first group is from the same author, resulting into a smaller discreteness.

3.1.4. Variable Correlation

Variable correlation on each group of data is conducted, as shown in Figure 5. It can be seen that the correlation of variables in the second group is partially different from that in the first group. In addition, the data obtained from VSG algorithm in the third group has a better variable correlation.

3.2. Data Preprocessing

It is necessary to preprocess the input and output variables to improve the training accuracy of the model. Specific methods, including standardization and normalization preprocessing, were used to treat the data and train the model. This allowed numerical problems due to the magnitude differences of input data to be avoided. For the MLP model, the saturation of neurons can be prevented, which can improve the convergence speed [49].
The process of normalization preprocessing method is found by:
y = x x min x max x min
where xmin is the minimum value, xmax is the maximum value and x is the sample value to be normalized.
After normalization preprocessing, the results need to be reverse normalized, that is:
y = y ( x max x min ) + x min
The standardization preprocessing method is found by:
y = x μ σ
where μ is the average value and σ is the standard deviation.
The data trained by standardization or normalization preprocessing method both changed linearly and presented the scaling features without changing the order. For the data trained using the standardization preprocessing method, the standard deviation value and mean value were rescaled to 1 and 0, respectively, while the standard deviation value and mean value were adjusted to the range between 0 and 1 for data trained using the normalization preprocessing method. Thus, the standardization preprocessing method can better maintain the space of the sample, and the normalization preprocessing method is more susceptible to outliers.

3.3. Partition of Training and Testing Set

The 10-fold cross-validation is used for the data in training as it can improve the accuracy [50] and generalization ability of the model [51,52] so that the overfitting problem can be avoided. As shown in Figure 6, all data are divided into ten parts, one fold is used as validation set, and the others are used as training set in each part. Then, the optimal hyper parameters for the training model can be obtained. As the training progresses, 10% of all data are randomly selected for testing.

3.4. Performance Evaluation

Statistical parameters including correlation coefficient R2, mean absolute error MAE and mean square error MSE are evaluated for the reliability analysis. The above parameters can be determined by:
R 2 = 1 i = 1 ( y i ^ y i ) 2 i = 1 ( y i y i ) 2
M A E = 1 m i = 1 m y i y ^ i
M S E = 1 m i = 1 m ( y i y ^ i ) 2
where m is the number of samples, ŷi, yi and yi represent the predicted value, true value and average value of the sample, respectively.
The OBJ indicator proposed by Golafshani et al. [53] is adopted to compare the performance of different models, that is:
O B J = M A E + M S E 1 + R 2

4. Results and Discussion

4.1. MLP Model

The measured values of D were taken as the horizontal coordinate, and the predicted values were taken as the vertical coordinate. The prediction results obtained by the MLP model with normalization and standardization preprocessing methods were plotted in Figure 7.
As seen from Figure 7, only a small number of points in the first and second groups are within the ±15% error bars, while more points in the third group are within the ±15% error bars regardless of data preprocessing methods. Moreover, the mean value of single point error in the three groups is 1.17, 1.42 and 0.95 when using normalization preprocessing method, indicating that the model trained by the third group of data is more stable than the other groups, while the prediction accuracy of the second group of data is worse than that of the first group.
It should be mentioned that, if the value of MAE, MSE and OBJ is low or R2 is close to 1, the MLP model has a good prediction performance. The prediction evaluation indicators of the MLP model are listed in Table 4. It can be found that the data of the third group had a better performance than that of the other two groups whether using the normalization or standardization preprocessing method. This reveals that the GMM-VSG algorithm can help improve the prediction accuracy of the MLP model. Although the prediction effect of the second group is better than that of the first group, its MAE and MSE are larger than those of the first group. The reason may be that the increase of data makes the prediction more accurate, while the second group of data comes from multiple different experiments so there are more test errors, which increases the proportion of noise. Furthermore, it seems that the OBJ value of the MLP model with normalization preprocessing is lower than that of the MLP model with standardization preprocessing in the third group. Thus, the normalization preprocessing method is more suitable for MLP. This is mainly because the distribution of the data set becomes uniform after using the GMM-VSG algorithm and the impact of noise value is reduced. After normalization, the gradient descent of the data becomes stable and slow and the convergence speed of the MLP model became faster so that the neurons will not supersaturate.
The loss function is the deviation between the real value and prediction value which can be used to evaluate the performance of the model. Figure 8 shows the relationships between the loss value and the number of iterations. It can be seen that the loss value first decreased rapidly and then leveled off in all three groups. The curves of loss function were oscillating in the first group and second group while kept stable in the third group. The oscillations appeared due to the outliers in the dataset which would affect the accuracy of the MLP model. In general, the loss value of the data trained by standardization preprocessing method was larger than that trained using the normalization preprocessing method. The oscillation quantity of the curves of the data trained using the standardization preprocessing method are greater than those trained using the normalization preprocessing method. Furthermore, the loss value of the data trained by the normalization preprocessing method oscillated significantly only at the 800th and 2700th iterations in the first two groups and remained stable in third group, i.e., it was lower than 0.007 after 3000 iterations. But the loss value of the data trained by the standardization preprocessing method was about 0.091 after 3000 iterations for the third group. This also indicates that the normalization preprocessing is more robust and more suitable for the MLP model.

4.2. SVM Model

The prediction results obtained by the SVM model with normalization and standardization preprocessing methods were shown in Figure 9. Similar to the MLP model, for the first group and second group, few data were in the ±15% error bars, while for the third group, most points were in the ±15% error bars regardless of the data preprocessing method. The mean value of single point error in the three groups is 1.31, 1.43, and 1.03 by using the standardization preprocessing method, indicating that the SVM model trained by the data of the third group is more reliable compared to that of the other two groups. As the SVM model is more sensitive to noise values, the effect of the data increment is not so obvious, and the prediction performance of the second group is much worse than that of the first group.
The prediction evaluation Indicators of the SVM model are shown In Table 5. As expected, the data of the third group had a better performance whether using the normalization or standardization preprocessing method. It is interesting to find that the OBJ value of the SVM model with standardization preprocessing is lower than that of the MLP model with normalization preprocessing in the third group. This is because SVM is the distance-based algorithm and the standardization preprocessing approach can maintain the space of the sample.

4.3. Comparisons

The comparisons of two machine learning methods with standardization and/or normalization preprocessing approaches are shown in Figure 10. In general, the SVM model can make a more accurate prediction while the MLP model is more stable. Because the hyperplane of SVM is determined by the support vector, in the condition of a fixed regularization, noise value will cause unreasonable division. SVM is more sensitive to the noise value. Since MLP updates the weights through multiple backpropagations, it is less sensitive to the noise value.
In the first and second group, the OBJ values obtained from the MLP model with standardization and normalization preprocessing method are close, while the OBJ value obtained from the SVM model with normalization preprocessing method is lower than that with the standardization preprocessing method. In the third group, both the MLP and SVM models can produce good predictions, and the SVM model with standardization preprocessing method is the most accurate one.

5. Conclusions and Prospects

5.1. Conclusions

In this paper, a reliable and accurate model for predicting the chloride diffusion coefficient in concrete has been established. A total of 194 sets of data were collected from the existing literature, and an expanded virtual database was obtained from the GMM-VSG algorithm. The following conclusions that can be drawn:
  • The connection between macroscopic properties and microstructure of concrete can be assessed by machine learning methods.
  • The MLP and SVM models built by the virtual database are capable of predicting the chloride diffusion coefficients in concrete. The R2 obtained from MLP and SVM models is 0.95 (by normalization) and 0.97 (by standardization), respectively. The OBJ value obtained by the MLP and SVM model is 0.68 (by normalization) and 0.30 (by standardization), respectively.
  • The expended data set produced by GMM-VSG algorithm can help to improve the accuracy of MLP and SVM models, and the improvement is greater for SVM model, probably because the increase in the amount of data weakens the impact of noise values.
  • The normalization preprocessing method is more suitable to the MLP model, while the standardization preprocessing method is adapted to the SVM model. This may be due to the fact that normalization preserves sample spacing better, and normalization prevents oversaturation of neurons. The normalization preprocessing method is also more robust than the standardization preprocessing method.

5.2. Prospects

In general, MLP and SVM used in this paper are still limited by other factors, such as the quality and quantity of the database, the selection of hyperparameters, etc. The generalizability and reliability of the established models have not been assessed in this paper. In our future work, the microstructural parameters will be used to predict the water and gas permeability coefficient through the machine learning method.

Author Contributions

Conceptualization, F.-Y.Z. and N.-J.T.; methodology, F.-Y.Z. and N.-J.T.; software, F.-Y.Z. and N.-J.T.; validation, F.-Y.Z. and N.-J.T.; formal analysis, F.-Y.Z. and N.-J.T.; investigation, F.-Y.Z. and N.-J.T.; resources, Y.-R.Z.; data curation, F.-Y.Z. and N.-J.T.; writing—original draft preparation, F.-Y.Z. and N.-J.T.; writing—review and editing, Y.-R.Z. and W.-B.Y.; visualization, F.-Y.Z. and N.-J.T.; supervision, Y.-R.Z. and W.-B.Y.; project administration, W.-B.Y.; funding acquisition, W.-B.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Research Funding of Zhejiang University of Technology, grant number No. SH1060210983.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lin, S.Z.; Li, H.M.; Zhang, H.Y. Experimental study on frost resistance durability and service life prediction of normal cement concrete. Adv. Mater. Res. 2011, 368–373, 2425–2429. [Google Scholar] [CrossRef]
  2. Chen, X.; Li, Z.Y.; Jin, W.L.; Zhang, Y.; Yao, C.J. Chloride ion ingress distribution within an alternate wetting-drying marine environment area. Sci. China Technol. Sci. 2012, 55, 970–996. [Google Scholar]
  3. Basheer, L.; Kropp, J.; Cleland, D.J. Assessment of the durability of concrete from its permeation properties: A review. Constr. Build. Mater. 2001, 15, 93–103. [Google Scholar] [CrossRef]
  4. Evangelista, L.; Brito, J.D. Durability performance of concrete made with fine recycled concrete aggregates. Cem. Concr. Compos. 2010, 32, 9–14. [Google Scholar] [CrossRef]
  5. Dunker, K.F.; Rabbat, B.G. Why America’s bridges are crumbling. Sci. Am. 1993, 266, 66–72. [Google Scholar] [CrossRef]
  6. Hong, N.F. Development and difficulty of forecast for corrosion and durability of concrete. Concrete 2006, 10, 10–12+16. (In Chinese) [Google Scholar]
  7. Zhang, M.H.; Li, H. Pore structure and chloride permeability of concrete containing nano-particles for pavement. Constr. Build. Mater. 2011, 25, 608–616. [Google Scholar] [CrossRef]
  8. Moon, H.Y.; Kim, H.S. Relationship between average pore diameter and chloride diffusivity in various concretes. Constr. Build. Mater. 2006, 20, 725–732. [Google Scholar] [CrossRef]
  9. Luo, R.; Cai, Y.B.; Wang, C.Y.; Huang, X. Study of chloride binding and diffusion in GGBS concrete. Cem. Concr. Res. 2003, 33, 1–7. [Google Scholar] [CrossRef]
  10. Jin, L.; Yu, G.; Wang, Z.; Fan, T. Developing a model for chloride transport through concrete considering the key factors. Case Stud. Constr. Mater. 2022, 17, e01168. [Google Scholar] [CrossRef]
  11. Qi, D.; Zheng, H.; Zhang, L.; Sun, G.; Yang, H.; Li, Y. Numerical simulation on diffusion reaction behavior of concrete under sulfate chloride coupled attack. Constr. Build. Mater. 2023, 405, 133237. [Google Scholar] [CrossRef]
  12. Shazali, M.A.; Rahman, M.K.; Al-Gadhib, A.H.; Balunch, M.H. Transport modeling of chlorides with binding in concrete. Arab. J. Sci. Eng. 2012, 37, 469–479. [Google Scholar] [CrossRef]
  13. Weiss, T.; Mareš, J.; Slavík, M.; Bruthans, J. A microdestructive method using dye-coated-probe to visualize capillary, diffusion and evaporation zones in porous materials. Sci. Total Environ. 2020, 704, 135339. [Google Scholar] [CrossRef] [PubMed]
  14. Wernert, V.; Nguyen, K.L.; Levitz, P.; Coasne, B.; Denoyel, R. Impact of surface diffusion on transport through porous materials. J. Chromatogr. A 2022, 1665, 462823. [Google Scholar] [CrossRef] [PubMed]
  15. Gao, Y.W.; Pastrana, A.P.C.; Manogharan, G.; van Duin, A.C.T. Molecular dynamics study of melting, diffusion, and sintering of cementite chromia core–shell particles. Comput. Mater. Sci. 2021, 199, 110721. [Google Scholar] [CrossRef]
  16. Cai, R.; Han, T.H.; Liao, W.Y.; Huang, J.; Li, D.W.; Kumar, A.; Ma, H.Y. Prediction of surface chloride concentration of marine concrete using ensemble machine learning. Cem. Concr. Res. 2020, 136, 106164. [Google Scholar] [CrossRef]
  17. Liu, Q.F.; Iqbal, M.F.; Yang, J.; Lu, X.Y.; Zhang, P.; Rauf, M. Prediction of chloride diffusivity in concrete using artificial neural network: Modelling and performance evaluation. Constr. Build. Mater. 2021, 268, 121082. [Google Scholar] [CrossRef]
  18. Tran, V.Q. Machine learning approach for investigating chloride diffusion coefficient of concrete containing supplementary cementitious materials. Constr. Build. Mater. 2022, 328, 127103. [Google Scholar] [CrossRef]
  19. Liu, Y.; Cao, Y.; Wang, L.; Chen, Z.S.; Qin, Y. Prediction of the durability of high-performance concrete using an integrated RF-LSSVM model. Constr. Build. Mater. 2022, 356, 129232. [Google Scholar] [CrossRef]
  20. Jin, L.; Dong, T.; Fan, T.; Duan, J.; Yu, H.L.; Jiao, P.F.; Zhang, W.B. Prediction of the chloride diffusivity of recycled aggregate concrete using artificial neural network. Mater. Today Commun. 2022, 32, 104137. [Google Scholar] [CrossRef]
  21. Polyzotis, N.; Roy, S.; Whang, S.E.; Zinkevich, M. Data management challenges in production machine learning. In Proceedings of the 2017 ACM International Conference on Management of Data, Chicago, IL, USA, 14–19 May 2017; pp. 1723–1726. [Google Scholar]
  22. Wang, J.C. A neural network initialization method based on machine learning. Comput. Res. Dev. 1997, 8, 41–46. (In Chinese) [Google Scholar]
  23. Niu, X.X.; Yang, C.L.; Wang, H.C.; Wang, Y.Y. Investigation of ANN and SVM based on limited samples for performance and emissions prediction of a CRDI-assisted marine diesel engine. Appl. Therm. Eng. 2017, 111, 1353–1364. [Google Scholar] [CrossRef]
  24. Ping, Y. Support Vector Machine Based Clustering and Text Classification Research. Ph.D. Thesis, Beijing University of Posts and Telecommunications, Beijing, China, 2012. (In Chinese). [Google Scholar]
  25. Akande, K.O.; Owolabi, T.O.; Twaha, S.; Olatunji, S.O. Performance comparison of SVM and ANN in predicting compressive strength of concrete. IOSR J. Comput. Eng. 2014, 16, 88–94. [Google Scholar] [CrossRef]
  26. Yang, J.; Yu, X.; Xie, Z.; Zhang, J. A novel virtual sample generation method based on Gaussian distribution. Knowl. Based Syst. 2011, 24, 740–748. [Google Scholar] [CrossRef]
  27. Gong, H.; Chen, Z.; Zhu, Q.; He, Y. A Monte Carlo and PSO based virtual sample generation method for enhancing the energy prediction and energy optimization on small data problem: An empirical study of petrochemical industries. Appl. Energy 2017, 197, 405–415. [Google Scholar] [CrossRef]
  28. Dong, Q.Y.; Bai, S.W.; Wang, Z.; Zhao, X.Y.; Yang, S.S.; Ren, N.Q. Virtual sample generation empowers machine learning-based effluent prediction in constructed wetlands. J. Environ. Manag. 2023, 346, 118961. [Google Scholar] [CrossRef] [PubMed]
  29. Sang, K.H.; Yin, X.Y.; Zhang, F.C. Machine learning seismic reservoir prediction method based on virtual sample generation. Pet. Sci. 2021, 18, 1662–1674. [Google Scholar] [CrossRef]
  30. Lin, L.S.; Lin, Y.S.; Li, D.C.; Liu, Y.H. Improved learning performance for small datasets in high dimensions by new dual-net model for non-linear interpolation virtual sample generation. Decis. Support Syst. 2023, 172, 113996. [Google Scholar] [CrossRef]
  31. Zhang, X.Q. Study on the Mechanism of Chloride Ion Diffusion in Double-Doped Fly Ash and Slag Concrete. Ph.D. Thesis, Nanjing University of Technology, Nanjing, China, 2015. (In Chinese). [Google Scholar]
  32. Zhang, Y.R.; Wu, S.Y.; Ma, X.Q.; Fang, L.C.; Zhang, J.Z. Effects of additives on water permeability and chloride diffusivity of concrete under marine tidal environment. Constr. Build. Mater. 2022, 320, 126217. [Google Scholar] [CrossRef]
  33. Zhang, J.Z.; Fang, R.H.; Lv, M.; Zhang, Y.R.; Cao, Y.H.; Gao, Y.H. Time dependent microstructure evolution of fly ash concrete in the natural tidal environment. J. Nat. Disasters 2019, 28, 9–16. (In Chinese) [Google Scholar]
  34. Zhang, J.Z.; Jin, T.; He, Y.C.; Yu, W.L.; Gao, Y.H.; Zhang, Y.R. Time dependent correlation of permeability of fly ash concrete under natural tidal environment. Eur. J. Environ. Civ. Eng. 2022, 26, 8477–8501. [Google Scholar] [CrossRef]
  35. Zhang, J.Z.; Zhou, X.Y.; Zhao, J.; Wang, M.; Gao, Y.H.; Zhang, Y.R. Similarity of chloride diffusivity of concrete exposed to different environments. ACI Mater. J. 2020, 117, 27–37. [Google Scholar]
  36. Zhang, J.Z.; Wang, M.; Zhou, X.Y.; Yu, W.L.; Gao, Y.H.; Zhang, Y.R. Exploring the emerging evolution trends of probabilistic service life prediction of reinforced concrete structures in the chloride environment by scientometric analysis. Adv. Civ. Eng. 2021, 2021, 8883142. [Google Scholar] [CrossRef]
  37. Gao, Y.H.; Shao, X.J.; Zhang, Y.R.; Fang, R.H.; Zhang, J.Z. Permeability dependency of fly ash concrete in natural tidal environment. J. Hydroelectr. Eng. 2021, 40, 214–222. (In Chinese) [Google Scholar]
  38. Gao, Y.H.; Guo, B.L.; Wang, M.; Zhang, Y.R.; Zhang, J.Z. Stable time and mechanism of concrete permeability in natural tidal environment. J. Hydroelectr. Eng. 2022, 41, 50–62. (In Chinese) [Google Scholar]
  39. Zhang, J.Z.; Wu, J.; Zhang, Y.R.; Wang, J.D. Time-varying relationship between pore structures and chloride diffusivity of concrete under the simulated tidal environment. Eur. J. Environ. Civ. Eng. 2022, 26, 501–518. [Google Scholar] [CrossRef]
  40. Shen, L.J.; Qian, Q. A virtual sample generation algorithm supporting machine learning with a small-sample dataset: A case study for rubber materials. Comput. Mater. Sci. 2022, 211, 111475. [Google Scholar] [CrossRef]
  41. Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386–408. [Google Scholar] [CrossRef]
  42. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  43. Wakjira, T.G.; Rahmzadeh, A.; Alam, M.S.; Tremblay, R. Explainable machine learning based efficient prediction tool for lateral cyclic response of post-tensioned base rocking steel bridge piers. Structures 2022, 44, 947–964. [Google Scholar] [CrossRef]
  44. Wakjira, T.G.; Abushanab, A.; Ebead, U.; Alnahhal, W. FAI: Fast, accurate, and intelligent approach and prediction tool for flexural capacity of FRP-RC beams based on super-learner machine learning model. Mater. Today Commun. 2022, 33, 104461. [Google Scholar] [CrossRef]
  45. Zhang, Y.R.; Yu, W.L.; Ma, X.Q.; Luo, T.Y.; Wang, J.J. Prediction of chloride concentration in fly ash concrete based on deep learning. J. Beijing Univ. Technol. 2023, 49, 205–212. (In Chinese) [Google Scholar]
  46. Adıgüzel, E.; Subaşı, N.; Mumc, T.V.; Ersoy, A. The effect of the marble dust to the efficiency of photovoltaic panels efficiency by SVM. Energy Rep. 2023, 9, 66–76. [Google Scholar] [CrossRef]
  47. Santos, C.E.d.S.; Sampaio, R.C.; Coelho, L.d.S.; Bestarsd, G.A.; Llanos, C.H. Multi-objective adaptive differential evolution for SVM/SVR hyperparameters selection. Pattern Recognit. 2021, 10, 107649. [Google Scholar] [CrossRef]
  48. Wang, L.L.; Yu, W.L.; Zhang, Y.R. Prediction of chloride concentration in fly ash concrete based on support vector regression. Zhejiang Architecture, (In Chinese, accepted.).
  49. Liu, X.T. BP neural network input layer data normalization study. Mech. Eng. Autom. 2010, 3, 122–126. (In Chinese) [Google Scholar]
  50. Vu, H.L.; Ng, K.T.W.; Richter, A.; An, C. Analysis of input set characteristics and variances on k-fold cross validation for a Recurrent Neural Network model on waste disposal rate estimation. J. Environ. Manag. 2022, 311, 114869. [Google Scholar] [CrossRef] [PubMed]
  51. Nguyen, X.C.; Nguyen, T.T.H.; La, D.D.; Kumar, G.; Rene, E.R.; Nguyen, D.D.; Chang, S.W.; Chung, W.J.; Nguyen, X.H.; Nguyen, V.K. Development of machine learning–based models to forecast solid waste generation in residential areas: A case study from Vietnam. Resour. Conserv. Recycl. 2021, 167, 105381. [Google Scholar] [CrossRef]
  52. Garre, A.; Ruiz, M.C.; Hontoria, E. Application of Machine Learning to support production planning of a food industry in the context of waste generation under uncertainty. Oper. Res. Perspect. 2020, 7, 100147. [Google Scholar] [CrossRef]
  53. Golafshani, E.M.; Behnood, A.; Arashpour, M. Predicting the compressive strength of normal and High-Performance Concretes using ANN and ANFIS hybridized with Grey Wolf Optimizer. Constr. Build. Mater. 2020, 232, 117266. [Google Scholar] [CrossRef]
Figure 1. The basic structure of MLP model.
Figure 1. The basic structure of MLP model.
Sustainability 15 16896 g001
Figure 2. Activation functions (a) Sigmoid; (b) Tanh; (c) ReLu.
Figure 2. Activation functions (a) Sigmoid; (b) Tanh; (c) ReLu.
Sustainability 15 16896 g002
Figure 3. Schematic diagram of an SVM model. (The red and blue dots indicate the samples that need to be classified.)
Figure 3. Schematic diagram of an SVM model. (The red and blue dots indicate the samples that need to be classified.)
Sustainability 15 16896 g003
Figure 4. Schematic diagram of GMM-VSG algorithm.
Figure 4. Schematic diagram of GMM-VSG algorithm.
Sustainability 15 16896 g004
Figure 5. Variable correlation of each data group (a) First group; (b) Second group; (c) Third group.
Figure 5. Variable correlation of each data group (a) First group; (b) Second group; (c) Third group.
Sustainability 15 16896 g005
Figure 6. 10-fold cross-validation.
Figure 6. 10-fold cross-validation.
Sustainability 15 16896 g006
Figure 7. Prediction results of the MLP model (a) First group; (b) Second group; (c) Third group.
Figure 7. Prediction results of the MLP model (a) First group; (b) Second group; (c) Third group.
Sustainability 15 16896 g007
Figure 8. Relationship between loss value and number of iterations (a) First group; (b) Second group; (c) Third group.
Figure 8. Relationship between loss value and number of iterations (a) First group; (b) Second group; (c) Third group.
Sustainability 15 16896 g008
Figure 9. Prediction results of the SVM model (a) First group; (b) Second group; (c) Third group.
Figure 9. Prediction results of the SVM model (a) First group; (b) Second group; (c) Third group.
Sustainability 15 16896 g009
Figure 10. Comparisons of OBJ value obtained from MLP and SVM models with standardization and/or normalization preprocessing method.
Figure 10. Comparisons of OBJ value obtained from MLP and SVM models with standardization and/or normalization preprocessing method.
Sustainability 15 16896 g010
Table 1. Statistical parameters of the first group.
Table 1. Statistical parameters of the first group.
Statistical ParameterPorosity (%)Contribution Porosity (%)D (10−12m2/s)
<20 nm20–50 nm50–200 nm>200 nm
Maximum10.853.102.875.264.0110.04
Minimum3.190.910.230.200.210.41
Mean5.471.611.611.410.832.99
Median5.231.631.571.140.652.27
Standard Error1.410.420.360.920.652.22
Kurtosis1.770.432.293.256.750.33
Skewness1.110.44−0.021.642.361.04
Table 2. Statistical parameters of the second group.
Table 2. Statistical parameters of the second group.
Statistical ParameterPorosity (%)Contribution Porosity (%)D (10−12m2/s)
<20 nm20–50 nm50–200 nm>200 nm
Maximum18.494.728.216.975.1814.55
Minimum3.190.380.230.080.000.41
Mean7.021.912.421.790.904.28
Median5.741.771.701.350.653.20
Standard Error3.490.731.611.420.763.42
Kurtosis1.072.140.961.946.94−0.05
Skewness1.341.281.391.522.270.94
Table 3. Statistical parameters of the third group.
Table 3. Statistical parameters of the third group.
Statistical ParameterPorosity (%)Contribution Porosity (%)D (10−12m2/s)
<20 nm20–50 nm50–200 nm>200 nm
Maximum18.494.728.216.975.1814.55
Minimum2.890.380.230.080.000.41
Mean7.451.952.581.960.964.67
Median6.071.771.751.520.713.75
Standard Error3.580.691.611.490.763.52
Kurtosis0.331.860.201.545.65−0.51
Skewness1.121.221.151.412.030.74
Table 4. Prediction evaluation indicators of the MLP model.
Table 4. Prediction evaluation indicators of the MLP model.
Evaluation IndicatorFirst GroupSecond GroupThird Group
NormalizationStandardizationNormalizationStandardizationNormalizationStandardization
MAE1.071.031.111.100.620.66
MSE1.871.751.671.860.710.73
R20.620.650.710.770.950.95
OBJ1.811.691.541.670.680.71
Table 5. Prediction evaluation indicators of the SVM model.
Table 5. Prediction evaluation indicators of the SVM model.
Evaluation IndicatorFirst GroupSecond GroupThird Group
NormalizationStandardizationNormalizationStandardizationNormalizationStandardization
MAE0.701.161.031.370.590.29
MSE1.071.933.003.690.810.30
R20.760.570.680.600.930.97
OBJ1.001.972.413.610.730.30
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, F.-Y.; Tao, N.-J.; Zhang, Y.-R.; Yuan, W.-B. Prediction of Chloride Diffusion Coefficient in Concrete Based on Machine Learning and Virtual Sample Algorithm. Sustainability 2023, 15, 16896. https://doi.org/10.3390/su152416896

AMA Style

Zhou F-Y, Tao N-J, Zhang Y-R, Yuan W-B. Prediction of Chloride Diffusion Coefficient in Concrete Based on Machine Learning and Virtual Sample Algorithm. Sustainability. 2023; 15(24):16896. https://doi.org/10.3390/su152416896

Chicago/Turabian Style

Zhou, Fei-Yu, Ning-Jing Tao, Yu-Rong Zhang, and Wei-Bin Yuan. 2023. "Prediction of Chloride Diffusion Coefficient in Concrete Based on Machine Learning and Virtual Sample Algorithm" Sustainability 15, no. 24: 16896. https://doi.org/10.3390/su152416896

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop