A Prediction Model of Coal Seam Roof Water Abundance Based on PSO-GA-BP Neural Network

Dai, Xue; Li, Xiaoqin; Zhang, Yuguang; Li, Wenping; Meng, Xiangsheng; Li, Liangning; Han, Yanbo

doi:10.3390/w15234117

Open AccessArticle

A Prediction Model of Coal Seam Roof Water Abundance Based on PSO-GA-BP Neural Network

by

Xue Dai

¹

,

Xiaoqin Li

^1,*,

Yuguang Zhang

²,

Wenping Li

¹,

Xiangsheng Meng

¹

,

Liangning Li

¹ and

Yanbo Han

¹

School of Resources and Geosciences, China University of Mining and Technology, Xuzhou 221116, China

²

National Investment Hami Energy Development Co., Ltd., Hami 839000, China

^*

Author to whom correspondence should be addressed.

Water 2023, 15(23), 4117; https://doi.org/10.3390/w15234117

Submission received: 22 October 2023 / Revised: 14 November 2023 / Accepted: 22 November 2023 / Published: 28 November 2023

(This article belongs to the Special Issue The Research on Effects of Coal Mining on Groundwater Environment and System)

Download

Browse Figures

Versions Notes

Abstract

:

With the gradual increase of coal production capacity, the issue of water hazards in coal seam roofs is increasing in prominence. Accurate and effective prediction of the water content of the roof aquifer, based on limited hydrogeological data, is critical to the identification of the central area of prevention and control of coal seam roof water damage and the reduction of the incidence of such accidents in coal mines. In this paper, we establish a prediction model for the water abundance of the roof slab aquifer, using a PSO-GA-BP neural network. Our model is based on five key factors: aquifer thickness, permeability coefficient, core recovery, number of sandstone and mudstone interbedded layers, and fold fluctuation. The model integrates the genetic algorithm (GA) into the particle swarm optimization (PSO) algorithm, with the particle swarm optimization algorithm serving as the primary approach. It utilizes adaptive inertia weight and quadratic optimization of the weights and thresholds of the backpropagation neural network to minimize the output error threshold for the purpose of minimizing output errors. The prediction model is applied to hydrogeology and coal mine production for the first time. The model is trained using 100 data samples collected by the Surfer 13 software. These samples help to accurately predict the unit inflow of water. The model is then compared with traditional forecasting methods such as FAHP, BP, and GA-BP neural network models to determine its efficiency. The study found that the PSO-GA-BP neural network model accurately predicts aquifer water abundance with higher precision. The root mean square error (RMSE) of the test set is determined to be 8.7 × 10⁻⁴, and the fitting result is measured at 0.9999, indicating minimal error with actual values of the sample. According to the prediction results of the test set, the water abundance capacity of the No. 7 coal mine in Hami Danan Lake is divided, and it is found that the overall difference between the results and the actual value is small, which verifies the reliability of the model. According to the results of the water abundance division, strong water abundance areas are mainly concentrated in the third-partition area. This study provides a new method for the prediction of aquifer water abundance, improves the prediction accuracy of aquifer water abundance, reduces the cost of coal mine production, and provides a scientific evaluation method and a theoretical basis for the prevention and control of water disasters in coal seam roofs.

Keywords:

water abundance; particle swarm optimization algorithm; genetic algorithm; BP neural network; FAHP

1. Introduction

The Cretaceous and Jurassic coal seams in western China are the main sources of coal production in China. These Cretaceous and Jurassic strata exhibit obvious weak cementation characteristics, including low strength, easy weathering, mud disintegration upon encountering water, and so on [1,2,3]. With the increasing coal production capacity, the issue of water damage on the roof of coal seams has become increasingly significant, posing a severe safety risk. Roof water damage is strongly linked to the water abundance of the roof aquifer, and the water inflow per unit of drilling is the most intuitive hydrogeological parameter to determine its water abundance nature. Therefore, accurate and efficient prediction of the water inflow per unit of the roof aquifer can identify the principal areas for preventing and controll water damage to the roof in coal seams, propose timely, scientific and effective prevention and remedial measures to significantly reduce the incidence of roof water damage accidents in coal mines [4,5]. Numerous studies have investigated the prediction of aquifer water abundance, including the use of conventional methods [5,6,7,8,9,10,11,12,13], as well as combining these methods with GIS data management and spatial analysis to classify areas of varying water abundance [14,15,16,17,18,19]. The conventional means of predicting the aquifer’s water abundance, along with other linear mathematical methods, are affected greatly by site data and human factors, and the calculation index weight is subjective and error is large, so it is difficult to accurately show the aquifer’s water abundance. Therefore, some scholars have suggested employing artificial neural networks to forecast the water abundance of aquifers [20,21,22,23]. Neural networks possess self-adaptive, self-learning, and fault-tolerant capabilities, which effectively address this issue. Among these, the BP neural network is the most widely used. Notably, the BP neural network suffers from several inadequacies, including a tendency to fall into local optima, slow convergence speed, and poor prediction accuracy. The prediction value of aquifer water abundance obtained by the BP neural network may contain errors. The genetic algorithm (GA) is used to optimize the BP network to overcome the limitations of local optima, and enhance the prediction accuracy of aquifer water abundance [24,25,26]. The genetic algorithm boasts an exceptional ability for global optimization. Nonetheless, it lacks a memory capacity which may result in the omission of optimal points and ultimately lead to suboptimal results in water abundance prediction.

To improve water abundance prediction, we introduce the particle swarm optimization (PSO) algorithm into the GA-BP neural network. PSO algorithm has the disadvantages of slow convergence in the later stage and possible convergence to the local extreme point, but it has the ability of memory, which can effectively supplement the shortcomings of GA and retain the individual and global optimal solution. The PSO-GA algorithm optimizes the BP neural network, while the population is optimized via the PSO algorithm. The optimal iteration is found through the use of the PSO algorithm. This algorithm is then integrated with the GA, which carries out crossover and mutation operations on particles during the particle swarm iteration. By combining the advantages of both algorithms, the weights and thresholds of the BP neural network are optimized. The solution obtained is then substituted into the BP neural network for subsequent calculations to minimize the output error.

The PSO-GA-BP neural network prediction model was applied to hydrogeology and mine safety for the first time, which solved the shortcomings such as low prediction accuracy of the BP neural network, lack of memory ability of the GA-BP neural network, and the potential for losing most of their advantages. It solved the limitations of traditional prediction methods, such as the great influence of site data and human factors, large error, etc., improved the prediction accuracy and reduced the production cost. It is of great significance for coal mine safety production.

2. Study Area

The Dannanhu No. 7 coal mine is part of the Dannanhu mining area in Hami, Xinjiang. It is situated in the southern section of Hami City, Xinjiang (Figure 1a,c), located within the Nanhu Gobi region. The terrain is high in the north and south, with a low center. The surface is mostly covered by the Quaternary and Neoproterozoic, with the Xishanyao formation of the Middle Jurassic occasionally visible in the southwestern area of the mine.

Based on the drill hole data, the strata in the area have developed chronologically with the Upper Carboniferous Wutongwuzi Formation (C₂wt) as the oldest layer, followed by the Lower Jurassic Sangonghe Formation (J₁s), and the Middle Jurassic Xishanyao Formation (J₂x) and Toutunhe Formation (J₂t) in the middle. The Neogene Pliocene Putaogou Formation (N₂p) and Quaternary Formation (Q₄) are observed, and the coal-bearing strata is the Middle Jurassic Xishanyao Formation (J₂x).

Within the Dannanhu No. 7 Coal Mine area, there are Quaternary clay aquifers and sand-mudstone aquifers at the bottom of the Toutunhe Formation in the Middle Jurassic, sand-mudstone aquifers at the base of the upper Xishanyao Formation and in the middle part of the Xishanyao Formation in the Middle Jurassic.

The aquifers are the Quaternary permeable layer, Neogene Pliocene Putaogou Formation fracture pore weak aquifer, Jurassic Middle Toutunhe Formation fracture pore aquifer and Jurassic Middle Xishanyao Formation fracture pore aquifer. The Xishanyao Formation fracture pore aquifer is further subdivided into three coal mine roof aquifers and 3~7 coal mine fracture pore aquifers. The study mainly analyzes the water abundance of the three coal mine roof aquifers, with the third coal mine roof aquifer thickness being the largest (Figure 1d).

First of all, the correlation analysis was carried out for the main control factors of aquifer water abundance, the main factors with high correlation were removed, then, the remaining main control factors were used to calculate and predict the unit water inflow. Finally, the prediction zone of water abundance was determined. The research route of this paper is shown in Figure 2:

3. The Primary Determinants of Aquifer Water Abundance

3.1. Analysis of the Main Controlling Factors of Water Abundance

To effectively assess aquifer-water yield, the impact of the main controlling factors on water abundance was analyzed in six aspects. These include aquifer thickness, sand (gravel)-mud ratio, permeability coefficient, core recovery rate, sand-mudstone interlayer number, and fold fluctuation degree [5,14,27]. Such analysis is informed by previous scholarship.

The primary control factors require data collection via drilling and pumping testing, and the data was not easy to obtain and scattered. To acquire more consistent and dependable information from the limited data, spatial interpolation was applied, specifically the Kriging interpolation in the thesis. Subsequently, contour maps displaying each of the control factors were created.

Aquifer thickness;

The size of an aquifer’s water-storage space is represented by its thickness. The thickness of the aquifer directly influences the amount of water present, with thicker aquifers indicating higher water content. The study area’s northeast features a large aquifer thickness, averaging about 95 m (Figure 3a).

2.: Sand (gravel)-mud ratio;

The alternating layers of sand and mudstone frequently appear in the roof strata surrounding coal seams. The sand-mud ratio signifies the thickness ratio of sand (gravel) to mudstone (siltstone) in the aquifer. If the thickness of the aquifer remains relatively constant, and the strata is less affected by tectonism, the greater the thickness of sandstone in the stratum, and the more abundant the aquifer’s water content. By analyzing hydrogeological borehole data, it was found that the higher the sand (gravel)-mud ratio, the greater the aquifer’s permeability coefficient. This suggests that the aquifer possesses a stronger permeability. Furthermore, the thickness of sandstone is greater than that of mudstone in the northwestern section of the study area, resulting in a larger effective aquifer thickness in that region and indicating a good water abundance nature. For instance, in hole S1, the ratio of sand (gravel)-mud was 0.9172, and the permeability coefficient was 0.276 m/d. The rock interval consisting of sand (gravel) is extensive with a considerable water quantity, as depicted in Figure 3b.

3.: Permeability coefficient;

The permeability coefficient reflects the aquifer’s permeability, which can reflect the conditions of replenishment, runoff, and discharge. The greater the permeability coefficient value, the stronger the aquifer’s permeability. The coefficient of permeability is predominantly associated with the level of rock fissure development: the more developed the fissures, the greater the degree of connectivity between them, and the lower the degree of filling within the fissures. As a result, the rock’s permeability is stronger and the coefficient of permeability is subsequently larger. The composition of the aquifer roof within the third coal layer at Dannanhu No. 7 Coal Mine encompasses fine sandstone, coarse sandstone, conglomerate-bearing coarse sandstone, mudstone, and sandy mudstone. The analysis of data from hydrogeological borehole pumping tests revealed permeability coefficients ranging between 0.015 and 0.276 m/d, giving a permeability grade that ranged from weak to medium permeability (Figure 3c).

4.: Core recovery;

The core recovery reflects the degree of fissure development in the strata and is measured by the ratio of the length of core taken to the drilling footage. The lower the core recovery rate is, the higher the degree of fracture development is, and the more broken the rock is. This suggests a larger water storage space per unit volume of aquifer, better groundwater runoff conditions, and a higher water content. The study area’s core recovery ranges from 0.66 to 0.96 (Figure 3d).

5.: Sand-mudstone interlayer number;

The number of sandstone and mudstone interlayers reflects the overlapping status of aquifer and aquifuge, which can reflect the strength of the hydraulic connection. The mudstone (siltstone) can weaken the water abundance of aquifer, and the more sandstone and mudstone interlayers of the same thickness of aquifer indicate that they are more aquifuge, and the more water abundance of aquifer is weakened. In the northwest of the study area, the number of sandstone and mudstone interlayers is the smallest, and the sand-gravel ratios are all greater than 0.5, indicating that the effective aquifer thickness is larger and the water abundance is richer here (Figure 3e).

6.: Fold fluctuation degree;

The folds in the well field, especially the core and both sides of the syncline, have developed fissures. The surface water from the two wings converges to the middle and seeps down into groundwater. The groundwater flows along the layers and slopes to create a catchment space in the syncline, which favors groundwater recharge by the syncline structure. Generally speaking, increased fold undulation enhances water catchment efficacy and augments water abundance (Figure 3f).

F_{u d} = H - H_{m i n}

(1)

where

F_{u d}

denotes the fold fluctuation degree,

H

denotes the elevation of the bottom surface of the aquifer, and

H_{m i n}

denotes the minimum elevation of the bottom surface of the aquifer.

3.2. Correlation Analysis of Factors Controlling Water Abundance

Correlation analysis is a statistical method used to judge the degree of correlation between two variables. The corresponding indicator is the correlation coefficient (r). The range of the correlation coefficient value is −1 ≤ r ≤ 1; the greater the absolute value of the correlation coefficient, the higher the degree of correlation between the two factors under control. A negative value indicates a negative correlation, while a positive value indicates a positive correlation [28,29].

In this study, the extent of correlation between the two variables controlling water abundance was assessed using the Pearson correlation coefficient method, utilizing the following formula:

r = \frac{\sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y})}{\sqrt{{\sum_{i = 1}^{n} (X_{i} - \bar{X})}^{2}} \sqrt{\sum_{i = 1}^{n} {(X_{i} - \bar{Y})}^{2}}}

(2)

where n is the number of samples,

X_{i}

and

Y_{i}

refer to the ith sample value of X and Y, and

\bar{X}

and

\bar{Y}

are the mean values of the samples. When

|r|

≤ 0.2, the two control factors are either extremely weakly correlated or uncorrelated. When 0.2 <

|r|

≤ 0.4, the two control factors are weakly correlated. When 0.4 <

|r|

≤ 0.6, the two control factors are moderately correlated. When 0.6 <

|r|

≤ 0.8, the two control factors are highly correlated. When 0.8 <

|r|

≤ 1, the two control factors are extremely highly correlated.

This paper uses 0.7 as the designated threshold. Once the correlation between the two factors surpasses 0.7, one of the control factors will be excluded, and the other one will subsequently undergo analysis. The correlation coefficient between the permeability coefficient and the sand (gravel)-mud ratio is greater than 0.7 (Table 1), it is necessary to eliminate one of the controlling factors. The correlation between the remaining four control factors and the permeability coefficient is even lower compared to the sand (gravel)-mud ratio, hence excluding the sand (gravel)-mud ratio indicator.

4. Model Establishment and Application

4.1. Principle of PSO-GA-BP Neural Network

The PSO algorithm concept originates from the study of how birds forage in flocks, which allows for optimal adaptation to the search behavior of particles in complex systems. In the PSO algorithm, particles possess solely two attributes: velocity and position. Their position and velocity are updated based on two extreme values: individual extreme value (P_best) and group extreme value (G_best). During the search process, particles are interconnected and share information with each other. The solution that every particle searches for individually is referred to as the individual extreme value, and the optimal individual extreme value within the particle swarm serves as the current global optimal solution. The particle swarm continues to iterate, updating velocity and position until the optimal solution that meets the termination condition is ultimately achieved [30,31].

Assuming that N particles form a tribe in an M-dimensional target search space, the particle attribute can be regarded as a D-dimensional vector, where the position of the i particle can be expressed as:

x_{i} = (x_{i 1}, x_{i 2}, \dots, x_{i M}), i = 1, 2, \dots, N

The velocity of the first particle can be expressed as:

v_{i} = (v_{i 1}, v_{i 2}, \dots, v_{i M}), i = 1, 2, \dots, N

The optimal position searched by the i particle is the individual extreme value, which is expressed as:

P_{b e s t} = (P_{i 1}, P_{i 2}, \dots, P_{i M}), i = 1, 2, \dots, N

The global optimal position searched by the particle swarm is the group extreme value, and the extreme value is recorded as:

G_{b e s t} = (P_{g 1}, P_{g 2}, \dots, P_{g M}), i = 1, 2, \dots, N

After searching for individual and group extremes, the particles update their speed and position according to the Formulas (3) and (4):

v_{i + 1} = ω * v_{i} + c_{1} * r_{1}  [P_{b e s t} - x_{i}] + c_{2} * r_{2}  [G_{b e s t} - x_{i}], i = 1, 2, \dots, N

(3)

x_{i} = x_{i} + v_{i + 1}

(4)

where

ω

is the inertia weight, which provides a balance between local and global search;

v_{i}, v_{i + 1}

are the current and post-update velocities of each iteration, respectively;

c_{1}, c_{2}

are the learning factors, which are used to control the movement of particles in each iteration, and both are positive;

r_{1}, r_{2}

are the random numbers in the range of [0, 1];

x_{i}, x_{i + 1}

are the current and post-update displacements of each iteration, respectively; and N is the number of particles.

The main steps of particle swarm optimization are as follows:

Initialize the particle swarm. Initialize the parameters of the particle swarm, including the particle swarm population size, particle displacement, velocity, individual extreme value and particle swarm extreme value, etc.;
Calculates the particle fitness value. According to the problem to be solved, the corresponding fitness function is selected, and the individual fitness value of the initial particle swarm is calculated by using the fitness function;
Update individual extreme value and group extreme value. Comparing the individual adaptation value calculated in the previous step with the individual extreme value of the particle, if the individual adaptation value is better, the individual adaptation value is regarded as the individual optimal position of the population particle, otherwise the original individual extreme value will be maintained until a better individual extreme value appears. Comparing the individual extreme value and the group extreme value, if the individual extreme value is better than the group extreme value, then the individual extreme value is taken as the global optimal position of the particle swarm, otherwise the original group extreme value will be maintained until a better individual extreme value appears;
Update the position and speed of the particles. Update the position and velocity of particles according to Formulas (3) and (4);
Judgment of termination conditions. According to the set termination condition of the algorithm, it is judged whether the algorithm meets the end condition, if it does not meet the end condition, return to step 2, and if it meets the end condition, proceed to the next step;
Output population extremum. The extreme value is regarded as the global optimal solution of the particle swarm optimization.

The PSO-GA-BP prediction model utilizes a neural network integrated with a GA into a PSO algorithm, whereby the PSO algorithm serves as the primary component. This optimizes the weight and threshold of the BP neural network in order to achieve the objective of reducing the BP neural network’s output error (Figure 4).

Firstly, initialize the structure parameters of the BP neural network. The number of nodes in the input, hidden, and output layers must be determined according to the sample data. Weight and threshold for the BP neural network must also be established, with careful encoding of each parameter.

Particle population initialization. Initialize the parameters necessary for the PSO algorithm and genetic algorithm, including crossover probability (P_cross), mutation probability (P_mutation), maximum particle velocity (V_max), inertia weight

ω

, learning factors

c_{1}, c_{2}

and iteration number N. Encode the operation on particle velocity and position using the same rules as mentioned above. Set the particle population size to be U. Then, calculate the particle using the fitness function adaptation value, find, and update the individual extreme value and population extreme value.

Seed group iteration. Initially, the seed swarm optimization algorithm is utilized for optimization iteration. The algorithm updates the particle velocity and position, arranges fitness values in ascending order, and then divides them into three equal parts. U₁ denotes the better adapted values, U₂ denotes the average, and U₃ denotes the poorly adapted values. According to the genetic algorithm, the well-adapted subpopulation U₁ is directly copied to the subsequent generation, while the general subpopulation U₂ uses velocity crossover operator and position crossover operator to perform particle crossover operation with crossover probability. Compare the fitness values of particles before and after performing crossover operation, and copy the better-adapted particles to the next iteration. Using velocity crossover operator and position crossover operator to initialize the subpopulation U₃ with poor fitness randomly with mutation probability, the particles obtained by mutation are put back into the particle population. Recalculate the fitness value of each particle within the population and compare the before and after values to update the P_best and G_best values. Repeat these steps until the best fitness value reaches the convergence accuracy or until the maximum number of iterations is reached. Then output the globally optimal solution.

The primary parameter of the PSO algorithm is the inertia weight

ω

, which significantly affects the algorithm’s convergence performance. A larger value of the inertia weight results in stronger global search capabilities of the particle swarm, minimizing chances of falling into the local extreme point. Conversely, a smaller value of the inertia weight allows for faster convergence of the algorithm due to the stronger local convergence capabilities of the particle swarm. This paper uses adaptive inertia weights to monitor the real-time motion state of the particle population during each round of iteration in the PSO algorithm. Consequently, each particle’s inertia weights in the population are dynamically adjusted according to the motion state, thus diminishing the number of unproductive iterations whilst enhancing the PSO algorithm’s convergence performance [32,33,34].

Run the BP neural network. Replace the global optimal solution acquired in the preceding step into the BP neural network as the initial weights and thresholds of the BP neural network. Subsequently, execute the BP neural network to obtain the conclusive output of the PSO-GA-BP neural network.

4.2. Case Analysis

Following the aforementioned analysis, five control parameters were employed to evaluate the water abundance in the No. 3 coal roof aquifer, so the input layer comprised five neurons, wherein the water unit inflow was designated as the output layer. The initial number of neurons in the implied layer was determined by utilizing empirical Formula (5), where n represents the number of input units, m represents the number of output units, and a is a constant within the range of [1, 10]. In this case, a is taken as the value of 1.

b = \sqrt{n + m} + a

(5)

The PSO-GA algorithm was applied to enhance the weight and threshold of the BP neural network. Initially, the genetic algorithm was used to discover the suitable optimal weights and thresholds for the generated hidden layers by the PSO algorithm each time, with the following parameters: the population size was 20, the maximum number of iterations was 20, the iterative objective function had upper and lower bounds of 1 × 10⁻⁹–1 × 10⁻⁷. Then, the PSO algorithm was used to continue optimization, so as to better classify and predict, and circle mapping was introduced for further optimization. Based on experience, the PSO algorithm converged in the second iteration. So, this paper specifies that the number of PSO iterations was set to 2, the number of particles was 5, the maximum velocity was 6, and the learning factors

c_{1}, c_{2}

were both 2.

To assess the model’s credibility, a sample data set comprising 100 control factor data, including measured data, was obtained from the contour plots drawn in the previous section. Out of these, the first 91 sets of sample data were chosen for the model training set, while the remaining 9 sets were used as the test set to predict the unit influx of water. After the training, the PSO-GA-BP neural network model had achieved a root mean square error (RMSE) of 1.9 × 10⁻⁴ for the training data, 8.7 × 10⁻⁴ for the test data, and 5.1 × 10⁻⁴ overall. Furthermore, the model exhibited a training set fit of 0.9999.

5. Discussion

To verify the superior performance of the PSO-GA-BP neural network model for water abundance prediction, the study conducted a comparative analysis with the traditional prediction method FAHP and other neural network models, including the BP and GA-BP neural network.

5.1. FAHP

Due to disregarding the imprecision of human subjective judgement, AHP frequently encounters issues of incongruity between judgement consistency and matrix consistency, as well as difficulties in conducting consistency tests during its application. Hence, FAHP is preferred as the conventional prediction approach for analysis. To ensure the consistency of forecasted results, we apply FAHP to anticipate and examine test set data. We assign five controlling factors for determining water abundance: aquifer thickness, permeability coefficient, core recovery, sand (gravel)-mudstone interlayer number, and fold fluctuation degree, which are then compared in pairs on a scale from 0.1 to 0.9 to generate the fuzzy complementary judgement matrix A.

A = a_{i j} =  [\begin{matrix} 0.50 & 0.40 & 0.40 & 0.60 & 0.70 \\ 0.60 & 0.50 & 0.50 & 0.60 & 0.60 \\ 0.60 & 0.50 & 0.50 & 0.60 & 0.60 \\ 0.40 & 0.40 & 0.40 & 0.50 & 0.60 \\ 0.30 & 0.40 & 0.40 & 0.40 & 0.50 \end{matrix}]

w_{i} = \frac{1}{n (n - 1)} (\sum_{j = 1}^{n} a_{i j} + \frac{n}{2} - 1), (i = 1, 2, \dots n)

(6)

W^{*} = {(w_{i j}^{*})}_{n \times n}

(7)

w_{i j}^{*} = \frac{w_{i}}{w_{i} + w_{j}} (i = 1, 2, \dots, n; j = 1, 2, \dots, n)

(8)

Applying matrix A and Formula (6), we achieve row sum normalization to determine the weights of each evaluation index

w_{i}

and acquire the weight vector W,

W = {(0.21, 0.22, 0.22, 0.19, 0.18)}^{T}

. To evaluate the plausibility of the matrix A and weight value W, we must also perform a consistency test. According to Formulas (7) and (8), the characteristic matrix

W^{*}

of the fuzzy complementary judgement matrix A can be expressed as:

W^{*} = {(w_{i j}^{*})}_{n \times n} =  [\begin{matrix} 0.50 & 0.49 & 0.49 & 0.52 & 0.54 \\ 0.51 & 0.50 & 0.50 & 0.53 & 0.55 \\ 0.51 & 0.50 & 0.50 & 0.53 & 0.55 \\ 0.48 & 0.47 & 0.47 & 0.50 & 0.52 \\ 0.46 & 0.45 & 0.45 & 0.48 & 0.50 \end{matrix}]

According to Equation (9), the compatibility index I is calculated as 0.1, and the fuzzy complementary judgment matrix A satisfies the consistency requirement

I (A, W^{*}) = \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} |a_{i j} + w_{i j}^{*} - 1|

(9)

V = w_{i} y_{i}

(10)

where

y_{i}

is the dimensionless value of the ith indicator.

Dimensionless treatment of each assessment index, combined with the calculated weights of each assessment index, are used to derive the corresponding unit inflow value using Equation (10). After comparison with the actual values, the predicted water abundance classes of sample numbers 6, 7, and 9 do not match the actual values, and the accuracy of the model prediction is 67% (Table 2).

5.2. Other Neural Network Prediction Models

To compare and analyze the prediction accuracy of the PSO-GA-BP neural network prediction model with other neural network prediction models, the same sample data set as the PSO-GA-BP model was selected, the first 91 sample data sets were the model training set, and the remaining 9 sample data sets (measured data) were the test set for the BP neural network and GA-BP neural network training.

Firstly, the number of nodes in the hidden layer of the BP and GA-BP neural networks was preferably trained by using an empirical formula, and after many training iterations, when the number of nodes in the hidden layer was 7, the corresponding mean square error was minimized to 2.43 × 10⁻⁵. The structure of the BP neural network was 5-7-1, and the length of the genetic algorithm coding was calculated according to Equations (11)–(13)

S_{1} = A \times B + B \times C

(11)

S_{2} = B + C

(12)

S_{3} = S_{1} + S_{2}

(13)

where A, B, and C are the number of neurons in the input, hidden, and output layers, respectively, S₁ is the number of neural network weights, which is 42 in this paper, S₂ is the number of thresholds, which is 8 in this paper, and S₃ is the length of the genetic algorithm coding, which is 50 in this paper.

The BP neural network was constructed according to the optimal number of hidden layer nodes obtained above, the maximum number of network training times was 1000, the learning rate was 0.01, and the minimum error of the training target was 1 × 10⁻⁵. The weights and thresholds of the BP neural network were optimized using genetic algorithms, and the parameters of the genetic algorithms were initialized by setting the population size to 30, the maximum number of iterations to 50, the crossover probability to 0.8, and the mutation probability to 0.1.

According to the training test results, the RMSE of the BP neural network was 5.95 × 10⁻³, and the RMSE of the GA-BP neural network was 1.53 × 10⁻³. The accuracy of classifying aquifer water abundance classes according to the prediction results of the BP and GA-BP neural networks was 100% compared with the actual values, but the two prediction models had large errors, which are very likely to cause water abundance grade misclassification. The prediction accuracy of water inflow per unit plays an important role in the prevention and control of roof water hazards, especially for the weakly cemented strata, which has low strength and is easily muddied and disintegrated when it encounters water, and only by minimizing the prediction error can the hazard of coal seam mining be reduced to a minimum. The comparison of the actual values of the test set samples with the predicted values of each method and the error comparison are shown in Figure 5, the prediction results and errors of the three prediction models on the test set are detailed in Table 3.

Based on the comparison results, the PSO-GA-BP neural network model exhibited the highest prediction accuracy, with a maximum error of 1.14 × 10⁻³ and a minimum error of 6.66 × 10⁻⁸. The model is well-suited to the sample measured values, as demonstrated by a relatively straight error curve and a small range of error intervals. In second place is the GA-BP model, with a maximum error of 5.26 × 10⁻³ and a minimum error of 1.03 × 10⁻⁴. The BP neural network prediction model achieved the lowest accuracy, with a maximum error of 1.65 × 10⁻² and a minimum error of 6.25 × 10⁻⁴, exhibiting a wider range of error intervals.

5.3. Prediction Zoning of Water Abundance

In this paper, the unit water influx values predicted by the PSO-GA-BP neural network model test set and the natural break point classification method in Arcgis were used to divide the water abundance of the No. 7 coal mine in Damian Lake, Hami (Figure 6). The limit values were 0.048, 0.087, 0.117, and 0.146, respectively. On this basis, it was concluded that the stronger and the strongest water abundance areas are mainly distributed in the north of the study area, that is, in the third partition, and a small part of them are distributed at the northern end of the first and second partitions and the intersection of the fourth and fifth zones.

The distribution area of strong water abundance is consistent with the thickness of aquifer, similar to the area of maximum permeability and fold fluctuation, and close to the area with larger core recovery rate, indicating that water abundance is positively correlated with these four main influencing factors; close to the minimum number of sand-mudstone interlayers, indicating that water abundance is negatively related to the main influencing factors.

To ensure the model’s reliability, we selected the actual unit water influx value of the same sample as the test set. We then used the model’s predicted boundary value to divide the mine area’s water abundance. The water-rich zoning map, depicted based on the actual unit water inflow values as shown in Figure 6, exhibits slightly larger areas of strong, stronger, weak, and weaker water abundance regions. However, the overall disparity was minimal, thereby affirming the model’s reliability.

6. Conclusions and Forecast

Aiming at the issue of roof water disaster, a prediction model for the water abundance of a coal seam roof aquifer was developed using a PSO-GA-BP neural network. This model takes into account the five primary factors that influence the water abundance of the aquifer, and integrates the genetic algorithm with the PSO algorithm. The algorithm continually seeks the optimal weight and threshold for the implicit layer generated by the PSO algorithm. The particle swarm optimization algorithm is used as the main body, and the adaptive inertia weight is adopted to optimize the weight and threshold of the BP neural network, to achieve the goal of minimizing output error.

Compared to the traditional prediction methods, FAHP and BP, and the GA-BP neural network prediction model, this model omits the need to evaluate the subjectivity of each control factor, which compensates for the tendency of the BP neural network model to fall into the local optimum and the GA-BP neural network model to miss the optimum point. The inclusion of adaptive inertia weight in the PSO algorithm results in higher prediction accuracy and a more favorable forecasting outcome. Furthermore, the principal regulating factors in the forecasting model can be acquired in every construction borehole without the requirement for geophysical surveying, significantly decreasing the building expense.

As per the verified test set data, the predicted value of the model is a highly precise match to the actual value, the maximum error is only 1.14 × 10⁻³, and the pre-prediction accuracy is as high as 99.99%. By utilizing the predicted unit water influx value, the water abundance of the 3-coal roof aquifer could be partitioned, revealing that areas with greater water abundance were concentrated in the third partition. By comparing the predicted value with the partition results of the real value of unit water inflow, it was found that the prediction accuracy of the model test set was high, and the water-rich partition was consistent with the actual situation.

There are still some shortcomings in the prediction model, in view of the small number of main control factors in the model, whether a better prediction result can be obtained by using more control factors, and whether the combination of other main control factors is more suitable. Further research needs to be done to determine whether the model can be better optimized.

Author Contributions

Conceptualization, X.D. and X.L.; methodology, X.L.; software, X.D.; validation, Y.Z. and X.M.; formal analysis, Y.Z.; investigation, X.M.; resources, Y.H.; data curation, L.L.; writing—original draft preparation, X.D. and X.L.; writing—review and editing, Y.Z. and L.L.; visualization, W.L.; supervision, W.L.; project administration, X.L.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 42372316, No. 41602309) and China Coal’s “Enlisting and Leading” project (No. 2022JB01). All the funders are Xiaoqin Li.

Data Availability Statement

No new data were created or analyzed in this study. Data are contained within the article.

Acknowledgments

The authors express their gratitude to everyone that provided assistance for the present study.

Conflicts of Interest

Author Yuguang Zhang was employed by National Investment Hami Energy Development Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Song, C.Y. Study and Application of Meso-Structural Characteristics and Deformation and Failure Mechanism of Weakly Cemented Sandstone; University of Science and Technology Beijing: Beijing, China, 2017. [Google Scholar]
Song, C.Y.; Ji, H.G.; Liu, Y.J.; Sun, L.H. Influence factors of driving disturbance of adjacent roadway under the condition of weakly cemented surrounding rock. J. Min. Saf. Eng. 2016, 33, 806–812. [Google Scholar]
Sun, L.H.; Ji, H.G.; Yang, B.S. Physical and mechanical properties of weakly cemented strata in typical western mining areas. J. Coal 2019, 44, 866–874. [Google Scholar] [CrossRef]
Dong, S.N.; Ji, Y.D.; Wang, H.; Zhao, B.F.; Cao, H.D. Water disaster prevention and control technology and application of typical roof of Jurassic coalfield in Ordos Basin. J. Coal 2020, 45, 2367–2375. [Google Scholar] [CrossRef]
Wu, Q.; Xu, K.; Zhang, W. Further discussion on the “Three-map-double prediction method” for predicting and evaluating the risk of water inrush from coal seam roof. J. Coal 2016, 41, 1341–1347. [Google Scholar] [CrossRef]
Wu, Q.; Huang, X.L.; Dong, D.L.; Yin, Z.R.; Li, J.M.; Hong, Y.Q.; Zhang, H.J. “Three-map-double prediction method” for evaluating water inrush condition of coal seam roof. J. Coal 2000, 25, 62–67. [Google Scholar] [CrossRef]
Hou, E.K.; Ji, Z.Z.; Che, X.Y.; Wang, J.W.; Gao, L.J.; Tian, S.X.; Yang, F. Water abundance prediction method of weathered bedrock based on improved AHP and entropy weight method. J. Coal Carb. 2019, 44, 3164–3173. [Google Scholar] [CrossRef]
Li, L.N.; Li, W.P.; Shi, S.Q.; Yang, Z.; He, J.H.; Chen, W.C.; Yang, Y.R.; Zhu, T.E.; Wang, Q.Q. An Improved Potential Groundwater Yield Zonation Method for Sandstone Aquifers and Its Application in Ningxia, China. Nat. Res. Res. 2022, 31, 849–865. [Google Scholar] [CrossRef]
Lu, Q.Y.; Li, X.Q.; Li, W.P.; Chen, W.; Li, L.F.; Liu, S.L. Risk Evaluation of Bed-Separation Water Inrush: A Case Study in the Yangliu Coal Mine, China. Mine Water Environ. 2018, 37, 288–299. [Google Scholar] [CrossRef]
Xue, S.; Li, W.P.; Guo, Q.C.; Liu, S.L.; Sun, M.Y.; Fan, B.J. Prediction of water abundance of roof confined aquifer based on FAHP-GRA evaluation method. Metal. Mine 2018, 4, 168–172. [Google Scholar]
Gong, H.J.; Zeng, Y.F.; Liu, S.Q.; Li, Z.; Niu, P.K. Evaluation of aquifer abundance based on improved fuzzy analytic hierarchy process. Coal Technol. 2018, 37, 158–159. [Google Scholar]
Al-Abadi, A.M.; Shahid, A. A comparison between index of entropy and catastrophe theory methods for mapping groundwater potential in an arid region. Environ. Monit. Assess. 2015, 187, 576. [Google Scholar] [CrossRef]
Jenifer, M.A.; Jha, M.K. Comparison of Analytic Hierarchy Process, Catastrophe and Entropy techniques for evaluating groundwater prospect of hard-rock aquifer systems. J. Hydrol. 2017, 548, 605–624. [Google Scholar] [CrossRef]
Wu, Q.; Fan, Z.L.; Liu, S.Q.; Zhang, Y.W.; Sun, W.J. Evaluation method of water abundance of information fusion aquifer based on GIS-water abundance index method. J. Coal 2011, 36, 1124–1128. [Google Scholar] [CrossRef]
Wu, Q.; Wang, J.H.; Liu, D.H.; Cui, F.P.; Liu, S.Q. A new practical method for evaluating water inrush from coal seam floor IV: Application of AHP vulnerability index method based on GIS. J. Coal 2009, 34, 233–238. [Google Scholar] [CrossRef]
Han, C.; Pan, X.H.; Li, G.L.; Tu, J.N. Fuzzy Analytic hierarchy process of aquifer Water abundance based on GIS Multi-source Information Integration. Hydrogeol. Eng. Geol. 2012, 39, 19–25. [Google Scholar]
Wang, G.D. Study on aquifer water abundance based on fuzzy analytic hierarchy process. Coal Technol. 2016, 35, 209–210. [Google Scholar]
Rahmati, O.; Samani, A.N.; Mahdavi, M.; Pourghasemi, H.R.; Zeinivand, H. Groundwater potential mapping at Kurdistan region of Iran using analytic hierarchy process and GIS. Arab. J. Geosci. 2015, 8, 7059–7071. [Google Scholar] [CrossRef]
Machiwal, D.; Rangi, N.; Sharma, A. Integrated knowledge- and data-driven approaches for groundwater potential zoning using GIS and multi-criteria decision making techniques on hard-rock terrain of Ahar catchment, Rajasthan, India. Environ. Earth Sci. 2015, 73, 1871–1892. [Google Scholar] [CrossRef]
Gong, H.J.; Liu, S.Q.; Zeng, Y.F. Study on evaluation of aquifer abundance based on BP neural network. Coal Technol. 2018, 37, 181–182. [Google Scholar]
Li, Z.; Zeng, Y.F.; Liu, S.Q.; Gong, H.J.; Niu, P.K. Application of BP artificial neural network in water abundance evaluation. Coal Eng. 2018, 50, 114–118. [Google Scholar]
Jiang, S.; Sun, Y.J.; Yang, L.; Lin, C.P. Prediction of mine water inflow based on BP neural network method. Coal Geol. Chin. 2007, 19, 38–40. [Google Scholar]
Ling, C.P.; Sun, Y.J.; Yang, L.H.; Jiang, S.; Shao, F.Y. Prediction of water inflow from pore-filled mine based on BP neural network. Hydrogeology 2007, 55–58. Available online: http://www.swdzgcdz.cn/index.php?m=content&c=index&a=show&catid=72&id=1746 (accessed on 21 November 2023).
Xiao, Z.X. Study on Prediction of Water Inflow from Underwater Tunnel Based on Genetic Algorithm and Neural Network; Southwest Jiaotong University: Chengdu, China, 2011. [Google Scholar]
Xiao, Z.X.; Huang, T.; Li, Z.; Pan, M.M. Application of genetic-neural network algorithm in prediction of water inflow from underwater tunnel. J. Water Res. Water Eng. 2011, 22, 102–105. [Google Scholar]
Li, Z.Y.; Wang, Y.C.; Olgun, C.G.; Yang, S.Q.; Jiao, Q.L.; Wang, M.T. Risk assessment of water inrush caused by karst cave in tunnels based on reliability and GA-BP neural network. Geomat. Nat. Hazards Risk 2020, 11, 1212–1232. [Google Scholar] [CrossRef]
Wu, Q.; Wang, Y.; Zhao, D.K.; Shen, J.J. Evaluation method and application of water abundance of loose aquifer based on sedimentary characteristics. J. Chin. Univ. Min. Technol. 2017, 46, 460–466. [Google Scholar]
Chen, Z.F. Evaluation of the effectiveness of China’s Monetary Policy—An Analysis based on Pearson correlation coefficient. Chin. Busin. Theory 2020, 6, 48–49. [Google Scholar]
Yang, F.; Feng, X.; Ruan, L.; Chen, J.W.; Xia, R.; Chen, Y.L.; Jin, Z.H. Study on the correlation between water branches and ultra-low frequency dielectric loss based on Pearson correlation coefficient method. High Volt. Elec. Appl. 2014, 50, 21–25+31. [Google Scholar]
Bashir, Z.A.; El-Hawary, M.E. Applying Wavelets to Short-Term Load Forecasting Using PSO-Based Neural Networks. IEEE Trans. Power Syst. 2009, 24, 20–27. [Google Scholar] [CrossRef]
Momeni, E.; Jahed, A.D.; Hajihassani, M.; Mohd, M.A. Prediction of uniaxial compressive strength of rock samples using hybrid particle swarm optimization-based artificial neural networks. Measurement 2015, 60, 50–63. [Google Scholar] [CrossRef]
Ao, Y.C.; Shi, Y.B.; Zhang, W.; Li, Y.J. Improved particle swarm optimization algorithm with adaptive inertia weight. J. Univ. Elect. Sci. Technol. 2014, 43, 874–880. [Google Scholar]
Zhang, H.; Wang, X.L. Particle swarm optimization algorithm for adaptive inertia weight optimization. Intel. Comp. Appl. 2023, 13, 5–8. [Google Scholar]
Zhang, Y.B.; Zou, D.X.; Zhang, C.Y.; Du, X.H. Particle swarm optimization algorithm with adaptive inertia weight. Comput. Simul. 2023, 40, 350–357. [Google Scholar]

Figure 1. Study area. (a) Location of Hami City, Xinjiang; (b) The partition map of the Dannanhu No. 7 Coal Mine; (c) Location of the Dannanhu No. 7 Coal Mine in Hami City; (d) Schematic diagram of main coal seam and aquifer.

Figure 2. Thesis workflow chart.

Figure 3. Contour map of water abundance control factors. (a) Aquifer thickness; (b) Sand (gravel)-mud ratio; (c) Permeability coefficient; (d) Core recovery; (e) Sand-mudstone interlayer number; (f) Fold fluctuation degree.

Figure 4. Flowchart of PSO-GA algorithm to optimize BP neural network.

Figure 5. Analysis of neural network prediction results. (a) Comparison between actual value and the predicted value of test set; (b) Comparison of prediction errors of test set.

Figure 6. Water abundance zoning of coal roof aquifer. (a) The data comes from the test set predictions; (b) The data comes from the actual value.

Table 1. Pearson’s correlation coefficients between control factors.

	Aquifer Thickness	Sand (Gravel)- Mud Ratio	Permeability Coefficient	Core Recovery	Sand-Mudstone Interlayer Number	Fold Fluctuation Degree
Aquifer thickness	1
Sand (gravel)-mud ratio	0.105	1
Permeability coefficient	0.129	0.781 *	1
Core recovery	0.136	0.113	−0.097	1
Sand-mudstone interlayer number	−0.055	−0.617	−0.601	−0.118	1
Fold fluctuation degree	−0.182	0.556	0.549	0.520	−0.152	1

Note: * indicates grade 0.05 (double tail), and the correlation is significant (p < 0.05).

Table 2. Evaluation and prediction of water abundance of FAHP.

Sample Number	Actual Value L/(s·m)	Water Abundance Class	Predicted Value L/(s·m)	Water Abundance Class	Error
1	0.1659	Medium	0.1537	Medium	−0.0122
2	0.04	Weak	0.0997	Weak	0.0597
3	0.0097	Weak	0.0684	Weak	0.0587
4	0.1887	Medium	0.1434	Medium	−0.0453
5	0.0198	Weak	0.0968	Weak	0.0770
6	0.0406	Weak	0.1087	Medium	0.0681
7	0.0077	Weak	0.1034	Medium	0.0957
8	0.1196	Medium	0.1128	Medium	−0.0068
9	0.0316	Weak	0.1132	Medium	0.0816

Table 3. The results of neural network prediction Unit: L/(s·m).

Sample Number	Actual Value	BP	BP Error	GA-BP	GA-BP Error	PSO-GA-BP	PSO-GA-BP Error
1	0.1659	0.1653	−0.00063	0.1662	2.86 × 10⁻⁴	0.1658	−4.27 × 10⁻⁵
2	0.0401	0.0447	0.00460	0.0377	−2.39 × 10⁻³	0.0401	1.65 × 10⁻⁵
3	0.0098	0.0263	0.01655	0.0150	5.26 × 10⁻³	0.0098	6.66 × 10⁻⁸
4	0.1887	0.1870	−0.00104	0.1876	−1.65 × 10⁻³	0.1875	−1.14 × 10⁻³
5	0.0198	0.0219	0.00211	0.0217	1.89 × 10⁻³	0.0198	3.57 × 10⁻⁵
6	0.0406	0.0449	0.00438	0.0393	−1.23 × 10⁻³	0.0405	−3.53 × 10⁻⁵
7	0.0077	0.0146	0.00684	0.0067	−1.01 × 10⁻³	0.0077	−5.01 × 10⁻⁶
8	0.1195	0.1160	−0.00352	0.1188	−7.06 × 10⁻⁴	0.1194	−8.88 × 10⁻⁵
9	0.0316	0.0479	0.01629	0.0317	1.03 × 10⁻⁴	0.0315	−6.86 × 10⁻⁵

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dai, X.; Li, X.; Zhang, Y.; Li, W.; Meng, X.; Li, L.; Han, Y. A Prediction Model of Coal Seam Roof Water Abundance Based on PSO-GA-BP Neural Network. Water 2023, 15, 4117. https://doi.org/10.3390/w15234117

AMA Style

Dai X, Li X, Zhang Y, Li W, Meng X, Li L, Han Y. A Prediction Model of Coal Seam Roof Water Abundance Based on PSO-GA-BP Neural Network. Water. 2023; 15(23):4117. https://doi.org/10.3390/w15234117

Chicago/Turabian Style

Dai, Xue, Xiaoqin Li, Yuguang Zhang, Wenping Li, Xiangsheng Meng, Liangning Li, and Yanbo Han. 2023. "A Prediction Model of Coal Seam Roof Water Abundance Based on PSO-GA-BP Neural Network" Water 15, no. 23: 4117. https://doi.org/10.3390/w15234117

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Prediction Model of Coal Seam Roof Water Abundance Based on PSO-GA-BP Neural Network

Abstract

1. Introduction

2. Study Area

3. The Primary Determinants of Aquifer Water Abundance

3.1. Analysis of the Main Controlling Factors of Water Abundance

3.2. Correlation Analysis of Factors Controlling Water Abundance

4. Model Establishment and Application

4.1. Principle of PSO-GA-BP Neural Network

4.2. Case Analysis

5. Discussion

5.1. FAHP

5.2. Other Neural Network Prediction Models

5.3. Prediction Zoning of Water Abundance

6. Conclusions and Forecast

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI