Next Article in Journal
What Drives Land Use Change in the Southern U.S.? A Case Study of Alabama
Previous Article in Journal
Genomic Tools in Applied Tree Breeding Programs: Factors to Consider
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forest-Fire-Risk Prediction Based on Random Forest and Backpropagation Neural Network of Heihe Area in Heilongjiang Province, China

1
College of Forestry, Northeast Forestry University, Harbin 150040, China
2
Heilongjiang Shengshan National Nature Reserve Service Center, Heihe 164300, China
3
School of Electrical Engineering, Heilongjiang University, Harbin 150080, China
*
Author to whom correspondence should be addressed.
Forests 2023, 14(2), 170; https://doi.org/10.3390/f14020170
Submission received: 2 December 2022 / Revised: 13 January 2023 / Accepted: 15 January 2023 / Published: 17 January 2023
(This article belongs to the Section Forest Ecology and Management)

Abstract

:
Forest fires are important factors that influence and restrict the development of forest ecosystems. In this paper, forest-fire-risk prediction was studied based on random forest (RF) and backpropagation neural network (BPNN) algorithms. The Heihe area of Heilongjiang Province is one of the key forest areas and forest-fire-prone areas in China. Based on daily historical forest-fire data from 1995 to 2015, daily meteorological data, topographic data and basic geographic information data, the main forest-fire driving factors were first analyzed by using RF importance characteristic evaluation and logistic stepwise regression. Then, the prediction models were established by using the two machine learning methods. Furthermore, the goodness of fit of the models was tested using the receiver operating characteristic test method. Finally, the fire-risk grades were divided by applying the kriging method. The results showed that 11 driving factors were significantly correlated with forest-fire occurrence, and days after the last rain, daily average relative humidity, daily maximum temperature, daily average water vapor pressure, daily minimum relative humidity and distance to settlement had a high correlation with the risk of forest-fire occurrence. The prediction accuracy of the two algorithms in regard to fire points was higher than that for nonfire points. The overall prediction accuracy and goodness of fit of the RF and BPNN algorithms were similar. The two methods were both suitable for forest-fire occurrence prediction. The high-fire-risk zones were mainly concentrated in the northwestern and central parts of the Heihe area.

1. Introduction

Forest ecosystems are related to the global carbon cycle and biochemical cycle, and the damage caused by forest fires to forest ecosystems is particularly serious [1]. Forest-fire forecasting is necessary for the management and control of forest fires [2,3]. The factors causing forest fires include meteorological factors, topography, the source of fires, human activities, and so on. Determining the main driving factors is the key to establishing an effective prediction model of forest-fire occurrence [4,5,6]. Meanwhile, with global climate crisis, extreme weather occurs frequently, and the influence of climate change on an increase in fire frequency and intensity has been reported [7,8,9,10,11]. The relationship between these factors and forest-fire occurrence is complex and possibly nonlinear.
Research on the forest fire prediction emerged in the 1920s in countries around the world. Especially, the United States, Canada, Australia, Russia and other countries attached great importance to forest fire research work [5,12,13,14,15]. There were many works devoted to forest fire prediction like probabilistic, deterministic, empirical and other. Mining historical fire data using mathematical and statistical methods to build a spatio-temporal prediction model between the forest fire and its driving factors was one of the most commonly used forest fire occurrence prediction methods. The most used model for fire behavior prediction simulation was the Rothermel model [16], which was a semi-physical model and had been applied in many areas with good prediction accuracy. Based on the available meteorological conditions and environmental observations, a multi-agent forest fire decision support system integrating prediction, detection, and management was proposed in the Ref. [17]. A point process framework was developed in the Ref. [18] for wildfire ignition observed in the Mediterranean France, the model revealed significant covariate effects in the southern French continent, and pointing out the influence of abnormally high temperatures and low precipitation on the risk of fire occurrence. Applying GIS and remote sensing techniques, the wildland fire risk mapped in several regions of Spain and an effective fire management system were proposed in the Ref. [19]. A new artificial neural network-based machine learning method was proposed in the Ref. [20] to build a GIS database of tropical forest fires and a spatial model of forest fire risk in the Lam Dong province, Vietnam. By applying remote sensing techniques, the incidence of the fire in Valmiki Tiger Reserve (Himalayan foothills) was studied in the Ref. [21]. In the Ref. [22], the spatio-temporal change of fire risk and danger potential was studied for northwestern Turkey where the data of Landsat imagery were used. A fire risk map with GIS technologies was generated and evaluated by using vegetation, topographic factors, and human factors for the Yeşilova Forestry Enterprise located in Kahramanmaraş, Turkey in the Ref. [23]. Taking account to anthropogenic and physical factors, a forest fire risk model for the north-western Anatolia section of Turkey was analyzed based on GIS, Remote Sensing and Analytical Hierarchy Process in the Ref. [24].
Throughout the fire management decision-making process, it is important to understand the spatial distribution of fires and to identify the human and environmental factors that contribute to the occurrence of fires in different regions and scales [25]. Additionally, identifying the main drivers of the fire occurrence is essential to understand the spatial pattern of wildfires and to implement effective fire management [26]. For the Mediterranean region, two different linear models were proposed in the Ref. [25] based on fire occurrence probability and frequency, respectively, to assess the most important human and/or biophysical drivers affecting the model. A brief fire history in eastern Kentucky, USA was reconstructed in the Ref. [27], and it found that elevation and slope were significantly related to fire occurrence. A prediction model for the probability of the lightning fire occurrence was studied in the Ref. [28]. An integrated dynamic spatio-temporal prediction of the forest-fire occurrence in the western United States was performed in the Ref. [29]. To predict spatial patterns of fire occurrence at regional and national levels in Mexico, the geographically weighted regression was used to predict fire density in the Ref. [30]. The fire danger index model for north Lebanon was developed based on the meteorological indices in the Ref. [31].
The traditional linear regression model is usually not enough to reveal their complex relationship. Machine learning methods were widely used [32,33], which can overcome the complex interaction among variables and have the ability to address nonlinear functions [34,35,36,37,38,39], with higher explanatory ability than traditional regression methods. The random forest (RF) algorithm is considered one of the best classification algorithms [40,41], and backpropagation neural network (BPNN) is one of the most widely used neural networks. They have shown impressive diversity in their applications [42,43,44,45,46,47,48,49,50]. A machine learning methodology was developed for the spatial prediction of forest fires with a case study of tropical forest fires in Lao Cai in Vietnam [37]. The Bayesian network model was used [38] to study the effects of temperature, relative humidity, wind speed, distance from settlements, tree species, distance from roads, and so on in regard to the occurrence of forest fires. Traditional multiple linear regression and RF were applied for the analysis of fire occurrence at the European scale [39] based on fire density and different physical, socioeconomic and demographic variables. The RF model showed a higher predictive ability than multiple linear regression. Applying the maximum entropy algorithm and considering physical and human variables, the prediction of human-caused fire occurrence was made [10] for the northeast of Spain. The predictive ability of the logistic regression model and neural network algorithm was compared [42] for wasteland fire occurrence in central Portugal, and the study results showed that the neural network algorithm had higher prediction accuracy. By adopting artificial neural networks and support vector machines, two forest-fire occurrence prediction algorithms were developed and tested based only on cumulative precipitation and relative humidity [43]. A hybrid machine learning algorithms was proposed in the Ref. [44] to mapping forest fire susceptibility in the north of Morocco. The study of forest fire susceptibility zones based on different machine learning algorithms has been a hot topic [45,46].
In recent years, the machine learning methods were also widely studied for the forest fire prediction in China. By adopting the BPNN algorithm, the daily, monthly and seasonal changes in forest-fire occurrence in Guangdong Province in China were predicted based on meteorological factors in the Ref. [47]. The main forest-fire driving factors in Shanxi Province in China based on the RF algorithm were analyzed [48]. Based on meteorological factors, forest-fire prediction models were established by using the RF algorithm [51]. The study determined that the RF algorithm was superior to the logistic regression model in forest-fire prediction ability [48,49]. By using the RF algorithm, the forest-fire occurrence prediction in Fujian Province in China based on meteorological factors was analyzed [50]. The risk prediction of forest fires of the southwest of China was studied in the Ref. [51] based on ant-miner algorithm. The application of convolutional neural networks to the prediction of forest fire susceptibility for Yunnan Province, China was developed in the Ref. [52]. Although a variety of machine learning methods have been applied and developed in forest fire occurrence prediction, the methods still lack universal diffusion in China, and there is a lack of comparison among different machine learning methods.
The Heihe area is a large forest-fire prevention area in Heilongjiang Province with a heavy frequency and intensity of forest fires [53]. At present, research on forest fires in Heilongjiang Province, China, has mainly been focused on the Daxing’an Mountains, and little research on forest fires in this area has been reported. Carrying out research work on forest-fire prediction in this area is of great significance to enhance the forest-fire defense capability in the northern part of Heilongjiang Province. Since many fire decisions are made at the zoning level, the ability to make daily predictions of fires in a given area is useful for many fire management applications [27]. Motivated by this analysis, to more accurately identify forest fire prone areas, the prediction adaptability of the RF and BPNN algorithms on forest fires was compared for the Heihe area in Heilongjiang Province, China; the main forest-fire driving factors were analyzed combining RF importance characteristic evaluation and logistic stepwise regression; the probability prediction models of forest-fire occurrence was established, in which the receiver operating characteristic test method was used in the validation process; and the forest-fire-risk grades were divided to provide important technical support for scientific and effective fire prevention and fire suppression work in this area.

2. Study Area and Data

2.1. Study Area

The Heihe area is located on the northeastern border of China, has a longitude of 124°45′ E to 129°18′ E and latitude of 47°42′ to 51°03′ N and spans the Daxing’an Mountains and Xiaoxing’an Mountains from north to south. In terms of structure, function, nature and status, the Heihe area is an important part of the natural ecosystem of the Daxing’an Mountains and Xiaoxing’an Mountains, has complete ecological characteristics of boreal temperate forests and has a unique ecological location with a forest coverage of 48.2%. The topography of the study area was shown in Figure 1. The forest types are broadleaf forest, coniferous forest and conifer-broadleaf forest. From 1995 to 2015, there were more than 700 forest fires with an average annual burned area of approximately 1200 hm2. The occurrences of forest fires were mainly concentrated in March, April, May, June, September and October; no forest fires occurred in January, February and December; and only a few forest fires occurred in July and August. The mean temperature was about 0.8 °C, the mean wind speed was about 2.5 m s−1, and the mean precipitation was about 1.5 mm.

2.2. Data

The daily historical forest-fire data of the Heihe area were collected from 1995 to 2015. In the forest fires, the largest forest fire was a lightning fire with an area of 185,698 hm2. The fire was caused by many reasons. The forest fire causes are given in Figure 2.
When establishing the full sample of forest-fire data, ArcGIS 10.2 software was used to randomly select non-ignition points in time and space according to the proportion of ignition points 1:1 [54]. The fire point was assigned “1”, and the nonfire point was assigned “0”. There were 1418 data points.
The meteorological data were obtained from the daily meteorological data of four national meteorological stations in the Heihe area from 1995 to 2015 downloaded from the China Meteorological Data Sharing Network (http://www.cma.gov.cn/ accessed on 3 March 2016). They include daily average relative humidity (%), daily average wind speed (m s−1), daily average temperature (°C), daily average water vapor pressure (hPa), daily average air pressure (hPa), daily maximum temperature (°C), daily maximum wind speed (m s−1), daily minimum temperature (°C), daily minimum relative humidity (%), daily sunshine hours (hour) and daily precipitation (mm). Moreover, the days after the last rain were calculated from the meteorological data. In data processing, according to the geographic coordinates of the full sample data and the geographic coordinates of the four meteorological stations, MATLAB software was used to calculate the corresponding distance. The nearest meteorological station data were used as the meteorological data of fire points or nonfire points, and the corresponding meteorological value was extracted according to the occurrence time of fire points or nonfire points.
The basic geographic information data were collected from the 1:250,000 national basic geographic databases provided by the National Geographic Information Resources Directory System website (http://www.webmap.cn/ accessed on 28 February 2022). In data processing, the analysis tool in ArcGIS 10.2 software was used to calculate the distance (m) from the fire points or nonfire points to the settlements, roads and railways.
Topographic factors were derived from ASTER GDEM 30M resolution Digital Elevation Model (DEM) data provided by the Geospatial Data Cloud of China (http://www.gscloud.cn/ accessed on 28 February 2022). The 3D analysis tool in ArcGIS 10.2 software was used to extract and calculate the elevation (m) and slope of fire points and nonfire points based on DEM data. The considered forest-fire driving factors in the models can be found in Table 1.
To avoid the prediction error caused by the difference in order of magnitude, forest-fire impact factors were normalized at the same scale. The max–min method was used to transform the sample data into numbers between (0, 1), and the functional form is:
x i = x i x min x max x min
where x i is the sample data, x min is the minimum value and x max is the maximum value. This paper used MATLAB software to realize the algorithm and ArcGIS 10.2 software to draw.

3. Methods

3.1. Random Forest Algorithm

The RF algorithm is a highly flexible machine learning algorithm whose basic unit is a decision tree. By integrating multiple trees into one through the idea of ensemble learning, the RF algorithm has good anti-noise ability and does not easily fall into overfitting and underfitting [49]. Let there be n forest-fire data and m forest-fire driving factors. In the RF algorithm, the n t r e e sample set draws from the full sample data by using the bootstrap sampling methods, then the n t r e e classification trees are built, and the m t r y factors are randomly selected from each node of each tree. Furthermore, the variable with the strongest classification ability is selected for branching, and the classification results of RF (strong classifier) were finally obtained by voting on the tree (weak classifier). Let m t r y = m , where the value of n t r e e takes enough to make the overall error rate stable [55].
Assume that the train set D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , , ( x n , y n ) } . The final classification decision is given as:
H ( x ) = arg max y Y i = 1 n t r e e I ( h i ( x ) = y )
where h i ( x ) is the weak classifier, and I ( · ) is the indicator function.
For the given classification models m 1 ( x ) , m 2 ( x ) , , m k ( x ) , the training data of each classification were sampled from the original data ( x , y ) . The marginal function was computed by
m g ( x , y ) = a v k I ( m k ( x ) = y ) max j k a v k I ( m k ( x ) = j )
Then, the errors can be defined by
P E = P x , y ( m g ( x , y ) < 0 )
The simple flowchart of the RF algorithm is shown in Figure 3.

3.2. Backpropagation Neural Network Algorithm

The BPNN algorithm is a kind of supervised learning algorithm that has a strong nonlinear mapping ability and anti-interference ability. The number of neuron nodes in the input layer of the BPNN depends on the number of forest-fire driving factors. The number of hidden layers ranges from one or more. Here, one hidden layer is selected. The number of neuron nodes in the hidden layer was determined to be 11 by experience and the experiment. The number of neuron nodes in the output layer is 1. When the number of neurons is too large, it is easy to increase the training time, and overfitting occurs when the amount of data processed is too large. Furthermore, the transfer function, summation unit and activation function among layers of the neural network are determined. The amount of forest-fire data in this paper is more suitable for applying the BPNN. The simple flowchart of BPNN is given in Figure 4.
By inputting training samples and passing them layer by layer, the error between the actual output value and the expected output value is compared in the output layer, the error is back-propagated, and the weight and threshold are modified layer by layer until the error is reduced to the given accuracy range. The BPNN can be optimized by changing the network topology, the learning rate, the initial weight and the threshold. The main calculation formulas of BPNN are shown as below. The transfer function is calculated by
f ( x ) = 1 1 + e x
The error function is calculated by
E p = 1 2 l ( t l O l ) 2
where t l is the expected outputs, and O l is the calculated outputs by the nets.
The outputs of the neuron nodes in the hidden layer are calculated by
y i = f ( j ( w i j x j θ i ) )
where w i j is the connection weight between the input neuron node and the hidden neuron nodes, and θ i is its threshold.
The outputs of the neuron nodes in the output layer are calculated by
O l = f ( i ( T i j x i θ l ) )
where T i j is the connection weight between the hidden neuron node and the output neuron nodes, and θ l is its threshold. Moreover, w i j , θ i , T i j and θ l can be corrected by the learning rate.

3.3. RF Importance Characteristic Evaluation and Logistic Stepwise Regression

In the RF algorithm, the bootstrap resampling method was adopted, and the samples that were not drawn each time accounted for approximately 36.8% of the total samples, which were called out-of-bag (OOB) data. Then, they were taken as the test sample data to obtain the OOB estimation of the model [56].
The RF algorithm can evaluate the importance of feature variables according to the OOB error rate obtained by OOB. The principle is as follows: the OOB error rate is calculated according to the OOB data of each classification tree t (errOOBt). When evaluating the importance of the feature variable Xi, other variables are kept unchanged, the sequence of Xi is randomly transformed, the OOB error rate of the OOB data for each transformation (errOOBit) is calculated, and the importance of the feature variable is evaluated by analyzing the increase in the OOB error rate when the sequence changes [49].
The calculation formula of the importance (VI) of variable Xi is [49]:
V I ( X j ) = 1 n t r e e t ( errOOB i t errOOB t )
Logistic stepwise regression is a common method used to select independent variables for linear models. In this paper, the input variable index systems for the forest-fire prediction model were established by using logistic stepwise regression and RF importance characteristic evaluation.

3.4. Receiver Operating Characteristic Curve (ROC)

The receiver operating characteristic (ROC) curve is a more commonly used accuracy evaluation tool [57,58], and the curve is plotted with the false-positive rate (probability of judging an actual false value as a true value) as the horizontal coordinate and the true-positive rate (probability of judging an actual true value as a true value) as the vertical coordinate. The area under the curve (AUC) of the ROC curves was used as a measure of the predictive capability and the goodness of fit of the models. Usually, AUC > 0.7 is meaningful, and the closer the AUC value is to 1, the better the corresponding model fit [48]. In this paper, AUC was used as an evaluation index.

4. Results and Analysis

4.1. Forest-Fire Driving Factors

The temporal extent, spatial extent and main drivers of forest fire prediction would affect the accuracy of forest fire occurrence prediction. After determining the temporal and spatial scales of forest fire prediction, the forest fire drivers would directly affect the prediction accuracy of forest fire occurrence prediction models. Therefore, establishing the forest fire drivers was the key to building a high prediction accuracy prediction model of the forest fire occurrence.
To obtain the models with strong generalization performance, the full sample dataset was randomly divided into training sets (60%) and validation sets (40%) five times. Then, five subsets of the data were randomly obtained to extract the important feature variables.
Logistic stepwise regression was used to screen out the forest-fire driving factors with a significance level of p < 0.05 in the five subsets. The significant characteristic variables that appeared more than 3 times in the five stepwise regressions were used as the logistic variable indicator system and entered into the cross-validation analysis of the algorithms. Meanwhile, logistic regression was performed by using the full sample set to calculate the significance level of each variable in the logistic variable indicator system. The variables in the logistic variable index systems with significance level p were daily maximum temperature with p < 0.0001, days after the last rain with p < 0.0001, daily average relative humidity with p < 0.0001, daily average water vapor pressure with p < 0.0001, distance to settlement with p < 0.0001, daily minimum relative humidity with p < 0.001, elevation with p < 0.001, and daily average air pressure with p < 0.05. Logistic stepwise regression was implemented by R software.
In the RF importance characteristic evaluation, the higher the scores of variables are, the greater the impact on forest-fire occurrence and the greater the importance. In the five subsets of data samples, according to the score of feature importance, the insignificant variables were removed to reconstruct the RF, and the set of variables with the smallest OOB error was selected as the variables of this sample set [54]. Combining the screening results of the five subsets of data samples, the significant feature variables that appeared more than 3 times together were screened out, as shown in Figure 5. The scores of important variables that appeared more than 3 times were screened out by importance ranking, and the variables were screened one by one in the full sample set with the criterion of minimum OOB error. Then, the RF variable index system (importance score) was obtained, including days after the last rain (2.95), daily average relative humidity (2.63), daily maximum temperature (1.61), daily precipitation (1.38), daily minimum relative humidity (1.16), daily average temperature (1.13), daily minimum temperature (1.09), daily average water vapor pressure (1.06), and distance to settlement (0.99). The RF importance characteristic evaluation was implemented by using MATLAB software.
Significant forest-fire driving factors with a significance level of p < 0.001 in logistic stepwise regression were also the top important forest-fire drivers in the RF importance characteristic evaluation, which could be judged to be the most important factors influencing forest-fire occurrence and were used as an integrated variable index system. Finally, three variable index systems of forest-fire factors were obtained, as shown in Table 2. The importance of days after the last rain and the daily average relative humidity in the RF variable index system was significantly greater than other variables, followed by daily maximum temperature. In general, meteorological factors had the greatest influence on the occurrence of forest fires, especially humidity and temperature. Human activity factors, such as distance to the settlement, distance to the road, and distance to the railway, and topographic factors were less influential than meteorological factors. Topographic factors were not screened out in the RF variable index system, and the elevation in the logistics variable index system was smaller than that of the distance to the settlement. In other words, ignitions were concentrated close to settlements.
According to Figure 2, the main forest-fire causes were human activities, being about 67.4%, in which burning the stubble in agricultural lands took precedence, followed by smoking, hunting or cooking in the wild in third place. Lightning strikes and power lines were relevant natural causes of forest fires, being about 3.5%. Furthermore, about 27.5% of forest fires were unexplained. The forest fire occurrence prediction mainly focused on artificial fire in this paper. Next, the prediction accuracy and goodness of fit of the models are analyzed in the following subsection.

4.2. Prediction Accuracy and Goodness of Fit of the Models

The AUC value calculated by the ROC curve and the prediction accuracy were used to conduct the cross-validation analysis and the comparison between the RF and BPNN models under the three variable index systems. The full sample data were randomly divided into a 60% training set and a 40% validation set, and the models were trained by the training set and validated by the validation set under the three variable index systems. To provide a more objective evaluation of the models, the average prediction accuracies and average AUC values of the validation sets of 100 Monte Carlo experiments were taken as the prediction accuracy and the goodness of fit of the models, respectively.
Table 3 shows the prediction accuracy and AUC values of the two machine learning methods under the three variable index systems. The prediction accuracy of the RF algorithm was between 87.91% and 88.98% in the three variable index systems. The accuracy of predicting ignition Point “1” was approximately 3.99%–4.49% higher than that of non-ignition Point “0”, and the highest accuracy level was 91.24%. The lowest AUC was 0.946, and the highest AUC was 0.955.
The prediction accuracy of the BPNN algorithm in the three variable index systems was between 86.01% and 86.94%. The accuracy of predicting ignition Point “1” was approximately 3.43%–6.03% higher than that of non-ignition Point “0”. The lowest AUC was 0.930, and the highest AUC was 0.939.
The prediction accuracies and AUC values of the two machine learning methods were similar under the three variable index systems. Both models had high goodness of fit and prediction accuracy, and the accuracy of predicting ignition points was higher than that of non-ignition points, which were both suitable for predicting and analyzing the occurrence probability of forest fires.
The results showed that the prediction accuracy and AUC values of the algorithms were the highest in the RF variable index system. In terms of prediction accuracy, the RF algorithm was 1.03%–2.41% higher than the BP neural network algorithm. The RF algorithm under the RF variable index system had the highest overall prediction accuracy level and goodness of fit.
However, the integrated variable index system relying only on the main four forest-fire drivers could obtain an average prediction accuracy close to that of the RF algorithm under the RF variable index system, which was determined to be the best variable index system for forest-fire prediction in the Heihe area, and the corresponding variables were the main driving factors affecting the occurrence of forest fires. In summary, the RF algorithm based on the integrated variable index system was determined to be the best forest-fire prediction model. However, it is noted that the prediction accuracies are only a little bit different among the two algorithms under the three variable index systems. The proposed method does not bring much improvement to the three variable index systems.
According to the established probability prediction model, the probability of forest-fire occurrence was divided into five grades [59], as shown in Table 4.
The RF algorithm and BP neural network algorithm based on the integrated variable index system were used to calculate the prediction probability of the full sample data, and the kriging method in ArcGIS 10.2 was used to classify the fire-risk grade. Figure 6 shows the ROC curves of the two models.
Figure 7 shows the distribution of forest-fire occurrence probability and fire-risk distribution based on the RF algorithm. Figure 8 shows the distribution of forest-fire occurrence probability and fire-risk distribution based on the BPNN algorithm. The general trends of the spatial distribution maps of forest fire occurrence probability corresponding to the two models obtained in this study are basically consistent, but some local differences are also apparent, which are related to the specific implementation mechanisms of each of the two models. The forest-fire-prone areas in the Heihe area were mainly concentrated in the northwestern and central areas.

5. Discussion

In the selection of forest-fire factors, it was found that the days after the last rain, the daily average relative humidity, daily maximum temperature, daily average water vapor pressure, daily minimum relative humidity and distance to settlement were the most significant factors in the logistic stepwise regression (significance level of p < 0.0001) and were also the top significant factors in RF importance characteristic evaluation, which could be judged as the most important factors affecting the occurrence of forest fires in the Heihe area. Meanwhile, forest-fire prediction models with high estimation accuracy could be obtained based on these six significant forest-fire driving factors. Daily minimum relative humidity, elevation, daily average temperature, daily average air pressure, daily minimum temperature and daily precipitation appeared in the variable selection process and had some influence on forest-fire occurrence. Among the meteorological factors, days after the last rain, daily average relative humidity, and daily maximum temperature had the greatest influence on forest-fire occurrence, followed by distance to settlements, and topographic factors had the least influence on forest-fire occurrence. Humidity variation and temperature variation had significant effects on forest-fire occurrence, which is consistent with previous research results [54]. Distance to settlements had a significant effect on forest-fire occurrence, which was due to the highly intertwined agriculture and forestry in the Heihe area. According to Figure 2, the main cause of forest fires was human activities, and there were about 67.4% forest fires caused by human activity. Lightning strikes and power lines were relevant natural causes of forest fires. There were 15 forest fires caused by lightning activity and 10 forest fires by power lines. Furthermore, about 27.5% forest fires were unexplained. Therefore, the research of this paper mainly focused on artificial fire.
The cross-validation in the three variable index systems revealed that the prediction accuracy of the RF algorithm and BP neural network algorithm differed less, and both could obtain high prediction accuracy and goodness of fit, with an average prediction accuracy between 86.01% and 88.98%, and AUC values between 0.930 and 0.955, both of which were suitable for forest-fire occurrence prediction. The results of the fire-risk distribution results showed that forest-fire-prone areas were mainly concentrated in the northwestern and central parts of the Heihe area. At present, there is still a lack of effective forest fire prediction and evaluation systems in forest-fire-prone areas, and no effective technical support for resource allocation. For these areas, the local emergency management departments can act in advance, make overall arrangements, strictly manage fire sources, eliminate fire risks, and do a good job in forest-fire early warnings. For forest-fire-prone areas, they should be equipped with more fire resources and fire towers, and fire prevention planning, and reasonable allocation of related materials according to the fire probability can be made, which can save human, material and financial resources, and help improve the efficiency of forest fire monitoring and prevention work.

6. Conclusions

In this study, using RF and BPNN, the influence of meteorological factors and topographic factors on forest-fire occurrence in the Heihe area was comprehensively analyzed. The main cause of fires in this area was human activities. Hurting the stubble in agricultural lands was the main cause in human activities. The days after the last rain, daily average relative humidity, daily maximum temperature, and distance to settlement were the main driving factors affecting forest-fire occurrence. It is noted that days after the last rain is an important driving factor which is revealed in this study. The relationship between forest-fire occurrence and meteorological changes was significant. The prediction of forest fires based on the index systems of different driving factors showed that the prediction accuracy and the goodness of fit of the RF algorithm were slightly higher than those of the BPNN algorithm, and both were suitable for the prediction of forest-fire occurrence probability. The fire-risk distribution results based on the prediction models showed that forest fires were prone to occur in the northwestern and central areas.
It was found that combining multiple forest-fire influencing factors could improve the accuracy of the prediction models [60]. This paper relied on meteorological, topographic, and human activity factors to study the forest-fire occurrence probability prediction model but did not yet analyze the influence of forest-fuel types, fire sources, and other factors on forest-fire occurrence, which is a shortcoming and also affects the model accuracy. Moreover, the internal connection between forest-fire causes and the forest-fire driving factors are not well analyzed. About 27.5 percent of fires are unexplained. To give a more accurate prediction result, these forest-fire causes need to be further clarified. However, the investigation of those unexplained forest-fire ignitions is a difficulty. In future work, forest-fire causes will be further clarified by investigating and visiting the local emergency management departments, and then the prediction of the possible forest fire causes in this area will be further studied.
In addition, the use of an intelligent optimization algorithm to optimize RF and BPNN is expected to further improve the prediction accuracy of the models [61]. The correct choice of models for forest-fire occurrence prediction is crucial to reveal the real spatial distribution pattern of forest fires. Moreover, no single model can solve the problem perfectly due to the differences in the models themselves and the study areas, and it may be a better choice to adopt several more suitable models at the same time to compare and synthesize the results of different models. These issues will also be considered in our future research.

Author Contributions

Methodology, C.G.; software, H.L.; validation, H.L.; resources, H.H.; writing—original draft preparation, C.G.; and writing—review and editing, C.G., H.L. and H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the “Strategic International Scientific and Technological Innovation Cooperation Special Fund of National Key Research and Development Program of China, Grant Number 2018YFE0207800”, “Young Innovative Talents Training Program of Universities in Heilongjiang Province, Grant Number UNPYSCT-2020001” and “Heilongjiang University Outstanding Youth Fund, Grant Number JCL202101”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rodrigues, M.; Riva, J.; Fotheringham, S. Modeling the spatial variation of the explanatory factors of human-caused wildfires in Spain using geographically weighted logistic regression. Appl. Geogr. 2014, 48, 25–63. [Google Scholar] [CrossRef]
  2. Ozbayoglu, A.M.; Bozer, R. Estimation of the burned area in forest fires using computational intelligence techniques. Procedia Comput. Sci. 2012, 12, 282–287. [Google Scholar] [CrossRef] [Green Version]
  3. Wu, C.; Xu, W.H.; Huang, S.D.; Qin, M.M.; Wang, Q.H. Research progress of remote sensing for forest-fire monitoring. J. Southwest For. Univ. 2020, 40, 172–179. [Google Scholar]
  4. Somashekar, R.K.; Ravikumar, P.; Mohan-Kumar, C.N.; Prakash, K.L.; Nagaraja, B.C. Burnt area mapping of bandipur national park, India using IRS 1C/1D LISS III data. J. Indian Soc. Remote Sens. 2009, 37, 37–50. [Google Scholar] [CrossRef]
  5. Cardille, J.A.; Ventura, S.J.; Turner, M.G. Environmental and social factors influencing wildfires in the upper midwest, United States. Ecol. Appl. 2001, 11, 111–127. [Google Scholar] [CrossRef]
  6. Gao, C.; Lin, H.L.; Hu, H.Q.; Song, H. A review of models of forest fire occurrence prediction in China. Chin. J. Appl. Ecol. 2020, 31, 3227–3240. [Google Scholar]
  7. Li, W.; Jiang, Z.H.; Zhang, X.B.; Li, L.; Sun, Y. Additional risk in extreme precipitation in China from 1.5 °C to 2.0 °C global warming levels. Sci. Bull. 2018, 63, 228–234. [Google Scholar] [CrossRef] [Green Version]
  8. Turco, M.; Llasat, M.C.; Hardenberg, J.V.; Provenzale, A. Impact of climate variability on summer fires in a Mediterranean environment (northeastern Iberian Peninsula). Clim. Chang. 2013, 116, 665–678. [Google Scholar] [CrossRef]
  9. Gu, X.L.; Wu, Z.W.; Zhang, Y.J.; Yan, S.J.; Fu, J.J.; Du, L.H. Prediction research of the forest fire in Jiangxi province in the background of climate change. Acta Ecol. Sin. 2020, 40, 667–677. [Google Scholar]
  10. Martín, Y.; Zúñiga-Antón, M.; Rodrigues Mimbrero, M. Modelling temporal variation of fire-occurrence towards the dynamic prediction of human wildfire ignition danger in northeast Spain. Geomat. Nat. Hazards Risk 2019, 10, 385–411. [Google Scholar] [CrossRef]
  11. Gao, B.; Shan, Z.H.; Cao, L.L.; Shan, Y.L.; Han, X.Y.; Wang, M.X.; Yin, S.N. Study on monthly dynamic change and occurrence prediction of forest fires in Daxing’an mountains. J. Cent. South Univ. For. Technol. 2021, 41, 53–62. [Google Scholar]
  12. Hering, A.S.; Bell, C.L.; Genton, M.G. Modeling spatio-temporal wildfire ignition point patterns. Environ. Ecol. Stat. 2009, 16, 225–250. [Google Scholar] [CrossRef] [Green Version]
  13. Alonso-Betanzos, A.; Fontenla-Romero, O.; Guijarro-Berdiñas, B.; Hernández-Pereira, E.; Andrade, M.I.P.; Jiménez, E.; Soto, J.L.L.; Carballas, T. An intelligent system for forest fire risk prediction and fire fighting management in Galicia. Expert Syst. Appl. 2003, 25, 545–554. [Google Scholar] [CrossRef]
  14. Pew, K.L.; Larsen, C.P.S. GIS analysis of spatial and temporal patterns of human-caused wildfires in the temperate rain forest of Vancouver island, Canada. For. Ecol. Manag. 2001, 140, 1–18. [Google Scholar] [CrossRef]
  15. Minnich, R.A.; Bahre, C.J. Wildland fire and chaparral succession along the California-Baja California boundary. Int. J. Wildland Fire 1995, 5, 13–24. [Google Scholar] [CrossRef]
  16. Rothermel, R.C. A Mathematical Model for Predicting Fire Spread in Wild Land Fuels; USDA Forest Service: Ogden, UT, USA, 1972; p. 115.
  17. Elmas, Ç.; Sönmez, Y. A data fusion framework with novel hybrid algorithm for multi-agent Decision Support System for Forest Fire. Expert Syst. Appl. 2011, 38, 9225–9236. [Google Scholar] [CrossRef]
  18. Opitz, T.; Bonneu, F.; Gabriel, E. Point-process based Bayesian modeling of space-time structures of forest fire occurrences in Mediterranean France. Spat. Stat. 2020, 40, 100429. [Google Scholar] [CrossRef] [Green Version]
  19. Chuvieco, E.; Aguado, I.; Yebra, M.; Nieto, H.; Salas, J.; Martín, M.P.; Vilar, L.; Martínez, J.; Martín, S.; Ibarra, P.; et al. Development of a framework for fire risk assessment using remote sensing and geographic information system technologies. Ecol. Model. 2010, 221, 46–58. [Google Scholar] [CrossRef]
  20. Bui, D.T.; Le, H.V.; Hoang, N.D. GIS-based spatial prediction of tropical forest fire danger using a new hybrid machine learning method. Ecol. Inform. 2018, 48, 104–116. [Google Scholar]
  21. Murthy, K.K.; Sinha, S.K.; Kaul, R.; Vaidyanathan, S. A fine-scale state-space model to understand drivers of forest fires in the Himalayan foothills. For. Ecol. Manag. 2019, 432, 902–911. [Google Scholar] [CrossRef]
  22. Sağlam, B.; Bilgili, E.; Durmaz, B.D.; Kadıoğulları, A.İ.; Küçük, Ö. Spatio-temporal analysis of forest fire risk and danger using LANDSAT imagery. Sensors 2008, 8, 3970–3987. [Google Scholar] [CrossRef] [Green Version]
  23. Sivrikaya, F.; Sağlam, B.; Akay, A.E.; Bozali, N. Evaluation of forest fire risk with GIS. Pol. J. Environ. Stud. 2014, 23, 187–194. [Google Scholar]
  24. Akbulak, C.; Tatlı, H.; Aygun, G.; Sağlam, B. Forest fire risk analysis via integration of GIS, RS and AHP: The Case of Canakkale, Turkey. J. Hum. Sci. 2018, 15, 2127–2143. [Google Scholar] [CrossRef] [Green Version]
  25. Elia, M.; Giannico, V.; Lafortezza, R.; Sanesi, G. Modeling fire ignition patterns in Mediterranean urban interface. Stoch. Environ. Res. Risk Assess. 2019, 33, 169–181. [Google Scholar] [CrossRef]
  26. Camp, P.E.; Krawchuk, M.A. Spatially varying constraints of human-caused fire occurrence in British Columbia, Canada. Int. J. Wildland Fire 2017, 26, 219–229. [Google Scholar] [CrossRef] [Green Version]
  27. Maingi, J.K.; Henry, M.C. Factors influencing wildfire occurrence and distribution in eastern Kentucky, USA. Int. J. Wildland Fire 2007, 16, 23–33. [Google Scholar] [CrossRef]
  28. Anderson, K. A model to predict lightning-caused fire occurrences. Int. J. Wildland Fire 2002, 11, 163–172. [Google Scholar] [CrossRef]
  29. Ager, A.A.; Barros, A.M.G.; Day, M.A.; Preisler, H.K.; Spies, T.A.; Bolte, J. Analyzing fine scale spatiotemporal drivers of wildfire in a forest landscape model. Ecol. Model. 2018, 384, 87–102. [Google Scholar] [CrossRef]
  30. Monjarás-Vega, N.A.; Briones-Herrera, C.I.; Vega-Nieva, D.J.; Calleros-Flores, E.; Corral-Rivas, J.J.; López-Serrano, P.M.; Pompa-García, P.; Rodríguez-Trejo, D.A.; Carrillo-Parra, A.; González-Cabán, A.; et al. Predicting forest fire kernel density at multiple scales with geographically weighted regression in Mexico. Sci. Total Environ. 2020, 718, 137313. [Google Scholar] [CrossRef]
  31. Hamadeh, N.; Karouni, A.; Daya, B.; Chauvet, P. Using correlative data analysis to develop weather index that estimates the risk of forest fires in Lebanon & Mediterranean: Assessment versus prevalent meteorological indices. Case Stud. Fire Saf. 2017, 7, 8–22. [Google Scholar]
  32. Milanović, S.; Kaczmarowski, J.; Ciesielski, M.; Trailović, Z.; Mielcarek, M.; Szczygieł, R.; Kwiatkowski, M.; Bałazy, R.; Zasada, M.; Milanović, S.D. Modeling and mapping of forest fire occurrence in the Lower Silesian Voivodeship of Poland based on Machine Learning methods. Forests 2023, 14, 46. [Google Scholar] [CrossRef]
  33. Kalantar, B.; Ueda, N.; Idrees, M.O.; Janizadeh, S.; Ahmadi, K.; Shabani, F. Forest fire susceptibility prediction based on machine learning models with resampling algorithms on remote sensing data. Rem. Sens. 2020, 12, 3682. [Google Scholar] [CrossRef]
  34. Zheng, Z.; Huang, W.; Li, S.N.; Zeng, Y.N. Forest fire spread simulating model using cellular automaton with extreme learning machine. Ecol. Model. 2017, 348, 33–43. [Google Scholar] [CrossRef] [Green Version]
  35. Iban, M.C.; Sekertekin, A. Machine learning based wildfire susceptibility mapping using remotely sensed fire data and GIS: A case study of Adana and Mersin provinces, Turkey. Ecol. Inform. 2022, 69, 101647. [Google Scholar] [CrossRef]
  36. Elia, M.; Este, M.D.; Ascoli, D.; Giannico, V.; Spano, G.; Ganga, A.; Colangelo, G.; Lafortezza, R.; Sanesi, G. Estimating the probability of wildfire occurrence in Mediterranean landscapes using artificial neural networks. Environ. Impact Assess. Rev. 2020, 85, 106474. [Google Scholar] [CrossRef]
  37. Bui, D.T.; Hoang, N.D.; Samui, P. Spatial pattern analysis and prediction of forest fire using new machine learning approach of multivariate adaptive regression splines and differential flower pollination optimization: A case study at Lao Cai province (Viet Nam). J. Environ. Manag. 2019, 237, 476–487. [Google Scholar]
  38. Sevinc, V.; Kucuk, O.; Goltas, M. A Bayesian network model for prediction and analysis of possible forest fire causes. For. Ecol. Manag. 2020, 457, 117723. [Google Scholar] [CrossRef]
  39. Oliveira, S.; Oehler, F.; San-Miguel-Ayanz, J.; Camia, A.; Pereira, J. Modeling spatial patterns of fire occurrence in Mediterranean Europe using Multiple Regression and Random Forest. For. Ecol. Manag. 2012, 275, 117–129. [Google Scholar] [CrossRef]
  40. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  41. Cutler, D.R.; Edwards, T.J.; Beard, K.H.; Cutler, A.; Hess, H.T.; Gibson, J.; Lawler, J.J. Random forests for classification in ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
  42. Vasconcelos, M.J.; Sllva, S.; Tome, M.; Alvim, M.; Pereira, J.C. Spatial prediction of fire ignition probabilities: Comparing Logistic regression and neural networks. Photogramm. Eng. Remote Sens. 2001, 67, 73–81. [Google Scholar]
  43. Sakr, G.E.; Elhajj, I.H.; Mitri, G. Efficient Forest fire occurrence prediction for developing countries using two weather parameters. Eng. Appl. Artif. Intell. 2011, 24, 888–894. [Google Scholar] [CrossRef]
  44. Mohajane, M.; Costache, R.; Karimi, F.; Pham, Q.B.; Essahlaoui, A.; Nguyen, H.; Laneve, G.; Oudija, F. Application of remote sensing and machine learning algorithms for forest fire mapping in a Mediterranean area. Ecol. Indic. 2021, 129, 107869. [Google Scholar] [CrossRef]
  45. Saha, S.; Bera, B.; Shit, P.K.; Bhattacharjee, S.; Sengupta, N. Prediction of forest fire susceptibility applying machine and deep learning algorithms for conservation priorities of forest resources. Remote Sens. Appl. Soc. Environ. 2023, 29, 100917. [Google Scholar] [CrossRef]
  46. Achu, A.L.; Thomas, J.; Aju, C.D. Machine-learning modelling of fire susceptibility in a forest-agriculture mosaic landscape of southern India. Ecol. Inform. 2021, 64, 101348. [Google Scholar] [CrossRef]
  47. Yang, J.B.; Ma, X.X. On the basis of artificial neural network to forecast the forest fire in Guangdong Province. Sci. Silvae Sin. 2005, 41, 127–132. [Google Scholar]
  48. Ma, W.Y.; Feng, Z.K.; Cheng, Z.X.; Wang, F.G. Study on driving factors and distribution pattern of forest fires in Shanxi Province. J. Cent. South Univ. For. Technol. 2020, 40, 57–69. [Google Scholar]
  49. Liang, H.L.; Lin, Y.R.; Yang, G.; Su, Z.W.; Wang, W.H.; Guo, F.T. Application of random forest algorithm on the forest fire prediction in Tahe area based on meteorological factors. Sci. Silvae Sin. 2016, 52, 89–98. [Google Scholar]
  50. Liang, H.L.; Guo, F.T.; Su, Z.W.; Wang, W.H.; Lin, F.F.; Lin, Y.R. Analysis of meteorological factors on forest fire occurrence of Fujian based on random forest algorithm. Fire Saf. Sci. 2015, 24, 191–200. [Google Scholar]
  51. Zheng, Z.; Gao, Y.H.; Yang, Q.Y.; Zou, B.; Xu, Y.J.; Chen, Y.Y.; Yang, S.Q.; Wang, Y.Q.; Wang, Z.W. Predicting Forest fire risk based on mining rules with ant-miner algorithm in cloud-rich areas. Ecol. Indic. 2020, 118, 106772. [Google Scholar] [CrossRef]
  52. Zhang, G.; Wang, M.; Liu, K. Deep neural networks for global wildfire susceptibility modelling. Ecol. Indic. 2021, 127, 107735. [Google Scholar] [CrossRef]
  53. Cui, Y.; Di, H.T.; Xing, Y.Q.; Chang, X.Q.; Shan, W. Spatial and temporal distributions of forest fires in Heilongjiang Province from 2001 to 2018 based on MODIS data. J. Nanjing For. Univ. (Nat. Sci. Ed.) 2021, 45, 205–211. [Google Scholar]
  54. Guo, F.T.; Su, Z.W.; Ma, X.Q.; Song, Y.H.; Sun, L.; Hu, H.Q.; Yang, T.T. Climatic and non-climatic factors driving lightning-induced fire in Tahe, Daxing’an mountation. Acta Ecol. Sin. 2015, 35, 6439–6448. [Google Scholar]
  55. Fang, K.N.; Wu, J.B.; Zhu, J.P.; Xie, B.C. A review of technologies on random forests. Stat. Inf. Forum 2011, 26, 32–37. [Google Scholar]
  56. Liaw, A.; Wiener, M. Classification and regression by random forests. Rnews 2002, 2, 18–22. [Google Scholar]
  57. Catry, F.X.; Rego, F.C.; Bação, F.L.; Moreira, F. Modeling and mapping wildfire ignition risk in Portugal. Int. J. Wildland Fire 2009, 18, 921–931. [Google Scholar] [CrossRef] [Green Version]
  58. Chang, Y.; Zhu, Z.L.; Bu, R.C.; Chen, H.W.; Feng, Y.T.; Li, Y.H.; Hu, Y.M.; Wang, Z.C. Predicting fire occurrence patterns with logistic regression in Heilongjiang Province, China. Landsc. Ecol. 2013, 28, 1989–2004. [Google Scholar] [CrossRef]
  59. Deng, O.; Li, Y.Q.; Feng, Z.K.; Zhang, D.Y. Model and zoning of forest fire risk in Heilongjiang province based on spatial Logistic. Trans. Chin. Soc. Agric. Eng. 2012, 28, 200–205. [Google Scholar]
  60. Zhu, Z.; Zhao, F.; Wang, Q.H.; Gao, Z.L.; Deng, X.F.; Huang, P.G. Driving factors of forest fire and fire risk zoning in Kunming City. J. Zhejiang A F Univ. 2022, 39, 380–387. [Google Scholar]
  61. Wang, L.; Hao, R.Y.; Liu, W.; Wen, Z.M. A multi-factor forest fire risk rating prediction model based on particle swarm optimization algorithm and back-propagation neural network. J. For. Eng. 2019, 4, 137–144. [Google Scholar]
Figure 1. The topography of the study area.
Figure 1. The topography of the study area.
Forests 14 00170 g001
Figure 2. The cause of the fire in the study area.
Figure 2. The cause of the fire in the study area.
Forests 14 00170 g002
Figure 3. The simple flowchart of the RF algorithm.
Figure 3. The simple flowchart of the RF algorithm.
Forests 14 00170 g003
Figure 4. The simple flowchart of the BPNN algorithm.
Figure 4. The simple flowchart of the BPNN algorithm.
Forests 14 00170 g004
Figure 5. RF variable importance contrast: (a) Data 1; (b) Data 2; (c) Data 3; (d) Data 4; (e) Data 5; and (f) Full Data. x1 denotes days after the last rain; x2 denotes daily average relative humidity; x3 denotes daily maximum temperature; x4 denotes daily minimum relative humidity; x5 denotes daily average temperature; x6 denotes daily precipitation; x7 denotes distance to settlement; x8 denotes daily minimum temperature; x9 denotes daily average water vapor pressure.
Figure 5. RF variable importance contrast: (a) Data 1; (b) Data 2; (c) Data 3; (d) Data 4; (e) Data 5; and (f) Full Data. x1 denotes days after the last rain; x2 denotes daily average relative humidity; x3 denotes daily maximum temperature; x4 denotes daily minimum relative humidity; x5 denotes daily average temperature; x6 denotes daily precipitation; x7 denotes distance to settlement; x8 denotes daily minimum temperature; x9 denotes daily average water vapor pressure.
Forests 14 00170 g005aForests 14 00170 g005b
Figure 6. ROC curve. (a) ROC curve of RF algorithm. (b) ROC curve of BPNN algorithm.
Figure 6. ROC curve. (a) ROC curve of RF algorithm. (b) ROC curve of BPNN algorithm.
Forests 14 00170 g006
Figure 7. Forest-fire occurrence probability and risk distribution based on RF algorithm. (a) Forest-fire occurrence probability. (b) Forest-fire-risk distribution.
Figure 7. Forest-fire occurrence probability and risk distribution based on RF algorithm. (a) Forest-fire occurrence probability. (b) Forest-fire-risk distribution.
Forests 14 00170 g007
Figure 8. Forest-fire occurrence probability and risk distribution based on BPNN algorithm. (a) Forest-fire occurrence probability. (b) Forest-fire-risk distribution.
Figure 8. Forest-fire occurrence probability and risk distribution based on BPNN algorithm. (a) Forest-fire occurrence probability. (b) Forest-fire-risk distribution.
Forests 14 00170 g008
Table 1. The considered forest-fire driving factors in the models.
Table 1. The considered forest-fire driving factors in the models.
FactorsData SourcesResolution/ScaleMinimum ValueMaximum Value
Meteorological dataDaily average wind speedChina Meteorological Data Sharing Network (http://www.cma.gov.cn/ accessed on 3 March 2016)1 m s−137.4
Daily maximum wind speed1 m s−12.212.6
Daily sunshine hours1 h015.1
Daily average air pressure1 hPa967.81012
Daily average temperature1 °C−30.624.7
Daily maximum temperature1 °C−24.532.2
Daily minimum temperature1 °C−37−19.4
Daily average water vapor pressure1 hPa0.424.1
Daily average relative humidity1%2495
Daily minimum relative humidity1%858
Daily precipitation1 mm049.4
Days after the last rain1 day062
Basic geographic information dataDistance to settlementNational Geographic Information Resources Directory System website (http://www.webmap.cn/ accessed on 28 February 2022)1 m51.414,519.2
Distance to road1 m1.38557.5
Distance to railway1 m12.0163,573.1
Topographic dataElevationGeospatial Data Cloud of China (http://www.gscloud.cn/ accessed on 28 February 2022)1 m851022
Slope024.2
Table 2. Three variable index systems.
Table 2. Three variable index systems.
Forest-Fire FactorsLogistic Variable Index SystemRF Variable Index SystemIntegrated Variable Index System
Daily average wind speed
Daily maximum wind speed
Daily sunshine hours
Daily average air pressure+
Daily average temperature+
Daily maximum temperature+++
Daily minimum temperature+
Daily average water vapor pressure+++
Daily average relative humidity+++
Daily minimum relative humidity+++
Daily precipitation+
Days after the last rain+++
Distance to settlement+++
Distance to road
Distance to railway
Elevation+
Slope
“+” indicates that the factor appeared in the variable index system, and “−” indicates that the factor did not appear in the variable index system.
Table 3. Results of the cross-validations.
Table 3. Results of the cross-validations.
Method/Accuracy/AUCLogistic Variable Index SystemRF Variable Index SystemIntegrated Variable Index System
RF algorithmAccuracy086.44%86.75%85.86%
190.43%91.24%90.00%
Total88.42%88.98%87.91%
AUC0.9470.9550.946
BPNN algorithmAccuracy083.00%85.18%85.18%
189.03%88.74%88.61%
Total86.01%86.94%86.88%
AUC0.9300.9390.938
“0” indicates ignition; “1” indicates non-ignition.
Table 4. The standard of forest-fire-risk grade division.
Table 4. The standard of forest-fire-risk grade division.
Forest-Fire Occurrence ProbabilityFire-Risk Grade
0~0.2I Basically no fire
0.2~0.4II Not prone to fire
0.4~0.6III Possible fire
0.6~0.8IV Prone to fire
0.8~1V Extremely prone to fire
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gao, C.; Lin, H.; Hu, H. Forest-Fire-Risk Prediction Based on Random Forest and Backpropagation Neural Network of Heihe Area in Heilongjiang Province, China. Forests 2023, 14, 170. https://doi.org/10.3390/f14020170

AMA Style

Gao C, Lin H, Hu H. Forest-Fire-Risk Prediction Based on Random Forest and Backpropagation Neural Network of Heihe Area in Heilongjiang Province, China. Forests. 2023; 14(2):170. https://doi.org/10.3390/f14020170

Chicago/Turabian Style

Gao, Chao, Honglei Lin, and Haiqing Hu. 2023. "Forest-Fire-Risk Prediction Based on Random Forest and Backpropagation Neural Network of Heihe Area in Heilongjiang Province, China" Forests 14, no. 2: 170. https://doi.org/10.3390/f14020170

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop