1. Introduction
Floods are one of the most serious natural disasters in the world, causing huge casualties and economic losses worldwide every year [
1,
2]. In recent years, with global climate warming, the frequency and intensity of floods have become higher. According to the “China Flood and Drought Disaster Prevention Bulletin” released by the Ministry of Water Resources (
http://www.mwr.gov.cn (accessed on 12 April 2022)), floods in 2020 caused 230 deaths across the country and direct economic losses of CNY 266.98 billion. Therefore, timely and accurate monitoring of floods and analyzing the evolution trends of floods are of great significance to disaster emergency management in China [
3].
At present, remote sensing technology has gradually become the main means for flood monitoring due to its advantages of wide coverage and low revisit period [
4,
5]. Depending on the detection method, it can be divided into optical remote sensing monitoring and radar remote sensing monitoring. However, since floods are usually accompanied by cloudy or rainy weather, the earth observation of optical remote sensing satellites are hindered, so it is difficult to obtain clear and cloud-free optical images [
6]. On the contrary, Synthetic Aperture Radar (SAR) plays an increasingly important role in flood monitoring because of its all-day and all-weather working ability, and it is not easily affected by cloudy and rainy weather [
7,
8].
To date, flood monitoring methods based on SAR data mainly include the thresholding-based method [
9,
10], object-oriented method [
11], active contour method [
12], and machine learning method [
13,
14]. Among them, the thresholding-based method is the most used in flood monitoring. Although its running is fast and the principle is simple [
15], it is difficult to meet the accuracy requirements in the face of uneven image grayscale and large flood area range [
16]. The object-oriented method can utilize features such as texture and shape of the image, and although good results are achieved, the scale parameters for segmentation and classification depend on experience [
17]. The active contour method can make full use of the color features and edge information of the image, but the speckle noise in the image and the complex calculations hinder its application for flood monitoring in large basins [
18]. In recent years, the machine learning method has been gradually applied to flood monitoring based on SAR images, and has achieved high extraction accuracy [
19]. It can make full use of the feature information of images, and the trained model can be used for multiple images of the same type, which is suitable for batch processing. Considering the large geographic range of the study area, the noise impact of SAR images, and the complex and diverse flood scenes, we used the machine learning method to monitor large-scale floods.
As the available remote sensing data become more abundant and the amount of data become larger, offline processing takes a lot of time, so it is difficult to meet the demands of disaster emergency monitoring. The emergence of cloud platforms such as Google Earth Engine (GEE) has solved the problem of long processing times and large amounts of calculations for remote sensing images [
20]. GEE is a cloud computing platform specialized in processing remote sensing images. It stores the main open-access remote sensing image datasets from the past 40 years, such as Landsat, Sentinel, Modis series data, etc. [
21]. With its powerful computing capabilities, massive free remote sensing data, and many built-in algorithms, GEE provides important and technical support for flood monitoring in large basins [
22]. DeVries et al. [
23] used all available Sentinel-1 images, combined with historical Landsat and other auxiliary data, to quickly extract inundation information during floods based on the GEE platform. Qiu et al. [
24] used the 66 Sentinel-1 images of the GEE platform to study the floods in the Pearl River Basin from 2017 to 2020 using the Otsu thresholding method. Jia et al. [
25] obtained the spatial and temporal distribution pattern of floods in the Chaohu Lake Basin from 2015 to 2020 based on Sentinel-1 images in the GEE platform, and analyzed the impact of floods on cropland and buildings. However, most of the existing studies focus on flooding in small basins, and flood monitoring in large-scale basins is affected by mountain shadows.
In addition, due to the uneven grayscale distribution in SAR images [
18], easy confusion of water body and mountain shadows [
26], and limited computing resources [
24], the existing large-scale watershed flood monitoring is not very accurate. Therefore, based on the GEE cloud computing platform, in this work, we used all Sentinel-1 images during the flood disasters period in the study area, first used the SVM method to extract the flood water body information. Then, we proposed a novel function model to remove the mountain shadows from the flood maps. Finally, we post-processed the results to analyze the flood disasters in the MLB in 2020.
4. Discussion
4.1. Comparison of Different Methods
In order to validate the accuracy of the SVM model, we compared its results with the extraction results of the RF method and Otsu method. The RF method is also one of the most popular algorithms in the field of machine learning. We set the tree number of RF to 100. The Otsu method is the most used thresholding-based method. We computed the Otsu threshold for the VH polarization mode of the Sentinel-1 images. The comparison results of the three methods are shown in
Table 4. The results showed that the SVM model had the highest accuracy and the accuracy of the RF model was second only to the SVM model. The Otsu method had the lowest accuracy. The accuracy of the thresholding-based method was lower than that of the machine learning method. The effectiveness of this paper in extracting flood water body using the SVM model was proven.
4.2. Analysis of Shadow Removal Model
This paper proposed a function model to remove mountain shadows from the water body extraction maps, but how much did this model improve the flood monitoring capability? Therefore, we analyzed the function model from qualitative and quantitative perspectives.
4.2.1. Qualitative Analysis
In order to qualitatively evaluate the ability of the function model to remove the mountain shadows, this paper selected four geographic areas to display the details after removing the mountain shadows, as shown in
Figure 9. The left column of
Figure 9 is the Landsat 8 optical remote sensing image of the four geographic regions. The middle column contains the corresponding Sentinel-1 SAR images. The right column is the result after removing the mountain shadows, in which the blue elements represented the water body after removing the mountain shadows, and the green elements represented shadows recognized by the function model. It could be seen that the shadows formed by tall mountains obscuring the radar beam were identified (region 1 and region 3), but the shadows caused by low mountains were partially identified, and some shadows were not extracted (region 2 and region 4). The results showed that the linear function model proposed in this paper had a good effect in removing the mountain shadows, and it could mitigate the interference of mountain shadows in large-scale watershed flood monitoring.
4.2.2. Quantitative Analysis
We selected a geographic region with a pixel size of 1319 × 2058 to quantitatively evaluate the effect of the shadow removal model. The selected region is in the west of the study area, and we randomly generated 5000 points to quantify the shadow model in the selected area. The water body results before and after removing mountain shadows were compared with actual water (
Figure 10), and the comparison results are shown in
Table 5. The accuracy and kappa coefficients in the selected region before removing mountain shadows was 93.06% and 0.9173, respectively. After removing mountain shadows, they were 95% and 0.9315, respectively. The accuracy and kappa coefficients after removing mountain shadows were improved by 1.94% and 0.0142, respectively. It could be seen that the function model was helpful for improving the flood monitoring ability.
4.3. Accuracy and Efficiency in the Flood Monitoring
The accuracy and kappa coefficients of the trained SVM model in the testing dataset were 97.77% and 0.9521, respectively, which proved its effectiveness in flood monitoring in a large basin. However, the existence of mountains in the west of the MLB, which produce shadows similar to water bodies, hinders further accuracy improvement during flood monitoring. Therefore, we proposed a shadow removal model to remove the mountain shadows, and the accuracy and Kappa coefficient of flood monitoring after removing mountain shadows were improved by 1.94% and 0.0142, respectively. The rapid development of the floods and since floods are usually accompanied by cloudy or rainy weather, the accuracy assessment of flood monitoring results from optical remote sensing images is limited. Meanwhile, since the MLB is very large and there are many rivers and lakes, it is unrealistic to evaluate the accuracy of the SVM model and mountain shadow removal model in the basin. Therefore, we selected several testing regions to quantitatively evaluate the accuracy of the model, but this approach is uncertain. The uncertainty of flood monitoring hinders emergency monitoring of flood control. It is difficult to study the uncertainty of flood monitoring.
The purpose of flood monitoring research is the practical application of emergency monitoring. In the process of flood emergency monitoring in a large basin, the SVM model and mountain shadow removal model can be used in combination, which can quickly extract the spatial and temporal distribution of floods in a short time. This provides a scientific basis for flood early warning, disaster relief, and post-disaster assessment, and has high production efficiency.
4.4. Inundation Analysis
In order to evaluate the losses caused by this major watershed flood, the study superimposed land use data into the water body extraction results to analyze the specific categories of land use inundated by the floods. The land use data used in the study was produced in 2020, which was consistent with the time of the research subject, so as to avoid changes in land use status due to differences in data years. As shown in
Figure 11 and
Table 6, a variety of land uses were affected by the basin floods, and a total of 8526 km
2 of land was inundated, which was mainly distributed along the Yangtze River. Among them, cropland was the most severely affected, with an affected area of 6160 km
2, accounting for 72.25% of the total inundated area, with the inundated cropland mainly distributed along the main stream of the Yangtze River and around Poyang Lake. The floods severely damaged agricultural production and caused serious losses to the MLB. Wetland and forest were more seriously affected, and their inundated areas were 985 km
2 and 677 km
2, respectively. The inundated wetland was mainly distributed along the Dongting Lake and Poyang Lake, and the inundated forest was mainly distributed in the west and south of the study area. Grassland, buildings, and bare land were less affected, and their inundated areas were 416 km
2, 182 km
2, and 106 km
2, respectively, accounting for less than 5%. The inundated grassland was mainly located in the west and south of the study area, while the inundated buildings were mainly located along the Wuhan section of the Yangtze River and the Nanjing section of the Yangtze River.
4.5. Disaster Analysis of the Typical Lakes
The MLB is vast, and there are many rivers and lakes [
40], so it is difficult to analyze the disaster across the basin. Therefore, we selected three typical lakes for disaster analysis, Taihu Lake, Poyang, Lake and East Dongting Lake, which are in the west, middle, and east of the study area, respectively.
4.5.1. Taihu Lake
Taihu Lake is the third largest freshwater lake in China, which is in the south of Jiangsu Province, with the geographical range of 30°55′~31°32′ N and 119°52′~120°36′ E.
The measured water level data could supplement SAR images for flood monitoring [
41]. Therefore, this paper collected daily water level data from the Dongting Xishan hydrological station located in Taihu Lake, as shown in
Figure 12. It could be seen that the water level of Taihu Lake was 3.22 m on 22 June, when the water level rose rapidly at the beginning, crossing the warning water level of 3.8 m on 29 June and reaching the highest level of 4.75 m on 21 July. Then, the water level gradually decreased and dropped to the warning level on 13 August.
Figure 13 and
Figure 14 showed the water area changes of Taihu Lake. It could be seen that the changes of the water body area were consistent with the trend of water level changes, and its water area increased from 2289 km
2 on 16 June to 2344 km
2 on 22 July, which was an increase of 55 km
2. Then, the water area decreased to 2307 km
2 on 27 August, which was a decreased of 37 km
2.
4.5.2. Poyang Lake
Poyang Lake is the largest freshwater lake in China, which is located in the north of Jiangxi Province, with the geographical range of 28°22′ N to 29°45′ N and 115°47′ E to 116°45′ E, and it plays an important role in flood storage and drought relief in the MLB [
42]. The water area changes of Poyang Lake are shown in
Figure 15. It could be seen that the water area of Poyang Lake showed a change trend of “steep rise and slow fall” during the whole flooding period. The water area of Poyang Lake first increased rapidly from 2832 km
2 on 16 June to 3794 km
2 on 20 July, and then decreased slowly to 3574 km
2 on 6 September due to the influence of high water levels of the Yangtze River. In terms of the temporal rate of change, from 16 June to 20 July, the water area increased by a total of 962 km
2 with a daily average increase of 40 km
2/d. From 20 July to 6 September, the water area decreased by a total of 220 km
2 with a daily average decrease of 4.58 km
2/d.
Figure 16 shows the spatial distribution maps of the water body in the Poyang Lake. It could be seen that there was no flood in the west, southwest, and east of Poyang Lake on 26 June 2020. On 8 July 2020, there was obvious flooding in the southwest of Poyang Lake, and a large area of the Yellow Lake flood storage area was inundated. On 20 July 2020, there was obvious inundation in the west and east of Poyang Lake. The slow decrease of the water area brought great pressure to the flood control and disaster relief in the MLB.
4.5.3. East Dongting Lake
Dongting Lake is the second largest freshwater lake in China, and it is located in the north of Hunan Province with the geographical range of 28°30′ N~30°20′ N and 112°25′ E~113°15′ E, and is an important storage lake in the Yangtze River basin. However, the SAR satellite did not observe the west of Dongting Lake at the early stage of flooding, and therefore the East Dongting Lake was selected for disaster analysis in this paper.
Figure 17 shows the water area changes of East Dongting Lake, and the change trend was similar to that of Poyang Lake, that is, the change trend of “steep rise and slow fall”. Its water area first increased rapidly from 1015 km
2 on 19 June to 1614 km
2 on 18 July, which was an water area increase of 599 km
2, and then the water area continued to increase slowly to 1629 km
2 on 30 July, after which the water area decreased slowly to 1480 km
2 on 4 September.
Figure 18 shows the spatial distribution maps of the water body in East Dongting Lake from 19 June to 4 September 2020. It could be seen that at the beginning of the flood, the middle and south of the East Dongting Lake did not appear to be inundated. As the water level rose, small lakes located in the middle and south of the East Dongting Lake joined together and inundated a large exposed area. After that, the water area slowly declined but the inundated areas did not show any significant recession.
4.6. Limitations and Implications
Although this paper adopted the SVM model to extract the flood water body information on the GEE platform and proposed a function model to remove the influence of the mountain shadows, thereby further improving the accuracy of large-scale flood monitoring, there were still some shortcomings in this paper. The first was the single data source. This paper used Sentinel-1 images to monitor floods, which were difficult to use to meet the requirements for large-scale floods. In the future, multi-source remote sensing images such as GF-3 and Landsat 8 can be combined. The second was the linear model to remove the effect of mountain shadows. It is not ideal, and the model can be improved in the future to improve the removal rate of mountain shadows. In addition, with the development of machine learning algorithms, new algorithms are constantly being used for flood monitoring [
14,
43]. In the future, we can apply more algorithms to find the most suitable algorithm for flood monitoring. Finally, for large-scale watershed flooding, it is difficult to grasp the evolution trend and characteristics of floods. This paper only selected three typical lakes to analyze flood disasters. More lakes and rivers can be selected for research in the future.
5. Conclusions
Based on the GEE cloud platform and Sentinel-1 SAR images, this paper used the SVM model to extract flood water bodies during floods, and then analyzed the flood disaster situation in the MLB to solve the low accuracy of the extraction of large-scale watershed flood disasters and the difficulty of removing mountain shadows. The results showed that: (1) the evaluation index accuracy and kappa coefficient of the trained SVM model in the testing dataset were 97.77% and 0.9521, respectively. (2) Compared with the other three methods for removing mountain shadows, the linear function model proposed based on samples had the best effect, and its shadow recognition rate was 75.46%. Applying the function model to the flood water body extraction maps could mitigate the interference of mountain shadows. (3) We analyzed flood disasters based on multi-temporal flood water body extraction maps. The flood inundated a total of 8526 km2 of land, of which cropland was the most severely damaged, accounting for 72.25% of the total inundated area. The flood seriously damaged the agricultural production in the MLB.
Remote sensing images are used as the record of surface information, and there are differences between the information extracted from images and actual surface information. Although the flood monitoring method in this paper has high accuracy, the bias of the SAR images for recording surface information, the error of flood information extraction, and the difficulty in carrying out accuracy assessment on a large scale, lead to the uncertainty of flood monitoring accuracy. We chose good quality remote sensing images and an intelligent algorithm to reduce this uncertainty. In the future, accuracy assessment can be carried out in more areas to reduce the accuracy uncertainty of flood monitoring.
The flood monitoring method and technical process in this paper can be used in actual flood monitoring, which has high production efficiency. It can provide important support for emergency response and disaster relief of relevant departments, and is of great significance to improve disaster emergency management capability, and provide important guarantees for subsequent studies such as flood development trend and post-disaster damage assessment.