An Improved Method for Rainfall Forecast Based on GNSS-PWV

Li, Longjiang; Zhang, Kefei; Wu, Suqin; Li, Haobo; Wang, Xiaoming; Hu, Andong; Li, Wang; Fu, Erjiang; Zhang, Minghao; Shen, Zhen

doi:10.3390/rs14174280

Open AccessArticle

An Improved Method for Rainfall Forecast Based on GNSS-PWV

by

Longjiang Li

¹,

Kefei Zhang

^1,2,*

,

Suqin Wu

¹,

Haobo Li

^2,3

,

Xiaoming Wang

⁴

,

Andong Hu

⁵,

Wang Li

⁶

,

Erjiang Fu

⁷,

Minghao Zhang

¹

and

Zhen Shen

¹

School of Environment Science and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China

²

Satellite Positioning for Atmosphere, Climate and Environment (SPACE) Research Centre, RMIT University, Melbourne, VIC 3001, Australia

³

State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing 100084, China

⁴

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

⁵

Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80309, USA

⁶

Faculty of Land Resources Engineering, Kunming University of Science and Technology, Kunming 650031, China

⁷

Bei-Stars Geospatial Information Innovation Institute, No. 1 Xinbei Road, Pukou District, Nanjing 211899, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(17), 4280; https://doi.org/10.3390/rs14174280

Submission received: 6 July 2022 / Revised: 22 August 2022 / Accepted: 24 August 2022 / Published: 30 August 2022

(This article belongs to the Special Issue Advanced GNSS Remote Sensing Techniques for Meteorology and Climate Sciences)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Global navigation satellite systems (GNSS) has been applied to the sounding of precipitable water vapor (PWV) due to its high accuracy and high spatiotemporal resolutions. PWV obtained from GNSS (GNSS-PWV) can be used to investigate extreme weather phenomena, such as the formation mechanism and prediction of rainfalls. In the study, a new, improved model for rainfall forecasting was developed based on GNSS data and rainfall data for the 9-year period from 2010 to 2018 at 66 stations located in the USA. The new model included three prediction factors—PWV value, PWV increase, maximum hourly PWV increase. The two key tasks involved for the development of the model were the determination of the thresholds for each prediction factor and the selection of the optimal strategy for using the three prediction factors together. For determining the thresholds, both critical success index (CSI) and true skill statistic (TSS) were tested, and results showed that TSS outperformed CSI for all rainfall events tested. Then, various strategies by combining the three prediction factors together were also tested, and results indicated that the best forecast result was from the case that any two of the prediction factors were over their own thresholds. Finally, the new model was evaluated using the GNSS data for the 2-year period from 2019 to 2020 at the above mentioned 66 stations, and the probability of detection (POD) and false-alarms rate (FAR) were adopted to measure the model performances. Over the 66 stations, the POD values ranged from 73% to 97% with the mean of 87%, and the FARs ranged from 26% to 77% with the mean of 53%. Moreover, it was also found that both POD and FAR values were related to the region of the station; e.g., the results at the stations that are located in humid regions were better than the ones located in dry regions. All these results suggest the feasibility and good performance of using GNSS-PWV for forecasting rainfall.

Keywords:

global navigation satellite systems (GNSS); precipitable water vapor (PWV); rainfall forecasting

1. Introduction

Water vapor is a very important element in the troposphere, where numerous extreme weather phenomena take place. It not only plays a significant role in hydrological cycle and dynamical process [1,2] but also is the major greenhouse gas [3,4]. Many weather events are linked to water vapor, such as rainfall, snow, and fog. Therefore, it is very important to investigate and forecast extreme weather for accurately monitoring water vapor in the troposphere. In addition to traditional methods of observing tropospheric water vapor (radiosonde, microwave radiometers, etc.), global navigation satellite systems (GNSS) has been widely used to retrieve precipitable water vapor (PWV) [5,6,7,8,9] since Bevis first proposed GNSS meteorology in 1992 [10].

GNSS signals are delayed and bent as they pass the atmosphere, which can be divided into tropospheric delay and ionospheric delay. The tropospheric delay mainly refers to zenith total delay (ZTD) [11], which consists of zenith hydrostatic delay (ZHD), predominantly caused by the dry part of the atmosphere, and zenith wet delay (ZWD), caused by water vapor in the atmosphere [12]. PWV can be converted from the ZWD by multiplying it by a conversion factor, which is a function of atmospheric weighted mean temperature (T_m) [13]. In the past three decades, the methods of retrieving PWV from GNSS (GNSS-PWV) have been given much attention, including the determination of the ZHD and T_m [14,15,16,17,18,19], and the accuracy of GNSS-PWV [20,21,22,23]. Nowadays, GNSS-PWV is used to investigate the evolution of several types of weather events [24,25,26,27,28] and climate change [29,30,31] due to its advantage of low cost and all-weather high resolution and high accuracy.

Rainfall is one of the most common weather events, and it significantly affects human life and property, especially heavy rainfalls. Thus, it is important to accurately forecast rainfall since only when a PWV value reaches a certain degree may rainfall occur [32]. Many studies have investigated the relationship between PWV values and rainfalls, aiming at using GNSS-PWV to improve the accuracy of rainfall forecasting. Champollion et al. [33] monitored the evolution of PWV during a heavy rainfall in southern France and evidenced ongoing accumulation of PWV before or through the heavy rainfall. Muller et al. [34] found that there was a tight relationship between tropical rainfall and PWV, which could be reproduced by a very simple, physically motivated two-layer model. Barindelli et al. [35] analyzed temporal variation in PWV and its relationship with the evolution of rainfall in a region of Italy and then found that PWV obviously increased before onset of rainfall and steeply decreased after rainfall. Guo et al. [36] investigated the relationship between PWV and heavy rainfall in Hubei Province, China. They found that PWV values reached a very high level before onset of heavy rainfall events, indicating that PWV can be used as a predictor for rainfall forecasting. Shi et al. [37] analyzed the relationship between PWV and rainfall over the Indo-China region and found that the relationship was mainly affected by the latitude and altitude of the station and the moon system. Sapucci et al. [38] evaluated high-frequency GNSS-PWV for intense rainfall events, and results showed a rapid increase in PWV before most intense rainfall events occurred. Kunkel et al. [39] found a positive correlation between PWV and extreme daily precipitation with over one-third statistically significant at 3104 stations in the U.S. Zhao et al. [40] used two-dimensional and three-dimensional PWV to investigate the relationship between PWV and rainfall, and results showed that both two- and three-dimensional PWV varied anomalously before the onset of rainfall. All of these studies verified a clear relationship between variation in PWV values and rainfall, and more specifically, rapid increase and/or large PWV values may lead to the onset of rainfall.

Based on the above studies, numerous threshold-based methods that used GNSS-PWV to predict rainfall were proposed in recent years. The methods mainly contain three steps: (1) Some prediction factors are calculated based on GNSS-PWV time series, e.g., PWV value, PWV increase, and rate of PWV increase; (2) the optimal threshold for each prediction factor is identified from several candidate thresholds; and (3) a rainfall event is forecasted if one or more prediction factors are over their optimal thresholds, and the event is forecasted correctly if it rains in the next n hours (i.e., the selected forecast window), and otherwise, the event is forecasted falsely. Many relative studies mainly investigate the types of the prediction factors and how to determine the optimal threshold for each of the prediction factors. Benevides et al. [41] proposed a one-predictor method using the rate of PWV variation, and the method caused both significant probability of detection (POD) and false-alarms rate (FAR) when the forecast window was 6 h. Yao et al. [42] proposed a new three-predictor method using PWV value, PWV increase, and rate of PWV increase as the predictors, whose POD and FAR were about 80% and 66%, respectively, in Zhejiang Province, China. Manandhar et al. [43] examined the above two methods using data from a tropical region and found that the PWV value was the major predictor in the region, which reduced the FAR by almost 17%. Then, based on the three-factor method, Li et al. [44] proposed a five-factor method by adding two more predictors (PWV decrease and rate of PWV decrease), which predicted 95% heavy rainfall in the Hong Kong region and reduced the FAR by 33% compared with the three-factor method. In addition to the aforementioned predictors, the other predictors obtained from GNSS products were also investigated. Manandhar et al. [45] used the first and second derivative of GNSS-PWV to predict rainfall in a tropic region, and results were similar to the ones in [43]. Zhao [11] and Li [46] used GNSS-ZTD time series for rainfall forecasting, which led to a result better than that in previous studies. Li et al. [47] also found that an optimal diurnal threshold outperformed an optimal monthly threshold in Hong Kong. Zhao et al. [48] used GNSS-PWV to forecast hourly rainfall based on a supervised learning algorithm, and the model archived a promising result. In the above-mentioned studies, several prediction factors were investigated for forecasting rainfall events, and significant results were shown. In addition to the prediction factors, how to determine the optimal threshold for each of the prediction factors was also investigated. There are mainly three methods for determining the optimal threshold. First, the empirical method was commonly used in previous studies. In the method, the optimal threshold for each prediction factor was the candidate value that leads to a suitable POD and FAR. Second, the percentile method was used by Zhao et al [11]. The method determines the optimal threshold by testing the candidate thresholds at certain percentiles. Third, the critical success index (CSI) method was used to forecast heavy rainfall events in [44]. The method regards the candidate threshold that leads to the maximum CSI as the optimal threshold.

It is difficult to determine the optimal threshold by the empirical method and also the percentile method since both POD and FAR decrease with the increase in the candidate threshold values. This means that different optimal thresholds may be selected even for the same station. Comparing the two methods, the CSI method is better for determining the optimal threshold. However, the CSI method was merely used to forecast heavy rainfall [44]. In this study, the method was tested for all types of rainfall events, and poor performance was found. Therefore, a new method using true skill statistic (TSS) to determine the optimal threshold was proposed. Furthermore, different combinations of prediction factors were also tested, which have not been investigated in previous studies, and the method was evaluated at 66 stations located in the USA.

The outline of this paper is as follows: Section 2 describes the data and the methods used in this study; Section 3 shows the construction of the improved model for rainfall forecasting proposed by this study; and the test results of the improved model are shown in Section 4, followed by discussions and conclusions, which are presented in the last section.

2. Data and Methodologies

2.1. Data

The four types of data used in this study included GNSS data from the International GNSS Service (IGS) and continuously operating reference stations (CORS) in the USA; sounding data from the Integrated Global Radiosonde Archive (IGRA); rainfall data from the Integrated Surface Database (ISD); and reanalysis data (ERA5) from the European Centre for Medium-Range Weather Forecast (ECMWF). The GNSS data were used to estimate GNSS-PWV, combining with ERA5 data. The rainfall data were used as the reference/truth of precipitation for the validation of the model prediction results. The sounding data were used as the reference to evaluate the accuracy of GNSS-PWV. The distribution of all stations used in this study is shown in Figure 1.

2.1.1. ISD

The ISD project was initiated by the National Climatic Data Center (NCDC) of the National Oceanic and Atmospheric Administration (NOAA) in 1998 for addressing the problem of complication in the usage of hourly surface-based meteorological observations. It aimed at merging hourly surface data collected from different organizations into a common ASCII character format and submitting to a same standardized quality control [49]. The ISD mainly includes three sections: a control data section, a mandatory data section, and an additional data section. The first section provides the information of stations, e.g., observing date, time, and location. The second section provides common meteorological parameters, e.g., wind, temperature, and pressure. The third section provides significant meteorological parameters and/or the parameters received with varying degrees of frequency, e.g., precipitation and snow depth. In this study, the data in the first section and precipitation data in the third section were used as the truth of rainfall.

The ISD contains billions of surface meteorological parameters for the period from 1900 to present from over about 30,000 stations around the world, and the data are available from https://www.ncei.noaa.gov/data/global-hourly/ (accessed on 14 January 2021). However, in this dataset, there are usually a large amount of missing data; thus, a checking process needs to be performed before the data were used. In this study, the data in the 11-year period from 2010 to 2020 were checked by the following steps: (1) the stations that had no observation in a year during the 11-year period were removed; (2) the stations whose missing data were over 40% of the whole data were removed; and (3) the stations whose missing data were over 40% of all data in the whole year were removed. After the checking process was carried out, only 1109 stations were left, and these stations were used as candidate rainfall stations of this study.

2.1.2. GNSS

In order to have as many GNSS stations that are co-located with the aforementioned candidate rainfall stations as possible, the GNSS data selected in this study are from three data sets provided by National Geodetic Survey (NGS), University Navigation by Satellite Timing and Ranging (NAVSTAR) Consortium (UNAVCO), and Scripps Institution of Oceanography’s Orbit and Permanent Array Center (SOPAC). The first data set was used as the major data source, and the other two data sets were used as the supplementary data source since there was considerable amount of missing data in each of the data sets. The above three organizations are detailed below.

NGS is one of the NOAA’s departments, which defines, maintains, and provides access to the National Spatial Reference System (NSRS) and also manages the NOAA CORS Network (NCN). The NCN includes GNSS carrier phase and code range measurements and station metadata, which are collected from hundreds of government, academic, and private organizations. The data from these different organizations are stored in a same format, a Receiver Independent Exchange Format (RINEX). The data are available free of charge from https://geodesy.noaa.gov/corsdata/ (accessed on 21 April 2022).

UNAVCO is a non-profit organization set up by the National Science Foundation (NSF) in the USA in 1984. It operates the Geodetic Facility for the Advancement of Geoscience (GAGE), a world data center, which provides access to GNSS data in RINEX over more than 10,000 stations around the world and other required data. The data are available free of charge from https://data.unavco.org/archive/gnss/ (accessed on 21 April 2022).

SOPAC mainly focuses on the analysis and archiving of precise GNSS data and data products since 1991 and provides the GARNER archive of GNSS, including data from the IGS, California Spatial Reference Center (CSRC) and California Real-Time Network (CRTN). The data are available free of charge from http://garner.ucsd.edu/pub/ (accessed on 21 April 2022).

From the three data sets, more than 8000 stations are available, but most of them are not continuously long-term or permanent GNSS stations. To ensure sufficient data used to investigate the relationship between GNSS-PWV and rainfall, for stations whose missing data were over 40% during the 11-year period studied, the station was discarded. After this checking process was carried out for all GNSS stations, 3812 stations were remained, and these candidate GNSS stations went through a further selection—only those that were co-located with rainfall stations would be finally selected for constructing and testing the proposed new model for rainfall forecast.

2.1.3. Radiosonde

As a high-quality radiosonde data set, IGRA is archived by NCDC, in which the observations of geopotential height, pressure, and water vapor pressure profiles at surface, standard, significant, and tropopause levels are provided twice per day over about 1500 unevenly distributed stations around the world. The data were available from ftp://ftp.ncdc.noaa.gov/pub/data/igra/derived/derived-por/ (accessed on 20 July 2021). Based on the layered observations, PWV can be calculated by the following approximation to the definition of the integral [50], which are named RS-PWV:

P W V = \frac{1}{g} \sum_{P_{1}}^{P_{2}} q d P

(1)

where P is the pressure in unit of hPa; P₁ and P₂ are the pressure at the lowest and highest levels, respectively; g is the acceleration of gravity in unit of m/s²; q is the specific humidity, which can be calculated through the following formula:

q = \frac{0.622 \times P_{w}}{P - 0.378 P_{w}}

(2)

where P_w is the water vapor pressure in unit of hPa.

Although the sounding data had passed through a strict quality control process before its release, it is ideally not directly used in the evaluation of GNSS-PWV, as some pre-processing steps still need to be carried out before RS-PWV is calculated. The procedure for the data set to be processed in this study is as follows. First, the checking method adopted in [51] was applied to the data since there were some missing and/or gross data in the sounding data from IGRA. Second, the height of the lowest layer in a sounding profile is usually different from the height of its co-located GNSS station. In this case, the values of meteorological parameters in the profile need to be extrapolated or interpolated to the height of the GNSS station. More details about the extrapolation or interpolation can be found in [17]. After the two pre-processing steps were performed, RS-PWV was calculated by Formula (1) and then used as the reference to compare with GNSS-PWV for the accuracy evaluation of GNSS-PWV.

2.1.4. ERA5

As the latest reanalysis data, ERA5 was provided by the ECMWF and contained several meteorological parameters values. The data were similar to radiosonde data, but they are in the three-dimensional grid-based format. Thus, ERA5 data can be used to calculate the ZHD and T_m for any station over the globe. In this study, pressure, temperature, water vapor pressure, and geopotential height for the period from 2010 to 2020 were acquired from https://doi.org/10.24381/cds.bd0915c6 (accessed on 15 June 2021) [52], and they were used to calculate the ZHD and T_m by the following formulas:

Z H D = \sum_{h_{s}}^{h_{t}} (k_{1} \frac{P_{d}}{T} + k_{1} \frac{M_{w}}{M_{d}} \frac{P_{w}}{T}) d h,

(3)

T_{m} = \frac{\sum_{h_{s}}^{h_{t}} \frac{P_{w}}{T} d h}{\sum_{h_{s}}^{h_{t}} \frac{P_{w}}{T^{2}} d h},

(4)

where h is the height (unit: mm); h_s and h_t are the heights at the bottom and top levels, respectively; P_d is the pressure for dry air (unit: hPa, P_d = P

-

P_w); P_w is the pressure of water vapor; T is the temperature (unit: K); k₁, M_w, and M_d are constant with the values of 77.604 K/mbar, 18.0152 kg/kmol, and 28.9644 kg/kmol, respectively [12]. Geopotential height was provided by ERA5, which needs to be converted to geodetic height. The conversion contains two steps:

(1): Geopotential height H was converted to geoid height h by [17]:

h = \frac{g H}{R_{φ} g_{φ} - g H},

(5)

where g = 9.80665 m/s²;

R_{φ}

and

g_{φ}

can be calculated by:

R_{φ} = \frac{6378.137}{1.006803 - 0.006706 \times \sin {(φ)}^{2}},

(6)

g_{φ} = 9.780325 \cdot \sqrt{\frac{1 + 0.00193185 \times \sin {(φ)}^{2}}{1 - 0.00669435 \times \sin {(φ)}^{2}}},

(7)

where

φ

is the latitude of the station.

(2): The above geoid height was converted to geodetic height using the official earth gravitational model 2008 (EGM 2008) [53].

Although ERA5 data cannot be used for real-time retrieval of GNSS-PWV, they were used to calculate the ZHD and T_m, which were used to calculate accurate ZWD and the conversion factor for an improved GNSS-PWV, and this study mainly investigated the relationship between GNSS-PWV and rainfall and the value of GNSS-PWV for weather forecasting.

2.2. Methodology

2.2.1. Selection of Co-Located Stations and Pre-Processing

In Section 2.1, 3812 GNSS stations and 1109 rainfall stations were initially selected as both types of candidate stations, and they need to be further selected for co-located ones. In this section, the distance between each GNSS station and each rainfall station was calculated, and if the distance was under 5 km, then the two stations were regarded as a pair of co-located stations, which would be used to investigate the relationship between GNSS-PWV and rainfall. After this process was performed for all the above initial candidate stations, 160 pairs of co-located stations were identified.

The next task was to check noneffective data or missing data for the 160 pairs of stations by the following two steps: (1) a rainfall event was discarded if there existed missing data before the onset of the rainfall event or the onset of the rainfall could not be correctly identified and (2) a station was excluded if there were less than 100 rainfall events during the 2-year period from 2019 to 2020 (the testing period). The second step was to mainly identify the stations that had either missing data or little rainfall. This is because (1) the results at the stations where the missing data were too much may not be trustworthy and (2) drizzling rainfall mainly occurs at the stations where there is little rainfall, and PWV variation is not significant before the onset of drizzling rain. After the above two-step checking process was completed, 66 pairs of co-located stations remained (see Figure 1 for their geographic distribution), and data from these stations were selected for use in this study.

2.2.2. Obtaining of GNSS-PWV

GNSS-PWV is generally obtained through a conversion from the ZTD, which is directly estimated from GNSS data processing. The GNSS observations (in the RINEX format) for the 11-year period over the aforementioned 66 pairs of co-located stations were collected from the webs described in Section 2.1.2. Then, these data were processed using the Bernese V5.2, and the precise point positioning (PPP) approach and precise orbit products from the IGS were adopted. More details for the data processing strategies are listed in Table 1.

After the ZTD over a GNSS station was estimated, its corresponding GNSS-PWV was obtained using the following procedure:

(1): Calculate the ZHD and T_m using Formulas (3) and (4), respectively, based on ERA5 reanalysis data at each of the four grid points around the GNSS station. Then, the four ZHD and T_m values are used to interpolate for the position of the GNSS station based on the double-linear interpolation [15,17].
(2): Calculate the ZWD by subtracting the ZHD from ZTD.
(3): Calculate the conversion factor using the following formula [10]:

K = \frac{10^{6}}{ρ (\frac{k_{3}}{T_{m}} + k_{2}^{'}) R_{v}},

(8)

where

ρ

is the density of liquid water, and

ρ =

1000 kg/m³;

R_{v}

is the specific gas constant for water vapor, and

R_{v}

= 461.5 J/(kg∙K);

k_{2}^{'}

= 17 K/mbar, and

k_{3}

= 377,600 K²/mbar. These values are obtained from [12].

(4): Calculate GNSS-PWV by multiplying the ZWD with K.

Through the above four steps, GNSS-PWVs for the 11-year period studied at the 66 GNSS stations were obtained. Of the 11-year GNSS-PWV results, the GNSS-PWVs for the period from 2010 to 2018 were used to construct the improved rainfall forecasting model proposed in this study, while the GNSS-PWVs from 2019 to 2020 were used to validate the new model.

2.2.3. Evaluation of GNSS-PWV

In this section, 10 of the 66 GNSS stations selected in Section 2.2.1 were used to evaluate the accuracy of GNSS-PWV since a radiosonde station co-located with each of the 10 GNSS stations was available (two stations with a <20 km distance were defined co-located [22]). Table 2 shows the statistical results of GNSS-PWV over each station compared with the reference of the RS-PWV at the co-located radiosonde station for the 11-year period from 2010 to 2020. The last column RMS indicates the overall accuracy.

The results in Table 2 show GNSS-PWV agreed well with RS-PWV over all the 10 stations. The biases of the 10 GNSS-PWVs ranged from −1.7 mm to 0 mm with the mean of −0.4 mm; the RMSs ranged from 1.2 mm to 3.0 mm with the mean of 1.7 mm, and the RMSs of nine stations (i.e., all stations except for ZMA1) were less than 3 mm, which is the threshold for the accuracy of PWV for meteorological research [54]. At the ZMA1 station, its RMS was about 3 mm, and its bias was also the largest (−1.7 mm) among the 10 stations. The GNSS-PWV and RS-PWV time series over the ZMA1 station in the 7-month period from November 2013 to May 2014 rather than the whole 11-year period, merely for a clear illustration, are shown in Figure 2, in which both time series present similar variation trends. Since our study mainly focused on the variation trend in GNSS-PWV time series before the onset of rainfall events rather than its absolute values or biases over the ZMA1 station, the result of this station was thus still used in the study.

2.2.4. Criteria for Evaluation of Forecasting Results

In previous studies mentioned in Section 1, some researchers used GNSS-PWV and its derivatives to predict rainfall events. For example, the three-predictor method proposed by Yao [42] predicted rainfall events using three commonly used prediction factors, which are PWV value, PWV increase, and the rate of PWV increase. If these prediction factors reach certain values—the so-called thresholds—a rainfall event will be forecasted. If a forecast result is the same as the truth, then the forecast is a correct forecast; otherwise, it is a false forecast. Therefore, a forecast result may be one of the following four possible results: (1) TP: the forecast result is rainfall, and it really rains within the next 12 h; (2) FP: the forecast result is rainfall, but it does not rain within the next 12 h; (3) FN: the forecast result is non-rainfall, but it rains within the next 12 h; (4) TN: the forecast result is non-rainfall, and it does not rain within the next 12 h.

For evaluating the forecasting results, the schematic 2 × 2 contingency table (Table 3) was used to count the number of TP, FP, FN, and TN.

Through the notations shown in Table 3, the formulas for POD, FAR, TSS, and CSI are defined as follows:

P O D = \frac{T P}{T P + F N}

(9)

F A R = \frac{F P}{F P + T P}

(10)

C S I = \frac{T P}{T P + F P + F N}

(11)

T S S = \frac{T P}{T P + F N} + \frac{T N}{T N + F P} - 1

(12)

The determination of the thresholds for each of the prediction factors is very important for the performance of the forecast model. In previous studies, the thresholds were determined by testing a large number of candidate threshold values, and the two quantities—the probability of detection (POD) and false-alarm rate (FAR)—are commonly used for the determination of the thresholds. The threshold values that result in the suitable POD and FAR are identified as the optimal thresholds. However, with the increase in the threshold values, both POD and FAR decrease (see the blue and red lines in Figure 3, respectively); hence, it is difficult to determine the optimal thresholds based on the POD and FAR only. Therefore, in this study, for simplifying the determination of the optimal threshold for each prediction factor, a new method based on the TSS was proposed.

In [44], CSI was used as a criterion to determine the optimal thresholds for heavy rainfall, and the results were promising. For determining the optimal threshold for a prediction factor in a month at a station, the threshold value that leads to the maximum CSI value among all candidate threshold values is regarded as the optimal threshold for the prediction factor of the month and the station. However, this method may not suit all rainfall events. In this section, the criterion of CSI was tested for determining the optimal thresholds in all rainfall events rather than heavy rainfall events.

For a prediction factor (i.e., any one of the three prediction factors of PWV value, PWV increase, and maximum hourly PWV increase), several values were selected initially as candidate thresholds. Then, corresponding to each of them, a group of TP, FP, FN, and TN in Table 3 can be counted, and the criteria of POD, FAR, CSI, and TSS were calculated by Formulas (9)–(12), respectively. Figure 3 shows the relationship between the criteria and candidate thresholds for the three prediction factors in January at the 724988-94704-NYDV station, and Figure 3a–c shows the results of the PWV value, PWV increase, and maximum hourly PWV increase, respectively. In Figure 3a (PWV value), the maximum CSI (cyan square) value of 32% was from the threshold of 14 mm, and the threshold’s corresponding POD (blue square) and FAR (red square) values were 51% and 49%, respectively. In Figure 3b, the maximum CSI (cyan square) value of 33% was from the threshold of 3 mm, and the threshold’s POD (blue square) and FAR (red square) values were 58% and 56%, respectively. In Figure 3c, the optimal CSI (cyan square) value of 34% was from the threshold of 2.1 mm/h, and the threshold’s POD (blue square) and FAR (red square) values were 45% and 43%, respectively. For all the three prediction factors, the results shown in Figure 3 suggest that the optimal threshold determined by CSI only is too high, which caused a small POD value. Therefore, it seemed that the criterion of CSI may be only applicable for heavy rainfall events, and it performed poorly for all rainfall events.

In machine learning classification, TSS [55], which is also named the Youden index [56], is used as the most common criterion for selecting the optimal threshold in a receiver operating characteristic (ROC) curve analysis [57]. As a summary measure of the ROC curve [58], the criterion determines the optimal threshold from several candidate thresholds by finding the maximum TSS [59,60]. In this study, for determining the optimal threshold for each prediction factors, the criterion of TSS was also tested, and the results are also shown in Figure 3, where the green lines were the TSS values corresponding to the candidate thresholds. In Figure 3a (PWV value), the green triangle represents the maximum TSS of 60%, and the corresponding optimal threshold was 7.1 mm. At the optimal threshold, the POD (blue triangle) and FAR (red triangle) values were 85% and 82%, respectively. In Figure 3a, the optimal threshold determined by TSS was less than the one determined by CSI (cyan square), and POD (blue triangle) and FAR (red triangle) values with the maximum TSS were over the ones (blue square and red square) with the maximum CSI. For other two prediction factors (Figure 3b,c), similar results were found; i.e., the optimal threshold determined by TSS was less than the one determined by CSI for each prediction factor. Although the FAR values (red triangles) with the maximum TSS values were also larger than the ones (red squares) with the maximum CSI values when a single prediction factor was used to forecast rainfall events, they can be reduced by using all the three prediction factors. Therefore, in this study, the criterion of TSS was used to determine the threshold for each of the three prediction factors for simplifying the determination of the thresholds.

In this study, the forecast time window selected was 12 h, which is determined by two principles: (1) world meteorological organization (WMO) suggests that the time window of short-range forecasting is 0~12 h [61]; and (2) 3, 6, 9, and 12 h time windows were all tested in this study, and the mean TSSs were 51%, 62%, 67%, and 71%, respectively, which indicated that the 12 h time window was the best.

3. Construction of New Model

The new model for rainfall forecasting was constructed based on the data at the aforementioned 66 pairs of co-located stations over the 9-year period from 2010 to 2018. For the determination of the prediction factors, the relationship between GNSS-PWV and rainfall is shown in Figure 4 for a comparison between GNSS-PWV variation and rainfall events in June 2015 at the 726140-54742-VTD7 station as an example. It was obvious that GNSS-PWV increased rapidly to a large value before the onset of each rainfall event. Therefore, PWV value, PWV increase, and maximum hourly PWV increase were adopted as the three prediction factors; see Table 4 for their definition.

3.1. Determination of Thresholds

According to the investigation in Section 2.2.4, instead of CSI, TSS was used to determine the threshold for each prediction factor for each month based on the data of the month over the 9-year period from 2010 to 2018. The threshold for each of the three prediction factors for each month and each station was determined using the following procedure:

(1): For the prediction factor, 12 values for a 12 h period before the onset of a rainfall were calculated according to their definition in Table 4; e.g., in Figure 5, a rainfall event occurred at 17:00, and the 12 values for the prediction factor in each hour in Period-1 were calculated for determining the prediction value of the rainfall event. Among the 12 values, the maximum was selected as the prediction factor value of the rainfall event. By repeating this step, the values of all rainfall events in the month were obtained for the prediction factor, and all these values were sorted from the minimum value to the maximum value.

(2): The minimum value for the prediction factor was regarded as the minimum candidate threshold, and the value that was at the position of 80% of the sorted data series obtained from step (1) was adopted as the maximum candidate threshold. The candidate thresholds can be determined by taking one threshold at a fixed interval from the minimum one to the maximum one (the fixed intervals were 1 mm, 0.2 mm, and 0.1 mm/h for PWV value, PWV increase, and maximum hourly PWV increase, respectively).
(3): For each of the above candidate thresholds, TP, FP, FN, and TN in Table 2 were counted in the month at the co-located station, and the TSS was calculated by Formula (12). Among all the candidate thresholds, the threshold that resulted in the largest TSS was determined as the optimal threshold and was used in the rainfall forecast.

Through the above three steps, the optimal threshold for a prediction factor for a month and a station was determined. For example, Figure 6 shows the PWV values before the onset of the 66 rainfall events in the month of January at the 726166-54781-VTBE station. The PWV values ranged from 4.9 mm (red triangle) to 33.4 mm (red square), and the value at the position of 80% (red dot) of the sorted data series was 16.9 mm. Therefore, the candidate thresholds for the PWV value of January of the station were selected from 4.9 mm to 16.9 mm with an interval of 1 mm. Then, for each of the candidate thresholds, TP, FP, FN, and TN shown in Table 3 were counted, and their corresponding TSS, POD, FAR, and CSI were calculated through the Formulas (9)–(12), respectively, and the results are shown in Table 5. In the table, the maximum TSS was 64.3%, and its corresponding threshold was 6.9 mm. Thus, the threshold of PWV value was 6.9 mm in January at the 726166-54781-VTBE station. Since there was one optimal threshold for a prediction factor for a month and a station, for the three prediction factors of 12 months at 66 stations, a total of 2376 optimal thresholds were determined.

3.2. Using All Prediction Factors Together

In Table 5, for a single prediction factor (PWV value), the POD resulting from the optimal threshold was 92.4%, but its corresponding FAR was as large as 81.3%. Thus, it needs to consider all the three prediction factors rather than only a single prediction factor. In this section, six strategies formed by satisfying different conditions from the three prediction factors were tested; see Table 6.

To identify the optimal strategy, the TSS values resulting from the six strategies were all calculated and results are shown in Figure 7. One can see that all the six results had a similar location dependence, i.e., the TSS values at the stations located in the eastern USA were larger than the ones located in the western USA. This may be due to the fact that the rainfall was more in the eastern region, which will be discussed in the next section.

Figure 7 also shows that the worst result is S1, and its TSS values ranged from 36% to 67% with the mean of 56%, mainly caused by the large FP. The TSS values of S2 ranged from 50% to 81% with the mean of 71%, which was slightly better than the ones in Subfigure S5. The results in Subfigure S3 were similar to S4. Moreover, the order of the results from the optimal to the worst was S2, S5, S6, S3, S4, and S1 (see the mean values). As a result, the strategy of S2 was adopted for the forecasting model.

According to Section 3.1 and Section 3.2, a new forecasting model was constructed using PWV values, PWV increase, and maximum hourly PWV increase as the prediction factors of the model, and the optimal thresholds for the three prediction factors were determined by the TSS values. A rainfall event would be predicted if any two of the prediction factors are over their optimal thresholds.

4. Results

The forecasting model developed in Section 3 was evaluated using the data at the 66 pairs of co-located stations for the 2-year period from 2019 to 2020, which was out-of-sample data. It is worthwhile mentioning that the forecast was valid for a 12 h time window; i.e., if a rainfall event was predicted when any two of the three prediction factors were over their optimal thresholds at a forecasting moment, and a rainfall event occurred within the next 12 h after the forecasting moment, then the prediction was the truth.

Figure 8a,b shows the POD and FAR values at the 66 rainfall stations, respectively. The POD values ranged from 73% to 97% with the mean of 87%, and the FAR values ranged from 27% to 77% with the mean of 53%. In terms of the mean POD and FAR values, the new model outperformed the ones in [38] with the POD value of 80% and FAR value of 66%. In addition, both POD and FAR values were location-dependent on the station. Specifically, the POD values increased from the west to the east. Of the 66 stations, only four stations had POD values under 80%, and all the four stations were located in the western (longitude < 114°W) USA. For the FAR result, the values at those stations that are located in the Alaska and the eastern USA were better than the ones located in the middle and western USA. There were 18 stations whose FAR values were over 60%, and all the stations were located in the middle and western USA. These results seemed to be dependent on the distribution of climate zones in the USA since all stations with bad results were located in arid regions.

For further investigation on the relationship between the two criteria of POD and FAR and rainfall, annual mean rainfall at each of the 66 stations was calculated, and results are shown in Figure 9. We can see that the annual mean rainfall at these stations ranged from 186 mm to over 2000 mm, and the annual mean rainfall values at the stations located in the middle and western USA were less than the ones located in the eastern USA, which was consistent with the results of POD and FAR. At most of the 18 stations where the FAR values were over 60%, their annual mean rainfall values were less than 1000 mm. This may be because very few rainfall events occurred at these stations, and their corresponding TP values in Formulas (9) and (12) were also small. Then, the 66 stations were separated into three groups from west to east according to the annual mean rainfall in Figure 9. The three groups were located in the western, middle, and eastern USA, with longitudes 180~100°W, 100~90°W, and 90~60°W, respectively. The mean POD and FAR values for each group calculated were shown in Table 7. From the table, the mean POD value in the west was the least, and the mean POD values in the other two groups were almost the same. However, the mean FAR in the east was the least. In summary, the new model developed in this study performed better in the eastern region than the middle and western regions.

5. Conclusions

In this study, an improved rainfall forecasting model was developed using GNSS-PWV time series and rainfall data for the 9-year period from 2010 to 2018 at 66 pairs of co-located stations in the USA. First, the raw data were pre-processed for quality assurance and identification of co-located data in both spatial and temporal domains. The GNSS-PWV at the 66 stations were retrieved using GNSS observations and also the aid of ERA5 reanalysis data. Since 10 GNSS stations were found co-located with local radiosonde stations, the GNSS-PWV from these stations were used to evaluate the data quality by comparing them with RS-PWV. Next, the criteria of CSI and TSS were tested for the determination of the thresholds for each of the three prediction factors of the forecasting model, and it was found that TSS outperformed CSI. As a result, the criterion of TSS was adopted to determine the threshold for each prediction factor for each month and each station. In the last step, six strategies formed by satisfying different conditions from the three prediction factors were investigated for the selection of the optimal strategy for rainfall forecasting. In the new, improved model for forecasting rainfall, a rainfall event was forecasted if any two of the prediction factors were over their optimal thresholds.

The new model was evaluated using the data in the 2-year period from 2019 to 2020 at the aforementioned 66 stations, and POD and FAR were used as the criteria of the evaluation. The POD values ranged from 73% to 97% with the mean of 87%, and the FAR values ranged from 27% to 77% with the mean of 53%. The results were acceptable by referring to the ones of same-type, three-factor model proposed by Yao, whose POD and FAR were about 80% and 66%, respectively. This means that the TSS method for determining the optimal threshold outperforms the empirical method adopted in previous studies, and the mean FAR of 53% further corroborated the fact that the formation of a rainfall event is a complex progress. In other words, this also illustrated the main limitations of this method; i.e., with the usage of PWV only (based on the threshold-based model), rainfall events may not be adequately detected. Even so, the new model developed based on GNSS-PWV and its derivatives still has the ability to be used as an alternative or complementary method to detect rainfall events, especially in small-scale regions, since it has the benefits of high POD scores, simple usage, low cost, application to all weather, etc. In addition, it was also found that both POD and FAR values were related to the location of the station. Specifically, the results at the stations that are located in the eastern USA were better than the ones in the middle and western USA, which was consistent with climate zones. From the annual mean rainfall at each station and the POD and FAR values, it was found that the new model in humid regions outperformed arid regions. The study simplified the determination of the thresholds for the prediction factors and validated the model constructed based on GNSS-PWV for rainfall forecast.

Author Contributions

Conceptualization, L.L., S.W. and K.Z.; methodology, L.L., H.L. and A.H.; software, L.L. and M.Z.; validation, L.L., X.W. and S.W.; formal analysis, L.L.; investigation, L.L. and E.F.; resources, L.L.; data curation, L.L. and W.L.; writing—original draft preparation, L.L.; writing—review and editing, L.L., S.W. and K.Z.; visualization, L.L. and Z.S.; supervision, K.Z. and S.W.; project administration, K.Z.; funding acquisition, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China (41730109, 41874040) and the Independent Innovation Project of “Double-First Class” Construction (2022ZZCX06).

Data Availability Statement

ISD data can be obtained from https://www.ncei.noaa.gov/data/global-hourly/ (accessed on 14 January 2021). GNSS data can be collected from https://geodesy.noaa.gov/corsdata/ (accessed on 21 April 2022), https://data.unavco.org/archive/gnss/ (accessed on 21 April 2022) and http://garner.ucsd.edu/pub/ (accessed on 21 April 2022). IGRA data can be downloaded from ftp://ftp.ncdc.noaa.gov/pub/data/igra/derived/derived-por/ (accessed on 20 July 2021). ERA5 data can be freely available from https://doi.org/10.24381/cds.bd0915c6 (accessed on 15 June 2021).

Acknowledgments

The authors would like to thank NGS, UNAVCO, and SOPAC for providing GNSS observations; IGS for precision orbit products; ECMWF for ERA5 reanalysis data; and NCDC for radiosonde data and rainfall data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Allan, R.P. The role of water vapour in earth’s energy flows. Surv. Geophys. 2012, 33, 557–564. [Google Scholar] [CrossRef]
Singh, R.P.; Dey, S.; Sahoo, A.K.; Kafatos, M. Retrieval of water vapor using SSM/I and its relation with the onset of monsoon. Ann. Geophys. 2004, 22, 3079–3083. [Google Scholar] [CrossRef]
Zhang, K.; Manning, T.; Wu, S.; Rohm, W.; Silcock, D.; Choy, S. Capturing the signature of severe weather events in Australia using GPS measurements. IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. 2015, 8, 1839–1847. [Google Scholar] [CrossRef]
Chung, E.; Soden, B.; Sohn, B.J.; Shi, L. Upper-tropospheric moistening in response to anthropogenic warming. Proc. Natl. Acad. Sci. USA 2014, 111, 11636–11641. [Google Scholar] [CrossRef]
Bock, O.; Parracho, A. Consistency and representativeness of integrated water vapour from ground-based GPS observations and ERA-Interim reanalysis. Atmos. Chem. Phys. 2019, 19, 9453–9468. [Google Scholar] [CrossRef]
Li, X.; Dick, G.; Lu, C.; Ge, M.; Nilsson, T.; Ning, T.; Wickert, J.; Schuh, H. Multi-GNSS meteorology: Real-time retrieving of atmospheric water vapor from BeiDou, Galileo, GLONASS, and GPS observations. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6385–6393. [Google Scholar] [CrossRef]
Wang, J.; Zhang, L. Climate applications of a global, 2-hourly atmospheric precipitable water dataset derived from IGS tropospheric products. J. Geod. 2009, 83, 209–217. [Google Scholar] [CrossRef]
Baltink, H.K.; Marel, H.; Hoeven, A.G.A. Integrated atmospheric water vapor estimates from a regional GPS network. J. Geophys. Res. Atmos. 2002, 107, ACL-3. [Google Scholar] [CrossRef]
Rocken, C.; Hove, T.V.; Johnson, J.; Solheim, F.; Ware, R.; Bevis, M.; Chiswell, S.; Businger, S. GPS/Storm—GPS sensing of atmospheric water vapor for meteorology. J. Atmos. Ocean. Technol. 1995, 12, 468–478. [Google Scholar] [CrossRef]
Bevis, M.; Businger, S.; Herring, T.A.; Rocken, C.; Anthes, R.A.; Ware, R.H. GPS meteorology: Remote sensing of atmospheric water vapor using the global positioning system. J. Geophys. Res. Atmos. 1992, 97, 15787–15801. [Google Scholar] [CrossRef]
Zhao, Q.; Liu, Y.; Ma, X.; Yao, W.; Yao, Y.; Li, X. An improved rainfall forecasting model based on GNSS observations. IEEE Trans. Geosc. Remote Sens. 2020, 58, 4891–4900. [Google Scholar] [CrossRef]
Davis, J.L.; Herring, T.A.; Shapiro, I.I.; Rogers, A.; Elgered, G. Geodesy by radio interferometry: Effects of atmospheric modeling errors on estimates of baseline length. Radio Sci. 1985, 20, 1593–1607. [Google Scholar] [CrossRef]
Bevis, M. GPS meteorology: Mapping zenith wet delays onto precipitable water. J. Appl. Meteorol. 1994, 33, 379–386. [Google Scholar] [CrossRef]
Zhang, W.; Zhang, H.; Liang, H.; Lou, Y.; Cai, Y.; Cao, Y.; Zhou, Y.; Liu, W. On the suitability of ERA5 in hourly GPS precipitable water vapor retrieval over China. J. Geod. 2019, 93, 1897–1909. [Google Scholar] [CrossRef]
Wang, X.; Zhang, K.; Wu, S.; He, C.; Cheng, Y.; Li, X. Determination of zenith hydrostatic delay and its impact on GNSS-derived integrated water vapor. Atmos. Meas. Tech. 2017, 10, 2807–2820. [Google Scholar] [CrossRef]
Jiang, P.; Ye, S.; Chen, D.; Liu, Y.; Xia, P. Retrieving precipitable water vapor data using GPS zenith delays and global reanalysis data in China. Remote Sens. 2016, 8, 389. [Google Scholar] [CrossRef]
Wang, X.; Zhang, K.; Wu, S.; Fan, S.; Cheng, Y. Water vapor—Weighted mean temperature and its impact on the determination of precipitable water vapor and its linear trend. J. Geophys. Res. Atmos. 2016, 121, 833–852. [Google Scholar] [CrossRef]
Sun, P.; Wu, S.; Zhang, K.; Wan, M.; Wang, R. A new global grid-based weighted mean temperature model considering vertical nonlinear variation. Atmos. Meas. Tech. 2021, 14, 2529–2542. [Google Scholar] [CrossRef]
Ding, M. A second generation of the neural network model for predicting weighted mean temperature. GPS Solut. 2020, 24, 61. [Google Scholar] [CrossRef]
Jade, S.; Shrungeshwara, T.S.; Anil, B. Water vapor study using MODIS and GPS data at 64 continuous GPS stations (2002—2017) in Indian subcontinent. J. Atmos. Sol.-Terr. Phys. 2019, 196, 105138. [Google Scholar] [CrossRef]
Chen, B.; Dai, W.; Liu, Z.; Wu, L.; Xia, P. Assessments of GMI-derived precipitable water vapor products over the south and east China seas using radiosonde and GNSS. Adv. Meteorol. 2018, 2018, 7161328. [Google Scholar] [CrossRef]
Gui, K.; Che, H.; Chen, Q.; Zeng, Z.; Liu, H.; Wang, Y.; Zheng, Y.; Sun, T.; Liao, T.; Wang, H.; et al. Evaluation of radiosonde, MODIS-NIR-Clear, and AERONET precipitable water vapor using IGS ground-based GPS measurements over China. Atmos. Res. 2017, 197, 461–473. [Google Scholar] [CrossRef]
Namaoui, H.; Kahlouche, S.; Belbachir, A.H.; Van Malderen, R.; Brenot, H.; Pottiaux, E. GPS water vapor and its comparison with radiosonde and ERA-Interim data in Algeria. Adv. Atmos. Sci. 2017, 34, 623–634. [Google Scholar] [CrossRef]
He, Q.; Zhang, K.; Wu, S.; Zhao, Q.; Wang, X.; Shen, Z.; Li, L.; Wan, M.; Liu, X. Real-time GNSS-derived PWV for typhoon characterizations: A case study for super typhoon Mangkhut in Hong Kong. Remote Sens. 2020, 12, 104. [Google Scholar] [CrossRef]
He, Q.; Shen, Z.; Wan, M.; Li, L. Precipitable water vapor converted from GNSS-ZTD and ERA5 datasets for the monitoring of tropical cyclones. IEEE Access 2020, 8, 87275–87290. [Google Scholar] [CrossRef]
Bonafoni, S.; Biondi, R.; Brenot, H.; Anthes, R. Radio occultation and ground-based GNSS products for observing, understanding and predicting extreme events: A review. Atmos. Res. 2019, 230, 104624. [Google Scholar] [CrossRef]
Zhao, Q.; Ma, X.; Yao, W.; Yao, Y. A new typhoon-monitoring method using precipitation water vapor. Remote Sens. 2019, 11, 2845. [Google Scholar] [CrossRef]
Vedel, H.; Huang, X.Y.; Haase, J.; Ge, M.; Calais, E. Impact of GPS zenith tropospheric delay data on precipitation forecasts in Mediterranean France and Spain. Geophys. Res. Lett. 2004, 31, L02102. [Google Scholar] [CrossRef]
Wang, X.; Zhang, K.; Wu, S.; Li, Z.; Cheng, Y.; Li, L.; Yuan, H. The correlation between GNSS-derived precipitable water vapor and sea surface temperature and its responses to El Niño—Southern Oscillation. Remote Sens. Environ. 2018, 216, 1–12. [Google Scholar] [CrossRef]
Zhao, Q.; Yao, Y.; Yao, W.Q.; Li, Z. Near-global GPS-derived PWV and its analysis in the El Niño event of 2014–2016. J. Atmos. Sol.-Terr. Phys. 2018, 179, 69–80. [Google Scholar] [CrossRef]
Zhu, D.; Zhang, K.; Yang, L.; Wu, S.; Li, L. Evaluation and calibration of MODIS near-infrared precipitable water vapor over China using GNSS observations and ERA-5 reanalysis dataset. Remote Sens. 2021, 13, 2761. [Google Scholar] [CrossRef]
Cao, Y.; Fang, Z.; Li, C.; Wang, Y. Test on monitoring meso-and micro-scale precipitation in Beijing using GPS data and satellite image. Plateau Meteorol. 2005, 24, 91–96. [Google Scholar]
Champollion, C.; Masson, F.; Van Baelen, J.; Walpersdorf, A.; Chéry, J.; Doerflinger, E. GPS monitoring of the tropospheric water vapor distribution and variation during the 9 September 2002 torrential precipitation episode in the Cévennes (southern France). J. Geophys. Res. Atmos. 2004, 109, D24102. [Google Scholar] [CrossRef]
Muller, C.J.; Back, L.E.; O’Gorman, P.A.; Emanuel, K.A. A model for the relationship between tropical precipitation and column water vapor. Geophys. Res. Lett. 2009, 36, L16804. [Google Scholar] [CrossRef]
Barindelli, S.; Realini, E.; Venuti, G.; Fermi, A.; Gatti, A. Detection of water vapor time variations associated with heavy rain in northern Italy by geodetic and low-cost GNSS receivers. Earth Planets Space 2018, 70, 28. [Google Scholar] [CrossRef]
Guo, Y.; Wang, J.; Wang, H.; Long, L.; Han, F. Indication of GPS precipitable water vapor for rainstorm forecast in Hubei. Meteorol. Sci. Technol. 2015, 43, 666–674. [Google Scholar]
Shi, C.; Zhang, W.; Cao, Y.; Lou, Y.; Liang, H.; Fan, L.; Satirapod, C.; Trakolful, C. Atmospheric water vapor climatological characteristics over Indo-China region based on BeiDou/GNSS and relationships with precipitation. Acta Geod. Cartogr. Sinica 2020, 49, 1112. [Google Scholar]
Sapucci, L.F.; Machado, L.A.T.; de Souza, E.M.; Campos, T.B. Global positioning system precipitable water vapour (GPS-PWV) jumps before intense rain events: A potential application to nowcasting. Meteorol. Appl. 2019, 26, 49–63. [Google Scholar] [CrossRef]
Kunkel, K.E.; Stevens, S.E.; Stevens, L.E.; Karl, T.R. Observed climatological relationships of extreme daily precipitation events with precipitable water and vertical velocity in the contiguous united states. Geophys. Res. Lett. 2020, 47, e2019GL086721. [Google Scholar] [CrossRef]
Zhao, Q.; Ma, X.; Yao, Y. Preliminary result of capturing the signature of heavy rainfall events using the 2-d-/4-d water vapour information derived from GNSS measurement in Hong Kong. Adv. Space Res. 2020, 66, 1537–1550. [Google Scholar] [CrossRef]
Benevides, P.; Catalao, J.; Miranda, P.M.A. On the inclusion of GPS precipitable water vapour in the nowcasting of rainfall. Nat. Hazards Earth Syst. Sci. 2015, 15, 2605–2616. [Google Scholar] [CrossRef]
Yao, Y.; Shan, L.; Zhao, Q. Establishing a method of short-term rainfall forecasting based on GNSS-derived PWV and its application. Sci. Rep. 2017, 7, 12465. [Google Scholar] [CrossRef] [PubMed]
Manandhar, S.; Lee, Y.H.; Meng, Y.S. GPS-PWV based improved long-term rainfall prediction algorithm for tropical regions. Remote Sens. 2019, 11, 2643. [Google Scholar] [CrossRef] [Green Version]
Li, H.; Wang, X.; Wu, S.; Zhang, K.; Chen, X.; Qiu, C.; Zhang, S.; Zhang, J.; Xie, M.; Li, L. Development of an improved model for prediction of short-term heavy precipitation based on GNSS-derived PWV. Remote Sens. 2020, 12, 4101. [Google Scholar] [CrossRef]
Manandhar, S.; Lee, Y.H.; Meng, Y.S.; Yuan, F.; Ong, J.T. GPS-derived PWV for rainfall nowcasting in tropical region. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4835–4844. [Google Scholar] [CrossRef]
Li, H.; Wang, X.; Wu, S.; Zhang, K.; Chen, X.; Zhang, J.; Qiu, C.; Zhang, S.; Li, L. An improved model for detecting heavy precipitation using GNSS-derived zenith total delay measurements. IEEE J. Sel. Topics Appl. Earth Obs. Remote Sens. 2021, 14, 5392–5405. [Google Scholar] [CrossRef]
Li, H.; Wang, X.; Wu, S.; Zhang, K.; Fu, E.; Xu, Y.; Qiu, C.; Zhang, J.; Li, L. A new method for determining an optimal diurnal threshold of GNSS precipitable water vapor for precipitation forecasting. Remote Sens. 2021, 13, 1390. [Google Scholar] [CrossRef]
Zhao, Q.; Liu, Y.; Yao, W.; Yao, Y. Hourly rainfall forecast model using supervised learning algorithm. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4100509. [Google Scholar] [CrossRef]
Smith, A.; Lott, N.; Vose, R. The integrated surface database: Recent developments and partnerships. Bull. Am. Meteorol. Soc. 2011, 92, 704–708. [Google Scholar] [CrossRef]
Zhang, Q.; Ye, J.; Zhang, S.; Han, F. Precipitable water vapor retrieval and analysis by multiple data sources: Ground-based GNSS, radio occultation, radiosonde, microwave satellite, and NWP reanalysis data. J. Sens. 2018, 2018, 3428303. [Google Scholar] [CrossRef]
Li, L.; Wu, S.; Zhang, K.; Wang, X.; Li, W.; Shen, Z.; Zhu, D.; He, Q.; Wan, M. A new zenith hydrostatic delay model for real-time retrievals of GNSS-PWV. Atmos. Meas. Tech. 2021, 14, 6379–6394. [Google Scholar] [CrossRef]
Hersbach, H.; Bell, B.; Berrisford, P.; Biavati, G.; Horányi, A.; Muñoz Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Rozum, I.; et al. ERA5 hourly data on pressure levels from 1979 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). 2018. Available online: https://cds.climate.copernicus.eu/cdsapp#!/dataset/10.24381/cds.bd0915c6?tab=overview (accessed on 5 July 2022).
Pavlis, N.K.; Holmes, S.A.; Kenyon, S.C.; Factor, J.K. The development and evaluation of the earth gravitational model 2008 (EGM2008). J. Geophys. Res. Solid Earth 2012, 117, B04406. [Google Scholar] [CrossRef]
Offiler, D.; Jones, J.; Bennit, G.; Vedel, H. EIG EUMETNET GNSS Water Vapour Programme (E-GVAP-II); Product Requirements Document; MetOffice: Brussel, Belgium, 2010.
Allouche, O.; Tsoar, A.; Kadmon, R. Assessing the accuracy of species distribution models: Prevalence, kappa and the true skill statistic (TSS). J. Appl. Ecol. 2006, 43, 1223–1232. [Google Scholar] [CrossRef]
Youden, W.J. Index for rating diagnostic tests. Cancer 1950, 3, 32–35. [Google Scholar] [CrossRef]
Bantis, L.E.; Nakas, C.T.; Reiser, B. Construction of confidence regions in the ROC space after the estimation of the optimal Youden index-based cut-off point. Biometrics 2014, 70, 212–223. [Google Scholar] [CrossRef]
Schisterman, E.F.; Faraggi, D.; Reiser, B.; Hu, J. Youden index and the optimal threshold for markers with mass at zero. Stat. Med. 2008, 27, 297–315. [Google Scholar] [CrossRef]
Berrar, D. Performance measures for binary classification. Encycl. Bioinf. Comput. Biol. 2019, 1, 546–560. [Google Scholar]
Ruopp, M.D.; Perkins, N.J.; Whitcomb, B.W.; Schisterman, E.F. Youden index and optimal cut-point estimated from observations affected by a lower limit of detection. Biom. J. J. Math. Methods Biosci. 2008, 50, 419–430. [Google Scholar] [CrossRef] [Green Version]
Schmid, F.; Wang, Y.; Harou, A. Guidelines for Nowcasting Techniques. World Meteorological Organization. 2017. Available online: https://library.wmo.int/doc_num.php?explnum_id=3795 (accessed on 1 August 2022).

Figure 1. Distribution of stations used in this study. The red quadrangles, cyan circles, and blue asterisks stand for radiosonde, GNSS, and rainfall stations, respectively.

Figure 2. GNSS-PWV (blue curves) and RS-PWV (red dots) time series over the ZMA1 station in the 7-month period from November 2013 to May 2014.

Figure 3. Percentage of POD, FAR, CSI, and TSS candidate thresholds at the 724988-94704-NYDV station in January over the period of 2010–2018. (a–c) The results of PWV value, PWV increase, and maximum hourly PWV increase, respectively. The triangles and squares are the results determined by TSS and CSI, respectively.

Figure 4. Variation in GNSS-PWV before rainfall events in June 2015 at the 726140-54742-VTD7 station.

Figure 5. Using the 12-h period before a rainfall event to determine the prediction factor value.

Figure 6. PWV values before onset of 66 rainfall events in January of a 9-year period at 726166-54781-VTBE station. The blue line represents the PWV values from the minimum value to the maximum value before onset of the 66 rainfall events. The red triangle, dot, and quadrangle represent the minimum value, the value at the position of 80%, and the maximum value, respectively.

Figure 7. TSS values at the 66 stations resulting from the six strategies S1, S2, S3, S4, S5, and S6 defined in Table 6.

Figure 8. (a) POD and (b) FAR at 66 rainfall stations from the data for the 2-year period from 2019 to 2020.

Figure 9. Annual mean rainfall of the 66 rainfall stations.

Table 1. Strategies adopted for GNSS data processing.

Parameter Name	Selection
Satellite	GPS
Processing method	PPP
Orbit and clock	IGS final products
A priori ZTD model	GPT + Saastamoinen model
Mapping function	Global mapping function (GMF)
Elevation cutoff	3°
Ionospheric delay	Ionosphere-free combination
Ocean tidal loading	FES2004
Elevation-dependent weight	cos(z)
Correction of antenna phase center variations	I14
Temporal resolution of ZTD	1 h

Table 2. Statistical results of GNSS-PWV over each GNSS station during the 11-year period studied using RS-PWV over the co-located radiosonde station as the reference.

GNSS Sta.	Lat (°)	Lon (°)	Ele (m)	Co-Located Radiosonde Sta.	Lat (°)	Lon (°)	Ele (m)	Dis (km)	Bias (mm)	STD (mm)	RMS (mm)
AC24	58.68	−156.65	36.4	USM00070326	58.68	−156.67	8.4	0.9	0.0	1.3	1.3
AC58	57.16	−170.22	19.4	USM00070308	57.15	−170.22	10.0	0.7	−0.1	1.3	1.3
AL35	33.17	−86.76	146.4	USM00072230	33.18	−86.78	174.0	2.8	0.0	2.3	2.3
ANC2	61.18	−149.98	57.9	USM00070273	61.16	−149.99	52.1	2.1	−0.4	1.2	1.2
BET1	60.79	−161.84	51.8	USM00070219	60.79	−161.84	33.0	0.3	−0.4	1.2	1.3
BRW1	71.28	−156.79	15.1	USM00070026	71.29	−156.78	11.9	0.7	−0.6	1.0	1.2
MIGD	45.03	−84.64	368.0	USM00072634	44.91	−84.72	447.7	14.5	−0.2	1.5	1.5
NCBE	34.72	−76.67	−29.3	USM00072305	34.78	−76.88	11.0	19.8	−0.7	2.2	2.3
P401	47.94	−124.56	36.2	USM00072797	47.93	−124.56	56.8	0.4	−0.4	1.5	1.5
ZMA1	25.82	−80.32	−8.0	USM00072202	25.75	−80.38	4.3	10.1	−1.7	2.5	3.0

Note: (1) Lat is latitude, (2) Lon is longitude, (3) Ele is Elevation, (4) Dis is the distance between the pair of co-located stations, and (5) Sta. is station.

Table 3. Schematic 2 × 2 contingency table for predicted events and observed events.

Truth		Forecast		Total
Truth		Yes	No	Total
Observed	Yes	TP	FN	TP + FN
Observed	No	FP	TN	FP + TN
Total		TP + FP	FN + TN	TP + FP + FN + TN

Table 4. Definition of the three prediction factors.

Prediction Factor	Definition	Remark
PWV value	Hourly GNSS-PWV value	Proposed by Yao et al. [42]
PWV increase	Maximum PWV before rainfall— Minimum PWV before the adjacent maximum PWV	Proposed by Yao et al. [42]
Maximum hourly PWV increase	Maximum hourly PWV increase between maximum and minimum PWVs before rainfall	Proposed in this study

Table 5. Statistic result for each candidate threshold of the prediction factor of PWV value in January at the 726166-54781-VTBE station.

Candidate Thresholds (mm)	TP	FP	FN	TN	TSS (%)	FAR (%)	POD (%)	CSI (%)
4.9	66	481	0	460	48.9	87.9	100.00	12.1
5.9	62	361	4	580	55.6	85.3	93.9	14.5
6.9	61	265	5	676	64.3	81.3	92.4	18.4
7.9	53	152	13	789	64.2	74.2	80.3	24.3
8.9	46	94	20	847	59.7	67.1	69.7	28.8
9.9	44	69	22	872	59.3	61.1	66.7	32.6
10.9	37	53	29	888	50.4	58.9	56.1	31.1
11.9	33	40	33	901	45.8	54.8	50.0	31.1
12.9	30	31	36	910	42.2	50.8	45.5	30.9
13.9	29	23	37	918	41.5	44.2	43.9	32.6
14.9	28	15	38	926	40.8	34.9	42.4	34.6
15.9	22	14	44	927	31.9	38.9	33.3	27.5
16.9	18	14	48	927	25.8	43.8	27.3	22.5

Table 6. Six strategies formed by satisfying different conditions from the three prediction factors.

Strategy	Condition
S1	PWV > TH1 or Inc > TH2 or Inr > TH3
S2	(PWV > TH1 and Inc > TH2) or (PWV > TH1 and Inr > TH3) or (Inc > TH2 and Inr > TH3)
S3	PWV > TH1 and Inc > TH2 and Inr > TH3
S4	PWV > TH1 or (Inc > TH2 and Inr > TH3)
S5	Inc > TH2 or (PWV > TH1 and Inr > TH3)
S6	Inr > TH3 or (PWV > TH1 and Inc > TH2)

Note: PWV, Inc, and Inr denote PWV value, PWV increase, and maximum hourly PWV increase, respectively; TH1, TH2, and TH3 denote their optimal thresholds.

Table 7. Mean POD and FAR values in each of the three parts that were separated according to the annual mean rainfall.

Longitude	Number of Stations	Mean POD	Mean FAR
180~100°W	21	83%	57%
100~90°W	16	90%	60%
90~60°W	30	89%	47%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, L.; Zhang, K.; Wu, S.; Li, H.; Wang, X.; Hu, A.; Li, W.; Fu, E.; Zhang, M.; Shen, Z. An Improved Method for Rainfall Forecast Based on GNSS-PWV. Remote Sens. 2022, 14, 4280. https://doi.org/10.3390/rs14174280

AMA Style

Li L, Zhang K, Wu S, Li H, Wang X, Hu A, Li W, Fu E, Zhang M, Shen Z. An Improved Method for Rainfall Forecast Based on GNSS-PWV. Remote Sensing. 2022; 14(17):4280. https://doi.org/10.3390/rs14174280

Chicago/Turabian Style

Li, Longjiang, Kefei Zhang, Suqin Wu, Haobo Li, Xiaoming Wang, Andong Hu, Wang Li, Erjiang Fu, Minghao Zhang, and Zhen Shen. 2022. "An Improved Method for Rainfall Forecast Based on GNSS-PWV" Remote Sensing 14, no. 17: 4280. https://doi.org/10.3390/rs14174280

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Improved Method for Rainfall Forecast Based on GNSS-PWV

Abstract

1. Introduction

2. Data and Methodologies

2.1. Data

2.1.1. ISD

2.1.2. GNSS

2.1.3. Radiosonde

2.1.4. ERA5

2.2. Methodology

2.2.1. Selection of Co-Located Stations and Pre-Processing

2.2.2. Obtaining of GNSS-PWV

2.2.3. Evaluation of GNSS-PWV

2.2.4. Criteria for Evaluation of Forecasting Results

3. Construction of New Model

3.1. Determination of Thresholds

3.2. Using All Prediction Factors Together

4. Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI