Next Article in Journal
Submerged Macrophyte Restoration in Enclosure: A Proper Way for Ecological Remediation of Shallow Lakes?
Previous Article in Journal
Shifting Waters: The Challenges of Transitioning from Freshwater to Treated Wastewater Irrigation in the Northern Jordan Valley
Previous Article in Special Issue
A Spatial-Reduction Attention-Based BiGRU Network for Water Level Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation of the Peak over Threshold-Based Design Rainfall and Its Spatial Variability in the Upper Vistula River Basin, Poland

by
Katarzyna Kołodziejczyk
1,*,† and
Agnieszka Rutkowska
2,†
1
Faculty of Environmental and Energy Engineering, Cracow University of Technology, Warszawska 24, 31-155 Cracow, Poland
2
Department of Applied Mathematics, Faculty of Environmental Engineering and Land Surveying, University of Agriculture in Krakow, Balicka 253 C, 30-198 Cracow, Poland
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Water 2023, 15(7), 1316; https://doi.org/10.3390/w15071316
Submission received: 27 February 2023 / Revised: 21 March 2023 / Accepted: 23 March 2023 / Published: 27 March 2023
(This article belongs to the Special Issue Statistical Analysis in Hydrology: Methods and Applications)

Abstract

:
The proper assessment of design rainfalls with long return periods is very important because they are inputs for many flood studies. In this paper, estimations are performed on daily design rainfall totals from 16 meteorological stations located in the area of the Upper Vistula River Basin (UVB), Poland. The study material consists of a historical series of daily rainfall totals from the period of 1960–2021. The peak over threshold (POT) method is used, and the rainfall depth over threshold is assumed to follow the generalized Pareto distribution (GPD) with parameters estimated from Hill statistics. Alternatively, the competitive method based on annual maxima (AM) is applied. The theoretical distribution of AM is assumed to follow a theoretical distribution function selected by using the Akaike information criterion (AIC) from a family of seven candidate distributions, the parameters of which are estimated by using the maximum likelihood method. The two methods are compared by using the root mean square error (RMSE) and the mean deviation error (MDE) criteria. It is found that the POT-based method with GPD and Hill estimators outperform the AM-based method when considering the highest rainfall events. The confidence intervals of the design rainfalls, derived by using the Monte Carlo simulation method, reflects their large spatial diversity across the UVB. It is shown that the station’s altitude strongly correlates with the threshold, variance, and design rainfall depth of the GPD. This proves the advantage of the GPD with Hill estimates, namely that it can accurately reflect the spatial properties of rainfall and its variability in the UVB. Results can be applied in water-management applications related to floods.

1. Introduction

Intense and long-duration precipitation has caused many catastrophic flood events in Poland, such as those seen in 1970, 1997, 2001 and 2010. In addition, as [1] indicates, the number of local flash floods resulting from heavy rainfalls has increased in Poland. The increasing urbanization process and climate changes have caused long periods of drought and then heavy rainfall, which may cause a further increase in the frequency and intensity of floods in the future. Currently, more and more emphasis is placed not only on the drainage of rainwater, but on its retention, as well as reproduction of retention lost as a result of the catchment sealing process [2]. Because of this, accurate designing of high rainfall depths is a pressing issue.
The proper modeling of extreme precipitation in the Upper Vistula River Basin (UVB), located in the southern part of Poland, is of high importance because the region is very flood prone. In the region, the estimation of design rainfall totals for long return periods is crucial for designing storm water drainage or land drainage systems, and preparing flood risk maps, as well as planning flood-control solutions [3]. The rationale for this study is that rivers in most of the region are very susceptible to flooding, the contribution of the surface runoff from this area to the total surface runoff of Poland is very high, and that, due to climate change, rainfall has become more and more intense in recent years.
Much attention should be given to the proper estimation of the distribution of daily rainfall totals because the right tail of the probability density function (PDF) is highly influenced by the observed high-precipitation quantiles. From the point of view of hydrology and meteorology, it is crucial to learn whether the tail of the PDF is heavy or light [4,5]. A heavy tail of the distribution indicates that the probability of occurrence of very high precipitation is larger than for a light tail.
The main objective of the paper was to conduct a very precise estimation of daily design rainfall depth for various return periods. The approach was not based on the commonly used annual maxima (AM) method but on the peak over threshold (POT) method with the use of Hill statistics. In the POT-based method, all sample values that exceed a certain threshold level are accounted for. The daily precipitation depth over this threshold was assumed to follow the generalized Pareto distribution (GPD). The Hill estimators of the parameters were used in the GPD fit. The estimators can accurately reflect very high rainfall values, often observed in the UVB. We also addressed the issue of the spatial associations between orography and variance of the GPD, and design rainfall depths. To assess the quality of the POT-based estimation with Hill statistics, we also performed the AM-based estimation by using the maximum likelihood and compared the results. Different distribution functions, such as the generalized extreme value (GEV), two-parameter gamma (Pearson 3, GA2), three-parameter gamma (GA3), lognormal (LOG), two-parameter Weibull (WE2), three-parameter Weibull (WE3), and Gumbel (GUM) were considered in the AM-based method in this study.
In Poland, the Bogdanowicz–Stachý probabilistic maximum precipitation model for various return periods is often used [6,7,8]. This AM-based method makes use of the Weibull distribution, the parameters of which reflect some specific rainfall properties in three different regions of Poland. However, the model does not cover the southern, mountainous part of the UVB with very high precipitation totals. Moreover, recent climatic changes might also affect its suitability [9].
The POT approach enables an increase of the sample size in comparison to the AM-based method by considering all independent peak events over threshold and the omission of low values below threshold which are not valid in the estimation of large events. This is the main advantage of the POT approach in comparison with the AM approach. The disadvantages are the inclusion of subjective judgment into the POT sample selection and the GPD estimation, and the strong sensitivity of high quantile estimates to the threshold choice. Therefore, the main difficulties in selecting the POT sample are the choice of the threshold, which produces the lowest bias of quantile estimates, and the ensured independence between successive events [10,11].
The GPD was used in various fields of study [12,13]. For instance, Madsen et al. [14,15,16] demonstrated that the POT-based GPD fit leads to a more accurate estimation of high quantiles compared to the AM-based GEV fit. Van Montfort and Witter [17] indicated that in the GPD fit, some POT series of daily rainfall show exponentially distributed peaks, with one or more outlying observations [17]. The methodology of developing the Polish Atlas of Rains Intensities (PANDa) (see [2]) was also based on the GDP distribution; however, the method of parameter estimation substantially differs from the method based on the Hill statistics in this paper. Various methods have been used in meteorology and hydrology to calibrate the GPD. For example, Madsen and Rosbjerg [18] compared the method of moments, probability-weighted moments, and maximum likelihood (ML) with regard to the precision of the 100-year flood quantile estimate. Martins et al. [19] used the ML estimates for the assessment of extreme monthly rainfall in Brazil. Singirankabo [20] applied the ML method to daily rainfall intensity in Rwanda. Martins and Stedinger [21] used the generalized maximum likelihood (GML) method, in which the ML and Bayesian approaches are combined, in extreme flood assessment. Yet another method of the GPD parameters’ estimation was adapted by Willems et al. to river discharges [22]. They used the Hill estimator of the γ parameter [23,24]. The Hill estimates were then successfully applied to river discharges and rainfall intensities in various world regions [11,25,26,27,28]. The main advantages of the method are that (i) the calibration of the threshold is based directly on sample values, which is crucial because of the large spatial diversity of rainfalls in the UVB, and (ii) the parameter estimation is weighted toward large rainfall events that play an important role in designing extremely heavy rainfalls. This method was used in the paper.

2. Materials and Methods

2.1. Study Area and Data

The Upper Vistula River Basin (UVB) has a total surface area of 47,053.51 km 2 (the Polish part; see Figure 1). It is diverse in terms of altitudes and geological structure (it includes the Carpathians, a part of the Carpathian Foredeep, parts of the Lublin Basin, the Upper Silesia Basin, the Cracow Monocline, the Nida Basin, and a part of the Świętokrzyskie Mountains). The spatial distribution of precipitation varies from the lowest in the north to the highest in the south (the Tatra Mountains). The meteorological conditions are influenced by polar maritime air masses in the west and continental air masses in the east; thus the climate is transitional [29,30,31]. The strong diversity in climate conditions influences the volume of precipitation and its extreme values. As indicated by Młyński et al. [32], the occurrence of precipitation is characterized by seasonality. In the UVB, the highest rainfall most often occurs in summer (from May to September). The region is very susceptible to flooding [33,34]. Additionally, as the UVB makes up approximately 30% of water resources in Poland [35], the region plays an important role in flood generation in a large part of the country [31].
The region is heavily influenced by anthropogenic activity [36]. It includes, e.g., a part of the Silesian urban area, which constitutes an important industrial region, as well as large cities: Kraków, Rzeszów, Tarnów, and many hydrotechnical structures (e.g. Wisła Czarne, Goczałkowice, Czorsztyn-Niedzica, Solina).
In the study, the stations located at the lowest altitudes are Rzeszów-Jasionka (200 m. a.s.l), Tarnów (209 m. a.s.l) and Pilzno (210 m. a.s.l), whereas Ochotnica Górna (620 m. a.s.l) and Kasprowy Wierch (1991 m. a.s.l, the Tatra Mountains) are at the highest altitudes (Table 1).
The series of daily precipitation totals from the period between 1960 and 2021 from 16 meteorological stations located in the UVB were used for the analysis (Figure 1). The data were obtained from the public database of the Institute of Meteorology and Water Management, National Research Institute (IMGW-PIB). In days with snowfall, the rainfall equivalent is provided in the data. Therefore, rainfall and precipitation are equivalent in the paper.
The mean value of the annual maximum daily rainfall total in the period between 1960 and 2021 varied from 36.14 mm (Raków) to 83.01 mm (Kasprowy Wierch) while the mean daily precipitation ranged from 1.5 mm to 2.10 mm, apart from Kasprowy Wierch, for which it was 4.89 mm (see Table 1).

2.2. Research Methods

The POT-based method with Hill statistics was applied in the design rainfall estimation. Then, for comparative purposes, the AM-based method was also used. The superior method was selected next in the sense that a higher rating was given to the method that better estimated the highest rainfall depths. Afterward, the dependence between the distribution parameters and orographic characteristics of the region where the rainfall stations are located was found.
The methods were depicted in the flow diagram in Figure 2. The shapes on the left branch refer to the steps in the POT-based estimation while the shapes on the right branch refer to the AM-based estimation. The branches join when the methods are compared, the design rainfalls and their confidence intervals are estimated, and when the GPD properties are associated with orographic characteristics. The detailed description of the methods is provided in the next sections.

2.3. The POT-Based Estimation

In the POT-based method, the daily rainfall depth X over a certain threshold x t (where X > x t ) is assumed to follow the generalized Pareto distribution (GPD). The GPD can cover a wide range of tail weights, and therefore it is very useful in the extreme value analysis [10]. The cumulative distribution function (CDF) of the GPD equals
G ( x ) = 1 ( 1 + γ x x t σ ) 1 γ if γ 0 , 1 exp ( x x t σ ) if γ = 0 ,
where γ R is the shape parameter and σ > 0 is the scale parameter. The γ parameter is known as the extreme value index. The scale parameter linearly depends on the threshold x t , with proportionality factor equal to the shape parameter, namely σ = γ x t , if only x t is sufficiently high [24]. A high, positive value of the γ parameter (heavy tail of the distribution) indicates that the probability of the occurrence of very high precipitation is larger than if γ = 0 (light, exponential tail). The variance of GPD equals
V a r GPD = σ 2 ( 1 γ ) 2 ( 1 2 γ ) ( if γ < 1 2 ) .
As the values of the estimated maximum precipitation depth and their variability strongly depend on σ and γ , its precise estimation is therefore highly important, especially in the areas susceptible to heavy downpours and torrential storms.
The POT-based estimation consisted of several steps the methods of which are provided the the following subsections.

2.3.1. Selection of the POT Sample

In order to select the final POT sample, various threshold candidates had to be considered. Before this, all daily precipitation totals greater than or equal to the lowest annual maximum AM (initial value) in the period between 1960 and 2021 were selected. Subsequently, to address the issue of the serial correlation of daily rainfall totals, the Kendall’s τ coefficient was tested for significance [39] for lag = one day. At the stations where the correlation was significantly different from zero (Sandomierz, Stróża, Kasprowy Wierch), the series were then modified by taking only the days with at least one day of temporal distance. Strictly speaking, if two rainfall depths from two consecutive days were included in the first sample, the lower value was omitted. This modification enabled the selection of the series with uncorrelated elements. No modification was carried out for stations with uncorrelated rainfall events. Then, for each station, the series was ordered from the lowest to the highest, x t x 1 with the lowest AM as the initial threshold value. This was the series of all threshold candidates.
Afterward, starting from the lowest element x t , the threshold was successively increased and, in each step, the Hill estimate γ ^ of the shape parameter γ of the GPD distribution was estimated [40]:
γ ^ = 1 t 1 j = 1 t 1 ln x j ln x t .
The estimate γ ^ is the mean excess of the log-transformed data over the threshold x t . The asymptotic mean squared error was also computed in each step [40],
M S E = 1 t 1 j = 1 t 1 w j ( ln x j x t γ ^ ln t j ) 2 ,
assuming the Hill weights w j = 1 ln j t , j = 1 , 2 , , t 1 . The threshold x t for which the series of the M S E achieved minimum was the final, optimal threshold. The final choice of the threshold was a tradeoff between low thresholds, which cause a high bias, and high thresholds, which cause a high variance of γ ^ ; therefore the selection of x t from the region of low fluctuation of γ ^ was of high importance. The final POT sample comprised of the POT events greater than or equal to the final threshold x t . The appropriate choice of x t was verified in the Pareto QQ plot where the points ( ln ( p j ) , ln ( x j ) ) for p j = j t + 1 —the probability of exceedance of x j —should lay along the regression line with the slope equal to γ ^ if the true distribution of the rainfall depth above threshold is GPD. This is the main advantage of the Hill weights, namely that the regression line is pulled toward matching the largest rainfall events because the Hill estimation is pointed at the events with a low probability of exceedance.
If the fit in the Pareto QQ plot is poor, because the points for the highest rainfall events lie much below the regression line, this means that the tail of the empirical distribution should be fitted to the GPD with γ = 0 (exponential distribution, EXP) where the estimate of the scale parameter equals [40]
σ ^ = 1 t 1 j = 1 t 1 x j x t .
In this case, the optimal threshold x t was selected based on minimizing the M S E [40]:
M S E = 1 t 1 j = 1 t 1 w j ( x j x t σ ^ ln t j ) 2 .
The exponential QQ plot is ( ln ( p j ) , x j ) , where p j = j t + 1 , was also used to assess the goodness of fit. The method was described in [40,41] and applied in [11,22,26].
The final estimates of the location, scale, and shape parameters of the GPD were x t , σ ^ , γ ^ , where γ ^ was derived from the Equation (3) and σ ^ = γ ^ x t , while the estimates of the location and scale parameters of the exponential distribution were x t , σ ^ where σ ^ was derived from the Equation (5).

2.3.2. Testing the Independence, Randomness, and Homogeneity

Many procedures for the POT series selection often suffer from some subjectivity; therefore, an objective criterion of independence, randomness, and identical distribution of the POT events should be included.
For various lags, the hypothesis that the Kendall’s τ serial correlation of the POT sample is zero [39] was verified. In order to check the randomness of the POT events, the method based on the dispersion index Ψ = i = 1 n y ( y i y ¯ ) 2 y ¯ was used to verify the hypothesis that the distribution of the number of exceedances over threshold per year is Poisson [10,42,43]. From a theoretical point of view, Ψ should asymptotically follow a χ 2 distribution with n y 1 degrees of freedom where y i , y ¯ , n y is the number of exceedances over threshold, its mean value, and the number of years, respectively.
Afterward, all final POT series were tested for homogeneity in the parameter of location by using the Mann–Kendall (MK) test for trend [44,45].

2.3.3. Testing the Goodness of Fit to the GPD Distribution

In order to verify the hypothesis that the distribution of rainfall depth over threshold was GPD, the Anderson–Darling (AD) test [46] was used in its modified version which accounts for the upper tail of the probability distribution function, namely [47],
ADU = t 2 2 j = 1 t G ( x ( j ) ) j = 1 t ( 2 2 j 1 t ) ln ( 1 G ( x ( j ) ) ) ,
where G is the cumulative distribution function of the GPD (see the Equation (1)). The critical values ADU crit of the AD test were estimated by using the Monte Carlo method based on N = 10 4 simulations.

2.3.4. Estimating the Design Rainfall Depth

Assuming that μ is the mean number of the POT elements per year, the return period of the rainfall depth x is T ( x ) = 1 μ ( 1 G ( x ) ) . By using this equation, the design rainfall of return period T was derived from
x G P D ( T ) = x t ( T μ ) γ if γ > 0 , x t + σ ln ( μ T ) if γ = 0 ,
assuming that μ T > 1 .

2.4. The AM-Based Estimation

The steps were similar to those from the Section 2.3. The difference was in the method of estimation that made use of Hill statistics in the POT-based method and the maximum likelihood estimation in the AM-based method.

2.4.1. Testing the Homogeneity of AM

The AM series was tested for trends by using the MK test. This is an important issue in order to check whether due to climate change or other circumstances the distribution of the daily annual maxima totals did not change during the observation period.

2.4.2. Parameter Estimation and Verification of the Type of the Distribution

The family of theoretical distribution function candidates, F = { GEV, GA2, GA3, LOG, WE2, WE3, GUM} was considered at each station where the abbreviations apply to generalized extreme value, two-parameter gamma, three-parameter gamma, lognormal, two-parameter Weibull, three-parameter Weibull, and Gumbel, respectively. The parameters were estimated by using the MLE [48,49]. The equations for the distribution functions are commonly used in the literature [50,51,52].
For every F F , the null hypothesis was verified at each station by using the Anderson–Darling test (ADU test), for which the following distribution function is F.

2.4.3. Selection of the Best Distribution of the AM and Estimation of Design Rainfall Depths

For each station, the best distribution F was selected from all distributions for which the null hypothesis was not rejected. The final selection was based on the Akaike information criterion [53,54], namely on the minimization of A I C = 2 ln L + 2 k where L is the likelihood function and k is the number of parameters of the theoretical distribution function. The main advantage of the use of A I C is that it meets the principle of parameter parsimony because it combines both the L-based fit and the number of parameters. Thus, models with many parameters are penalized and have less of a chance to be selected.
After the best distribution, F was selected, and design rainfall depths for various return periods were estimated.

2.5. Selection of a Better Method from the POT-Based and AM-Based Methods

The comparison was performed between the results of two methods of estimation of the highest design rainfalls—the POT-based method and the AM-based method, the former using the Hill estimation of the GPD distribution from the POT sample (Section 2.3) and the latter using the MLE of the F distribution that was selected by using the AIC criterion (Section 2.4). The evaluation of each method was done by using the mean deviation error ( M D E ) and the root mean squared error ( R M S E ), where the observed and theoretical rainfall depths of the same return period of the GPD and F distributions were compared. We have
M D E P O T = 1 # S T S x P O T ( T ) x G P D ( T ) , M D E A M = 1 # S T S x A M ( T ) x F ( T )
R M S E P O T = 1 # S T S ( x P O T ( T ) x G P D ( T ) ) 2 , R M S E A M = 1 # S T S ( x A M ( T ) x F ( T ) ) 2 ,
where x P O T ( T ) , x A M ( T ) are the observed quantiles of return period T in the POT and AM series, respectively, while x G P D ( T ) , x F ( T ) are the estimated quantiles of the return period T of the GPD and F distribution, respectively. The set S consists of various return periods and # S is the number of its elements. Only long return periods were included in S to account for the highest rainfalls, namely S 1 = { 2 , 5 , 10 , 12 , 15 , 20 , 30 , 60 } years (case 1) and S 2 = { 15 , 20 , 30 , 60 } years (case 2).
The choice of the M D E and R M S E was due to the fact that they reflect the bias and the variance of the quantile estimators. For example, if M D E P O T and R M S E P O T are near to zero, it means that the bias and variance of the estimator x G P D is low, and the design rainfall depths are estimated properly.
The POT-based and AM-based methods were assessed in the sense that the scores d 1 , i , d 2 , i were assigned first to each station i, namely
d 1 , i = 1 for POT and 0 for AM if | M D E P O T | < | M D E A M | 0 for POT and 1 for AM if | M D E P O T | > | M D E A M | 0 for POT and 0 for AM if | M D E P O T | = | M A E A M | ,
d 2 , i = 1 for POT and 0 for AM if R M S E P O T < R M S E A M 0 for POT and 1 for AM if R M S E P O T > R M S E A M 0 for POT and 0 for AM if R M S E P O T = R M S E A M .
Then, for the AM-based method and POT-based method separately, d 1 , i and d 1 , i were summarized over all stations to d 1 = i = 1 16 d 1 , i and d 2 = i = 1 16 d 2 , i . Lastly, for each method, the final score d was computed, d = d 1 + d 2 and the higher rating was given to the method with higher d. The procedure was repeated for S 1 and S 2 .

2.6. Estimation of the Design Rainfalls for Long Return Periods and Their Confidence Intervals

The design rainfall depths for various long return periods, T { 50 , 100 , 150 , 200 } years, were estimated at each station assuming that the distribution function was selected by using the methods from the Section 2.5.
The ( 1 α ) · 100 % confidence intervals C I for the design rainfall depths were derived next by using the Monte Carlo simulation method. For each station, N = 10 4 artificial series representing the random variable following the distribution were selected, with parameters from a given station being randomly drawn and the design rainfall derived for each series. Then, the confidence limits were computed as the sample quantiles of order α / 2 and 1 α / 2 of the design rainfalls.

2.7. Associations between Distribution Characteristics and the Station’s Altitude

The Spearman correlation coefficient r S between the station’s altitude and the GDP characteristics, such as threshold x t , variance (2), and design rainfall depth (8) was derived. Then the hypothesis that the Spearman correlation ρ S is significantly greater than zero was verified, in order to recognize the spatial relationships across the whole region. The impact of the spatially diverse γ parameter values of the GPD on design rainfalls was also analyzed. It should be noted (Section 2.3) that γ is highly responsible for the shape of the tail of the GPD, and thus it controls the values of the design rainfalls.
The level of significance α = 0.05 was assumed in the whole paper. All calculations were carried out in the R program [55].

3. Results

3.1. POT-Based Method

3.1.1. Selection of the POT Sample and Estimation of the GPD Parameters

The final threshold x t and the estimates γ ^ and σ ^ of the GPD parameters were derived by using the methods from Section 2.3. The final POT sample comprised of rainfall depths greater than or equal to x t . In the POT samples, the average number of events per year ranged from 1.19 for the Lesko station to 3.44 for Ochotnica Górna.
The values of x t , γ ^ , and σ ^ parameters for each station are displayed in Table 2. It can be noted that γ > 0 was obtained for the stations located in the southern, mountainous parts of the UVB (Carpathians) and in the upland in the north, while γ = 0 for the stations in the central part of the UVB (Northern Sub-Carpathians on the border with the Carpathians). This proves that the elevation of the station above sea level affects the properties of the distribution’s tail.

3.1.2. Testing of Homogeneity, Independence, and Randomness

The p-values of the MK test varied from 0.054 to 0.96 among the 16 stations (see Table 3), which means that the null hypothesis of no trend was not rejected. As evident in the results, the homogeneity of each time series can be confirmed.
The results of the hypothesis testing, that the Kendall’s τ correlation coefficient for lag = 1 is zero were shown in Table 3 in the form of the p-values of the test. The conclusion can be drawn that the POT events are independent.
The values of the dispersion index ψ varied from 59.66 to 84.44. All of the index values were within the 95% confidence interval (Figure 3). Therefore, it can be concluded that the distribution of the number of exceedances over threshold is Poisson and that the randomness of the POT events was confirmed.

3.1.3. The Anderson–Darling Test of Goodness of Fit

For each station, the ADU test statistic was lower than the critical value ADU crit (Table 4). The conclusion can be drawn that for all stations the distribution of the rainfall depth over threshold is GPD.
For 14 out of the 16 analysed stations, there is a good fit to the GPD or to the exponential distribution. However, in case of the Brynica and Rzeszów-Jasionka stations, the fit is weaker for higher precipitation values (the POT values are lower than the estimated values). The QQ plots for four exemplary stations (Stróża, Kasprowy Wierch, Chrzanów and Nowy Sącz) were shown in Figure 4. The congruence between the empirical and theoretical quantiles is evident in the plots, even for high rainfall depths.

3.1.4. The Design Rainfall Depth and Its Confidence Interval

For each station, the design rainfall depth was calculated from the Equation (8) for various return periods. The results for T = 100 and T = 200 years are presented in Table 4. For both of the return periods, the lowest values were obtained for the Rzeszów-Jasionka station, and the highest for the Kasprowy Wierch station. The 95 % confidence intervals of the design rainfalls were also shown in Table 4. They reflect uncertainty of design rainfall estimates.
The visual assessment of the CIs is easier on the lollipop charts (Figure 5) where P T = 100 , and the lower and upper bounds of the CIs were depicted. A strong diversity and a high uncertainty of the estimated design rainfalls can be noted at Kasprowy Wierch, Stróża, and Skoczów stations where the CIs are widespread (see Figure 5). In the case of the Kasprowy Wierch station, the uncertainty is approximately at the level of 20%. The narrowest confidence intervals were obtained for Rzeszów-Jasionka, Chrzanów, Tarnów, Pilzno, and Nowy Sącz stations which show the lowest uncertainty of the estimated rainfalls at these stations. It can be noted that the stations with lower altitude have lower uncertainty of design rainfall than the stations with higher altitudes. Thus, the spatial properties of the UVB are reflected in the uncertainty of design rainfall.
Figure 6 shows the intensity–frequency curves, where the dependence between design daily rainfall and recurrence period, as well as the comparison between design daily rainfalls for all stations, are depicted. For the Sandomierz (red line) and Raków (navy blue line) stations, in the case of short return periods ( T = 10 years), the design rainfall value is relatively low. However, as the return period increases, the rainfall also increases, and for T = 200 years it exceeds 100 mm in both cases. The high values of the parameter γ in the two stations are reflected in the strong increase of the intensity–requency curve. The lowest increase was obtained for the Tarnów, Chrzanów i Nowy Sącz stations, which in turn reflects a low value of the parameter γ . For all return periods T [ 10 , 100 ] , the highest values were obtained for the Kasprowy Wierch station. This station is located in the Carpathians (1991 m. a.s.l.). Meanwhile, the lowest values were found for Rzeszów-Jasionka, the station at the lowest altitude (200 m. a.s.l.). The issue of how the design rainfall depends on the altitude is discussed in detail in Section 3.4.

3.2. AM-Based Method

3.2.1. Selection AM Sample and Testing the Homogeneity

The verification was carried out by using the Mann–Kendall test, assuming the null hypothesis about the homogeneity of the data. The p-values of the MK test varied from 0.13 to 1.0 . For the Wisłok Wielki station, the null hypothesis of no trend was rejected (increasing trend), and hence, because the homogeneity of the AM time series could not be confirmed, the station was omitted from further considerations. For the other 15 stations, the homogeneity of the AM series was confirmed.

3.2.2. Verification of the Type of the Distribution

The null hypothesis that the time series comes from a theoretical distribution F was verified by using the ADU test. For most stations, there were several distributions accepted by the test, e.g., the GEV, GA2, GA3, LOG, WE3, and GUM distributions. Therefore, as the hypothesis testing results were not conclusive enough in selecting the best fit, the AIC criterion was used in the next step.

3.2.3. Selection of the Best Distribution of the AM by Using the AIC Criterion

The AIC values were displayed in Table 5. The final choices, indicated by the lowest AIC values were marked with boxes. The conclusion can be drawn that the GEV distribution turned out to provide the best fit at two stations, GA3 at four stations, LOG at three stations, WE3 at one station, and GUM at five stations. The GA2 and WE3 were not selected for any station.

3.3. Selection of a Better Method from the POT and AM Based on the Comparison between Rainfall Depths Derived from the POT and AM Series

By using the Equation (8) and the return periods of 2, 5, 10, 12, 15, 20, 30, and 60 years (case 1) and 15, 20, 30, and 60 years (case 2), the design daily rainfall depths were derived from the GPD (POT-based method) and the theoretical distribution selected by using the AIC (AM-based method).
Figure 7 shows the design rainfall depths obtained from the GPD estimation (the POT-based method) and the GEV/GA3/LOG/WE3/GUM estimation (the AM-based method) for six exemplary stations. It can be observed that the GPD yields a better fit to high observed values than the AM-based fit (Brynica, Skoczów, Stróża, Nowy Sącz, Kasprowy Wierch). Many highest peaks were insufficiently designed with the AM-based method while the GPD fit was more accurate. A similar situation occurred in the other stations not shown here.
To quantify the efficiency of the POT method in comparison to the AM-based method, the RMSE and MDE criteria were used to measure the forecasting errors. The comparison was conducted by using the Equations (11) and (12). The results are shown in Table 6.
In case 1, the final score d was 25 and 5 for the POT-based and AM-based method, respectively. Similarly, in case 2 the final score d was 25 to 5 in favor of the POT method.
It can be concluded that from the point of view of the RMSE and MDE criteria, a higher rating was given to the POT method. Hence, the POT method with Hill estimates of the GPD is a better choice in estimating daily precipitation totals in the UVB than the AM-based method.

3.4. Associations between Distribution Characteristics and Station’s Altitude

As evident in the results (Table 2), the parameters of the GPD differ between stations. Such differences can be also noted between design rainfall depths. Additionally, the difference is larger for longer return periods.
The sample Spearman correlation coefficient r S between the station’s altitude and design rainfall depth for return periods of 100, 150, and 200 years was calculated first. The scatterplots were shown in Figure 8. A deep insight into the sample values and in the scatterplot allows us to observe that after omitting the Kasprowy Wierch station (marked in red) the correlation becomes slightly weaker. As the Kasprowy Wierch station highly influences the relation, the two cases were considered next (a) with the Kasprowy Wierch and (b) without the Kasprowy Wierch station. The r S values varied from 0.56 to 0.58 in (a) and dropped to 0.46–0.49 in (b). The blue and green areas represent the confidence region around the regression line at the confidence level of 95 % .
The question was if the dependence is strong in the whole region. It was confirmed, by using hypothesis testing, that ρ S is significantly greater than zero for return periods of 100, 150, and 200 years and in both cases (a) and (b). Thus, the positive relation between the height of the station and design rainfall depth was confirmed.
The sample correlation coefficient r S between the station’s altitude and the threshold x t equals 0.64 in (a) and 0.56 in (b) while between the station’s altitude and V a r GPD equals 0.51 in (a) and 0.41 in (b) (see Figure 9). It was shown, by using hypothesis testing, that the correlation between the station’s altitude and both characteristics is significantly greater than zero.
The relation between the altitude and the γ parameter values is also considerable because γ is greater than zero at the stations with higher altitude (north and south of the UVB) and equal to zero at lower altitudes (central part of the UVB). The Spearman correlation coefficient equals r S = 0.48 (case (a)) and 0.39 (case (b)). The hypothesis testing for ρ S shows that it is significantly greater than zero in (a). However, in (b) the test failed to reject the null hypothesis of no correlation.
Remembering that the uncertainty of design rainfall depth was also associated with the station’s altitude (Section 3.1.4), this proves that the GPD properly reflects the increase in the magnitude and variability of design rainfall with the increase of each station’s altitude in the region.
The conclusion can be drawn that a strong spatial diversity can be ascribed to the GPD characteristics and to the predicted rainfalls with long return periods.

4. Discussion

To the best of our knowledge, this method has not been used in Poland so far. The novelty of the study was in the use of the Hill estimates of the GPD in a region in Poland that is very diverse with regard to rainfall pattern—ranging from moderate to torrential. The novelty also appiles to the assessment of the dependence between orography and characteristics of the GPD, which is important from the point of view of the precision of the estimates of high design rainfall totals. This is the first study in the UVB where the GPD with Hill estimates is adapted to the distribution of daily rainfall over threshold. Therefore, the comparison with other authors can be limited only to some aspects.
The drawback of all POT-based estimation methods is a lack of an unique, universal procedure of a threshold choice. However, basic guidelines that have been developed over the last several dozen years, should always be considered (see Introduction). The limitation of the POT-based method with Hill estimates is that the MSE-based optimal threshold, selected from the region of low fluctuation of γ ^ , can be sometimes difficult to identify, and some experience is needed in analysing the relation between the MSE and potential thresholds. Nevertheless, the advantages of the POT-based method with the Hill estimates of the GPD, and its superiority over the AM-based method in the estimation of largest rainfall quantiles outweigh the disadvantages.
In [2,3,8], the POT method was also used for stations located in Poland. A detailed description of statistical tools and exemplary stations’ analysis was provided. However, it should be highlighted that the method of parameter estimation in [2,8] is MLE, which differs from the method used in this paper. Based only on plots in [8], as exact values were not provided there, it can be concluded that the design rainfalls computed by using the Hill statistics are somewhat higher than those obtained in [8].
A similar region (UVB) was studied by Młyński et al. [56], who chose GEV as the best distribution of the annual maximum precipitation, based on data (43 years) from 51 rainfall stations. The recommendation was based on various selection criteria (e.g., the peak-weighted root mean square error). In five stations that are common with five stations from this study, the P T = 10 values obtained by the authors [56] are similar to P T = 10 values estimated in this paper.
The results of Mascaro are similar to the results obtained in this paper with regard to the gamma parameter that was greater or equal to zero in various stations. However, the seasonal approach of the POT-based method with Hill statistics in the UVB might be considered in future plans.
Onyutha and Willems [26] derived the daily design rainfall in stations located in the Lake Victoria Basin by using the POT method and the GPD distribution and used various methods of estimation in terms of capturing the tail behaviour of the GPD and the uncertainty of the high-intensity quantiles. Some similarity between the results can be observed, namely the “normal tail” of the GPD ( γ near to zero) in all nine stations in [26] and only in five stations with relatively low altitude (200–300 m. a.s.l.) in this study. However, it should be stressed that γ > 0 was estimated in a majority of stations in this study, in particular in all stations with altitude greater than 300 m. a.s.l. This confirms that the inclusion of the spatial factor in rainfall estimates in upland and mountainous areas was certified.

5. Conclusions

The estimation of design rainfalls for a wide range of return periods was carried out by using the POT-based method with the GPD distribution and the Hill estimates at 16 stations located in the UVB, a spatially diverse region vulnerable to flooding. The accurate estimation of the threshold, shape, and scale parameters of the GPD was shown to be a key point in the process because their values control the tail of the GPD.
After comparison between the results of the POT-based method (Hill estimates) with the AM-based method (MLE estimates + AIC), it was found that for 80% of the stations under study the priority should be given to the former method, which shows a very good performance of the POT-based method. The results show that the characteristics (threshold, variance) of the GPD vary between stations, increase with the station’s altitude, and that the shape parameter is greater than zero for stations located in the southern, mountainous part, and in the northern, upland region, while it is mostly equal to zero in the central part with a lower altitude. This highly influenced the values of the design rainfalls that increase with the station’s altitude. This indicates that orographic enhancement increased the design rainfall depth.
The conclusions can be drawn that the method is highly competitive with other methods in designing large rainfall events and that the GPD with Hill statistics is able to properly reflect the increase in the magnitude and variability of design rainfall with the increase in each station’s altitude in the region.
With regard to future plans, considering that only mountain and upland areas were studied, it would also be worthwhile to verify whether this method will accurately estimate high precipitation totals in lowland areas, characterized by a different precipitation pattern. The next issue is seasonality, which plays an important role in temporal rainfall scheme and thus should also be considered.
Therefore the method can be recommended in water-management applications related to floods in the UVB.

Author Contributions

Conceptualization, K.K. and A.R.; methodology, K.K. and A.R.; software, K.K. and A.R.; validation, K.K. and A.R.; formal analysis, K.K. and A.R.; investigation, K.K. and A.R.; resources, K.K. and A.R.; data curation, K.K.; writing—original draft preparation, K.K. and A.R.; writing—review and editing, K.K. and A.R.; visualization, K.K. and A.R.; supervision, K.K. and A.R.; project administration, K.K. and A.R.; funding acquisition, K.K. All authors have read and agreed to the published version of the manuscript.

Funding

The paper was financed from the funds of the Cracow University of Technology, Poland.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available via Dane Publiczne IMGW-PIB https://danepubliczne.imgw.pl/ (accessed on 9 July 2022).

Acknowledgments

The paper was financed from the funds of the Cracow University of Technology, Poland. The authors kindly acknowledge IMGW–PIB for daily rainfall data. The support provided in a form of a subsidy from the Ministry of Science and Higher Education for the University of Agriculture in Krakow is also gratefully acknowledged.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pińskwar, I.; Choryński, A.; Graczyk, D.; Kundzewicz, Z. Observed changes in extreme precipitation in Poland: 1991–2015 versus 1961–1990. Theor. Appl. Climatol. 2019, 135, 773–787. [Google Scholar] [CrossRef] [Green Version]
  2. Licznar, P.; Zaleski, J.; Burszta-Adamiak, E.; Gajda, W.; Jurczyk, A.; Lewandowski, R.; Mańczak, P.; Mikołajewski, K.; Oktawiec, M.; Ośródka, K.; et al. Metodyka Opracowania Polskiego Atlasu Natężeń Deszczów (PANDa); Seria Publikacji Naukowo-Badawczych—Instytut Meteorologii i Gospodarki Wodnej; Instytut Meteorologii i Gospodarki Wodnej—Państwowy Instytut Badawczy: Warszawa, Poland, 2020. [Google Scholar]
  3. Burszta-Adamiak, E.; Licznar, P.; Zaleski, J. Criteria for identifying maximum rainfall determined by the peaks-over-threshold (POT) method under the Polish Atlas of Rainfall Intensities (PANDa) project. Meteorol. Hydrol. Water Manag. 2019, 7, 3–13. [Google Scholar] [CrossRef]
  4. Kozubowski, T.; Panorska, A.; Qeadan, F.; Gershunov, A.; Rominger, D. Testing Exponentiality Versus Pareto Distribution via Likelihood Ratio. Commun. Stat.-Simul. C. 2009, 38, 118–139. [Google Scholar] [CrossRef]
  5. Rutkowska, A.; Banasik, K. The shape parameter of the GEV and GP distributions of annual maxima and peak over threshold discharges—Statistical analysis. Science 2014, z. XX, 95–104. [Google Scholar]
  6. Bogdanowicz, E.; Stachỳ, J. Maksymalne opady deszczu w Polsce: Charakterystyki projektowe. In Materiały Badawcze—Instytut Meteorologii i Gospodarki Wodnej: Hydrologia i Oceanologia; Instytut Meteorologii i Gospodarki Wodnej: Warszawa, Poland, 1998; Volume 23. [Google Scholar]
  7. Bogdanowicz, E.; Stachý, J. Maximum rainfall in Poland—A design approach. IAHS-AISH P. 2002, 271, 15–18. [Google Scholar]
  8. Bisaga, W.; Bryła, M.; Kaźmierczak, B.; Kielar, R.; Kitowski, M.; Marosz, M.; Miętek, B.; Tokarczyk, T.; Walczykiewicz, T.; Żelazny, M.; et al. Modele Probabilistyczne Opadów Maksymalnych o Określonym Czasie Trwania i Prawdopodobieństwie Przewyższenia—Projekt PMAXTP; IMGW-PIB: Warszawa, Poland, 2022. [Google Scholar]
  9. Licznar, P.; Kotowski, A.; Siekanowicz-Grochowina, K.; Oktawiec, M.; Burszta-Adamiak, E. Empirical verification of Bogdanowicz-Stachý’s formula for design rainfall intensity calculations. Ochr. Sr. 2018, 40, 21–28. [Google Scholar]
  10. Lang, M.; Ouarda, T.; Bobée, B. Towards operational guidelines for over-threshold modeling. J. Hydrol. 1999, 225, 103–117. [Google Scholar] [CrossRef]
  11. Rutkowska, A.; Willems, P.; Niedzielski, T. Relation between design floods based on daily maxima and daily means: Use of the Peak over Threshold approach in the Upper Nysa Kłodzka Basin (SW Poland). Geomat. Nat. Hazards Risk 2017, 8, 585–606. [Google Scholar] [CrossRef] [Green Version]
  12. Pickands, J. Statistical Inference Using Extreme Order Statistics. Ann. Stat. 1975, 3, 119–131. [Google Scholar] [CrossRef]
  13. Castillo, E.; Hadi, A.S. Fitting the Generalized Pareto Distribution to Data. J. Am. Stat. Assoc. 1997, 92, 1609–1620. [Google Scholar] [CrossRef]
  14. Madsen, H.; Rosbjerg, D.; Harremoës, P. Application of the Partial Duration Series Approach in the Analysis of Extreme Rainfalls. In Proceedings of the International Yokohama Symposium, Yokohama, Japan, 20–23 July 1993; Kundzewicz, Z., Rosbjerg, D., Simonovic, S., Takuchi, K., Eds.; IAHS Press: Wallingford, UK, 1993; Volume 213, pp. 257–266. [Google Scholar]
  15. Madsen, H.; Pearson, C.P.; Rosbjerg, D. Comparison of annual maximum series and partial duration series methods for modeling extreme hydrologic events: 2. Regional modeling. Water Resour. Res. 1997, 33, 759–769. [Google Scholar] [CrossRef]
  16. Madsen, H.; Rasmussen, P.F.; Rosbjerg, D. Comparison of annual maximum series and partial duration series methods for modeling extreme hydrologic events: 1. At-site modeling. Water Resour. Res. 1997, 33, 747–757. [Google Scholar] [CrossRef]
  17. Montfort, M.A.J.V.; Witter, J.V. The Generalized Pareto distribution applied to rainfall depths. Hydrol. Sci. J. 1986, 31, 151–162. [Google Scholar] [CrossRef]
  18. Madsen, H.; Rosbjerg, D. Generalized least squares and empirical bayes estimation in regional partial duration series index-flood modeling. Water Resour. Res. 1997, 33, 771–781. [Google Scholar] [CrossRef]
  19. Martins, A.L.A.; Liska, G.R.; Beijo, L.A.; de Menezes, F.S.; Cirillo, M.A. Generalized Pareto distribution applied to the analysis of maximum rainfall events in Uruguaiana, RS, Brazil. SN Appl. Sci. 2020, 2, 1479. [Google Scholar] [CrossRef]
  20. Singirankabo, E.; Iyamuremye, E. Modelling extreme rainfall events in Kigali city using generalized Pareto distribution. Meteorol. Appl. 2022, 29, e2076. [Google Scholar] [CrossRef]
  21. Martins, E.S.; Stedinger, J.R. Generalized Maximum Likelihood Pareto-Poisson estimators for partial duration series. Water Resour. Res. 2001, 37, 2551–2557. [Google Scholar] [CrossRef] [Green Version]
  22. Willems, P.; Guillou, A.; Beirlant, J. Bias correction in hydrologic GPD based extreme value analysis by means of a slowly varying function. J. Hydrol. 2007, 338, 221–236. [Google Scholar] [CrossRef]
  23. Hill, B.M. A Simple General Approach to Inference About the Tail of a Distribution. Ann. Stat. 1975, 3, 1163–1174. [Google Scholar] [CrossRef]
  24. Beirlant, J.; Teugels, J.L.; Vynckier, P. Practical Analysis of Extreme Values; Leuven University Press: Leuven, Belgium, 1996; p. 170. [Google Scholar]
  25. Boniphace, E.R.; Willems, P. Impact of dependence in river flow data on flood frequency analysis based on regression in quantile plots: Analysis and solutions. Water Resour. Res. 2011, 47, 1–62. [Google Scholar] [CrossRef] [Green Version]
  26. Onyutha, C.; Willems, P. Uncertainty in calibrating generalised Pareto distribution to rainfall extremes in Lake Victoria basin. Hydrol. Res. 2014, 46, 356–376. [Google Scholar] [CrossRef]
  27. Taye, M.T.; Willems, P. Influence of climate variability on representative QDF predictions of the upper Blue Nile basin. J. Hydrol. 2011, 411, 355–365. [Google Scholar] [CrossRef]
  28. Langousis, A.; Mamalakis, A.; Puliga, M.; Deidda, R. Threshold detection for the generalized Pareto distribution: Review of representative methods and application to the NOAA NCDC daily rainfall database. Water Resour. Res. 2016, 52, 2659–2681. [Google Scholar] [CrossRef] [Green Version]
  29. Niedźwiedź, T.; Obrȩbska-Starklowa, B. Klimat. In Dorzecze górnej Wisły, cz. I; Dynowska, I., Maciejewski, M., Eds.; PWN: Kraków, Poland, 1991; Volume I, pp. 68–84. [Google Scholar]
  30. Niedźwiedź, T. Extreme precipitation events on the northern side of the Tatra Mountains. Geogr. Pol. 2006, 76, 15–24. [Google Scholar]
  31. Pociask-Karteczka, J. The Upper Vistula Basin—A Geographical Overview. In Flood Risk in the Upper Vistula Basin; Kundzewicz, Z.W., Stoffel, M., Niedźwiedź, T., Wyżga, B., Eds.; Springer: Cham, Switzerland, 2016; pp. 3–21. [Google Scholar] [CrossRef]
  32. Młyński, D.; Cebulska, M.; Wałęga, A. Trends, Variability, and Seasonality of Maximum Annual Daily Precipitation in the Upper Vistula Basin, Poland. Atmosphere 2018, 9, 313. [Google Scholar] [CrossRef] [Green Version]
  33. Dynowska, I.; Maciejewski, M. Dorzecze górnej Wisły: Opracowanie Zbiorowe; PWN: Kraków, Poland, 1991; Number t. 1. (In Polish) [Google Scholar]
  34. Kundzewicz, Z.; Stoffel, M.; Niedźwiedź, T.; Wyżga, B. Flood Risk in the Upper Vistula Basin; GeoPlanet: Earth and Planetary Sciences; Springer: Cham, Switzerland, 2016. [Google Scholar]
  35. Majewski, W. General characteristics of the Vistula and its basin. Acta Energetica 2013, 2, 6–15. [Google Scholar] [CrossRef]
  36. Soja, R. Hydrological Aspects of Anthropopression in the Polish Carpathians; Prace Geograficzne—Polska Akademia Nauk, PAN IG i PZ: Warszawa, Poland, 2002; pp. 1–135. [Google Scholar]
  37. Cebulska, M.; Szczepanek, R.; Twardosz, R.; i Gospodarki Przestrzennej, U.J.I.G. The Spatial Distribution of Precipitation in the Upper Vistula River Basin: Average Annual Precipitation (1952–1981); T. Kościuszko Cracow University of Technology, Faculty of Environmental Engineering: Kraków, Poland, 2013; p. 83. [Google Scholar]
  38. IMGW. Dane Publiczne. 2022. Available online: https://danepubliczne.imgw.pl/ (accessed on 29 July 2022).
  39. Ferguson, T.S.; Genest, C.; Hallin, M. Kendall’s Tau for Serial Dependence. Can. J. Stat. 2000, 28, 587–604. [Google Scholar] [CrossRef]
  40. Beirlant, J.; Dierckx, G.; Goegebeur, Y.; Matthys, G. Tail Index Estimation and an Exponential Regression Model. Extremes 1999, 2, 177–200. [Google Scholar] [CrossRef]
  41. Willems, P. Hydrological applications of extreme value analysis. In Hydrology in a Changing Environment; Wheater, H., Kirby, C., Eds.; John Wiley: Chichester, UK, 1998; Volume 3, pp. 15–25. [Google Scholar]
  42. Cunnane, C. A note on the Poisson assumption in partial duration series models. Water Resour. Res. 1979, 15, 489–494. [Google Scholar] [CrossRef]
  43. Perry, J.N.; Mead, R. On the Power of the Index of Dispersion Test to Detect Spatial Pattern. Biometrics 1979, 35, 613–622. [Google Scholar] [CrossRef]
  44. Kendall, M. A New Measure of Rank Correlation. Biometrika 1938, 30, 81–93. [Google Scholar] [CrossRef]
  45. Mann, H.B. Nonparametric Tests Against Trend. Econometrica 1945, 13, 245–259. [Google Scholar] [CrossRef]
  46. Anderson, T.W.; Darling, D.A. Asymptotic Theory of Certain “Goodness of Fit” Criteria Based on Stochastic Processes. Ann. Math. Stat. 1952, 23, 193–212. [Google Scholar] [CrossRef]
  47. Sinclair, C.; Spurr, B.; Ahmad, M. Modified anderson darling test. Commun. Stat. Theory 1990, 19, 3677–3686. [Google Scholar] [CrossRef]
  48. Laio, F. Cramer von Mises and Anderson-Darling goodness of fit tests for extreme value distributions with unknown parameters. Water Resour. Res. 2004, 40, 1–10. [Google Scholar] [CrossRef]
  49. Stephens, M.A. Asymptotic Results for Goodness-of-Fit Statistics with Unknown Parameters. Ann. Stat. 1976, 4, 357–369. [Google Scholar] [CrossRef]
  50. Maity, R. Statistical Methods in Hydrology and Hydroclimatology; Springer Nature: Singapore, 2018; p. 444. [Google Scholar]
  51. Jenkinson, A.F. The frequency distribution of the annual maximum (or minimum) values of meteorological elements. Q. J. R. Meteor. Soc. 1955, 81, 158–171. [Google Scholar] [CrossRef]
  52. Papalexiou, S.M.; Koutsoyiannis, D. Battle of extreme value distributions: A global survey on extreme daily rainfall. Water Resour. Res. 2013, 49, 187–201. [Google Scholar] [CrossRef]
  53. Akaike, H.; Petrov, B.N.; Csaki, F. Information Theory and an Extension of the Maximum Likelihood Principle. In Proceedings of the 2nd International Symposium on Information Theory; Petrov, B.N., Csaki, F., Eds.; Akademiai Kiado: Budapest, Hungary, 1973; pp. 267–281. [Google Scholar]
  54. Laio, F.; Di Baldassarre, G.; Montanari, A. Model selection techniques for the frequency analysis of hydrological extremes. Water Resour. Res. 2009, 45, 1–11. [Google Scholar] [CrossRef]
  55. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
  56. Młyński, D.; Wałęga, A.; Petroselli, A.; Tauro, F.; Cebulska, M. Estimating Maximum Daily Precipitation in the Upper Vistula Basin, Poland. Atmosphere 2019, 10, 43. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The location of the stations used in the study. 1. Raków, 2. Sandomierz, 3. Brynica, 4. Wolbrom, 5. Chrzanów, 6. Tarnów, 7. Pilzno, 8. Rzeszów-Jasionka, 9. Skoczów, 10. Stróża, 11. Nowy Sącz, 12. Ochotnica Górna, 13. Lesko, 14. Wisłok Wielki, 15. Kasprowy Wierch, 16. Cisna.
Figure 1. The location of the stations used in the study. 1. Raków, 2. Sandomierz, 3. Brynica, 4. Wolbrom, 5. Chrzanów, 6. Tarnów, 7. Pilzno, 8. Rzeszów-Jasionka, 9. Skoczów, 10. Stróża, 11. Nowy Sącz, 12. Ochotnica Górna, 13. Lesko, 14. Wisłok Wielki, 15. Kasprowy Wierch, 16. Cisna.
Water 15 01316 g001
Figure 2. Flow diagram that visualizes the consecutive steps in the POT-based estimation (the left branch) and in the AM-based estimation (the right branch).
Figure 2. Flow diagram that visualizes the consecutive steps in the POT-based estimation (the left branch) and in the AM-based estimation (the right branch).
Water 15 01316 g002
Figure 3. Values of the dispersion index ψ and limits of the confidence interval.
Figure 3. Values of the dispersion index ψ and limits of the confidence interval.
Water 15 01316 g003
Figure 4. QQ plots for four exemplary stations. (a,b) Pareto QQ plot. (c,d) exponential QQ plot.
Figure 4. QQ plots for four exemplary stations. (a,b) Pareto QQ plot. (c,d) exponential QQ plot.
Water 15 01316 g004
Figure 5. Confidence intervals of P T = 100 .
Figure 5. Confidence intervals of P T = 100 .
Water 15 01316 g005
Figure 6. IF curves for various return periods T, logarithmic scale.
Figure 6. IF curves for various return periods T, logarithmic scale.
Water 15 01316 g006
Figure 7. Design rainfall depth computed using the POT-based and AM-based methods. The circles represent the sample values, while the continuous lines are theoretical distributions: GPD (blue line, POT-based method) and the theoretical distribution that was selected by using the AIC (red line, AM-based method).
Figure 7. Design rainfall depth computed using the POT-based and AM-based methods. The circles represent the sample values, while the continuous lines are theoretical distributions: GPD (blue line, POT-based method) and the theoretical distribution that was selected by using the AIC (red line, AM-based method).
Water 15 01316 g007
Figure 8. The scatterplot showing the relation between the station’s altitude and design rainfall depth for return periods of 100, 150, and 200 years. (a) With the Kasprowy Wierch station (marked in red) (b) Without the Kasprowy Wierch station. The blue or green area represents the confidence region around the regression line and its upper and lower bands are confidence limits of the predicted rainfall depths at the confidence level of 95 % .
Figure 8. The scatterplot showing the relation between the station’s altitude and design rainfall depth for return periods of 100, 150, and 200 years. (a) With the Kasprowy Wierch station (marked in red) (b) Without the Kasprowy Wierch station. The blue or green area represents the confidence region around the regression line and its upper and lower bands are confidence limits of the predicted rainfall depths at the confidence level of 95 % .
Water 15 01316 g008
Figure 9. The scatterplot showing the relation between the station’s altitude and variance of the GPD. (a) Including the Kasprowy Wierch station marked in red. (b) Without the Kasprowy Wierch station. The blue or green area represents the confidence region around the regression line and its upper and lower bands are confidence limits of the predicted variance of rainfall depth at the confidence level of 95 % .
Figure 9. The scatterplot showing the relation between the station’s altitude and variance of the GPD. (a) Including the Kasprowy Wierch station marked in red. (b) Without the Kasprowy Wierch station. The blue or green area represents the confidence region around the regression line and its upper and lower bands are confidence limits of the predicted variance of rainfall depth at the confidence level of 95 % .
Water 15 01316 g009
Table 1. The stations used in the study and their main characteristics [37,38].
Table 1. The stations used in the study and their main characteristics [37,38].
No.StationLongitudeLatitudeAltitude
[m. a.s.l.]
Mean dp ( 1 )
[mm]
Mean AM ( 2 )
[mm]
1Raków21 03′00″50 41′00″2201.6336.14
2Sandomierz21 42′57″50 41′48″2171.5538.44
3Brynica19 00′00″50 28′00″2852.0341.53
4Wolbrom19 45′00″50 23′00″3702.0643.03
5Chrzanów19 23′00″50 09′00″2952.1042.60
6Tarnów20 59′04″50 01′48″2091.9749.71
7Pilzno21 18′00″49 59′00″2101.9846.64
8Rzeszów-Jasionka22 02′32″50 06′39″2001.7637.79
9Skoczów18 47′42″49 47′56″2952.5953.84
10Stróża19 56′00″49 48′00″3072.5451.37
11Nowy Sącz20 41′21″49 37′38″2922.0145.14
12Ochotnica Górna20 14′00″49 32′00″6202.3250.48
13Lesko22 20′30″49 27′59″4202.2442.96
14Wisłok Wielki21 59′57″49 22′44″5502.6449.95
15Kasprowy Wierch19 58′55″49 13′57″19914.8983.01
16Cisna22 20′00″49 13′00″5402.9852.47
(1) Daily precipitation (2) Annual maximum.
Table 2. The values of the x t , γ ^ , and σ ^ parameters for each station.
Table 2. The values of the x t , γ ^ , and σ ^ parameters for each station.
Station x t γ ^ σ ^ Distribution
1. Raków27.50.277.39GPD
2. Sandomierz23.00.306.83GPD
3. Brynica25.90.297.53GPD
4. Wolbrom31.30.278.31GPD
5. Chrzanów27.50.0010.96EXP
6. Tarnów28.00.0014.85EXP
7. Pilzno28.80.0013.30EXP
8. Rzeszów-Jasionka27.20.009.82EXP
9. Skoczów38.50.2810.96GPD
10. Stróża32.00.319.92GPD
11. Nowy Sącz32.00.0010.94EXP
12. Ochotnica Górna27.50.308.24GPD
13. Lesko38.10.186.92GPD
14. Wisłok Wielki34.70.258.83GPD
15. Kasprowy Wierch52.30.3016.62GPD
16. Cisna42.10.208.60GPD
Table 3. The p-values of the test of the Kendall’s τ (column 2), the MK test (column 3), and the ADU test statistics and the critical values ADU crit (column 4).
Table 3. The p-values of the test of the Kendall’s τ (column 2), the MK test (column 3), and the ADU test statistics and the critical values ADU crit (column 4).
StationKendall’s τ  p-Value
(lag = 1)
MK p-ValueADU / ADU crit
1. Raków0.2820.9600.50/1.31
2. Sandomierz0.4380.3590.25/1.27
3. Brynica0.2400.6940.61/1.30
4. Wolbrom0.5200.2470.26/1.33
5. Chrzanów1.0000.8160.15/1.30
6. Tarnów0.4250.8870.10/1.30
7. Pilzno0.4300.8390.29/1.32
8. Rzeszów-Jasionka0.4950.3320.31/1.26
9. Skoczów0.0540.5780.27/1.30
10. Stróża0.5050.2180.11/1.34
11. Nowy Sącz0.1790.3740.14/1.31
12. Ochotnica Górna0.9690.9040.23/1.29
13. Lesko0.7100.3830.14/1.25
14. Wisłok Wielki0.3320.0540.19/1.35
15. Kasprowy Wierch0.7150.1850.27/1.31
16. Cisna0.7440.1210.20/1.32
Table 4. The design rainfall depths of return period T = 100 years and T = 200 years, and its 95 % confidence intervals.
Table 4. The design rainfall depths of return period T = 100 years and T = 200 years, and its 95 % confidence intervals.
Station P T = 100 CI P T = 100 P T = 200 CI P T = 200
1. Raków106.36(82.06; 142.38)128.14(95.39; 178.56)
2. Sandomierz120.95(94.31; 158.27)148.57(112.33; 201.01)
3. Brynica132.41(104.02; 172.61)161.98(123.52; 218.19)
4. Wolbrom124.62(97.63; 163.78)149.82(113.61; 204.19)
5. Chrzanów88.54(79.27; 99.05)96.14(85.72; 107.09)
6. Tarnów111.14(99.05; 124.66)121.44(107.85; 136.63)
7. Pilzno101.50(89.94; 113.98)110.72(97.70; 124.78)
8. Rzeszów-Jasionka79.85(71.22; 89.45)86.65(76.91; 97.49)
9. Skoczów171.88(131.77; 228.23)209.37(154.99; 288.61)
10. Stróża179.30(138.73; 238.47)222.26(166.57; 306.31)
11. Nowy Sącz91.27(81.81; 101.73)98.86(88.19; 110.65)
12. Ochotnica Górna159.67(127.07; 203.82)196.51(152.23; 258.19)
13. Lesko91.34(75.56; 113.02)103.60(83.38; 132.17)
14. Wisłok Wielki140.18(112.20; 176.44167.22(130.12; 216.68)
15. Kasprowy Wierch271.52(211.75; 358.95)333.97(252.44; 457.28)
16. Cisna119.74(98.44; 147.69)137.96(110.45; 175.07)
Table 5. The AIC values for the GEV, GA2, GA3, LOG, WE2, WE3, and GUM distributions. The lowest AIC values (in boxes) show the distribution functions that provide the best fit.
Table 5. The AIC values for the GEV, GA2, GA3, LOG, WE2, WE3, and GUM distributions. The lowest AIC values (in boxes) show the distribution functions that provide the best fit.
StationGEVGA2GA3LOGWE2WE3GUM
1. Raków483.79493.34470.83486.68510.92479.29486.86
2. Sandomierz496.87507.19488.86498.82527.11503.14499.39
3. Brynica499.35501.98500.70498.12518.05503.40497.84
4. Wolbrom509.04508.94508.90507.14517.54509.70507.13
5. Chrzanów504.91503.50504.31502.73509.68504.82502.93
6. Tarnów533.53531.91533.07531.44538.21533.43531.55
7. Pilzno507.91508.10506.97506.16517.38507.62506.02
8. Rzeszów-Jasionka483.19481.69483.48481.67487.11485.27481.66
9. Skoczów541.56544.23542.15540.52555.86544.30540.66
10. Stróża535.30545.45537.62538.10561.31540.89538.87
11. Nowy Sącz494.12494.73491.73492.74504.25490.52492.26
12. Ochotnica Górna525.93540.82532.97531.81558.98539.42531.65
13. Lesko467.97469.10466.80467.08481.15467.06466.00
14. Wisłok Wielkine *nenenenenene
15. Kasprowy Wierch602.26605.08576.67601.08616.41601.40602.18
16. Cisna501.26504.43499.48501.58517.20499.54499.92
Note: * The distribution function of AM not estimated.
Table 6. The scores d 1 , d 2 assigned to the stations based on the comparison between the MDE and RMSE from the POT and AM methods. The number 1 shows a better fit. The final score d is shown under the table.
Table 6. The scores d 1 , d 2 assigned to the stations based on the comparison between the MDE and RMSE from the POT and AM methods. The number 1 shows a better fit. The final score d is shown under the table.
Case 1Case 2
Station MDE POT MDE AM RMSE POT RMSE AM MDE POT MDE AM RMSE POT RMSE AM
1. Raków10101010
2. Sandomierz10101010
3. Brynica01101010
4. Wolbrom10010101
5. Chrzanów10101010
6. Tarnów10101010
7. Pilzno10101010
8. Rzeszów-J.01010101
9. Skoczów10101010
10. Stróża10101010
11. Nowy Sącz10101010
12. Ochotnica G.10101010
13. Lesko10101010
14. Wisłok Wielki------0-
15. Kasprowy W.10101010
16. Cisna01100110
d POT = 25d AM = 5 d POT = 25d AM = 5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kołodziejczyk, K.; Rutkowska, A. Estimation of the Peak over Threshold-Based Design Rainfall and Its Spatial Variability in the Upper Vistula River Basin, Poland. Water 2023, 15, 1316. https://doi.org/10.3390/w15071316

AMA Style

Kołodziejczyk K, Rutkowska A. Estimation of the Peak over Threshold-Based Design Rainfall and Its Spatial Variability in the Upper Vistula River Basin, Poland. Water. 2023; 15(7):1316. https://doi.org/10.3390/w15071316

Chicago/Turabian Style

Kołodziejczyk, Katarzyna, and Agnieszka Rutkowska. 2023. "Estimation of the Peak over Threshold-Based Design Rainfall and Its Spatial Variability in the Upper Vistula River Basin, Poland" Water 15, no. 7: 1316. https://doi.org/10.3390/w15071316

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop