A Refined Zenith Tropospheric Delay Model Based on a Generalized Regression Neural Network and the GPT3 Model in Europe

Wei, Min; Yu, Xuexiang; Ke, Fuyang; He, Xiangxiang; Xu, Keli

doi:10.3390/atmos14121727

Open AccessArticle

A Refined Zenith Tropospheric Delay Model Based on a Generalized Regression Neural Network and the GPT3 Model in Europe

by

Min Wei

^1,2,3,

Xuexiang Yu

^1,2,3,*,

Fuyang Ke

^4,5,

Xiangxiang He

⁶ and

Keli Xu

^1,2,3

¹

School of Geomatics, Anhui University of Science and Technology, Huainan 232001, China

²

Key Laboratory of Aviation-Aerospace-Ground Cooperative Monitoring and Early Warning of Coal Mining-Induced Disasters of Anhui Higher Education Institutes, KLAHEI (KLAHEI18015), Anhui University of Science and Technology, Huainan 232001, China

³

Coal Industry Engineering Research Center of Mining Area Environmental and Disaster Cooperative Monitoring, Anhui University of Science and Technology, Huainan 232001, China

⁴

School of Software, Nanjing University of Information Science and Technology, Nanjing 210044, China

⁵

Wuxi Research Institute, Nanjing University of Information Science and Technology, Wuxi 214000, China

⁶

School of Earth Sciences and Engineering, Hohai University, Nanjing 211100, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2023, 14(12), 1727; https://doi.org/10.3390/atmos14121727

Submission received: 22 October 2023 / Revised: 13 November 2023 / Accepted: 20 November 2023 / Published: 24 November 2023

(This article belongs to the Special Issue Atmospheric Data Prediction Using Statistical, and Machine Learning Approaches of Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

An accurate model of the Zenith Tropospheric Delay (ZTD) plays a crucial role in Global Navigation Satellite System (GNSS) precise positioning, water vapor retrieval, and meteorological research. Current empirical models (such as the GPT3 model) can only reflect the approximate change trend of ZTD but cannot accurately reflect nonlinear changes such as rapid fluctuations in ZTD. In recent years, the application of machine learning methods in the modeling and prediction of ZTD has gained prominence, yielding commendable results. Utilizing the ZTD products from 53 International GNSS Service (IGS) stations in Europe during the year 2021 as a foundational dataset, a Generalized Regression Neural Network (GRNN) is employed to model IGS ZTD while considering spatiotemporal factors and its association with GPT3 ZTD. This endeavor culminates in the development of a refined GRNN model. To verify the performance of the model, the prediction results are compared with two other ZTD values. One is obtained based on the European Centre for Medium-Range Weather Forecasts Reanalysis 5 (ERA5) data, and the other is obtained by the GPT3 model. The results show that the bias of the GRNN refined model is almost 0 mm, and the average Root-Mean-Square Error (RMSE) and Mean Absolute Error (MAE) are 18.33 mm and 14.08 mm, respectively. Compared with ERA5 ZTD and GPT3 ZTD, the RMSE of GRNN ZTD has decreased by 19.5% and 63.4%, respectively, and the MAE of GRNN ZTD has decreased by 24.8% and 67.1%. Compared with the other two models, the GRNN refined model has better performance in reflecting the rapid fluctuations of ZTD. In addition, also discussed is the impact of spatial factors and time factors on modeling. The findings indicate that modeling accuracy within the central region of the modeling area surpasses that at the periphery by approximately 17.8%. The period from June to October is associated with the lowest accuracy, whereas the optimal accuracy is typically observed from January to April. The most substantial differences in accuracy were observed at station OP71 (Paris, France), with the highest accuracy recorded (9.51 mm) in April and the lowest (24.00 mm) in September.

Keywords:

GNSS; Zenith Tropospheric Delay; Generalized Regression Neural Network; ERA5; IGS data; GPT3

1. Introduction

The tropospheric delay refers to the signal delay caused by signal propagation in the troposphere (the un-ionized part below 50 km). The troposphere represents a neutral atmospheric region in which signal propagation remains frequency-independent, rendering interfrequency differencing ineffective in mitigating delay effects [1]. In Global Navigation Satellite System (GNSS) data processing, the delay on the signal path is generally calculated using the product of the Zenith Tropospheric Delay (ZTD) and the projection function related to the altitude angle [2,3]. On the one hand, tropospheric delay is an important error source in the navigation and positioning process of GNSS [4,5]; on the other hand, the water vapor information it contains is the basis for water vapor retrieval in GNSS meteorology [6,7]. Therefore, establishing a stable and reliable ZTD model is one of the current GNSS research hot spots.

ZTD models can be divided into two broad categories based on whether meteorological parameters are required. The first type of classic ZTD model requires measured meteorological parameters. Common ones include the Black, Saastamoinen, and Hopfield models [8], but GNSS monitoring stations in many areas are not equipped with meteorological sensors and cannot obtain meteorological data. To solve this problem, many scholars began to study the second type of model which does not require meteorological parameters, namely, the empirical model, that is, through a large amount of historical empirical data to establish the mapping relationship between various influencing factors and ZTD to estimate ZTD. A series of models have been proposed one after another, such as the Global Pressure and Temperature (GPT) series of models [9,10,11], the University of New Brunswick (UNB) series of models [12], and the EGNOS model [13]. Empirical models have gained extensive utilization; however, the simulation of Zenith Wet Delay (ZWD) within ZTD remains intricate and resistant to formulation by means of physical equations. Consequently, this particular model category continues to encounter challenges in satisfying the prerequisites of high-precision GNSS positioning, water vapor retrieving, and related applications [14].

It is very difficult to study the physical mechanism of ZTD. Data-driven models using machine learning methods have become a popular research topic without explicitly providing physical mechanisms [15]. Machine learning is widely used in data modeling and prediction because it can solve problems such as classification and prediction and has strong nonlinear fitting capabilities. In recent years, many scholars have used machine learning methods to conduct extensive research on regional ZTD modeling and related fields. Wang et al. used GPS network observation data and the back-propagation neural network (BPNN) algorithm to study a ZTD prediction model. The model exhibits accuracy at the centimeter level; nonetheless, there are specific stations demonstrating suboptimal predictive performance [16]. Xiao et al. proposed an improved BP neural network, overcoming the shortcomings of over-fitting and instability of the traditional BP neural network. Nevertheless, the model fails to consider the impact of factors such as station distribution and the dimensions of the modeled area on the experimental outcomes [17]. Yang et al. used an artificial neural network (ANN) to construct the correlation between the ZTD estimated by the GPT3 model and the ZTD calculated by GNSS observation data. The proposed model demonstrates superior performance in ZTD estimation compared to the Saastamoinen model and the GPT3 model. However, an investigation into the impacts of site quantity, site distribution, and training set data volume on the proposed model was not conducted [18]. Li et al. used the ZTD products provided by IGS and ERA5 data to develop a new regional ZTD model based on the least-squares support vector machine (LSSVM). However, the model’s accuracy is influenced by an inadequate number of training samples and the heightened variability in climatic conditions within the surrounding area [19]. Later, long short-term memory (LSTM) and Radical Basis Function (RBF) neural networks were used to correct the ZTD value estimated by the GPT3 model in the Antarctic region. The annual average RMSE of the ZTD value calculated by the corrected model was 15.7 mm, which was 10.2 mm smaller than the GPT3 ZTD. However, the accuracy at sites with fewer stations is lower compared to the GPT3 model [20]. Zhang et al. used the difference between the ZTD calculated by GNSS observation data and the ZTD calculated by the periodic function model as the input dataset to train an LSTM neural network. Utilizing the estimated ZTD in static Precise Point Positioning (PPP) significantly diminishes the convergence time in comparison to conventional PPP methods [21]. Xu et al. used ERA5 meteorological data to calculate the ZTD value of the measuring stations and proposed an improved tropospheric delay model based on the RBF neural network. The results showed that the accuracy of the ZTD value estimated by the improved model was significantly improved, and the improvement rate is related to the density of the distribution of measuring stations [22]. Li et al. used Random Forest (RF) to fit the relationship between the residual values of GPT3 ZTD, GNSS ZTD, and spatiotemporal information, then established an improved model of GPT3 ZTD in mainland China. The improved model can more accurately capture the rapid variations in the observed ZTD, and experimental results show that the RMSE of it is 1.83 cm [23].

While the GPT3 model has become a subject of extensive study, it continues to exhibit certain limitations, particularly when it comes to accurately modeling swift fluctuations in ZTD, frequently linked with extreme weather variations [24]. Additional ZTD computations can be conducted through the utilization of techniques capable of capturing nonlinear variations, with machine learning methods emerging as a primary focus of research. Within the realm of machine learning algorithms, Generalized Regression Neural Networks (GRNNs) are distinguished by their robust nonlinear mapping capabilities and rapid learning pace, endowing them with a significant advantage in capturing the swift variations in ZTD values. Hence, a GRNN refined ZTD model was developed to enhance the accuracy of ZTD estimates derived from the GPT3 model. This model was built using data sourced from 53 IGS stations located within the European region. The ZTD with high accuracy was also calculated using ERA5 data in the experiments for comparative analysis [25,26]. In this study, the primary objective is to develop an enhanced modeling approach capable of accurately capturing the rapid fluctuations in ZTD, and it is user-friendly and does not necessitate intricate preprocessing of prior data. The performance of this model will be thoroughly analyzed and rigorously evaluated. The results are anticipated to serve as a significant reference point in the fields of water vapor inversion, extreme weather forecasting, and precise point positioning (PPP).

The remainder of this article is organized as follows. Section 2 provides a detailed introduction to the data and methods used in this study. Section 3 uses experiments to verify the accuracy of the proposed model by comparing it with other models. In Section 4, based on the proposed model, their impact on modeling accuracy is explored from both spatial and temporal aspects. Finally, a concluding summary is given in Section 5.

2. Dataset and Methods

2.1. Dataset

The experiment of this article is based on 53 IGS measuring stations in Europe. The distribution of measuring stations is shown in Figure 1. These measuring stations are distributed in the area [7° W–35° E, 39° N–59° N], most of which are located in the European plains. The elevations of the measuring stations are not very different, but there are differences in sea and land distribution. Moreover, the distribution of measuring stations in the central area is relatively dense. The experimental data adopt the 2021 complete ZTD products released by the IGS center at the selected 53 measuring stations, which are regarded as true values and are called IGS ZTD.

ERA5 data are the fifth-generation global meteorological parameter reanalysis data updated by the European Centre for Medium-Range Weather Forecasts (ECMWF), Reading, UK, in January 2019. The spatial resolution is 0.25° × 0.25°, and the temporal resolution is 1 h. The data are divided into ERA5 hourly data on single levels and ERA5 hourly data on pressure levels. The latter divide the atmosphere into 37 pressure layers in the vertical direction and provide meteorological parameters on the surface of each pressure layer, allowing it to describe changes in meteorological parameters in more detail. Previous studies in the Chinese region have shown that the atmospheric temperature and pressure estimated from ERA5 hourly data on pressure levels agree well with the measured temperature and pressure. And the ZTD computed using these estimated meteorological parameters exhibits an RMSE of approximately 11.49 mm when compared to the GNSS-derived ZTD [26]. Thus, ERA5 hourly data on pressure levels are used to calculate the ZTD in this article. Compared with using ERA5 data to calculate ZTD, the GPT3 model is simple and convenient to use. By inputting the longitude, latitude, geodetic height, and time information of a certain point into the model, the corresponding meteorological parameters can be obtained, and the corresponding ZTD can be calculated. Their specific calculation methods are shown below.

2.2. Different ZTD Models

2.2.1. ZTD Model Based on ERA5 Data

Since there is almost no wet delay effect on the top layer and above, this part is solved using the Saastamoinen dry delay model, and the ZTD of the remaining layers is solved using the integral method. The specific formula is as follows [27,28,29]:

{Z T D}_{t o p} = 0.0022768 \times \frac{P_{t o p}}{1 - 0.00266 \times \cos (2 φ) - 0.00028 \times h_{t o p}}

(1)

{Z T D}_{l e v e l} = 10^{- 6} \int N d h = 10^{- 6} \sum_{i}^{n - 1} (N_{i} + N_{i + 1}) \times (h_{i + 1} - h_{i}) / 2

(2)

Z T D = Z T D_{t o p} + Z T D_{l e v e l}

(3)

In the formula,

P_{t o p}

and

h_{t o p}

denote the air pressure value and altitude of the top layer, φ denotes the latitude, n denotes the total number of layers of reanalysis data contained above the station,

h_{i}

denotes the altitude associated with the i-th layer of reanalyzed data, and N denotes the atmospheric refraction index. The calculation formula is as follows:

N = \frac{k_{1} \times (P - e)}{T} + \frac{k_{2} \times e}{T} + \frac{k_{3} \times e}{T^{2}}

(4)

e = \frac{q \times P}{0.622 + 0.378 \times q}

(5)

In the formula,

k_{1} = 77.604 K / hPa, k_{2} = 64.79 K / hPa, k_{1} = 73,754,630 K^{2} / hPa

, P is the atmospheric pressure (hPa), e is the vapor pressure (hPa), q is the specific humidity, and T is the temperature (K).

The height of ERA5 data is the geopotential height, but the height of the IGS station is the geodetic height, so the geodetic height needs to be converted into geopotential height. Therefore, the geodetic height can be converted into orthometric height [30,31]. After conversion, if the elevation is within the pressure layer height range of the reanalysis data, the interpolation method is used to obtain the meteorological parameters at the height of the measuring station, and then the ZTD value is calculated using Equations (1)–(5). If the elevation is low at the height of the lowest pressure layer, meteorological parameters need to be extrapolated. In this article, the temperature is estimated by extrapolating the vertical lapse rate of −6.5 K/km, the specific humidity is estimated by the specific humidity value of the lowest layer, and the air pressure value is estimated by the barometric formula. The mathematical form is as follows:

P (h) = P_{0} \times e^{- ρ_{0} g h / P_{0}} = P_{0} \times e^{- h / H}

(6)

where

P (h)

represents the air pressure value at the measuring station height

h

,

P_{0}

represents the air pressure value at sea level, which is 1013.25 hPa, and

H

is a composite parameter called “scale height”. For the specific derivation and calculation process, please refer to [32]. In addition, ERA5 data are grid data, which generally do not coincide with the location of GNSS stations. Therefore, after calculating the ZTD values of the four grid points corresponding to a certain station, the bilinear interpolation method is used for horizontal interpolation. Finally, the ZTD estimate of the station location is obtained, which is called ERA5 ZTD.

2.2.2. ZTD Based on the GPT3 Model

The GPT3 model is the latest version of the GPT series model. Compared with the GPT2w model, it reduces the error of the discrete mapping function at a specific low elevation angle. The GPT3 model can output the atmospheric pressure of the measuring station only based on the geographical location, time, and other information of the measuring station. For meteorological parameters such as intensity, temperature, and water vapor pressure, the derivation formula is as follows [33]:

\begin{matrix} r (t) = & A_{0} + A_{1} \times \cos (\frac{d o y}{365.25} 2 π) + B_{1} \times \sin (\frac{d o y}{365.25} 2 π) \\ + A_{2} \times \cos (\frac{d o y}{365.25} 4 π) + B_{2} \times \sin (\frac{d o y}{365.25} 4 π) \end{matrix}

(7)

where

r (t)

is the meteorological parameter to be estimated,

d o y

is the day of the year,

A_{0}

represents its average value, and

(A_{1}, B_{1})

and

(A_{2}, B_{2})

represent their annual amplitude and half-year amplitude, respectively. After obtaining the required meteorological quantities at the four nearest sampling points, the bilinear interpolation algorithm is used to interpolate the parameters of the station to be determined to obtain the meteorological parameters at the measuring station. Afterward, the obtained meteorological parameters were used to calculate the zenith hydrostatic delay (ZHD) and zenith wet delay (ZWD) using the Saastamoinen model and the Askne model, respectively [3,34].

Z H D = 0.0022768 \times \frac{P}{1 - 0.00266 \times \cos (2 φ) - 0.00028 \times H}

(8)

Z W D = 10^{- 6} \times (k_{2}^{'} + \frac{k_{3}}{T_{m}}) \times \frac{R_{d}}{(λ + 1) \times g_{m}} \times e_{s}

(9)

In the formula,

φ

and

H

are the latitude and geodetic height of the measuring station, average acceleration due to gravity

g_{m} = 9.80665 m / S^{2}, k_{2}^{'}

and

k_{3}

are the atmospheric refraction constants, which are

16.52 K / hPa

and

3.776 \times 10^{5} K^{2} / hPa

, respectively.

P, T_{m}, R_{d}, λ, e_{s}

are the atmospheric pressure, atmospheric weighted average temperature, dry atmosphere universal gas constant, water vapor pressure lapse rate, and water vapor pressure, respectively. Summing ZHD and ZWD gives the ZTD estimated by the GPT3 model, which is called GPT3 ZTD.

2.3. ZTD Model Based on GRNN

GRNN is a radial basis function network model based on mathematical statistics. It is a forward-propagation neural network with strong nonlinear mapping capabilities and fast global convergence speed [35], and it does not require finding model parameters. This makes GRNN a powerful tool for regression, approximation, fitting, and prediction.

2.3.1. The Inputs and Outputs of the Model

Research shows that ZTD has obvious spatiotemporal variation characteristics [36,37]. Therefore, this article selects the latitude, longitude, and geodetic height of the measuring station, day of year, hour of day, and GPT3 ZTD as the input of the model, while ZTD is the output value of the model. The longitude, latitude, and geodetic height information of 53 IGS stations can be obtained from the IGS official website. The relationship between the input value and the output value is expressed as follows [38]:

{Z T D}_{p r e} = φ (lat, l o n, h e i g h t, d o y, h o d, {Z T D}_{G P T 3})

(10)

where year, doy, and hod represent the year, day of the year, and hour of the day, respectively, while lat, lon, and height represent the latitude, longitude, and geodetic height, respectively.

{Z T D}_{G P T 3}

indicates the ZTD calculated by the GPT3 model. The term “

{Z T D}_{p r e}

” serves a dual purpose with distinct meanings: within the context of modeling, it signifies the IGS ZTD at the training stations, while in the application of the model, it denotes the predictive outcomes at the test stations.

φ ()

represents a regional ZTD refined model based on GRNN that combines spatiotemporal information and GPT3 ZTD, referred to as the GRNN refined model.

2.3.2. GRNN Model

As shown in Figure 2, GRNN is composed of the input layer, model layer, summation layer, and output layer. The model structure is as follows [39]:

(1): Input layer

Responsible for receiving input data and passing the data directly to the pattern layer without processing. The number of nodes in the input layer is equal to the number of features of the input data. Each node corresponds to a feature in the input data.

(2): Model layer

The number of neurons in the model layer is equal to the number of learning samples, and each neuron corresponds to a different sample. The transfer function of the model layer neuron is:

p_{i} = \exp \{- \frac{{(X - X_{i})}^{T} (X - X_{i})}{2 δ^{2}}\}

(11)

where

i = 1, 2, \dots, n

,

p_{i}

is the output of the i-th neuron in the model layer, X is the network input variable,

X_{i}

is the learning sample corresponding to the i-th neuron, and

δ

is the hyperparameter of the model.

(3): Summation layer

In the summation layer, two distinct types of neurons are employed for the summation process, with one type being computed as follows:

\sum_{i = 1}^{n} \exp \{- \frac{{(X - X_{i})}^{T} (X - X_{i})}{2 δ^{2}}\}

(12)

It executes an arithmetic summation of the outputs originating from all pattern layer neurons. The connection weight between the pattern layer and each neuron is set to 1, and the transfer function employed is as follows:

S_{D} = \sum_{i = 1}^{n} p_{i}

(13)

Another calculation formula is expressed as follows:

\sum_{i = 1}^{n} Y_{i} \exp \{- \frac{{(X - X_{i})}^{T} (X - X_{i})}{2 δ^{2}}\}

(14)

It conducts a weighted summation of all neurons within the pattern layer. The connection weights between neurons are represented by the j-th element in the i-th output sample

Y_{i}

, and the transfer function employed is described as follows:

S_{N j} = \sum_{i = 1}^{n} y_{i j} p_{i}, j = 1, 2, \dots, k

(15)

The number of neurons in the output layer is equal to the number of features k of the output data in the learning sample. Each neuron divides the output of the summation layer. The output of neuron j corresponds to the j-th element of the estimation result

\hat{Y} (X)

, that is:

y_{j} = \frac{S_{N j}}{S_{D}}, j = 1, 2, \dots, k

(16)

2.3.3. Determine the Hyperparameters of the GRNN Model

Prior to commencing the training process for the GRNN, it is imperative to determine the hyperparameter δ in Equation (11). δ denotes the diffusion value of the GRNN. The value of δ was determined through a posteriori approach, and the training effect of various δ values was evaluated using a 10-fold cross-validation methodology. The 10-fold cross-validation technique was used to evaluate the accuracy of different machine learning models [40]. It randomly and evenly divides the dataset into 10 groups, 9 of which are used as training sets for model fitting, and the remaining 1 group is used as a test set for model testing. This process is carried out 10 times to ensure that all data are included in the test. Based on previous experience, this article uses

δ

values ranging from 0.01 to 0.15 with an interval of 0.01 to train the GRNN model. Root-Mean-Square Error (RMSE) is used to measure the deviation between the observed value and the true value and is sensitive to outliers. Therefore, the optimal

δ

value is determined by the minimum RMSE between the model output value and the test value generated by the 10-fold cross-validation method. Figure 3 shows the changes in RMSE calculated when δ is between 0.01 and 0.15. It can be seen that as the value of δ increases, RMSE first decreases and then gradually increases. When

δ = 0.03

, the corresponding RMSE obtains the minimum value of 17.71 mm, so 0.03 is selected as the hyperparameter to train the final GRNN model.

3. Analysis and Results

In order to assess the performance of the GRNN refined model, this study employs the two calculation methods presented in Section 2.2.1 and Section 2.2.2 to compute ERA5 ZTD and GPT3 ZTD for the purpose of conducting a comparative analysis. Thus, there are three types of ZTD in the experiments of this article: ERA5 ZTD, GPT3 ZTD, and GRNN ZTD, which are respectively derived from the ERA5 data, GPT3 model, and GRNN optimization model. Three statistics, i.e., bias, RMSE, and Mean Absolute Error (MAE), are selected as comparison standards. Different statistics highlight different characteristics of the results [18]. Bias can show the systematic error, that is, the degree of deviation between the predicted value and the true value; RMSE measures the deviation between the predicted value and the true value, and is more sensitive to outliers in the data; MAE measures the average distance between each data point and the dataset. The more dispersed the data points, the greater the MAE. The calculation formulas for the three statistics are as follows:

B i a s = \frac{1}{N} \sum_{i = 1}^{N} (Z T D_{i} - Z T D_{i}^{M})

(17)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(Z T D_{i} - Z T D_{i}^{M})}^{2}}

(18)

M A E = \frac{1}{N} \sum_{i = 1}^{N} |Z T D_{i} - Z T D_{i}^{M}|

(19)

where

{Z T D}_{i}

represents the true value,

{Z T D}_{i}^{M}

represents the model value, and N is the number of samples.

3.1. Preprocessing for Different ZTD

Prior to initiating the model training, it is essential to first temporally align the sample data. While IGS ZTD, GPT3 ZTD, and ERA5 ZTD data for each station have been acquired in Section 2, they exhibit dissimilar sequence lengths and temporal resolutions. Therefore, a harmonization process is necessary to ensure consistency among these datasets. For one IGS station, the time resolution of its IGS ZTD is 5 min, and there are missing data for several days in a year (the number of missing days is different for each station, and the missing dates are also different). Its GPT3 ZTD and ERA5 ZTD have full sequence length but a temporal resolution of 1 day and 1 h, respectively. In summary, first, consider unifying the time resolutions of the three types of data. Since the GPT3 model has obvious annual and semi-annual cycles and is not sensitive to ZTD changes in a short time (within a few hours), it can be considered that the changes between two adjacent values of GPT3 ZTD are uniform. Therefore, it is reasonable that linear interpolation can be used to expand the GPT3 ZTD from 1 value per day to 24 values per day. Then, the ZTD value corresponding to each hour is extracted from the IGS ZTD, so that the time resolution of the three ZTDs is 1 h. Finally, the date of the missing data was determined based on the IGS ZTD of each station, and the GPT3 ZTD and ERA5 ZTD of the corresponding dates were removed, so that the sequence lengths of the three types of data were consistent. After the processing is completed, a total of 421,800 pieces of data will be available in 2021 from 53 IGS stations.

In accordance with the categorization of station types illustrated in Figure 1, the consolidated dataset is segregated into two subsets: a training set and a validation set. Data from 43 stations represented by the red circular markers are employed for training the GRNN model, while data from the remaining 10 stations serve as the validation set for assessing the model’s accuracy. The specifics concerning the 10 validation stations can be found in Table 1.

3.2. Overall Performance of the GRNN Refined Model

Following the training of the GRNN model with optimized parameters, the performance of the refined GRNN model is assessed for the initial time. This evaluation involves the utilization of statistical indicators derived from both the training and validation sets. The residual histograms of the training set and the test set are shown in Figure 4. It can be seen that the absolute value of the residual of the training set is within 80 mm, and the absolute value of the residual of the validation set is within 60 mm. They all obey the normal distribution, indicating that the GRNN refined model effectively eliminates systematic deviations. In the training dataset, 52% of residuals fall within the −20 to 20 mm interval, and 86.4% of residuals are contained within the −40 to 40 mm range. And in the validation dataset, 74.75% of residuals are observed within the −20 to 20 mm interval, with 96.15% falling within the −40 to 40 mm range.

Two types of accuracy, training accuracy and validation accuracy, are defined in this experiment. Training accuracy refers to after determining the optimal parameters, the accuracy information of the optimized GRNN ZTD and IGS ZTD obtained by using the 10-fold cross-validation method on the training set. Test accuracy refers to training the final model with the data in the training set, inputting the input parameters in the verification set into the trained model to obtain the optimized ZTD, and comparing it with the IGS ZTD to obtain the accuracy information. To further verify the internal accuracy of the GRNN model, the data of the training set are divided according to the measurement stations, and the training accuracy of each training measurement station is calculated separately, that is, their three statistics of RMSE, bias, and MAE.

As shown in Figure 5, the RMSE of the training station is mostly distributed in the range of 20–30 mm; the MAE is distributed in the range of 15–25 mm; the bias fluctuates around 0 mm, indicating that the training results of the GRNN refined model have eliminated systematic errors. None of the three statistics of the training stations showed points with large errors, which demonstrated the good performance of the GRNN refined model during model training.

To test whether there is an “over-fitting” phenomenon in the model, this article compares the training accuracy and testing accuracy of the model, as shown in Table 2. The dataset is comprised of training data from 43 stations and validation data from 10 stations, resulting in 43 and 10 sets of respective statistical metrics. These metrics serve as the basis for deriving the mean, maximum, and minimum values for the three statistical indicators, which are used to assess the two types of accuracy. Table 2 summarizes the average values of the three statistical indicators of training accuracy and test accuracy, and the square brackets correspond to the maximum and minimum values, respectively. Overall, the three statistical indicators of training accuracy and test accuracy are not significantly different, which shows that the accuracy of the trained model has not declined, and there is no “over-fitting” phenomenon.

3.3. Evaluating the External Accuracy of the GRNN Refined Model

To validate the external accuracy (test accuracy), data from the 10 validation stations listed in Table 1 were employed. Due to the limited length of the article, 6 of the 10 validation stations were randomly selected for comparison of results, as shown in Figure 6.

As can be seen from Figure 6, the GPT3 ZTD of different verification stations can only reflect the general change trend of ZTD but cannot reflect the rapid fluctuation of ZTD. Compared with GPT3 ZTD, ERA5 ZTD and GRNN ZTD are not only more consistent with IGS ZTD but also consistent with the changing trend of IGS ZTD, which shows that they both have higher accuracy. The difference is that the peaks of ERA5 ZTD at rising and falling points are mostly higher than those of IGS ZTD, which indicates that there may be a positive bias in ERA5 ZTD compared with IGS ZTD. It is worth noting that ERA5 ZTD is closer to IGS ZTD than GRNN ZTD at many peaks. This may be due to ERA5 ZTD being calculated from 37-layer measured meteorological parameters, which can be closer to the true value during periods when ZTD changes rapidly.

To further analyze the three models, the ZTD obtained by the three models of GPT3, ERA5, and GRNN is compared with the IGS ZTD. The obtained deviations are recorded as DGPT3, DERA5, and DGRNN, respectively. The residual sequences of the six measuring stations are obtained as shown in Figure 7. It can be seen from the figure that the change in the range from small to large is DGRNN, DERA5, and DGPT3. At some stations, DGRNN and DERA5 are not very different. Among them, the fluctuation range of DGRNN is the smallest, roughly distributed on both sides of 0 mm, and there is little difference in performance at different stations. However, the performance of DGPT3 and DERA5 at different measuring stations is quite different, which shows that GRNN ZTD has better stability than GPT3 ZTD and ERA5 ZTD. In addition, it can be seen from the figure that in some periods at the BRUX, BUCU, VILL, and WORC stations, when DGPT3 has a large deviation, the DGRNN also has a large deviation value at the corresponding time. This is because the input parameters of the GRNN refined model include GPT3 ZTD, which shows that GPT3 ZTD has a certain impact on GRNN ZTD. At the M0SE station, the fluctuations of the three deviation values are larger than those of other stations. This is because the M0SE station is close to the ocean, and the dramatic changes in water vapor lead to rapid and irregular fluctuations in ZTD. This shows that none of the three models can acutely capture this ZTD change.

The RMSE, bias, and MAE of the three models at 10 verification stations are shown in Figure 8. It can be seen that the accuracy of measuring stations close to the ocean is relatively poor, and this performance is particularly obvious on GPT3 ZTD. Although the ZTD calculated by ERA5 data has considerable accuracy, its stability is relatively poor compared to the GRNN refined model. In Figure 8, it can be seen from the color changes that the accuracy of ERA5 ZTD changes at several inland stations, while the GRNN ZTD is relatively stable. In general, GRNN ZTD has the highest accuracy and is the most stable. This reflects the superiority of the GRNN refined model and verifies its applicability in the entire research area.

Table 3 shows the three statistical values of the 10 verification stations and their average values. For GRNN ZTD, RMSE is between 16.40 and 20.37 mm, bias is between −0.67 and 0.66 mm, MAE is between 12.61 and 15.98 mm, and the average values are 18.33 mm, 0.07 mm, and 14.08 mm, respectively. In comparison to ERA5 ZTD and GPT3 ZTD, the introduced model achieved a significant improvement: bias was virtually eliminated, RMSE was diminished by 19.5% and 63.4%, and MAE exhibited a reduction of 24.8% and 67.1%, respectively. The above results show that the overall error of the GRNN refined model is very small, and its overall accuracy still has obvious advantages compared with the ERA5 and GPT3 models. Compared with IGS ZTD, GRNN ZTD has higher consistency and better accuracy. At the same time, it is also noted that at the measuring station M0SE close to the ocean, the accuracy of GRNN ZTD has also declined relative to the overall accuracy. This reflects a common and difficult-to-solve problem, that is, the accumulation of water vapor near the measuring station close to the ocean. The impact of rapid changes and irregular changes in ocean climate on the accuracy of ZTD modeling is difficult to eliminate.

4. Impact of Spatiotemporal Factors on Modeling Accuracy

Numerous factors influence modeling accuracy, including factors associated with the volume of training data within the model, the geographical placement of validation stations, and temporal fluctuations. Consequently, this section is dedicated to investigating the impacts of station spatial distribution and temporal variations on the accuracy of the GRNN refined model.

4.1. Impact of Spatial Factors on Modeling Accuracy

To explore the impact of the distribution of stations on modeling accuracy, this section redesigned an experiment: Selecting one station as the verification station, using the remaining 52 stations as training stations (using their data to train the model), and then using the trained GRNN refined model to verify the ZTD accuracy of this verification station. For example, the OBE4 station is used as the verification measuring station, the remaining 52 stations are used to train the model, and then the trained model is used to verify the ZTD accuracy of the OBE4 station. Experiments were conducted at 10 verification stations in sequence, and the results are shown in Figure 9. The subfigure in the upper left corner of Figure 9 shows the location distribution of the 53 stations, the red triangle icon represents the verification measurement station and the blue circular icon represents the training measurement station. The remaining three subfigures represent the bias, RMSE, and MAE of the 10 verification measurement stations, respectively. The darker the color, the higher the accuracy.

It can be seen from Figure 9 that the small bias measuring stations are almost all located inside the land or in areas with a dense distribution of measuring stations, while the large bias measuring stations are close to the ocean. The GRAC station (6.921° E, 43.754° N) has the largest deviation, which is −0.615 mm. This shows that compared with the distribution of the station, whether it is close to the ocean has a greater impact on the bias of the station. The performance of RMSE and MAE at the same measuring station has good consistency. It can be seen that the accuracy of the measuring station located at the edge of the modeling area is lower than that of the measuring station located in the center of the modeling area. The highest accuracy measuring station is the OBE4 measuring station located in the center (11.277° E, 48.085° N); the RMSE and MAE are 17.36 mm and 13.39 mm, respectively, which is consistent with the experimental results in Table 4 in Section 3.3. Further analysis shows that in addition to the influence of the location distribution of the verification station on the accuracy of the model, the influence of the distribution of stations around the verification station on the accuracy cannot be ignored. For example, also located at the edge of the modeling area, the accuracy of the two measuring stations BUCU (26.126° E, 44.464° N) and OP71 (2.355° E, 48.386° N) is lower than the accuracy of the two measuring stations BRUX (4.359° E, 50.798° N) and VILL (3.952° W, 40.443° N). Because the latter serve as the validation station with a denser distribution of nearby stations than the former, this may have a positive impact on their accuracy.

Overall, it is evident that, within the context of regional ZTD modeling, the geographical placement of measuring stations and the density of neighboring sites surrounding the measuring station significantly influence its accuracy. The accuracy of areas located in the center of the modeling area or with dense surrounding sites is higher than in areas located at the edge of the modeling area or with sparse surrounding sites. The smallest measuring station for RMSE and MAE is OBE4, located at the center of the modeling, which is 17.36 mm and 13.39 mm, respectively; the largest measuring station for RMSE and MAE is M0SE, located at the edge of the modeling, which is 21.21 mm and 16.66 mm, respectively. However, it should also be noted that the verification accuracy of WROC measuring stations, which are in a relatively central area and are densely distributed around them, is relatively skewed. This shows that in addition to the factors of the location distribution of the measuring stations, the accuracy of the measuring stations is also affected by other spatial factors, such as latitude and elevation factors. In addition, the distance from the ocean is also a factor that affects the accuracy of the measuring station, because the closer to the ocean, the more drastic the changes in water vapor are.

4.2. Impact of Time Factors on Modeling Accuracy

Usually, climate conditions such as rainfall, temperature, humidity, etc. will change differently in an area at different times of the year. This will affect the ZTD change trend and fluctuation amplitude above the measuring station, and thus affect the accuracy of ZTD modeling. To verify whether the accuracy of the same model changes in different time periods, this section calculates the monthly MAE of 10 verification stations to reflect the error changes of each station from January to December. Figure 10 plots the relationship between the MAE of the 10 verification stations and each month. The horizontal axis in the figure represents the 10 verification measuring stations, and the vertical axis represents the monthly MAE of a certain measuring station from January to December. Each small grid represents the monthly average absolute deviation value of a certain measuring station in a certain month. The redder the color, the greater the error. It can be intuitively seen from the figure that the red squares are mainly concentrated from June to October, which shows that the monthly average absolute deviation of each station from June to October is larger than that in other months. That is, the accuracy of the model from June to October is worse than the accuracy in other months. In addition, it is noted that the number of red squares and white squares at stations close to the ocean, such as the BRUX, M0SE, and OP71 stations, is higher than that of other stations. This also confirms the conclusion in Section 4.1 that “the distance from the ocean is also a factor that affects the accuracy of the measurement station”, indicating that the modeling accuracy of the measurement station close to the ocean is relatively low.

For a more detailed examination of the temporal influence on model accuracy, the monthly MAE of each validation station has been collated and is presented in Table 4. As can be seen from Table 4, the monthly average absolute deviation values of different measuring stations vary greatly, with the maximum value exceeding 20 mm and the minimum value within 10 mm. Moreover, the monthly MAE of almost every measuring station shows a trend of first increasing and then decreasing, with the maximum values appearing from June to October, which shows that the accuracy of the model is poor from June to October. On the contrary, the minimum value of the monthly average absolute deviation of the measuring station appears from January to March. It is known that the water vapor content is notably higher and exhibits greater complexity during the summer season. These characteristics have a direct bearing on the accuracy of ZWD, leading to reduced precision during summer compared to winter. And this explains well why ZTD has the larger MAE during the summer months.

In summary, the modeling accuracy of the GRNN refined model demonstrates temporal variability. The accuracy of the model from June to October (summer and autumn) is significantly lower than that from January to March (winter) accuracy.

5. Discussion

The newly released GPT3 model has better performance in terms of accuracy and stability than the GPT2w and GPT2 models. However, it primarily portrays the overarching ZTD trend and encounters challenges in accurately capturing rapid fluctuations within a brief time frame. This is a common limitation associated with empirical models. Hence, an improved GPT3 model built upon the foundation of the GRNN is proposed, called the GRNN refined model. It successfully addresses the limitation of the GPT3 model, which struggles to capture the swift variations in ZTD. For the purpose of more extensive comparative analysis, ERA5 ZTD values were additionally computed for 53 stations within the study area using ERA5 data. As depicted in Figure 6 and Figure 7, it becomes evident that the GRNN ZTD and ERA5 ZTD exhibit a consistent trend and effectively capture the rapid fluctuations in IGS ZTD. However, there is a noteworthy positive deviation of ERA5 from the IGS ZTD. It is noteworthy that the procedure for ERA5 ZTD entails a substantial level of complexity and demands a significant investment of time and computational resources. This holds true for both the initial data retrieval within the study area and the subsequent ZTD computation. The GRNN refined model is characterized by its relative simplicity, necessitating the inclusion of information pertaining to IGS stations, temporal data, and IGS ZTD for training within the model. This is an absolute advantage for near real-time water vapor inversion, PPP, PPP-RTK, and others. The GRNN refined model combines the advantages of the GPT3 model and the ERA5 data. It can not only reflect the real ZTD rapid fluctuation changes as well as the ERA5 ZTD but is also easy to calculate as the GPT3 ZTD. While the GRNN ZTD exhibits inferior performance compared to ERA5 ZTD at its peak, both models share a consistent trend and effectively capture the fluctuation in ZTD. Furthermore, the GRNN model substantially mitigates bias, whereas ERA5 still retains a non-negligible bias value. Consequently, it can be asserted that the GRNN model outperforms ERA5 in capturing the rapid fluctuations of ZTD.

The experimental outcomes clearly indicate that the GRNN ZTD outperformed the GPT3 ZTD and ERA5 ZTD. The GRNN ZTD effectively eliminated bias, demonstrating an RMSE of 18.33 mm and MAE of 14.08 mm. In comparison to ERA5 ZTD and GPT3 ZTD, RMSE exhibited reductions of 19.5% and 63.4%, respectively, while MAE demonstrated reductions of 24.8% and 67.1%, respectively. As depicted in Figure 8, it is evident that the GRNN refined model surpasses the other two models, particularly in the coastal region characterized by rapid and intricate water vapor fluctuations. This observation validates its robust and stable performance. Furthermore, an inquiry was conducted to examine the influence of two key factors, namely, spatial and temporal variables, on modeling accuracy. The findings clearly indicated that both temporal and spatial factors exert a notable impact on the accuracy of the modeling process. In the spatial context, it is observed that sites positioned within the central area of the modeled region, or those surrounded by a denser distribution of neighboring sites, typically exhibit higher levels of accuracy in contrast to sites located at the peripheries of the modeled region. For example, the modeling accuracy of OBE4 located at the center of the region (RMSE: 16.40 mm) is 17.86% higher than that of BRUX located at the edge of the region (RMSE: 19.33 mm). Concerning temporal variation, it is noteworthy that modeling accuracy tends to be comparatively lower during the period from June to October as opposed to other months. Notably, the largest monthly mean absolute deviation difference is observed at station OP71, with a magnitude of 14.49 mm. Moreover, several conclusions derived from this research align with previous findings. Specifically, the spatial distribution of stations significantly influences the model’s accuracy, with greater precision achieved through a denser distribution. Additionally, the model’s accuracy is observed to be seasonally dependent, generally exhibiting lower accuracy during summer compared to winter.

Nonetheless, it is important to acknowledge that this study does have certain limitations. Firstly, the GRNN refined model still exhibits a non-negligible error in approximating the true value at its peak, consequently impacting the precision when applying ZTD for water vapor inversion or PPP, leaving room for improvement. Secondly, the study’s reliance on just one year of data may be considered insufficient for constructing a long-term and stable model. Subsequent research endeavors should prioritize expanding the temporal scope of experimental data and exploring the implementation of deep learning models with more intricate structures for ZTD modeling. This approach will enhance the model’s stability and ensure greater alignment with actual ZTD fluctuations. Simultaneously, it will bring the predicted model values, especially at peak points, into closer proximity to the true values, thus fulfilling the exacting requirements of high precision.

6. Conclusions

In this study, a revised GPT3 model for ZTD estimation in Europe based on the GRNN (the GRNN refined model) is proposed. The model uses the spatiotemporal information of the IGS stations and the GPT3 ZTD as input terms and the IGS ZTD as the output term, and the 10-fold cross-validation method is used to determine the hyperparameters of the model. The proposed model is trained and validated employing ZTD data from 53 IGS stations in the European region for the year 2021 as the reference values. To establish the superior performance of the GRNN-derived ZTD obtained through the GRNN optimization model, we conducted comparative analyses with ERA5 ZTD and GPT3 ZTD, calculated using the ERA5 dataset and the GPT3 model, respectively. The results of the research are as follows:

(1): The accuracy performance of GRNN ZTD surpasses that of ERA5 ZTD and GPT3 ZTD. Specifically, the bias associated with GRNN ZTD approaches nearly 0 mm. The RMSE and MAE of GRNN ZTD are recorded at 18.33 mm and 14.08 mm, respectively. These values represent a notable increase of 19.5% and 24.8%, as well as 63.4% and 67.1%, in comparison to ERA5 ZTD and GPT3 ZTD, respectively. Moreover, the GRNN refined model demonstrates remarkable stability, notably outperforming the other two models, particularly in the coastal areas characterized by rapid and complex water vapor fluctuations.
(2): Both spatial and temporal factors exert a significant influence on modeling accuracy. In terms of spatial considerations, model accuracy is observed to be higher for stations situated in the central regions of the modeling area or those enveloped by a dense distribution of neighboring stations, in contrast to stations positioned at the periphery of the modeling area or surrounded by a sparser distribution of stations. The experimental findings reveal that OBE4 (located in the central region) exhibits an RMSE of 16.40 mm, which is notably 17.86% lower than that of BRUX (positioned at the edge of the region) with an RMSE of 19.33 mm. Temporally, a consistent trend is observed across all stations: the monthly mean absolute deviation exhibits an initial increase followed by a subsequent decrease, with the maximum deviation primarily occurring between June and October. Notably, the largest monthly mean absolute deviation is recorded at station OP71, reaching 14.49 mm. This underscores the influence of temporal factors on modeling accuracy, with the model demonstrating lower accuracy during the summer months in comparison to winter. This discrepancy can be attributed to the intricacies and complexity of climate changes during the summer period.

In future research, in addition to increasing the amount of training data and replacing the training model, other methods to improve accuracy should be considered. On one hand, the incorporation of parameters encompassing atmospheric humidity into the model inputs is contemplated to bring the prediction results into closer alignment with actual values. On the other hand, methodological studies can incorporate sensitivity analyses, by making a small change in the inputs and checking how much the results change and which method provides better results.

Author Contributions

Conceptualization, M.W.; methodology, M.W.; validation, M.W.; formal analysis, X.Y. and F.K.; data curation, M.W. and K.X.; writing—original draft preparation, M.W.; writing—review and editing, F.K., X.H. and K.X.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was financially supported by The Key Research and Development Program of Anhui Province (Grant No. 202104a07020014), The Major science and technology projects of Anhui Province (Grant No. 202103a05020026), The Natural Science Foundation of Jiangsu Province (Grant No. BK20211037), the Jiangsu Province Science and Technology project Social development project (Grant No. BE2021622), and The National Natural Science Foundation of China (Grant No. 41674036).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The ZTD data of IGS stations are available at ftp://igs.gnsswhu.cn/pub/gps/products/troposphere/new (accessed on 9 August 2023). The ERA5 data from ECMWF of the research region are available at https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-pressure-levels?tab=form (accessed on 22 August 2023). The codes associated with the GPT3 model can be found at https://vmf.geo.tuwien.ac.at (accessed on 5 July 2023).

Acknowledgments

The authors would like to acknowledge ECMWF for providing ERA5 data, IGS for providing ZTD data, and Landskron and Böhm for providing the GPT3 model.

Conflicts of Interest

The authors declare no conflict of interest.

References

Sun, W.; Zhu, M. Study on Modeling of Tropospheric Zenith Delay in China with BP-Ada boost Strong Predictor. J. Geod. Geodyn. 2022, 42, 35–40. [Google Scholar]
Davis, J.; Herring, T.; Shapiro, I.; Rogers, A.; Elgered, G. Geodesy by radio interferometry: Effects of atmospheric modeling errors on estimates of baseline length. Radio Sci. 1985, 20, 1593–1607. [Google Scholar] [CrossRef]
Askne, J.; Nordius, H. Estimation of tropospheric delay for microwaves from surface weather data. Radio Sci. 1987, 22, 379–386. [Google Scholar] [CrossRef]
Xia, P.; Tong, M.; Ye, S.; Qian, J.; Fangxin, H. Establishing a high-precision real-time ZTD model of China with GPS and ERA5 historical data and its application in PPP. GPS Solut. 2023, 27, 2. [Google Scholar] [CrossRef]
Zheng, F.; Lou, Y.; Gu, S.; Gong, X.; Shi, C. Modeling tropospheric wet delays with national GNSS reference network in China for BeiDou precise point positioning. J. Geod. 2018, 92, 545–560. [Google Scholar] [CrossRef]
Liou, Y.-A.; Teng, Y.-T.; Van Hove, T.; Liljegren, J.C. Comparison of precipitable water observations in the near tropics by GPS, microwave radiometer, and radiosondes. J. Appl. Meteorol. Climatol. 2001, 40, 5–15. [Google Scholar] [CrossRef]
Brenot, H.; Neméghaire, J.; Delobbe, L.; Clerbaux, N.; De Meutter, P.; Deckmyn, A.; Delcloo, A.; Frappez, L.; Van Roozendael, M. Preliminary signs of the initiation of deep convection by GNSS. Atmos. Chem. Phys. 2013, 13, 5425–5449. [Google Scholar] [CrossRef]
Liu, J.; Chang, Z.; Zheng, H. Analysis on the performances of the GNSS tropospheric delay correction models. E3S Web Conf. 2022, 360, 01043. [Google Scholar] [CrossRef]
Böhm, J.; Heinkelmann, R.; Schuh, H. Short note: A global model of pressure and temperature for geodetic applications. J. Geod. 2007, 81, 679–683. [Google Scholar] [CrossRef]
Lagler, K.; Schindelegger, M.; Böhm, J.; Krásná, H.; Nilsson, T. GPT2: Empirical slant delay model for radio space geodetic techniques. Geophys. Res. Lett. 2013, 40, 1069–1073. [Google Scholar] [CrossRef]
Böhm, J.; Möller, G.; Schindelegger, M.; Pain, G.; Weber, R. Development of an improved empirical model for slant delays in the troposphere (GPT2w). GPS Solut. 2015, 19, 433–441. [Google Scholar] [CrossRef]
Collins, J.; Langley, B. The Residual Tropospheric Propagation Delay: How Bad Can It Get? In Proceedings of the 11th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GPS 1998), Nashville, TN, USA, 15–18 September 1998; pp. 729–738. [Google Scholar]
Penna, N.; Dodson, A.; Chen, W. Assessment of EGNOS tropospheric correction model. J. Navig. 2001, 54, 37–55. [Google Scholar] [CrossRef]
Li, H.; Zhu, G.; Kang, Q.; Huang, L.; Wang, H. A global zenith tropospheric delay model with ERA5 and GNSS-based ZTD difference correction. GPS Solut. 2023, 27, 154. [Google Scholar] [CrossRef]
Yang, X.; Li, Y.; Yu, X.; Tan, H.; Yuan, J.; Zhu, M. Regional/Single Station Zenith Tropospheric Delay Combination Prediction Model Based on Radial Basis Function Neural Network and Improved Long Short-Term Memory. Atmosphere 2023, 14, 303. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, L.; Yang, J. Study on prediction of zenith tropospheric delay by use of BP neural network. J. Geod. Geodyn. 2011, 31, 134–137. [Google Scholar]
Xiao, G.; Ou, J.; Liu, G.; Zhang, H. Construction of a regional precise tropospheric delay model based on improved BP neural network. Chin. J. Geophys. 2018, 61, 3139–3148. (In Chinese) [Google Scholar]
Yang, F.; Guo, J.; Zhang, C.; Li, Y.; Li, J. A Regional Zenith Tropospheric Delay (ZTD) Model Based on GPT3 and ANN. Remote Sens. 2021, 13, 838. [Google Scholar] [CrossRef]
Li, S.; Xu, T.; Jiang, N.; Yang, H.; Wang, S.; Zhang, Z. Regional Zenith Tropospheric Delay Modeling Based on Least Squares Support Vector Machine Using GNSS and ERA5 Data. Remote Sens. 2021, 13, 1004. [Google Scholar] [CrossRef]
Li, S.; Xu, T.; Xu, Y.; Jiang, N.; Bastos, L. Forecasting GNSS Zenith Troposphere Delay by Improving GPT3 Model with Machine Learning in Antarctica. Atmosphere 2022, 13, 78. [Google Scholar] [CrossRef]
Zhang, H.; Yao, Y.; Hu, M.; Xu, C.; Su, X.; Che, D.; Peng, W. A Tropospheric Zenith Delay Forecasting Model Based on a Long Short-Term Memory Neural Network and Its Impact on Precise Point Positioning. Remote Sens. 2022, 14, 5921. [Google Scholar] [CrossRef]
Tianhe, X.; Song, L.; Shuaimin, W.; Nan, J. Improved tropospheric delay model for China using RBF neural network and meteorological data. Acta Geod. Cartogr. Sin. 2022, 51, 1690. [Google Scholar]
Li, J.; Zhang, Q.; Liu, L.; Yao, Y.; Huang, L.; Chen, F.; Zhou, L.; Zhang, B. A refined zenith tropospheric delay model for Mainland China based on the global pressure and temperature 3 (GPT3) model and random forest. GPS Solut. 2023, 27, 172. [Google Scholar] [CrossRef]
Ejigu, Y.; Hunegnaw, A.; Abraha, K.; Teferle, F. Impact of GPS antenna phase center models on zenith wet delay and tropospheric gradients. GPS Solut. 2019, 23, 1–15. [Google Scholar] [CrossRef]
Sun, Z.; Zhang, B.; Yao, Y. An ERA5-Based Model for Estimating Tropospheric Delay and Weighted Mean Temperature Over China with Improved Spatiotemporal Resolutions. Earth Space Sci. 2019, 6, 1926–1941. [Google Scholar] [CrossRef]
Jiang, C.; Xu, T.; Wang, S.; Nie, W.; Sun, Z. Evaluation of Zenith Tropospheric Delay Derived from ERA5 Data over China Using GNSS Observations. Remote Sens. 2020, 12, 663. [Google Scholar] [CrossRef]
Jinfang, H.; Yidong, L.; Weixing, Z.; Jingnan, L. The assessment of ZTD calculated from reanalysis over China. Sci. Surv. Mapp 2018, 43, 13–17. [Google Scholar]
Lian, D.; He, Q.; Li, L.; Zhang, K.; Fu, E.; Li, G.; Wang, R.; Gao, B.; Song, K. A Novel Method for Monitoring Tropical Cyclones’ Movement Using GNSS Zenith Tropospheric Delay. Remote Sens. 2023, 15, 3247. [Google Scholar] [CrossRef]
Zhang, Y.; Cai, C. Method and accuracy assessment of zenith tropospheric delay derived from ERA5 re-analysis data. J. Geod. Geodyn. 2020, 40, 62–65. [Google Scholar]
Shangguan, M.; Cheng, X.; Pan, X.; Meng, D.; Wu, Z.; Xie, Z. Assessments of global tropospheric delay retrieval from reanalysis based on GNSS data. Chin. J. Geophys. 2023, 66, 939–950. [Google Scholar]
Jiang, P.; Ye, S.; Chen, D.; Liu, Y.; Xia, P. Retrieving precipitable water vapor data using GPS zenith delays and global reanalysis data in China. Remote Sens. 2016, 8, 389. [Google Scholar] [CrossRef]
Lente, G.; Ősz, K. Barometric formulas: Various derivations and comparisons to environmentally relevant observations. ChemTexts 2020, 6, 13. [Google Scholar] [CrossRef]
Landskron, D.; Böhm, J. VMF3/GPT3: Refined discrete and empirical troposphere mapping functions. J. Geod. 2018, 92, 349–360. [Google Scholar] [CrossRef]
Saastamoinen, J. Atmospheric Correction for the Troposphere and Stratosphere in Radio Ranging Satellites. Geophys. Monogr. Ser. 2013, 15, 247–251. [Google Scholar]
Yuan, Q.; Xu, H.; Li, T.; Shen, H.; Zhang, L. Estimating surface soil moisture from satellite observations using a generalized regression neural network trained on sparse ground-based measurements in the continental US. J. Hydrol. 2020, 580, 124351. [Google Scholar] [CrossRef]
Huang, L.; Guo, L.; Liu, L.; Huang, Y.; Xie, S.; Kang, C. Accuracy analysis of ZTD and ZWD calculated from MERRA-2 reanalysis data over China. Geomat. Inf. Sci. Wuhan Univ. 2023, 48, 416–424. [Google Scholar]
Li, T.; Wang, L.; Chen, R.; Fu, W.; Xu, B.; Jiang, P.; Liu, J.; Zhou, H.; Han, Y. Refining the empirical global pressure and temperature model with the ERA5 reanalysis and radiosonde data. J. Geod. 2021, 95, 1–17. [Google Scholar] [CrossRef]
Sun, Z.; Zhang, B.; Yao, Y. Improving the Estimation of Weighted Mean Temperature in China Using Machine Learning Methods. Remote Sens. 2021, 13, 1016. [Google Scholar] [CrossRef]
Li, J.; Li, H.; Yao, Y.; Liu, L.; Zhang, B.; Huang, L. Zenith Wet Delay Fusion Based on A Generalized Regression Neural Network. Geomat. Inf. Sci. Wuhan Univ. 2022. [Google Scholar] [CrossRef]
Rodriguez, J.; Perez, A.; Lozano, J. Sensitivity analysis of k-fold cross-validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 569–575. [Google Scholar] [CrossRef]

Figure 1. IGS stations distribution map. The triangle represents the verification stations, and the circle represents the training stations.

Figure 2. The GRNN model structure for refining the GPT3 ZTD.

Figure 3. RMSE obtained by training GRNN models with different hyperparameters.

Figure 4. Histogram of the residuals for the training and validation sets in GRNN.

Figure 5. Accuracies at training sites of RMSE, bias, and MAE, respectively.

Figure 6. The plot shows the ZTD time series from IGS and three models in six sample stations.

Figure 7. Residual errors at the six sample sites for the three models.

Figure 8. Map showing bias, RMSE, and MAE at 10 validation stations for the three methods.

Figure 9. RMSE, bias, and MAE at the 10 verification sites. TS represents training stations, and VS represents verification stations.

Figure 10. A heatmap of the monthly MAE for 10 validation stations.

Table 1. Information of verification stations.

Stations Name	Latitude (°)	Longitude (°)	Height (m)
BRUX	50.798	4.359	158.3
BUCU	44.464	26.126	143.2
GRAC	43.754	6.921	1319.8
LAMA	53.892	20.67	186.7
M0SE	41.893	12.493	120.6
OBE4	48.085	11.277	650.5
OP71	48.836	2.335	124.5
SPT0	57.715	12.891	219.9
VILL	40.443	−3.952	647.4
WROC	51.113	17.062	180.3

Table 2. Training accuracy and test accuracy of the GRNN refined model (unit: mm).

Statistical Indicators	Training Accuracy	Test Accuracy
Bias	0.02 [−0.91–0.88]	0.07 [−0.67–0.66]
RMSE	22.01 [16.77–26.5]	18.33 [16.4–20.37]
MAE	17.11 [13.2–20.75]	14.07 [12.6–15.98]

Table 3. The prediction accuracy results of the three models at 10 verification stations (unit: mm).

	GRNN ZTD			ERA5 ZTD			GPT3 ZTD
	Bias	RMSE	MAE	Bias	RMSE	MAE	Bias	RMSE	MAE
BURX	−0.67	19.33	14.85	8.21	24.59	21.67	28.17	46.92	37.55
BUCU	0.33	16.73	12.90	7.11	20.55	17.28	18.88	35.27	28.85
GRAC	0.57	17.66	13.35	11.10	15.83	12.80	36.07	47.89	39.01
LAMA	0.66	17.52	13.19	7.93	15.95	12.91	83.03	89.38	83.75
M0SE	0.42	20.37	15.98	5.58	41.35	30.09	38.78	51.88	43.28
OBE4	−0.28	16.40	12.61	10.03	14.07	11.46	24.97	39.12	32.89
OP71	−0.01	20.02	15.36	7.12	33.24	27.50	15.68	37.61	30.86
SPT0	0.04	18.88	14.42	10.61	16.31	13.50	50.67	61.72	54.01
VILL	−0.24	18.57	14.28	17.01	20.44	17.92	1.37	32.79	26.85
WROC	−0.11	17.90	13.83	19.00	25.52	22.08	47.68	58.21	51.31
mean	0.07	18.33	14.08	10.36	22.78	18.72	34.53	50.08	42.84

Table 4. Monthly MAE of the GRNN refined model at 10 verification stations (unit: mm).

	BRUX	BUCU	GRAC	LAMA	M0SE	OBE4	OP71	SPT0	VILL	WROC
January	11.93	12.22	11.16	8.92	13.93	8.59	11.41	10.36	10.29	8.82
February	14.81	13.83	11.59	9.65	12.04	11.98	15.18	10.55	12.48	11.32
March	11.95	8.81	7.96	11.45	10.66	7.019	12.46	14.25	12.62	11.39
April	8.99	10.17	10.03	10.54	15.99	9.44	9.51	8.21	13.01	10.33
May	14.23	10.93	17.02	11.43	18.04	11.09	14.72	8.49	14.74	10.66
June	17.56	13.84	12.03	17.62	19.04	15.66	16.28	20.44	15.15	14.78
July	14.37	17.70	17.70	16.59	17.45	17.66	12.25	16.84	14.31	22.50
August	15.92	18.61	18.01	13.32	19.07	12.94	18.01	14.77	18.74	15.74
September	18.56	12.57	15.66	12.98	17.60	15.80	24.00	18.91	17.26	16.63
October	20.18	12.26	13.71	20.59	19.36	14.99	20.09	21.07	18.73	17.27
November	14.13	13.73	13.61	10.95	14.69	12.92	13.87	16.76	10.81	13.22
December	15.52	10.09	11.74	14.31	13.93	13.16	16.50	12.36	13.23	13.33

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wei, M.; Yu, X.; Ke, F.; He, X.; Xu, K. A Refined Zenith Tropospheric Delay Model Based on a Generalized Regression Neural Network and the GPT3 Model in Europe. Atmosphere 2023, 14, 1727. https://doi.org/10.3390/atmos14121727

AMA Style

Wei M, Yu X, Ke F, He X, Xu K. A Refined Zenith Tropospheric Delay Model Based on a Generalized Regression Neural Network and the GPT3 Model in Europe. Atmosphere. 2023; 14(12):1727. https://doi.org/10.3390/atmos14121727

Chicago/Turabian Style

Wei, Min, Xuexiang Yu, Fuyang Ke, Xiangxiang He, and Keli Xu. 2023. "A Refined Zenith Tropospheric Delay Model Based on a Generalized Regression Neural Network and the GPT3 Model in Europe" Atmosphere 14, no. 12: 1727. https://doi.org/10.3390/atmos14121727

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Refined Zenith Tropospheric Delay Model Based on a Generalized Regression Neural Network and the GPT3 Model in Europe

Abstract

1. Introduction

2. Dataset and Methods

2.1. Dataset

2.2. Different ZTD Models

2.2.1. ZTD Model Based on ERA5 Data

2.2.2. ZTD Based on the GPT3 Model

2.3. ZTD Model Based on GRNN

2.3.1. The Inputs and Outputs of the Model

2.3.2. GRNN Model

2.3.3. Determine the Hyperparameters of the GRNN Model

3. Analysis and Results

3.1. Preprocessing for Different ZTD

3.2. Overall Performance of the GRNN Refined Model

3.3. Evaluating the External Accuracy of the GRNN Refined Model

4. Impact of Spatiotemporal Factors on Modeling Accuracy

4.1. Impact of Spatial Factors on Modeling Accuracy

4.2. Impact of Time Factors on Modeling Accuracy

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI