Next Article in Journal
Peat Drainage Ditch Mapping from Aerial Imagery Using a Convolutional Neural Network
Next Article in Special Issue
Tropical Tropopause Layer Cloud Properties from Spaceborne Active Observations
Previous Article in Journal
Point Cloud Plane Segmentation-Based Robust Image Matching for Camera Pose Estimation
Previous Article in Special Issue
Evaluating Latent-Heat-Nudging Schemes and Radar forward Operator Settings for a Convective Summer Period over Germany Using the ICON-KENDA System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Retrieval of Atmospheric Water Vapor Content in the Environment from AHI/H8 Using Both Physical and Random Forest Methods—A Case Study for Typhoon Maria (201808)

1
Chinese Academy of Meteorological Sciences, China Meteorological Administration, Beijing 100081, China
2
Key Laboratory of Radiometric Calibration and Validation for Environmental Satellites, National Satellite Meteorological Center (National Center for Space Weather), China Meteorological Administration, Beijing 100081, China
3
Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Nanjing University of Information Science and Technology, Nanjing 210044, China
4
Key Laboratory for Aerosol-Cloud-Precipitation, China Meteorological Administration, School of Atmospheric Physics, Nanjing University of Information Science and Technology, Nanjing 210044, China
5
Innovation Center for FengYun Meteorological Satellite (FYSIC), National Satellite Meteorological Center (National Center for Space Weather), China Meteorological Administration, Beijing 100081, China
6
Guangzhou Satellite Meteorological Ground Station, Guangdong Meteorological Administration, Guangzhou 510630, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(2), 498; https://doi.org/10.3390/rs15020498
Submission received: 19 December 2022 / Revised: 7 January 2023 / Accepted: 11 January 2023 / Published: 13 January 2023
(This article belongs to the Special Issue Remote Sensing of Clouds and Precipitation at Multiple Scales II)

Abstract

:
The advanced imagers onboard the new generation of geostationary satellites could provide multilayer atmospheric moisture information with unprecedented high spatial and temporal resolutions, while the physical retrieval algorithm (One-Dimensional Variational, 1DVAR) is performed for operational atmospheric water vapor products with reduced resolutions, which is due to the limited computational efficiency of the physical retrieval algorithm. In this study, a typical cost-efficient machine learning (Random Forecast, RF) algorithm is adopted and compared with the physical retrieval algorithm for retrieving the atmospheric moisture from the measurements of Advance Himawari Imager (AHI) onboard the Himawari-8 satellite during the typhoon Maria (201808). It is found that the accuracy of the RF-based algorithm has much high computational efficiency and provides moisture retrievals with accuracy 35–45% better than that of 1DVAR, which means the retrieval process can be conducted at full spatial resolution for potential operational application. Both the Global Forecast System (GFS) forecasts and the AHI measurements are necessary information for moisture retrievals; they provide added value for each other.

1. Introduction

Tropical cyclones (TCs), either hurricanes or typhoons, are typical high-impact weather events with strong winds and heavy precipitation, often causing huge damage and disasters to society. Notably, these damages have ever been intensified over the past 50 years under a warming climate background [1,2,3]. The intensity of a typhoon after striking land usually slows down since less water vapor supplies to the cyclone. The distribution of atmospheric moisture carried by a storm system or in the surrounding environment is a distinctly important contributor to influencing its intensity and moving path [4,5,6]. The nearby and remote precipitation brought by a typhoon or motivated circulation system needs the supply of moisture [7,8]. Research shows that different extents of water vapor (WV) transport supplements to two respective typhoons with similar intensities or tracks will lead to completely discrepant life cycles and disaster impacts [9,10], embodying in the intensity, location and rainfall distribution [11]. As a whole, the precise descriptions of moisture fields are extremely important to better understanding, numerically simulating, and predicting the TC’s development and movement.
The reanalysis datasets could provide relatively reliable humidity fields and are widely used for comprehending the mechanism of typhoon development and movement [12,13]. They are commonly generated by combining various observations including various satellite remote sensing data with numerical weather prediction models through an assimilation system [14,15]. The latest ERA5 (the fifth generation ECMWF atmospheric reanalysis) reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) has high accuracy and the highest spatio-temporal resolutions among all the reanalysis datasets, which could provide multilayer WV from 1000 hPa to 1 hPa with the temporal resolution of 1 h and horizontal resolution of 0.25° [16]. However, compared to the current geostationary advanced imager observing systems, the ERA5 provides atmospheric thermodynamic information at a quite low spatio-temporal resolution.
The Advanced Himawari Imager (AHI), onboard the Himawari-8 (H8) satellite, launched in 2014, started a new era for the new-generation geostationary satellite observing system [17]. It provides full disk InfraRed (IR) observations every 10 min with 2 km spatial resolution at nadir view for IR spectral bands over the East Asian region. H8 is stationed geostationary at the nadir of 140.7°E, which has an apparent resolution advantage over the reanalysis datasets. Observations with high spatio-temporal resolution are crucial to capture fast-changing weather systems [18,19,20]. For example, Schmit et al. [21,22] show that movement or development of clouds, convection, severe storms, and tropical cyclones are continuous even in 1 min intervals. Lee et al. [23] demonstrated that atmospheric profile products with 10 min intervals, derived from the AHI, have the capability on depicting low-level moisture advection and increasing instability in the pre-landfall environment of a typhoon. Di et al. [24] pointed out that moisture spatial variations within 12 to 16 km resolution are non-negligible, especially over land and in the pre-convection environment, and this information is an obvious indicator of several storm occurrences/development. Besides, evaluations on reanalysis datasets and regional model forecast products using GEO WV measurements from advanced imagers onboard the geostationary weather satellites both show non-negligible moisture bias at higher and lower troposphere [25,26]. Especially, the reanalysis dataset is not a near real-time product that has a period of lag time, and it is not available for operational applications. In contrast, high spatio-temporal atmospheric thermodynamical information promptly retrieved from geostationary imagers’ measurements (e.g., AHI), such as atmospheric vertical temperature profiles, atmospheric vertical moisture (VM) profiles, total precipitable water (TPW), layered precipitable water (LPW), and the atmospheric stability indices, are irreplaceable information to understand the process and mechanism of rapidly developing convective systems (e.g., TCs), as well as to improve their predictions [27].
The current operational atmospheric profile retrieval algorithm applied for high-resolution GEO imagers such as the AHI, and Advanced Baseline Imager (ABI), is the iterative-based one-dimensional variational (1DVAR) method [19,28]. While limited by the computational efficiency, instead of pixel by pixel, the retrievals are performed on Field-of-Regard basis, and one Field-of-Regard contains a 5 × 5 pixel box to meet the latency requirement for both full disk and regional production [19,23], for example, the retrievals must be produced for full disk coverage within 10 min to keep the temporal resolution of 10 min. It means the spatial resolution of atmospheric profile products is largely reduced from 2 km to 10 km due to the high computational cost on processing the large volume of Level 1B data if the retrieval is performed at pixel level. A machine learning-based approach is an alternative scheme that can fully use the sensor’s original resolution information since this kind of approach is cost-efficiently applied online as long as it is well trained offline. In addition, the machine learning algorithm sometimes has superior performance on the retrieval of the atmospheric profile, if it could also reproduce the nonlinearity relationship between the atmospheric parameters and radiance observations [29,30,31]. The purpose of this study is to demonstrate the feasibility of using rapid algorithms such as RF for producing atmospheric moisture information from advanced GEO imager measurements with original high spatio-temporal resolution and low latency. A typhoon case is chosen for the comparison between 1DVAR and machine learning-based algorithm (RF) on retrieving the atmospheric moisture parameters at the AHI original spatio-temporal resolution.
In this study, the 1DVAR algorithm and the RF-based algorithm are both applied to retrieve atmospheric moisture products such as VM, TPW, and LPW from the AHI measurements during typhoon Maria (201808) under clear skies at original spatio-temporal resolution (2 km at nadir view). Since the thermodynamic information in the environment is associated with TC development such as intensification, the high spatio-temporal resolution atmospheric water vapor information from AHI can be used for better understanding the TC process. The specific algorithm and data used here are described in Section 2. Section 3 shows the retrieval results from the RF-based algorithm. Section 4 exhibits the comparisons of results between the physical retrieval algorithm and the RF-based algorithm. The summary and discussions are given in Section 5.

2. Data and Methods

2.1. Data

2.1.1. Forecasts

Forecasts from the Global Forecast System (GFS) developed at National Centers for Environmental Prediction are used as input parameters or background fields in the RF-based and the 1DVAR algorithm, respectively. The GFS provides dozens of atmospheric and land-soil variables, including temperatures, winds, precipitation, soil moisture, and atmospheric ozone concentration at the temporal interval of three hours and spatial resolution of 0.5°. The specific variables of the GFS data adopted in the two types of retrieval algorithms include pressure at mean sea level (PRMSL_meansealevel), sea surface temperature (TMP_surface), along with relative humidity and temperature (TMP) at each pressure level. Section 2.2 outlines the specific time of data.

2.1.2. AHI Level1B

The AHI onboard the H8, launched in 2014, has three visible, three near-infrared, and ten IR spectral bands with 0.5 km, 1 km, and 2 km spatial resolution, respectively [32]. And it provides full disk measurements every 10 min. This study is focused on the retrieval of atmospheric water vapor profiles using Level 1B measurements of the AHI onboard the Himawari 8 (H8), and only IR measurements from water vapor and window bands (bands 8–16) are used in the retrieval process. Regarding using 9.6 µm channel, Li et al. [33] shows in Figure 1 (in their paper) that 9.6 μm channel also includes the information of low-level water vapor. Therefore, we thought 9.6 μm channel can be included in the establishment of the model in this study. In addition, both retrieval algorithms are only conducted over clear skies, therefore, the AHI level-2 cloud mask product is also needed for cloud screening. The algorithm procedure of cloud mask products please refer to [34] and its accuracy has been asserted in Wang [35]. Cloud mask product is used as a prerequisite in the operational macroscopic and microphysical property of cloud parameters and atmospheric parameters retrieval algorithms [36,37].

2.1.3. ERA5

The ERA5, as the fifth-generation reanalysis product generated by the ECWMF, absorbs various kinds of observations through the data assimilation system, which provides information on variable atmospheric parameters with high accuracy at the temporal resolution of 1 h and horizontal resolution of 0.25° [16]. The accuracies of temperature and WV profiles have been asserted by various assessments with remote sensing measurements and radiosonde observations [25,38]. The WV profiles of the ERA5 are used as labels for the RF-based algorithm and the validation resources (independent) for both algorithms. To be specific, specific humidity (q), mean sea level pressure (MSL), sea surface temperature (SST), the TPW (300–1000 hPa), and LPW of the upper (700–900 hPa), middle (900–700 hPa) and lower (700–300 hPa) levels, respectively, are to be retrieved. The TPW and LPW are calculated by the following equation:
LPW = 1 g   i = 1 n q k ¯ Δ P k
where q k ¯ = q k q k 1 / 2   and   Δ P k = P k 1 P k   , q k means specific humidity at k level; P k is atmospheric pressure at k level; N means all the atmospheric levels. Table 1.

2.1.4. Dropsonde

Dropsonde observations during typhoon Maria (201808) are also used to validate the retrievals independently, the dropsonde data is downloaded from https://doi.org/10.5281/zenodo.4671631 accessed on 9 July 2018. These dropsonde observations are obtained by releasing the downward-facing sounder from the aircraft, which was equipped with a range of observation instruments including sensors of GPS antenna, GPS receiver, temperature, pressure, etc. For your information, these soundings have also been used in other studies [39].

2.2. Methods

2.2.1. Maria Typhoon

‘Maria’ is a super typhoon with a maximum intensity of level 17, a central minimum pressure downs to 925 hPa [40], and a long duration of maintenance generated in the northwest Pacific Ocean on 4 July 2018, which caused severe economic damage [31]. There are several reasons for choosing this typhoon in our study: (1) first, this typhoon is the most famous one in 2018 and has been studied the most frequently with a lot of information to refer to; (2) this typhoon had a long duration, high intensity, stronger impact than others and a more obvious rapid strengthening phase. In our opinion, this typhoon has high research value; and (3) the typhoon is strong and long-lasting, so it carries more vapor and supplies more vapor from the environmental field than other typhoons.

2.2.2. RF-Based Algorithm

The RF method is a typical machine learning integrated model proposed by Breiman [41]. It has obvious advantages in handling classification and prediction-related problems effectively and efficiently, which includes but are not limited to: (1) the procedure of training can be processed in parallel, which has a great advantage in enhancing the algorithm speed, especially for the case of large samples; (2) the importance ranking of features (input parameters) can be given directly, which is beneficial for understanding the model performance and the physical means behind the ranking; and (3) using random sampling with put-back, the RF model has small variance and powerful generalization. Due to those advantages, the RF method has been recently applied in various studies and applications, for example, Turini et al. [42] and Kuehnlein et al. [43] established new approaches for rainfall estimation from high spatial and temporal resolution satellite measurements using RF. Lee et al. [30] used RF to obtain a highly accurate water vapor retrieval model, and Zhang et al. [44] found that RF can effectively improve the retrieval accuracy of TPW from thermal IR remote sensing observations.
The RF-based algorithm adopted in this study aims to retrieve the atmospheric moisture profile products such as VM, TPW, and LPW during the typhoon Maria (201808) from 4 to 11 July 2018. First, a set of data containing the inputs, the AHI measurements and the GFS forecasts, as well as the outputs, the ERA5, are formed. The AHI cloud mask product is used to clear sky pixels for retrievals. Only clear sky AHI pixels over ocean are used for retrievals, therefore, our retrievals are performed in the environmental region. The AHI and the GFS are temporally and spatially interpolated to the resolution of the ERA5 with the temporal resolution of 1 h and the horizontal resolution of 0.25° × 0.25° using the nearest distance method. In order to train and evaluate the RF-based retrieval model, a match-up dataset is developed through spatially and temporally collocating ERA5, GFS and AHI radiance measurements during typhoon Maria (2018). There are three steps for developing the training and test datasets: in the first step, for each ERA5 grid, the closest clear AHI pixel with 0.05 latitude/longitude degree (about 5 km) radius is selected to form a match, and the GFS data are also spatially and temporally interpolated to each ERA5 grid; in the second step, in order to reduce the data volume, 1 out of every 4 ERA5 grids is picked to finally form the size reduced ERA5-GFS-AHI matchup dataset (83,199 samples); in the third step, the size reduced matchup dataset is randomly divided into training (70% of the samples) and testing (30% of the samples). On the one hand, the training and testing datasets cover the spatial and temporal variations during typhoon period, the training is representative for this case, and since only one quarter of matchup dataset is used, the application to ABI measurements with 10 min temporal resolution and 2 km spatial resolution is considered somewhat independent. Equation (1) is used for the variable conversion, and Table 2 outlines the specific input/output variables. The entire dataset is randomly divided into two sets of training and test datasets (data size is 83,199) with 70% and 30% of the sample size (sample size: 100°E–180°E, 0°N–50°N), respectively. The training dataset is used to obtain several significate super-parameters of this algorithm including the number of trees, the maximum depth of a tree, and so on. The final-used super-parameters are listed in Table 3. The test dataset is to independently evaluate the algorithm’s performance. Note that all the results shown in Section 3 and Section 4 are based on the test dataset. Again, the RF-based algorithm is only applied to the AHI measurements over clear skies with the original spatial resolution of 2 km.
It is worth noting that the ERA5 is the best uniformly sampled atmospheric reanalysis, it has very good representativeness of atmospheric diurnal variation and can be well match with geostationary observations with high temporal resolution. The ERA5 dataset has very good quality, it is the first reanalysis to use a 10-member ensemble to assess atmospheric uncertainties through the 4D-Var data assimilation system, and it also uses many observations for data assimilation, thus can provide atmospheric conditions more accurately. The accuracy of the temperature and water vapor profiles has been confirmed by various assessments from remote sensing and radiosonde observations [25,37], and using ERA5 data as a reference to analyze vertical meteorological fields is highly advantageous [31]. The ERA5 data have a much improved spatial and temporal resolutions compared to its predecessor, and the vertical stratification has been increased from 60 to 137 layers. Therefore, although the spatial resolution of ERA5 data is relatively coarse, it is used as the true value reference for this study because of its high accuracy and uniform sampling. Moreover, the purpose of this study is to retrieve the water vapor information with high spatial and temporal resolutions, so it is necessary to use the independent data to validate the accuracy of the water vapor information obtained from AHI measurements using the RF and 1DVAR methodologies.

2.2.3. 1DVAR

The atmospheric moisture profile physical retrieval algorithm adopts the 1DVAR iterative technique based on optimal estimate theory [19,38,45,46]. In the AHI retrieval, the regression is used to generate the first guess with the AHI IR measurements and forecasts (GFS) as the predictor, and 1DVAR is then performed by adjusting the atmospheric profile state, X, to minimize a cost function:
J X = Y m F X T E 1 Y m F X + X X b T γ B 1 X X b
where the X is profile vector (containing atmospheric temperature vertical profile, moisture vertical profile and surface skin temperature [47]) to be retrieved; E and B are observational and background error covariance matrix of the AHI measurements and first guess, respectively; Superscripts −1 and T denote inverse and transport of a matrix, respectively.   Y m is the AHI IR band measurements; F is the radiative transfer model, which is the Pressure-Layer Fast Algorithm for Atmospheric Transmittance (PFAAST) model used in this study [48,49]. γ is the regularization parameter or smoothing factor used for solution convergence [50]. The profile vector X can be solved through an iterative approach (the Quasi-Newtonian iteration used in our algorithm):
X n + 1 = X 0 + F T X n · O 1 · F X n + B 1 1 · F T X n · O 1 F X 0 F X n + F T X n X n X 0
where F T is the Jacobian matrix of the radiative transfer model. A more detailed description of the algorithm can be found in the ABI legacy atmospheric profile (LAP) algorithm [19,28,46]. It should be noted that both E and B matrix are diagonal for simplicity and efficient calculations.

2.3. Evaluation Criteria

Table 4 lists the metrics used in this study to evaluate both retrieval algorithms. Root mean square error (RMSE) and Mean Absolute Error (MAE) are the two indicators used to measure the deviations between the retrievals and the true values. Mean absolute percentage error (MAPE) is a relative error indicator and it depends both on deviations and the true values. Each indicator nearer to 0 means the retrievals are better; the correlation coefficient (R) is to quantitatively provide the degree of linear correlation between the retrievals and true values.

3. Evaluations of RF-Based Algorithm

The atmospheric moisture profile retrievals from the RF-based algorithm are evaluated using the ERA5 as a reference and the evaluation indicators listed in Section 2.3 are used. This RF algorithm uses the inputs of combined the AHI IR band brightness temperature measurements and the GFS forecasts (GFS + AHI), the same inputs for the 1DVAR algorithm. The retrievals follow Schmit et al. [19], including atmospheric vertical moisture (VM) (moisture at each pressure level between 300 and 1000 hPa), total precipitable water (TPW, integrated from 300 to 1000 hPa), layered precipitation water (LPWs) (integrated from 300 to 700 hPa, 700 to 900 hPa, and 900 to 1000 hPa, respectively). Note that the evaluations are only conducted with the independent test dataset. The GFS moisture forecasts are also included in the evaluation, to see the added value of the AHI IR band measurements on the GFS forecasts.
The RMSE and MAPE statistical results of the VM retrievals at the pressure levels between 300 hPa and 1000 hPa from the GFS forecasts and RF-based algorithms are shown in Figure 1a,b, respectively. Figure 1c,d are similar but the LPW retrievals are assessed. It can be seen that the moisture RMSEs between 300 hPa and 1000 hPa are significantly reduced from the GFS to AHI retrievals, resulting in consistent improvement of LPWs (upper, middle, and low) from combined the AHI and the GFS over the GFS alone. Consistent improvement can also be found when the MAPE is used as an indicator, however, the error retrievals at 300 hPa are the largest and those at the middle levels of the troposphere are modest, which seems opposite to the RMSE results. This is due to the fact that moisture concentration at 300 hPa is usually quite small, and it increases in an exponential growth as close to the surface, it is reasonable that moisture retrievals at 300 hPa have a smaller absolute error but larger relative error. As a whole, with the AHI measurements combined with the GFS, the moisture retrievals from the RF-based algorithm have been largely improved over the GFS forecast, indicating the added value of the AHI to the forecast for applications. The specific error reductions or accuracy improvements (in percentage) of the RF-based algorithm from the AHI are listed in Table 5, evaluated using RMSE, MAPE, MAE, and R indicators, respectively. It is interesting to see that the accuracy improvements (percentages) of the retrievals are comparatively consistent among MAPE, RMSE, and MAE, for all parameters.
Figure 1e provides the importance ranking for all the input variables. Overall, the GFS forecasts have a larger significance than the AHI measurements in the RF-based algorithm, indicating the importance of background. The relatively important of the AHI measurements are the brightness temperatures from band 12 (centered at 9.6 μm) as well as brightness temperatures differences between band 14 (centered at 11.20 μm) and band 15 (centered at 12.3 μm). Some information about moisture near the boundary is contained in these measurements, where the GFS moisture forecasts have relatively large errors (Lee et al., 2014). Interestingly, the accuracies of moisture parameters from the RF-based algorithm with the AHI alone data are relatively lower than those from the GFS+AHI (not shown). It indicates that the GFS forecasts and the AHI measurements from the satellite are both necessary information for moisture retrieval and they provide added value for each other.
Figure 2 illustrates grid-to-grid comparisons of the VM retrievals at 300, 700, and 1000 hPa, respectively, with the ERA5 based on the test dataset. As mentioned, since the spatial resolutions of the ERA5 and the retrievals are different, the grid-to-grid comparisons require to be down-sampled to temporal and spatial resolutions of the ERA5. The color represents the probability density of samples, and the assessment indicators are given in panels. It obviously shows that the GFS moisture forecasts have relatively a large inconsistency with the ERA5, especially for the moisture at the upper troposphere. With the AHI measurement added, the AVM retrievals from the RF-based algorithm agree with the ERA5 at grid level. Overall, the accuracy of retrievals is relatively lower for the moisture at the upper troposphere than the lower troposphere, which is accordant and dependent on the accuracy of the GFS forecast itself.
The spatial distribution of the multilayer WV retrievals and the GFS forecasts at 00 UTC on 5 July 2018 are shown in Figure 3. The IR brightness temperatures measurements for band 9 (6.9 μm), 10 (7.3 μm), and 14 (11.2 μm) are also exhibited in Figure 3a–c. Since both the RF-based and 1DVAR algorithms are only conducted for the clear AHI pixels, retrieved moistures are in the surrounding environment of the typhoon Maria. From Figure 3a–c, it can be seen that the retrieved moistures are in the northern and eastern of the typhoon. From Figure 3d–l, it indicates that the patterns of retrievals are coincident better with the ERA5 than with the GFS, especially for the northern of the typhoon. The GFS moisture forecasts tend to be dry, while the retrievals tend to be wet. The accuracy at low troposphere is better as a whole.
Figure 4 shows the differences between the ERA5 and the retrievals/GFS: (a) (b) for the upper LPW, (e) (f) for the middle LPW, (i) (j) for the lower LPW, and (c) (d) for the VM at 300 hPa, (g) (h) for the VM at 700 hPa, and (k) (l) for the VM at 1000 hPa, respectively. It can be seen that as a whole, the LPW and VM retrievals errors from the RF-based algorithm are smaller than those from the GFS forecasts. The retrieval errors from the RF-based algorithm are smaller for the lower layers and somewhat larger for the upper layers, which are consistent with those findings in Figure 2.
The dropsonde moisture observations are also used to assess the retrievals from RF-based algorithm after temporally and spatially matched. The match requires the observation time difference between dropsonde and the AHI less than 5 min and the spatial distance within 50 km. Limited by the observation time and clear sky condition, only two dropsonde observations over the ocean meet the requirements and are collocated. Figure 5 shows that except retrievals at the upper troposphere, most of the retrievals from the RF-based algorithm are even closer to the dropsonde observations than the ERA5. It also can be seen that, overall, the 1DVAR water vapor retrievals are close to RF-based results in both cases in terms of TPW and LPW (right panels), while at significant pressure levels, 1DVAR is slight better in the middle to upper troposphere while RF-based model is better in the lower to middle troposphere in these two particular cases (left panels).

4. Comparisons of Retrievals between RF-Based and 1DVAR Algorithms

The LPW retrievals from the RF-based and 1DVAR algorithms at different layers 00 UTC 5 July 2018 are compared in Figure 6. It exhibits the WV differences between the retrievals and the ERA5. Although the algorithms are both conducted for the same AHI measurement at pixel level under clear sky conditions, since the 1DVAR algorithm might fail in the convergence of iteration and will output invalid results for these nonconvergence pixel [46], therefore the data size of the retrievals from the 1DVAR algorithm is distinctly smaller. Besides, assessed by the ERA5, LPW retrievals from the RF-based algorithm are closer to the ERA5 than the 1DVAR. Especially, the overestimations on moisture are more noticeable in the 1DVAR algorithm, while the problems are visually alleviated in the RF-based algorithm at all layers in this case.
In addition, comparisons between the retrievals and the ERA5 at grid level with different cloudy scenarios are analyzed. With the help of the AHI cloud mask (2 km), the cloud coverage of each ERA5 grid (25 km) is determined. Comparing the accuracies of two sets of retrievals with different cloud coverages is possible. In Table 6, four scenes including clear, partly cloudy, cloudy, and overcast scenarios with 0–10%, 10–30%, 30–70%, and 70–100% cloud coverage, respectively, are considered in the comparisons. Using RMSE, MAPE, and MAE as indicators, retrievals from the RF-based algorithm have 35–45% percentage improvements than the 1DVAR for all four cloud scenes. The performance of the 1DVAR algorithm is relatively weaker in the cloudy scenarios while the accuracy of the RF-based algorithm has almost no scene dependency. Besides the retrieval accuracies, the computation efficiency between the RF-based and 1DVAR algorithms are compared in this case study. The total time cost of retrieving one full disk of Himawari-8 data with the 10 min and 2 km resolutions using the RF-based algorithm is about 7 min, which is much faster than 5 h and a half based on the traditional 1DVAR algorithm.

5. Summary and Discussion

The advanced IR imagers onboard the new generation of the geostationary satellite could provide multilayer moisture information with unprecedented high spatial and temporal resolutions, which are irreplaceable information to understand the process and mechanism of rapidly developing convective systems (e.g., TCs), those measurements are also the very useful improving simulation of prediction of TC. However, operational physical retrieval algorithm (1DVAR) is performed with reduced spatial resolutions due to limited computational efficiency for large data processing if the retrieval is performed at pixel level. In this study, a typical cost-efficient machine learning, the RF-based algorithm, is developed and compared with the 1DVAR algorithm for the measurements of the AHI onboard the H8 satellite during the typhoon Maria (201808). It is found that:
  • The GFS forecasts and the AHI measurements are both necessary information for moisture retrievals and provide supplemental value for each other;
  • The accuracy of atmospheric water vapor retrievals using the RF-based algorithm with representative training dataset is enhanced when compared with the 1DVAR algorithm (e.g., with 35–45% improvement in this case);
  • The retrievals can be conducted at full resolution in operation with high computational efficiency if using a machine learning-based algorithm instead, potentially for real-time or near real-time quantitative applications of high spatio-temporal resolution satellite measurements.
It should be noted that building a representative training dataset is very important for the applications of RF model, due to limited computer resource on processing large volume of data (every 10 min with 2 km spatial resolution at nadir view) for training, the RF model in this study is trained with only limited independent data from the case, therefore it can be applied appropriately to the water vapor retrieval for this typhoon case, and the accuracy might be reduced when it is applied to another typhoon case. However, the main objective of this case study was to prove our hypothesis: if the machine learning model is well trained, then it can be used to extract atmospheric water vapor content efficiently and effectively from the high spatio-temporal resolution satellite observations for real-time or near real-time applications, which exhibits advantage of processing large volume of data with low latency when compared with the physical based 1DVAR methodology. For real-time or near real-time applications, representative training dataset can be achieved through either routinely updating training dataset, or building a representative training dataset covering seasonal, regional, and diurnal variations. Compared with the 1DVAR algorithm, when the same input data are used, the advantages of the machine learning approach for atmospheric profile retrieval include high computational efficiency, stable retrieval (e.g., less scene dependence), not relying on the radiative transfer model, less sensitive to the calibration bias of observations, etc. It is very useful for near real-time applications of high resolution (spatial, temporal, and spectral) satellite measurements such as geostationary imager and sounder measurements [51] with large data volume, those near real-time applications can be realized through training the model offline while applying the model online. In addition, multiple sources of data can be easily used together with machine learning techniques in the retrieval process, if they are temporally and spatially matched. Despite those advantages, using machine learning for sounding retrieval should be cautious, since it highly relies on the representativeness of training dataset as mentioned above, in addition, the RF method is lack of physical interpretation. Building a physical driven RF model is our future work, those including but are not limited to understanding the sensitivity of each channel in retrieval under various atmospheric conditions, the use of both temporal and spatial information in the RF-based retrieval model, the relative importance of other source of measurements (if available) in RF-based atmospheric water vapor retrieval, etc.
Since the retrieval accuracy and stability with RF method are highly dependent on the representativeness of the training dataset, the key to building a trustworthy model is the development of a training dataset that should cover all weather conditions and all seasons. Building such a training dataset should be the focus of future work on using machine learning for profile retrieval. In addition, the label data are also very important, the ERA5 is good moisture label data but the spatial and temporal resolutions are still not well coupled with geostationary measurements, using high resolution regional reanalysis for label data should be considered in future work. Due to the limitation of the case study, the numbers from the evaluation might be less representative, but that does not change the general conclusions and findings, for example, (1) and (3) mentioned above.

Author Contributions

Conceptualization, D.D.; methodology, D.D., L.Z. and R.Z.; software, L.Z. and R.Z.; validation, W.B.; formal analysis, D.D., L.Z. and R.Z.; investigation, L.Z. and R.Z.; resources, L.Z. and Z.L.; data curation, L.Z. and Z.L.; writing—original draft preparation, D.D. and L.Z.; writing—review and editing, L.Z., R.Z. and W.B.; visualization, L.Z. and R.Z.; supervision, D.D.; project administration, D.D.; funding acquisition, D.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 42105126 and National Natural Science Foundation of Jiangsu Province, grant number BK20210662.

Data Availability Statement

Not applicable.

Acknowledgments

All the data used in the study including the AHI measurements, GFS forecasts from NCEP, ERA5 data and dropsonde observation are publicly available. The AHI measurements can be download from the JAXA website (http://www.eorc.jaxa.jp/ptree/index.html) accessed from 4 July 2018 to 11 July 2018. The GFS can be download from https://www.nco.ncep.noaa.gov/pmb/products/gfs/ accessed from 4 July 2018 to 11 July 2018. The ERA5 reanalysis data can be download form (https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset) accessed from 4 July 2018 to 11 July 2018. The dropsonde for assessments on retrievals can be download form (https://doi.org/10.5281/zenodo.4671631) accessed on 9 July 2018.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Emanuel, K. Increasing destructiveness of tropical cyclones over the past 30 years. Nature 2005, 436, 686–688. [Google Scholar] [CrossRef]
  2. Elsner, J.B.; Kossin, J.P.; Jagger, T.H. The increasing intensity of the strongest tropical cyclones. Nature 2008, 455, 92–95. [Google Scholar] [CrossRef] [PubMed]
  3. Zhao, J.; Zhan, R.; Wang, Y. Different Responses of Tropical Cyclone Tracks Over the Western North Pacific and North Atlantic to Two Distinct Sea Surface Temperature Warming Patterns. Geophys. Res. Lett. 2020, 47, e2019GL086923. [Google Scholar] [CrossRef]
  4. Liu, H.; Tang, W.; Zhao, L. Analysis of helicity and moisture during a rainstorm effected by 0808 typhoon. Sci. Meteorol. Sin. 2010, 30, 344–350. [Google Scholar]
  5. Chen, X.; Zhang, J.; Li, C.; Cao, L. Study on the low_frequency environmental influences on the rapid intensification of typhoon “Lekima”. J. Meteorol. Sci. 2020, 40, 114–122. [Google Scholar]
  6. Lyu, Q.; Zhang, Y.; Jiang, H.; Zheng, H.; Chen, D.; Hu, Y. Precipitation evolution during the rapid intensification event of typhoon “Lekima” (1909). J. Meteorol. Sci. 2020, 40, 136–142. [Google Scholar]
  7. Li, Y.; Zhang, Z.; Gao, G.; Cong, C.; Li, R. Impacts of typhoon circulation on the water vapor transportation of remote precipitation. J. Mar. Meteorol. 2017, 37, 111–117. [Google Scholar]
  8. Bueh, C.; Zhuge, A.; Xie, Z.; Gao, Z.; Lin, D. Water Vapor Transportation Features and Key Synoptic-scale Systems of the 7.20 Rainstorm in Henan Province in 2021. Chin. J. Atmos. Sci. 2022, 46, 725–744. [Google Scholar]
  9. Li, Y.; Chen, L.; Wang, J. The Diagnostic analusis on the characteristics of large scale circulationcorresponding to the sustaining and decaying of tropical cyclone after it’s landfall. Acta Meteorol. Sin. 2004, 62, 167–179. [Google Scholar]
  10. Dai, Z.; Wu, H.; Jiang, Y.; Xia, M.; Zhu, X.; Qing, T. Characteristics of water vapor transport in precipitation difference between two landing typhoons of “Jelawat” and “Haiku”. J. Meteorol. Environ. 2018, 34, 16–24. [Google Scholar]
  11. Chen, Z.; Chen, L.; Li, Y. Diagnostic analysis of large-scale circulation features associated with strong and weak landfalling typhoon precipitation events. Acta Meteorol. Sin. 2009, 67, 840–850. [Google Scholar]
  12. Dee, D.P.; Uppala, S.M.; Simmons, A.J.; Berrisford, P.; Poli, P.; Kobayashi, S.; Andrae, U.; Balmaseda, M.A.; Balsamo, G.; Bauer, P.; et al. The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Q. J. R. Meteorol. Soc. 2011, 137, 553–597. [Google Scholar] [CrossRef]
  13. Chen, H.; Mao, Z.; Chen, J. A comparative analysis of climatological characteristics of landing typhoons with and without atmospheric river in recent 30 years in China. Acta Meteorol. Sin. 2020, 78, 745–760. [Google Scholar]
  14. Fujiwara, M.; Polavarapu, S.; Jackson, D. A proposal of the SPARC Reanalysis/Analysis Intercomparison Project. SPARC Newsl. 2012, 38, 14–17. [Google Scholar]
  15. Fujiwara, M.; Wright, J.S.; Manney, G.L.; Gray, L.J.; Anstey, J.; Birner, T.; Davis, S.; Gerber, E.P.; Harvey, V.L.; Hegglin, M.I.; et al. Introduction to the SPARC Reanalysis Intercomparison Project (S-RIP) and overview of the reanalysis systems. Atmos. Chem. Phys. 2017, 17, 1417–1452. [Google Scholar] [CrossRef] [Green Version]
  16. Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horanyi, A.; Munoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
  17. Bessho, K.; Date, K.; Hayashi, M.; Ikeda, A.; Imai, T.; Inoue, H.; Kumagai, Y.; Miyakawa, T.; Murata, H.; Ohno, T.; et al. An Introduction to Himawari-8/9-Japan’s New-Generation Geostationary Meteorological Satellites. J. Meteorol. Soc. Jpn. 2016, 94, 151–183. [Google Scholar] [CrossRef] [Green Version]
  18. Hu, X.; Huang, Y.; Lu, Q.; Zheng, J. Inversion of water vapor total using FY-3A near-infrared data. J. Appl. Meteorol. 2011, 22, 46–56. [Google Scholar]
  19. Schmit, T.J.; Li, J.; Lee, S.J.; Li, Z.; Dworak, R.; Lee, Y.-K.; Bowlan, M.; Gerth, J.; Martin, G.D.; Straka, W.; et al. Legacy Atmospheric Profiles and Derived Products From GOES-16: Validation and Applications. Earth Space Sci. 2019, 6, 1730–1748. [Google Scholar] [CrossRef]
  20. Yin, R.; Han, W.; Gao, Z.; Li, J. Impact of High Temporal Resolution FY-4A Geostationary Interferometric Infrared Sounder (GIIRS) Radiance Measurements on Typhoon Forecasts: Maria (2018) Case with GRAPES Global 4D-Var Assimilation System. Geophys. Res. Lett. 2021, 48, e2021GL093672. [Google Scholar] [CrossRef]
  21. Schmit, T.J.; Goodman, S.J.; Lindsey, D.T.; Rabin, R.M.; Bedka, K.M.; Gunshor, M.M.; Cintineo, J.L.; Velden, C.S.; Bachmeier, A.S.; Lindstrom, S.S.; et al. Geostationary Operational Environmental Satellite (GOES)-14 super rapid scan operations to prepare for GOES-R. J. Appl. Remote Sens. 2013, 7, 073462. [Google Scholar] [CrossRef]
  22. Schmit, T.J.; Goodman, S.J.; Gunshor, M.M.; Sieglaff, J.; Heidinger, A.K.; Bachmeier, A.S.; Lindstrom, S.S.; Terborg, A.; Feltz, J.; Bah, K.; et al. Rapid Refresh Information of Significant Events: Preparing Users for the Next Generation of Geostationary Operational Satellites. Bull. Am. Meteorol. Soc. 2015, 96, 561–576. [Google Scholar] [CrossRef] [Green Version]
  23. Lee, Y.-K.; Li, J.; Li, Z.; Schmit, T. Atmospheric temporal variations in the pre-landfall environment of typhoon Nangka (2015) observed by the Himawari-8 AHI. Asia-Pac. J. Atmos. Sci. 2017, 53, 431–443. [Google Scholar] [CrossRef]
  24. Di, D.; Li, J.; Li, Z.; Li, J.; Schmit, T.J.; Menzel, W.P. Can Current Hyperspectral Infrared Sounders Capture the Small Scale Atmospheric Water Vapor Spatial Variations? Geophys. Res. Lett. 2021, 48, e2021GL095825. [Google Scholar] [CrossRef]
  25. Xue, Y.; Li, J.; Li, Z.; Lu, R.; Gunshor, M.M.; Moeller, S.L.; Di, D.; Schmit, T.J. Assessment of Upper Tropospheric Water Vapor Monthly Variation in Reanalyses with Near-Global Homogenized 6.5-mu m Radiances from Geostationary Satellites. J. Geophys. Res.-Atmos. 2020, 125, e2020JD032695. [Google Scholar] [CrossRef]
  26. Jiang, X.; Li, J.; Li, Z.; Xue, Y.; Di, D.; Wang, P.; Li, J. Evaluation of Environmental Moisture from NWP Models with Measurements from Advanced Geostationary Satellite Imager-A Case Study. Remote Sens. 2020, 12, 670. [Google Scholar] [CrossRef] [Green Version]
  27. Schmit, T.J.; Li, J.; Gurka, J.J.; Goldberg, M.D.; Schrab, K.J.; Li, J.; Feltz, W.F. The GOES-R Advanced Baseline Imager and the Continuation of Current Sounder Products. J. Appl. Meteorol. Climatol. 2008, 47, 2696–2711. [Google Scholar] [CrossRef]
  28. Lee, Y.-K.; Li, Z.; Li, J.; Schmit, T.J. Evaluation of the GOES-R ABI LAP Retrieval Algorithm Using the GOES-13 Sounder. J. Atmos. Ocean. Technol. 2014, 31, 3–19. [Google Scholar] [CrossRef] [Green Version]
  29. Zhang, K.; Wu, C.; Li, J. Retrieval of Atmospheric Temperature and Moisture Vertical Profiles from Satellite Advanced Infrared Sounder Radiances with a New Regularization Parameter Selecting Method. J. Meteorol. Res. 2016, 30, 356–370. [Google Scholar] [CrossRef]
  30. Lee, Y.; Han, D.; Ahn, M.-H.; Im, J.; Lee, S.J. Retrieval of Total Precipitable Water from Himawari-8 AHI Data: A Comparison of Random Forest, Extreme Gradient Boosting, and Deep Neural Network. Remote Sens. 2019, 11, 1741. [Google Scholar] [CrossRef] [Green Version]
  31. Huang, Q.; Liang, W.L.; Huang, R. Comparative analysis of wind profile radar products and ERA5 reanalysis data. J. Meteorol. Res. Appl. 2021, 42, 83–88. [Google Scholar]
  32. Murata, H.; Takahashi, M.; Kosaka, Y. VIS and IR bands of Himawari-8/AHI compatible with those of MTSAT-2/Imager. MSC Tech. Note 2015, 60, 1–18. [Google Scholar]
  33. Li, J.; Schmidt, C.C.; Nelson, J.P. Estimation of Total Atmospheric Ozone from GOES Sounder Radiances with High Temporal Resolution. J. Atmos. Ocean. Technol. 2001, 18, 157–168. [Google Scholar] [CrossRef]
  34. Min, M.; Wu, C.; Li, C.; Liu, H.; Xu, N.; Wu, X.; Chen, L.; Wang, F.; Sun, F.; Qin, D.; et al. Developing the Science Product Algorithm Testbed for Chinese Next-Generation Geostationary Meteorological Satellites: Fengyun-4 Series. J. Meteorol. Res. 2017, 31, 708–719. [Google Scholar] [CrossRef]
  35. Wang, X.; Min, M.; Wang, F.; Guo, J.; Li, B.; Tang, S. Intercomparisons of Cloud Mask Products Among Fengyun-4A, Himawari-8, and MODIS. Ieee Trans. Geosci. Remote Sens. 2019, 57, 8827–8839. [Google Scholar] [CrossRef]
  36. Min, M.; Li, J.; Wang, F.; Liu, Z.; Menzel, W.P. Retrieval of cloud top properties from advanced geostationary satellite imager measurements based on machine learning algorithms. Remote Sens. Environ. 2020, 239, 111616. [Google Scholar] [CrossRef]
  37. Zhang, Y.; Li, J.; Li, Z.; Zheng, J.; Wu, D.; Zhao, H. FENGYUN-4A Advanced Geosynchronous Radiation Imager Layered Precipitable Water Vapor Products’ Comprehensive Evaluation Based on Quality Control System. Atmosphere 2022, 13, 290. [Google Scholar] [CrossRef]
  38. Davis, S.M.; Hegglin, M.I.; Fujiwara, M.; Dragani, R.; Harada, Y.; Kobayashi, C.; Long, C.; Manney, G.L.; Nash, E.R.; Potter, G.L.; et al. Assessment of upper tropospheric and stratospheric water vapor and ozone in reanalyses as part of S-RIP. Atmos. Chem. Phys. 2017, 17, 12743–12778. [Google Scholar] [CrossRef] [Green Version]
  39. Liu, C.-Y.; Chiu, C.-H.; Lin, P.-H.; Min, M. Comparison of Cloud-Top Property Retrievals from Advanced Himawari Imager, MODIS, CloudSat/CPR, CALIPSO/CALIOP, and Radiosonde. J. Geophys. Res. Atmos. 2020, 125, e2020JD032683. [Google Scholar] [CrossRef]
  40. Heng, J.; Yang, S.; Gong, Y.; Gu, J.; Liu, H. Characteristics of the convective bursts and their relationship with the rapid intensification of Super Typhoon Maria (2018). Atmos. Ocean. Sci. Lett. 2020, 13, 146–154. [Google Scholar] [CrossRef] [Green Version]
  41. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  42. Turini, N.; Thies, B.; Bendix, J. Estimating High Spatio-Temporal Resolution Rainfall from MSG1 and GPM IMERG Based on Machine Learning: Case Study of Iran. Remote Sens. 2019, 11, 2307. [Google Scholar] [CrossRef] [Green Version]
  43. Kuehnlein, M.; Appelhans, T.; Thies, B.; Nauss, T. Precipitation Estimates from MSG SEVIRI Daytime, Nighttime, and Twilight Data with Random Forests. J. Appl. Meteorol. Climatol. 2014, 53, 2457–2480. [Google Scholar] [CrossRef]
  44. Zhang, H.; Tang, B. Remote sensing retrieval of total precipitable water under clear-sky atmosphere from FY-4A AGRI data by combining physical mechanism and random forest algorithm. J. Remote Sens. 2021, 25, 1836–1847. [Google Scholar] [CrossRef]
  45. Li, Z.; Li, J.; Menzel, W.P.; Schmit, T.J.; Nelson, J.P., III; Daniels, J.; Ackerman, S.A. GOES sounding improvement and applications to severe storm nowcasting. Geophys. Res. Lett. 2008, 35. Available online: https://www.webofscience.com/wos/woscc/full-record/WOS:000253063600007 (accessed on 18 December 2022).
  46. Li, J.; Schmit, T.J.; Jin, X.; Martin, G. GOES-R Advanced Baseline Imager (ABI) Algorithm Theoretical Basis Document for Legacy Atmospheric Moisture Profile, Legacy Atmospheric Temperature Profile, Total Precipitable Water, and Derived At-Mospheric Stability Indices; NOAA Nesdis Center for Satellite Applications and Research, 2010; Volume 106. Available online: http://www.goes-r.gov/products/ATBDs/baseline/Sounding_LAP_v2.0_no_color.pdf (accessed on 18 December 2022).
  47. Hannon, S.; Strow, L.L.; McMillan, W.W. Atmospheric infrared fast transmittance models: A comparison of two approaches. In Proceedings of the SPIE’s 1996 International Symposium on Optical Science, Engineering, and Instrumentation, Denver, CO, USA, 31 October 1996. [Google Scholar]
  48. Rodgers, C.D. Rereieval of atmospheric-temperature and composition from remote measurements of thermal-randiation. Rev. Geophys. 1976, 14, 609–624. [Google Scholar] [CrossRef]
  49. Li, J.; Nelson III, J.P.; Schmidt, C.C.; Schmit, T.J.; Menzel, W.P.; Seemann, S.; Eva, B. An approach to improvement temperature and moisture retrievals from the GOES Sounder measurements. In Proceedings of the 13th Conference on Satellite Meteorology and Oceanography, Norfolk, VA, USA, 19 September 2004. [Google Scholar]
  50. Li, J.; Wolf, W.W.; Menzel, W.P.; Zhang, W.J.; Huang, H.L.; Achtor, T.H. Global soundings of the atmosphere from ATOVS measurements: The algorithm and validation. J. Appl. Meteorol. 2000, 39, 1248–1268. [Google Scholar] [CrossRef]
  51. Li, J.; Paul Menzel, W.; Schmit, T.J.; Schmetz, J. Applications of geostationary hyperspectral infrared sounder observations—Progress, challenges, and future perspectives. Bull. Am. Meteorol. Soc. 2022, 103, E2733–E2755. [Google Scholar] [CrossRef]
Figure 1. (a,c) RMSE and (b,d) MAPE evaluation results for the retrievals VM at different pressure levels and LPW at different atmospheric layers, respectively. (e) Importance ranking of inputs variations in the RF-based algorithm.
Figure 1. (a,c) RMSE and (b,d) MAPE evaluation results for the retrievals VM at different pressure levels and LPW at different atmospheric layers, respectively. (e) Importance ranking of inputs variations in the RF-based algorithm.
Remotesensing 15 00498 g001
Figure 2. The probability density of the VM derived from the GFS forecasts (a,c,e) and the retrievals the from RF-based algorithm (b,d,f) against the ERA5 at 300 hPa (upper), 700 hPa (middle), and 1000 hPa (bottom), respectively.
Figure 2. The probability density of the VM derived from the GFS forecasts (a,c,e) and the retrievals the from RF-based algorithm (b,d,f) against the ERA5 at 300 hPa (upper), 700 hPa (middle), and 1000 hPa (bottom), respectively.
Remotesensing 15 00498 g002
Figure 3. The brightness temperature patterns of (a) band 9 central at 6.9 μm, (b) band 10 central at 7.3 μm, and (c) band 9 central at 11.2 μm at 00 UTC 5 July 2018, and the VM retrievals at 300 hPa, 700 hPa, and 1000 hPa, respectively, from (d,g,j) the GFS forecasts, (e,h,k) the RF-based algorithm and (f,i,l) the ERA5.
Figure 3. The brightness temperature patterns of (a) band 9 central at 6.9 μm, (b) band 10 central at 7.3 μm, and (c) band 9 central at 11.2 μm at 00 UTC 5 July 2018, and the VM retrievals at 300 hPa, 700 hPa, and 1000 hPa, respectively, from (d,g,j) the GFS forecasts, (e,h,k) the RF-based algorithm and (f,i,l) the ERA5.
Remotesensing 15 00498 g003
Figure 4. Differences between the ERA5 and the retrievals/GFS of LPWs for (a,b) upper, (e,f) middle and (i,j) low layers, the retrievals/GFS of VM at (c,d) 300 hPa, (g,h) 700 hPa and (k,l) 1000 hPa, respectively. The retrievals are from the RF-based algorithm.
Figure 4. Differences between the ERA5 and the retrievals/GFS of LPWs for (a,b) upper, (e,f) middle and (i,j) low layers, the retrievals/GFS of VM at (c,d) 300 hPa, (g,h) 700 hPa and (k,l) 1000 hPa, respectively. The retrievals are from the RF-based algorithm.
Remotesensing 15 00498 g004
Figure 5. The multilayer VM and LPW retrievals or observations at 300 hPa to 1000 hPa from the RF-based algorithm (green square), the dropsonde (blue circle), the ERA5 (red triangle), and the 1DVAR algorithm (yellow triangle).
Figure 5. The multilayer VM and LPW retrievals or observations at 300 hPa to 1000 hPa from the RF-based algorithm (green square), the dropsonde (blue circle), the ERA5 (red triangle), and the 1DVAR algorithm (yellow triangle).
Remotesensing 15 00498 g005
Figure 6. Differences between the ERA5 and the retrievals of water vapor at the upper (a,b), middle (c,d), and lower (e,f) layers by the 1DVAR (left) and the RF-based (right), respectively. (Number means the amount of the retrievals from the 1DVAR or RF-based algorithm).
Figure 6. Differences between the ERA5 and the retrievals of water vapor at the upper (a,b), middle (c,d), and lower (e,f) layers by the 1DVAR (left) and the RF-based (right), respectively. (Number means the amount of the retrievals from the 1DVAR or RF-based algorithm).
Remotesensing 15 00498 g006
Table 1. Introduction of the AHI infrared bands.
Table 1. Introduction of the AHI infrared bands.
Band NumberCentral Wavelength (μm)Resolution at Nadir (km)
86.22.0
96.9
107.3
118.6
129.6
1310.4
1411.2
1512.3
1613.3
Table 2. Specific input and output variations of RF-based algorithm.
Table 2. Specific input and output variations of RF-based algorithm.
VariablesInformation Provided by These Variables
InputGFSPRMSL_meansealevel (hPa)Minimum sea level pressure
TMP_surface (K)Sea surface temperature
VM (1000 hPa/900 hPa/850 hPa/
800 hPa/700 hPa/500 hPa/300 hPa) (g/kg)
Water vapor specific humidity at specific atmospheric pressure levels
LPW (UP/MID/LOW) TPW (mm)Layered precipitable water Total precipitable water
AHIIRX0620 (K)Brightness temperature of the AHI band 08 which provides upper tropospheric moisture information
IRX0700 (K)Brightness temperature of the AHI band 09 which provides middle to upper tropospheric moisture information
IRX0730 (K)Brightness temperature of the AHI band 10 which provides low to middle tropospheric moisture information
IRX0860 (K)Brightness temperature of the AHI band 11
IRX0960 (K)Brightness temperature of AHI band 12
IRX1040 (K)Brightness temperature of the AHI band 13 which provides SST information
IRX1120 (K)Brightness temperature of the AHI band 14 which provides SST information
IRX1230 (K)Brightness temperature of the AHI band 15 which provides boundary layer moisture information
IRX1330 (K)Brightness temperature of the AHI band 16 which provides low level atmospheric temperature information
IRX0620-IRX1120 (K)Brightness temperature difference
IRX0700-IRX1120 (K)
IRX0730-IRX1120 (K)
IRX1230-IRX1120 (K)
OutputERA5MSL (hPa)Minimum sea level pressure
SST (K)Sea surface temperature
VM (1000 hPa/900 hPa/850 hPa/
800 hPa/700 hPa/500 hPa/300 hPa) (g/kg)
Atmospheric water vapor mixing ratio at specific pressure levels
LPW(UP/MID/LOW) (mm)
TPW (mm)
Layered precipitable water
Total precipitable water
Table 3. Super-parameters used in RF-based algorithm.
Table 3. Super-parameters used in RF-based algorithm.
Super-ParameterDefinitionNumber
1. n_estimatorsthe number of trees300
2. max_featuresthe maximum number of predictors“auto”
3. max_depththe maximum depth of the tree30
Table 4. Algorithm evaluation indicators.
Table 4. Algorithm evaluation indicators.
Evaluation IndicatorsCalculation FormulasRange of ValuesOptimum Value
RMSE 1 n i = 1 n y i * y i 2 0 , + 0
R Cov y i , y i * Var y i Var y i * [–1, 1]
MAPE 100 % n i = 1 n y i * y i y i 0 , + 0
MAE 1 n i = 1 n y i * y i 0 , + 0
Note: where y i * denotes the retrievals and y i denotes the true values; Cov y i , y i * is the covariance matrix of y i and y i * . Var y i and Var y i * are the variances of y i and y i * , respectively.
Table 5. Accuracy improvements of the retrievals from RF-based algorithm using AHI+GFS as inputs over the GFS forecasts.
Table 5. Accuracy improvements of the retrievals from RF-based algorithm using AHI+GFS as inputs over the GFS forecasts.
RMSEMAPEMAER
GFSGFS+
AHI
Improvement (%)GFSGFS+
AHI
Improvement (%)GFSGFS+
AHI
Improvement (%)GFSGFS+
AHI
Improvement (%)
MSL4.320.9578.010.290.0679.312.890.6577.510.790.9824.05
SST1.210.4562.810.270.1062.960.810.2964.200.980.991.02
VM _1000 hPa1.580.5763.927.872.6366.581.140.3965.790.870.9812.64
VM _900 hPa2.631.0360.8419.377.4461.591.920.7361.980.590.9357.63
VM _850 hPa2.821.0263.8325.259.0164.322.110.7265.880.530.9375.47
VM _800 hPa2.781.0861.1528.9711.3060.992.200.7864.550.510.9178.43
VM _700 hPa2.501.0856.8040.6218.2455.101.980.7960.100.450.8997.78
VM _500 hPa1.580.7055.7069.8333.3052.321.240.5258.060.400.89122.5
VM _300 hPa0.330.1457.58111.4646.4058.370.260.1061.540.260.94261.54
TPW10.103.7662.7719.476.6665.797.992.6466.960.640.9446.88
LPW_LOW1.760.6662.509.683.4564.361.250.4564.000.820.9718.29
LPW_MID4.961.8961.9024.358.9763.163.861.3565.030.560.9366.07
LPW_UP5.142.0759.7341.1116.5459.774.071.4963.390.480.9189.58
Table 6. Retrieval accuracies between 1DVAR and RF-based methods for different cloudy scenarios.
Table 6. Retrieval accuracies between 1DVAR and RF-based methods for different cloudy scenarios.
RMSE (Unit: mm)MAPE (Unit: %)MAE (Unit: mm)R
1DVARRFImprovement (%)1DVARRFImprovement (%)1DVARRFImprovement (%)1DVARRFImprovement (%)
Clear
(0–10%)
1.170.7040.176.243.4444.870.920.4946.740.710.8617.44
Partly Cloudy
(10–30%)
1.0006139.005.332.9245.220.810.4248.450.820.889.00
Cloudy
(30–70%)
1.230.7340.656.643.8142.620.970.5543.300.770.8610.46
Overcast
(70–100%)
1.110.7136.046.183.5342.880.900.5044.440.720.8515.29
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, L.; Zhou, R.; Di, D.; Bai, W.; Liu, Z. Retrieval of Atmospheric Water Vapor Content in the Environment from AHI/H8 Using Both Physical and Random Forest Methods—A Case Study for Typhoon Maria (201808). Remote Sens. 2023, 15, 498. https://doi.org/10.3390/rs15020498

AMA Style

Zhu L, Zhou R, Di D, Bai W, Liu Z. Retrieval of Atmospheric Water Vapor Content in the Environment from AHI/H8 Using Both Physical and Random Forest Methods—A Case Study for Typhoon Maria (201808). Remote Sensing. 2023; 15(2):498. https://doi.org/10.3390/rs15020498

Chicago/Turabian Style

Zhu, Linyan, Ronglian Zhou, Di Di, Wenguang Bai, and Zijing Liu. 2023. "Retrieval of Atmospheric Water Vapor Content in the Environment from AHI/H8 Using Both Physical and Random Forest Methods—A Case Study for Typhoon Maria (201808)" Remote Sensing 15, no. 2: 498. https://doi.org/10.3390/rs15020498

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop