A Snowfall Detection Algorithm for Fengyun-3D Microwave Sounders with Differentiated Atmospheric Temperature Conditions

Ji, Qingwen; Ma, Ziqiang; Xu, Jintao; Yan, Songkun; Li, Xiaoqing

doi:10.3390/w15132315

Open AccessArticle

A Snowfall Detection Algorithm for Fengyun-3D Microwave Sounders with Differentiated Atmospheric Temperature Conditions

by

Qingwen Ji

¹,

Ziqiang Ma

^1,*

,

Jintao Xu

¹

,

Songkun Yan

¹ and

Xiaoqing Li

^2,*

¹

Institute of Remote Sensing and Geographical Information System, School of Earth and Space Sciences, Peking University, Beijing 100871, China

²

Innovation Center for Fengyun Meteorological Satellite, National Satellite Meteorological Center, China Meteorological Administration, Beijing 100081, China

^*

Authors to whom correspondence should be addressed.

Water 2023, 15(13), 2315; https://doi.org/10.3390/w15132315

Submission received: 30 April 2023 / Revised: 16 June 2023 / Accepted: 17 June 2023 / Published: 21 June 2023

(This article belongs to the Special Issue Causes and Reconstruction of Catastrophic Flash Flood Disasters: Investigation, Analysis, Modelling and Risk Management)

Download

Browse Figures

Versions Notes

Abstract

:

Precipitation in different phases has varying effects on runoff. However, monitoring surface snowfall poses a significant challenge, highlighting the importance of developing a snowfall detection algorithm. The objective of this study is develop a snowfall detection algorithm for the Microwave Temperature Sounder-2 (MWTS-II) and the Microwave Humidity Sounder-2 (MWHS-II) onboard the FY-3D satellite while considering the differentiated atmosphere temperature conditions. The results show that: (1) The brightness temperature (TB) of MWTS Channel 3 is well-suited for pre-classifying atmospheric temperatures, and significant differences in TB distribution exist between the two pre-classification subsets. (2) Among six machine classifiers examined, the random forest classifier exhibits favorable classification performance on both the validation set (accuracy: 0.76, recall: 0.76, F1 score: 0.75) and test set (accuracy: 0.80, recall: 0.44, F1 score: 0.44). (3) The application of the snowfall detection algorithm showcases a reasonable spatial distribution and outperforms the IMERG and ERA5 snowfall data.

Keywords:

snowfall detection; Fengyun-3D; MWTS; MWHS

1. Introduction

The phase of precipitation has a significant impact on land hydrology, water and energy cycles, and water resource management. Rainfall contributes rapidly to runoff and can result in flash floods, while snowfall contributes to runoff with a delay and influences water yield [1]. Rainfall on snow can also increase the risk of floods [2,3]. Snowfall accumulates as a snowpack, which serves as an important natural reservoir and a crucial source of fresh water during the summer for many arid or semi-arid regions, such as the western United States. However, detecting and measuring snowfall using spaceborne observations poses considerable challenges in modern hydrometeorology [4].

Unlike rain particles, which are spherical and have a known density of 1 g/cm³, snow particles come in various shapes and densities that vary in particle size and structure [5]. Moreover, the heterogeneity of surface snow cover and weather systems in high-latitude regions further complicates the contribution of atmospheric snowfall and solid water particles to satellite-observed brightness temperature (TB). Additionally, light snowfall events account for the majority of all snowfall events and are more challenging to detect, placing higher demands on sensor sensitivity [6].

Compared with remote sensing sensors of visible and infrared frequencies, passive microwave radiometers have proven to be capable of snowfall detection due to their cloud-penetrating ability [7,8,9,10]. The scattering effect of solid/mixed-phase water particles leads to a decrease in TB in the high-frequency band [11]. However, the cloud microphysical processes associated with snowfall are complex, involving the generation, growth, collision, and melting of ice crystals and snowflakes. These processes depend on various atmospheric factors such as temperature, humidity, and pressure. Furthermore, the presence of supercooled cloud liquid water can cause TB to increase and obscure the TB depression signature resulting from ice scattering [12]. Therefore, empirical modeling methods are more commonly used to develop snowfall detection algorithms compared to physical modeling methods.

Machine learning classifiers have gained widespread popularity and are extensively employed in various classification problems. These algorithms possess the ability to model complex class signatures, can accommodate diverse input predictor data, and do not make assumptions about data distribution. Numerous studies have consistently demonstrated that machine learning algorithms tend to achieve higher accuracy compared to traditional parametric classifiers, especially when dealing with complex data characterized by a high-dimensional feature space, which entails numerous predictor variables [13,14,15]. These characteristics make machine learning classifiers highly promising for developing snowfall event detection algorithms with multi-channel TBs as the primary input.

Fengyun-3D (FY-3D) is the second generation of China’s polar-orbiting meteorological satellites, equipped with advanced observation capabilities that provide important data support for weather forecasting and meteorological disaster detection on a global scale [16]. FY-3D is equipped with advanced microwave detectors with comprehensive channel settings, enabling the observation of the vertical distribution of atmospheric temperature and humidity. These capabilities hold significant potential for snowfall detection [17]. The objective of this study is to develop a snowfall detection algorithm for the Microwave Temperature Sounder-2 (MWTS-II) and Microwave Humidity Sounder-2 (MWHS-II) onboard the FY-3D satellites. The paper is organized as follows. The FY-3D microwave sounding data, the gauge data, and the sample preparation are briefly introduced in Section 2. Section 3 delves into the theory behind the machine learning classifiers that were compared in the paper, as well as the evaluation methods employed to assess their performance. Section 4 analyzes TB distribution characteristics under snowfall and non-snowfall conditions, proposes a pre-classification strategy for atmospheric temperature, constructs a snowfall detection algorithm based on the optimal classifier after comparing various machine learning classifiers, and discusses the practical applications of this algorithm. The conclusions are presented in Section 5.

2. Materials

2.1. FY-3D Microwave Sounding Data

The MWTS-II onboard the FY-3D satellite is a passive cross-track scanning microwave sounder capable of temperature sounding under nearly all weather conditions. It has an observation angle of ±49.5°, a scan period of 8/3 s, a scan swath of 2250 km, and 13 channels on frequencies ranging from 50 and 60 GHz (Table 1). A total of 90 field of view (FOV) samples is taken along each scan line, and the spatial resolution is about 32 km near nadir. The instrument is mainly used to detect surface emissivity and vertical distribution of atmospheric temperature [18,19].

The MWHS-II aboard the FY-3D satellite is used to detect both temperature and moisture under nearly all weather conditions with 15 channels on higher frequencies, including bands at 183 GHz (for humidity) and 118 GHz (Table 2) [18,20]. The MWHS is a cross-track scanning sounder with a scan angle of ±53.35° and a scan period of 2667 ms. During each scan, the instrument has a total of 98 FOVs and a scan swath of 2700 km. Its sounding capability covers the oxygen band at 118 GHz with a sub-satellite point resolution of 32 km, the water vapor band at 183 GHz with a 16-km resolution, and window parts of the spectrum at 89 and 150 GHz with a 32-km resolution.

2.2. NCEP ADP Gauge-Based Observations

The precipitation records utilized in this study were sourced from the National Centers for Environmental Prediction (NCEP) Automated Data Processing (ADP) Global Surface Observational Weather Data. These records served as the reference truth data for snowfall detection algorithms. The classification of the records into non-precipitation, rain, and snow categories was based on the World Meteorological Organization’s (WMO) present weather codes [21]. Specifically, records with codes 0 to 19 and 100 to 110 were classified as non-precipitation, while those with present weather codes 60–99 were classified as rainfall, and those with codes 70–79, 85, and 86 were classified as snowfall [22,23]. These records are collected from observations of more than 14,000 gauges over land. The NCEP ADP data were collected from the Global Telecommunication System (GTS) at intervals centered on the 3-hourly and 6-hourly analysis times of their global and regional models. Prior to usage, a quality control process was implemented to ensure the accuracy and consistency of the collected records across different sources. The dataset made available to the users encompasses all the quality-controlled observations. In addition, 2 m air temperature and 2 m dew point temperature in the NCEP dataset are set as input variables to supplement the surface environmental information.

2.3. Data Matching between FY-3D Observations and NCEP Dataset

This study aimed to match one year of FY-3D observations with the NCEP dataset for the year 2019 over land to construct the training set. To achieve spatiotemporal matching between the NCEP dataset and the FY-3D MWTS and MWHS observations, the following strategies were employed. First, the MWTS and MWHS observations were aligned in time with the NCEP dataset by considering the observation time difference within half an hour. Second, for each NCEP record, the satellite observation with the smallest absolute difference in latitude and longitude was identified among the temporally matched observations. However, any matched samples with an absolute difference greater than 0.5° were removed from consideration. To illustrate the matching process using the MWHS observations, the average value of the sum of absolute spatial differences between the NCEP data and MWHS observations was found to be 0.134°. Additionally, 90% of the matched samples exhibited spatial differences within 0.2°. These spatial differences align well with the spatial resolution of the satellite observations.

3. Methodology

3.1. Machine Learning Classifiers

This study compares the effectiveness of six different machine learning classifiers in developing algorithms for detecting snowfall events. To extend the binary classifiers for multiclass classification, the study adopts a “one-vs-rest” approach. For each of the three categories (rainfall, snowfall, and non-precipitation), a separate classifier is trained using a binary labeling scheme (yes/no). The classifiers used in the study are as follows:

(1): K-Nearest Neighbor (KNN): KNN is a simple supervised learning method that identifies the k nearest training samples based on a distance measure and makes predictions based on the information of these k “neighbors” [24].
(2): Logistic Regression: Logistic regression is a classical classification method that establishes a regression formula for determining the classification boundary based on available data [25].
(3): Decision Tree: The decision tree is a model for decision-making based on a tree structure that classifies the data set through multiple conditional discriminative processes [26].
(4): Random Forest: Random forest is an ensemble learning algorithm that combines multiple decision trees. It generates multiple decision trees from the given dataset, and each tree classifies the dataset once. The final classification result is determined by a majority vote among all the decision trees [27].
(5): Gradient Boosting: The gradient boosting classifier is an iterative algorithm that uses a decision tree as the base learner. It differs from the random forest classifier in the sampling method employed [28].
(6): Gaussian Naïve Bayes: The Gaussian Naïve Bayes classifier is a probabilistic approach based on Gaussian distribution. It assumes that each feature has independent ability to predict the output variable [29].

3.2. The Construction of Snowfall Detection Algorithm

The construction of the snowfall detection algorithm involves three main steps:

Initialization of Training Data: The training data is initialized based on spatiotemporally matched samples of FY-3D observations and the NCEP dataset. Each sample consists of a total of 30 input variables, including TBs from 13 channels of the MWTS, TBs of 15 channels from the MWHS, surface 2 m air temperature, and 2 m dew point temperature from the NCEP dataset. The matching process results in a total of 420,876 matches.

Pre-filtering and Pre-classification of Training Data: The training data is pre-filtered and pre-classified based on the distribution of key input variable values under three categories of weather conditions. The pre-filtering process follows two principles: (1) input samples with 2 m dew point temperature higher than 2 °C are labeled as non-snowfall samples, and (2) input samples with TBs from MWTS Channel 3 higher than 260 K are labeled as non-snowfall samples. The filtered training data is then pre-classified into a cold subset (TB < 247 K) and a warm subset (TB > 247 K) based on the TBs from MWTS Channel 3. After pre-filtering, 130,405 matching samples satisfying the conditions remain, with 51,914 samples in the cold subset and 78,491 samples in the warm subset.

Construction of Input Features and Training of the Snowfall Detection Model: The input features are constructed, and the snowfall detection model is trained separately on the cold and warm subsets. The initial training data in both subsets are normalized and processed using principal component analysis (PCA) to extract six principal components from the 30 input variables as the input features. The training dataset is divided into training and validation sets in a 7:3 ratio. Six different classifiers are used to train models on the training set, and the optimal classifier is selected based on performance metrics. The corresponding model of the optimal classifier is saved for snowfall detection.

3.3. Evaluation Metrics and Strategies

In the snowfall detection algorithm, accuracy, recall, and F1-score are used as evaluation metrics during the training process for comparing the performance of different machine learning classifiers. The three metrics are calculated using a 2 × 2 contingency table consisting of four numbers (hit, miss, false alarm, and correct negative) [30]. If the snowfall category is used as an example, a hit is defined as both the NCEP and the FY-3D snowfall detection algorithm detecting snowfall. A false alarm is when the FY-3D algorithm detects snowfall but the NCEP does not, while a miss is when the NCEP detects snowfall but the FY-3D detection algorithm does not. A correct negative is when both the NCEP and the FY-3D detection algorithm detect no snowfall. Specifically, the three metrics are calculated as follows:

A c c u r a c y = \frac{h i t + c o r r e c t n e g a t i v e}{h i t + f a l s e a l a r m + m i s s + c o r r e c t n e g a t i v e}

(1)

R e c a l l = \frac{h i t}{h i t + m i s s}

(2)

F 1 - s c o r e = \frac{2 P R}{P + R} = \frac{2 h i t}{2 h i t + f a l s e a l a r m + m i s s}

(3)

where R represents recall and P represents precision. Precision refers to the proportion of hit samples to all samples with positive predictions based on detection algorithm.

Additionally, three dichotomous metrics commonly used for evaluating precipitation products were also employed to assess the effectiveness of the algorithm when compared with other snowfall datasets, including probability of detection (POD), false alarm rate (FAR), and critical success index (CSI). These three metrics are calculated as follows:

P O D = \frac{h i t}{h i t + m i s s}

(4)

F A R = \frac{f a l s e a l a r m}{h i t + f a l s e a l a r m}

(5)

C S I = \frac{h i t}{h i t + f a l s e a l a r m + m i s s}

(6)

4. Results and Discussion

4.1. Distribution of Key Input Variable Values under Three Weather Conditions

The dew point temperature, which combines surface air temperature, relative humidity, and surface air pressure, is a comprehensive index that plays a crucial role in precipitation phase identification [31]. In this study, all matched samples were classified based on the weather type indicated in the reference data, and the distribution of dew point temperatures was analyzed separately for each type (Figure 1). It was observed that the dew point temperatures for all snowfall samples were mainly concentrated around 0 °C and lower, with the highest value occurring around 2 °C. Conversely, the distribution of dew point temperatures was significantly higher for the two types of non-snowfall samples. Therefore, the dew point temperature serves as a pre-filtering criterion for the samples in the initial screening stage of input samples, and only samples with dew point temperature below 2 °C are input to the model for training.

The temperatures of the lower atmosphere and near-surface air play a crucial role in snowfall formation. Different temperature conditions result in significant variations in the vertical structure of hydrometeors during snowfall events, leading to distinct differences in the distribution of TBs of water vapor-sensitive channels. Consequently, it is necessary to pre-classify input samples based on temperature. Figure 2 illustrates the correlations among the observed TBs of all channels of the MWHS and MWTS, as well as the 2-m air temperature, and 2-m dew point temperature in the NCEP dataset.

The air temperature and dew point temperature exhibit a strong correlation with the window channel of the MWTS, where the correlation coefficient between the surface 2-m air temperature and the TB of the window channel exceeds 0.9. In comparison, the correlation between the observed TBs of the window channels (Channel 1 and Channel 10) of the MWHS and the surface 2-m air temperature is significantly weaker than that of the low-frequency window channels of the MWTS. This discrepancy arises due to the higher sensitivity of the high-frequency band to the scattering of hydrometeors. Among the 118 GHz channels, Channels 3 and 4, Channels 5 and 6, and Channels 7–9 show very high agreement, respectively, with similar frequencies and peak channel weight function heights within these channel groups. The non-window channels, in contrast to the window channels, are influenced relatively less by the surface-related signals such as snow cover. Among all non-window channels, MWTS Channel 3 displays the highest correlation with surface air temperature and dew point temperature. Therefore this channel was selected as the basis for pre-classifying input samples.

Figure 3 shows the distribution of the TBs from MWTS Channel 3 in the samples of the three weather conditions. The distribution of TBs is observed to be similar in the subset of non-precipitation samples and the subset of rainfall samples, with values ranging between 225 and 280 K. The distribution exhibits a peak around 260 K. However, it is notable that the non-precipitation samples have a higher occurrence of TB values above 260 K compared to the other categories. In contrast, the snowfall samples display a distinct and significantly lower range of TB values, ranging from approximately 225 to 260 K. Based on this observation, a threshold of 260 K is chosen for the pre-filtering of the input dataset. Any samples with TBs above 260 K on the 52.8 GHz channel are directly classified as non-snowfall samples and excluded from the training dataset. Additionally, the threshold of 247 K is determined for pre-classification based on the peak value of the TB distribution in the snowfall samples. This threshold is used to divide the samples into a cold subset (TB < 247 K) and a warm subset (TB > 247 K). The subsequent feature construction and model training steps are performed separately on these two subsets.

4.2. Statistical Characterization of Input Variables for Pre-Classified Subsets

Figure 4 presents the distributions of two surface variables, namely surface air temperature and dew point temperature, for the separate subsets. In the cold subset, it is observed that the mean surface air temperature and dew point temperature are smaller for snowfall events compared to rainfall events, and both are smaller than for non-precipitation events. However, it is important to note that the surface temperature-related variables for all three events are relatively low, and the distribution ranges are very close when considering the standard deviation. This close overlap between different categories indicates the difficulty of distinguishing between the three categories based solely on surface temperature-related variables in the cold subset. On the other hand, in the warm subset, there are significant differences in temperature-related variables among the three events. Specifically, for snowfall events, the mean temperature is below zero, and both surface air temperature and dew point temperature are considerably lower compared to rainfall and non-precipitation events. It is worth mentioning that frontal and convective rains, which are common in continental regions, are associated with the movement of warm winds or the upward movement of surface thermal air masses. This leads to the highest mean surface air temperature and dew point temperature for rainfall events in the warm subset compared to the other event types.

Figure 5 illustrates the distributions of TBs from representative channels of the MWTS in the separate subsets. In the cold subset, TB differences between the three sample types are relatively smaller, but the window channel (e.g., MWTS Channel 1) shows higher mean TB for snowfall events due to surface reflection effects. Non-window channels exhibit comparable or slightly higher TB for snowfall and rainfall events compared to non-precipitation events. In the warm subset, the window channel’s TB aligns with surface temperatures, with significantly lower TB for snowfall events. In the middle and lower atmosphere, distinguishing TB differences becomes more challenging, while the upper atmosphere shows significantly lower TB for snowfall events due to colder ice crystal nuclei.

Figure 6 depicts the distributions of TBs for representative channels of the MWHS in the separate subsets. In the cold subset, snowfall conditions exhibit higher TB in the window channel and middle/lower atmosphere probe channels compared to rainfall and non-precipitation events. This suggests the dominance of liquid water emission in the lower atmosphere during snowfall events at lower temperatures. Snowfall events in this condition may originate from liquid/mixed-phase water condensate particles reaching the surface as solid form during falling due to low temperatures. In the warm subset, the lower atmosphere sounding channel shows lower TB distribution under snowfall events, indicating the scattering effect of ice particles in the lower atmosphere at relatively high temperatures. This suggests that the presence of sufficient solid water particles, such as ice/hail, is necessary for snowfall events in this case. In contrast, there is minimal TB difference in the middle and upper atmospheric sounding channels under different weather conditions.

4.3. Algorithm Training Results

PCA is utilized to extract information from the input variables due to the strong correlations between TBs of channels with similar frequencies. After standardizing the initial input variables, PCA is performed separately on each of the two subsets. The first principal component, which has the largest eigenvalue, accounts for a significant portion of the total variance. In the warm subset, it explains 40.3% of the total variance with an eigenvalue of 13.31, while in the cold subset, it explains 35.7% with an eigenvalue of 11.76. To select the most informative features, the first six principal components are chosen based on a cumulative total variance threshold of 90% for each subset.

The principal component scores, which represent the projected values of the original data onto each principal component, show that surface temperature and atmospheric temperature sounding channels (MWTS Channels 3–5) and atmospheric humidity sounding channels (MWHS Channels 6–9 and Channels 11–15) with lower peak heights of channel weight functions contribute the most to the PCA features (Table A1). However, in the warm subset, the water vapor detection channels and the temperature detection channels of the middle and upper atmosphere have relatively higher weights compared to the cold subset.

Table 3 displays the classification performances of the six classifiers, with the metrics representing the weighted mean values based on the sample size. The random forest and decision tree classifiers demonstrate significantly higher accuracy compared to the other methods. This is attributed to their ability to consider all input principal components in a balanced manner, making them more adaptable to datasets with imbalanced categories. For instance, in the cold subset, which consists of 51,914 samples, snowfall samples comprise approximately 50%, non-precipitation samples account for around 36%, and rainfall samples constitute approximately 14%. Considering the overall performance of the metrics, the random forest classifier is chosen as the classifier of snowfall detection algorithm in this study.

4.4. Evaluation and Application Examples

The snowfall detection algorithm was evaluated using a test set consisting of FY-3D MWTS- and MWHS-observed TBs and NCEP surface variables from January to March 2020. A total of 29,870 valid samples were matched, including 5440 snowfall samples, 6961 rainfall samples, and 17,469 non-precipitation samples. By applying the one-vs-rest strategy, the performance of the model for each individual class was evaluated separately. Table 4 provides specific results for the classification performance of the model on the test set, highlighting the performance for each classification category. The algorithm showed good accuracy overall, but the recall and F1-score for the rainfall category were lower, which can be attributed to its significantly smaller sample size.

As discussed in Section 4.2, differentiating between rainfall and snowfall categories in the cold condition and between rainfall and non-precipitation categories in the warm condition poses challenges due to the distribution characteristics of the input data. In the cold subset of the test set, which includes 6890 data points, there are 3471 snowfall samples and only 713 rainfall samples. In the remaining samples outside the cold subset, there are 14,235 non-precipitation samples and 6248 rainfall samples. The substantial difference in sample size between the two categories increases the likelihood of misclassification, resulting in lower recall and F1-scores for the rainfall category.

The snowfall detection algorithm was evaluated against three existing snowfall products (CloudSat, ERA5, and IMERG) using NCEP data as the reference for classification performance. The evaluation employed metrics such as POD, FAR, and CSI, which are commonly used in precipitation product evaluation. To overcome the limitations of CloudSat data in terms of spatial coverage, the snowfall detection results of algorithm, along with the three data sources, was matched with NCEP records over the entire year of 2020. The evaluated IMERG snowfall products utilizes a phase classification condition based on a liquid precipitation probability of less than 50% [32]. Table 5 presents the performance metrics for algorithm’s snowfall detection and these products. The algorithm demonstrates significantly better accuracy in identifying the snowfall category compared to the IMERG and the ERA5. The algorithm’s POD in snowfall category recognition is comparable to that of the CloudSat product’s classification accuracy. However, when considering overall metrics, there is still a gap between the algorithm and the CloudSat product.

Overall, the proposed algorithm takes into account the distribution characteristics of channel brightness and surface variables under different temperature conditions and incorporates physical constraints. However, it does have limitations. The uneven number of training samples for each category is not fully addressed, and the algorithm tends to predict categories with more training samples, which can affect the accuracy of predicting snowfall and rainfall samples whose input feature values are similar to those of non-precipitation samples. This limitation highlights the challenge of accurately classifying these two categories and suggests the need for further improvement in handling imbalanced datasets and distinguishing similar categories.

In practical applications of the algorithm, it may not always be possible to have spatiotemporally matched NCEP site observation surface variable data for every satellite observation. To address this issue, the 2-m air temperature and 2-m dew point temperature in the input variables for the algorithm were provided by the ERA5 reanalysis data, which offers strong spatiotemporal continuity. Figure 7 illustrates an application example of the algorithm on 1 January 2019, displaying the daily distribution of snowfall over the continental U.S. The snowfall distribution is generated using data from the National Water Center’s U.S. National Snowfall Analysis Network. The snowfall detection results from the FY-3D satellite observations are also shown. The algorithmic input to these results is based on the observation time of MWHS samples, for which MWTS samples with observation time differences within 15 min are matched. Gauge-based data indicate two areas of concentrated snowfall events on that day, located within the longitude range of −110° to −100° E and latitude range of 30° to 40° N, as well as the longitude range of −100° to −70° E and latitude range of 40° to 45° N. Comparing the snowfall identification results for the same latitude and longitude ranges in Figure 7c,d,g and Figure 7e,f, it can be observed that the snowfall event detection algorithm also detects the occurrence of snowfall events in these regions. The snowfall detection results from FY-3D reveal that the coverage of non-snowfall samples is the highest, while the distribution of snowfall samples and rainfall samples is relatively concentrated. Additionally, many regions exhibit a mixed distribution of snowfall and rainfall samples.

5. Conclusions

In this study, a snowfall detection algorithm considering differentiated atmospheric temperature conditions is proposed for FY-3D MWTS and MWHS microwave sounding data. The algorithm is trained and validated using data from the whole year of 2019, tested using data from January to March 2020, and compared with existing snowfall products during the whole year of 2020. The main conclusions are given as follows:

(1): The low-frequency window channels of the MWTS show a strong correlation with surface 2 m air temperature, while the high-frequency window channels of the MWHS have a weaker correlation. Among the non-window channels, MWTS Channel 3 exhibits the highest correlation with surface air temperature and dew point temperature, making it a suitable proxy for atmospheric temperature conditions in the algorithm.
(2): The value distribution of key input variables shows significant differences between the subsets corresponding to different atmospheric temperature conditions. Snowfall mechanisms vary based on temperature conditions, with snowfall caused by solid hydrometeors falling at higher atmospheric temperatures exhibiting dominant scattering effects and snowfall caused by mixed-phase or liquid water particles condensing during descent at lower atmospheric temperatures exhibiting dominant emission effects.
(3): The prediction model compares six commonly used classifiers, and the random forest approach performs the best on the validation set regionally, with accuracy, recall, and F1 score values of 0.76, 0.76, and 0.75, respectively. The snowfall detection algorithm, constructed using principal component analysis (PCA) and the random forest classifier, achieves high accuracy (0.797) and average recall (0.437) performance on the test set. The overall classification performance of the algorithm outperforms that of ERA5 and IMERG snowfall datasets but still lags behind CloudSat snowfall products.

The proposed snowfall detection algorithm takes into account snowfall occurrence mechanisms under different temperature conditions, leading to improved prediction accuracy on both the validation and test sets. This algorithm provides valuable support for obtaining surface snowfall information using FY-3D microwave sounding data.

Author Contributions

All authors contributed to the work presented in this paper. Q.J. and Z.M. developed the concept and methodology; Q.J. performed the algorithm construction, wrote the initial paper, and con-tributed to the research questions and data preprocessing; J.X. and S.Y. contributed to the analysis of the research data and optimization of the figures; Z.M. and X.L. contributed to the content and structure of the final paper as well as to the writing of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the National Key R&D Program of China (Grant No. 2021YFB3900400); the Second Tibetan Plateau Scientific Expedition and Research (STEP) program (grant No. 2019QZKK0105); the International Research Center of Big Data for Sustainable Development Goals Youth Director Foundation (CBAS2022DF011); the National Natural Science Foundation of China (Grant No. 41901343); and the China Postdoctoral Science Foundation (No. 2018M630037, and 2019T120021).

Data Availability Statement

The NCEP ADP data can be downloaded from http://rda.ucar.edu/datasets/ds464.0/ (accessed on 1 March 2021). The FY-3D microwave sounding data can be downloaded from the Fengyun satellite remote sensing data service website http://satellite.nsmc.org.cn (accessed on 23 February 2022). The ERA5 data can be downloaded from the Climate Data Store website https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels (accessed on 13 January 2022).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. The principal component scores of the input variables in the first three principal components.

Variable	Cold Subset			Warm Subset
Variable	PC1	PC2	PC3	PC1	PC2	PC3
2 m Air.T	−0.197	0.009	0.177	−0.233	0.076	0.068
2 m Dewp.T	−0.116	−0.005	0.174	−0.231	0.079	0.064
MWTS-Ch1	−0.222	0.043	0.155	−0.220	0.087	0.093
MWTS-Ch2	−0.242	0.037	0.113	−0.239	0.085	0.056
MWTS-Ch3	−0.251	−0.008	−0.011	−0.248	0.035	−0.080
MWTS-Ch4	−0.214	−0.084	−0.134	−0.200	−0.074	−0.193
MWTS-Ch5	−0.155	−0.222	−0.184	−0.152	−0.229	−0.205
MWTS-Ch6	−0.072	−0.341	−0.132	−0.113	−0.308	−0.124
MWTS-Ch7	0.024	−0.363	−0.024	−0.085	−0.329	−0.038
MWTS-Ch8	0.077	−0.337	0.124	−0.060	−0.339	0.102
MWTS-Ch9	0.033	−0.306	0.243	−0.056	−0.300	0.226
MWTS-Ch10	−0.027	−0.176	0.351	−0.060	−0.189	0.341
MWTS-Ch11	−0.049	0.002	0.340	−0.059	−0.047	0.352
MWTS-Ch12	−0.049	0.099	0.260	−0.059	0.042	0.291
MWTS-Ch13	−0.052	0.129	0.180	−0.057	0.088	0.206
MWHS-Ch1	−0.223	0.051	0.178	−0.216	0.106	0.131
MWHS-Ch2	−0.015	−0.207	0.310	−0.057	−0.222	0.300
MWHS-Ch3	0.056	−0.336	0.152	−0.060	−0.330	0.134
MWHS-Ch4	0.064	−0.353	0.060	−0.073	−0.342	0.039
MWHS-Ch5	−0.114	−0.291	−0.193	−0.136	−0.251	−0.202
MWHS-Ch6	−0.188	−0.199	−0.195	−0.173	−0.180	−0.237
MWHS-Ch7	−0.283	0.006	0.034	−0.262	0.070	0.006
MWHS-Ch8	−0.281	0.019	0.066	−0.257	0.087	0.045
MWHS-Ch9	−0.257	0.045	0.133	−0.232	0.112	0.109
MWHS-Ch10	−0.238	0.054	0.125	−0.225	0.122	0.112
MWHS-Ch11	−0.155	−0.039	−0.149	−0.150	−0.032	−0.142
MWHS-Ch12	−0.185	0.002	−0.132	−0.194	0.025	−0.104
MWHS-Ch13	−0.205	0.031	−0.099	−0.222	0.071	−0.045
MWHS-Ch14	−0.219	0.037	−0.073	−0.232	0.089	−0.013
MWHS-Ch15	−0.228	0.041	−0.005	−0.236	0.111	0.040

References

Hunsaker, C.T.; Whitaker, T.W.; Bales, R.C. Snowmelt runoff and water yield along elevation and temperature gradients in California’s Southern Sierra Nevada 1. JAWRA J. Am. Water Resour. Assoc. 2012, 48, 667–678. [Google Scholar] [CrossRef]
McCabe, G.J.; Clark, M.P.; Hay, L.E. Rain-on-snow events in the western United States. Bull. Am. Meteorol. Soc. 2007, 88, 319–328. [Google Scholar] [CrossRef] [Green Version]
Sui, J.; Koehler, G. Rain-on-snow induced flood events in Southern Germany. J. Hydrol. 2001, 252, 205–220. [Google Scholar] [CrossRef]
Levizzani, V.; Laviola, S.; Cattani, E. Detection and measurement of snowfall from space. Remote Sens. 2011, 3, 145–166. [Google Scholar] [CrossRef] [Green Version]
Liu, G. Approximation of single scattering properties of ice and snow particles for high microwave frequencies. J. Atmos. Sci. 2004, 61, 2441–2456. [Google Scholar] [CrossRef]
Doesken, N.J.; Judson, A. The Snow Booklet: A Guide to the Science, Climatology, and Measurement of Snow in the United States; Colorado State University Publications & Printing: Fort Collins, CO, USA, 1997. [Google Scholar]
Foster, J.L.; Skofronick-Jackson, G.; Meng, H.; Wang, J.R.; Riggs, G.; Kocin, P.J.; Johnson, B.T.; Cohen, J.; Hall, D.K.; Nghiem, S.V. Passive microwave remote sensing of the historic February 2010 snowstorms in the Middle Atlantic region of the USA. Hydrol. Process. 2012, 26, 3459–3471. [Google Scholar] [CrossRef]
Kongoli, C.; Pellegrino, P.; Ferraro, R.R.; Grody, N.C.; Meng, H. A new snowfall detection algorithm over land using measurements from the Advanced Microwave Sounding Unit (AMSU). Geophys. Res. Lett. 2003, 30, 1756. [Google Scholar] [CrossRef]
Liu, G.; Curry, J.A. Precipitation characteristics in Greenland-Iceland-Norwegian Seas determined by using satellite microwave data. J. Geophys. Res. Atmos. 1997, 102, 13987–13997. [Google Scholar] [CrossRef]
Skofronick-Jackson, G.M.; Kim, M.J.; Weinman, J.A.; Chang, D.E. A physical model to determine snowfall over land by microwave radiometry. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1047–1058. [Google Scholar] [CrossRef] [Green Version]
Skofronick-Jackson, G.M.; Weinman, J.A.; Chang, D.-E. Observation of snowfall over land by microwave radiometry from space. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Toronto, ON, Canada, 24–28 June 2002; pp. 1866–1868. [Google Scholar]
Liu, G.; Seo, E.K. Detecting snowfall over land by satellite high-frequency microwave observations: The lack of scattering signature and a statistical approach. J. Geophys. Res. Atmos. 2013, 118, 1376–1387. [Google Scholar] [CrossRef]
Ghimire, B.; Rogan, J.; Galiano, V.R.; Panday, P.; Neeti, N. An evaluation of bagging, boosting, and random forests for land-cover classification in Cape Cod, Massachusetts, USA. GIScience Remote Sens. 2012, 49, 623–643. [Google Scholar] [CrossRef]
Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine learning in geosciences and remote sensing. Geosci. Front. 2016, 7, 3–10. [Google Scholar] [CrossRef] [Green Version]
Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens. 2018, 39, 2784–2817. [Google Scholar] [CrossRef] [Green Version]
Carminati, F.; Atkinson, N.; Candy, B.; Lu, Q. Insights into the microwave instruments onboard the Fengyun 3D satellite: Data quality and assimilation in the Met Office NWP system. Adv. Atmos. Sci. 2021, 38, 1379–1396. [Google Scholar] [CrossRef]
Xu, J.; Ma, Z.; Hu, H.; Weng, F. A Cloud-Dependent 1DVAR Precipitation Retrieval Algorithm for FengYun-3D Microwave Soundings: A Case Study in Tropical Cyclone Mekkhala. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Hu, H.; Han, Y. Comparing the thermal structures of tropical cyclones derived from suomi NPP ATMS and FY-3D microwave sounders. IEEE Trans. Geosci. Remote Sens. 2020, 59, 8073–8083. [Google Scholar] [CrossRef]
Niu, Z.; Zhang, L.; Dong, P.; Weng, F.; Huang, W. Impact of assimilating FY-3D MWTS-2 upper air sounding data on forecasting typhoon lekima (2019). Remote Sens. 2021, 13, 1841. [Google Scholar] [CrossRef]
Song, L.; Shen, F.; Shao, C.; Shu, A.; Zhu, L. Impacts of 3DEnVar-Based FY-3D MWHS-2 Radiance Assimilation on Numerical Simulations of Landfalling Typhoon Ampil (2018). Remote Sens. 2022, 14, 6037. [Google Scholar] [CrossRef]
Dai, A. Temperature and pressure dependence of the rain-snow phase transition over land and ocean. Geophys. Res. Lett. 2008, 35. [Google Scholar] [CrossRef]
Sims, E.M.; Liu, G. A parameterization of the probability of snow–rain transition. J. Hydrometeorol. 2015, 16, 1466–1477. [Google Scholar] [CrossRef]
Xiong, W.; Tang, G.; Wang, T.; Ma, Z.; Wan, W. Evaluation of IMERG and ERA5 Precipitation-Phase Partitioning on the Global Scale. Water 2022, 14, 1122. [Google Scholar] [CrossRef]
Dudani, S.A. The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. 1976, 4, 325–327. [Google Scholar] [CrossRef]
Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013; Volume 398. [Google Scholar]
Myles, A.J.; Feudale, R.N.; Liu, Y.; Woody, N.A.; Brown, S.D. An introduction to decision tree modeling. J. Chemom. A J. Chemom. Soc. 2004, 18, 275–285. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Ferreira, A.J.; Figueiredo, M.A. Boosting algorithms: A review of methods, theory, and applications. In Ensemble Machine Learning: Methods and Applications; Springer: Berlin/Heidelberg, Germany, 2012; pp. 35–85. [Google Scholar]
Reddy, E.M.K.; Gurrala, A.; Hasitha, V.B.; Kumar, K.V.R. Introduction to Naive Bayes and a Review on Its Subtypes with Applications. In Bayesian Reasoning and Gaussian Processes for Machine Learning Applications; Chapman and Hall/CRC: New York, NY, USA, 2022; pp. 1–14. [Google Scholar]
Wilks, D.S. Statistical Methods in the Atmospheric Sciences; International Geophysics Series; Elsevier Inc.: Amsterdam, The Netherlands, 2006; Volume 100. [Google Scholar]
Behrangi, A.; Yin, X.; Rajagopal, S.; Stampoulis, D.; Ye, H. On distinguishing snowfall from rainfall using near-surface atmospheric information: C omparative analysis, uncertainties and hydrologic importance. Q. J. R. Meteorol. Soc. 2018, 144, 89–102. [Google Scholar] [CrossRef] [Green Version]
Sadeghi, L.; Saghafian, B.; Moazami, S. Evaluation of IMERG and MRMS remotely sensed snowfall products. Int. J. Remote Sens. 2019, 40, 4175–4192. [Google Scholar] [CrossRef]

Figure 1. Histogram of the dew point temperature distribution for the three weather categories.

Figure 2. Correlations between all MWTS and MWHS channels and NCEP Variables of 2 m air temperature in °C (2 m Air.T), and 2 m dew point temperature (2 m Dew point.T) in matched samples.

Figure 3. Histogram of the TB distribution of MWTS Channel 3 for the three weather categories.

Figure 4. Histogram of mean surface 2 m air temperature and 2 m dew point temperature distribution for different weather categories in separate subsets.

Figure 5. Histogram of mean TBs for the representative MWTS channels distribution for different weather categories in separate subsets.

Figure 6. Histogram of mean TBs for the representative MWHS channels distribution for different weather categories in separate subsets.

Figure 7. Application example of snowfall detection algorithm on 1 January 2019. (a) Daily accumulated snowfall generated by data from the National Water Center’s U.S. National Snowfall Analy-sis Network. (b–g) Snowfall detection results based on the snowfall detection algorithm using FY-3D MWHS and MWTS observations.

Table 1. Key features of the FY-3D MWTS-II channels, including peak channel weighting functions calculated using the US standard atmospheric profile.

Channel No.	Center Frequency (GHz)	Polarization	Bandwidth	NE∆T (K)	Peak Sounding Height (hPa)
1	50.30	QH	180	1.20	window
2	51.76	QH	400	0.75	window
3	52.8	QH	400	0.75	950
4	53.596	QH	400	0.75	700
5	54.40	QH	400	0.75	400
6	54.94	QH	400	0.75	250
7	55.50	QH	330	0.75	180
8	57.290344 (f₀)	QH	330	0.75	90
9	f₀ ± 0.217	QH	78	1.20	50
10	f₀ ± 0.3222 ± 0.048	QH	36	1.20	20
11	f₀ ± 0.3222 ± 0.022	QH	16	1.70	12
12	f₀ ± 0.3222 ± 0.010	QH	8	2.40	5
13	f₀ ± 0.3222 ± 0.0045	QH	3	3.60	2

Table 2. Key features of the FY-3D MWHS-II channels, including peak channel weighting functions calculated using the US standard atmospheric profile.

Channel No.	Center Frequency (GHz)	Polarization	Bandwidth	NE∆T (K)	Peak Sounding Height (hPa)
1	89.0	QV	1500	1.0	window
2	118.75 ± 0.08	QH	20	3.6	30
3	118.75 ± 0.2	QH	100	2.0	60
4	118.75 ± 0.3	QH	165	1.6	100
5	118.75 ± 0.8	QH	200	1.6	230
6	118.75 ± 1.1	QH	200	1.6	350
7	118.75 ± 2.5	QH	200	1.6	900
8	118.75 ± 3.0	QH	1000	1.0	925
9	118.75 ± 5	QH	2000	1.0	950
10	150.0	QV	1500	1.0	window
11	183.31 ± 1	QH	500	1.0	450
12	183.31 ± 1.8	QH	700	1.0	500
13	183.31 ± 3	QH	1000	1.0	600
14	183.31 ± 4.55	QH	2000	1.0	700
15	183.31 ± 7	QH	2000	1.0	800

Table 3. Classification accuracy of different classifiers on the validation dataset.

Classifier	Accuracy	Recall	F1-Score
KNN	0.67	0.68	0.68
Logistic regression	0.50	0.55	0.5
Random Forest	0.76	0.76	0.75
Decision Trees	0.73	0.73	0.73
Gradient Boosting	0.63	0.63	0.59
Gaussian Naïve Bayes	0.58	0.58	0.53

Table 4. Classification accuracy of the algorithm on the test set.

Weather Condition	Accuracy	Recall	F1-Score
Snowfall	0.797	0.437	0.439
Rainfall	0.767	0.109	0.179
Non-precipitation	0.592	0.799	0.696

Table 5. Evaluation of snowfall detection algorithm and existing snowfall products using NCEP data.

Snowfall Dataset	POD	FAR	CSI
Snowfall event detected by FY-3D	0.608	0.589	0.324
CloudSat	0.616	0.139	0.560
ERA5	0.250	0.429	0.210
IMERG	0.200	0.191	0.191

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ji, Q.; Ma, Z.; Xu, J.; Yan, S.; Li, X. A Snowfall Detection Algorithm for Fengyun-3D Microwave Sounders with Differentiated Atmospheric Temperature Conditions. Water 2023, 15, 2315. https://doi.org/10.3390/w15132315

AMA Style

Ji Q, Ma Z, Xu J, Yan S, Li X. A Snowfall Detection Algorithm for Fengyun-3D Microwave Sounders with Differentiated Atmospheric Temperature Conditions. Water. 2023; 15(13):2315. https://doi.org/10.3390/w15132315

Chicago/Turabian Style

Ji, Qingwen, Ziqiang Ma, Jintao Xu, Songkun Yan, and Xiaoqing Li. 2023. "A Snowfall Detection Algorithm for Fengyun-3D Microwave Sounders with Differentiated Atmospheric Temperature Conditions" Water 15, no. 13: 2315. https://doi.org/10.3390/w15132315

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Snowfall Detection Algorithm for Fengyun-3D Microwave Sounders with Differentiated Atmospheric Temperature Conditions

Abstract

1. Introduction

2. Materials

2.1. FY-3D Microwave Sounding Data

2.2. NCEP ADP Gauge-Based Observations

2.3. Data Matching between FY-3D Observations and NCEP Dataset

3. Methodology

3.1. Machine Learning Classifiers

3.2. The Construction of Snowfall Detection Algorithm

3.3. Evaluation Metrics and Strategies

4. Results and Discussion

4.1. Distribution of Key Input Variable Values under Three Weather Conditions

4.2. Statistical Characterization of Input Variables for Pre-Classified Subsets

4.3. Algorithm Training Results

4.4. Evaluation and Application Examples

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI