Multivariable Characterization of Atmospheric Environment with Data Collected in Flight

Shakirova, Aliia; Nichman, Leonid; Belacel, Nabil; Nguyen, Cuong; Bliankinshtein, Natalia; Wolde, Mengistu; DiVito, Stephanie; Bernstein, Ben; Huang, Yi

doi:10.3390/atmos13101715

Open AccessArticle

Multivariable Characterization of Atmospheric Environment with Data Collected in Flight

by

Aliia Shakirova

¹,

Leonid Nichman

^2,*

,

Nabil Belacel

³

,

Cuong Nguyen

²,

Natalia Bliankinshtein

²,

Mengistu Wolde

²,

Stephanie DiVito

⁴,

Ben Bernstein

⁵ and

Yi Huang

¹

Department of Atmospheric and Oceanic Sciences, McGill University, Montreal, QC H3A 0B9, Canada

²

Flight Research Laboratory, Aerospace Research Center, National Research Council Canada, Ottawa, ON K1A 0R6, Canada

³

Digital Technologies Research Center, National Research Council Canada, Ottawa, ON K1A 0R6, Canada

⁴

Aviation Research Division, William J. Hughes Technical Center, Federal Aviation Administration, Atlantic City, NJ 08405, USA

⁵

Leading Edge Atmospherics, Longmont, CO 80503, USA

^*

Author to whom correspondence should be addressed.

Atmosphere 2022, 13(10), 1715; https://doi.org/10.3390/atmos13101715

Submission received: 4 August 2022 / Revised: 14 October 2022 / Accepted: 17 October 2022 / Published: 19 October 2022

(This article belongs to the Special Issue Feature Papers in Atmosphere Science)

Download

Browse Figures

Versions Notes

Abstract

:

The In-Cloud Icing and Large-drop Experiment (ICICLE) flight campaign, led by the United States Federal Aviation Administration, was conducted in the geographical region over US Midwest and Western Great Lakes, between January and March 2019, with the aim to collect atmospheric data and study the aircraft icing hazard. Measurements were taken onboard the National Research Council of Canada (NRC) Convair-580 aircraft, which was equipped with more than 40 in situ probes, sensors, and remote sensing instruments in collaboration with Environment and Climate Change Canada (ECCC). In each flight, aerosol, cloud microphysics, atmospheric and aircraft state data were collected. Atmospheric environment characterization is critical both for cloud studies and for operational decision making in flight. In this study, we use the advantage of multiple input parameters collected in-flight together with machine learning and clustering techniques to characterize the flight environment. Eleven parameters were evaluated for the classification of the sampled environment along the flight path. Namely, aerosol concentration, temperature, hydrometeor concentration, hydrometeor size, liquid water content, total water content, ice accretion rate, and radar parameters in the vicinity of the aircraft. In the analysis of selected flights, we were able to identify periods of supercooled liquid clouds, glaciated clouds, two types of mixed-phase clouds, and clear air conditions. This approach offers an alternative characterization of cloud boundaries and a complementary identification of flight periods with hazardous icing conditions.

Keywords:

clouds; mixed phase; classification; ICICLE; Convair-580; airborne; flight; troposphere

1. Introduction

Regulators, air traffic controllers, and pilots use atmospheric data for airborne operations. In the decision-making process, the interpretation of the incoming data may vary based on individuals’ experience [1]. New weather forecast and operational tools aim to overcome human factor biases and improve characterization of the hazardous flight environment via multiplatform analyses, i.e., airborne, spaceborne, ground, and models [2,3,4,5]. Such characterization of the atmospheric environment would highly depend on the selection of tools and the final application. Common cases of interpretation biases include that of clear-air or out-of-cloud environment definition and that of mixed phase clouds, where the ratios of sizes and numbers of hydrometeors play a key role. Meanwhile, atmospheric scientists distinguish between different types of clouds by the dominant physical processes involved in cloud evolution. All identified categories are often inconsistent due to the use of different approaches, e.g., selection of observations location, resolution and sensitivity. For example, Oktem and Romps [6] have used ground stereo cameras in a multi-year study to identify cloud spacing, i.e., clear air and other cloud parameters. On the other hand, for phase discrimination, satellite data is often used, incorporating algorithms and thresholds on polarization or reflectance at multiple wavelengths (e.g., [7,8,9]), these may not necessarily agree with cloud boundaries observations from the ground elevated research stations (e.g., [10]), especially in cases of multiple cloud layers. Some specific examples include spatial and temporal distribution of cumulus clouds, in particular cirrocumulus or altocumulus which have inherent discrepancies in characterization using remote sensing versus in situ. Moreover, the known limitations and accuracy of in situ probes [11] as well as the expertise of the operator, in many cases, will influence the outcome of such characterization efforts. In light of these limitations, a method for assessment of the flight environment without using any a priori criteria would be desirable.

Research is ongoing to develop a multi-sensor cloud phase algorithm [12,13,14,15]. Recently, in the second Buffalo Area Icing and Radar Study (BAIRS II) an effort was made to distinguish between liquid phase, mixed phase, and glaciated (no liquid) conditions [4]. That study used a “decision tree” dendrogram classification approach where the liquid water content (LWC) and total water content (TWC) data from the Nevzorov hot-wire probes, particle concentration from forward scattering spectrometer probe (FSSP), temperature data, and SNDI (Spheres, Needles, Dendrites, and Irregulars) crystal habit classification [16,17,18] from the precipitation imaging probe (PIP) and 2D-C probes were the principal inputs to determine cloud phase. Other researchers have used similar methods; however, these only include in situ sensors (e.g., [19,20]). Additional relevant data clustering methods include “hierarchical cluster analysis” [21], fuzzy j-means (FJM) [22] and clustering based Simulated Annealing Improved Artificial Fish Swarm Optimization Algorithm (SA-IAFSA) [23] to obtain the optimal number of clusters instead of defining it a priori.

In this study, we present the results from a number of classification approaches, applied to data collected in flight, in an effort to advance the development of an operational tool for flight environment classification. In the following section, we describe the environment where data were collected, the aircraft and the instrumentation that were used to collect the data, and focus on the description of common methods that were implemented in this study.

2. Materials and Methods

2.1. NRC Convair-580 Research Aircraft

Measurements for this study were collected onboard the National Research Council of Canada (NRC) Convair-580 aircraft, which was instrumented in collaboration with Environment and Climate Change Canada (ECCC). The NRC Convair-580 is a twin-engine, pressurized aircraft equipped with an array of in situ and remote sensing instruments, which allow measurements of the atmospheric state, aircraft state, aerosol, gas phase, cloud microphysics, and remote sensing parameters (Figure 1). These complementary measurements allow a multidimensional and comprehensive characterization of the flight environment. More details on the aircraft can be found in [24,25,26].

2.1.1. Aircraft Instruments

The exhaustive list of instruments integrated into NRC’s Convair-580 aircraft for the flight campaign is described in detail in [24]. In Table 1, we list the source instruments whose data was used as selected input features in the characterization approaches. The Table includes both the probes for in situ measurements and remote-sensing instruments.

Nevzorov

The Nevzorov sensor has constant-temperature hot-wire elements, which measure LWC and TWC in clouds. These elements differ in shape of their collectors: the LWC detector is mostly sensitive to liquid droplets and discriminates them from ice particles that instantly shatter when collide with the convex surface of the sensor, while the conical hollow collector of the TWC enables to catch both ice and liquid particles. The operational principles of both Nevzorov elements are described in [14,27]. The Nevzorov was mounted under the wing (Figure 1). In our analysis, we used LWC, TWC, and the ratio of the two, which is a good indicator of mixed phase conditions.

Rosemount Icing Detector

A Rosemount Icing Detector (RID) is mounted on an oscillating rod, which changes its natural vibration frequency as impinging ice droplets accrete and freeze on the sensor. The decrease in frequency is proportional to the increase in the accumulated mass and induces a voltage gain digitized to 1 mV precision. If the accreted ice exceeds a voltage threshold, the sensor’s tip is heated for a fixed time interval to remove the ice. Previous studies have shown that the instrument has insignificant response to ice crystals; hence, the RID can not only provide independent supercooled liquid water (SLW) detection, but also can be useful in segregating liquid and glaciated conditions. The detailed principle of RID operation and limitations are described in [28,29,30].

OAP and Scattering Probes

In this study, we used composite Particle Size Distributions (PSD) covering the full size-range of the hydrometeors (2 µm to 2 cm). Normally, these PSDs are composed of data from one scattering probe (e.g., CDP, FCDP, FSSP) one mid-size range Optical Array Probe (OAP), typically 2D-S, and one precipitation-size range OAP, typically HVPS-3, similar to PSDs presented in Nguyen et al. [31]. A detailed description of these common commercial probes can be found in Baumgardner et al. [32] and references therein. For images, a particle area equivalent diameter was used for sizing of particles that are captured fully within the detection array.

Rosemount Temperature Sensor

In this study, we use one of the 5 Rosemount Temperature sensors [33] installed on the aircraft. The accuracy of the static temperature measured by this sensor is known to be ±0.1 °C [33]. However, due to known sampling biases and spread, including the location near the fuselage, it is estimated to be ±1 °C.

UHSAS

The Ultra-High Sensitivity Aerosol Spectrometer [34] was integrated in a cabin rack and connected to a sampling isokinetic inlet located on top of the fuselage. The instrument recorded 1 Hz PSD data in the size range of 60 to 1000 nm. For our analysis, we used the total number concentration of aerosol counted by the instrument.

NAW

For remote sensing, we used the NRC’s Airborne W-band radar (NAW), which typically has three pointing direction options [31,35]. In this study, we used data from the side (horizontal) sensing direction. The most informative parameters were found to be Radar Reflectivity (Z), Differential Reflectivity (ZDR), and Linear Depolarization Ratio (LDR) (Table 1). We used the remote sensing data within the vicinity of the aircraft, at a distance of 250 m. This distance was selected to ensure the radar data are not affected by the close-range biases [31] while remaining within the same flight environment as experienced by the in situ probes (for more details, see Section 2.3.2). The assumption here is uniformity of cloud microstructure on a spatial scale of 250 m. This assumption works quite well for homogeneous cases. However, it does not work in the case of an inhomogeneous cloudy environment within a 250 m range. Such issues may occur on cloud edges; however, since here we use the side antenna which is leveled on the plane of the in situ instruments, the probability of lateral inhomogeneity is much lower.

Additional instruments, not listed here, were used for quality assessment of the characterization outcomes. Some of those include but not limited to Airborne Elastic Cloud Lidar (AECL, [36]), Cloud Particle Imager (CPI, [37]) and other OAP images, as well as GOES satellite channels and other data, which are available in the ICICLE project repository (see Data Availability Statement section).

2.2. ICICLE Flight Campaign

The data used in this study was collected during the In-Cloud Icing and Large-drop Experiment (ICICLE) flight campaign, led by the United States Federal Aviation Administration (FAA). It was conducted between January and March 2019, over the US Midwest and Western Great Lakes. An essential aim of this campaign was to study microphysical cloud properties in order to better understand how hazardous icing conditions (particularly freezing rain, freezing drizzle, and ice pellets) form and evolve over time [20,24]. In these flights, the observed cloud composition (i.e., PSD and phase) varied between ice particles of various sizes and shapes, freezing drizzle, and freezing rain, and supercooled cloud droplets. Other properties, such as ice accretion, LWC, and TWC, were also collected to characterize bulk cloud microphysical properties. Here, a subset of data is used for the selected two flight test cases with a rich variety of conditions. The following subsections and Figure 2, Figure 3 and Figure 4 provide an overview of those flight environments.

2.2.1. Flight 20, 23 February 2019

On the morning of 23 February 2019, a strong area of low pressure over the Texas Panhandle area was lifting north-eastward into Kansas, encroaching into the western side of an area of high pressure centered over New England. The advancement of the precipitation into southern Wisconsin led to the development of a deep layer dominated by ice crystals falling into a melting layer, then subsequently, into the near-surface sub-freezing layer to form classical freezing rain (FZRA) there during the morning hours [38]. These environments and the classical FZRA mechanism were the focus of this day’s mission. Vertical profiles were made at Janesville (KJVL) and Madison (KMSN), including missed approaches to document the environment essentially down to the surface. In addition, level flight legs were flown at select altitudes within the lower (3000–4000 ft) and upper sub-freezing layers (10,000 and 12,000 ft) between the two sites and in the vicinity. Later, the aircraft captured data at the higher altitudes to the west, near the Wisconsin-Iowa border, as well as during the return to the operations base at Rockford, IL, USA (KRFD; Figure 2, Figure 3 and Figure 4).

2.2.2. Flight 28, 5 March 2019

On the morning of 5 March 2019, an area of low pressure was present across the northern Great Lakes, with a developing cold front moving rapidly from Wisconsin into Michigan. West-northwesterly flow was present over this region, bringing cold temperatures and cold air advection across Lake Michigan. The mission was focused on the boundary layer; lake effect clouds over and immediately downwind of Lake Michigan, which had pockets of surface water temperatures on the order of 5 °C. The cellular stratocumulus lake-effect clouds had rather cold (−22 to −24 °C), liquid tops, as was evidenced by signals in the visible, long- and short-wave infrared satellite channels. There was a rapid conversion from liquid to snow from the cloud tops downward and cellular streaks of snow were present in reflectivity from the Grand Rapids, Michigan NEXRAD (not shown). The layer of higher, weaker, similarly cold, mixed-phase clouds was also evident over the southeastern part of Lake Michigan in satellite imagery (Figure 3).

The Convair-580 departed KRFD at ~11:54 UTC, climbed to ~8000 ft, then flew across the southern part of the lake, briefly sampling the upper cloud layer near 9500 ft, then descending through the boundary-layer stratocumulus to perform a missed approach at the South Haven airport at ~12:58 UTC. After performing a deep vertical profile while flying to the north along the eastern lake shore, the aircraft performed a missed approach at Muskegon at ~13:34 UTC. This was followed by a series of north-south and east-west runs across the eastern half of the lake and just inland, while cycling altitudes through the stratocumulus deck. Upon reaching the western-most point where sampling was possible, the aircraft climbed into clear air above the stratocumulus deck and returned to the base at KRFD, landing at ~16:28 UTC (Figure 2, Figure 3 and Figure 4).

2.3. Classification Approaches

In this study, we composed a list of selected variables, derived from in situ single-particle and bulk, as well as remote sensing measurements. Each of these variables on its own is not sufficient for characterization without the others; aerosol information is useful to cluster periods of clear-air and cloud-formation events; microphysical information makes use of hydrometeor features to characterize clouds but provides limited information about the physical processes. In many applications, such as remote sensing, knowledge of the microphysics is essential. One key parameter in remote sensing interpretation is the hydrometeor size distribution [31]. Here, we present clustering and other methods when the end goal is classification of the environment. For simplification of the discussion and the comparisons, we use the terms clustering and classification interchangeably. However, the clustering method becomes classification only after the subsequent step of manual attribution of class labels to the identified clusters.

2.3.1. Clustering Methods

Generally, clustering methods can be divided into two main groups: hierarchical and non-hierarchical clustering. Hierarchical clustering algorithms group data points and provide a natural graphical representation of data. The graphical representation resulting from hierarchical clustering is a dendrogram representing the nested grouping of data points and the similarity levels at which groupings change [39]. Non-hierarchical (also known as partitional) clustering methods yield a single partition of the data instead of a clustering structure. They perform the partition of data points into K clusters so that data in the same cluster are more similar to each other than those in different clusters. Partitional clustering methods can be categorized as either hard or fuzzy [22,40,41]. Hard (also known as crisp) clustering assigns each data point to a single cluster, while fuzzy clustering methods assign degrees of membership in several clusters to each data point. A fuzzy clustering can be transformed to a hard clustering by assigning each point to the cluster with the dominant degree of membership [22]. In the following subsections, we will present briefly the clustering methods used in our study.

K-Means

K-means clustering [42] is an iterative data-partitioning algorithm which assigns input data points to one of K clusters, where K is defined before the initialization of the algorithm. In this study, we use the k-means clustering function from MATLAB. The cluster centroid initialization is performed according to the K-means++ algorithm [43]. It is known that the k-means output can strongly depend on the initialization [44]. Various initial cluster centroid positions can eventually result in different k-means outcomes. To avoid such instability, we set the ‘Replicates’ attribute of the MATLAB (version R2022a) function to 100, repeating the clustering algorithm a hundred times, using new initial cluster centroid positions at each iteration.

K-means clustering serves as a primary tool for the multivariable characterization of cloud phase in this study. However, one must keep in mind that the k-means is an unsupervised learning algorithm that groups the input data into categories without labeling them. Instead of determining the cluster labels, k-means algorithm can detect internal structure in data and partition observations according to identified patterns. Unsupervised learning nature of this method has the advantages of simplicity and applicability to large datasets; however, the k-means algorithm does not come without drawbacks. The apparent shortcoming is the predetermined number of clusters K, the choice of which is generally not explicit. Furthermore, each entity is assigned to one cluster, which assumes well-defined class boundaries. Unfortunately, this is not the case in many real-life applications in which the boundaries between classes may overlap and the entities may belong to more than one cluster.

The optimal number of clusters in this method was established using the “elbow” location [45], when the algorithm is run several times with a range of K values (e.g., from 1 to 15) and for each K value the cost function is calculated. The inflection point of the resulting curve indicates the optimal number of clusters. It was found that the K value for two selected flights is 5. As mentioned above, the biggest drawback is the lack of labeled outputs. Therefore, a subsequent analysis of the clustering results is required to label the clusters. In consideration of this limitation, we implemented a manual inspection of the input parameters with individually assigned clusters and characterized the clusters according to their physical significance, followed by validation using complementary data.

Fuzzy Clustering

Fuzzy clustering methods have been proposed to address the limitations of traditional clustering methods. Particularly, fuzzy clustering methods have the advantage of being able to deal with the complex relationships between samples, which can help in reducing data noise. Fuzzy C-Means (FCM) is one of the most popular and frequently used fuzzy clustering methods [41]. The FCM method alternately optimizes membership degrees and centroids until the best clusters are found. In the FCM method, the fuzzy assignment of samples to each cluster is essentially based on the relative distance between one sample and all cluster centroids. The FCM method is a heuristic that aims to minimize the objective function by optimizing the membership degree and the centroids [41]. The FCM method has been proven to provide better solutions in many machine learning and pattern recognition applications in comparison to hard clustering methods, such as k-means. Furthermore, the Fuzzy J-means (FJM) method introduced by Belacel et al. [22] uses all possible centroids-to-pattern re-locations in order to construct move-defined neighborhoods. The membership values and centroids are calculated in the same way as in the FCM algorithm. Comparative results previously published indicate that the FJM method outperforms the FCM method for both simulated and real data sets [22,46].

To avoid the limitation of the predetermined clustering number, we have tried two approaches: The first one is based on empirical experiments, as in the case of k-means. The second approach applies the clustering based on hybrid metaheuristic SA-IAFSA algorithm, to obtain the number of clusters instead of defining it a priori.

Decision Tree Dendrogram Classification Method

In this paper, we validate the k-means algorithm outcomes using complementary data, e.g., AECL, NAX radar, and imagery observations and compare it to other classification approaches. One of these classification approaches was developed to characterize the flight conditions during the ICICLE campaign using aircraft in situ measurements with 1 s temporal resolution [20]. This method applies a set of conditional statements on water content and temperature data to partition observations into five primary categories: “liquid T < 0 °C”, “liquid T > 0 °C”, “mixed phase”, “glaciated”, and “clear air”. As such, classification is based on a flowchart, we will further refer to it by its common name, a Decision Tree (DT) classification [47,48]. The DT methodology in ICICLE is depicted in Figure 5. LWC and IWC data for the DT classification are extracted from Nevzorov sensors and the temperature is taken from static temperature measurements of one of the aircraft Rosemount temperature sensors. The ‘liquid T > 0 °C’ category may generally contain a portion of mixed phase clouds with melting ice. Here, only flight F20 had such flight segments of positive temperatures (Figure 4e); therefore we can expect a fraction of this category in DT to include mixed phase.

2.3.2. Data Preparation

Selecting and preparing the input parameters for multivariable characterization of atmospheric environment is a challenging task. In recent literature, a wide variety of in situ and remote sensing measurements have been used for similar classification objectives (see Section 1). There are numerous studies on input feature set selection in machine learning problems (e.g., [49,50,51]). Unfortunately, there is no universal procedure for input parameters selection that could be applied to most classification problems. Here, we use the assumption that coincident and collocated data from both in situ and remote-sensing instruments in the vicinity of the aircraft, which makes the atmospheric environment analysis, along the flight path, more coherent [31]. In addition, aerosol was included in the analysis as it is often involved in cloud processes e.g., as condensation and ice nuclei [52] while the static temperature plays a major role in cloud formation and phase determination, and it is often used in IWC parameterization (e.g., [53,54]).

Atmospheric data collected in flight with complex instrumentation, normally, has to pass numerous steps of data processing and corrections to reduce bias in the analysis. For example, OAP probes have corrections for out-of-focus particles, inter-arrival time correction for shattered particles, numerous options of sizing projected images of particles (e.g., [55,56]). Bulk probe data corrections include but not limited to reference-signal corrections, dead-time correction during deicing, collection efficiency corrections, latent heat release corrections [32,57]. Temperature can be corrected with a combination of readings from multiple sources. Airborne aerosol corrections include sampling efficiency, refractive index corrections, sizing corrections. Radar signal can include pointing vector correction, ground signal clutter correction, close-range bias correction, noise subtraction. Here, we show that with minimal to no data treatment, it is possible to achieve a valid approximation of the flight conditions. The steps of data preparation we have taken are listed below.

First, data from all the different probes and instruments were temporally aligned with matching temporal resolution of 1 s in the overlapping flight segments. The variables were then tested for interdependence and those with high interdependence across numerous flights were removed from the input set to enhance variable complementarity and eliminate redundant features which presumably do not provide much additional information. Next, we selected the optimal input parameters, which provide the most valuable information to determine the phase of the detected clouds. From the atmospheric physics perspective, most commonly measured parameters include size, shape, and concentration of hydrometeors; bulk water content; atmospheric state; aerosol; and radar measurements. Hence, for clustering objective, bulk parameters such as LWC and TWC from Nevzorov sensors, static temperature, FSSP and OAP droplet concentration, mean volume diameter (MVD), W-band radar reflectivity, LDR and ZDR, concentration of aerosols, and RID voltage signal were taken as inputs (Table 1). A correlation matrix of the selected input variables is depicted in Figure 6. Red color indicates higher correlation, while blue color indicates higher anti-correlation.

The next step in data preparation is to filter the input set, which can have biased values remaining as a result of instrument noise, icing, or malfunctions. For example, values of LWC and TWC from Nevzorov detectors were removed if LWC was larger than TWC or one of the variables was less than the sensitivity of the Nevzorov sensor. Such unphysical values of data may occur because LWC and TWC are measured independently by different detectors, therefore their values may not always obey the physics of LWC ≤ TWC, even if properly calibrated. LWC and TWC are calculated by subtracting measured power and a baseline corresponding to clear air power [14], when the latter might have some error that in turn could lead to negative values of LWC or TWC.

The hydrometeor number concentrations from a scattering probe and OAP (Figure 4c,d) are merged into a composite concentration, ranging from 2 µm to 2 cm in particle diameter. Concentration of the full-size spectrum were used for the derivation of the mean volume diameter (Equation (1)),

M V D = {(\frac{\sum N_{i} D_{i}^{3}}{\sum N_{i}})}^{\frac{1}{3}}

(1)

where N_i and D_i are the number concentration and middle point of the i-th size bin, respectively.

The size distribution of the hydrometeors has a direct impact on the W-band radar variables, recorded using the side-pointing antenna (see Section 2.1.1). In order to avoid unreliable and noisy signal, all three radar variables (i.e., Z, ZDR, LDR) were masked using a SNR threshold of 0 dB. Differential reflectivity exhibited close range biases within 1-km distance from the aircraft (not shown) and was calibrated using flight segments in small liquid drops, where ZDR is approximately 0 dB.

Further steps in data preparation addressed the issue of missing values. Keeping missing input values (e.g., NaN, Inf) induces large voids in each input parameters dataset, which will remain unclassified, a scenario that would lead to a larger data void in the clustering product because of the overlap of multiple input datasets. A common practice is to fill the gaps in observations with the mean of the non-missing values or some predefined constants. However, such data manipulation does not work well and in most cases will likely result in the ‘cloud’ class inflation during the time intervals when the instrument was legitimately not in operation, or the signal was below its sensitivity threshold and hence could not be detected. Rather than ignoring missing data in the input dataset or replacing it with mean values, we impute the missing values for each input variable, independently, according to the following procedure: LWC, TWC, aerosol concentration, MVD, and number particle concentration missing values were replaced with 0; measurement gaps from the RID were equated to the nearest non-missing value, temperature was subjected to linear interpolation of neighboring non-missing values; W-band radar reflectivity, ZDR and LDR voids were substituted with −35 dBZ, −5 dB and −40 dB, respectively.

The input variables (Table 1) are not commensurate and had to be normalized to the range 0–1 to avoid biases in the classification due to the span in orders of magnitudes between absolute values of different variables.

3. Results

In this section, we present two independent flight-case analyses as well as an intercomparison between several different classification approaches, such as k-means, DT, and fuzzy clustering. 2D-S imagery and lidar measurements were used for manual validation of the classification results.

3.1. Variety of Conditions in Flight

Two flights with different conditions of interest were selected for multivariable atmospheric environment characterization: flight F20 on 23 February 2019 and flight F28 on 5 March 2019 (see Section 2.2). Bernstein et al. [20] summarized the environmental conditions of interest, highlighting cases with high IWC, supercooled LWC, freezing drizzle and freezing rain, small drops, and mixed phase. Figure 7 depicts the percentages of occurrence of each of the five primary categories using the DT approach. Flight F20 is characterized by the significant IWC amount detected above the layer of freezing rain, shown as a dominant portion of glaciated conditions throughout the flight [38]. Flight F28 represents a case with a wide variety of sizes and shapes and the presence of supercooled liquid water below −20 °C.

3.2. Flight 20

The results of our analysis based on the probability distribution functions (PDF) and assigned clusters suggest that the blue cluster represents a clear-air environment as it has low signatures in the Nevzorov sensors and missing values in radar measurements. Moreover, the absence of imagery data from the 2D-S probe between 15:33 and 15:37 UTC supports the clear-air labeling in this time interval. Concentration of larger aerosol may be detected in this period by the optical probe and the aerosol instruments. Glaciated phase (green color) is notable for the low LWC/TWC ratio, RID signal below 3 V, temperatures below 0 °C, smaller concentration and larger MVD of the particles, and ZDR shifted to positive values. Mixed phase identified here by the algorithm (purple color) is characterized by LWC/TWC in the range 0 to 0.5, temperatures below and above the freezing point, and wider range of RID and number concentration. Interestingly, in many studies the choice of an LWC/TWC phase fraction threshold is dictated by the type and accuracy of the instrumentation, such as 0.1 and 0.9 [14]. There are two purely liquid phases observed during flight F20 corresponding to the red and yellow clusters. The main difference between these two liquid conditions is seen in number concentration, temperature and RID measurements. Yellow cluster is characterized by temperatures below 0 °C and oscillating RID signal indicating supercooled liquid water freezing and accretion, whereas red cluster implies liquid above 0 °C (Figure 8). It is worth noting that some studies show that pure liquid droplets larger than 100 µm may be misclassified by Nevzorov as mixed phase [58].

We have selected a couple of flight periods to examine cases with more scrutiny. Figure 9 shows the time series plots of the aircraft pressure-altitude and six input variables between 13:41 and 13:57 UTC during flight F20. The NRC Convair-580 aircraft climbed through four different cloud conditions during this time interval, from freezing rain layer (A), through a warm nose (B), into mixed phase and more ice-laden layers above (C, D). The K-means algorithm separated the supercooled liquid state into a yellow cluster, as seen in flight section ‘A’ in Figure 9. This separation is supported by the increasing RID reading and sub-zero temperature. Next, in period B (dominated by red cluster) we observe positive temperatures and mostly a small-droplet environment, also seen in the 2D-S imagery. In A and B, the LDR value is near the detection limit of −30 dB (not shown). An LDR reading in raining spherical drops is likely a leaked co-pol signal measured in cross-pol channel. In the subsequent transition from B to C, we see a detectable signal increase as large ice appears. The transition between mixed and glaciated environments (C and D in Figure 9) is characterized primarily by changes in RID readings, temperature, and LWC/TWC ratio while W-band radar LDR value in ice remains above the values seen in section A, B.

3.3. Flight 28

Figure 10 depicts the PDFs of each input variable recorded in flight F28 where color coding denotes the number of clusters assigned by the k-means algorithm. The low values of liquid and total water content, hydrometeor concentration and size, as well as distinct detached peaks in the W-band radar reflectivity, ZDR, and LDR indicate that the blue cluster signifies clear-air flight segments. LWC/TWC ratios differentiate glaciated phase from liquid and mixed conditions. Lower concentration and higher MVD suggest that the purple cluster, likely, represents an ice-dominated environment. Positive radar dBZ values and mean volume diameter larger than 100 μm indicates the presence of large ice aggregates within the glaciated class. It is interesting to note that the concentration of these samples has a multimodal distribution, implying different types of ice particles. Time series of concentration and size in the purple cluster and 2D-S images confirm this assumption, showing spheroids, columnar crystals, and aggregates of ice corresponding to the time intervals of glaciated conditions (Figure 11, Figure 12 and Figure 13). RID response greater than 4 V indicates the presence of supercooled liquid water in two of the clusters, which we refer to as “Mixed 2” and “liquid” (Figure 10). When liquid water is present, the increase in RID signal in the time series is notable (Figure 11), whereas flat or decreasing RID signal correspond to the absence of liquid water or a LWC below the sensitivity threshold of the detector at temperatures consistently below −10 °C throughout this flight. “Mixed 1” class identified is distinct from other classes for its lower values in the LWC and TWC sensors and in the RID detector and can be interpreted as mixed phase with greater portion of ice. Therefore, green and red clusters represent mixed phase clouds with different proportions of liquid and ice particles. Green cluster (Mixed 1) has also lower hydrometeor concentration compared to the red cluster (Mixed 2), indicating that the green cluster could be a transition phase between clear air and mixed phase.

After analyzing the individual variables, identifying the five clusters assigned by the k-means algorithm using physical interpretation (Figure 10 and Figure 11), and validating the labels with complementary information, we can now describe the overall flight conditions. Figure 12 shows the NRC Convair-580 aircraft pressure-altitude flight profile on 5 March 2019 during flight F28. The color-coded parts of the trajectory denote five clusters assigned by the k-means algorithm and further manually labeled. On ascent, the aircraft passed a clear air segment of the flight and made transition into a mixed phase, at 12:22 UTC. It entered the narrow interval of liquid prevalent conditions at cloud top at 12:33 UTC, followed by a descent into a mixed phase environment and a missed-approach maneuver in the glaciated cloud phase at 12:41 UTC. The NRC Convair-580 then ascended from the glaciated to mixed phase environment and subsequently from mixed phase to liquid layers and ultimately to the clear air environment. The almost symmetrical occurrence of clusters at different altitude levels, over ~1 h segment, over the east coast of Lake Michigan (Figure 2 and Figure 3), suggest a persistent cloud layer and composition. After these transitions, in the interval 13:10–13:22 UTC, we observe a few patchy periods of liquid water between clear air segments. It is discernible that the glaciated layers were present at the lower altitude, mostly below 850 hPa, whereas the mixed phase and the liquid conditions were observed near cloud tops during the porpoising maneuvers of flight F28 in the interval 14:00–15:18 UTC.

In a deeper analysis of the first missed approach segment, we notice a layered macro structure. Figure 13 shows the aircraft pressure-altitude flight profile, its corresponding W-band reflectivity, color-coded according to the identified clusters, and complementary lidar signal and depolarization ratio plots. The lidar depolarization ratio used here as an auxiliary qualitative tool for validation, since it was not calibrated for absolute values. In section A of this flight segment, identified as liquid by the k-means algorithm, we observe mostly 10 µm droplets, depicted as single dark pixels recorded by the 2D-S probe. Lidar’s low depolarization ratio is another indicator of the liquid phase observed at 3 km altitude. Next, in section B, radar reflectivity is increased, and the segment is classified mostly as “mixed 1”. Lidar plots indicate a transition between layers on descent, where a larger variety of ice particles is detected by the 2D-S probe. Lidar depolarization ratio is medium to high when the aircraft enters section C of the flight, classified as “mixed 2” by the k-means algorithm. The 2D-S probe images show 10 µm droplets together with larger 100 µm hydrometeors. At the warmer temperature, in section D (identified as “glaciated”), radar reflectivity is further increased in correlation with the 2D-S imagery of larger ice aggregates. The RID constant reading (Figure 11) indicates limited to no additional accretion and therefore we conclude prevailing ice composition.

Such validation analyses show that the k-means algorithm cluster assignment corresponds well to the physical cloud structure observed with the complementary tools in flight F28. From the clustering results, we can infer some valuable information on the characteristic range of instruments’ values corresponding to different cloud phases. For example, the glaciated phase in this flight has an LWC/TWC ratio below 0.25, large MVD in the range of 100–1000 μm, fairly flat (non-accreting) RID signal, and enhanced aerosol concentration about 10³ cm⁻³. As opposed to the glaciated environment, liquid conditions in flight F28 have lower MVD and radar reflectivity values, larger hydrometeor concentration and an LWC/TWC ratio greater than 0.8. Undoubtedly, future studies will need to combine observations from numerous flights with various conditions to obtain universal thresholds for each cloud category.

3.4. Classification Intercomparison

In the previous section, we focused on k-means classification however, different approaches for atmospheric classification exist for applications along the flight track (see Section 1). In this section, we compare the results from four such classification techniques using data from flight F28, namely: k-means, DT, fuzzy clustering, and image sphericity derived classification produced by ECCC and distributed with their dataset (see Data Availability Statement section). The latter method is based on particle size distributions for Liquid and Ice categories, classified using area equivalent diameter and maximal diameter for any imaged particle, a mixed phase is classified when both categories are present and clear-air is classified when none of the categories are present. In fuzzy clustering method for this flight, the optimal number of 4 clusters was determined through the SA-IAFSA algorithm (see Section 2.3) after the data preparation procedure. Figure 14a–d illustrates the intercomparison between time series of the four selected methods. The fractional occurrence of each cluster for a selected period is shown in Figure 14e.

Overall, in this flight, while fuzzy clustering captures well the transition between clear air, mixed and glaciated environments, it fails to capture the pure liquid phase with the same accuracy as k-means or DT algorithms. On the other hand, k-means clustering, and DT classification also have noticeable distinctions between them. For instance, one can notice the difference in clear-air classification, around 13:16 UTC where the k-means algorithm underestimated the clear-air class occurrence in comparison to DT, instead classifying some of the segments as mixed phase. This can be explained by the inclusion of the radar measurements in the input parameters set, an increase in LDR signal from −40 dB to −25 dB was found for the samples of mixed phase while other input parameters showed no significant difference between clear-air and mixed-phase around 13:16 UTC (Figure 11).

One other distinction of the k-means output in comparison to the DT classification is the segregation of two mixed-phase classes from glaciated conditions. To examine these discrepancies between the two methods with higher scrutiny, we selected a test case between 13:01 and 13:05 UTC in ascent above 800 hPa pressure-altitude level (Figure 15). We show the time-series of the NRC Convair-580 aircraft pressure-altitude flight profile color-coded with the classes assigned by the k-means and DT algorithms (Figure 15a,b) temporally aligned with lidar depolarization ratio (Figure 15c), and selected input variables time series (Figure 15d,e). Comparing mixed phase 1 (green cluster) in k-means with the corresponding glaciated (purple cluster) classified in DT in the period where the aircraft ascends from mixed-phase 2 (red cluster), we notice a significant increase in the complementary lidar depolarization information. The highly attenuated lidar depolarization ratio becomes less attenuated with medium to high depolarization ratio values. Another major change identified during the ascent is the decrease in the bigger diameter hydrometeors (>1 mm) (Figure 15d) and stronger fluctuations in their concentration. Other variations were observed also in a decrease in radar reflectivity in comparison to the glaciated phase earlier in the flight and a significant decrease in the aerosol concentration (Figure 15e).

The sphericity derived classification is in good agreement with the k-means clustering for clear-air and glaciated periods. However, the pure liquid phase is almost not present in the former method, instead it is classified as mixed phase (Figure 14e). This is due to the lowest manual threshold set to classify as ‘mixed’ cases when both ice and liquid classes contain concentration larger than zero. The challenge of setting such thresholds is addressed in depth by Korolev et al. [59] and it is discussed in the next section.

4. Discussion and Practical Applications

In cloud phase characterization, the scientific community is currently relying on a variety of definitions. Korolev et al. [59], in their review paper, stated that there is no consensus regarding what defines “liquid”, “mixed phase”, and “ice” conditions. As an example, they illustrated the complexity in defining extreme ratios between hydrometeor phases, spatial clustering of phases (i.e., two types: genuine and conditional volumetric mixed-phase), and longevity of the phases. They listed the known instrumental limitations (both in situ and remote) and concluded that in the future, only the synergy of in situ measurements with different platform remote sensing will lead to an advanced understanding of mixed-phase clouds across the globe, even though the mixed-phase definition might change.

Nguyen et al. [31] have implemented this approach and demonstrated its efficiency in unprecedented, collocated, airborne in situ and radar measurements, including in characterization of mixed phase and riming events in the Arctic clouds and showed agreement with models. In the study presented here, we used the same sampling methodology, focusing our effort on the vicinity of the aircraft. While simultaneous collection of data using multiple measurement techniques should potentially improve the characterization of clouds, the subjectivity inherent in interpreting results from multiple techniques also introduces the potential to misinterpret results and/or introduce bias in order to find agreement with predisposed opinions [59]. In order to minimize the aforementioned opinions, we tested a number of taxonomy techniques, which resulted in a split of the mixed phase category into Mixed1 and Mixed2, as predicted by Korolev et al. [59]. It should be noted that this analysis is based on a limited number of flight hours and while it requires further scrutiny, the advantage of the different capability of lidar and radar to detect various sizes of liquid droplets and ice crystals and the detection capability of in situ instruments in a convoluted product is profound and can provide a more consistent definition of various types of mixed-phases than the definitions derived from each individual instrument separately.

In some numerical models, the cloud phase is still separated by simplistic thresholds assumptions, e.g., temperature threshold, size threshold, or concentration threshold, into liquid and ice. However, the reality is much more convoluted with the presence of counterintuitive, small spheroidal ice, large rain drops, supercooled liquid (down to −40 °C) and high ice concentration (e.g., after ice multiplication events). Misclassified phase will essentially propagate into errors in the derived properties like the optical depth of the cloud as well as cloud longevity, timescale and nature of cloud chemistry taking place in a certain cloud phase.

Finally, it is worth noting that due to its multi-instrument nature, the presented classification approach produces useful output even in the absence of exceedingly sophisticated quality-control or corrections of input parameters that would normally be required for quantitative analysis of cloud microphysics measurements.

The classification of events is a common problem crossing many fields. The outcome of this study has an additional practical application for the classification of icing environments using an impartial approach. A development of cloud masking method based on the classified portions of the flight would enable determination of cloud boundaries and cloud types, subsequently paving the path for characterization of hazardous conditions via multiplatform analyses, i.e., airborne, spaceborne, ground, and models.

5. Conclusions

In this work, we have presented a multivariable characterization of the atmospheric environment with data collected in flight. The data was collected onboard the NRC Convair-580 aircraft in two selected flights during the ICICLE campaign in 2019. We implemented a multivariable classification approach to facilitate segregation of cloud analyses into different atmospheric state categories as well as to develop a tool for operational decision onboard the aircraft. The variety of sampling technologies installed on our aircraft (e.g., optical, electrical, vibrational, remote sensing, thermal) enabled us to validate the results of a common k-means clustering characterization approach with complementary independent measurements e.g., imagery and lidar. This new capability was compared with other advanced classification methods. The number of identified classes agrees with the lowest number of expected categories of the physical atmospheric content in these flight environments (e.g., clear-air, liquid, glaciated, mixed-phase/multimodal). An additional cluster was identified, indicating a split of the mixed-phase category. The pristine classes of clear-air and glaciated clouds have shown an excellent agreement with the ECCC phase separated datasets, which passed a manual expert evaluation of the phase.

The time demanding processing of individual datasets collected by each of the instruments and the subsequent quality assessment often requires high scrutiny. Our study shows that untreated and uncorrected data can be used in some methods to produce an adequate representation of the flight environment, validated with complementary tools, therefore potentially saving time and cost of processing.

This single-flight, individual, classification approach can be expanded, with more computing power, to include a greater number of flights for an offline study of sampling features across a higher variety of environments. The findings of this study have enabled the identification of the minimal number of key input parameters for objective classification of the environment. Those will be used in a development of a Convolutional Neural Network (CNN) approach for real-time implementation (e.g., [60]) in flight. However, the selection of the best methodology to use still strongly depends on the backend user application of the classification results.

Icing studies onboard NRC’s Convair-580 have been ongoing for decades (e.g., [4,19,61]) and will continue in the foreseen future. The methodology presented here has the potential to become an operational tool for the NRC aircraft fleet, which is heavily used in international atmospheric science missions and can be transferred to other aircraft that have a similar suite of common instruments onboard.

Combining this approach with ground, modeling, and satellite datasets in this project will help to develop more accurate modeling tools for weather forecasting and ground detection of icing hazard.

Author Contributions

Conceptualization, L.N. and M.W.; methodology, A.S., L.N. and N.B. (Nabil Belacel); formal analysis, A.S., N.B. (Natalia Bliankinshtein), C.N., N.B. (Nabil Belacel), B.B. and M.W.; funding acquisition and project administration, S.D.; writing—original draft preparation, A.S., L.N. and N.B. (Nabil Belacel); writing—review and editing, all coauthors; visualization, A.S.; supervision, Y.H., L.N. and M.W. All authors have read and agreed to the published version of the manuscript.

Funding

The flight portion of this research and data collection was funded by the US Federal Aviation Administration (FAA): Stephanie DiVito (Terminal Area Icing Weather Information for NextGen (TAIWIN) program lead) and Danny Sims (In-Flight Icing (IFI) program lead). The analysis portion of the study was funded by the MIDAIR (Multisensor-data Icing Detection using Airborne In-situ and Remote-sensing) project under National Research Council of Canada APDC (Aeronautical Product Development and Certification) program.

Data Availability Statement

Portions of the data used in this study were downloaded from the ICICLE Long Term Data Archive at UCAR/NCAR—Earth Observing Laboratory [EOL], 2019. https://data.eol.ucar.edu/master_lists/generated/icicle/ (accessed on 1 January 2022). For example: Sphericity derived classification was derived from PSDs of ice and liquid provided in Korolev, A., Heckman, I. 2020. NRC Convair 580 Composite Liquid and Ice Size Distribution Data. Version 1.0. UCAR/NCAR—Earth Observing Laboratory. https://doi.org/10.26023/7BZP-NGPY-WG0B (accessed on 1 January 2022). Aerosol data were taken from Nichman, L., Corbin, J., Bala, K. 2020. NRC Convair 580 Aerosol Data. Version 1.0. UCAR/NCAR—Earth Observing Laboratory. https://doi.org/10.26023/K15M-XS6N-5B0E (accessed on 1 January 2022). Lidar data taken from: Baibakov, K., Wolde, M. 2020. NRC Convair 580 AECL Lidar Data. Version 1.0. UCAR/NCAR—Earth Observing Laboratory. https://doi.org/10.26023/X3JA-PZSK-ME0F (accessed on 1 January 2022). W-band radar data taken from: Nguyen, C., Wolde, M. 2020. NRC Convair 580 Airborne W-Band (NAW) Radar Data and Imagery. Version 1.0. UCAR/NCAR—Earth Observing Laboratory. https://doi.org/10.26023/PBVG-0S4X-3D05 (accessed on 1 January 2022). 2D-S images can be downloaded from: Korolev, A., Heckman, I. 2020. NRC Convair 580 2DS Particle Images. Version 1.0. UCAR/NCAR—Earth Observing Laboratory. https://doi.org/10.26023/DREN-VTHA-0N0E (accessed on 1 January 2022). Further details and the raw data can be requested from the corresponding author.

Acknowledgments

Many people from NRC, ECCC, FAA, NCAR and LEA contributed to the successful completion of the flight campaign. We thank the engineering, operation, and managerial staff from NRC (especially Eric Roux, Daniel MacDonald, Carl Jones, Steve Ingram, and Yawo-Daniel Hoyi) who made the project possible by working long hours during instrument integration and field operations. We also acknowledge the ECCC team Alexei Korolev, Mike Harwood, Jason Iwachow, Ivan Heckman, and Igor Grishin for the integration of the ECCC instrumentation onboard, its operation and maintenance, data collection and postflight data processing. The authors would also like to acknowledge NCAR staff for operations and forecasting support during the campaign and Stephanie DiVito (FAA) and Danny Sims (FAA) for leading the flight campaign and securing funds under the TAIWIN and IFI programs. The funders contributed to the writing of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jayde, M.K.; Blickensderfer, B.; Guinn, T.; Kleber, J.L. The Effects of Display Type, Weather Type, and Pilot Experience on Pilot Interpretation of Weather Products. Atmosphere 2021, 12, 143. [Google Scholar] [CrossRef]
Williams, E.R.; Donovan, M.F.; Smalley, D.J.; Hallowell, R.G.; Griffin, E.; Hood, K.T.; Bennett, B.J. The 2013 Buffalo Area Icing and Radar Study; Project Report ATC-419; MIT Lincoln Laboratory: Lexington, MA, USA, 2015. [Google Scholar]
Williams, E.R.; Smalley, D.J.; Donovan, M.F. Recommendations for ICICLE and Other Future Icing Hazard Campaigns Based on Experience with In Situ Validation and NEXRAD Dual Polarimetric Radar Assessments of Icing Conditions; Document 43PM-Wx-0182; MIT Lincoln Laboratory: Lexington, MA, USA, 2018. [Google Scholar]
Williams, E.R.; Donovan, M.F.; Smalley, D.J.; Kurdzo, J.M.; Bennett, B.J. The 2017 Buffalo Area Icing and Radar Study (BAIRS II); Report ATC-447; MIT Lincoln Laboratory: Lexington, MA, USA, 2020. [Google Scholar]
DiVito, S.; Riley, J.; Landolt, S.D.; Bernestein, B.; Green, S.; Bracken, J. An Update of the Federal Aviation Administration Terminal Area Icing Weather Information for NextGen (TAIWIN) Project. In Proceedings of the 21st Conference on Aviation, Range, and Aerospace Meteorology, American Meteorological Society 101st Annual Meeting, 10.1, Virtual, 14 January 2021; Available online: https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/384097 (accessed on 1 January 2022).
Öktem, R.; Romps, D.M. Prediction for cloud spacing confirmed using stereo cameras. J. Atmos. Sci. 2021, 78, 3717–3725. [Google Scholar] [CrossRef]
Cesana, G.; Chepfer, H.; Winker, D.; Getzewich, B.; Cai, X.; Jourdan, O.; Mioche, G.; Okamoto, H.; Hagihara, Y.; Noel, V.; et al. Using in situ airborne measurements to evaluate three cloud phase products derived from CALIPSO. J. Geophys. Res. Atmos. 2016, 121, 5788–5808. [Google Scholar] [CrossRef] [Green Version]
Hu, J.; Rosenfeld, D.; Zhu, Y.; Lu, X.; Carlin, J. Multi-channel Imager Algorithm (MIA): A novel cloud-top phase classification algorithm. Atmos. Res. 2021, 261, 105767. [Google Scholar] [CrossRef]
Smith, W.L., Jr.; Minnis, P.; Fleeger, C.; Spangenberg, D.; Palikonda, R.; Nguyen, L. Determining the Flight Icing Threat to Aircraft with Single-Layer Cloud Parameters Derived from Operational Satellite Data. J. Appl. Meteorol. Climatol. 2012, 51, 1794–1810. [Google Scholar] [CrossRef]
Doulgeris, K.-M.; Komppula, M.; Romakkaniemi, S.; Hyvärinen, A.-P.; Kerminen, V.-M.; Brus, D. In situ cloud ground-based measurements in the Finnish sub-Arctic: Intercomparison of three cloud spectrometer setups. Atmos. Meas. Tech. 2020, 13, 5129–5147. [Google Scholar] [CrossRef]
McFarquhar, G.M.; Baumgardner, D.; Bansemer, A.; Abel, S.J.; Crosier, J.; French, J.; Rosenberg, P.; Korolev, A.V.; Schwarzoenboeck, A.; Leroy, D.; et al. Processing of Ice Cloud in Situ Data Collected by Bulk Water, Scattering, and Imaging Probes: Fundamentals, Uncertainties, and Efforts toward Consistency. Meteorol. Monogr. 2017, 58, 11.1–11.33. [Google Scholar] [CrossRef] [Green Version]
Straka, J.M.; Zrnić, D.S.; Ryzhkov, A.V. Bulk Hydrometeor Classification and Quantification Using Polarimetric Radar Data: Synthesis of Relations. J. Appl. Meteorol. 2000, 39, 1341–1372. [Google Scholar] [CrossRef]
Liu, H.; Chandrasekar, V. Classification of Hydrometeors Based on Polarimetric Radar Measurements: Development of Fuzzy Logic and Neuro-Fuzzy Systems, and In Situ Verification. J. Atmos. Ocean. Technol. 2000, 17, 140–164. [Google Scholar] [CrossRef]
Korolev, A.V.; Isaac, G.A.; Cober, S.G.; Strapp, J.W.; Hallett, J. Microphysical characterization of mixed-phase clouds. Q. J. R. Meteorol. Soc. 2003, 129, 39–65. [Google Scholar] [CrossRef]
D’Alessandro, J.J.; McFarquhar, G.M.; Wu, W.; Stith, J.L.; Jensen, J.B.; Rauber, R.M. Characterizing the occurrence and spatial heterogeneity of liquid, ice and mixed phase low-level clouds over the Southern Ocean using in situ observations acquired during SOCRATES. J. Geophys. Res. Atmos. 2021, 126, 11. [Google Scholar] [CrossRef]
Holroyd, E.W. Some Techniques and Uses of 2D-C Habit Classification Software for Snow Particles. J. Atmos. Ocean. Technol. 1987, 4, 498–511. [Google Scholar] [CrossRef]
Korolev, A.; Sussman, B. A Technique for Habit Classification of Cloud Particles. J. Atmos. Ocean. Technol. 2000, 17, 1048–1057. [Google Scholar] [CrossRef]
Korolev, A. Reconstruction of the Sizes of Spherical Particles from Their Shadow Images. Part I: Theoretical Considerations. J. Atmos. Ocean. Technol. 2007, 24, 376–389. [Google Scholar] [CrossRef]
Cober, S.G.; Isaac, G.A.; Korolev, A.V.; Strapp, J.W. Assessing Cloud-Phase Conditions. J. Appl. Meteorol. 2001, 40, 1967–1983. [Google Scholar] [CrossRef]
Bernstein, B.; DiVito, S.; Riley, J.T.; Landolt, S.; Sims, D.; Haggerty, J.; Korolev, A.; Heckman, I.; Wolde, M.; Nichman, L.; et al. Overview of NRC Convair-580 In situ Flight Observations Made during ICICLE. In Proceedings of the 21st Conference on Aviation, Range, and Aerospace Meteorology, American Meteorological Society 101st Annual Meeting 10.4A, Virtual, 14 January 2021. [Google Scholar]
Dilmi, M.D.; Mallet, C.; Barthes, L.; Chazottes, A. Data-driven clustering of rain events: Microphysics information derived from macro-scale observations. Atmos. Meas. Tech. 2017, 10, 1557–1574. [Google Scholar] [CrossRef] [Green Version]
Belacel, N.; Hansen, P.; Mladenovic, N. Fuzzy J-Means: A New Heuristic for Fuzzy Clustering. Pattern Recognit. 2002, 35, 2193–2200. [Google Scholar] [CrossRef]
He, S.; Belacel, N.; Chan, A.; Hamam, H.; Bouslimani, Y. A hybrid artificial fish swarm simulated annealing optimization algorithm for automatic identification of clusters. Int. J. Inf. Tech. Decis. 2016, 15, 949–974. [Google Scholar] [CrossRef]
Bernstein, B.; DiVito, S.; Riley, J.T.; Landolt, S.; Haggerty, J.; Thompson, G.; Adriaansen, D.; Serke, D.; Kessinger, C.; Tessendorf, S.; et al. The In-Cloud Icing and Large-Drop Experiment Science and Operations Plans [DOT/FAA/TC-21/29]; Atlantic City International Airport, Federal Aviation Administration: Egg Harbor Township, NJ, USA, 2021. [Google Scholar] [CrossRef]
Wolde, M.; Nguyen, C.; Korolev, A.; Bastian, M. Characterization of the Pilot X-band radar responses to the HIWC environment during the Cayenne HAIC-HIWC 2015 Campaign. In Proceedings of the 8th AIAA Atmospheric and Space Environments Conference, AIAA 2016-4201, Washington, DC, USA, 13–17 June 2016; Available online: https://arc.aiaa.org/doi/10.2514/6.2016-4201 (accessed on 1 January 2022).
Nichman, L.; Bliankinshtein, N.; Wolde, M.; Davison, C.; Fuleki, D.; Orchard, D.; Nolan, C.; Keating, D.; Jackson, D.; Olvhamma, S.; et al. Advanced techniques for airborne measurements tested onboard NRC’s Convair-580 Aircraft. In Proceedings of the International Conference on Clouds and Precipitation (ICCP), Pune, India, 2–6 August 2021; Available online: https://tinyurl.com/iccpconvair (accessed on 1 January 2022).
Korolev, A.V.; Strapp, J.W.; Isaac, G.A.; Nevzorov, A.N. The Nevzorov Airborne Hot-Wire LWC-TWC Probe: Principle of Operation and Performance Characteristics. J. Atmos. Ocean. Technol. 1998, 15, 1495–1510. [Google Scholar] [CrossRef]
Baumgardner, D.; Rodi, A. Laboratory and wind tunnel evaluations of the Rosemount Icing Detector. J. Atmos. Ocean. Technol. 1989, 6, 971–979. [Google Scholar] [CrossRef]
Heymsfield, A.J.; Miloshevich, L.M. Evaluation of Liquid Water Measuring Instruments in Cold Clouds Sampled during FIRE. J. Atmos. Ocean. Technol. 1989, 6, 378–388. [Google Scholar] [CrossRef]
Cober, S.G.; Isaac, G.A.; Korolev, A.V. Assessing the Rosemount icing detector with in situ measurements. J. Atmos. Ocean. Technol. 2001, 18, 515–528. [Google Scholar] [CrossRef]
Nguyen, C.M.; Wolde, M.; Battaglia, A.; Nichman, L.; Bliankinshtein, N.; Haimov, S.; Bala, K.; Schuettemeyer, D. Coincident in situ and triple-frequency radar airborne observations in the Arctic. Atmos. Meas. Tech. 2022, 15, 775–795. [Google Scholar] [CrossRef]
Baumgardner, D.; Abel, S.J.; Axisa, D.; Cotton, R.; Crosier, J.; Field, P.; Gurganus, C.; Heymsfield, A.; Korolev, A.V.; Krämer, M.; et al. Cloud Ice Properties: In Situ Measurement Challenges. Meteorol. Monogr. 2017, 58, 9.1–9.23. [Google Scholar] [CrossRef] [Green Version]
Stickney, T.M.; Shedlov, M.W.; Thompson, D.I. Goodrich Total Temperature Sensors; Technical Report 5755; Revision, C., Ed.; Rosemount Aerospace Inc.: Burnsville, MN, USA, 1994. [Google Scholar]
Droplet Measurement Technologies. Ultra-High Sensitivity Aerosol Spectrometer (UHSAS), Operator Manual, DOC-0210, Rev E-4; Software Version 4.1.0; Droplet Measurement Technologies: Longmont, CO, USA, 2013. [Google Scholar]
Wolde, M.; Pazmany, A. NRC dual-frequency airborne radar for atmospheric research. In Proceedings of the 32nd Conference on Radar Meteorology, Albuquerque, NM, USA, 24–29 October 2005. P1R.9. [Google Scholar]
Baibakov, K.; Wolde, M.; Nguyen, C.; Korolev, A.; Wang, Z.; Wechsler, P. Performance of a compact elastic 355 nm airborne lidar in tropical and mid-latitude clouds. In Proceedings of the SPIE 10006, Lidar Technologies, Techniques, and Measurements for Atmospheric Remote Sensing XII, 100060C, Edinburgh, UK, 24 October 2016. [Google Scholar] [CrossRef]
Lawson, R.P.; Baker, B.A.; Schmitt, C.G.; Jensen, T.L. An overview of microphysical properties of Arctic clouds observed in May and July during FIRE ACE. J. Geophys. Res. 2001, 106, 14989–15014. [Google Scholar] [CrossRef] [Green Version]
Rugg, A.; Bernstein, B.C.; Haggerty, J.A.; Korolev, A.; Nguyen, C.; Wolde, M.; Heckman, I.; DiVito, S. High Ice Water Content Conditions Associated with Wintertime Elevated Convection in the Midwest. J. Appl. Meteorol. Climatol. 2022, 61, 559–575. [Google Scholar] [CrossRef]
Xu, D.; Tian, Y. A Comprehensive Survey of Clustering Algorithms. Ann. Data. Sci. 2015, 2, 165–193. [Google Scholar] [CrossRef] [Green Version]
Belacel, N.; Wang, C.; Cupelovic-Culf, M. Clustering: Unsupervised Learning in Large Biological Data. In Statistical Bioinformatics: A Guide for Life and Biomedical Science Researchers; Wiley Online Library: Hoboken, NJ, USA, 2010; Chapter 5; pp. 89–127. [Google Scholar] [CrossRef]
Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Plenum Press: New York, NY, USA, 1981. [Google Scholar] [CrossRef]
Lloyd, S.P. Least Squares Quantization in PCM. IEEE Trans. Inf. Theory 1982, 28, 129–137. [Google Scholar] [CrossRef] [Green Version]
David, A.; Vassilvitskii, S. K-Means++: The Advantages of Careful Seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2007; pp. 1027–1035. [Google Scholar] [CrossRef]
Fränti, P.; Sieranoja, S. How much can k-means be improved by using better initialization and repeats? Pattern Recognit. 2019, 93, 95–112. [Google Scholar] [CrossRef]
Tibshirani, R.; Walther, G.; Hastie, T. Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2001, 63, 411–423. [Google Scholar] [CrossRef]
Belacel, N.; Čuperlović-Culf, M.; Laflamme, M.; Ouellette, R. Fuzzy J-Means and VNS methods for clustering genes from microarray data. Bioinformatics 2004, 22, 1690–1701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Duda, R.; Hart, P.; Stork, D. Pattern Classification, 2nd ed.; Wiley: New York, NY, USA, 2001; ISBN 978-0-471-05669-0. [Google Scholar]
Brodley, C.E.; Utgoff, P.E. Multivariate Decision Trees. Mach. Learn. 1995, 19, 45–77. [Google Scholar] [CrossRef] [Green Version]
Kohavi, R.; John, G. Wrappers for feature selection. Artif. Intell. 1997, 97, 273–324. [Google Scholar] [CrossRef] [Green Version]
Blum, A.; Langley, P. Selection of relevant features and examples in machine learning. Artif. Intell. 1997, 97, 245–271. [Google Scholar] [CrossRef] [Green Version]
Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar] [CrossRef]
Kreidenweis, S.M.; Petters, M.; Lohmann, U. 100 Years of Progress in Cloud Physics, Aerosols, and Aerosol Chemistry Research. Meteorol. Monogr. 2019, 59, 11.1–11.72. [Google Scholar] [CrossRef]
Fontaine, E.; Schwarzenboeck, A.; Leroy, D.; Delanoë, J.; Protat, A.; Dezitter, F.; Strapp, J.W.; Lilie, L.E. Statistical analysis of ice microphysical properties in tropical mesoscale convective systems derived from cloud radar and in situ microphysical observations. Atmos. Chem. Phys. 2020, 20, 3503–3553. [Google Scholar] [CrossRef] [Green Version]
Yachao, H.; McFarquhar, G.M.; Wu, W.; Huang, Y.; Schwarzenboeck, A.; Protat, A.; Korolev, A.; Rauber, R.M.; Wang, H. Dependence of Ice Microphysical Properties on Environmental Parameters: Results from HAIC-HIWC Cayenne Field Campaign. J. Atmos. Sci. 2021, 78, 2957–2981. [Google Scholar] [CrossRef]
Korolev, A.; Field, P.R. Assessment of the performance of the inter-arrival time algorithm to identify ice shattering artifacts in cloud particle probe measurements. Atmos. Meas. Tech. 2015, 8, 761–777. [Google Scholar] [CrossRef] [Green Version]
Wu, W.; McFarquhar, G.M. On the Impacts of Different Definitions of Maximum Dimension for Nonspherical Particles Recorded by 2D Imaging Probes. J. Atmos. Ocean. Technol. 2016, 33, 1057–1072. [Google Scholar] [CrossRef]
Mazin, I.P.; Korolev, A.V.; Heymsfield, A.; Isaac, G.A.; Cober, S.G. Thermodynamics of Icing Cylinder for Measurements of Liquid Water Content in Supercooled Clouds. J. Atmos. Ocean. Technol. 2001, 18, 543–558. [Google Scholar] [CrossRef]
Schwarzenboeck, A.; Mioche, G.; Armetta, A.; Herber, A.; Gayet, J.-F. Response of the Nevzorov hot wire probe in clouds dominated by droplet conditions in the drizzle size range. Atmos. Meas. Tech. 2009, 2, 779–788. [Google Scholar] [CrossRef] [Green Version]
Korolev, A.; McFarquhar, G.; Field, P.R.; Franklin, C.; Lawson, P.; Wang, Z.; Williams, E.; Abel, S.J.; Axisa, D.; Borrmann, S.; et al. Mixed-Phase Clouds: Progress and Challenges. Meteorol. Monogr. 2017, 58, 5.1–5.50. [Google Scholar] [CrossRef]
Sadouk, L. CNN Approaches for Time Series Classification. In Time Series Analysis-Data, Methods, and Applications; Ngan, C.-K., Ed.; IntechOpen: London, UK, 2019. [Google Scholar] [CrossRef] [Green Version]
Cober, S.G.; Isaac, G.A.; Strapp, J.W. Aircraft Icing Measurements in East Coast Winter Storms. J. Appl. Meteorol. Climatol. 1995, 34, 88–100. [Google Scholar] [CrossRef]

Figure 1. Illustration of the NRC Convair-580 aircraft instrumented for the ICICLE campaign in 2019, reproduced with permission from [24]. Top row: NRC’s radar system (W- and X-band) and its axial pointing options and underwing cloud probes. Middle row: Aerosol isokinetic sampling inlet and bulk water detectors installed on the fuselage. Bottom row: Underwing cloud and atmospheric-state probes. Bottom middle: Cabin configuration with operator stations.

Figure 2. Flight tracks of the NRC Convair-580 aircraft for two selected flights: F20 on 23 February 2019 and F28 on 5 March 2019. Color coding shows UTC time progression along the flight track.

Figure 3. GOES-16, 11.2 µm infrared satellite imagery of representative time windows, in the middle of the flights. (a) F20, (b) F28. The images were obtained from the Satellite Cloud and Radiation Property retrieval System (SatCORPS) Team, http://satcorps.larc.nasa.gov (accessed on 1 January 2021).

Figure 4. (a,b) Wind rose plots (top), (c,d) hydrometeor size distributions—pdf (bottom), (e) P-T profiles measured in flight (right) for two selected flights, F20 and F28.

Figure 5. Decision tree illustrating the methodology of the classification based on conditional statements on data from bulk water content and temperature sensors [20]. The full decision tree includes further sub-classification using particle diameters.

Figure 6. A correlation matrix of input variables in flight F20 (23 February 2019).

Figure 7. DT classes percentage histogram for flights F20 and F28. Reproduced with permission [20]. Definition of each class is according to the dendrogram in Figure 5.

Figure 8. Time series plots of in situ measurements collected in flight F20 on 23 February 2019. Color coding denotes different clusters assigned by k-means algorithm. From Top to Bottom, the panels illustrate probe measurement from the temperature sensor, LWC/TWC ratio, RID, combined FSSP and OAP concentration and MVD, and aerosol number concentration. Due to the 1 Hz temporal averaging cloud particle concentration and size often appear as bimodal on the fulltime scale of the flight.

Figure 9. (Top): Same as Figure 8, but for the selected time interval between 13:41 and 13:57 UTC on 23 February 2019 (F20). (Bottom): 2D-S validation images for four different cloud phases. 2D-S Image resolution is 10 µm and frame height represents ~1.3 mm.

Figure 10. Probability distribution functions of 11 input variables in flight F28 (5 March 2019), color coded by the k-means algorithm identified clusters.

Figure 11. Time series of in situ measurements collected in flight F28 on 5 March 2019. Color coding denotes different clusters assigned by the k-means algorithm. From Top to Bottom, the panels illustrate probe measurement from the temperature sensor, LWC/TWC ratio, RID, combined FSSP and OAP concentration and MVD, and aerosol number concentration.

Figure 12. Time series of aircraft pressure-altitude flight profile on 5 March 2019 (F28).

Figure 13. NRC Convair-580 aircraft pressure-altitude flight profile and W-band radar reflectivity (Top), lidar reflectivity and depolarization plots (Bottom), 2D-S images for 4 selected time intervals (Right) from 12:34 to 12:45 UTC on 5 March 2019 (F28).

Figure 14. Time series of aircraft pressure-altitude on 5 March 2019 (F28) color-coded according to class assignments by four methods: (a) k-means algorithm with manual labeling; (b) decision tree; (c) fuzzy clustering without manual labeling; (d) sphericity derived classification. In the bottom panel, (e) histogram for a selected segment of the flight (11:55–15:55), illustrating the differences in the occurrence of the classes in each method.

Figure 15. Test case time series (13:01–13:05 UTC) for intercomparison between k-means and DT algorithms classification of mixed and glaciated phases in F28: (a) aircraft pressure-altitude color-coded according to class assignments with k-means, (b) aircraft pressure-altitude color-coded according to class assignments with DT, (c) lidar depolarization ratio, (d) HVPS-3 MVD > 1 mm, (e) selected input parameters used for classification, color coded according to class assignments with k-means.

Table 1. List of selected parameters that were used for this study (see instrument description below).

#	Selected Parameter	Description	Source Instrument
1	LWC [g m⁻³]	Bulk in situ Cloud microphysics	Nevzorov
2	TWC [g m⁻³]	Bulk in situ Cloud microphysics	Nevzorov
3	LWC/TWC [a.u.]	Bulk in situ Cloud microphysics	Nevzorov
4	Icing periods [Hz]	Bulk in situ Cloud microphysics	Rosemount Icing Detector
5	Number concentration [cm⁻³]	Single particle in situ Cloud microphysics	OAP & Scattering probes
6	MVD [µm]	Single particle in situ Cloud microphysics	OAP & Scattering probes
7	Static temperature [°C]	Atmospheric state	Rosemount temperature sensor
8	Aerosol number concentration [cm⁻³]	Aerosol	UHSAS
9	Radar reflectivity [dBZ]	Remote sensing	NAW
10	ZDR [dB]	Remote sensing	NAW
11	LDR [dB]	Remote sensing	NAW

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shakirova, A.; Nichman, L.; Belacel, N.; Nguyen, C.; Bliankinshtein, N.; Wolde, M.; DiVito, S.; Bernstein, B.; Huang, Y. Multivariable Characterization of Atmospheric Environment with Data Collected in Flight. Atmosphere 2022, 13, 1715. https://doi.org/10.3390/atmos13101715

AMA Style

Shakirova A, Nichman L, Belacel N, Nguyen C, Bliankinshtein N, Wolde M, DiVito S, Bernstein B, Huang Y. Multivariable Characterization of Atmospheric Environment with Data Collected in Flight. Atmosphere. 2022; 13(10):1715. https://doi.org/10.3390/atmos13101715

Chicago/Turabian Style

Shakirova, Aliia, Leonid Nichman, Nabil Belacel, Cuong Nguyen, Natalia Bliankinshtein, Mengistu Wolde, Stephanie DiVito, Ben Bernstein, and Yi Huang. 2022. "Multivariable Characterization of Atmospheric Environment with Data Collected in Flight" Atmosphere 13, no. 10: 1715. https://doi.org/10.3390/atmos13101715

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multivariable Characterization of Atmospheric Environment with Data Collected in Flight

Abstract

1. Introduction

2. Materials and Methods

2.1. NRC Convair-580 Research Aircraft

2.1.1. Aircraft Instruments

Nevzorov

Rosemount Icing Detector

OAP and Scattering Probes

Rosemount Temperature Sensor

UHSAS

NAW

2.2. ICICLE Flight Campaign

2.2.1. Flight 20, 23 February 2019

2.2.2. Flight 28, 5 March 2019

2.3. Classification Approaches

2.3.1. Clustering Methods

K-Means

Fuzzy Clustering

Decision Tree Dendrogram Classification Method

2.3.2. Data Preparation

3. Results

3.1. Variety of Conditions in Flight

3.2. Flight 20

3.3. Flight 28

3.4. Classification Intercomparison

4. Discussion and Practical Applications

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI