Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning

Yu, Tong; Zhou, Jing; Fan, Jiahao; Wang, Yi; Zhang, Zhou

doi:10.3390/rs15164108

Open AccessArticle

Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning

by

Tong Yu

¹

,

Jing Zhou

¹

,

Jiahao Fan

¹

,

Yi Wang

²

and

Zhou Zhang

^1,*

¹

Biological Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA

²

Department of Horticulture, University of Wisconsin-Madison, Madison, WI 53706, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(16), 4108; https://doi.org/10.3390/rs15164108

Submission received: 5 July 2023 / Revised: 11 August 2023 / Accepted: 20 August 2023 / Published: 21 August 2023

(This article belongs to the Special Issue Retrieving Leaf Area Index Using Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

Potato holds significant importance as a staple food crop worldwide, particularly in addressing the needs of a growing population. Accurate estimation of the potato Leaf Area Index (LAI) plays a crucial role in predicting crop yield and facilitating precise management practices. Leveraging the capabilities of UAV platforms, we harnessed their efficiency in capturing multi-source, high-resolution remote sensing data. Our study focused on estimating potato LAI utilizing UAV-based digital red–green–blue (RGB) images, Light Detection and Ranging (LiDAR) points, and hyperspectral images (HSI). From these data sources, we computed four sets of indices and employed them as inputs for four different machine-learning regression models: Support Vector Regression (SVR), Random Forest Regression (RFR), Histogram-based Gradient Boosting Regression Tree (HGBR), and Partial Least-Squares Regression (PLSR). We assessed the accuracy of individual features as well as various combinations of feature levels. Among the three sensors, HSI exhibited the most promising results due to its rich spectral information, surpassing the performance of LiDAR and RGB. Notably, the fusion of multiple features outperformed any single component, with the combination of all features of all sensors achieving the highest R² value of 0.782. HSI, especially when utilized in calculating vegetation indices, emerged as the most critical feature in the combination experiments. LiDAR played a relatively smaller role in potato LAI estimation compared to HSI and RGB. Additionally, we discovered that the RFR excelled at effectively integrating features.

Keywords:

LAI; UAV; multi-sensor data; machine learning

1. Introduction

Potatoes are one of the most widely consumed staple foods [1]. They provide essential carbohydrates and valuable nutrients, contributing to the diets of millions worldwide [2]. Moreover, being a major agricultural crop, potatoes play a vital role in economies and global food security [3]. Monitoring the growth status of potato crops is of utmost significance for effective agricultural management. However, due to their subterranean nature, direct observation of potato growth presents challenges. In order to overcome this limitation, an efficient approach involves calculating various indices based on above-ground plant parts. One such vital index is the LAI, which serves as a reliable indicator of the crop’s health and vigor. LAI measures the total green leaf area (one side) per unit of ground surface area. It is a crucial parameter reflecting the canopy structure and the biochemical status and is commonly used in the estimation of biomass [4], chlorophyll content [5], crop yield [6], and other agroecosystem studies [7]. By leveraging LAI measurements, farmers and researchers can gain valuable insights into the overall performance of potato crops, enabling them to make informed decisions regarding irrigation, fertilization, and crop management practices.

In order to obtain accurate LAI values, the most traditional way is to cut out all the leaves within the unit area and calculate the total area. Later, proximal sensing devices, such as LAI-2000 (LI-COR, Inc., Lincoln, NE, USA) and Tracing Radiation and Architecture of Canopies (3rd Wave Engineering, Nepean, ON, Canada) have been developed and extensively used in estimating LAI in practices. However, these methods are inefficient and costly as they heavily rely on human labor to conduct the measuring procedure. Remote sensing, however, enabled by sensing platforms at a wide range of scales, has the potential to conduct large-scale LAI measurement tasks.

Satellite and UAV are the two most widely used remote sensing platforms. There are many applications of LAI assessment using satellite imagery. Liu et al. utilized Landsat imagery to compute vegetation indices (VIs) and constructed a semi-empirical model for estimating the LAI of various crops [8]. Sentinel-2 data have been used to compute 40 different VIs for investigating biophysical variables [9]. Satellites have an irreplaceable advantage in large-scale LAI estimation [10,11], but their low spatial and temporal resolution largely limits their application to precision agricultural management. Unmanned Aerial Vehicle (UAV)-based remote sensing, on the other hand, overcomes these limitations thanks to the flexibility in the data collection schedule, data accessibility, and the capability of carrying multiple types of sensors. Leveraging various functionalities of the available sensor, a variety of UAV-based remoting sensing platforms has been customized for different tasks in precision agriculture [12], such as crop classification and extraction [13,14], stress [15], height [16], yield [17], biomass [18], nitrogen content [19,20], and LAI [21,22]. The UAV-LiDAR system can provide a comprehensive representation of the vertical structure of the research target, making it widely applicable in forest research [23,24]. The UAV-RGB system is the most cost-effective and commonly used system. It captures RGB images primarily for applications in classification (semantic segmentation) tasks [25,26]. When combined with photogrammetric methods, it can also generate point clouds for further analysis [27]. The UAV-hyperspectral system has numerous applications in agriculture, forestry, and other fields. Its most prominent application is the estimation of various parameters or traits based on spectral information [28,29].

As a typical active sensor, LiDAR generates pulses to measure the observation targets and output points cloud containing the structural information. These points can be used to estimate canopy cover [30], height [31], and other features [32], and they can also support the calculation of LAI. Zhang et al. utilized LiDAR data to compute various height-based characteristics, achieving the highest R² values of 0.679 (flowering stage) and 0.291 (mature stage) using five machine learning models. However, due to the lack of spectral information, these accuracies were lower by 0.083 and 0.327, respectively, compared to features from HSI [33]. Similar results are also observed in [34], where spectral features have a higher R² than structural features by 0.246–0.3, provided that the samples are acquired in a similar region and at a similar time. When the size of plant leaves is small, due to the limitation of data sampling density, LiDAR data are likely to be unable to fully capture leaf information.

Optical images are typically classified into categories, such as RGB images and multi-/hyper-spectral images, based on the number of channels they possess. While an RGB image consists of only three bands, it can still be used to calculate various features and extract valuable information about LAI. Yang et al. tested six VIs for estimating sugarcane LAI over the whole growing stage, with the highest R² reaching 0.668 [35]. Li et al. combined color and texture features of RGB images, achieving the R² of 0.86 [36]. However, the low spectral resolution of RGB images makes it unable to capture features of a wider range reflectance spectrum, resulting in its disadvantages to multi/hyper-spectral data for crop growth monitoring [37].

The multi-/hyper-spectral images, containing abundant spectral information, reflect the physiological and biochemical information of crops and can be used to calculate widely recognized indices. Tao et al. evaluated hyperspectral indices and red-edge parameters on LAI estimation, with R² ranging between 0.62 and 0.76 for four growth stages [38]. Gao et al. utilized eight spectral indices to retrieve the LAI of winter wheat with correlation coefficients ranging from 0.677 to 0.851 [39]. Ma et al. evaluated the performances of four methods for LAI estimation based on eight spectral indices, and the highest R² of the validation set reached 0.74 [40]. These methods could achieve good estimation accuracy (R² > 0.7), but the bands that constitute the VIs were still referenced from other studies, and no specific selection was made for the research target. Ma et al. conducted band selection based on four basic index types for cotton LAI estimation, and the highest R² value reached 0.9 [41]. It remains to be discussed whether other vegetation types, such as potatoes, require band selection and whether more complex index types are suitable for band selection methods.

Though with the above promising results, each data source has limitations in estimating LAI on its own. Therefore, many researchers have started exploring the combination of multiple data sources to estimate LAI. Luo et al. (2019) combined LiDAR and hyperspectral data to predict the LAI of maize, wheat, and vegetables, and the R² of combination features reached 0.829 [42]. Yue et al. (2018) fused the VIs and height from RGB and HSI to estimate above-ground biomass and LAI with the highest R² for LAI exceeding 0.9 [43]. The combination of multiple data sources has shown a significant improvement in accuracy compared to using a single data source. However, further research is needed to determine the enhancement in crop LAI estimation that can be achieved by integrating these three data sources.

The LAI assessment methods for various crops based on image features such as vegetation indices have achieved good estimation accuracy, which is further enhanced by the combination of multiple data sources. However, there are still two issues that need to be further analyzed. The first one is how to select the bands for VI calculation. Most of the studies use fixed bands VIs rather than considering the differences between different plants in different growth states. Taking NDVI as an example, many studies will calculate NDVI as an input feature, and the band selection (λ1 and λ2) shows great differences [17,44]. The second one is that few studies have reported how the features of three data sources, RGB, LiDAR, and HSI, contribute to the performance of LAI estimation.

Thus, the specific objectives of this study are to (1) validate whether the band combinations used in previous studies are suitable for LAI estimation of potatoes, and comparative experiments were conducted between fixed and optimized band combinations. (2) Investigate the performance of single and combined data sources in the LAI estimation of potatoes. (3) Examining the importance and contribution of features from different data sources in the combination experiments.

The article is organized as follows. After Section 1, field experiments, data acquisition and processing, and models with strategies are discussed in Section 2. Section 3 introduces the ground data and the comparison results of different feature combinations. Differences in ground data from two potato cultivars and the contribution and role of different data sources are discussed in Section 4. Finally, Section 5 concludes the work.

2. Materials and Methods

2.1. Field Experiments

The field experiment was conducted in 2021 at the University of Wisconsin (UW) Hancock Agricultural Research Station (HARS), which is a vegetable research farm located in the Central Sands area of Wisconsin. The whole field was 41 m wide by 78 m long, comprising a total of 32 individual plots. Each subplot was comprised of 8 rows, with each row measuring 7.6 m in length and 0.9 m in width.

The experiment followed a split-plot design with 4 replications, as shown in Figure 1. The field consisted of one strip without fertigation and one strip with fertigation. Within each strip, two different nitrogen (N) rates were randomly assigned to the entire plot, as specified in Table 1. Other production practices were implemented based on the recommendations provided by UW Extension [45]. Two cultivars, Snowden (chipping potato cultivar) and Colomba (yellow potato cultivar), were planted on 23 April and harvested on 15 September 2021. The sampling dates were scheduled for 30 June, 20 July, 23 July, 3 August, and 12 August.

2.2. Data Collection

RGB, LiDAR, and Hyperspectral data were synchronously collected by three sensors mounted on a Matrice 600 Pro platform (DJI Technology Co., Shenzhen, China) under clear sky conditions five times in the growing season on 30 June, 20 July, 29 July, 3 August, and 12 August. RGB images were taken by a Cyber-shot DSC-RX1R II camera (Sony Group Corporation, Minato, Tokyo, Japan) and 3D point data were taken by a VLP-16 sensor (Velodyne Lidar Inc., San Jose, CA, USA), which uses an array of 16 infrared (IR) lasers paired with IR detectors and emit each laser at 18.08 kHz. The HSI were taken by a Nano-Hyperspec sensor (Headwall Photonics Inc., Bolton, MA, USA), containing 640 pixels in each scan line with a pixel pitch of 7.4 μm. The details of the sensors were provided in Table 2. A Global Navigation Satellite System-aided Inertial Navigation System (GNSS/INS), VN-300 (VectorNav, Dallas, TX, USA), was integrated with the sensors to provide the longitude, latitude, and attitude indicator (yaw, pitch, and roll).

We used LAI 2000 Plant Canopy Analyzer (LI-COR, Inc., Lincoln, NE, USA) to measure the LAI of the potato plants. It utilizes a “fisheye” optical sensor to calculate LAI by capturing light measurements both above and below the canopy. This device measures light interception at five different zenith angles simultaneously. Then, it employs a radiative transfer model to compute the LAI.

2.3. Image Process and Features Calculation

In our methodology, we adopted a rigorous approach to ensure the accuracy of image processing. After collecting raw data from three sensors, we used GRYFN Processing Tool V 1.2.6 (West Lafayette, IN, USA) software to preprocess them, such as orthorectification, mosaic, geometric and radiometric correction. For HSI, the Hyperspec III software V 3.1.4 (Bolton, MA, USA) was also used for radiometric correction. Further corrections and processes were achieved by Python. Our workflow follows a standardized process that draws on methods commonly used in similar studies [17]. We conducted a visual interpretation to assess the correctness of the processing results. To analyze the relationships between the images and LAI, various related features were extracted from the processed RGB, LiDAR, and HIS. These features will reduce the redundancy of the raw data and emphasize some specific information about the plant. All the processing and features are shown below, and corresponding formulas are shown in Table 3.

2.3.1. RGB-Based Features

The separate RGB images collected from the RGB camera were orthorectified by the LiDAR-derived digital terrain model (DTM) and then mosaiced into one image in GRYFN. The RGB mosaic showed no noticeable distortion. Images of each plot were extracted from it by the manually drawn plot boundaries, and we obtained 160 (32 plots by 5 times) individual images. Before extracting the features of plots, we used the balanced histogram thresholding (BHT) method to automatically remove the background (shadow and soil) in Python.

Six features of each plot were calculated based on the processed images: mean pixel value of red bands (R), mean pixel value of green bands (G), mean pixel value of blue bands (B), the mean normalized value of red band (Normalized_R), the mean normalized value of green band (Normalized_G), the mean normalized value of blue band (Normalized_B).

2.3.2. LiDAR-Based Features

All the raw data collected from LiDAR were georeferenced with the recorded position in GNSS/IMU unit and transferred into the point cloud in GRYFN. Most points of LiDAR were distributed within a reasonable range of potato plant heights. After separating the points of different plots by plot boundaries, we used cloth simulation filtering (CSF) [46] to distinguish the points of plant and ground in each plot. The basic idea of CSF is to assume that an inverted point cloud surface is covered with a rigid cloth. By analyzing the interactions between the cloth nodes and the LiDAR points, the cloth nodes can be used to simulate the ground. The extracted ground points were used to generate DTM with 8 cm spatial resolution. The median height of the ground points in each grid is set to the height of that grid, while for grids where no ground points are extracted, the height is decided by the mean value of the surrounding grids. The vertical distances of plant points to DTM are considered the heights of the plant points. Besides the point cloud, GRYFN also outputs a median-filtered digital surface model (DSM). With the categorized points, plant heights, and DSM, we can calculate several features:

Height Percentile: The 50th, 75th, 90th, and 95th percentile height of plant points.

Canopy Volume: The number of points categorized as plants.

Canopy Cover: The ratio of the number of plant points to all the points.

Max Plant Height: The difference between the maximum height and the minimum height of DSM.

Plant Area Index (PAI): The plant area, the sum of plant area density (PAD) values multiplied by voxel volume per unit ground surface area. The index was calculated by the algorithm in [47]. The algorithm scale and average returned lidar intensities for each lidar pulse and used the Beer–Lambert law to estimate the PAD.

2.3.3. HSI-Based Features

The data collected from the hyperspectral scanner were orthorectified and georeferenced based on the position information of the GNSS/IMU unit. Then, raw digital numbers were calibrated to reflectance based on the metadata and calibration panels in Hyperspec III and GRYFN. The HSI exhibited standard green vegetation spectral curves. After separating the reflectance map of each plot, the background was removed based on the threshold in Python. The threshold was set to 0.15 at the 800 nm wavelength, and pixels with lower values were removed. Images of each plot were also extracted by the plot boundaries. We calculated two types of features, VI and statistical features, based on that.

We selected six typical indices as input features. The NDVI [48] is the most widely used VI and can be used to monitor the phenology, quantity, and activity of vegetation. The two-band enhanced vegetation index (EVI2) not only maintains the advantages of EVI, improving linearity with biophysical vegetation properties and reducing saturation effects but also does not need to use a blue band [49]. As red-edge reflectance-based VIs are preferable for crop LAI and other canopy architectures estimation and are more sensitive to leaf chlorophyll content [50]. We selected four VIs related to the red-edge bands and chlorophyll, the red-edge chlorophyll index (CI_rededge) [51], the green chlorophyll index(CI_green) [51], the red-edge modified simple ratio index (MSR_rededge) [52], and the MERIS terrestrial chlorophyll index (MTCI) [53].

Besides, the mean and standard deviation (Std) of each of the HSI bands were calculated and used as input features for estimating LAI.

2.4. Feature Search/Selection/Comparison Strategies

2.4.1. Grid Searching Bands and Fixed Bands

The HSI imagery with 274 narrow bands supplies more refined spectral information in constructing VIs, which are commonly calculated from fixed and broad spectral bands based on previous studies. For example, the NDVI is the ratio of the difference between the near-infrared (NIR) and red bands to the summation of these two bands. The band ranges are commonly 770–890 and 630–690 nm for NIR and red, respectively. However, with the HSI imagery, there are 55 bands within the NIR range and 27 within the red range. Therefore, to locate the specific HSI bands with the best performance for estimating LAI, we applied a grid-searching method in the feature selection step. First, we divide the bands into five sets, blue, green, red, Red Edge, and NIR set. The spectral range of different sets is slightly larger than the commonly used range to ensure that the best combination can be searched as much as possible. For example, the range of NIR is set to 760–1000 nm, and then, we calculated six groups of HSI-based VI based on the band sets, and the combination with the highest Pearson correlation coefficient (r) of LAI in each group was selected as the final output of this index. Furthermore, we compared the performance in estimating LAI by VIs calculated using the selected HSI bands with the fixed (experienced) bands. When the wavelength of the sensor and the experienced wavelength do not exactly match, we use the band with the smallest difference from its wavelength as a substitute for the following calculation.

2.4.2. Combination of VIs from Different Data Sources

VIs derived from RGB, LiDAR, and HSI data can be regarded as distinct subsets of features, each with a unique focus. By combining these multi-source features, the input can be enriched, leading to improved estimation capabilities of the models. To assess the extent of this improvement, two sets of experiments were conducted. In the first step, the three individual data sources were evaluated independently. Subsequently, in the second step, the four combinations—RGB + LiDAR, RGB + HSI, LiDAR + HSI, and RGB + LiDAR + HSI—were evaluated. All inputs underwent standardization, and the parameters were determined through a combination of grid searching and manual adjustment.

2.4.3. Statistical Features Selection and Combination with VIs

The VIs are considered subsets of the original reflectance data and typically emphasize specific vegetation properties. In addition to VIs, statistical features are also important for evaluating the biological conditions of vegetation. The mean and standard deviation of each band were calculated as the original statistical features. However, these 548 features cannot be directly used for model training due to the presence of invalid and redundant information. To address this, we employed Recursive Feature Elimination with Cross-Validation (RFECV) on the original features. RFECV is a method that depends on the chosen estimator and requires feature importance as input. In this case, we utilized Random Forest (RFR) and Support Vector Regression (SVR) as base models to extract two feature subsets. Subsequently, we evaluated the performance of the four regression models using the two resulting feature sets obtained from RFECV. This approach aimed to reduce redundancy and identify the essential features for model training and estimation. The selected features and the combination of VIs and selected features are evaluated.

2.5. Machine Learning Model

Four commonly used machine learning approaches, Support Vector Regression (SVR), Random Forest Regression (RFR), Histogram-based Gradient Boosting Regression Tree (HGBR), and Partial Least-Squares Regression (PLSR), were selected as learners to evaluate the features. SVR is a supervised learning model from SVMs. The algorithm’s goal is to put more original points inside the hyperplane with a width ε and reduce the error outside. Three parameters, kernel, gamma, and regularization parameter (C), were adjusted with different input features. RFR is a supervised ensemble learning algorithm that constructs several decision trees with training data. The outputs of these trees were integrated to calculate the final estimation. HGBR tremendously accelerates the gradient-boosting methods by categorizing the continuous features into integer-valued groups instead of sorting continuous values. PLSR projects the input features and LAI to a new lower-dimensional space and applies a linear regressor to fit the data. In this study, all the regression models were performed using Python and the scikit-learn library [54]. As effective parameter tuning plays a pivotal role in optimizing model performance and unleashing its true potential, the optimal values of the model’s parameters were obtained by grid search and manual adjustment. The name and explanation of the parameters are shown in Table 4.

2.6. Evaluation Metrics

The accuracy of the LAI estimation was evaluated using the coefficient of determination (R²), root-mean-square error (RMSE), and the mean absolute error (MAE). The formulas of these metrics are shown below. R² is positively correlated with model accuracy, while RMSE and MAE are just the opposite. In our experiment, we applied a 5-fold cross-validation strategy, and repeat was set to 5. The final evaluation metrics are the mean values of the 25 results.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - y_{i_p r e})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(1)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - y_{i_p r e})}^{2}}{n}}

(2)

M A E = \frac{\sum_{i = 1}^{n} {| y}_{i} - y_{i_p r e} |}{n}

(3)

where

y_{i}

is the measured value,

y_{i_p r e}

is the estimated value,

\bar{y}

is the mean value and

n

is the sample number.

3. Results

3.1. Ground Data Statistics

The Colomba is an early maturing variety, harvested about 90 days after planting, whereas the Snowden exhibits late maturation, occurring approximately 110–120 days after planting. This discrepancy in maturation time is reflected in the distinct distribution of the LAI illustrated in Figure 2. Snowden exhibited a pattern of initially increasing followed by a subsequent decline, while the Colomba consistently displayed a downward trend. Additionally, Figure 2 provides visual representations in the form of time series images depicting the growth progression of both vegetation types.

In Figure 3, we also present the distribution of LAI of the two varieties under different N rates. For the Colomba, LAI decreased over time under all four fertilization conditions. However, the magnitude of LAI values is influenced by the rate of fertilizer application. Specifically, the LAI of the control group is lower than that of R1, which is lower than the LAI values of R2 and R3. This phenomenon is more prominent in Snowden, where both the control group and R1 exhibited a declining trend in LAI. However, noteworthy is the fact that the LAI values of R2 and R3 demonstrated an increasing trend when higher amounts of fertigation were applied. The effect of fertigation on LAI is significant and leads to a dataset with large variations. The regression experiments based on this dataset can prove the generality of the model to some extent.

3.2. Comparison of VIs with Searched and Fixed Bands

There are notable differences between the effectiveness of the searched bands and fixed bands, especially in the Red Edge and NIR range, in Table 5. In general, the correlation coefficients of these searched ones were greatly increased, with the maximum improvement reaching 0.211. The NDVI and EVI, commonly used to characterize “greenness”, demonstrated higher correlations compared to others in both searched and fixed bands combination. After searching, the correlations between two VIs composed of the red-edge band, CI_rededge and MSR_rededge, have significantly increased from 0.565 to 0.736 and from 0.548 to 0.759, respectively. These improvements provide evidence of the effectiveness of the optimal (searched) VI for estimating LAI.

The searched bands are different from fixed bands and generally have longer wavelengths. This phenomenon indicates that vegetation with a similar reflectance spectrum can have varying factors such as plant structure, biochemical composition, external environmental pressures, and diseases, all of which can impact spectral characteristics. As a result, relying solely on VI formulations based on fixed bands may not consistently yield optimal results. Grid searching, on the other hand, can aid in identifying suitable combinations of bands to a certain extent.

3.3. Combination of VIs from Different Data Sources

The evaluation results of single sources are shown in Table 6. The highest R² for each data source is underlined, and the highest R² among all data sources is in bold. Despite employing different data acquisition mechanisms, RGB and LiDAR achieved comparable regression accuracy. The RGB image-derived traits have a slight advantage in estimating potato LAI compared to those from LiDAR. With its extensive spectral information, the HSI features exhibited significantly higher R² compared to the other two data sources.

In general, RFR was the most accurate and stable model. However, as an ensemble of decision trees, it incurs higher time costs compared to other methods. SVR excels in terms of runtime efficiency due to its lower training and prediction complexity. A solid mathematical foundation and theoretical guarantees provide confidence in the model’s performance. The HGBR model fell intermediate to RFR and SVR in terms of efficiency and accuracy, while Partial Least Squares Regression (PLSR) represents the fastest model.

Results of the four different combinations of the three data sources as predictors in estimating potato LAI are shown in Figure 4. The best performance (R² = 0.775) was given by the RFR model with all features from LiDAR, RGB, and HSI, followed by 0.768 achieved by the RFR model with features from LiDAR and HSI. Notably, the combined data sources outperformed the individual ones. For example, the R² of LiDAR + RGB was higher than that of LiDAR or RGB alone, and the R² of LiDAR + RGB + HSI was higher than that of either component. The combinations of multiple data sources encompass diverse different feature spaces, providing the regression model with richer input information. This, in turn, improves the performance of estimating LAI.

When examining the combinations with the lowest accuracy values (0.687, 0.734, 0.745, and 0.740), we can notice that the last three combinations, which incorporate HSI features, exhibit similar and notably higher accuracy levels compared to the first combination. This observation highlights the significance of hyperspectral information in estimating LAI with different modeling approaches.

Among the four models, RFR emerged as the most suitable algorithm for estimating LAI with combined features, as evidenced by its superior performance across all combinations. Each combination outperformed the single components individually. Notably, RFR achieved the highest R² value of 0.775 when utilizing all three feature subsets. On the other hand, SVR, despite showing advantages in Table 6, did not perform well in this set of experiments. Its results exhibited an opposite trend compared to RFR, and in some combinations, the accuracy was even reduced. For instance, combinations such as LiDAR + HSI and LiDAR + RGB + HSI performed worse than using HSI alone. The results of HGBR show similarities to those of SVR, with both strengths and weaknesses observed across different combinations. Finally, PLSR exhibited the lowest accuracy among the models, with a difference of approximately 0.2 to 0.3 compared to other methods. In addition to accuracy, efficiency is indeed an important metric to consider. During the grid searching process for the best parameters, each feature combination requires parameter tuning, leading to variations in evaluation times for each model. RFR had the greatest uncertainty in evaluation time. On the other hand, PLSR and SVR were generally faster and more stable compared to RFR and HGBR.

From Table 6 and Figure 4, it can be observed that the R² of the combinations involving all three data sources is slightly higher (by 0.009) compared to using HSI alone. Despite doubling the amount of data and workload, the overall improvement is marginal, indicating that the complementary information provided by RGB and LiDAR to HSI is limited in this context.

3.4. Combination of VIs and Selected Statistical Features

The evaluation metrics of original statistical features and selected features are shown in Table 7. The highest R² values for the original or selected features are underlined, and the highest R² value in the table is in bold. RFR-based and SVR-based RFECV selected 44 and 40 features, respectively. These selected features and original features were fed into the evaluation models to calculate the accuracy, and the selected ones achieved the same or even better performance. The highest R² reached 0.766. The significant reduction in the number of features did not result in a significant decrease in accuracy, indicating that the insignificant features had been removed. Moreover, the refinement of features brings a significant reduction in evaluation time, which was reduced by 15.22%, 56.52%, 88.71%, and 30%, respectively.

These statistical features and combined VIs achieved reliable performance. When combining them as input of evaluation models, the best R² was improved from 0.778 to 0.782, as shown in Figure 5. Given that the RFR model achieved the highest accuracy in the previous experiments, we generate the importance of each feature based on the RFR model, shown in Figure 6. Similarly, HSI is the most important data source, and the six VIs derived from it were in the top six. The feature importance confirmed with the modeling results that the contributions of LiDAR and RGB were minimal. The features extracted by LiDAR varied considerably, including the MaxPlantHeight in the seventh position and the Height Percentile and Canopy Cover inside the top 10 of the countdowns. The importance difference of RGB features is not significant. The green channel is slightly higher than the red channel, and the red channel is slightly higher than the blue channel.

4. Discussion

Many widely used vegetation indices were originally defined based on a broader range of wavelengths (such as green, red, and NIR bands) when they were first proposed, such as the VIs in Table 3. As hyperspectral data become more prevalent, selecting the appropriate narrow bands for calculating these indices becomes critical. Researchers have addressed this challenge and conducted studies to find the optimal bands for specific vegetation indices, such as an optimal band combination for simple indices [41,56] and optimal bandwidth [57]. In this study, we performed grid searching for several complex VIs and selected the most relevant one for LAI as the optimal band. The optimal bands we discovered for potatoes through our experiments are distinct from the commonly used fixed bands and even differ from the optimal bands identified in other studies. This highlights the importance of conducting a dedicated band selection process for different research targets when using hyperspectral data.

In the part of feature combination, we conducted ablation experiments, testing single source features, pairwise combination features, and all features. The HSI performed better than RGB, which in turn performed better than LiDAR. From our results, we can find that the richer the spectral information in the data, the higher the prediction accuracy of LAI. Similar findings are present in [58,59,60], showing that the multi-hyperspectral imagery-derived model outperforms the LiDAR-derived model. In all four combination experiments, the accuracy of the combination features was higher than the accuracy of their components. This result indicates that multiple data sources can complement each other and improve LAI prediction performance. However, the improvement in accuracy was not substantial and aligned with the findings in [59,60,61].

VI is just a subset of features extracted from HSI, and a large amount of spectral information remains unused. Therefore, we extracted another type of feature, statistical features, to analyze the importance of each band. The results indicate that compared to using all information, the important bands selected by RFECV are more advantageous for estimating the target features. This is consistent with the results of other methods [62,63] that also employ RFECV. It indicates that there is data redundancy when using hyperspectral information for parameter estimation, and removing unnecessary data can lead to better parameter estimation.

In precision agriculture, the accuracy estimation of LAI should be considered along with the cost of UAV sensors, image acquisition efficiency, data processing complexity, and other factors. Currently, RGB cameras have the lowest cost (~USD 3000) and the simplest data processing workflow. The acceptable accuracy in experiments (R² = 0.726 in Table 6) indicates that RGB cameras are a good choice for LAI estimation under low cost and low technical requirements. Compared to RGB cameras, LiDAR devices slightly cost more (~USD 4000), and their powerful penetration capability may not be effectively utilized in agricultural fields, resulting in lower accuracy (R² = 0.666 in Table 6). Therefore, we do not recommend using LiDAR as the sole sensor for LAI estimation. Hyperspectral sensors perform best in terms of accuracy (R² = 0.766 in Table 6). However, their prohibitive cost (~USD 50,000) and the requirement for specialized knowledge are two main disadvantages. In cases where high accuracy is demanded and the participants are knowledgeable in data processing, hyperspectral sensors are a very suitable choice. If sensors that support custom spectral channel settings are widely used, users can acquire only the bands sensitive to the research parameters, such as the optimal bands obtained in Table 5. Consequently, the overall cost of using multispectral or hyperspectral sensors will be further reduced. In addition, our method can also be applied to other green vegetation, but the optimal bands corresponding to them may vary, necessitating a re-evaluation of band selection. However, the applicability of these methods to non-green vegetation requires further investigation.

While the current research emphasis is on deep learning, machine learning still retains its advantages in this application. Deep learning, or a deep neural network, is a data-driven approach that requires a substantial amount of training data to adequately estimate even millions of parameters. It finds extensive use in tasks such as leaf classification [64,65] and crop classification [66,67], but its application is limited in regression tasks such as LAI estimation. This is due to the significantly greater difficulty in acquiring true values for LAI compared to classification tasks and the lack of large, publicly available LAI datasets for thorough parameter training of deep networks. Hence, for small-scale (field) crop parameter estimation, mainstream machine learning methods and shallow neural networks offer performance that meets requirements. Moreover, machine learning methods have lower hardware demands, making them better suited for practical applications.

5. Conclusions

This study validates the necessity of band selection for potato LAI estimation by comparing experiments involving searched and fixed bands. In comparison to Vegetation Indices (VIs) calculated from fixed bands based on empirical knowledge, the correlation between LAI and optimized narrow-band VIs increased by 0.01 to 0.211. This provides evidence for the essentiality of optimal band selection for potatoes.

Additionally, a series of experiments using single and combined data sources are conducted to comprehensively analyze the accuracy, significance, and cost involved. Three data sources provide different types of features, structural features, and general and detailed spectral features. Among these features, VIs calculated from HSI outperformed others, and its highest R² reached 0.766 (by the RFR), and features of RGB achieved better accuracy than that of LiDAR. The content of spectral information in the data correlates with the accuracy of LAI estimation. In the ablation experiments, VIs of HSI dominated the feature space, while adding features from RGB and LiDAR barely improved the model accuracy. The experiments of statistical features have demonstrated that there is a data redundancy in hyperspectral data when used for LAI estimation tasks. Removing this redundant information not only reduces computational complexity but also improves estimation accuracy to some extent.

HSI achieved the highest accuracy, but at the same time, the acquisition and processing cost is the highest. In practical applications, it is essential to consider multiple factors, such as budget constraints and accuracy requirements, to determine the sensor to use. Different situations and projects may call for various data sources, and it is crucial to strike a balance between the available resources and the desired level of information.

Author Contributions

Conceptualization, T.Y. and J.Z.; methodology, T.Y.; formal analysis, T.Y.; data curation, J.F. and Y.W.; writing—original draft preparation, T.Y.; writing—review and editing, T.Y., J.Z. and Y.W.; supervision, Z.Z.; project administration, Z.Z.; funding acquisition, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the USDA National Institute of Food and Agriculture, Hatch project 7002632.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ezekiel, R.; Singh, N.; Sharma, S.; Kaur, A. Beneficial Phytochemicals in Potato—A Review. Food Res. Int. 2013, 50, 487–496. [Google Scholar] [CrossRef]
Campos, H.; Ortiz, O. (Eds.) The Potato Crop: Its Agricultural, Nutritional and Social Contribution to Humankind; Springer International Publishing: Cham, Switzerland, 2020; ISBN 978-3-030-28682-8. [Google Scholar]
Devaux, A.; Goffart, J.-P.; Kromann, P.; Andrade-Piedra, J.; Polar, V.; Hareau, G. The Potato of the Future: Opportunities and Challenges in Sustainable Agri-Food Systems. Potato Res. 2021, 64, 681–720. [Google Scholar] [CrossRef] [PubMed]
Dong, T.; Liu, J.; Qian, B.; He, L.; Liu, J.; Wang, R.; Jing, Q.; Champagne, C.; McNairn, H.; Powers, J.; et al. Estimating Crop Biomass Using Leaf Area Index Derived from Landsat 8 and Sentinel-2 Data. ISPRS J. Photogramm. Remote Sens. 2020, 168, 236–250. [Google Scholar] [CrossRef]
Simic Milas, A.; Romanko, M.; Reil, P.; Abeysinghe, T.; Marambe, A. The Importance of Leaf Area Index in Mapping Chlorophyll Content of Corn under Different Agricultural Treatments Using UAV Images. Int. J. Remote Sens. 2018, 39, 5415–5431. [Google Scholar] [CrossRef]
Baez-Gonzalez, A.D.; Kiniry, J.R.; Maas, S.J.; Tiscareno, M.L.; Macias C., J.; Mendoza, J.L.; Richardson, C.W.; Salinas G., J.; Manjarrez, J.R. Large-Area Maize Yield Forecasting Using Leaf Area Index Based Yield Model. Agron. J. 2005, 97, 418–425. [Google Scholar] [CrossRef]
Duchemin, B.; Hadria, R.; Erraki, S.; Boulet, G.; Maisongrande, P.; Chehbouni, A.; Escadafal, R.; Ezzahar, J.; Hoedjes, J.C.B.; Kharrou, M.H.; et al. Monitoring Wheat Phenology and Irrigation in Central Morocco: On the Use of Relationships between Evapotranspiration, Crops Coefficients, Leaf Area Index and Remotely-Sensed Vegetation Indices. Agric. Water Manag. 2006, 79, 1–27. [Google Scholar] [CrossRef]
Liu, J.; Pattey, E.; Jégo, G. Assessment of Vegetation Indices for Regional Crop Green LAI Estimation from Landsat Images over Multiple Growing Seasons. Remote Sens. Environ. 2012, 123, 347–358. [Google Scholar] [CrossRef]
Kamenova, I.; Dimitrov, P. Evaluation of Sentinel-2 Vegetation Indices for Prediction of LAI, FAPAR and FCover of Winter Wheat in Bulgaria. Eur. J. Remote Sens. 2021, 54, 89–108. [Google Scholar] [CrossRef]
Deng, F.; Chen, J.M.; Plummer, S.; Chen, M.; Pisek, J. Algorithm for Global Leaf Area Index Retrieval Using Satellite Imagery. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2219–2229. [Google Scholar] [CrossRef]
Xiao, Z.; Liang, S.; Wang, J.; Xiang, Y.; Zhao, X.; Song, J. Long-Time-Series Global Land Surface Satellite Leaf Area Index Product Derived from MODIS and AVHRR Surface Reflectance. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5301–5318. [Google Scholar] [CrossRef]
Aslan, M.F.; Durdu, A.; Sabanci, K.; Ropelewska, E.; Gültekin, S.S. A Comprehensive Survey of the Recent Studies with UAV for Precision Agriculture in Open Fields and Greenhouses. Appl. Sci. 2022, 12, 1047. [Google Scholar] [CrossRef]
Pandey, A.; Jain, K. An Intelligent System for Crop Identification and Classification from UAV Images Using Conjugated Dense Convolutional Neural Network. Comput. Electron. Agric. 2022, 192, 106543. [Google Scholar] [CrossRef]
Bouguettaya, A.; Zarzour, H.; Kechida, A.; Taberkit, A.M. Deep Learning Techniques to Classify Agricultural Crops through UAV Imagery: A Review. Neural Comput. Appl. 2022, 34, 9511–9536. [Google Scholar] [CrossRef]
Barbedo, J.G.A. A Review on the Use of Unmanned Aerial Vehicles and Imaging Sensors for Monitoring and Assessing Plant Stresses. Drones 2019, 3, 40. [Google Scholar] [CrossRef]
Xie, T.; Li, J.; Yang, C.; Jiang, Z.; Chen, Y.; Guo, L.; Zhang, J. Crop Height Estimation Based on UAV Images: Methods, Errors, and Strategies. Comput. Electron. Agric. 2021, 185, 106155. [Google Scholar] [CrossRef]
Feng, L.; Zhang, Z.; Ma, Y.; Du, Q.; Williams, P.; Drewry, J.; Luck, B. Alfalfa Yield Prediction Using UAV-Based Hyperspectral Imagery and Ensemble Learning. Remote Sens. 2020, 12, 2028. [Google Scholar] [CrossRef]
Bendig, J.; Bolten, A.; Bennertz, S.; Broscheit, J.; Eichfuss, S.; Bareth, G. Estimating Biomass of Barley Using Crop Surface Models (CSMs) Derived from UAV-Based RGB Imaging. Remote Sens. 2014, 6, 10395–10412. [Google Scholar] [CrossRef]
Kou, J.; Duan, L.; Yin, C.; Ma, L.; Chen, X.; Gao, P.; Lv, X. Predicting Leaf Nitrogen Content in Cotton with UAV RGB Images. Sustainability 2022, 14, 9259. [Google Scholar] [CrossRef]
Xu, X.; Fan, L.; Li, Z.; Meng, Y.; Feng, H.; Yang, H.; Xu, B. Estimating Leaf Nitrogen Content in Corn Based on Information Fusion of Multiple-Sensor Imagery from UAV. Remote Sens. 2021, 13, 340. [Google Scholar] [CrossRef]
Hammond, K.; Kerry, R.; Jensen, R.R.; Spackman, R.; Hulet, A.; Hopkins, B.G.; Yost, M.A.; Hopkins, A.P.; Hansen, N.C. Assessing Within-Field Variation in Alfalfa Leaf Area Index Using UAV Visible Vegetation Indices. Agronomy 2023, 13, 1289. [Google Scholar] [CrossRef]
Cheng, Q.; Xu, H.; Fei, S.; Li, Z.; Chen, Z. Estimation of Maize LAI Using Ensemble Learning and UAV Multispectral Imagery under Different Water and Fertilizer Treatments. Agriculture 2022, 12, 1267. [Google Scholar] [CrossRef]
Liao, K.; Li, Y.; Zou, B.; Li, D.; Lu, D. Examining the Role of UAV Lidar Data in Improving Tree Volume Calculation Accuracy. Remote Sens. 2022, 14, 4410. [Google Scholar] [CrossRef]
Liu, K.; Shen, X.; Cao, L.; Wang, G.; Cao, F. Estimating Forest Structural Attributes Using UAV-LiDAR Data in Ginkgo Plantations. ISPRS J. Photogramm. Remote Sens. 2018, 146, 465–482. [Google Scholar] [CrossRef]
Onishi, M.; Ise, T. Explainable Identification and Mapping of Trees Using UAV RGB Image and Deep Learning. Sci. Rep. 2021, 11, 903. [Google Scholar] [CrossRef]
Li, B.; Xu, X.; Han, J.; Zhang, L.; Bian, C.; Jin, L.; Liu, J. The Estimation of Crop Emergence in Potatoes by UAV RGB Imagery. Plant Methods 2019, 15, 15. [Google Scholar] [CrossRef]
Weiss, M.; Baret, F. Using 3D Point Clouds Derived from UAV RGB Imagery to Describe Vineyard 3D Macro-Structure. Remote Sens. 2017, 9, 111. [Google Scholar] [CrossRef]
Yan, Y.; Yang, J.; Li, B.; Qin, C.; Ji, W.; Xu, Y.; Huang, Y. High-Resolution Mapping of Soil Organic Matter at the Field Scale Using UAV Hyperspectral Images with a Small Calibration Dataset. Remote Sens. 2023, 15, 1433. [Google Scholar] [CrossRef]
Sun, Q.; Gu, X.; Chen, L.; Xu, X.; Wei, Z.; Pan, Y.; Gao, Y. Monitoring Maize Canopy Chlorophyll Density under Lodging Stress Based on UAV Hyperspectral Imagery. Comput. Electron. Agric. 2022, 193, 106671. [Google Scholar] [CrossRef]
Tang, H.; Armston, J.; Hancock, S.; Marselis, S.; Goetz, S.; Dubayah, R. Characterizing Global Forest Canopy Cover Distribution Using Spaceborne Lidar. Remote Sens. Environ. 2019, 231, 111262. [Google Scholar] [CrossRef]
ten Harkel, J.; Bartholomeus, H.; Kooistra, L. Biomass and Crop Height Estimation of Different Crops Using UAV-Based Lidar. Remote Sens. 2020, 12, 17. [Google Scholar] [CrossRef]
Mulugeta Aneley, G.; Haas, M.; Köhl, K. LIDAR-Based Phenotyping for Drought Response and Drought Tolerance in Potato. Potato Res. 2022. [Google Scholar] [CrossRef]
Zhang, Y.; Yang, Y.; Zhang, Q.; Duan, R.; Liu, J.; Qin, Y.; Wang, X. Toward Multi-Stage Phenotyping of Soybean with Multimodal UAV Sensor Data: A Comparison of Machine Learning Approaches for Leaf Area Index Estimation. Remote Sens. 2023, 15, 7. [Google Scholar] [CrossRef]
Yan, P.; Han, Q.; Feng, Y.; Kang, S. Estimating LAI for Cotton Using Multisource UAV Data and a Modified Universal Model. Remote Sens. 2022, 14, 4272. [Google Scholar] [CrossRef]
Yang, Q.; Ye, H.; Huang, K.; Zha, Y.; Shi, L. Estimation of Leaf Area Index of Sugarcane Using Crop Surface Model Based on UAV Image. Trans. Chin. Soc. Agric. Eng. 2017, 33, 104–111. [Google Scholar]
Li, S.; Yuan, F.; Ata-UI-Karim, S.T.; Zheng, H.; Cheng, T.; Liu, X.; Tian, Y.; Zhu, Y.; Cao, W.; Cao, Q. Combining Color Indices and Textures of UAV-Based Digital Imagery for Rice LAI Estimation. Remote Sens. 2019, 11, 1763. [Google Scholar] [CrossRef]
Feng, H.; Tao, H.; Li, Z.; Yang, G.; Zhao, C. Comparison of UAV RGB Imagery and Hyperspectral Remote-Sensing Data for Monitoring Winter Wheat Growth. Remote Sens. 2022, 14, 3811. [Google Scholar] [CrossRef]
Tao, H.; Feng, H.; Xu, L.; Miao, M.; Long, H.; Yue, J.; Li, Z.; Yang, G.; Yang, X.; Fan, L. Estimation of Crop Growth Parameters Using UAV-Based Hyperspectral Remote Sensing Data. Sensors 2020, 20, 1296. [Google Scholar] [CrossRef]
Gao, L.; Yang, G.; Yu, H.; Xu, B.; Zhao, X.; Dong, J.; Ma, Y. Retrieving Winter Wheat Leaf Area Index Based on Unmanned Aerial…: Ingenta Connect. Trans. Chin. Soc. Agric. Eng. 2016, 32, 113–120. [Google Scholar] [CrossRef]
Ma, J.; Wang, L.; Chen, P. Comparing Different Methods for Wheat LAI Inversion Based on Hyperspectral Data. Agriculture 2022, 12, 1353. [Google Scholar] [CrossRef]
Ma, Y.; Zhang, Q.; Yi, X.; Ma, L.; Zhang, L.; Huang, C.; Zhang, Z.; Lv, X. Estimation of Cotton Leaf Area Index (LAI) Based on Spectral Transformation and Vegetation Index. Remote Sens. 2022, 14, 136. [Google Scholar] [CrossRef]
Luo, S.; Wang, C.; Xi, X.; Nie, S.; Fan, X.; Chen, H.; Yang, X.; Peng, D.; Lin, Y.; Zhou, G. Combining Hyperspectral Imagery and LiDAR Pseudo-Waveform for Predicting Crop LAI, Canopy Height and above-Ground Biomass. Ecol. Indic. 2019, 102, 801–812. [Google Scholar] [CrossRef]
Yue, J.; Feng, H.; Jin, X.; Yuan, H.; Li, Z.; Zhou, C.; Yang, G.; Tian, Q. A Comparison of Crop Parameters Estimation Using Images from UAV-Mounted Snapshot Hyperspectral Sensor and High-Definition Digital Camera. Remote Sens. 2018, 10, 1138. [Google Scholar] [CrossRef]
van der Meij, B.; Kooistra, L.; Suomalainen, J.; Barel, J.M.; De Deyn, G.B. Remote Sensing of Plant Trait Responses to Field-Based Plant–Soil Feedback Using UAV-Based Optical Sensors. Biogeosciences 2017, 14, 733–749. [Google Scholar] [CrossRef]
Bradford, B.Z.; Colquhoun, J.B.; Chapman, S.A.; Gevens, A.J.; Groves, R.L.; Heider, D.J.; Nice, G.R.W.; Ruark, M.D.; Wang, Y. Commercial Vegetable Production in Wisconsin—2023; University of Wisconsin–Madison: Madison, WI, USA, 2023. [Google Scholar]
Zhang, W.; Qi, J.; Wan, P.; Wang, H.; Xie, D.; Wang, X.; Yan, G. An Easy-to-Use Airborne LiDAR Data Filtering Method Based on Cloth Simulation. Remote Sens. 2016, 8, 501. [Google Scholar] [CrossRef]
Arnqvist, J.; Freier, J.; Dellwik, E. Robust Processing of Airborne Laser Scans to Plant Area Density Profiles. Biogeosciences 2020, 17, 5939–5952. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Deering, D.W.; Schell, J.A.; Harlan, J.C. Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation; NASA: Washington, DC, USA, 1974.
Jiang, Z.; Huete, A.R.; Didan, K.; Miura, T. Development of a Two-Band Enhanced Vegetation Index without a Blue Band. Remote Sens. Environ. 2008, 112, 3833–3845. [Google Scholar] [CrossRef]
Dong, T.; Liu, J.; Shang, J.; Qian, B.; Ma, B.; Kovacs, J.M.; Walters, D.; Jiao, X.; Geng, X.; Shi, Y. Assessment of Red-Edge Vegetation Indices for Crop Leaf Area Index Estimation. Remote Sens. Environ. 2019, 222, 133–143. [Google Scholar] [CrossRef]
Gitelson, A.A.; Gritz, Y.; Merzlyak, M.N. Relationships between Leaf Chlorophyll Content and Spectral Reflectance and Algorithms for Non-Destructive Chlorophyll Assessment in Higher Plant Leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef]
Wu, C.; Niu, Z.; Tang, Q.; Huang, W. Estimating Chlorophyll Content from Hyperspectral Vegetation Indices: Modeling and Validation. Agric. For. Meteorol. 2008, 148, 1230–1241. [Google Scholar] [CrossRef]
Dash, J.; Curran, P.J. The MERIS Terrestrial Chlorophyll Index. Int. J. Remote Sens. 2004, 25, 5403–5413. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Gitelson, A.A.; Viña, A.; Ciganda, V.; Rundquist, D.C.; Arkebauer, T.J. Remote Estimation of Canopy Chlorophyll Content in Crops. Geophys. Res. Lett. 2005, 32. [Google Scholar] [CrossRef]
Tian, Y.-C.; Gu, K.-J.; Chu, X.; Yao, X.; Cao, W.-X.; Zhu, Y. Comparison of Different Hyperspectral Vegetation Indices for Canopy Leaf Nitrogen Concentration Estimation in Rice. Plant Soil 2014, 376, 193–209. [Google Scholar] [CrossRef]
Liang, L.; Huang, T.; Di, L.; Geng, D.; Yan, J.; Wang, S.; Wang, L.; Li, L.; Chen, B.; Kang, J. Influence of Different Bandwidths on LAI Estimation Using Vegetation Indices. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 1494–1502. [Google Scholar] [CrossRef]
Zhang, F.; Hassanzadeh, A.; Kikkert, J.; Pethybridge, S.J.; van Aardt, J. Evaluation of Leaf Area Index (LAI) of Broadacre Crops Using UAS-Based LiDAR Point Clouds and Multispectral Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 4027–4044. [Google Scholar] [CrossRef]
Jayaraj, P. Estimation of Leaf Area Index (Lai) in Maize Planting Experiments Using Lidar and Hyperspectral Data Acquired from a Uav Platform. Master’s Thesis, Purdue University, West Lafayette, IN, USA, 2023. [Google Scholar]
Dilmurat, K.; Sagan, V.; Moose, S. Ai-driven maize yield forecasting using unmanned aerial vehicle-based hyperspectral and lidar data fusion. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2022, V-3–2022, 193–199. [Google Scholar] [CrossRef]
Zhu, W.; Sun, Z.; Huang, Y.; Yang, T.; Li, J.; Zhu, K.; Zhang, J.; Yang, B.; Shao, C.; Peng, J.; et al. Optimization of Multi-Source UAV RS Agro-Monitoring Schemes Designed for Field-Scale Crop Phenotyping. Precis. Agric. 2021, 22, 1768–1802. [Google Scholar] [CrossRef]
Barbosa, B.D.S.; Ferraz, G.A.e.S.; Costa, L.; Ampatzidis, Y.; Vijayakumar, V.; dos Santos, L.M. UAV-Based Coffee Yield Prediction Utilizing Feature Selection and Deep Learning. Smart Agric. Technol. 2021, 1, 100010. [Google Scholar] [CrossRef]
Wu, J.; Zheng, D.; Wu, Z.; Song, H.; Zhang, X. Prediction of Buckwheat Maturity in UAV-RGB Images Based on Recursive Feature Elimination Cross-Validation: A Case Study in Jinzhong, Northern China. Plants 2022, 11, 3257. [Google Scholar] [CrossRef] [PubMed]
Aslan, M.F. Comparative Analysis of CNN Models and Bayesian Optimization-Based Machine Learning Algorithms in Leaf Type Classification. Balk. J. Electr. Comput. Eng. 2023, 11, 13–24. [Google Scholar] [CrossRef]
Tan, L.; Lu, J.; Jiang, H. Tomato Leaf Diseases Classification Based on Leaf Images: A Comparison between Classical Machine Learning and Deep Learning Methods. AgriEngineering 2021, 3, 542–558. [Google Scholar] [CrossRef]
Zhong, L.; Hu, L.; Zhou, H. Deep Learning Based Multi-Temporal Crop Classification. Remote Sens. Environ. 2019, 221, 430–443. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, Z.; Feng, L.; Ma, Y.; Du, Q. A New Attention-Based CNN Approach for Crop Mapping Using Time Series Sentinel-2 Images. Comput. Electron. Agric. 2021, 184, 106090. [Google Scholar] [CrossRef]

Figure 1. Detail of Field Experiment.

Figure 2. Box Plots of LAI Distributions on Five Sampling Dates and RGB Images Example.

Figure 3. The LAI Distribution and Temporal Trends under Different N Rates. (a) Colomba; (b) Snowden.

Figure 4. Accuracy and Scatterplot of Different Feature Combinations from Three Data Sources. (a) LiDAR + RGB; (b) LiDAR + HIS; (c) RGB + HIS; (d) LiDAR + RGB + HIS.

Figure 5. Accuracy and Scatterplot of a combination of VIs and statistical features.

Figure 6. Feature Importance of the Combination of VIs and Statistical Features.

Table 1. Nitrogen Application Treatments and Schedule.

Name	Seasonal Total N Rate	Planting	Emergence (Hilling)	Tuber Initiation	Fertigation Date
Name	Seasonal Total N Rate	23 April	12 May	2 June	30 June	10 July	20 July	30 July
C	37	37	-	-	-	-	-	-
R1	287	37	85	165	-	-	-	-
R2	287	37	85	30	34	34	34	34
R3	392	37	85	134	34	34	34	34

Unit: kg/ha.

Table 2. Sensor Descriptions.

Sensors	Description
RGB camera	Sony Cyber-shot DSC-RX1R II 42 MP full-frame sensor 35 mm F2 lens
LiDAR unit	Velodyne VLP-16 100 m range 905 nm infra-red (IR) lasers Dual Returns
Hyperspectral scanner	Headwall Nano-Hyperspec 274 bands with 2.2 nm spectral resolution (B1-B274) visible-near-infrared range (400–1000 nm)

Table 3. Image Features Derived from Three Data Sources.

Sensors	Name	Definition
LiDAR	MaxPlantHeight	H_max − H_min
	H₅0th, H₇5th, H₉0th, H₉5th	The 50th, 75th, 90th, and 95th percentile height values.
	Canopy Volume	Sum (H_allpoints)
	Canopy Cover	Number_non-ground/Number_allpoints
	Plant Area Index	plant area/ground surface area
RGB	R	Mean (DN_R)
	G	Mean (DN_G)
	B	Mean (DN_B)
	Normalized_R	Mean (DN_R/(DN_R + DN_G + DN_B))
	Normalized_G	Mean (DN_G/(DN_R + DN_G + DN_B))
	Normalized_B	Mean (DN_B/(DN_R+ DN_G + DN_B))
Hyperspectral	NDVI	(R_NIR − R_RED)/(R_NIR + R_RED)
	EVI2	2.5 × (R_NIR − R_RED)/(1 + 2.4 × R_NIR + R_RED)
	CI_rededge	R_NIR/R_RED-EDGE − 1
	CI_green	R_NIR/R_GREEN − 1
	MSR_rededge	(R_NIR/R_RED-EDGE − 1)/(R_NIR/R_RED-EDGE + 1)^1/2
	MTCI	(R_NIR − R_RED-EDGE)/(R_RED-EDGE − R_RED)
	Mean	Mean value of each band
	Std	Standard deviation of each band

Table 4. The Names and Explanations of Parameters of Four Regression Models.

Model	Hyperparameter	Explanation
SVR	c	squared L2 penalty
SVR	gamma	gamma in RBF kernel
RFR	n_estimators	number of trees
	max_features	the number of features to consider when looking for the best split
	min_samples_leaf	The minimum number of samples required to be at a leaf node
	random_state	Pseudo-random number generator to control the sub-sampling in the binning process
HGBR	max_iter	The maximum number of iterations of the boosting process
	learning_rate	The learning rate, shrinkage
	max_leaf_nodes	The maximum number of leaves for each tree
	random_state	Pseudo-random number generator to control the subsampling in the binning process
PLSR	n_components	Number of components to keep
PLSR	tol	The tolerance used as convergence criteria in the power method

Table 5. Comparison of Wavelengths of Searched Bands and Fixed Bands.

	VI	Green (nm)	Red (nm)	Red Edge (nm)	NIR (nm)	r	Ref.
Searched bands	NDVI	-	677.968	-	826.143	0.830	-
	EVI2	-	708.93	-	773.065	0.823	-
	CI_rededge	-	-	726.622	901.337	0.736	-
	CI_green	551.908	-	-	914.606	0.585	-
	MSR_rededge	-	-	717.776	901.337	0.759	-
	MTCI	-	631.525	713.353	797.393	0.549
Fixed bands	NDVI	-	670 (669.121)	-	800 (799.604)	0.820	[43]
	EVI2	-	670 (669.121)	-	800 (799.604)	0.736	[43]
	CI_rededge	-	-	710 (708.930)	800 (799.604)	0.565	[55]
	CI_green	550 (549.696)	-	-	800 (799.604)	0.519	[55]
	MSR_rededge	-	-	705 (704.507)	750 (750.95)	0.548	[52]
	MTCI	-	681 (680.179)	708 (708.93)	753 (753.161)	0.505	[53]

Table 6. Evaluation Accuracy of Single Source Features.

Sources	Evaluation Model	R²	RMSE	MAE	Evaluation Time (s)
RGB	RFR	0.668	0.826	0.649	4.93
	SVR	0.726	0.751	0.592	0.19
	HGBR	0.627	0.876	0.685	1.75
	PLSR	0.638	0.862	0.698	0.07
LiDAR	RFR	0.666	0.824	0.643	4.93
	SVR	0.552	0.958	0.750	0.10
	HGBR	0.633	0.865	0.683	1.75
	PLSR	0.640	0.856	0.693	0.07
HSI	RFR	0.766	0.688	0.539	2.20
	SVR	0.762	0.694	0.544	0.15
	HGBR	0.764	0.691	0.554	2.68
	PLSR	0.738	0.729	0.598	0.09

Table 7. Evaluation Accuracy of Original Statistical Features and Selected Features.

Selection Methods	Base Model	Evaluation Model	Number of Features	R²	RMSE	Evaluation Time (s)
RFECV	RFR	RFR	44	0.766	0.686	2.34
	SVR	SVR	40	0.763	0.694	0.10
	SVR	HGBR	40	0.746	0.713	1.66
	RFR	PLSR	44	0.747	0.717	0.07
-	-	RFR	548	0.759	0.697	2.76
	-	SVR	548	0.763	0.695	0.23
	-	HGBR	548	0.750	0.712	14.70
	-	PLSR	548	0.741	0.721	0.10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yu, T.; Zhou, J.; Fan, J.; Wang, Y.; Zhang, Z. Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning. Remote Sens. 2023, 15, 4108. https://doi.org/10.3390/rs15164108

AMA Style

Yu T, Zhou J, Fan J, Wang Y, Zhang Z. Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning. Remote Sensing. 2023; 15(16):4108. https://doi.org/10.3390/rs15164108

Chicago/Turabian Style

Yu, Tong, Jing Zhou, Jiahao Fan, Yi Wang, and Zhou Zhang. 2023. "Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning" Remote Sensing 15, no. 16: 4108. https://doi.org/10.3390/rs15164108

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Potato Leaf Area Index Estimation Using Multi-Sensor Unmanned Aerial Vehicle (UAV) Imagery and Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Field Experiments

2.2. Data Collection

2.3. Image Process and Features Calculation

2.3.1. RGB-Based Features

2.3.2. LiDAR-Based Features

2.3.3. HSI-Based Features

2.4. Feature Search/Selection/Comparison Strategies

2.4.1. Grid Searching Bands and Fixed Bands

2.4.2. Combination of VIs from Different Data Sources

2.4.3. Statistical Features Selection and Combination with VIs

2.5. Machine Learning Model

2.6. Evaluation Metrics

3. Results

3.1. Ground Data Statistics

3.2. Comparison of VIs with Searched and Fixed Bands

3.3. Combination of VIs from Different Data Sources

3.4. Combination of VIs and Selected Statistical Features

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI