1. Introduction
The development of unmanned aerial vehicle (UAV) and spectral cameras has caused a breakthrough in the field of remote sensing (RS) by enabling the collection of imagery at specific wavelengths with unprecedented spatial resolution [
1,
2]. Remotely sensed multispectral images have been used in precision agriculture to assess nitrogen [
3] and chlorophyll content [
4], growth, yield, health status, and disease [
1,
5]. To this end, several manufacturers have developed multispectral cameras specifically for UAV applications. Most cameras have been designed with a separate sensor for each band (examples are the MicaSense RedEdge, Tetracam MCA or DJI Multispectral camera systems). These solutions have been developed with user friendliness in mind; they come or are compatible with integrated global navigation satellite system (GNSS) systems, irradiance sensors and have built-in compatibility with most common image processing software packages. The possible drawback is that high user-friendliness requirements have led to simplifications in the workflow. These simplifications have their effect on the recommended workflow for capturing data, as well as on the image processing stage, possibly affecting radiometric calibration [
6]. This reduction in accuracy has not yet been quantified, though.
Multispectral images need to be converted into reflectance data before they can be interpreted or used as input for calculating vegetation indices (VI). This process, called radiometric calibration, has been repeatedly identified as one of the main technological barriers to using UAVs for remote sensing [
7]. Many different approaches for radiometric calibration have been proposed, but only some are implemented in commercial software packages or are easily available through open source software [
8,
9,
10,
11]. Using commercial software has drawbacks: software developers usually limit the users’ insight in the specifics of the processing pipeline [
12]. Furthermore, actively developed packages are continually updated, and each update can change (part of) the calibration workflow. This complicates the comparison of datasets processed with different versions or packages. However, using commercial software is by far the fastest and easiest method to convert raw images into a usable reflectance map. Open source packages are usually developed with a reduced emphasis on user friendliness and therefore require a higher level of expertise, increasing the barrier of entry for potential users. Additionally, open source packages are usually less frequently updated and are more likely to contain bugs or errors in their workflow. Unlike commercial software, they do provide complete insight into the process. One example of an open source radiometric calibration solution is the Python package developed by MicaSense at
https://github.com/micasense/imageprocessing (accessed on 24 February 2023) [
10]. In scientific literature, several methods for radiometric calibration have been described that are not easily available as well (e.g., [
13,
14]). These methods require a custom implementation to use, and might depend on specific sensors that are not always available to other users. Intuitively, more complex methods require less assumptions than more simplified methods and should therefore lead to more accurate calibrations. However, the extent of the improvement with more complex algorithms remains unclear.
The performance of radiometric calibration methods also depends on the circumstances at the moment of data collection. Some methods might be highly suited to deal with varying illumination conditions, but less suited for clear-sky conditions or vice versa. For example, the MicaSense Dual Camera system comes with a DLS2 irradiance sensor, which is capable of measuring solar irradiance as well as the solar incidence angle. However, the use of corrections based on this sensor in clear-sky conditions is discouraged by the manufacturer. The impact of meteorological conditions on method performances is something that few studies take into account when comparing methods. In this article, we test and compare five different radiometric calibration methods, ranging from very user-friendly but possibly simplistic to more complex methods, requiring additional measurements in the field and additional pre-processing power. These methods are evaluated with probably the most wide-spread multispectral camera (MicaSense RedEdge), and are tested on a sunny and an overcast dataset. Before providing the full description of the methods (
Section 3) and the results (
Section 4), we first give a more detailed background on radiometric calibration (
Section 2).
5. Discussion
In this article, we compared the performance in radiometric calibration of the two most commonly used commercial software packages Pix4D Fields and
Agisoft Metashape (P4D-SP and AM-SP methods), with those of more advanced methods. As mentioned in the introduction, the commercial software packages require less expert knowledge to convert raw images into usable reflectance maps compared to the other methods. The downsides of these solutions is that the user has little insight into or control over the process. In addition, our results show that differences exist in how both packages have implemented radiometric calibration, even though the raw datasets were identical, and both developers reference the same methods in their manuals [
8,
9].
For both datasets, the AM-SP method clearly showed a higher bias than the P4D-SP method. The bias differences were consistent across the bands, except for the band at 705 nm, where the bias of the AM-SP method was lower than that of the P4D-SP method. It is, however, important to note that not just the calibration algorithm, but the mosaicing algorithm differs between both packages as well. Visually, slight displacements of pixels were noticeable between the orthophotos generated by
Pix4D Fields and
Metashape, explaining a portion of the difference in accuracy between both methods as well. However, as these displacements were of a scale of less than 1 pixel width (<3 cm), this effect will have been small. Our results suggest that
Pix4D might currently be the overall better option out of the two packages regarding radiometric calibration. However, both packages are actively developed, and future updates could change this [
30].
The ELM-MP method performed best out of the tested methods in both conditions regarding the RMSE and bias scores. This indicates that more elaborate calibration methods do in fact improve calibration accuracy because they are likely to account for more variables, and require less assumptions than simplified methods. For example, by using reference images of the RRTs taken at mission height, we can correct for scattering and absorption effects by the atmosphere on the optical path between the surface and the sensor. Our results indicate that this adds accuracy to the calibration workflow, at the cost of increased processing complexity. In our case, flight height was relatively low (29 m). We expect that the advantage of measuring RRTs from the dedicated mission height will increase with increasing flight altitudes, provided reference panels are large enough, or in conditions with more atmospheric scattering (e.g., presence of aerosols). Scattering and absorption effects are ignored when imaging a reference target at ground level, as was done for the P4D, AM and MS-SP methods. Another advantage is the use of multiple panels over just one. Multiple panels allow for the detection of saturation in the panels, as saturation would cause deviations from the linear relationship between at-sensor radiance and surface reflectance [
35]. The use of multiple panels therefore adds accuracy at the cost of a more intensive workflow in the field.
Still, as mentioned, the difference between most methods is small. For most applications, the calibration results from
Pix4D Fields would be adequately accurate, and the added value of using more complex calibration methods depends on the accuracy requirements of a given research question. Especially if vegetation indices are the intended end product, the more user-friendly solutions are suitable [
42]. For more complex research questions involving analysis of individual bands, or when incorporating datasets from different locations or taken in different weather conditions, our study indicates that the ELM-MP method can lead to more reliable results.
One of the main sources of uncertainty within the P4D-SP, AM-SP and MS workflows is that the calibration panel needs to be imaged at ground level. Applying the ELM with the 6 gray RRTs on orthophotos that were generated using the single RRT was therefore hypothesized to improve accuracy. For the dataset in clear-sky conditions, applying an additional ELM correction to the orthophoto produced by the AM-SP method removes a large amount of bias for all bands except at 740 nm. This was expected, since corrections based on the gray RRTs are determined by an image taken by the UAV at mission height, and are assumed to be more accurate than the reference image at ground level. The ELM is especially fit for removing biases that may be present after the AM-SP method, since the added correction will be highly homogeneous over the entire orthophoto. The effect on the RMSE score will be limited since all samples are equally corrected. The results of the AM-MP method on the overcast dataset are however unexpected. There, the method added bias, instead of reducing it. A possible explanation for this lies in the way orthophotos are calculated. A given pixel in an orthophoto generated by Metashape is extracted from an individual image. During a mapping mission, the UAV usually makes multiple passes over the RRTs. It is therefore possible that there is a time gap between the spectrometer measurements of the RRTs and the used image(s) for the targets in the final orthophoto. When conditions are unstable, a large difference in incoming irradiance is possible as well. When applying the ELM on individual images, as in the ELM-MP method, it is possible to select a representative image of the reference targets near the same time as the spectrometer measurements.
Baugh & Groeneveld (2007) showed that the relationship between remotely sensed DN and surface reflectance is in fact linear, and therefore calibrations should be reliable outside of the reflectance range of the RRTs [
21]. Still, Aasen et al. (2018) recommended using targets that cover the expected reflectance range of the subject(s) of interest [
18]. Our darkest RRT had an average reflectance of 5% (
Figure 3), which was higher than the vegetation reflectance in the VIS spectrum (
Table A1), and thus most of the area of interest fell outside of the RRTs’ reflectance range. Using a darker panel might therefore improve calibrations further.
For assessing the accuracy of radiometrically calibrated images, we used spectrometer measurements taken at ground level. Calibrating the spectrometer before a measurement took between 10 and 20 s. In overcast conditions, this time window is enough for the irradiance to change substantially during the measurement process, or otherwise affect the sensor calibration or measurement [
43], while care was taken to avoid measurements during periods of noticeably varying irradiation, it is impossible to be certain that varying irradiance did not affect ground truth measurements, which might explain part of the added RMSE and bias in overcast conditions, compared to clear-sky conditions. Additionally, this reasoning is valid for the measurements of the gray RRTs as well, possibly affecting the calibration procedure itself. A possible improvement for this workflow would be to use an irradiance sensor at ground level and correct the reference spectrometer measurements for changes in solar irradiance.
It is well known that cloud coverage alters the irradiance spectrum [
44]. This has biological implications, as plant reflectance changes with varying cloud coverage [
45]. This should be taken into account when comparing the datasets from overcast and clear-sky conditions. More recently, Mamaghani et al. (2019) showed that the reflectance of plants is less variable in overcast than in clear-sky conditions, likely due to an increased influence of shadows within the canopy in direct light conditions [
46,
47,
48]. Because of this, they recommend taking images in overcast or diffuse light conditions and using GSDs of 8 cm or higher, and found that the absolute radiometric correction workflow was sufficiently accurate. These results were obtained in a stationary setup with a Micasense RedEdge-3 camera. In our case, radiometric corrections in overcast conditions were less reliable than in clear-sky conditions. Therefore, we do not recommend following the advice given by Mamaghani et al. when applying any of the radiometric calibration methods tested in this study for UAV missions, unless a very reliable method can be developed for radiometric calibration.
The different land cover classes were chosen to have a large range of intensities across the spectrum in our datasets (
Figure 11). The plant classes (beans, corn and grass) showed higher relative variations, due to them moving in the wind and the constantly changing shadows within the canopies. Shadow correction techniques could reduce these effects and make the method more reliable [
49,
50].
The AM-SP, MS and ELM-MP methods unexpectedly performed better when DLS2 corrections were disabled in overcast conditions. Clearly, the sensor is highly sensitive to orientation with respect to the sun. Correspondence with MicaSense revealed that these fluctuations are expected in stable weather conditions, and disabling DLS2 corrections in such cases is the recommended practice. A method for correcting the raw DLS2 measurements is proposed on the MicaSense GitHub page, but as
Figure 7 shows, it was not able to remove the orientation effect. A better correction for relative tilt of the DLS2 measurements could lead to an improvement in accuracy [
51]. However, the specifics behind the different irradiance parameters in the image metadata are not disclosed, so they would need empirical determination. Alternatively, the recent development of illumination estimation techniques without irradiance sensors, based on image keypoints found during the photogrammetry process, could improve the results further as well [
37,
47,
48].
Our hardware setup might also have induced some inaccuracies. It is recommended by MicaSense to ensure that the up-welling radiance (the camera) and down-welling radiance (the DLS2 sensor) sensors are always at a reciprocal angle [
18]. However, in our setup, the camera was mounted on a gimbal, making sure each image was taken pointing nadir. Because of this, we could not use more accurate viewing angles determined during image stitching in photogrammetry software, and had to rely on the likely less accurate IMU sensor of the DLS2. A solution could be to mount the irradiance sensor on a high precision gimbal.
The resampling of hyperspectral ground measurements was done based on the available spectral information of the camera, namely central wavelengths and bandwidths. Ideally, this should be done by convolving the spectral response curves over the hyperspectral measurements, as sensors are not equally sensitive to each wavelength within a given band range [
15]. However, this was not possible as the spectral response curves of our camera are not disclosed by the manufacturer, and we did not have access to specialized equipment to determine the response curves ourselves.
During the processing, we did not account for bidirectional reflectance distribution function (BRDF) effects or other scene reflectance effects like shadow or topography corrections on the surface reflectance for several reasons. The main reason being that we wanted to focus on the radiometric calibration process itself. As these scene reflectance corrections would influence the intermediary result, comparing different methods would become more challenging. Furthermore, the commercial solutions Pix4D Fields and Agisoft Metashape do not include these corrections in their workflow, so methods that do include BRDF corrections would have an advantage, assuming the corrections are accurate. An optimal calibration workflow would include such corrections, if they are sufficiently reliable. This reliability is likely to depend on flight conditions.
While our study focused on a specific multispectral camera, our results concerning the use of multiple radiometric reference targets, taking reference images at mission altitude and the effect of clouds on the reliability of reflectance maps can be extrapolated to other sensors. Several studies have determined the spectral consistencies between different multispectral sensors [
52,
53,
54], while good correlations between the raw images of different sensors have been found, errors of 2–5% are expected, depending on the wavelength. Furthermore, the measuring altitude influences this error. This emphasizes the importance of good radiometric calibrations.